Saturday, 3 August 2013

A Structured Look at Cellular Automatons




From the Wikipedia: A cellular automaton is a discrete model studied in computability theory, mathematics, physics, complexity science, theoretical biology and microstructure modelling. It consists of a regular grid of cells, each in one of a finite number of states, such as "On" and "Off" (in contrast to a coupled map lattice). The grid can be in any finite number of dimensions. For each cell, a set of cells called its neighbourhood (usually including the cell itself) is defined relative to the specified cell. For example, the neighbourhood of a cell might be defined as the set of cells a distance of 2 or less from the cell. An initial state (time t=0) is selected by assigning a state for each cell. A new generation is created (advancing t by 1), according to some fixed rule (generally, a mathematical function) that determines the new state of each cell in terms of the current state of the cell and the states of the cells in its neighbourhood. For example, the rule might be that the cell is "On" in the next generation if exactly two of the cells in the neighbourhood are "On" in the current generation, otherwise the cell is "Off" in the next generation. Typically, the rule for updating the state of cells is the same for each cell and does not change over time, and is applied to the whole grid simultaneously, though exceptions are known.

We have measured the complexity and extracted the complexity map of a few cellular automatons which may be found here and are illustrated in the image below:




While  humans are good at recognising patterns and structure, rapid classification of patterns in terms of their complexity is not easy. For example, which is more complex in the above figure, Rule 250 or Rule 190? The answer is below.


Rule 30



Rule 54




Rule 62




Rule 90




Rule 190




Rule 250




It appears that Rule 250 Automaton is the most complex of all (C = 186.25) , while the one with the lowest complexity is Rule 90 (C = 64.31). Not very intuitive, is it?  Intuition is given only to him who has undergone long preparation to receive it (L. Pasteur).






www.ontonix.com





Friday, 2 August 2013

Correlation, Regression and how to Destroy Information.




(The above image is from an article by Felix Salomon - 23/2/2009).
When a continuous domain is transferred onto another continuous domain, the process is called transformation

When a discrete domain is transferred onto another discrete domain, the process is called mapping

But when a discrete domain is transferred onto a continuous domain, what is the process called? Not clear, but in such a process information is destroyed. Regression is an example. Discrete (often expensive to get) data is used to build a function that fits the data, after which the data is gently removed and life continues on the smooth and differentiable function (or surface) to the delight of mathematicians. Typically,  democratic-flavoured approaches such as Least Squares are adopted to perpetrate the crime.

The reason we call Least Squares (and other related methods) "democratic" (in democracy everyone gets one vote, even assassins who get re-inserted into society, just as respectful hard-working and law-observing citizens) is that every point contributes to the construction of the mentioned best-fit function in equal measure. In other words, data points sitting in a cluster are treated equally with dispersed points. All that matters is the vertical distance from the sought best-fit function.

Finally, we have the icing on the cake: correlation. Look at the figure below, depicting two sets of points lying along a straight line.



The regression model is the same in each case. The correlations too! But how can that be? These two cases correspond to two totally different situations. The physics needed to distribute points evenly is not the same which makes them cluster into two groups. And yet in both cases stats yields a 100% correlation coefficient without distinguishing between two evidently different situations. What's more, in the void between the two clusters one cannot use the regression model just like that.  Assuming continuity a-priori can come at a heavy price.

Clearly this is a very simple example. The point, however, is that not many individuals out there are curious enough to look a bit deeper into data (yes, even visually!) and ask basic questions when using statistics or other methods.

By the way, "regression" is defined (Merriam Webster Dictionary) as "trend or shift to a lower or less perfect state". Indeed, when you kill information - replacing the original data with a best-fit line - this is all you can expect.






Thursday, 1 August 2013

Complexity, drug toxicity and drug design.



As the patents on many drugs, which have been launched in the 1990s, will soon expire, the pharmaceutical industry finds itself at a turning point in its evolution, particularly as far as research and development are concerned. As the pipelines of new products are shrinking, the exposure of many companies is increasing. This will surely hurt revenue in mid and long term. Further pressure comes from an increasingly turbulent economy, shareholders, greater regulatory burden and rallying operating costs, not to mention growing R&D costs. It is the high clinical development costs, in conjunction with shrinking drug discovery rates that are leading to a decline in the productivity in the pharmaceuticals industry. Moreover, during the last decade, R&D productivity has decreased. But even though emerging technologies have enabled companies to develop multiple parallel options, and to test numerous compounds in early stages, such techniques are effective only when there exist databases of candidates as well as drug evaluation criteria. An important improvement is expected in establishing new methods for identifying unwanted toxic effects in early development phases, as well as reducing the late-stage failure rate. Bio-informatics and biomarkers are expected to play an important role.

However, independently of new technologies, and in order to adapt, the pharmaceutical industry must re-think its current business model which appears to be unsustainable in a rapidly changing and demanding market. Innovative medicines will be in demand, as the need for more personalised treatment grows for an quickly growing and fragmented population. In fact, as diagnosis methods improve, the need for more personalised and focused drugs will be inevitable. Pharmaceutical companies must transition from the old block-buster model to a more fragmented and diversified offering of products. It appears, therefore, that the economical sustainability of the pharmaceutical industry hinges on innovation.

A major concern shared by all drug manufacturers is that of drug toxicity. A candidate molecule under investigation must be validated on animals before authorisation for trials in humans is granted. If these preclinical studies show good results, clinical trials with healthy volunteers follow. These have the scope of studying drug efficacy and excluding the presence of toxic effects. Following numerous trials on patients with the targeted disease provide a statistical description of the drug efficacy. There exist essentially two approaches to drug toxicity determination: knowledge-based and the QSAR (Quantitative Structure Activity Relationship) rule-based models, which relate variations in biological activity and molecular descriptors. Evidently, any expert of rule-based system will see its efficacy bounded by the quality and relevance of the employed rules. Because of the inability to predict successfully drug toxicity, drug manufacturers report billion-dollar losses every year.

We wish to formulate a conjecture in relation to drug toxicity: the toxic effects of a molecule are proportional to its complexity. In other words, we suggest that a more complex molecule has greater potential to do damage and over a broader spectrum and that higher complexity may also imply greater capacity to combine with other molecules. The underlying idea is to use complexity as a ranking and risk-stratification mechanism for molecules.

Over the last decade, Ontonix has been developing and validating a novel approach to measuring complexity. The metric is function of structure, entropy, data granularity and coarse-graining. It has been used successfully as an innovative risk-stratification and crisis-anticipation system in economics, medicine and engineering. The metric possesses the following properties:
  • The existence of a lower and upper bound. The upper bound is known as critical complexity.
  • In the vicinity of its lower complexity bound, a generic dynamic system behaves in deterministic fashion.
  • In the vicinity of its critical complexity, a system possesses a very high number of potential behavioral modes and spontaneous mode-switching occurs even in the presence of injection of very small amounts of energy.
  • A large number of components is not necessary to lead to high complexity. Systems with a large number of components can be considerably less complex than systems with a very small number of components. In essence, complicated does not necessarily imply complex.
Based on molecular modelling and molecular simulation techniques (Monte Carlo Simulation), one may readily measure the complexity of compounds and use this measure to classify and rank them. In other words, we suggest to use complexity as a “biomarker”. A simple example of the concept is illustrated below, where two so-called Process Maps are shown. Each map is determined automatically by OntoSpace™. Such maps represent the structural properties of a given system, whereby relevant parameters are aligned along the diagonal and are linked by means of connectors (blue dots) which correspond to significant rules. Critical parameters – shown in red – correspond to hubs. The map on the left corresponds to a system with 94 rules and has a complexity of 28.4. The one on the right exhibits 69 rules and a complexity of 19.2. Supposing that both maps correspond to two candidate drugs for the same target disease, the one of the right could correspond to a potentially less toxic candidate. As mentioned, this is a conjecture and needs to be verified.


Clearly, the logic is that if one can perform a given task with a less complex solution, that is probably a better solution. However, we also suggest that substances which function in the proximity of their corresponding critical complexities are globally less robust and, potentially, more toxic. Therefore, a higher value of complexity does not necessarily imply a worse alternative – it is ultimately the relative distance to the corresponding critical complexity which may turn out to be a better discriminant.






 

Monday, 29 July 2013

Complexity? It's All Relative.





 
Complexity is a measure of the total amount of structured information (which is measured in bits) that is contained within a system and reflects many of its fundamental properties, such as:

  • Potential -  the ability to evolve, survive
  • Functionality - the set of distinct functions the system is able to perform
  • Robustness - the ability to function correctly in the presence of endogenous/exogenous uncertainties

In biology, the above can be combined in one single property known as fitness.

Like any mathematically sound metric our complexity metric is bounded (metrics that can attain infinite values are generally not so useful). The upper bound, which is of great interest, is called critical complexity and tells us how far the system can go with its current structure.

Because of the existence of critical complexity, complexity itself is a relative measure. This means that all statements, such as, "this system is very complex, that one is not", are without value until you refer complexity to its corresponding bounds. Each system in the Universe has its own complexity bounds, in addition to its current value.  Because of this a small company can, in effect, be relatively more complex than a large one, precisely because it operates closer to its own complexity limit.  Let us see a hypothetical example.

Imagine two companies: one is very large, the other small. Suppose each one operates in a multi-storey building and that each one is hiring new employees. Imagine also that the small company has reached the limit in terms of office space while the larger company is constantly adding new floors. This is illustrated in the figure below.





In this hypothetical situation, the smaller company has reached its maximum capacity and adding new employees will only make things worse. It is critically complex and, with its current structure, it cannot grow - it has reached its physiological growth limit and can do two things:

  • "Add more floors" (this is equivalent to increasing its critical complexity - one way to achieve this is via acquisitions or mergers)
  • Restructure the business
If a growing business doesn't increase its own critical complexity at the appropriate rate it will reach a situation of "saturation". If you "add floors" at a rate that is not high enough, the business will become progressively less resilient and will ultimately reach a situation in which it will not be able to function properly, not to mention facing extreme events.

Complexity is a disease of our modern times (more or less like high cholesterol, which is often consequence of our lifestyles). Globalisation, technology, or uncertainty in the economy are making life complex and it is increasing the complexity of businesses themselves. An apparently healthy business may hide (but not for long!) very high complexity. Just like very high cholesterol levels are rarely a good omen, the same may be said of high complexity. This is why companies should run a complexity health-check on a regular basis.

So, the next time you hear someone say that something is complex, ask them about critical complexity. It's all relative!
 
 
 
 
 
 

Complexity: A Link Between Science and Art?



Serious science starts when you begin to measure. According to this philosophy we constantly apply our complexity technology  in attempts to measure entities/phenomena/situations that so far haven't been quantified in rigorous scientific terms. Of course we can always apply our subjective perceptions of the reality that surrounds us so as to classify and rank, for example, beauty, fear, risk, sophistication, stress, elegance, pleasure, anger, workload, etc., etc. Based on our perceptions we make decisions, we select strategies, we make investments. When it comes to actually measuring certain perceptions complexity may be a very useful proxy.

Let's consider, for example, art. Let's suppose that we wish to measure the amount of pleasure resulting from the contemplation of a work of art, say a painting. We can postulate the following conjecture: the pleasure one receives when contemplating a work of art is proportional to its complexity. This is of course a simple assumption but it will suffice to illustrate the main concept of this short note. Modern art produces often paintings which consist of a few lines or splashes on a canvas. You just walk past. When, instead, you stand in front of a painting by, say, Rembrandt van Rijn, you experience awe and admiration. Now why would that be case? Evidently, painting something of the calibre of The Night Watch is not matter of taking a spray gun and producing something with the aid of previous ingested chemical substances. Modern "art" versus a masterpiece. Minutes of delirium versus years of hard work. Splashes versus intricate details. Clear, but how do you actually compare them?

We have measured the complexity of ten paintings by Leonardo da Vinci and Rembrandt. The results are reported below without further comments.

Leonardo
and Rembrandt





Sunday, 28 July 2013

Why is Resilience (in economics) such a difficult concept to grasp?


Resilience, put in layman's terms, is the capacity to withstand shocks, or impacts. For an engineers it a very useful characteristic of materials, just like Young's modulus, the Poisson ratio of the coefficient of thermal expansion. But high resilience doesn't necessarily mean high performance, or vice versa. Take carbon fibres, for example. They can have Young's modulus of 700 Gigapacals (GPa)  and a tensile strength of 20 GPa while steel, for example, has Young's modulus of 200 GPa and a tensile strength of 1-2 Gpa. And yet, carbon fibers (as well as alloys with a high carbon content) are very fragile while steel is, in general, ductile. Basically, carbon fibres have fantastic performance in terms of stiffness and strength but responds very poorly to impacts and shocks.

What has all this got to do with economics? Our economy is extremely turbulent (and this is only just the beginning!) and chaotic, which means that it is dominated by shocks and, sometime, by extreme events (like the unexpected failure of a huge bank or corporation, or default of a country which needs to be bailed out, like Ireland, Greece, Portugal, or natural events such as tsunamis). Such extreme events send out shock waves into the global economy which, in virtue of its interconnectedness, propagates them very quickly. This can cause problems to numerous businesses even on the other side of the globe. Basically, the economy is a super-huge dynamic and densely interconnected network in which the nodes are corporations, banks, countries and even single individuals (depending on the level of detail we are willing to go to). It so happens that today, very frequently, bad things happen at the nodes of this network. The network is in a state of permanent fibrillation. It appears that the intensity of this fibrillation will increase, as will the number of extreme events. Basically, our global economy will become more and more turbulent. By the way, we use the word 'turbulence' with nonchalance but it is an extremely complex phenomenon in fluid dynamics with very involved mathematics behind it - luckily, people somehow get it! And that's good. What is not so good is that people don't get the concept of resilience. And resilience is a very important concept not just in engineering but also in economics. This is because in turbulence it is high resilience that may mean the difference between survival and collapse. High resilience can in fact be seen as s sort of stability. It is not necessary to have high performance to be resilient (or stable). In general, these two attributes of a system are independent. To explain this difficult (some say it is counter-intuitive) concept, let us consider Formula 1 cars: extreme performance, for very short periods of time, extreme sensitivity to small defects with, often, extreme consequences. Sometimes, it is better to sacrifice performance and gain resilience but this is not always possible. In Formula 1 there is no place for compromise. Winning is the only thing that counts.

But let's get back to resilience versus performance and try to reinforce the fact that the two are independent. Suppose a doctor analyzes blood and concentrates on the levels of cholesterol and, say, glucose. You can have the following combinations (this is of course a highly simplified picture):

Cholesterol: high, glucose: low
Cholesterol: low, glucose:high
Cholesterol: low, glucose: low
Cholesterol: high, glucose: high

You don't need to have high cholesterol to have high glucose concentration. And you don't need to have low glucose levels to have low levels of cholesterol.

Considering, say, the economy of a country, we can have the following conditions:

Performance: high, resilience: low
Performance: low, resilience:high
Performance: low, resilience: low
Performance: high, resilience: high

Just because the German economy performs better than that of many countries it doesn't mean it is also more resilient. This is certainly not intuitive but there are many examples in which simplistic linear thinking and intuition fail. Where were all the experts just before the sub-prime bubble exploded?




www.ontonix.com




Measuring the Complexity of Fractals



We don't need to convince anyone of the importance of fractals to science. The question we wish to address is measuring the complexity of fractals. When it comes to more traditional shapes, geometries or structures such as buildings, plants, works of art or even music, it is fairly easy to rank them according to what we perceive as intricacy or complexity. But when it comes to fractals the situation is a bit different. Fractals contain elaborate structures of immense depth and dimensionality that is not easy to grasp by simple visual inspection or intuition.

We have used OntoNet to measure the complexity of the two fractals illustrated above. Which one is the most complex of the two? And by how much? The answer is the following:

Fractal on the left - complexity = 968.8
Fractal on the right - complexity = 172.9

This means that the first fractal is about 5.6 times more complex. At first sight this may not be obvious as the image on the right appears to be more intricate with much more local detail. However, the image on the left presents more global structure hence it is more complex. The other image is more scattered with smaller local details and globally speaking it is less complex . This means that it  transmits less structured information, which is precisely what complexity quantifies. Finally, below we illustrate the complexity map of the fractal on the left hand side.





www.ontonix.com