Wednesday, 3 July 2013

From Data to Knowledge Maps.


How is knowledge created? How do human beings create knowledge? Can synthetic knowledge be created? In this short article we will answer these questions in a new light. But first of all, let's start with a definition of knowledge. According to the Wikipedia:  

"Knowledge is defined (Oxford English Dictionary) variously as (i) expertise, and skills acquired by a person through experience or education; the theoretical or practical understanding of a subject, (ii) what is known in a particular field or in total; facts and information or (iii) awareness or familiarity gained by experience of a fact or situation. Philosophical debates in general start with Plato's formulation of knowledge as "justified true belief". There is however no single agreed definition of knowledge presently, nor any prospect of one, and there remain numerous competing theories." 

Experience seems to be the key here. In the case of humans (and most probably animals too) experience (data) is collected via the memorization of events. The brain then arranges these into organized sets of inter-related rules (i.e. body of knowledge) which may be pictured via a Knowledge Map, such as the one above (related to air-traffic control). This is probably how the brain stores these rules - by organizing them into maps. The ability to relate rules, which may belong to different pieces of knowledge (imagine how the same knowledge of how the weather works influences the way we drive and the way we dress - the same rules will belong to two maps in this case). Rules are of course fuzzy and look more or less like this: 

if A then B 

For example: when it rains, driving speed is low. Mature people, with more experience, attribute more importance to this and other rules than youngsters, who have very few data points (experience) and cannot therefore take rain seriously when driving. Clearly, more data points help to strengthen the rule and make it more relevant when it comes to decision-making. Stronger rules means a more consolidated body of knowledge (more stable topology of the map). It is more difficult to fool a mature experienced individual than a child precisely because of this fact.  

But rules may be weak not only because the underlying data is scarce. Many phenomena are intrinsically stochastic in nature and strong crisp rules simply cannot be built. See for example the case below (the map is a portion of the map shown above). While there is a clear and almost linear relationship between the take-off weight of an aircraft and the approximation velocity of a landing aircraft, the relationship between the landing weight and landing distance is no longer so clear. Adding more data point will not lead to a cleaner rule.


OntoSpace generates rules and arranges them into maps in precisely the way described above. For the same data set numerous maps are constructed and many rules belong to the same maps. The above maps for air-traffic control have been generated by OntoSpace using real data collected in a control tower over a certain amount of time. Therefore, these rules are objective. However, one could construct them on a different basis, interviewing for example tens of traffic controllers and then processing the resulting data. The maps, in this case, would be subjective given the presence of a human component. Finally, one could simulate air traffic in an airport and use that data to generate the corresponding knowledge maps. Again, the human component would be less evident in that it would appear indirectly via the model.

But what has got complexity to do with all this? Each body of knowledge (knowledge map) has its corresponding complexity measure. We could conjecture here that the way the brain builds and maintains maps is according to the Principle of Minimum Entropy Production. The principle of minimum entropy production states that if more than one steady state solution is compatible with the problem boundary conditions then nature prefers the solution of minimum dissipative structure i.e. the observed solution is that with the minimum rate of entropy production. Our conjecture is that maps are built in a manner which reduces their complexity, i.e. Knowledge Maps in the brain are minimum-complexity maps. This, clearly, is a conjecture. However, since our complexity measure is function of both map topology and map entropy, our conjecture seems quite likely. Evidently, a mature brain holds and maintains a huge number of maps, which are dynamically related and stored. Having to manage a less complex set of maps is clearly more efficient, not only from an energetic standpoint but also from an organizational one too. 

This short note implies that, to a certain degree, OntoSpace mimics the brain when it comes to transforming data into usable knowledge. We find increasing evidence of this conjecture. And how can the amount of knowledge be measured? Complexity is a good proxy. Because complexity is a measure of the "amount of structured information" and because it is measured in bits, it is a direct measure of how much information is contained a set of inter-related (fuzzy) rules.