Sunday, 14 July 2013

How To Transform a Set of Curves Into One

Suppose you are monitoring a system, say a human brain, a chemical plant, an asset portfolio, a traffic system. Suppose there are hundreds of parameters that you are monitoring. How do you get the idea of how things are globally going? Which parameter do you look at? How do you "add them up"? How can you blend all the information into one parameter that would convey an idea of the situation? In the above plot an example of a section of an Electroencephalogram is shown, containing only a small number of channels (electrodes). Clearly, analysing a few curves at the same time is feasible, even via simple visual analysis, but when it comes to hundreds or thousands of channels, this is not possible, regardless of the matter of experience of the observer.

One way to map (transform) multiple channels of data onto one scalar function is via complexity. Complexity is a scalar function obtained from a sampled vector x(t) of N channels. The function is computed as C = f (T; E), where T is the Topology of the corresponding System Map (see examples of such maps for an EEG, or an ECG) and E is entropy. Given that entropy is measured in bits, C is also measured in bits, and represents the total amount of structured information within the N-channel data set.

If the N channels of data are sampled each at a certain frequency but within a moving window of a certain width, the result is a time-dependent function of data complexity C(t). The process is fast and may be performed in real-time using OntoNet™, our Quantitative Complexity Management engine, as illustrated in the scheme below (the blue arrow indicates the direction of time flow).

What the C(t) function represents is:

1. Total amount of structured information contained in the data window at time t. This includes all channel interactions.

2. An idea of overall variability of data. The higher the value of C(t) the more each channel varies in conjunction with other channels. This points to a general "increase of activity" within the system.

In addition, C(t) can be tracked to detect:

1. Imminent "traumas" within a given system. In general, traumas are preceded by sudden increases in C(t).

2. Anomalies, phase transitions, situations which are generally "invisible" to conventional statistical methods.

Evidently, similar analysis can be performed off-line, as well as in real-time.

Today, as we quickly generate massive amounts of data, techniques, such as the one described above, can help significantly in extracting useful and actionable information.