Complex networks, data mining, causality, and beyond

Over the last few weeks Innaxis has published two papers that may be of interest to air transport researchers, among others.

The first paper is an extensive review on the combined use of complex network theory and data mining. Not only do complex network analysis and data mining share the same goal in general- that of extracting information from complex systems to ultimately create a new compact quantifiable representation- but they also often address similar problems as well. Despite these commonalities, a surprisingly low number of researchers take advantage of methodologies, as many conclude that these two fields are either largely redundant or totally antithetic. In this review, we challenge this perception, show how this state of affairs should be relegated to contingent rather than conceptual differences, and that these two fields can in fact advantageously be used in a synergistic manner. The review starts by presenting an overview of both fields, and by illustrating some of their fundamental concepts. A variety of contexts in which complex network theory and data mining have been used in a synergistic manner are then presented. Finally, all discussed concepts are illustrated with worked examples through a series of hands-on sections, which we hope will help the reader to put these ideas in practice. If you ever wonder how a real-world problem can be tackled by these two techniques, you should definitively read this review!

 

 

The second paper addresses the common misinterpretation of correlation vs causality. Following this idea, many causality metrics have been proposed in the literature, all sharing a same drawback: they are defined for time series. In other words, the system (or systems) under analysis should display a time evolution. Associating causality to the temporal domain is intuitive, due to the way the human brain incorporates time into our perception of causality; nevertheless, such association results in some rather important problems.

For instance, suppose one is trying to detect if there is a causality relation between the workload of an ATC controller and the appearance of loss of separation events. These events are only defined at one point in time. To illustrate, one can detect an instance of a loss of separation and check the corresponding workload; afterwards, perform the same actions for another event; and so forth. In the end, the researcher would get two vectors of features, which do not encode any temporal evolutions – in other words, consecutive values are not correlated. So, in this situation, how can we detect if a true causality (and not just a correlation) is present?

In this paper we propose a novel metric able to detect causality within static data sets, by analysing how extreme events in one element correspond to the appearance of extreme events in a second element- refer to the picture above for a graphical representation. The metric is able to detect non-linear causalities, to analyse both cross-sectional and longitudinal data sets, and to discriminate between real causalities and correlations caused by confounding factors.

If you are interested in these ideas, feel free to have a look at these two papers:

M. Zanin et al., Combining complex networks and data mining: why and how. Physics Reports (2016), pp. 1-44. http://authors.elsevier.com/a/1T3yF_8QfbYE-k. Also available at: http://arxiv.org/abs/1604.08816
M. Zanin, On causality of extreme events. PeerJ. Also available at: http://arxiv.org/abs/1601.07054

If you have questions about them, please contact M. Zanin at mzanin@innaxis.org

Finally, Seddik Belkoura is going to present a paper at the forthcoming ICRAT 2016, Philadelphia, about the use of the static causality metric to study delay propagation. You can find the paper on the official website of the conference (http://www.icrat.org/), and also by contacting him at sb@innaxis.org.

Guardar

Innaxis farthest east: CHINA! 1st EU-US-Chinese symposium on CS in Air Transport

Complexworld-posts

Between April 10 to 12, our Principal Researcher Massimiliano Zanin made a long trip. The objective: co-organise the first Chinese / EU / USA Symposium on Complexity Science in Air Transportation, which took place in the Beihang University, Beijing.

The event has been a success, including invited keynotes of leading researchers like Shlomo Havlin, of the Bar-Ilan University, and Mark Hansen of UC Berkeley. Noteworthy has also been the participation of the ComplexWorld network, through Andrew Cook (University of Westminster) and Fabrizio Lillo (Scuola Normale Superiore di Pisa).

As for Massimiliano, he presented an unconventional idea in his talk “The air transport vs. the human brain: two worlds apart?”: the human brain and the air transport are two systems not at all different, which can (and ought to) be studied using the same techniques drawn from statistical physics. By considering the air transport as an information processing system, the strategies used in neuroscience can seamlessly be adapted, in order to improve our knowledge of processes like delay propagations. This idea has been illustrated using several examples, drawn from some research works done in collaboration with Seddik Belkoura and Andrew Cook in the past two years.

 

 

More information about the event is available at: http://airnets.de/Symposium2016/index.html

Additionally, if reading Chinese is not a problem for you, you may be interested in the press release in the Beihang University website: http://news.buaa.edu.cn/zhxw/95477.htm

Complex, functional and multi-layer networks: from the brain to air transport

This post was written by Innaxis researcher, Massimiliano Zanin. 

0317_blog

 

In the last few years, researchers have realized that interactions between the constituting elements of complex systems seldom develop on a single channel.  Let’s take the case of a social network: information exchange may happen orally, electronically, or even indirectly; additionally, people interact according to different types of relationships, like friendship and co-working. This is important because the type of information shared may significantly depend on the channel and on the type of relation: you would probably not say the same to a co-worker in an email as you would to a significant other face to face. Due to this, it may be necessary to include different types, or layers of links, in order to obtain a meaningful representation of the system under study. Neglecting such multi-layer structure, or in other words working with the projected network, may alter our perception of the topology and dynamics, leading to a wrong understanding of the properties of the system.

Since a couple of years, I’ve been interested in the multi-layer structure of the air transport system, see for instance Refs. [1, 2]. Clearly, not all connections are the same: it is straightforward to identify that a clear multi-layer structure is created by airlines and airline alliances, which allow an easy movement of passengers between them, but difficult inter-layer movements. Last year we have published a huge monograph on multi-layer networks, which includes all aspects: from defining topological metrics, analysing dynamical process, up to a review of applications. You can find it in Ref. [3]. (But please, do not print it before checking the number of pages!)

More recently, I’ve started asking myself: “what about multi-layer functional networks?” Let’s take one step back, and see what functional networks are.

In the early stages of complex network theory, such paradigm was mainly used to analyze systems whose structure, either physical or virtual, could be directly mapped into a network. Once again, this is the case of the air transport system, as links (direct flights between pairs of cities) have a physical nature and are easily accessible. It was soon clear that in certain cases this is not possible, as the only information obtainable from the system itself was the evolution through time of some observables. Such measurable variables reflect the behavior of the interacting elements constituting the system, and as such, the value of every observable is expected to be a “function” of the values of other peers. When the structure of such interactions is inferred from the dynamics of the observables, the result is then called a functional network.

Many examples are available of functional networks, but probably the most famous is the study of brain dynamics. First of all, it has to be noticed that physical connections between brain regions do exist, but they are quite difficult to assess… especially if you don’t want to damage the brain! Also, physical connections are interesting, but much more important are the connections that actually activate when the brain is performing some kind of task. A functional network representation can be the perfect solution. By considering the magnetic or electric field generated by spiking neurons, links are established whenever some kind of synchronisation is detected between the recorded time series, usually by means of metrics like Pearson’s linear correlation, Synchronization Likelihood, or Granger Causality. When two regions are synchronised, they are (probably, indeed this point can be discussed!) exchanging some kind of information, and thus participating in a specific computation: functional networks thus represent these collaborative processes.

Now, what about the multi-layer structure of the brain? It is well known that the human cortex has a six-layer structure, in which each layer is responsible for a different level of information abstraction and integration. This structure is nevertheless neglected, due to the limited spatial resolution of magnetic and electric sensors, and the analysed time series just correspond to the global activity of the top-most layers. We are thus projecting the multi-layer network into a single layer. Are we confident that the resulting network is still representative of the original brain activity? Notice that the non-linear nature of the projection process can foster the appearance of constructive or destructive interferences: a link may appear in the projection even if no relationship is present in any layer; or links in two layers can interfere, and disappear from the projection.

How can we validate this hypothesis? It cannot be done with brain data, as we still cannot solve the spatial resolution problem – let’s see how technology will evolve in the next decade. I found a solution by moving back to aviation. Specifically, we can create functional networks of delays: nodes are airports, pairwise connected when there is a correlation (or causality) between the time series representing their average hourly delay. Airports are thus connected if a delay propagation process is detected between them. The concept is not new: see for instance the work performed in the POEM WP-E project [4]; the advantage is that delay propagation can be assessed without any modelling process, and without having to gather information about aircraft turn-arounds, crews, etc. Moreover, the availability of high-resolution real data allows the reconstruction of a complete multi-layer picture, in which each layer corresponds to a different airline. But we can also collapse the dynamics, in order to simulate the creation of a single-layer representation, and compare the structures of the single- and multi-layer representations.

This is exactly what I’ve done in a paper recently published in Physica A [5]. Results are quite startling! First, the most central nodes in the projections do not correspond to the nodes of high centrality in each layer; therefore, the former analysis give biased estimations, which cannot reliably be used to detect the most critical elements in the system. If you then try to use a single-layer model to allocate resources, you would probably end up giving money to the wrong airport! Furthermore, when a simple dynamical model is executed, the magnitude of the error yielded by considering a single layer projection is as big as the results themselves, thus indicating that any estimate obtained with this simplification is meaningless.

So, what does this mean in terms of complex systems modelling? Can we neglect the multi-layer structure? The answer is clearly NO.
Let’s consider the problem of modeling and forecasting the dynamics of the air transport network. First, results obtained imply that any simulation performed to understand the dynamics of the system may yield misleading results when the multi-layer structure created by airlines is neglected. In spite of this, most of the recent research works in this fields fail to include this essential ingredient, both in the analysis of delay propagation and of the network robustness to disruption and attacks. Second, it has to be noticed that the air transport system is created by the interactions between a large number of agents, which may create different layers along different dimensions. For instance, multiple flights do not just share the airline, but they may also be connected by the crew operating them. Disregarding these different layer dimensions, like crews, aircraft types or flight type (cargo or passengers), may further bias our understanding of the system. If the most important airports, in term of delay propagation, cannot reliably be detected with a projected functional network, the identification of functional hubs in the brain dynamics may be confused by the fact that the multi-layer structure of the cortex is neglected. Therefore, global hubs may not correspond to the most important nodes in each layer: the single layer analysis may then be misinforming about the real structure created by information flows. Or, in other words, it is possible that all results obtained in neuroscience by means of functional networks may be biased… quite a big problem!

Summing up: complex networks, and their functional version, are very powerful tools to understand the hidden dynamics behind real complex systems. Yet, one has to remember this: one layer does not fit all!

P.S.: One last comment: if you want to play with the data of Ref. [5], you can find an interactive version of the paper here.

 

References:

[1] Cardillo, A., Gómez-Gardeñes, J., Zanin, M., Romance, M., Papo, D., Del Pozo, F., & Boccaletti, S. (2013). Emergence of network features from multiplexity. Scientific reports, 3. Freely accessible here.

[2] Cardillo, A., Zanin, M., Gómez-Gardeñes, J., Romance, M., del Amo, A. J. G., & Boccaletti, S. (2012). Modeling the multi-layer nature of the European Air Transport Network: Resilience and passengers re-scheduling under random failures. arXiv preprint arXiv:1211.6839. Preprint available here.

[3] Boccaletti, S., Bianconi, G., Criado, R., Del Genio, C. I., Gómez-Gardeñes, J., Romance, M., … & Zanin, M. (2014). The structure and dynamics of multilayer networks. Physics Reports544(1), 1-122. Preprint available here.

[4] Cook, A., Tanner, G., Cristóbal, S., & Zanin, M. (2013). New perspectives for air transport performance. Third SESAR Innovation Days, 26th – 28th November 2013. PDF available here.

[5] Zanin, M. (2015). Can we neglect the multi-layer structure of functional networks?. Physica A: Statistical Mechanics and its Applications. Preprint available here.

Guardar

Data Science and Complex Systems applied to Aviation – Innaxis Workshop

Businesses have entered a new era of decision-making and managing principles due to the pervasive availability of large amounts of data and the drastic growth, in the last decade, in the capacity to store and process data. Aviation is not an exception; Data Science principles have started to emerge through research programmes and practical applications in the field, albeit more slowly in some business functions than others.

Data Science, as a set of fundamental principles that support and guide the principled extraction of information and knowledge from data, leans on well-known data-mining techniques. However, it goes far beyond these techniques, with successful data-science paradigms that provide specific application guidelines. Data-driven decision making involves principles, processes and techniques for understanding phenomena via the automatic analysis of data.

A data-analytic thinking approach will help to envision opportunities for improving data-driven decision making in different contexts. There is strong evidence that aviation performance can be improved substantially via data-driven decision making and data-science techniques drawing on big data. Data-science will support data-driven decision making in the aviation field, where the underlying principles have yet to be established, in order to be able to realize its potential.

Innaxis participates in various research programmes and works on different applications in this field. We will be organizing a workshop on Data Science applied to Aviation in Madrid, Spain during October 2013. Please, write to us at innovation@innaxis.org if find this of interest and you would like to receive information on the workshop (please state “Data Science workshop” in the subject).

Innaxis holds parallel session at The European Future Technologies Conference and Exhibition (FET11)

Carlos Alvarez, President of The Innaxis Foundation & Research Institute chaired the parallel session, “Complex Systems for an ICT-enabled Energy System¨ at The European Future Technologies Conference and Exhibition (FET11) held this past 4-6 May in Budapest, Hungary.

FET11 aims to provide a unique forum dedicated to future and emerging information technologies. It proves to be successful in bringing together scientists, policy makers, industrialists, and other stakeholders to discuss the latest breakthroughs and challenges within the industry.

The parallel session organised by Innaxis explored the ways in which Complex Systems science has had a role in the modelling, control, simulation, and governance of the future Energy System. Within the session a foundation was laid for a new research community to be able to formulate innovative approaches to help pave the way for future European-scale initiatives.

Other speakers included:

Pablo Viejo, European Institute for Energy Research,

Nikos Hatziargyrio, National Technical University of Athens, and

Daniele Miorandi, The ComplexEnergy Project

An insightful discussion was held throughout the session as a wide variety of professionals were brought together including some from Salzburg Research, Innova, and Smartlab-University of Genova.

Copies of the ComplexEnergy White Paper were also distributed to the attendees.

The presentation given by Carlos Alvarez introducing and kicking off the parallel session can be downloaded here.

Any questions or comments can be sent directly to info@innaxis.org.

First ComplexWorld Annual Conference

Complexworld-posts

ComplexWorld will held the first Annual Conference in Seville on July 6-8, 2011 and it will be hosted by the School of Engineering of the University of Seville.

The ComplexWorld Annual Conference is intended as a forum for Air Traffic Management scientists and PhD students, Complexity Science researchers and the ComplexWorld Network community, including Members, Participants, and SESAR WP-E investigators.

The aim of the Conference is to bring together researchers from academia, research establishments, and industry that share common interests and expertise in the field of ATM Complexity Management that lies at the intersection of Complexity Science and ATM.

The Conference will focus on new concepts and developments in areas aligned with the WP-E research theme ‘Mastering Complex Systems Safely’ which explores how Complexity Science can contribute to understand, model, and ultimately drive and optimize the behaviour and the evolution of the ATM system that emerges from the complex relationships between its different elements.
The topics of the Conference include (but are not limited to):
  • Multiple spatio-temporal scales in ATM
  • Non-determinism and uncertainty in ATM
  • Emergent behaviour in ATM
  • Complex modelling of the ATM system
  • Validation and Verification of Complex ATM models
  • Design, Control and Optimization of Complex ATM models
  • Applications of Complex Systems to improve ATM performance
  • Characterization of Disturbances in ATM
  • ATM Resilience: Analysis of disturbance propagation and system stability and agility
  • Information management and decision-making mechanisms in ATM
  • Metrics at different scales, ontology and measurement in ATM
  • Other complex socio-technical and/or transport systems.

Written contributions to topics mentioned above or similar ones are sought. Papers with innovative ideas and/or technical progress will be preferred. Please submit your abstract (one A4 page) to info@complexworld.eu. Deadlines:

April 1: Abstract submission deadline
April 18: Notification of acceptance
June 1: Full Paper submission deadline
July 6-8: Conference

We hope that this information is of your interest and encourage you to send a Paper and to attend the First ComplexWorld Annual Conference.

LOCATION
Escuela Superior de Ingenieros
, Camino de los Descubrimientos 41092 Sevilla, Spain (See the map)

REGISTRATION
Anyone interested in registering should contact Innaxis by e-mail (info@complexworld.eu) including the following information:
– Name and institution:
– Arrival and departure days.

ComplexEnergy 2nd Workshop proves successful

Innaxis and Create-Net lead the preparation of the Research Roadmap.

On November 16th, The Innaxis Foundation & Research Institute along with Create-Net organised the second workshop of the ComplexEnergy project, putting together a group of experts from the ICT, Energy and ComplexSystems field. The goal of the workshop was to discuss the future roadmap for energy research in the context of complexity science and ICT fields.

The meeting was quite productive thanks to the scientific and technical level of the different experts. At the workshop officials from the European Commission also advised the best strategies for fostering research in this area. Innaxis and Create-Net will lead the preparation of a Research Roadmap that will take form of a public deliverable to the European Commission during the first quarter of 2011.

More information on the ComplexEnergy project and in the Energy/ICT/Complex systems research fields in general can be found on www.ComplexEnergy.eu.

Madri+d, leader in Spanish R&D communication, announces ComplexWorld Network

Complexworld-posts

On the 21rst of October, Madri+d posted an announcement of the official launch of ComplexWorld in their online weekly newsletter.

Madri+d is a network that brings together public and private research institutions and industry with de aim of improving regional competitiveness by transfer of knowledge. As one of the few resources that compiles all Spanish R&D news, it is a widely renowned as the go-to place for the latest articles from the government, industry, and academia.

The article stresses the aim of the network as bringing together researchers from universities, research establishments, and companies that share common interests and expertise in the study of Air Traffic Management from the Complexity Science perspective. The ComplexWorld network is led by Innaxis. More information can be found here.

The article, written in Spanish, reaches out to a large Spanish community and motivates them to become involved with the network and with European initiatives in general.

The Network is moving at a fast pace and the news concerning it has been widely disseminated. Please refer to the past blog post concerning the Network´s first draft of the White Paper as well as the SESAR WP-E Network Call for PhDs. Any questions can be sent to info@complexworld.eu.

Guardar

The ComplexWorld Network releases first draft of White Paper + Call for PhDs

Complexworld-posts

A first draft of the Complex ATM White Paper has been finished. This first draft of the White Paper will be presented and discussed during the WP-E Networks Joint Session at the INO 2010.

The ComplexWorld Network will soon be opening a private space within Innaxis´s collaborative wiki-like site so as to share with all Network Participants additional, useful information (e.g. pointers to the White Paper references). More news regarding this will be published next week.

In addition to the first draft of the White Paper, the Network has officially opened the Call for PhDs. The Call is open to anyone who is aiming to start a PhD in a theme related to the ComplexWorld Network. The Call will be closing on December 1st, 2010.

If you have not done so already, please remember to join the Network´s  LinkedIn Group. It is a great way to engage in discussions and network with other Participants.

Finally, if you are not a Participant and you are interested in receiving the White Paper and the Call for PhDs, please contact us about formally registering your entity. The Network is opened to all types of entities that would like to join in this collaborative initiative. Questions or comments can be sent to: info@complexworld.eu.

Dates set for the first ComplexWorld Annual Congress

The Members of the ComplexWorld Network have set the dates for the first ComplexWorld Annual Workshop.

The ComplexWorld Network, funded by the Single European Sky ATM Research (SESAR) Programme within Work-Package E, will be holding its first Annual Workshop on the 6th-8th of July, 2011, in Seville (Spain).

This Workshop will be pivotal since the Network´s development for the entire first year will be discussed. Findings from the White Paper will be presented, as well as progress from the WP-E Complexity projects and the ComplexWorld PhD Programme. Also, the Workshop will include selected presentations on Complexity Science and ATM by Network Members, Participants and external scientists.

All Participants and interested entities are invited to attend the Workshop. We will be informing all of the Participants and interested entities as the details are developed. If you would like to become a Participant and be included in our distribution list please send us an email at info@complexworld.eu.

For more information regarding the ComplexWorld network and the research topics explored, please check out the network´s website at: www.complexworld.eu .

Connect with us!