When airlines and ANSPs come together

SCLRG_DSC06380

The SafeClouds.eu project team came together for the last Consortium Meeting on November 6th and 7th in Majorca. Big thanks to Air Europa who supported and hosted the meeting.

For two days, five airlines (namely Air Europa, Iberia, Norwegian, Pegasus and Vueling) met with the 3 ANSPs participating in the project (Austrocontrol, ENAIRE and LFV), along with Eurocontrol, AESA and EASA (Spanish and European Safety Authority respectively). The last meeting was to collaboratively discuss their broad experience in safety. The group combination of airspace users, including pilots, ATCOs, FDM safety analysts, and safety authorities representatives provided a very inspiring and clear overview of present-day aviation safety analysis, its challenges and opportunities on the transition from event-driven to data-driven safety intelligence. These meetings provide critical insight for data scientists and are key to support the users-driven approach adopted for the project since its conception. The users have defined relevant safety scenarios where data science and ML techniques can provide an added value over the incident-analysis tools they currently have. The scenarios, Runway performance, unstable approaches, group proximity, and airprox drive the descriptive and predictive analytics for SafeClouds.eu. The consortium meetings are an important to present results from the data analysis and discuss and capture their requirements (both individually and in groups) for future work. As the final users of the data analysis work performed within SafeClouds.eu, it is key to ensure this alignment so their visualization dashboards provides relevant and usable ML tools.

SafeClouds is currently immersed in running the analytics based on three years of FDM data, which is merged with traffic data from Eurocontrol, weather data and surface radar data, among other data sources as required by the use case. This comes after investing the first months of the project to develop the legal and technical framework for securely managing and protecting the data. Considering this, the DataBeacon development, a data infrastructure that through several security layers and applying innovative cryptographic techniques, enables the data protection and merging while preserving its confidentiality. This Aviation ML platform, and the different implemented features and applications, enables data analysts to perform their analysis over various aviation data sources without actually having access to the databases. In all, this provides the necessary level of trust to the users and data owners.

With these developments, SafeClouds.eu is going one step further by providing breakthrough analytics on safety precursors based on ML techniques. This analysis will combine airline FDM data with traffic, ADS-B and METEO data, providing improved information on the scenario that individual airspace users cannot otherwise access. This provides airlines, ANSPs and airports an enhanced understanding on the main causes that influence a safety incident which can support decision making for developing customized mitigation actions. Interested in more details on the techniques and results? A follow-up post will be published soon.

SafeClouds Mid-term Review

 

On 11th of April, we had a successful mid-term review for our H2020 project, Safeclouds. The meeting was hosted by Eurocontrol in Brussels, with participants from all entities involved in the project.

Read Eurocontrol's post on the mid-term review here!

SafeClouds presented in the EU-US workshop

Last January, a team of European and American entities organised a workshop on transatlantic research with the support of the European Commission. The event was hosted by the FAA in their facilities at the William J. Hughes Technical Center in Atlantic City. Those mostly in attendance were US and European companies interested in how the different research threads could be boosted through international cooperation.

Among the subjects discussed during the three day event, data analytics was mentioned several times as a interesting area with applicability to different areas in industrial research. Particularly, safety data analytics was covered in three presentations. First, the FAA presented their +10-year old programme ASIAS, which collects data from more than 40 carriers and has been leading the developments in this field for more than a decade. Second, EASA presented the Data4Safety programme, recently launched and in a proof-of-concept stage. Lastly, Innaxis presented the research programme SafeClouds.eu, including the latest technological developments and how they could complement the existing initiatives by providing and exploring new research avenues.

Infrastructure needed for Aviation Data Analytics

Author: Jens Krueger

Safety is key in aviation. To reach maximum safety, stakeholders are collecting a large amount of data for analytics. Ultimately, researchers want to not only evaluate the causal dependencies of safety critical events, but to also enhance operational efficiency.

Presently, such data is stored in isolated data silos. The goal of SafeClouds.eu is twofold: advance data-driven analytics for safety and efficiency and manipulate data outside of the silos to enable data sharing and merging between different stakeholders, including data owners. However, the infrastructure must ensure that personal or confidential data is not leaked to third parties; all while maintaining data sharing capabilities.
In order to address the requirements for data protection and analysis, the SafeClouds.eu infrastructure must enable the following data analysis paradigms:

  • Fusion of identified confidential data streams into a single de-identified data stream. Identified data is data that contains information that could be used to directly or indirectly (e.g. via linking attacks) expose personal data linked to a specific group of people or individuals.
  • Access to the de-identified data streams for SafeClouds.eu data analysis.
  • Information sharing of the analysis of restricted and confidential data from aviation stakeholders (airlines, ANSPs) for blind benchmarking.
  • Access governance should be in place, such specifics on data access (i.e. should be continuously monitored) and limitations.

The infrastructure architecture must reflect data protection requirements in order to guarantee the different data confidentiality levels. The physically-independent components are as follows:

Local system:
The local system sits at the premises of the participating companies (e.g. airlines and ANSPs) and stores raw datasets from different source systems. The data leverages other sources to comprise a 360-scenario dataset with enhanced informational context and processing. The global cloud system should provide such datasets. Finally, the dataset is de-identified and made accessible. Authorised third parties are allowed access only for data management and administrative tasks.

Dedicated private cloud:
Each participating party will be provided with a private segment of the cloud infrastructure that is logically and physically independent. It is used for de-identified data storage and analytics. Data scientists from SafeClouds.eu official partners will have access to the de-identified data under the data protection agreements.

Global cloud system:
The global cloud system is divided into two parts. The global storage will hold all open datasets (Meteo, ADS-B, SWIM, Radar). It will also ensure dataset quality and accessibility through pre-processing. In addition, it will grant access from the local systems and the dedicated private cloud. Note that the global processing infrastructure performs analytics on joint datasets from all dedicated private clouds. 

Figure 1: Hierarchical architecture of the SafeClouds.eu infrastructure

The SafeClouds.eu Cloud Infrastructure

The SafeClouds.eu cloud infrastructure is built on Amazon Web Services (AWS). One of the main advantages of AWS is that it consists of several datacenters located around the world. This enables SafeClouds.eu to reduce communication latencies by choosing the most appropriate datacenter locations. For example, each AWS datacenter is located within a region. Then, each region has several datacenters, or Availability Zones. Each Availability Zone is attached to a different part of the power grid, to mitigate a case of potential power outage damanage. Any distributed cloud application running in AWS must consider the tradeoff between fault-tolerance by placing nodes in different Availability Zones with keeping computational resources as close together as possible to enhance performance.  

For SafeClouds.eu, AWS enables the infrastructure to horizontally scale with an increasing number of stakeholders or increased processing or storage requirements.

To ensure security AWS Identitiy and Access Management (IAM) as well as virtual private clouds (VPC) and encryption for data in motion and at rest is used.

Remarks

The SafeClouds.eu infrastructure enables data protection, data sharing and flexibility. Data safety and security is key to gain trust from data providers; without it the overall project is at risk for success. This blog post stresses the importance of a distributed and secure infrastructure and gives a first look into how the overall infrastructure architecture is designed. However, alhough the base infrastructure technology supports scalability, security, and other factors, the most important challenge is to leverage and implement those technological capabilities. One of the main security threads is human failure, bugs, and wrong implementations. To account for user error, the infrastructure must be as automated as possible along with clearly defined and deterministic processes. In addition, each entry point must be defined and encapsulated while keeping accessibility and usability. SafeClouds.edu will be using this precise infrastructure for aviation data analytics, and will share those findings with the aviation and data science communities. 

Discovering hidden knowledge in aviation data

Author: Paula Lopez (INX)

Machine learning is producing outstanding results although we know it is still far from emulating human intelligence. Applying machine learning techniques, including multi-level artificial neural networks (deep learning) to, for example, speech or image recognition has been continuously resulting in improved results (e.g. digital assistants like Apple´s Siri or Amazon´s Echo). In spite of the significant progress achieved so far, there are still some challenges that need to be resolved in order to be applicable in most industries. On one hand, we face a fragmented ecosystem, meaning that there is a gap between the data scientists and the domain experts working in each particular sector. In order to be able to convert data into knowledge, collaboration among both expertises is required. On the other hand, challenges related to data management and data analysis need to be addressed prior to implementing machine learning techniques in most industries. These challenges, just to name a few, include heterogeneous and distributed data sources, data validation, distributed data architectures, data security, scalability, real-time analysis and decision-support or data visualization.

However, we cannot fall into the error of assuming that a machine learning problem can be addressed through a generic standard application of a set of algorithms and techniques. Machine learning problems are highly case-dependent and, therefore, the purpose of the analysis needs to be carefully defined in advance. This is what we (at Innaxis) call Purposeful Knowledge Discovery which also was the title of the keynote speech made by Innaxis President Carlos Alvarez Pereira at the SESAR Innovation Days 2017 in Belgrade. And this is, precisely, the approach we follow at Innaxis in our data science research projects, like SafeClouds.eu: an H2020 project aimed at enhancing aviation safety through the application of data science techniques.

SafeClouds.eu includes a team of 16 partners including data scientists and engineers from several research entities (Innaxis, Tadorea, Fraunhofer, TU Munich, Linköping University, TU Delft and CRIDA) and a group of airlines, ANSPs and safety authorities (Iberia, Air Europa, Vueling, Norwegian, Pegasus, LFV, Eurocontrol, AESA and EASA). This group of airspace stakeholders is the user group of the project, in other words, those defining the questions for which they need data for gaining answers. These questions can be of three types: descriptive (what happened?), predictive (what will happen?) or prescriptive (what to do for what we want to happen). Once the questions are defined (SafeClouds.eu use cases) the team of data scientists and engineers work together and collaborate with users covering the full cycle of data science techniques: data management, data processing architecture, deep analytics, data protection, pseudo- anonymization, advanced visualization and user experience. As previously mentioned, every step has its own challenges as there are no data science standard tools to be transferred automatically from one field to another. Below, we outline just two challenges: fusion of proprietary confidential data and benchmarking among these competing stakeholders.

  • Smart Data Fusion: Simply erasing the flight-identifier parameters would protect the data but not allow fusion of datasets. Many data require protection and cannot be shared (e.g. FDM data and radar tracks), so fusion needs sophisticated techniques coming from cryptography and enabling coding sensitive data in a non-reversible way.
  • Secure Blind Benchmarking: Benchmarking among stakeholders based on data that cannot be shared also requires the application of specific techniques. This includes secure multiparty computation enabling comparison between confidential data without disclosing the data, not even to a trusted third party.

These are just some examples of the challenges the SafeClouds.eu team is facing in the field of aviation safety data analysis. The solutions offered by these techniques make them ideal to be applied to other fields such as fuel consumption but, again, the purpose of the analysis will determine the following necessary steps.

 

SafeClouds presentation at the IATA ADS

On November 15-16, 2017, IATA organised the first Aviation Data Symposium in Miami, FL USA. This event covered different angles of the application of engineering and data analytics to airline safety, operations, passenger distribution, sales, and air freight. These three areas were complemented by a technology track, which covered techniques and tools to support data activities in airlines. The safety and operation tracks discussed how big data is helping airlines to optimise operations while maintaining safety, and also presenting the upcoming main challenges.

The event also covered a review of the benefits from the various global information sharing and exchange networks, including the Global Aviation Data Management programmes coordinated by IATA. During the Symposium, Mr. Quevedo presented IATA data connect, the database of aviation accidents, IATA FDX, the GDDB and STEADES. ASIAS, the US data exchange programme was also presented by Mr. Madar, Managing Director of Operation Safety of American Airlines. Then, Mr. Hernández-Coronado, Director of Safety Analysis and QM of the Spanish Aviation and Security Agency (AESA) presented the European programme Data4Safety, that was recently launched by EASA in Europe.

Concerns regarding privacy remain very strong, as often, the privacy protocols are strict and de-identification could make data challenging to use, as explained by the programme representatives. Mr. Madar stressed new techniques and technologies that allow to progress on data privacy, together with new tools that allow to move from descriptive to predictive technologies, like machine learning, as an area that will help the programmes evolve, as the descriptive analysis done in the last decade, as done with ASIAS.

Mr. Hernández-Coronado presented SafeClouds in detail. AESA participates in the SafeClouds project and helps the team understand how different technologies researched in the project can help aviation data exchange programmes overcome some of the presented challenges. These challenges include data fusion and integration, data protection and privacy, and computing infrastructures. SafeClouds also investigates predictive analytic concepts and techniques to help aviation stakeholders make decisions, even during the operations.

Mr. Hérnandez-Coronado also covered the activities performed by the Spanish Aviation and Security Agency, particularly the Spanish SSP, State Safety Programme. This system receives and collects around 300-400 safety events per week. He also presented the RIMAS system, showing the capability of providing a complete risk assessment picture of the national safety status by combining a variety of data sources; ultimately providing analytical support for AESA so that they may focus their attention on those areas that require supervision.

Blockchain and other data science applications for aviation digitalization

For the 5th consecutive year, Innaxis organized the Data Science in Aviation Workshop with much positive feedback. This 2017 edition took place last September at EASA HQ in Cologne, Germany, sponsored by the SafeClouds.eu project.

This series of annual workshops was created in 2013 to promote data science techniques applied to the aviation field. Initially, this was a breakthrough idea as data analytic initiatives in the sector were very scarce. On the other hand, the potential benefit of applying these techniques to aviation, with relatively limited investment, greatly supported the effort of pushing this paradigm shift. Now, only 5 years later, the number of ongoing initiatives of data science applications in the aviation sector has continuously increased; demonstrating that the effort was really worth it.

Data has become the key driver of change all across aviation: from maintenance to training, from fuel efficiency to safety. There are on-going examples, with different levels of maturity, in nearly every layer of the aviation sector. This ranges from manufacturing to operations, both from the industry as well as the academia. The last DSIAW brought together this wide variety. Knowledge discovery and Data Mining (KDD) will be, is currently being, a key enabler of the digitalization of our industry.

The entire Horizon2020 transport research programme is driven by the overall objective of making “European transport greener, safer, more efficient and innovative“. These challenges were precisely the 4 pillars of the 2017 DSIAW, showing how data can play a key role in achieving them through the application of data science (DS) techniques. The presentations were distributed among these 4 sessions: DS4Environment, DS4Safety, DS4Predictability and innovative DS techniques and supporting tools, illustrating the audience with these initiatives:

DS4Environment: While the development of greener technologies (engines, aerostructures, components, etc) require several coordinated initiatives, data science offers cost-effective solutions based on real figures of fuel burnt and noise pollution. Applying data analytics techniques to these datasets enhances our knowledge of fuel consumption and noise emission patterns, which supports efficient resource use, thus resulting in a emissions reduction to minimize environmental impact. For this theme, Boeing Global Services – Fuel Dashboard solution and the Technical University of Madrid initiatives related to environmental and noise emissions studies.

DS4Safety: The aviation sector’s requirement for high safety levels has always been the main reason to avoid ‘radical’ changes in this industry or, at least, follow a very slow adoption path. Nevertheless, aviation safety has recently become a pioneering area in data science applications. We can’t neglect to mention the significant challenges in this line of research, such as data protection, data merging, pattern detection in rare events, secure data infrastructures, etc, but nonetheless there are very promising initiatives such as: the SafeClouds project coordinated by Innaxis, the EASA Data4Safety programme, or the activities from SafetyData in NLP applied to Occurrence Reports. All projects were presented at the workshop.

DS4Predictability: In air transportation, efficiency is very linked to predictability, and predictability in turn, is highly dependent on data. Improving predictability reduces uncertainty which avoids losses and enables a more efficient aviation system from reducing delays to predicting systems failures. Ongoing studies, such as those presented by the University of Westminster or Atos, are good examples on how data can provoke a deep transformation of common airline procedures, like disruption management or maintenance scheduling.

DS techniques and supporting tools: Different KDD application techniques require appropriate infrastructures as well as supporting techniques that ensure various requirements are met. This includes: data protection, security, computation efficiency, flexibility, scability, etc. During this last workshop, we learned from the Eurocontrol experience in using cloud-based infrastructures. We also learned about the Innaxis spin-off, TADOREA, which shared knowledge on crypto-economics as a potential solution for enabling secure data analytics, while maintaining data privacy.

Still not convinced? Wanting to learn more? Visit the event page to watch the presentations and videos.

FDM Raw Data: Why Binary Data and How to Decode It?

Safeclouds-post

Authors: Lukas Höhndorf & Javensius Sembiring (TU Munich)

SafeClouds.eu gathers 16 partners for research collaboration with a wide and diverse group of users, including air navigation services providers, airlines and safety agencies. SafeClouds.eu encourages active involvement from users, as the project aims to apply data science techniques to improve aviation safety. SafeClouds.eu is unique as it involves data combination and collaboration from ANSPs, airlines and authorities in order to improve our knowledge on safety risks, all while maintaining the confidentiality of the data. This safety analysis requires comprehensive understanding of various data sources, and supports the use case analysis as selected by the users.
The basics of the FDM data, as one of the main data sources for the project, is outlined in this post.

Onboard Recording

A large amount of data is recorded during civil aircraft flights. Apart from the “Flight Data Recorder” that is mainly used for accident investigations (widely known as “Black Box”), there are also recorders for regular operations. These recorders are often called “Quick Access Recorders” (QAR). QAR data is analysed in terms of safety, efficiency and other aspects in Flight Data Monitoring activities for airlines and is furthermore an integral part of the research project SafeClouds.eu.
image2017-7-14 17_54_24

Figure 1: Example for a QAR (Source: https://www.safran-electronics-defense.com/aerospace/commercial-aircraft/information-system/aircraft-condition-monitoring-system-acms)

Aircraft are very complex systems with a large number of sensors constantly recording measurements. Important parameters regarding the aircraft state, including position, altitude, speed, engine characteristics and many others are recorded by the QAR. Depending on the aircraft type and airline, the number of recorded parameters can reach several thousand.

As a digital device, the recording uses binary format. In other words, if we look at the QAR data we would only see a bit stream, i.e. a sequence of 0 and 1. In order to use the data and investigate, for example the aircraft position, two additional components are necessary. First, logic is needed to determine how the data is written into the bit stream. This is given by an ARINC standard and two versions are presently used: ARINC 717 standard is used for older aircraft types and the ARINC 767 is used for newer aircraft types. Second, a detailed description of the location of any considered parameter in the bit stream is needed. This is given by a “dataframe” which is a text document of up to several hundreds of pages.

image2017-7-14 17_54_34

Figure 2: Overview (Source: “Flight Data Decoding used for Generating En-Route Information based on Binary Quick Access Recorder Data”, Master thesis, Nils Mohr, Technical University of Munich)

File Formats

One of the advantages of data stored in binary format is storage efficiency. The size of the same flight data file stored in binary format compared to being stored in engineering values (e.g. in a CSV file) might be ten times smaller. Considering the research project SafeClouds.eu or the shared framework for flight data such as ASIAS of the FAA, FDX of IATA or Data4Safety of EASA which collects millions of flight data, an efficient storage is obviously needed.

However, storing flight data in binary format then requires an efficient way to transfer the binary data into engineering values. Considering the bit stream logic, two parts are necessary. First, the bit stream logic (provided by the ARINC standard) needs to be represented in a decoding algorithm. Second, the dataframe information, i.e. which parameter can be found in which part of the bit stream needs to be accessible to the decoding algorithm.

Decoding

Recorded parameters have different characteristics. For example, they can be numeric, alphanumeric or characters. Depending on these characteristics, different decoding rules have to be applied. As an example, a temperature recording of 36.5 °C with a linear conversion rule is considered in the following figure.

image2017-7-14 17_54_46

Figure 3: Simple Decoding Example (Source: “Flight Data Decoding used for Generating En-Route Information based on Binary Quick Access Recorder Data”, Master thesis, Nils Mohr, Technical University of Munich)

Starting from the bit stream, just specific binary values are relevant for the temperature recording. As mentioned above, this information can be found in the dataframe. The combination of all bits leads to a number in the binary system, which can then be transferred into the associated decimal value. Applying the conversion rule for linear parameters gives the result 36.5. Information about these rules as well as the unit, in this case degree Celsius, can be found in the dataframe.

Summary

The data that is recorded by civilian aircraft in their daily operation contains valuable information that can be used for airline safety analyses. Due to the nature of the recording, the data is generated in binary format. To make the data accessible and readable for the analysts, a decoding algorithm is applied. For the development of this algorithm, information about the recording logic and for all the considered parameters must be available.

Author: Lukas Höhndorf (TU Munich)

SafeClouds at EASA 2016 annual event and European Commission newsletter

Safeclouds-post
SafeClouds.eu, the most advanced project to improve aviation safety through data analysis, was presented at the EASA 2016 Annual Safety Conference, held in Bratislava last November.
Carlos Alvarez, President of Innaxis, participated in the panel “Sharing and processing safety data: a vital step forward for safety?”. Carlos laid out the main goals of the project as well as our priorities for the next month, strengthening the importance of an integrated data pipeline, from low level raw data management to embedded analytics, driven by user operational questions. The integrated approach will be capable of developing data science solutions to provide all-new capabilities for safety improvements to aviation stakeholders.
You can watch the video of the session:
 shutterstock_245210677_plane_sunset_web_0
In parallel, SafeClouds.eu was also selected for the INEA/European Commission newsletter. This newsletter highlighted just 6 out of the hundreds projects recently awarded within the EU H2020 programme. As it is pointed in the newsletter “The (SafeClouds) project will develop a novel data mining approach for aviation safety and design innovative representations of the results in order to effectively transfer the gained to such users as airlines and air navigation service providers”

SafeClouds kick off meeting

Safeclouds-post

SafeClouds.eu, a H2020 big data for safety project, coordinated by Innaxis, kicked off earlier this month.

SafeClouds is the recently launched H2020 aviation-safety project. It is coordinated by Innaxis, with 15 additional entities (including airlines, ANSPs, EASA, Eurocontrol, various research entities, etc) from 8 different countries.
The aim of SafeClouds is to improve aviation safety by developing state-of-art big data and data analysis tools. The consortium will build a coordinated platform to combine and share data among different aviation actors.
img_6330

Connect with us!