Development of a Data Fusion Framework to support the Analysis of Aviation Big Data
Pinon, Olivia J.
Mavris, Dimitri N.
MetadataShow full item record
The Federal Aviation Administration (FAA) is primarily responsible for the advancement, safety, and regulation of civil aviation, as well as overseeing the development of the air traffic control system in the United States. As such, it is faced with tremendous amounts of data on a daily basis. This data, which comes in high volumes, in various formats, from disparate sources and at various frequencies, is used by FAA analysts and researchers to make accurate forecasts, improve the safety and operational performance of their operations, and streamline processes. However, by its very nature, aviation Big Data presents a number of challenges to analysts: it impedes their ability to get a real-time picture of the state of the system, identify trends and operational patterns, make real-time predictions, etc. As such, the overarching objective of the present effort is to support FAA through the development of a data fusion framework to support the analysis of aviation Big Data. For the purpose of this research, three datasets were considered: System-Wide Information Management (SWIM) Flight Publication Data Service (SFDPS), Traffic Flow Management System (TFMS), and Meteorological Terminal Aviation Routine (METAR). The equivalent of one day of data was retrieved from each dataset, parsed and fused. A use case was then used to illustrate how a data fusion framework could be used by FAA analysts and researchers. The use case focused on predicting the occurrence of weather-related Ground Delay Programs (GDP) at the Newark (EWR), La Guardia (LGA), and Boston Logan (BOS) International Airports. This involved developing a prediction model using the Decision Tree Machine Learning technique. Evaluation metrics such as Matthew’s Correlation Coefficient were then used to evaluate the model’s performance. It is expected that a data fusion framework, once integrated within the FAA’s Computing and Analytics Shared Services Integrated Environment (CASSIE) could be used by analysts and researchers alike to identify trends and patterns and develop efficient methods to ensure that the U.S. civil and general aviation remains the safest in the world.