ANALYZING AVIATION SAFETY REPORTS: FROM TOPIC MODELING TO SCALABLE MULTI-LABEL CLASSIFICATION

Metadata Updated: May 2, 2019

ANALYZING AVIATION SAFETY REPORTS: FROM TOPIC MODELING TO SCALABLE MULTI-LABEL CLASSIFICATION

AMRUDIN AGOVIC*, HANHUAI SHAN, AND ARINDAM BANERJEE

Abstract. The Aviation Safety Reporting System (ASRS) is used to collect voluntarily submitted aviation safety reports from pilots, controllers and others. As such it is particularly useful in researching aviation safety deficiencies. In this paper we address two challenges related to the analysis of ASRS data: (1) the unsupervised extraction of meaningful and interpretable topics from ASRS reports and (2) multi-label classification of ASRS data based on a set of predefined categories. For topic modeling we investigate the practical usefulness of Latent Dirichlet Allocation (LDA) when it comes to modeling ASRS reports in terms of interpretable topics. We also utilize LDA to generate a more compact representation of ASRS reports to be used in multi-label classification. For multi-label classification we propose a novel and highly scalable multi-label classification algorithm based on multi-variate regression. Empirical results indicate that our approach is superior to several baseline and state-of-the-art approaches.

Access & Use Information

Public: This dataset is intended for public access and use. License: U.S. Government Work

Downloads & Resources

Dates

Metadata Created Date August 1, 2018
Metadata Updated Date May 2, 2019
Data Update Frequency irregular

Metadata Source

Harvested from NASA Data.json

Additional Metadata

Resource Type Dataset
Metadata Created Date August 1, 2018
Metadata Updated Date May 2, 2019
Publisher Dashlink
Unique Identifier DASHLINK_229
Maintainer
Elizabeth Foughty
Maintainer Email
Public Access Level public
Data Update Frequency irregular
Bureau Code 026:00
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://data.nasa.gov/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Datagov Dedupe Retained 20190501230127
Harvest Object Id aa98ccc9-78e2-4ffd-93df-be2248e655d5
Harvest Source Id 39e4ad2a-47ca-4507-8258-852babd0fd99
Harvest Source Title NASA Data.json
Data First Published 2010-10-13
Homepage URL https://c3.nasa.gov/dashlink/resources/229/
License http://www.usa.gov/publicdomain/label/1.0/
Data Last Modified 2018-07-19
Program Code 026:029
Source Datajson Identifier True
Source Hash 7cff0852615a932fc5bc4668cb0d50770112db3c
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.