Anomaly Detection with Text Mining

Metadata Updated: November 12, 2020

Many existing complex space systems have a significant amount of historical maintenance and problem data bases that are stored in unstructured text forms. The problem that we address in this paper is the discovery of recurring anomalies and relationships between problem reports that may indicate larger systemic problems. We will illustrate our techniques on data from discrepancy reports regarding software anomalies in the Space Shuttle. These free text reports are written by a number of different people, thus the emphasis and wording vary considerably.

With Mehran Sahami from Stanford University, I'm putting together a book on text mining called "Text Mining: Theory and Applications" to be published by Taylor and Francis.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources


Metadata Created Date November 12, 2020
Metadata Updated Date November 12, 2020
Data Update Frequency irregular

Metadata Source

Harvested from NASA Data.json

Additional Metadata

Resource Type Dataset
Metadata Created Date November 12, 2020
Metadata Updated Date November 12, 2020
Publisher Dashlink
Unique Identifier Unknown
Identifier DASHLINK_4
Data First Published 2010-09-09
Data Last Modified 2020-01-29
Public Access Level public
Data Update Frequency irregular
Bureau Code 026:00
Metadata Context
Metadata Catalog ID
Schema Version
Catalog Describedby
Homepage URL
Program Code 026:029
Source Datajson Identifier True
Source Hash fe066787a4381b26d55a6154775db6e3cbc4b84f
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.