Anomaly Detection with Text Mining

Metadata Updated: May 2, 2019

Many existing complex space systems have a significant amount of historical maintenance and problem data bases that are stored in unstructured text forms. The problem that we address in this paper is the discovery of recurring anomalies and relationships between problem reports that may indicate larger systemic problems. We will illustrate our techniques on data from discrepancy reports regarding software anomalies in the Space Shuttle. These free text reports are written by a number of different people, thus the emphasis and wording vary considerably.

With Mehran Sahami from Stanford University, I'm putting together a book on text mining called "Text Mining: Theory and Applications" to be published by Taylor and Francis.

Access & Use Information

Public: This dataset is intended for public access and use. License: U.S. Government Work

Downloads & Resources


Metadata Created Date August 1, 2018
Metadata Updated Date May 2, 2019
Data Update Frequency irregular

Metadata Source

Harvested from NASA Data.json

Additional Metadata

Resource Type Dataset
Metadata Created Date August 1, 2018
Metadata Updated Date May 2, 2019
Publisher Dashlink
Unique Identifier DASHLINK_4
Ashok Srivastava
Maintainer Email
Public Access Level public
Data Update Frequency irregular
Bureau Code 026:00
Metadata Context
Metadata Catalog ID
Schema Version
Catalog Describedby
Datagov Dedupe Retained 20190501230127
Harvest Object Id 97a00271-343b-4940-84b1-4bf4ec5a637d
Harvest Source Id 39e4ad2a-47ca-4507-8258-852babd0fd99
Harvest Source Title NASA Data.json
Data First Published 2010-09-09
Homepage URL
Data Last Modified 2018-07-19
Program Code 026:029
Source Datajson Identifier True
Source Hash eb151c382bec29a4bdebce0e5ffd127cd89fda68
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.