Scalable, Asynchronous, Distributed Eigen-Monitoring of Astronomy Data Streams

Metadata Updated: February 28, 2019

In this paper, we develop a distributed algorithm for monitoring the principal components (PCs) for next generation of astronomy petascale data pipelines such as the Large Synoptic Survey Telescopes (LSST). This telescope will take repeated images of the night sky every 20 s, thereby generating 30 terabytes of calibrated imagery every night that will need to be co-analyzed with other astronomical data stored at different locations around the world. Event detection, classification, and isolation in such data sets may provide useful insights to unique astronomical phenomenon displaying astrophysically significant variations: quasars, supernovae, variable stars, and potentially hazardous asteroids. However, performing such data mining tasks is a challenging problem for such high-throughput distributed data streams. In this paper, we propose a highly scalable and distributed asynchronous algorithm for monitoring the PCs of such dynamic data streams and discuss a prototype web-based system PADMINI (Peer-to-Peer Astronomy Data Mining) which implements this algorithm for use by the astronomers. We demonstrate the algorithm on a large set of distributed astronomical data to accomplish well-known astronomy tasks such as measuring variations in the fundamental plane of galaxy parameters. The proposed algorithm is provably correct (i.e., converges to the correct PCs without centralizing any data) and can seamlessly handle changes to the data or the network. Real experiments performed on Sloan Digital Sky Survey (SDSS) catalogue data show the effectiveness of the algorithm.

Access & Use Information

Public: This dataset is intended for public access and use. License: U.S. Government Work

Downloads & Resources

Dates

Metadata Created Date August 1, 2018
Metadata Updated Date February 28, 2019
Data Update Frequency irregular

Metadata Source

Harvested from NASA Data.json

Additional Metadata

Resource Type Dataset
Metadata Created Date August 1, 2018
Metadata Updated Date February 28, 2019
Publisher Dashlink
Unique Identifier DASHLINK_366
Maintainer
Kanishka Bhaduri
Maintainer Email
Public Access Level public
Data Update Frequency irregular
Bureau Code 026:00
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://data.nasa.gov/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id d00302b3-1f64-4a40-b7ee-5d51ef32bfec
Harvest Source Id 39e4ad2a-47ca-4507-8258-852babd0fd99
Harvest Source Title NASA Data.json
Data First Published 2011-05-05
Homepage URL https://c3.nasa.gov/dashlink/resources/366/
License http://www.usa.gov/publicdomain/label/1.0/
Data Last Modified 2018-07-19
Program Code 026:029
Source Datajson Identifier True
Source Hash 562f7ac05a7fe643d954aa10f15a3c49c0daf364
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.