| NASA's Heliophysics Data Environment:
Data and Services for the
Heliophysics System Observatory
Heliophysics (HP) studies the nature and dynamical interactions of the Sun, the heliosphere, and the plasma environments of the planets based on data from a fleet of spacecraft termed the "Heliophysics System Observatory" (HSO). This requires easy access to data and tools from a distributed set of active archives. The Heliophysics Data Environment (HPDE) is the collective set of data and tools resulting from and using the HSO. The HPDE maintains openly accessible data that are independently scientifically usable.
The NASA Heliophysics Science Data Management Policy (pdf version) presents an integrated view of the HPDE. Among other things, the HP Data Policy provides a summary of the components of the HPDE, gives a timeline for the data lifecycle, and provides guidelines for documents such as Project Data Management and Mission Archive Plans. The HPDE is built from peer-reviewed data systems driven by community needs and founded on community-based standards. Consistent with this approach, data providers and data users share responsibility for the quality and proper use of the data for research.
The Heliophysics Data Portal (HDP) provides access to a registry of HP data products that can be searched by time, measurement type, observed region, time resolution, spacecraft or observatory name, any text, and other means, in any order. It is made possible by the use of uniform metadata provided by the uniform "SPASE" terminology. A simple Help page will get you started. Many datasets can be accessed and plotted directly from the HDP, but in all cases the HDP provides the most direct access routes to the data.
NASA has a responsibility to keep data easily available long-term, and for this it cannot depend on mission archives after missions end. NASA HP now expects that missions will deliver data in standard formats (CDF, FITS, NetCDF) to the the Space Physics Data Facility (SPDF; space physics and ITM data) or the Solar Data Analysis Center (SDAC; solar data) as the data are produced. Working with the archives improves the data quality, accessibility, and documentation. The archives are also responsible for a number of tools including SolarSoft, CDAWeb, and SSCWeb, as well as the maintenance and upgrading of the CDF data format standard.
There are many other archives that serve data of relevance to HP, both from other agencies and from other nations. Some of the larger ones are listed here. In addition, many missions serve data directly; check the HDP for access routes to specific products.
A general visualization tool for registered solar images from many missions, including some ground-based. Available ata products include multi-wavelength SDO, SOHO, Hinode, and other images in multiple wavelengths, with easy overplotting and movie-making. Many videos are already archived on YouTube. The European version, JHelioviewer, runs on desktops/laptops and adds such things as potential-field modeled magnetic field lines. Go to Helioviewer or JHelioviewer to try them out. Both applications include links to the Heliophysics Events Knowledgebase (not supported by HPDE but by HP missions) to provide solar events and structures.
Web browser and API access to most solar physics data. Searches can be made by time range, mission, instrument, observables, nicknames (e.g., “H-alpha”) and spectral ranges. VSO is integrated into SolarSoft to provide API access to data via IDL. See an info link and a basic data search link.
A mostly IDL-based collection of routines that provide everything from basic data access and processing of level-zero data to general plotting tools, to advanced analysis tools tailored to the needs of particular solar missions. This is gradually being complemented by routines in the scientific Python ecosystem and developed by the PyHC (see above), but it still serves as the major workhorse for solar data analysis. See the main page and an installation link.
Roughly speaking, SolarSoft for non-solar physics. A set of IDL routines, along with a version that does not require IDL, for loading, plotting, analyzing, and integrating data from many ground and space-based observatories. The latter include all the CDAWeb accessible data, with enhancements for some analysis routines (e.g., visualization of some 3D distributions) and continual expansion and improvement. A Python version of SPEDAS is being developed. A general SPDAS webpage can be found here, and a overview publication is by Angelopoulos et al., 2018.
A Java application that reads and displays/plots all formats used in Heliophysics plus many others. Autoplot can be used to plot data for a server or as a standalone Webstart application. It can be used to form a data access layer, and can make “png walks” of images of data plots to provide rapid data surveys. A variety of output formats are also possible, and it is compatible with the HAPI access method. Autoplot is a downloadable application.
The Heliophysics Application Programming Interface is a generic API with a full and mature specification that provides uniform access to a wide range of HP data, including time series of scalars, vectors, spectrograms, and more complex matrix fields. It has a number of implementations, with servers including CDAWeb, iSWA at CCMC, LISRD at LASP, and a growing number of other places. Data access using a simple API from clients exists for IDL, Matlab, and Python. It is being built and maintained by an informal (but funded) group at a variety of institutions. Details including a specification of the protocol can be found here.
The "creader" software is a set of routines to take advantage of CDAWeb's Web Service access. One line of IDL code issues a command to the CDAWeb server to bring in data from a particular dataset for a given time range, time resolution, and set of variables, and the variables are renamed to whatever is desired. The binning to a uniform resolution allows multiple datasets to be compared directly, no matter what the original time resolution was. Putting a creader saveset of routines into an accessible place in the IDL path makes the routines callable without a need for compilation. For more details, go to the "how to" page.
A web browser interface (CDAWeb) provides access to most of the SPDF active archive of old and new NASA mission products; it provides a variety of browse plots, quality plots, binning, bad point editing, and ASCII and CDF downloads. Direct application data access to subsets (by variable and time) and supersets (across file boundaries) is provided by web services of both “SOAP” and “RESTful” varieties in what is called the Coordinated Data Analysis System CDAS. IDL and other libraries of routines provide various user capabilities.
The "OMNI" dataset is a collection of datasets at 1-min and hourly resolution that allow users to have uniform access to values of many variables at the nominal bow shock of the Earth, thus useful for both solar wind and magnetospheric studies. Many indices (KP, Dst, AE, etc) and other quantities are provided, along with tools for plotting, subsetting by various filters, and variable intercomparisons. Many of the data products are also accessible through CDAWeb, thus providing API access via applications such as IDL and Python.
SPDF provides continual upgrades in conjunction with user needs to assure that the standard CDF data format continues to provide efficient and complete access to HP data. This effort includes maintaining and updating the “ISTP” guidelines for the metadata in CDF (and now NetCDF) files.
There is an increasing use of Python for data access and analysis, particularly among younger researchers. The HPDE is unifying these efforts and has initiated funding of some projects. This will be a community-directed effort that builds on the models of other open-source communities. The SunPy effort is part of PyHC, along with a host of other projects.
The SSCWeb browser interface provides spacecraft orbit views in planar projections with options to plot multiple spacecraft orbits and to apply region and other query restrictions. A very large set of spacecraft are included. Also provided are 3-D space and time interactive views using "TIPSOD".
The HPDE is collaborating with the CCMC to develop SPASE-based descriptions of simulation code and its output with the goal of both making CCMC simulation and verification activities easier, and also to facilitate data-model comparisons. (See: https://ccmc.gsfc.nasa.gov.)
Small levels of funding are being supplied to keep some projects up-to-date and available to the community, such as the CHIANTI database, DSCOVR high-resolution data, and various others as needed and possible.
The SPASE Data Model needs to be continually updated, and new or revised product descriptions are needed frequently; the SPASE group provides these, partly as a contract-funded activity for continuity and community service, and partly as an open consortium that continues to refine and update the model. SPASE descriptions increasingly include Digital Object Identifiers (DOIs) for data reference and “product and parameter keys” to aid the use of APIs. (See: http://spase-group.org, and the publication by Roberts et al. (2018); https://doi.org/10.1029/2018SW002038.)
Central to the success of the Data Environment is a uniform set of terminology to describe products and their sources. This allows us to make registry of data products that is useful for search and discovery of relevant data. A number of groups in the US and elsewhere have worked on this problem. To foster the interoperability of the various partners in the HPDE, NASA HP has sponsored the SPASE (Space Physics Archive Search and Extract) collaborative, consisting of scientists and software designers from a number of US and international institutions, to develop the SPASE Data Model that will allow uniform descriptions of products and services. The current official version of the Data Model is available for use, and suggestions for improvement are always welcome.
Since the SPASE effort grew up after the initiation of most of the current NASA missions, there was no mandate in their contracts or PDMPs for the provision of standardized metadata. Thus the NASA HDMC (see page link above) has formed a SPASE Metadata Working Team (SMWT) to work with missions to make the required XML files for describing data products. Much of this work for current and many past missions has been completed, but if you see any difficulties or need help with new SPASE product descriptions, please contact the SMWT via email to email@example.com or firstname.lastname@example.org; more streamlined access is in the works.
The recognition and documentation of data use in publications has become more formalized, with many journals adopting increasingly specific requirements for citation and referencing of datasets. Most people agree that these changes are beneficial to all concerned, including readers who want to reproduce research and data providers who want an easy way to keep track of and get credit for data use. To help NASA's Heliophysics (HP) division with adapting to this change, the SPASE group is offering a service that makes acquiring a "Digital Object Identifier" easy for data providers. DOIs have long been routinely used for assigning unique identifiers to journal articles or books, but they are now the de facto standard for registering datasets. What will be needed from data providers will be the author, publisher, and publication date. We will provide starting points for this information using current registry information, as suggested here for Wind spacecraft data products. Click here for more details.
NASA HP proposals now require Data Management Plans in most proposals that states what new datasets will be produced during the work. Such datasets need to be provided to NASA HP Archives (SPDF, SDAC, and, for simulation output, CCMC). Here is a Research Data Management Plan Template that includes a discussion of the issues involved to use as a guide.
Mission planning includes the production of Project Data Management Plan that includes how products are to be produced as well as what products are going to be provided to achieve the mission goals. This plan should be developed in conjunction with the HP Archives (SPDF or SDAC). See Appendix D of the NASA Heliophysics Science Data Management Policy.
Many of the basic ideas about access to large, distributed data holdings were well-understood by those attempting to make a "Space Physics Data System" in the early 1980's. The technology has improved since, but these documents still are useful:
Here are the projects that have been funded by competition in the Heliophysics Data Environment Enhancements (HDEE) offering in ROSES, given by category of award and the by the year of the ROSES call that is relevant for the task. Also included are a number of tasks that are not HDMC funded, but are relevant to the HPDE. Here also are the abstracts for the all the awards: