HySDS - Powering NASA's Earth Science Data Processing at Scale
HySDS (Hybrid Cloud Science Data Processing System) is revolutionizing how NASA handles massive-scale Earth science data processing. With the ability to process over 300TB of data per day and support for 3-million processing jobs daily, it has become the backbone of NASA's most critical Earth Science missions.
The Birth of a Solution
Back in 2008, what started as a service-based science data processing initiative under ACCESS has evolved into one of NASA's most critical data processing frameworks. The motivation was clear: Earth science missions were facing a staggering 100x increase in daily data volumes, and traditional on-premise compute solutions weren't going to cut it anymore.
Breaking Records and Setting Standards
HySDS has achieved several groundbreaking milestones:
- First NASA Science Data System (SDS) to scale to over 8,000 parallel compute nodes in the cloud
- First to successfully utilize AWS spot market for cost-effective data processing
- Pioneer in running Earth Observation Science Data Systems operations in the cloud
- First to span operations across multiple cloud providers (AWS, GCP, Azure) and NASA's HECC
Real-World Impact
The numbers speak for themselves. Currently, HySDS is:
- Processing over 300TB of data per day for NISAR
- Handled 2PB of data processing for SWOT in its first year
- Supporting 3-million processing jobs per day
- Being used by 13 active NASA projects in 2024
- Has been integrated into 33 NASA-funded projects to date
Community-Driven Innovation
What makes HySDS truly special is its community approach. With over 50 developers and 30+ contributors, the system has fostered a collaborative environment where improvements made by one project benefit the entire ecosystem. The community coordinates through:
- Bi-weekly multi-mission meetings
- Active public GitHub repositories
- Community Slack channels
- Shared documentation and operational procedures
Supporting Critical NASA Missions
HySDS has become the backbone for several flagship NASA Earth Science missions:
- NISAR (NASA-ISRO Synthetic Aperture Radar)
- SWOT (Surface Water Ocean Topography)
- SMAP (Soil Moisture Active Passive)
- SNWG OPERA
- OCO-2 and OCO-3 reprocessing
Looking to the Future
As we move forward, HySDS continues to evolve. The system's open-source nature and community-driven development model ensure it stays at the forefront of Earth science data processing technology. With upcoming missions promising even larger data volumes, HySDS's hybrid approach and scalable architecture position it perfectly for future challenges.
Getting Involved
For those interested in joining the HySDS community, the project maintains an active presence on GitHub and welcomes contributions from developers, scientists, and enthusiasts alike. The codebase is available at https://github.com/hysds/, and the community wiki provides extensive documentation and resources for newcomers.
Copyright 2024, by the California Institute of Technology. ALL RIGHTS RESERVED. United States Government Sponsorship acknowledged.