Blog Post

NASA opens earth science data, cloud computing to the public as part of new contest

NASA launched a new contest Tuesday for imagining and then building new uses for the space agency’s trove of earth sciences data. The challenge — actually two of them, broken down into the imagination and building stages — kicks off on July 1 and runs through Nov. 15, and utilizes the NASA Open Earth Exchange platform. The exchange’s datasets and informational material, as well as the computing resources for the challenge, are hosted on Amazon Web Services.

This is not the first time NASA has opened its data to the public as part of a challenge. It had previously partnered with TopCoder, a platform for crowdsourced challenges, on a number of competitions. The OpenNEX challenge is in partnership with Innocentive.

Although agencies like NASA, other research institutions and universities still maintain some of the smartest people and best tools in their respective fields, there’s a growing acceptance of what’s possible when the public can access certain types of data and the computing resources to analyze it. Competitions on the Kaggle platform, for example, are routinely won by teams or individuals with little subject-matter expertise but lots of general experience in building predictive models.

Likewise, GDELT creator Kalev Leetaru recently partnered with Google to open that massive socio-political database, as well as tools for analyzing it, to the public. He told me at the time that he’s excited to see what happens when data scientists in all fields can experiment on the data with minimal effort of their part. “You’ve got all this pent-up [analytic] expertise out there,” he said. “Go run these big queries. Tell us what’s possible.”

Cloud computing is the thing that makes these types of competitions possible. Researchers in fields such as genomics hope the cloud can transform their spaces, too, and lead to new scientific breakthroughs. The power comes from giving scientists access to data and computing resources in a centralized location instead of having to send huge datasets over the network and process them using locally using — in the best-case scenario — high-performance computers.