Bloom Won’t Micromanage Data So Apps Can Scale

Building webscale or cloud applications is hampered by figuring out ways to spread tasks out over thousands of computers without slowing things down, or requiring too many people to keep things running. Virtualization and faster storage helps, as do new databases (GigaOM Pro sub req’d) and caching techniques, but right now folks are trying to adapt how they program computers to reflect that one has now become many.

Bloom, a programming language created at the University of California, Berkeley by a group led by Joseph Hellerstein, is one such effort. Bloom was profiled this week as one of the top 10 emerging technologies by MIT’s Technology Review, because it could help cloud computing continue to scale. Here’s how, according to Technology Review:

The challenge is that these languages process data in static batches. They can’t process data that is constantly changing, such as readings from a network of sensors. The solution, ­Hellerstein explains, is to build into the language the notion that data can be dynamic, changing as it’s being processed. This sense of time enables a program to make provisions for data that might be arriving later — or never.

Hellerstein also gave an extensive interview to HPC in the Cloud this week about what Bloom is and the problem it’s trying to solve. From that interview:

To put it simply, our what our work is trying to do is start with the data itself and get people to talk about what should happen to the data step-by-step through a program without ever having them specify at all how many machines are involved. So, when you ask a query of a database you describe what data you want—not how to get it.

The interview lays out how this programming effort  came about (building network protocols) and who might care most about using Bloom (Amazon, Google or anyone with big data needs), but for me the best part of it was how Hellerstein underscored that the ability to harness a heck of a lot of servers and treat them as a single computer is the next big shift in information technology.

We can call it cloud computing, webscale applications or merely bigger data centers, but the key element here is that the hardware has gone social in ways that require many-to-many ways of communication and delivering instructions to the processors — inside the servers, between the servers, and soon, between data centers. The exciting aspect of this shift is that while larger companies like Google, Yahoo and Amazon are innovating, there is plenty of room for startups with a new appliance, server, networking technology or chunk of code to make waves — and hopefully, money.

For more on the effort, please check out the FAQ’s Hellerstein has posted on his blog.

Image courtesy of Flickr user tibchris


Comments have been disabled for this post