Austin, Texas-based startup Ravel has released GoldenOrb, an open-source graph database that looks to bring the benefit’s of Google’s Pregel project to the masses. Graph databases don’t get the attention of other big-data technologies such as Hadoop or NoSQL, but every Twitter user is familiar with the result of what graph databases can do.
Essentially, graph databases excel at finding relationships between disparate pieces of data, with one major use case being social graphs. They run analyses over terabytes of graph data while maintaining the relationships between the data, even as the data and the relationships constantly evolve.
Twitter actually created its own graph database, called FlockDB, to help the site determine who’s connected to whom in the Twittersphere. Google uses Pregel to power its PageRank feature, although as it explained in a 2009 blog post introducing the technology, there are many other possibilities:
If you squint the right way, you will notice that graphs are everywhere. For example, social networks, popularized by Web 2.0, are graphs that describe relationships among people. Transportation routes create a graph of physical connections among geographical locations. Paths of disease outbreaks form a graph, as do games among soccer teams, computer network topologies, and citations among scientific papers. Perhaps the most pervasive graph is the web itself, where documents are vertices and links are edges. …
A relatively simple analysis of a standard map (a graph!) can provide the shortest route between two cities. But progressively more sophisticated analysis could be applied to richer information such as speed limits, expected traffic jams, roadworks and even weather conditions. In addition to the shortest route, measured as sheer distance, you could learn about the most scenic route, or the most fuel-efficient one, or the one which has the most rest areas. All these options, and more, can all be extracted from the graph and made useful — provided you have the right tools and inputs.
In spreading the word about GoldenOrb, Ravel expands upon the use cases, citing marketing analysis, pharmaceutical research and, essentially, any situation in which it would be beneficial to “run traditional analytics on entire data sets instead of only small samples … .”
A couple of things make GoldenOrb particularly worth watching: 1) it’s both an open source and a product, which distinguishes it from Twitter’s open-source FlockDB project, Google’s proprietary Pregel project and Objectivity’s proprietary InfiniteGraph product; and 2) it’s based on Hadoop. Having an actual product to work on instead of just code could garner a large community, especially from the growing ranks of Hadoop developers.
Hadoop and NoSQL databases have both ridden the big data wave to form robust development communities, so why can’t graph databases be next? The GoldenOrb code is available at https://github.com/raveldata/goldenorb.
For more information about Ravel from the horse’s mouth, including plans to create an enterprise version of Apache’s Hadoop-based Mahout machine-learning platform, check out this video interview with Ravel president Zach Richardson:
Image courtesy of Ravel.