Blog Post

Continuuity open sources Tephra to take care of multiple data updates

Continuuity, the self-proclaimed big data PaaS startup has open sourced Tephra, a transaction engine tailored for Apache HBase, the NoSQL database that runs on top of the Hadoop Distributed File System.

While the startup uses HBase as a way to store data for its Reactor application development platform, the company found that HBase was not ideal for applications that require many updates to the data across multiple subsets of the total system, explained Continuuity CEO Jonathan Gray.

With HBase, a user can only update a single database region, which consists of a group of rows for a given database; if a person were to attempt to update multiple regions, it’s likely that a lot of errors could occur as the database wasn’t designed to do so.

To solve that problem Continuuity created Tephra, which acts as a sort of automatic data updater for HBase that allows for multiple regions of the database to be altered without causing concern that bad data might be entered.

Here’s how developers could use Tephra, per a Continuuity blog post:
[blockquote person=”Continuuity” attribution=”Continuuity”]Developers typically create secondary indexes on HBase by writing updates to a second table with additional rows that reference the rows in the main table based on the index values. The problem is that there isn’t consistency in operations across the two tables, so they can get out of sync. Based on their actual data access patterns and what their application cares about, developers are forced to adopt more complicated application logic to manage the data and work around the inconsistencies. In contrast, Tephra simplifies this use case by allowing updates to both tables to be performed in a single globally consistent transaction.[/blockquote]

Tephra can also integrate with MongoDB, LevelDB and other relational databases and data warehouses.

“One of the goals of Tephra was not to just be for HBase,” said Gray. “We have built it with the notion that you can do transactions across multiple systems.”

Tephra transaction life cycle
Tephra transaction life cycle

The startup currently uses Tephra to handle transactions for the stream-processing technology jetStream, developed by Continuuity and AT&T Labs, said Gray.

Contiuuity isn’t the only company tweaking HBase for better performance. Facebook last month showed off Hydrabase, an updated version of HBase, that shortens the amount of lag that occurs whenever a region server fails.

One Response to “Continuuity open sources Tephra to take care of multiple data updates”