When running databases, how do you get the speed you want while offering the flexibility and cost savings of the cloud? At Structure:Data in New York City Thursday, Wordnik co-founder Tony Tam talked about how his company was able to move its relational database from dedicated hardware to the cloud.
Wordnik provides a real-time engine for understanding the meaning of words that you read among various publisher sites. The best example of this might be a recent integration with SmartMoney, which provides relational links to a glossary that the publisher has assembled.
But managing dynamic, real-time understanding and meaning of words is no small task. To handle the task, Wordnik has gone through various iterations of its product in order to get to the speed and capacity that it needed. Wordnik started out on with a MySQL database on Amazon EC2, but found that setup wasn’t processing its data quickly enough. So it moved to some dedicated hardware and MongoDB, which solved its processing problem — it went from processing about 50 records a second to processing 1,000 records a second with the change, Tam said.
That was great from a performance point-of-view, but it came at a huge cost. In order to run at scale and to be ready for peaks in usage, Wordnik had a whole lot of excess capacity that was running idle most of the time.
“I’m not going to tell you that EC2 is as fast as raw metal, because it’s not,” Tam said. But what it lost in pure performance it was able to make up in part by using management tools to quickly bring up and shut down EC2 instances as needed and splitting processes across a number of clusters. Doing so cut costs by about half, when compared to owning servers with a bunch of capacity that was going unused most of the time.
To help with that, Wordnik relies on 10gen, which develops MongoDB and provides service and support to clients around it. 10gen founder Dwight Merriman said that the idea behind the database was in part to enable a cloud-based instances. But while MongoDB is cloud-friendly, Merriman said a number of clients still run it on dedicated hardware as well.
“Databases are not cloud-friendly. They’re one of the hardest parts of stack to get into the cloud,” Merriman said. “We’re big believers that this tech needs to be able to run anywhere… We don’t want it to be constrained.”
Watch the livestream of Structure:Data here.