Terracotta is bringing real-time analytics to the masses (of Java users, at least) by letting Ehcache users query data stored in the product’s in-memory cache. With Terracotta’s new Ehcache Search product, customers can perform simple queries in real time against as much as terabytes of data stored in their transactional caches,without having to install any new servers or purchase new appliances. The approach won’t replace the data warehouse, but it could have a significant effect on the future of analytics software development.
The open-source Ehcache is to Java what memcached is to dynamic web languages, in that it lets developers store certain data in-memory to avoid the inherent latency of interacting with the database every time an application needs to serve a piece of data. This setup is great for transactional workloads, but, generally, any analysis still requires a trip to the database. For queries that could benefit from real-time results, this latency can become troublesome, especially if the database is being bombarded, and slowed down, by large numbers of requests. Enter Ehcache Search.
According to Terracotta CEO Amit Pandey, one early customer that manages logistics for a fast-food chain was able to reduce latency times to the sub-second range from nearly a minute. Desperate for better performance, the company was considering Oracle’s high-performance Exadata Database Machine, but didn’t need all the complexity, and didn’t really want to pay the high price or deal with the 12-month installation process, either. A software-only product, it took only a month to install Ehcache, load the desired data into the memory of the existing application servers and start using the product.
But, as even Pandey acknowledges, databases and data warehouses aren’t going away. They’re still necessary for complex queries, especially against huge volumes of data that simply cannot be stored in-memory. Although, a Terracotta sales rep might be quick to point out that the line is blurring. When used in combination with Terracotta’s BigMemory product, users can store up to a terabyte of data in-memory (officially, although Pandey says users are storing up to 4 TB), and the company is planning to enrich the analytics capabilities within the next 18 months. Presently, Ehcache Search is available in both open-source and enterprise editions, and BigMemory is available solely as an enterprise edition.
This blend of transactional and analytical environments doesn’t start with Terracotta, however, and it won’t likely end with it. Already, SAP is selling its High-Performance Analytics Appliance (HANA) that relies on in-memory processing to let customers “instantly explore and analyze all of their transactional and analytical data,” and I have to think other vendors with their hands in both pots (e.g., Oracle and IBM will roll out their own offerings, as well. Pandey thinks they might even roll out lightweight versions in the same
vain vein as the open source Ehcache Search, but said that will require strong customer demand. Considering those companies’ reliance on Java, and that Ehcache has a footprint hundreds of thousands of Java applications, Terracotta might be the company that makes Oracle, IBM and SAP customers see the light.
If that happens, it could represent a real shift in the advanced analytics market similar to the freemium trend we’re currently seeing in the SaaS space. Presently, vendors such as Terracotta, EMC (via Greenplum), and Jaspersoft and Pentaho are all approaching free, open-source analytics from different perspectives — real-time analytics, analytics database and BI, respectively — but getting huge software vendors on board with giving away advanced features to some degree might be considered a landscape shift that couldn’t have been imagined just several years ago.
Image courtesy of Flickr user jpctalbot.
Related content from GigaOM Pro (sub req’d):