It looks like Oracle has some competition when it comes to selling big iron for big data. On Wednesday, Cray, the Seattle-based company best known for building some of the world’s fastest supercomputers, said it’s getting into the big data game. A new division within Cray, called YarcData, will leverage Cray’s experience working within data-intensive environments for customers such as Boeing in order to woo large-enterprises with big data needs.
Cray was short on details in a press release announcing the new division, but new YarcData SVP and GM Arvind Parthasarathi, formerly of Informatica is quoted saying, “YarcData is the nexus of the world’s most advanced technologies from Cray being applied to solve the world’s most challenging Big Data problems.” The natural leap is that Cray will design parallel-processing systems capable of incredible data throughput — something already required in the supercomputing space, where incredible processing capacity would be wasted without a steady data stream — but that will support today’s popular big data tools (e.g., Hadoop, analytic databases and predictive analytics software).
This type of system could be very valuable for organizations such as banks and intelligence agencies that want to run big data workloads as fast as possible — even process streaming data in real time– and the deep pockets to pay for Cray’s presumably pricey systems. Despite the fact that big-data framework Hadoop gained popularity in part because it’s designed to run on commodity hardware, there’s always a place for high-end hardware when milliseconds really do matter, and there’s something to be said for pre-configured systems that take the guesswork out of building a big data environment, as I explained recently in a piece for GigaOM Pro (sub req’d).
Cray isn’t alone in pushing this high-performance, enterprise-focused big data vision, though. Oracle made a splash in October when it announced a Big Data Appliance that marries Hadoop, R, NoSQL and other technologies to the high-end hardware Oracle obtained when it bought Sun Microsystems. IBM also has an extensive big data software portfolio complemented by a systems business that includes supercomputers, as well. And although it doesn’t have an HPC pedigree like the others, Teradata has years of experience building systems optimized for analytics.
Cray won’t likely become a household name in the big data world, and its notoriously secretive customers might never divulge what they’re using its analytics products for, but there certainly is a market — however small — for super-big, super-fast and super-expensive data.