Ready or not ...

Hortonworks sets its IPO price; CEO says Hadoop is ready to explode

Hadoop software company Hortonworks priced its initial public offering on Thursday at $16. That’s an increase of 23 percent over the midpoint of the company’s initial estimate, which valued its shares at $12 to $14. Hortonworks will start trading Friday morning on the Nasdaq under the ticker HDP.

However, the company caused minor shockwaves in November when it filed paperwork to go public. It’s not so much the IPO that surprised people — Hortonworks, as well as blood rivals Cloudera and MapR, have all discussed impending IPOs — but rather the timing. As Hortonworks lays out in its SEC S-1 form, the business of selling Hadoop is still a long, hard and capital-intensive process.

Industry analysts and investors wasted no time poring over the company’s paperwork and finding reasons to be worried about its revenue, profits and customer counts.

Hortonworks CEO Rob Bearden has a different take on the matter. Numbers aside, he sees Hadoop as a market ready to explode and Hortonworks as a company ready to capitalize on it. Here’s what he had to say during an interview with Gigaom last month.

Hadoop is no longer a question

It’s not yet ubiquitous, but Bearden thinks Hadoop has already crossed an important inflection point in that companies are no longer asking themselves whether they’ll deploy Hadoop. The decision to deploy is now a “predetermined assumption,” he said, and the only real question is “how much, how fast.” They’re willing to make significant commitments in Hadoop environments because they’re now confident it will benefit them in some way.

“The enterprise sees the functional and economic value that Hadoop enables them to achieve,” Bearden explained.

Data on how on true this worldview is paints a conflicting picture. On the one hand, some analysts claim many CIOs are (somehow) still very confused about what Hadoop is and where it fits into their data strategies. On the other hand, some surveys show big data adoption and planning are picking up — and Hadoop is usually considered a very big part any big data strategy.

To the degree Hadoop is synonymous with big data, things are looking good. Source: Gartner
From a Gartner survey asking respondents about their companies’ stage of big data adoption. Source: Gartner

Data lakes and new applications are coming next

However, getting companies to adopt Hadoop at all is just the first step. Bearden thinks the next big inflection point will come as companies move their Hadoop deployments outside of department-level clusters and start spreading them across the whole company. He, like a handful of other big data companies, uses the (sometimes-criticized) term “data lake” to describe the resulting architecture, which ideally is a large Hadoop Distributed File System environment storing data from many different applications across many different departments.

Once companies start consolidating their data into a central location and see what Bearden calls “the art of the possible,” that’s when the next generation of data-driven applications really begin to take hold. He pointed to things such as improving supply chain operations by predicting demand based on customer behavior patterns, or using sensor data to improve logistics for fleet vehicles.

“That’s what’s going to explode in 2015,” he said.

And over the next few years, he added, these new applications will pull Hadoop along into more companies just like ERP applications helped pull in relational databases decades ago.

The Hortonworks view of YARN on Hadoop. Source: Hortonworks
The Hortonworks view of YARN on Hadoop. Source: Hortonworks

YARN and the cloud make it possible

It’s not coincidental, Bearden said, that the timing of the coming wave of Hadoop applications coincides with the availability of YARN, the resource-management framework that’s now part of Apache Hadoop and lets the same cluster run many different types of computing jobs. Users can still run their batch jobs using MapReduce, but also do interactive machine learning on Spark and real-time processing on Storm, for example.

Two or three years ago, companies were evaluating their proof-of-concept projects and deploying small clusters into production, and over the past year they’ve been moving toward the data lake architecture or, as Bearden put it, making Hadoop “the long pole in their tents.” Next year, he predicts they’ll really have a handle on what they can do because of YARN.

“YARN did for Hadoop what the combustion engine did for transportation,” he said.

The other big thing that’s changing how users are considering their Hadoop usage is the cloud, Bearden said. And, just like its competitors, Hortonworks has spent “a ton of dough” and engineering resources on making sure its software runs the same whether it’s installed on Windows or Linux machines, inside a customer data center or in the cloud. When customers get what he calls “pure architectural freedom,” they can begin thinking about whether it makes sense, for example, to query across different physical locations or to optimize for cost and performance by moving certain data onto lower-cost, lower-performance cloud storage.

Hadoop vendors have just made this a real possibility, but, he said, “That is the big exploration that’s happening right now.”

Yay, the cloud, for big data! Source: Gigaom Research
Yay, the cloud, for big data! Source: Gigaom Research