Open is the flavor du jour in enterprise software for big data. But it’s just the beginning.
The challenge of analyzing massive data sets has spurred a hotbed of creativity in open-source communities. While some of these are more novel than innovative, hindsight has taught us that open source succeeds in the enterprise when it meets relatively mundane challenges like security, reliability, manageability and cost-efficiency.
Enterprises need a platform to ingest, store and process data that is open and extensible but also robust and high-performing. Committed to developing Apache Hadoop as that platform for the long-term, Intel is adding value to these essentials.
• First, we focused on bolstering security with hardware capabilities available today. In the Intel Distribution for Apache Hadoop, we built in file-based encryption in HDFS, accelerated up to 20 times with Intel AES-NI. We also launched Project Rhino to offer a common framework for authentication, authorization and auditing across key Hadoop projects.
• Optimizing performance was another focus, and we achieved up to 8.5 times faster Hive queries, adaptive data replication and optimization for SSD.
• We’re also making secure, high-performing Hadoop clusters easier to manage. By simplifying deployment and monitoring, automating configuration with Intel® Manager, and enabling Hadoop across multiple data centers, we’re taking the complexity out of Hadoop deployments.
While foresight is rarer than hindsight, we predict that this is the right approach for enterprises which are harnessing big data for the long haul.