Not originally created for the enterprise environment, Hadoop was built for internet data center environments like Google, Yahoo, Facebook, and Twitter. These are different IT environments, structured, supported, and managed in different ways from enterprise IT. As a result, Hadoop currently lacks many of the functions and internal processes that enterprise IT needs in terms of security, availability, data integrity, and data governance.
There is no question that there are enterprise-level industry segments where Hadoop has taken hold and is flourishing, such as financial services, health care, pharmaceuticals, and energy. Most deployments are in departments with centralized IT becoming involved from the standpoint of providing and integrating infrastructure (servers with embedded storage, networking gear, etc.). In addition, these grassroots Hadoop projects are still mainly at a secondary level and are not yet considered to be critical, production-level IT.
Hadoop must mature further in order to be regarded as a viable enterprise platform capable of supporting critical business functions running real-time applications. As Hadoop matures so will its critical nature within the organizations that are now learning its ins and outs. Enterprise IT will become more directly involved with managing and supporting Hadoop — a process that is by no means a given. In essence, Hadoop has to follow the rules of centralized IT, and therefore the platform will be subject to production data center security levels, management processes, data protection and data integrity guarantees, data governance policies, and, above all, service level agreements (SLAs).
This report will:
- Place Hadoop in the context of enterprise IT and help those managing Hadoop platforms make it responsive to the enterprise’s data governance policies and processes
- Outline these policies using the industry segments and data sources mentioned above
- Describe ways in which Hadoop can be made responsive to the enterprise’s IT infrastructure, security, auditing, and compliance stakeholders
- Make the point that by addressing these concerns, Hadoop can advance to full production status, including support for real-time applications
Thumbnail image courtesy of Thinkstock