More hadoop Stories

server farm

Hadoop is a very valuable tool, but it’s far from perfect. While Apache, Cloudera, EMC, MapR and Yahoo focus on core architectural issues, there is a group of vendors trying to make Hadoop a more-fulfilling experience by focusing on business-level concerns such as applications and utilization. Read more »

horton_hears_a_who_-copy

Yahoo will be spinning off a separate company focused on the development and commercialization of Apache Hadoop, called HortonWorks. The official announcement likely will come tomorrow or Wednesday to coincide with Yahoo’s annual Hadoop Summit, but rumors have been circulating for months. Read more »

loading external resource

Being able to crunch terabytes of data is great, but having someone else do it for you is even better. HPCC Systems, which launched last week to challenge Hadoop’s big data dominance, is planning to do just that with a cloud service for big data processing. Read more »

Subscriber Content

fieldguide

Cloud computing has grown from a pie-in-the-sky vision to a major IT movement over the past few years. As its promise has grown, though, so too has its scope. This report covers six key sectors in cloud computing: commodity Infrastructure-as-a-Service (IaaS), enterprise IaaS, Platform-as-a-Service (PaaS), Software-as-a-Service (SaaS), cloud storage and private clouds. We highlight the current state of each and provide informed insights into where they — and cloud computing in general — are headed. Much like any market in a still-evolving state, the infrastructure of the cloud-computing transition is still being built by startups, practitioners and even a big-name company or two. Companies mentioned in this report include VMware, Amazon, Nasuni, Terremark and Heroku. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

SeaMicro's SM10000-64 server.

Online dating service eHarmony is using SeaMicro’s specialized Intel Atom-powered servers as the foundation of its Hadoop infrastructure, demonstrating that big data applications such as Hadoop might be a killer app for low-powered micro servers. Read more »

Freedom-of-choice-a22077920

Is Hadoop our only hope for solving big data challenges? From scalability to fault tolerance, Hadoop does myriad things very well. Yet, Hadoop is not the solution to all big data problems and use cases. Several key issues remain, including investment, complexity and batch-only processing. Read more »

chain

Big data startup MapR is now an official corporate contributor to the Apache Hadoop project, a somewhat interesting turn of affairs given its corporate mission to lure users away from Apache’s Hadoop Distributed File System. However, other companies commercializing Hadoop shoud follow its lead. Read more »

cash

The global economy continues to face uncertainty, but despite this, many technology companies have cash on hand and are opting to spend it on mergers and acquisitions. Here we examine some likely strategies from five different companies: IBM, Oracle, HP, Cisco and Hewlett-Packard. Read more »

loading external resource

EMCelephant

San Jose, Calif.-based storage startup MapR, which provides a high-performance alternative for the Hadoop Distributed File System, will serve as the storage component for EMC’s forthcoming Greenplum HD Enterprise Edition Hadoop distribution. Cloudera announced an HDFS partnership of its own with compression expert RainStor. Read more »

private property

As a recent McKinsey Global Institute report on big data points out, finding the appropriate balance between consumer privacy and business innovation will play a key role in ensuring that big data and the overall web advance at the pace required by both business and consumers. Read more »

EMCelephant

EMC is throwing its weight behind Hadoop. Today, at the EMC World, the storage giant announced a slew of Hadoop-centric products, including a specialized appliance for Hadoop-based big data analytics and two separate Hadoop distributions. EMC’s entry is going to shake-up the Hadoop market. Read more »

hadoop logo

The recent excitement around Hadoop has culminated in five new Hadoop products today from EMC, NetApp, Mellanox, SnapLogic and DataStax. What’s interesting now is that we’re seeing large technology vendors with hardware expertise pushing gear optimized for Hadoop. Read more »

opera stack

Don’t feel bad if you haven’t heard of Opera Solutions. However, the analytics-as-a-service provider has been quietly building up its $100 million company since 2004 and, with big data on the tip of the IT world’s collective tongue, Opera is ready to start spreading the word. Read more »

donations

Data-integration specialist Syncsort is releasing two new Hadoop tools that it says will give Hadoop users a better, faster experience than they can achieve using Apache Hadoop alone. Unlike some other recent announcements, however, Syncsort is looking to improve Hadoop rather than replace aspects of it. Read more »

classroom

IBM today announced a new product dedicated to helping customers perform sentiment analysis of social media data, as well as a new program with the Yale School of Management’s Center for Customer Insight to train students in advanced data analysis skills. Read more »

fighting elephants

If Yahoo plans to spin off its white-hot Hadoop business, it would make Yahoo the third vendor operating alongside Cloudera and IBM — fighting for what, right now, are only speculative customer dollars. Would Yahoo’s spinout have what it takes to compete? Read more »

talented elephant

Hadoop is the talk of the town when it comes to big data, but it’s not without faults that have some users begging for an alternative. Like many open source projects, it’s relatively unpolished and often requires a great deal of learning and much strenuous customization […] Read more »

American_Cash

The most interesting part about yesterday’s announcement that Groupon is using the Cloudera Distribution of Hadoop wasn’t the actual use but, rather, the insight that Groupon is “building a world-class infrastructure” of which Hadoop will be a key part. But recruiting big-data-savvy talent is getting rather pricey. Read more »

Subscriber Content

gigaompromasterimagecloud

Two markets stand out above all else when looking at the first quarter of 2011: infrastructure as a service (IaaS) — the epitome of cloud computing — and big data. Amazon Web Services continues to lead the IaaS space in terms of customers and innovation, while Rackspace, buoyed by momentum around OpenStack, will be its primary competitor for mainstream customers. In the big data space, there are so many players and terms floating about it’s difficult for outsiders to get a handle on who’s who and what’s what, though such activity validates the technologies. Other developments this quarter included HP’s impending presence in the cloud computing and big data spaces and the realization that Intel won’t be left to die if low-power servers based on x86 processors catch on like the buzz late last year suggests they will. Additional companies mentioned in this report include VMware, Microsoft, Cloudera, SeaMicro and Facebook. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

bonfire

Cloudera released version 3.0 of its distribution of Apache Hadoop (CDH3) Tuesday. CDH3 is a big reason why, despite a recent spate of Hadoop-based big data products either on the market or about to be there, Cloudera says it isn’t sweating all the new competition. Read more »

checklist

Despite an industry-wide push for better and more-complete big data strategies, it’s beginning to look like EMC and IBM will be the two technology vendors earning the most data-related dollars once the dust settles because they’ve embraced the new big data bundle while others have not. Read more »

CT_scan

Startup medical search company Apixio is trying to save lives by bringing machine-learning and natural-language-processing techniques to medical records, giving doctors a patient’s entire relevant medical history via a simple cloud-based search engine. The goal is to make information-sharing among medical providers far more intelligent. Read more »

numbers

A handful of new releases and partnerships this week — as well as a big award — illustrate just how versatile the data-processing tool Hadoop is and how widespread its use might become. Hadoop is becoming a more viable tool for everyone from business users to journalists. Read more »

speed

Hardware rarely comes up in discussions about big data, save for those centered on data warehouse appliances. But the omission hardly means hardware is irrelevant. In fact, big gear might become a big deal as companies look to bolster the performance of their big data systems. Read more »

Subscriber Content

bronze elephant

Hadoop has been used by large web companies for applications such as search engines, but the reality is that the project is so much more. This report takes a closer look, examining what Hadoop is (and isn’t), who’s doing what to productize it and why we can expect to see the market pick up serious steam in 2011. We profile the growing number of companies — from startups like MapR to Cloudera, the arguable leader in the space — using Hadoop, the challenges still hindering widespread adoption and where potential users can expect the market to go as we move through 2011 and beyond. Companies mentioned in this report include Yahoo, Facebook, EMC, Teradata and Appistry. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

wall street

High-performance computing leader Platform Computing hopes to capitalize on the big data movement by spreading its wings beyond its flagship business of managing clusters and grids and into managing MapReduce environments, too. Platform has a solid foundation among leading businesses, especially in the financial services industry. Read more »

tunnel vision

One of the statements that struck me most from Structure: Big Data was CA CTO Donald Ferguson’s notion that big data represents a “very promising” opportunity for startups, particularly those targeting specific target use cases. I think he’s right, particularly with regard to the latter part. Read more »

ravel

Ravel wants to provide a supported open source version of Google’s Pregel software called Golden Orb to handle large-scale graph analytics. Ravel COO Zach Richardson told me in the following video interview that the startup would release the Golden Orb code on March 31st. Read more »

fighting elephants

It turns out that “big data” isn’t just a buzzword, but a legitimate concern for companies across the board. Their interest in the tools to take advantage of the opportunity for data analysis has sparked a land grab among software vendors centered around Hadoop. Read more »

Cloudera's Amr Awadallah, Pervasive Software's Mike Hoskins, 10gen's Dwight Merriman, Yahoo's Todd Papaioannou, and DataStax Ben Werther

Mapr, a stealth-mode start-up with about 30 employees is developing a version of Hadoop and plans to compete with the likes of Cloudera. The company is likely to launch later this year and has been funded by Lightspeed Venture Partners and NEA. Read more »

Knome, Metamarkets, ITA Software, OmniTI, Karmasphere at Structure Big Data 2011

As organizations strive to analyze more data than ever and to do it faster than ever, the results they’re getting might actually be worse than those in the pre-big-data and real-time world — at least temporarily. Read more »

Braxton Woodham, Tap11, at Structure Big Data 2011

When it comes to social data, one of the biggest firehoses around is the one that comes from Twitter. Trying to make sense of 140 million tweets a day in something close to real-time is a significant challenge, says Tap11 chief technology officer Braxton Woodham. Read more »

Cloudera's Amr Awadallah, Pervasive Software's Mike Hoskins, 10gen's Dwight Merriman, Yahoo's Todd Papaioannou, and DataStax Ben Werther

During an afternoon panel entitled “The Many Faces of MapReduce — Hadoop and Beyond,” moderator Gary Orenstein compared the two primary Hadoop components — MapReduce and the Hadoop Distributed File System — to the meat and bread of a sandwich. Read more »

cassandrathumb

NoSQL startup DataStax officially entered the pantheon of Hadoop providers today, introducing its own distribution called “Brisk.” Brisk utilizes the open source NoSQL database Cassandra as a replacement for Apache’s Hadoop Distributed File System, as well as Cassandra’s built-in MapReduce engine and Hive. Read more »

hadoop logo

A Yale computer science project has turned into a company giving Hadoop the ability to perform analytics on both structured and unstructured data. Hadapt launched today with an undisclosed amount of funding and the goal of making Hadoop more broadly applicable for analytics. Read more »

Subscriber Content

datacenter

Business and IT leaders now face significant opportunities and challenges with big data — that is data sets that are so large they are difficult to store, manage and analyze. This report explores the rapidly evolving big data business and technology ecosystem. It examines big data in the context of several different industries: financial services, health care, sports, travel and media. We explore the different big data technologies — from Hadoop and NoSQL derivatives to cloud-based collaboration tools — and their various benefits for enterprises. And we examine some of the existing challenges big data poses, and what enterprise IT leaders can do to overcome them. Companies mentioned in this report include Amazon Web Services, Google, Teradata, IBM and Cloudera. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

cat-video

Using Hadoop to process data for targeted web advertising efforts is nothing new, but this week, two companies in the video advertising space also stepped forward to highlight how Hadoop is helping them deliver the right ads to the right viewers for their clients. Read more »

17891011page 9 of 11