More hadoop Stories

Freedom-of-choice-a22077920

Is Hadoop our only hope for solving big data challenges? From scalability to fault tolerance, Hadoop does myriad things very well. Yet, Hadoop is not the solution to all big data problems and use cases. Several key issues remain, including investment, complexity and batch-only processing. Read more »

chain

Big data startup MapR is now an official corporate contributor to the Apache Hadoop project, a somewhat interesting turn of affairs given its corporate mission to lure users away from Apache’s Hadoop Distributed File System. However, other companies commercializing Hadoop shoud follow its lead. Read more »

cash

The global economy continues to face uncertainty, but despite this, many technology companies have cash on hand and are opting to spend it on mergers and acquisitions. Here we examine some likely strategies from five different companies: IBM, Oracle, HP, Cisco and Hewlett-Packard. Read more »

loading external resource

EMCelephant

San Jose, Calif.-based storage startup MapR, which provides a high-performance alternative for the Hadoop Distributed File System, will serve as the storage component for EMC’s forthcoming Greenplum HD Enterprise Edition Hadoop distribution. Cloudera announced an HDFS partnership of its own with compression expert RainStor. Read more »

private property

As a recent McKinsey Global Institute report on big data points out, finding the appropriate balance between consumer privacy and business innovation will play a key role in ensuring that big data and the overall web advance at the pace required by both business and consumers. Read more »

EMCelephant

EMC is throwing its weight behind Hadoop. Today, at the EMC World, the storage giant announced a slew of Hadoop-centric products, including a specialized appliance for Hadoop-based big data analytics and two separate Hadoop distributions. EMC’s entry is going to shake-up the Hadoop market. Read more »

hadoop logo

The recent excitement around Hadoop has culminated in five new Hadoop products today from EMC, NetApp, Mellanox, SnapLogic and DataStax. What’s interesting now is that we’re seeing large technology vendors with hardware expertise pushing gear optimized for Hadoop. Read more »

opera stack

Don’t feel bad if you haven’t heard of Opera Solutions. However, the analytics-as-a-service provider has been quietly building up its $100 million company since 2004 and, with big data on the tip of the IT world’s collective tongue, Opera is ready to start spreading the word. Read more »

donations

Data-integration specialist Syncsort is releasing two new Hadoop tools that it says will give Hadoop users a better, faster experience than they can achieve using Apache Hadoop alone. Unlike some other recent announcements, however, Syncsort is looking to improve Hadoop rather than replace aspects of it. Read more »

classroom

IBM today announced a new product dedicated to helping customers perform sentiment analysis of social media data, as well as a new program with the Yale School of Management’s Center for Customer Insight to train students in advanced data analysis skills. Read more »

loading external resource

fighting elephants

If Yahoo plans to spin off its white-hot Hadoop business, it would make Yahoo the third vendor operating alongside Cloudera and IBM — fighting for what, right now, are only speculative customer dollars. Would Yahoo’s spinout have what it takes to compete? Read more »

talented elephant

Hadoop is the talk of the town when it comes to big data, but it’s not without faults that have some users begging for an alternative. Like many open source projects, it’s relatively unpolished and often requires a great deal of learning and much strenuous customization […] Read more »

American_Cash

The most interesting part about yesterday’s announcement that Groupon is using the Cloudera Distribution of Hadoop wasn’t the actual use but, rather, the insight that Groupon is “building a world-class infrastructure” of which Hadoop will be a key part. But recruiting big-data-savvy talent is getting rather pricey. Read more »

Subscriber Content

gigaompromasterimagecloud

Two markets stand out above all else when looking at the first quarter of 2011: infrastructure as a service (IaaS) — the epitome of cloud computing — and big data. Amazon Web Services continues to lead the IaaS space in terms of customers and innovation, while Rackspace, buoyed by momentum around OpenStack, will be its primary competitor for mainstream customers. In the big data space, there are so many players and terms floating about it’s difficult for outsiders to get a handle on who’s who and what’s what, though such activity validates the technologies. Other developments this quarter included HP’s impending presence in the cloud computing and big data spaces and the realization that Intel won’t be left to die if low-power servers based on x86 processors catch on like the buzz late last year suggests they will. Additional companies mentioned in this report include VMware, Microsoft, Cloudera, SeaMicro and Facebook. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

bonfire

Cloudera released version 3.0 of its distribution of Apache Hadoop (CDH3) Tuesday. CDH3 is a big reason why, despite a recent spate of Hadoop-based big data products either on the market or about to be there, Cloudera says it isn’t sweating all the new competition. Read more »

checklist

Despite an industry-wide push for better and more-complete big data strategies, it’s beginning to look like EMC and IBM will be the two technology vendors earning the most data-related dollars once the dust settles because they’ve embraced the new big data bundle while others have not. Read more »

CT_scan

Startup medical search company Apixio is trying to save lives by bringing machine-learning and natural-language-processing techniques to medical records, giving doctors a patient’s entire relevant medical history via a simple cloud-based search engine. The goal is to make information-sharing among medical providers far more intelligent. Read more »

numbers

A handful of new releases and partnerships this week — as well as a big award — illustrate just how versatile the data-processing tool Hadoop is and how widespread its use might become. Hadoop is becoming a more viable tool for everyone from business users to journalists. Read more »

speed

Hardware rarely comes up in discussions about big data, save for those centered on data warehouse appliances. But the omission hardly means hardware is irrelevant. In fact, big gear might become a big deal as companies look to bolster the performance of their big data systems. Read more »

Subscriber Content

bronze elephant

Hadoop has been used by large web companies for applications such as search engines, but the reality is that the project is so much more. This report takes a closer look, examining what Hadoop is (and isn’t), who’s doing what to productize it and why we can expect to see the market pick up serious steam in 2011. We profile the growing number of companies — from startups like MapR to Cloudera, the arguable leader in the space — using Hadoop, the challenges still hindering widespread adoption and where potential users can expect the market to go as we move through 2011 and beyond. Companies mentioned in this report include Yahoo, Facebook, EMC, Teradata and Appistry. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

wall street

High-performance computing leader Platform Computing hopes to capitalize on the big data movement by spreading its wings beyond its flagship business of managing clusters and grids and into managing MapReduce environments, too. Platform has a solid foundation among leading businesses, especially in the financial services industry. Read more »

tunnel vision

One of the statements that struck me most from Structure: Big Data was CA CTO Donald Ferguson’s notion that big data represents a “very promising” opportunity for startups, particularly those targeting specific target use cases. I think he’s right, particularly with regard to the latter part. Read more »

ravel

Ravel wants to provide a supported open source version of Google’s Pregel software called Golden Orb to handle large-scale graph analytics. Ravel COO Zach Richardson told me in the following video interview that the startup would release the Golden Orb code on March 31st. Read more »

fighting elephants

It turns out that “big data” isn’t just a buzzword, but a legitimate concern for companies across the board. Their interest in the tools to take advantage of the opportunity for data analysis has sparked a land grab among software vendors centered around Hadoop. Read more »

Cloudera's Amr Awadallah, Pervasive Software's Mike Hoskins, 10gen's Dwight Merriman, Yahoo's Todd Papaioannou, and DataStax Ben Werther

Mapr, a stealth-mode start-up with about 30 employees is developing a version of Hadoop and plans to compete with the likes of Cloudera. The company is likely to launch later this year and has been funded by Lightspeed Venture Partners and NEA. Read more »

Knome, Metamarkets, ITA Software, OmniTI, Karmasphere at Structure Big Data 2011

As organizations strive to analyze more data than ever and to do it faster than ever, the results they’re getting might actually be worse than those in the pre-big-data and real-time world — at least temporarily. Read more »

Braxton Woodham, Tap11, at Structure Big Data 2011

When it comes to social data, one of the biggest firehoses around is the one that comes from Twitter. Trying to make sense of 140 million tweets a day in something close to real-time is a significant challenge, says Tap11 chief technology officer Braxton Woodham. Read more »

Cloudera's Amr Awadallah, Pervasive Software's Mike Hoskins, 10gen's Dwight Merriman, Yahoo's Todd Papaioannou, and DataStax Ben Werther

During an afternoon panel entitled “The Many Faces of MapReduce — Hadoop and Beyond,” moderator Gary Orenstein compared the two primary Hadoop components — MapReduce and the Hadoop Distributed File System — to the meat and bread of a sandwich. Read more »

cassandrathumb

NoSQL startup DataStax officially entered the pantheon of Hadoop providers today, introducing its own distribution called “Brisk.” Brisk utilizes the open source NoSQL database Cassandra as a replacement for Apache’s Hadoop Distributed File System, as well as Cassandra’s built-in MapReduce engine and Hive. Read more »

hadoop logo

A Yale computer science project has turned into a company giving Hadoop the ability to perform analytics on both structured and unstructured data. Hadapt launched today with an undisclosed amount of funding and the goal of making Hadoop more broadly applicable for analytics. Read more »

Subscriber Content

datacenter

Business and IT leaders now face significant opportunities and challenges with big data — that is data sets that are so large they are difficult to store, manage and analyze. This report explores the rapidly evolving big data business and technology ecosystem. It examines big data in the context of several different industries: financial services, health care, sports, travel and media. We explore the different big data technologies — from Hadoop and NoSQL derivatives to cloud-based collaboration tools — and their various benefits for enterprises. And we examine some of the existing challenges big data poses, and what enterprise IT leaders can do to overcome them. Companies mentioned in this report include Amazon Web Services, Google, Teradata, IBM and Cloudera. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

cat-video

Using Hadoop to process data for targeted web advertising efforts is nothing new, but this week, two companies in the video advertising space also stepped forward to highlight how Hadoop is helping them deliver the right ads to the right viewers for their clients. Read more »

hadoop logo

Just over than a month after discontinuing its Hadoop distribution to focus on the flagship Apache Hadoop project, Yahoo is proposing some changes to the Hadoop MapReduce component that could significantly improve processing performance. The proposal illustrates just how beneficial Yahoo’s renewed focus could be. Read more »

dryad

Microsoft is developing a new big data tool called Dryad. Dryad and the associated programming model, DryadLINQ, simplify the process of running data-intensive applications across hundreds, or even thousands, of machines running Windows HPC Server. Dryad builds upon lessons learned from Hadoop, but differs in some significant ways. Read more »

facebook analytics

Facebook is working on a real-time analytics dashboard to let users determine which content is getting the most attention from visitors. As described in an educational session on Wednesday night in Facebook’s Seattle office, the service is built atop HBase and tracks about 100 metrics. Read more »

barrier

It was a big week for big data, with two key trends adding fuel to claims that data management and analysis will never be the same. Even laggards will be tempted to give big data tools a try to see what all the hype is about. Read more »

091107-N-7478G-227

Few would argue that Hadoop doesn’t have a bright future as a foundational element of big data stacks, but Piccolo, a new project out of New York University, is moving data in-memory in an attempt to improve parallel-processing performance beyond what Hadoop and/or MapReduce can do. Read more »

datacenter

With enterprise data volumes growing, business and IT leaders face significant opportunities and challenges from big data. The space, of course, is not without its obstacles — including plenty of privacy concerns — but in 2011, there are numerous sales-growth opportunities and new business models finally surfacing. Read more »

17891011page 9 of 11