As Hadoop moves towards establishing itself as a key data management platform for the enterprise, there is a new set of challenges it must meet to be regarded as a true contender in the field.
Jeff Dean, a Google Fellow who helped develop some of the web giant’s most innovative infrastructure projects, says focusing on one problem at a time is crucial for success
Google announced a “patent pledge” in which it will donate 10 patents related to MapReduce to protect the emerging cloud and big data industry from lawsuits.
In the first of our four-part multi-media series on Hadoop, the people who helped build Hadoop talk about its birth, its promise and the challenges in moving it from webscale to just large-scale.
A group of Stanford researchers recently ran a complex fluid dynamics workload across more than a million cores on the Sequoia supercomputer. It’s an impressive feat and might foretell a future where parallel programming becomes commonplace even on our smartphones.
In just a few years, big data has turned from a buzzword and concept best left for large web companies into a force that drives much of our digital lives. Here are five technological trends that will change how data is processed and consumed going forward.
At some inderminate time, very possibly this year, business intelligence favorite Tableau Software will file for its initial public offering. When it does, it will be in good company, along with others that were smart enough to ride the twin waves of consumerization and big data.
WibiData, a Hadoop-based startup focused on making it easier to analyze user behavior, has raised $5 million from New Enterprise Associates. The company, formerly known as Odiago, launched in late 2011 already claiming Wikipedia and Atlassian among its early customers.
For eBay, big data is serious business. Every day, the site stores and analyzes data from millions of users buying, selling and searching for hundreds of millions of products. It handles all this data with lots of Hadoop, although a good data warehouse doesn’t hurt either.
Cloud-based DNA-sequencing specialist DNAnexus has closed a $15 million second round led by Google Ventures and TPG Biotech. Elsewhere, we learned Wednesday that agribusiness giant Monsanto has deployed Cloudant’s NoSQL database as the underpinning of the company’s genomics system.
IBM on Tuesday acquired Platform Computing, a company that made a name for itself in high-performance computing but recently made a splash in the cloud computing and big data spaces. It’s likely these areas that had IBM in a buying mood.
Is Hadoop our only hope for solving big data challenges? From scalability to fault tolerance, Hadoop does myriad things very well. Yet, Hadoop is not the solution to all big data problems and use cases. Several key issues remain, including investment, complexity and batch-only processing.
High-performance computing leader Platform Computing hopes to capitalize on the big data movement by spreading its wings beyond its flagship business of managing clusters and grids and into managing MapReduce environments, too. Platform has a solid foundation among leading businesses, especially in the financial services industry.
It turns out that “big data” isn’t just a buzzword, but a legitimate concern for companies across the board. Their interest in the tools to take advantage of the opportunity for data analysis has sparked a land grab among software vendors centered around Hadoop.
During an afternoon panel entitled “The Many Faces of MapReduce — Hadoop and Beyond,” moderator Gary Orenstein compared the two primary Hadoop components — MapReduce and the Hadoop Distributed File System — to the meat and bread of a sandwich.
Aster Data, a San Carlos, Calif.-based company, is offering a free version of MapReduce development environment for downloads, which will allow developers to build data analytical apps based on it. MapReduce is a technology that was first used by Google for parallel processing of bigdata sets.
While it doesn’t produce the kind of instant name recognition that some other software platforms do, you’ve probably already used Hadoop many times. Hadoop is a free, open-source software framework, designed to work on clusters of computers, that can mine huge sets of data very quickly. It was inspired by components in Google’s search platform, particularly MapReduce, and is best known for powering fast, distributed searches at sites including Yahoo! and Facebook. But Hadoop’s transition from powering search tools at web sites to many other types of applications is already well underway.
We’re now entering what I call the “Industrial Revolution of Data,” where the majority of data will be stamped out by machines:…