More mapreduce Stories
Subscriber Content

big data on computer image

In the tsunami of experimentation, investment, and deployment of systems that analyze big data, vendors have seemingly been trying approaches at two extremes—either embracing the Hadoop ecosystem or building increasingly sophisticated query capabilities into database management system (DBMS) engines.For some use cases, there appears to be room for a third approach that lies between the extremes and borrows from the best of each. Read more at GigaOM Pro »

loading external resource

Sequoia
photo: Lawrence Livermore National Laboratory

A group of Stanford researchers recently ran a complex fluid dynamics workload across more than a million cores on the Sequoia supercomputer. It’s an impressive feat and might foretell a future where parallel programming becomes commonplace even on our smartphones. Read more »

Subscriber Content

bronze elephant

The delivery of real-­time query makes Hadoop accessible to more users — and by orders of magnitude. Its significance goes well beyond delivering a database management system (DBMS) kind of query engine that other products have had for decades. Rather, Hadoop as a platform now supports a whole new paradigm of analytics. With the introduction of real-­time query, Hadoop has taken a major step toward unifying the majority of big data analytic applications onto one platform. This research paper targets information technology professionals who have in-­depth experience with traditional RDBMS and seek to understand where the Hadoop ecosystem and big data analytics fit. Read more at GigaOM Pro »

Subscriber Content

rockclimbing1

Organizations are coping with the challenge of processing unprecedented volumes of data. However, the processes involved with using a large cluster to run applications like Hadoop are error-prone. So IT managers are turning to cluster-management solutions to automate tasks associated with cluster creation, management and maintenance. Read more at GigaOM Pro »

loading external resource

achilles heel

Hadoop is on its way to becomig the de facto platform for the next-generation of data-based applications, but it’s not without some flaws. Ironically, one of Hadoop’s biggest shortcomings right now is also one of its biggest strengths going forward — the Hadoop Distributed File System. Read more »

elephant walking away

For better or worse, Hadoop has become synonymous with big data. In just a few years it has gone from a fringe technology to the de facto standard. But is the enterprise buying into a technology whose best day has already passed? Read more »

Subscriber Content

clouds

A major limitation of big data is that the technologies used to analyze it are not easy to learn. It doesn’t have to be that way, and technologies like data visualization and cloud-based tools target less-sophisticated users — from business users to receptionists to high school students. Read more at GigaOM Pro »

Subscriber Content

gigaompromasterimagecloud

Discussions about the cloud now involve more than just the IT department. New developments in hardware architectures, more-energy-efficient data centers, regulatory concerns and simplifying analytics are all discussions currently circling through the industry. Here’s what to consider when thinking about your business in the cloud. Read more at GigaOM Pro »

bronze-elephant-e1317338128377

Market research firm IDC released the first legitimate market forecast for Hadoop on Monday, claiming the ecosystem around the de facto big data platform will sell almost $813 million worth of software by 2016. But Hadoop’s actual economic impact is likely much, much larger. Read more »

Subscriber Content

Ask a VC about big data and she will probably tell you about visualization of the user interface. We’re talking about intuitive UIs that let users visually work with data using charts and tools, not algorithms. It’s hard to do right, but the payoff could be huge. Read more at GigaOM Pro »

Subscriber Content

gigaompromasterimagecloud

This quarter saw Amazon Web Services finally relaxing its public-cloud-only stance and launching services to support hybrid-cloud deployments. Meanwhile, Hadoop players moved to make their platforms more accessible to mainstream BI analysts and database administrators. A new quarterly report analyzes these trends and provides a near-term outlook. Read more at GigaOM Pro »

2058017896_44a24866c1_z

For years, Oracle has wowed Wall Street with fat software margins: Large companies depending on Oracle relational databases pay what it takes to keep them up and running. It’s unclear whether Oracle can carry that dominance over into the Big Data era, however. Read more »

Subscriber Content

datacenter

Big data now touches everything from enterprises to smart-meter startups, while Hadoop is fast becoming the leading tool to analyze that data, and debates around privacy abound. GigaOM Pro analysts offer insights on what to consider when it comes to big data decisions for your business. Read more at GigaOM Pro »

Subscriber Content

audience_blue_marfis75

The emergence of the big data phenomenon is fundamentally changing everything from the way companies operate to the way people interact to how the world deals with outbreaks of infectious diseases. Here we highlight 10 case studies illustrating how big data is changing the world. Read more at GigaOM Pro »

hadoop

Matt Howard of Norwest Venture Partners predicts that 2012 and 2013 will be Hadoop’s breakout years. Howard gives us insight into the five factors that will accelerate Hadoop’s mainstream adoption over the next 18 months. Read more »

tableau screen

At some inderminate time, very possibly this year, business intelligence favorite Tableau Software will file for its initial public offering. When it does, it will be in good company, along with others that were smart enough to ride the twin waves of consumerization and big data. Read more »

EB6513-BestOfBreed-Graphic

Data-warehouse veteran Teradata has tightened its embrace of the Hadoop big data platform via a partnership with Hortonworks. The goal is to give customers big data environments that integrate everything from the Teradata Database for advanced SQL analytics and the Hortonworks Data Platform Hadoop distribution. Read more »

hadoop

Hadoop features front and center in the discussion of how to implement a big data strategy, one of the biggest trends in IT. There’s just one problem that keeps cropping up: many people don’t seem to know exactly what it means when somebody says “Hadoop.” Read more »

ebay screen

For eBay, big data is serious business. Every day, the site stores and analyzes data from millions of users buying, selling and searching for hundreds of millions of products. It handles all this data with lots of Hadoop, although a good data warehouse doesn’t hurt either. Read more »

6554314153_b776e626f0_z

Pentaho is moving its business intelligence tools to the Apache license to make them more compatible with big data technologies that already operate under that license. Pentaho’s Kettle extract, transform, load (ETL) technology was previously available under the LGPL or lesser Gnu General Public License. Read more »

6259499293_b577b94cfd_z

This year may have been the beginning of the big data onslaught, but big data will only get bigger in 2012. Watch for companies to check out specialized databases for different data types and to segment their data centers for old and new workloads. Read more »

Subscriber Content

motherboard

When it comes to the promise of data as the currency of the web, the current state of affairs has privacy advocates and many consumers up in arms. But it doesn’t have to be the one-sided affair it is today, in which companies have all the data and all the rights, and we shouldn’t have to be afraid of who’s doing what with our information. With laws, products, practices and education, data can become a far more valuable currency than cash ever was. Keeping that in mind, this research note examines five issues that must be addressed by policy makers and entrepreneurs so that they can deliver on our data-driven digital future. Companies mentioned in this report include Twitter, Facebook and Foursquare. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

dna profile

Cloud-based DNA-sequencing specialist DNAnexus has closed a $15 million second round led by Google Ventures and TPG Biotech. Elsewhere, we learned Wednesday that agribusiness giant Monsanto has deployed Cloudant’s NoSQL database as the underpinning of the company’s genomics system. Read more »

Clouds-A3

IBM on Tuesday acquired Platform Computing, a company that made a name for itself in high-performance computing but recently made a splash in the cloud computing and big data spaces. It’s likely these areas that had IBM in a buying mood. Read more »

elephants

Cloudera and Hortonworks have been playing a game of oneupsmanship over the past few weeks in an attempt to prove whose contributions to the Apache Hadoop project matter most. Reputation matters to both companies, but maybe not as much as fending off encroachments to their turf. Read more »

iStock_000000072805XSmall

Attention webscale aficionados, Twitter plans to open source its Hadoop-like real-time data processing tool known as Storm. The social service nabbed the code through its acquisition last month of BackType, and says it’s a better tool for processing streams of data. Read more »

fantasy

Big data — as in managing and analyzing large volumes of information — has come a long way in the past couple of years. Among the greatest innovations might be the advent of real-time analytics, which allow the processing of information in real time to enable instantaneous decision-making. Read more »

hummer

The fight for Hadoop dominance is officially on. While Hortonworks is busy answering questions about its product strategy, Cloudera and MapR will demonstrate new versions of their distributions overflowing with bells and whistles. And there are several other competitive products lurking in the background. Read more »

server farm

Hadoop is a very valuable tool, but it’s far from perfect. While Apache, Cloudera, EMC, MapR and Yahoo focus on core architectural issues, there is a group of vendors trying to make Hadoop a more-fulfilling experience by focusing on business-level concerns such as applications and utilization. Read more »

12page 1 of 2