In late 2007, in a conversation with my colleagues I pointed out that over the next decade or so, the Internet’s infrastructure was going to be transformed in a very fundamental manner. It would start with the proliferation of Amazon Web Service-style core cloud computing and […] Read more »
Now six years old, the Apache Hadoop platform for storing and processing huge amounts of data, perhaps the catalyst of the current big data movement, appears ready for its closeup. According to the companies leading the Hadoop charge, they’re already beating away customers with a stick. Read more »
Netflix’s algorithms for recommending movies to customers might not be perfect, but it isn’t for lack of trying. The company is capturing and analyzing incredible amounts of data, even from the videos themselves, to try and figure out what you want to watch next. Read more »
It’s no secret that Facebook stores a lot of data in Hadoop, but how it keeps that data available whenever it needs it isn’t necessarily common knowledge. Today at the Hadoop Summit Facebook Engineer Andrew Ryan highlighted that solution, which Facebook calls AvatarNode. Read more »
Amazon Web Services already has a winner with its Elastic MapReduce Hadoop service, and now it’s turning up the heat by adding MapR’s Hadoop distribution as an option. Users can take advantage of MapR’s performance features while also having integration with AWS’s suite of cloud services. Read more »
VMware is launching a new open source project, called “Serengeti,” that aims to let the Hadoop data-processing platform run on the virtualization leader’s vSphere hypervisor. VMware apparently smells a lucrative opportunity in Hadoop and isn’t about to miss out on getting a piece of the pie. Read more »
Online genealogy service Ancestry.com is trying to become like the Amazon or Netflix of family trees. Much like those companies use customer data to recommend products or movies customers might like, Ancestry.com is using machine learning to make learning about ancestors a lot less work. Read more »
One year after launching into the Hadoop market with much anticipation, Yahoo spinoff Hortonworks finally has a product available. The company announced version 1.0 of its flagship Hortonworks Data Platform on Tuesday, as well as a High Availability version designed with new partner VMware. Read more »
Karmasphere CEO Gail Ennis told me recently she thinks “2013 is going to be the year when we see [Hadoop adoption] go a lot more mainstream and [turn] into a tornado.” I like the prediction, as much for its imagery as for its near-term certainty. Read more »
Two key members of the Facebook team that created the Hadoop query language Hive are launching their own big data startup called Qubole on Thursday. Qubole is a managed version of Hive that’s hosted on the Amazon Web Services cloud computing infrastructure. Read more »
If you just pay attention to largest Hadoop users, you might think the platform is just a way of powering search engines or analyzing customer behavior for ad-serving. Of course that’s not the case, but finding those broader use cases can still be kind of difficult. Read more »
If you’ve ever wondered what big data means at an individual level, this realization about sums it up: “I could either keep dying my hair or retire a year earlier.” It’s those types of realizations Intuit hopes its heavy big data use will help uncover. Read more »
It’s neither easy nor glamorous — data scientists get all the love — but making sure your Hadoop cluster is properly configured and applications are running optimally is necessary, especially as applications move into production. Here are five tools to help you do it. Read more »
Facebook’s hyperinflated valuation heading into its IPO has everything to do with its promise, and very little to do with its actual profits. Here are some numbers we know about Facebook’s infrastructure that speak to its promise perhaps as much as its 900 million users. Read more »
There’s nothing quite like a hypothetical about someone setting a whole block on fire after cutting off the fire department’s electric supply in order to slow its response. Is it comforting to know that smart people and smart analytics could help stop it from happening? Read more »
Yahoo is looking to leverage its big data prowess with a new tool for marketers called Genome. It looks like an acknowledgement that while Yahoo might not rule the the web anymore, it knows a heck of a lot about analytics. Read more »
As the world once again starts analyzing Yahoo’s myriad woes after Sunday morning’s ouster of embattled CEO Scott Thompson, I’m left wondering if its investment in Hadoop didn’t aid in the company’s demise, even if it’s a way down the long list of Yahoo’s mistakes. Read more »
The IT hype machine has everyone jumping on the big data bandwagon. But before we start saving every scrap of data in the enterprise for fear that we will miss a nugget of insight, shouldn’t we focus on what we already have? Read more »
Paul Doscher, CEO of Lucid Imagination wants you to know that when it come to enterprise-class search, open-source Lucene is a contender. And a strong contender that can face off against Google, Amazon and Microsoft in the big data search arena. Read more »
Finally the worlds of big data geeks and clean energy nerds have collided. Researchers have proposed building a “GreenHadoop,” that is a version of the MapReduce programming framework that could manage a data center’s computing workload to optimize clean energy from a solar system. Read more »
A cadre of DevOps experts will gather later this week at an undisclosed location in Northern California. The goal: To hash out issues they see in their own shops, to compare notes on problems and talk in a way that they cannot in vendor-driven conferences. Read more »
Market research firm IDC released the first legitimate market forecast for Hadoop on Monday, claiming the ecosystem around the de facto big data platform will sell almost $813 million worth of software by 2016. But Hadoop’s actual economic impact is likely much, much larger. Read more »
Ask a VC about big data and she will probably tell you about visualization of the user interface. We’re talking about intuitive UIs that let users visually work with data using charts and tools, not algorithms. It’s hard to do right, but the payoff could be huge. Read more at GigaOM Pro »
Known for integration and embeddable databases, Pervasive has all kinds of exciting technology plans for cloud and big data on its roadmap. ... Read more at GigaOM Pro »
When your business is to insure farmers against the effects of bad weather, you’d better have some seriously accurate data on your side. Mother Nature, after all, can be somewhat unpredictable. The Climate Corporation thinks the answer is lots of data and lots computing power. Read more »
Cloud computing and big data are in the enterprise to stay, but making the most of them presents challenges for IT decision makers. The future belongs to those companies who can work through legacy tools, ongoing security issues and the data scientist shortage. Read more at GigaOM Pro »
There are now more than half a dozen commercial Hadoop distributions in the market, and almost every enterprise with big data challenges is tinkering with the Apache Foundation-licensed software. A new report examines the key disruptive trends shaping the Hadoop platform market. Read more at GigaOM Pro »
IBM’s big data platform will support the Cloudera Hadoop distribution, a surprising decision given the reservations the two companies had expressed about each other before. That gives IBM and rival Oracle at least one thing in common: Oracle’s Big Data Appliance runs Cloudera too. Read more »
It’s beginning to look like there will be no free-standing analytics companies left. IBM is buying Vivisimo for the “discovery and navigation” expertise that companies use to access and analyze (what else?) big data. The news come a week after IBM bought Varicent, another analytics company. Read more »
VMware has acquired Cetas, a startup that provides analytics atop the Hadoop platform. Terms of the deal haven’t been disclosed, but Cetas is an 18-month-old company with tens of paying customers that didn’t need to rush into an acquisition. So, why did VMware buy it? Read more »
Big data and the marketing world go together like peanut butter and jelly. Marketers want to present their brands in the most-effective manner possible and always put the right ad in front of the right person. Big data makes that possible at a whole new level. Read more »
This quarter saw Amazon Web Services finally relaxing its public-cloud-only stance and launching services to support hybrid-cloud deployments. Meanwhile, Hadoop players moved to make their platforms more accessible to mainstream BI analysts and database administrators. A new quarterly report analyzes these trends and provides a near-term outlook. Read more at GigaOM Pro »
Skybox Imaging, a startup that wants to capture and analyze high-resolution photos and videos of the Earth, has raised $70 million in Series C funding. The money will help Skybox its lineup of software engineers and data scientists that might be its secret sauce. Read more »
This quarter the EV market struggled to find its footing. Meanwhile, the smart-grid sector solidified and low-power technology proved itself important in the data center. Read more to learn what these news pieces and others mean for the larger space over the next few months. Read more at GigaOM Pro »
If you’re an amateur poet and love big data, high-performance system vendor AMAX has a deal for you. The company is conducting a contest to find the best haiku on big data. But I’m sharing my poems right here. Read more »
TempoDB, a startup out of Chicago, has build a database as a service offering specifically for time-series data thrown off by thermostats, servers, automotive telematics. Does the world (or the Internet of Things) need a specialty time series database hosted in the cloud? Read more »
The headline might sound like buzzword stew, but it couldn’t be any truer. For companies willing to make the leap to cloud services, there will be a lot of companies willing to make big data as easy as paying your bill every month. Read more »
For years, Oracle has wowed Wall Street with fat software margins: Large companies depending on Oracle relational databases pay what it takes to keep them up and running. It’s unclear whether Oracle can carry that dominance over into the Big Data era, however. Read more »
If your organization doesn’t have a strategy for big data now, you will need one in the future. Here we discuss the difference between big data and traditional business intelligence, as well as the considerations executives should take into account as they plan their big data strategies. Read more at GigaOM Pro »
In a webscale data center, peak efficiency feels like a blast furnace. I stepped into the hot aisle of Dell Modular Data Center and 1,920 servers blasted 115-degree air right in my face. If eBay’s Dean Nelson has his way, that was just the beginning. Read more »