photo: Christophe Bisciglia (left) and Aaron Kimball (right)
Startup WibiData has raised another $15 million and wants to turn the lessons it has learned in the field into generic software that can let anyone build predictive applications on Hadoop. Read more »
Cascading creator Concurrent has developed a new open source tool called Pattern for running machine learning models on Hadoop clusters. When combined with its SQL tool called Lingual, users can move data from one stage to another easily. Read more »
Graph databases and graph-processing applications have been popping up all over the place lately, and now they’re starting to go commercial. On Tuesday, popular open source project GraphLab joined the ranks of graph startups. Read more »
MetLife is building new products on new technologies thanks to a $300 million investment in new technology and new skills. One of the first products is a MongoDB-based app that puts all of customers’ information in one place. Read more »
MapR on Wednesday released its commercial version of HBase called M7, the first such product on the market, that the company claims is bigger, faster and better than the open source version. Read more »
Cloudera’s Impala engine for interactive SQL queries on Hadoop data is now generally available, and CEO Mike Olson gives his lay of the competitive landscape. Read more »
Hadoop not fast enough for you? Then you might want to get to know AMPLab, a University of California, Berkeley team developing faster versions of many core Hadoop components. Read more »
Cloud computing is finally starting to add value to business, as those in charge of cloud within enterprises are moving from talking to doing. That much was very evident in the first quarter of 2013. Read more »
Despite its initial efforts at building its own open-source online learning platform, Stanford said it will fold that platform into the edX platform launched by Harvard and MIT. Read more »
Facebook has built a new open source tool for benchmarking graph databases, called LinkBench. And although the chances are your infrastructure and workloads look nothing like Facebook’s, the good news is LinkBench was built with configurability in mind. Read more »
MapR is releasing open source code and partnering with Canonical on Ubuntu, while Netflix is releasing some data for for developers to play with. Sounds like a good day for openness. Read more »
Cascading proprietor Concurrent has secured $4 million in venture capital in order to advance its efforts toward easing the development of big data applications. Read more »
Riak CS distributed cloud storage technology has always been sort of open-sourcey but not really open sourced. That’s changing now with Basho putting it under the Apache 2 license. Read more »
Hadoop vendor is racking up customers and on Monday it announced a $30 million venture-capital investment that brings its total funding to $59 million since launching in 2011. Read more »
Mega data centers’ innovations in serviceability, automatically detecting and recovering from failures, procurement practices, and so forth will become standard practice in all modern data centers. Read more »
Open source software is an easy punching bag when security breaches arise. But getting rid of open source isn’t the answer — it is too valuable. Instead, we need to take some key steps to ensure the security of components throughout development. Read more »
We were there very early on for the birth of Hadoop and its maturation into a vital data analysis tool. Here’s a look back at some of our best stories. Read more »
In the first of our four-part multi-media series on Hadoop, the people who helped build Hadoop talk about its birth, its promise and the challenges in moving it from webscale to just large-scale. Read more »
Five years ago, LinkedIn was a shell of the technology company it is today. Here’s an inside look at where it came from, what it’s become and where it’s going. Read more »
There has been a lot of data news already this week — some big, some interesting, and some both. Here’s a collection of the stuff you shouldn’t, or don’t want to, miss. Read more »
Barely two weeks on the job, Damon Sicore, ed tech startup Edmodo’s new VP of engineering, talks about the company’s technological priorities and challenges and how he plans to create an engineering culture. Read more »
Red Hat is the latest company offering an alternative to the Hadoop Distributed File System, only this one is open source and ties into Red Hat’s bigger vision of hybrid cloud computing. Read more »
Backblaze pioneered the concept of open source storage hardware in 2009, and its designs have caught on. Hundreds of institutions — including Netflix and Shutterfly — use the designs, which have just entered their third generation. Read more »
As part of its new big-data-focused XDATA initiative, DARPA has invested $3 million in a startup called Continuum Analytics. The company’s aim is to extend Python’s prowess in scientific computing into the world of big data and analytics. Read more »
We all know Europe’s a bit behind the curve on cloud, but that’s not the only reason the fast-growing IaaS platform is finding the going tougher there than elsewhere. Read more »
In a digital world, the recipe has transformed from a static set of instructions into a kind of open-source code which any cook and adjust or reformulate. Food Network’s Alton Brown proposes to embrace that trend to create a form of living recipe. Read more »
Writing Hadoop queries doesn’t have to be hard and neither does sharing data according to Mortar Data, which just released an open source framework for Hadoop applications. The idea is that groups of people can more easily collaborate on building apps around giant data sets. Read more »
Big data tools such as Cassandra and Hadoop are transforming how data is stored and exploited at scale. But without similarly capable search technologies, enterprise adopters face challenges when it comes to gaining insights from that data. Read more at GigaOM Pro »
HBase is a great option for developing big data applications, but it’s not necessarily easy to use. WibiData is addressing this by open sourcing a portion of its predictive analytics infrastructure that adds structure to data, followed eventually by a whole HBase development framework called Kiji. Read more »
Facebook has open sourced a new system called Corona for scheduling and managing Hadoop jobs. Corona attempts to do away with many of the problems that come along with massive-scale Hadoop operations, and soon looks to take Facebook’s Hadoop deployment beyond just MapReduce. Read more »
Rackspace is busy building a Hadoop service, giving the company one more avenue to compete with cloud kingpin Amazon Web Services. However, the two services — along with several others on the market — highlight just how different seemingly similar cloud services can be. Read more »
Red Hat CTO Brian Stevens sees it as a ‘personal disappointment’ that there aren’t more companies lining up to follow his firm’s highly lucrative open source subscription path. Read more »
Many cloud providers say they are open, or based on open-source technology, but in order to be truly open a service has to be backed by a community of users who contribute to making the technology better, says the founder of OpenNebula Read more »
A major cloud trend over the past decade has been open source, but at present there is no one standard all providers obey. But anyone looking for a longer-term alternative to AWS now has two exciting new prospects: OpenStack and OpenShift. Read more at GigaOM Pro »
It’s not home to Google, Amazon or Facebook, but from plucky entrepreneurs to the world’s most-advanced computing systems, Europe has a lot more to offer the world of cloud computing and web infrastructure than might meet the eye. Here are seven reasons why it matters. Read more »
Open-source principles have helped create a host of useful software, including the Linux operating system and the crowd-powered resource that is Wikipedia — but could the same approach be used to open up the process of producing government legislation? Clay Shirky argues that it could. Read more »
It’s not for everyone, but if you’re storing petabytes of data Hadoop, Quantcast thinks it has the cure to your woes. Its newly open sourced Quantcast File System promises smaller clusters and better performance, and it has proven itself over exabytes of data inside Quantcast. Read more »
10gen, the company behind the popular MongoDB database, is working with the nonprofit open source education initiative edX to offer free online MongoDB courses. 10gen CEO and co-founder Dwight Merriman will teach one of the classes. Read more »
The big data world is full of small, scrappy startups using their ingenuity to build complex systems out of open source software, but the Walt Disney Company is not one of them. Here’s what goes into building a big data platform in a Fortune 100 company. Read more »