Sqrrl Enterprise, a commercial version of the National Security Agency’s Accumulo database technology, is now generally available. As one might expect, it’s all about security and analytics at a massive scale. Read more »
Accel has launched its Big Data Fund 2, a followup on the equally large fund the venture capital firm started in November 2011. Rather than seeking products that target data scientists, it wants those targeting business users. Read more »
This week, both Facebook and Yahoo detailed new efforts to manage real-time data flows within their myriad systems. Yahoo’s work is an open source implementation of Storm designed to run on the same cluster as Hadoop and even share resources. Read more »
Raymie Stata spent seven years working on the guts of Hadoop as a VP, chief architect and CTO at Yahoo. His new Hadoop startup, called Altiscale, has raised a $12 million from some prominent investors. Read more »
There’s much debate still to be had over the NSA’s recently uncovered data-collection practices, but some of the technologies underlying them are out in the open. Here’s what we know already. Read more »
How does the NSA analyze all the data it’s collecting from cell phone users? With a massive database system built with just such scale and workloads in mind. Read more »
Cloudera’s new search feature, based on the Apache Solr project, is the latest move by the company to expand the utility of its Hadoop distribution. It’s also far from the last. Read more »
While the rest of the Hadoop world is trying to distance itself from Hive with new interactive engines, Hortonworks is trying to make it faster. It might actually be a sound strategy. Read more »
photo: Christophe Bisciglia (left) and Aaron Kimball (right)
Startup WibiData has raised another $15 million and wants to turn the lessons it has learned in the field into generic software that can let anyone build predictive applications on Hadoop. Read more »
Cascading creator Concurrent has developed a new open source tool called Pattern for running machine learning models on Hadoop clusters. When combined with its SQL tool called Lingual, users can move data from one stage to another easily. Read more »
Database startup Drawn to Scale, creator of the SQL-on-Hadoop technology called Spire, is closing down. The company’s product, Spire, was one of the first SQL-on-Hadoop technologies. Read more »
Data-warehouse providers are quickly adding Hadoop distributions, or even their own versions of Hadoop, into their architecture, adding further cost advantages to collections of extremely large data sets. Finding the talent to manage this newly converged environment will not be easy, but it presents tremendous opportunity for companies willing to take some risk. Read more at GigaOM Pro »
Hadoop startup Mortar Data is offering to build recommendation systems for 10 companies, with help from Hilary Mason, Drew Conway and Max Shron. It’s part of a bigger plan to democratize the science behind online recommendations. Read more »
EMC CTO John Roese has a tough, but important job trying to keep EMC, VMware and Pivotal all moving in the same direction. While the three are separate companies, their fates are also very much aligned. Read more »
IBM’s entrant in the SQL-on-Hadoop competition has been flying under the radar, but is available as a technology preview. Called Big SQL, it’s a big deal if IBM wants to be a major player in the Hadoop space. Read more »
MapR on Wednesday released its commercial version of HBase called M7, the first such product on the market, that the company claims is bigger, faster and better than the open source version. Read more »
Analytics startup Precog is on a mission to make analytics on unstructured data as simple as possible with a new line of targeted appliances. Read more »
In the tsunami of experimentation, investment, and deployment of systems that analyze big data, vendors have seemingly been trying approaches at two extremes—either embracing the Hadoop ecosystem or building increasingly sophisticated query capabilities into database management system (DBMS) engines.For some use cases, there appears to be room for a third approach that lies between the extremes and borrows from the best of each. Read more at GigaOM Pro »
Cloudera’s Impala engine for interactive SQL queries on Hadoop data is now generally available, and CEO Mike Olson gives his lay of the competitive landscape. Read more »
Accurate timing has grown more important in distributed systems, not just for mobile networks, but also for tracking data between data centers. Our love of digital junk is pushing storage to the edge. Read more »
The advent of big data is affecting Ford Motor Co. in some significant ways, from how it analyzes its supply chain to the features it puts into its cars. Read more »
Hadoop experts Qubole have just closed a Series A funding round for their service, which lets users run Hive data warehouse jobs in Amazon’s cloud. Read more »
Gravity CTO Jim Benedetto knows his way around MySQL after managing a 600-instance cluster at MySpace, but he has found HBase religion as his real-time content-recommendation platform grew. And he’s not alone. Read more »
Hadoop not fast enough for you? Then you might want to get to know AMPLab, a University of California, Berkeley team developing faster versions of many core Hadoop components. Read more »
Data scientist might be the sexiest job of the 21st century, but it’s hardly an easy gig to land. Here is some advice from practitioners at Netflix, Orbitz and Hortonworks on how get hired and even do the hiring. Read more »
Teradata introduces a new high-speed data-warehouse appliance and announces the ability to use insights from Hadoop as part of analysis in a data-warehouse appliance. Read more »
Cloud computing is finally starting to add value to business, as those in charge of cloud within enterprises are moving from talking to doing. That much was very evident in the first quarter of 2013. Read more at GigaOM Pro »
The new version aims to provide a simpler interface for wrangling hundreds of data points per site visit. Qubit has also released research about browser user value, with IE users coming out on top. Read more »
Almost every tech company claims to hate patent trolls, but they certainly don’t always back up their words with actions. Recent patent activity around the Hadoop big data platform might show how companies can effectively battle trolls — if they really want to. Read more »
IBM announced a new PureData appliance for Hadoop and technology for speeding up analytic databases. The announcements come at a good time, with data sets growing and enterprises hankering for easy and fast analysis capability. Read more »
MapR is releasing open source code and partnering with Canonical on Ubuntu, while Netflix is releasing some data for for developers to play with. Sounds like a good day for openness. Read more »
Google announced a “patent pledge” in which it will donate 10 patents related to MapReduce to protect the emerging cloud and big data industry from lawsuits. Read more »
Teradata has been around forever, and its customer base full of huge companies suggests it will probably for a while to come. Here’s how some of its customers use the company’s analytics software. Read more »
Platfora, the San Mateo, Calif.-based startup that helped spur a general rethinking of business intelligence for a big data world, is finally exiting its beta period and is generally available. It’s no wonder the company has garnered so much attention given its stated mission to make […] Read more »
Entrepreneurs who build applications on top of Hadoop see lots of use cases, but the ecosystem needs to evolve further in order to support wider and more cost-effective implementations. Read more »
Asking how something is better than Hadoop is not the right question. For strategic thinking around big data companies need to figure out what they want to achieve, not what tool to use. Read more »
The strategic partnership will see Cloudera’s enterprise Hadoop distribution, along with its Impala real-time query engine, running on top of T-Systems’ extensive cloud infrastructure in Europe and beyond. Read more »
Cascading proprietor Concurrent has secured $4 million in venture capital in order to advance its efforts toward easing the development of big data applications. Read more »
Today’s most successful companies are the ones with the ability to capture and analyze all data available to them. Enter SQL-on-Hadoop solutions, which increase the accessibility of Hadoop and allow organizations to reuse their investment learning in SQL. Read more at GigaOM Pro »
Database startup Drawn to Scale has extended its Spire distributed data platform from SQL to MongoDB. That means users can get high performance from the latter even across hundreds of terabytes. Read more »