Our library of 1700 research reports is available only to our subscribers. We occasionally release ones for our larger audience to benefit from. This is one…
Read MoreResearch
Survey reveals a few interesting numbers about Apache Spark
A new survey from startups Databricks and Typesafe revealed some interesting insights into how software developers are using the Apache Spark data-processing…
Read MoreWhy data science matters and how technology makes it possible
When Hilary Mason talks about data, it’s a good idea to listen. She was chief data scientist at Bit.ly, data scientist in…
Read MoreCloudera tunes Google’s Dataflow to run on Spark
Hadoop software company Cloudera has worked with Google to make Google’s Dataflow programming model run on Apache Spark. Dataflow, which Google announced as a…
Read MoreA startup wants to build a trading platform for sensor data
A startup out of Las Vegas is trying to capitalize on a very difficult, and potentially very lucrative, opportunity within the internet of things. The…
Read MoreThe team that created Kafka is leaving LinkedIn to build a company around it
A trio of LinkedIn engineers, led by Jay Kreps, is taking the Apache Kafka technology they helped create commercial. Their new company, called Confluent, wants to enable real-time data processing in all sorts of companies, and is backed by Benchmark, LinkedIn and Data Collective.…
Read MoreMicrosoft adds stream processing and pipeline tools to Azure
Microsoft announced a trio of new cloud data services on Wednesday aimed at stream processing and data pipelines. They're not revolutionary, but they appear to have their own advantages, and they also help ensure Azure keeps up with the Joneses in cloud computing.…
Read MoreZoomData raises $17M to become the visualization layer for big data
ZoomData has raised another $17 million for its visualization technology that was built with stream processing and iPads in mind, but has since made the leap to historical data and desktops, as well.…
Read MoreMeasuring the world’s emotions using Twitter and Amazon’s cloud
Amazon Web Services, Gnip and two Australian research institutions have teamed up to track the emotions of tweets in near real-time and offer the data to the public via visualizations, downloadable tables and an API.…
Read MoreAT&T Labs, Continuuity will open source a Hadoop streaming engine called jetStream
Big data startup Continuuity has teamed with AT&T Labs on an open source project called jetStream that pairs a high-throughput SQL database with a real-time data-processing engine. The goal is to underpin applications that can handle multiple levels of latency, consistency and analysis on streaming data.…
Read MoreDataTorrent’s Hadoop stream-processing engine is now for sale
DataTorrent, a startup building a stream-processing engine for Hadoop that it claims can analyze more than 1 billion data events per second, announced…
Read MoreHortonworks co-founder Baldeschwieler now advising DataTorrent
Eric Baldeschwieler, the founding CEO of Hortonworks and former Yahoo VP who led the company’s Hadoop development efforts, is now a strategic…
Read MoreWhy the internet of things is big data’s latest killer app — if you do it right
A Huntsville, Ala., company is moving from the machine-to-machine world into cloud platforms and big data. Here's how it did it and how it thinks its work could actually end up saving lives.…
Read MoreThis week in big data: Clouds, collaboration and Cassandra
There has been a lot of data industry news this week coming out of the Strata conference, and elsewhere. Here are some of the highlights.…
Read MoreAmazon’s streaming data service, Kinesis, is now available
The service, announced in November as a tool for customers who want to process data in a timely fashion, gives Amazon a rival to Apache Storm.…
Read MoreNetflix open sources its data traffic cop, Suro
Netflix has open sourced a tool called Suro that collects event data from disparate application servers before sending them to other data platforms such as Hadoop and Elasticsearch. It's more big data innovation that hopefully finds its way into the mainstream.…
Read MoreOn the path to personalization
This post from the New York Times‘ Open blog talks about the architecture and algorithms underpinning its content-personalization engine. Its experience speaks…
Read MoreStartup Dataminr claims it gave investors a three-minute headstart to dump BlackBerry stock
Dataminr, a startup dedicated to analyzing the Twitter firehose of real-time tweets, is using today's BlackBerry news as proof of its value. The company claims it gave users a 3-minute advantage in which time to start selling BalckBerry shares.…
Read MoreHortonworks has big plans to make Storm work for the enterprise
Hortonworks is working to integrate the Storm stream-processing engine with its Hadoop distro, and hopes to have it ready for enterprise apps within a year's time. It's the latest non-batch functionality for Hadoop thanks to YARN, which lets Hadoop run all sorts of processing frameworks.…
Read MoreLinkedIn open sources stream-processing engine Samza, its take on Storm
Samza is LinkedIn's take on Twitter's Storm engine for stream processing, only built on top of LinkedIn's own Kafka messaging system. It's the latest in a growing line of open source efforts from LinkedIn, and another notch in the belt for Hadoop.…
Read MoreAd platform Adello acquires Hadoop startup HStreaming
Big data startup HStreaming is now part of Swiss advertising firm Adello Group. HStreaming had standout technology by all accounts, but the business never scaled enough to survive in a tough market.…
Read MoreTwitter open sources Storm-Hadoop hybrid called Summingbird
Twitter has open sourced a "streaming MapReduce" system called Summingbird that makes Hadoop and Storm play nicer together so applications that require both batch and stream processing can do their jobs with as little complexity as possible.…
Read MoreCSC buys Infochimps and its big data platform
IT services and consulting specialist CSC has acquired Infochimps, a startup that sells a big data query and processing platform. Infochimps had raised about $5 million in equity and debt financing since launching in 2009.…
Read MoreGridGain gets $10M for in-memory computing
GridGain Systems has raised a $10 million series B investment round for its suite of in-memory computing technology. In-memory databases are popular…
Read More