More data Stories

Upcoming Events

In Brief

Hadoop startup WibiData has updated Kiji, its open source project that aims to make HBase a better (or easier) database for serving real-time applications. Among the updates in its latest SDK is an improved version of the KijiScoring feature. “Developers can now pass per-request settings to producer functions, greatly expanding the flexibility of real-time predictive model scoring. For example, a user’s current geolocation from mobile application can be factored in when re-computing which offers or recommendations to serve a user,” explains a press release.

In Brief

Guavus, a San Mateo, Calif.-based startup that specializes in analyzing the data coming off carrier networks, has hired former NetApp EVP Manish Goel as CEO. Goel replaces Anukool Lakhina, who founded the company and will stay on board to help drive its technology strategy, among other things. Guavus has raised $87 million in capital and claims some major wireless carriers as customers of its software that helps tie customer data to network activity.

alex_paris2

Ahead of our Mobilize event Oct. 16 and 17, we asked experts how 50 billion connected devices and 6 billion people change their industry. In this essay designer Alexandra Deschamps-Sonsino tackles the topic of privacy. Read more »

loading external resource
In Brief

Yelp has announced the winners of its inaugural Yelp Dataset Challenge, and the four entries it chose actually seem pretty useful. They run the gamut from a technique to highlight key words so users can read reviews faster to helping businesses predict whether they’ll see an uptick in activity on Yelp. Having read countless reviews giving restaurants low ratings even though the food was good, I think the entry that extracts subtopics (e.g., food, service, ambience) from restaurant reviews might be my favorite.

facebook-wage-3

Machine learning startup BigML now supports text data in its cloud-based prediction service. It has always analyzed numerical fields in complex datasets to determine the relationship between them and any given outcome, and how it will consider the importance of words, too. Read more »

In Brief

IBM is going to acquire a Dublin, Ireland-based company called The Now Factory, which specializes in providing customer and network analytics for wireless carriers. The idea is that better, faster data about their networks can help carriers optimize performance and better serve (or target) customers based on their usage behavior. The Now Factory seems similar in vision to the San Mateo, Calif.-based Guavus, and it seems logical the two will cross paths more often thanks to IBM’s global reach.

In Brief

Startup Dataguise has closed a $13 million series B investment round “led by Toba Capital with additional capital coming from the investment arm of a leading electronic conglomerate,” according to a press release. Dataguise’s biggest selling point might be its product designed to secure data within Hadoop. Aside from standard authentication, Fremont, Calif.-based Dataguise actually uses big data techniques to analyze data, determine what’s sensitive and then mask or encrypt it.

In Brief

Cloudera will be integrating with the Apache Accumulo database and, according to a press release, “devoting significant internal engineering resources to speed Accumulo’s development.” The National Security Agency created Accumulo and built in fine-grained authentication to ensure only authorized individuals could see ay given piece of data. Cloudera’s support could be bittersweet for Sqrrl, an Accumulo startup comprised of former NSA engineers and intelligence experts, which should benefit from a bigger ecosystem but whose sales might suffer if Accumulo makes its way into Cloudera’s Hadoop distribution.

EqualLogic and now DataGravity Co-founder Paula Long is very smart about storage technology. Right now, she’s looking at things like flash and cloud storage with a skeptical eye. They’re valuable and will become more valuable, she says, but only when they’re done right. Read more »

On The Web

It might have priced in the lower range of its purported value, but enterprise tech stocks have done pretty well recently and Violin has been one the bigger companies in a red-hot flash market. More interesting in the long run might be how Violin’s IPO affects — or is affected — by planned IPOs for smaller flash vendors like Pure Storage and Nimble Storage. Expect an update on the Violin public offering on Friday.

On The Web

This article from Klint Finley at Wired Enterprise raises some good questions about the ideal integration of big data into nonprofits. I rather prefer the efforts of DataKind and the SumAll Foundation, which try to help nonprofits solve problems rather than harvest email addresses. The flipside, of course, is that individual donors are what keep the lights on in many cases, so access to more of them is good.

On The Web

This seems like good advice from Hortonworks’ Ofer Mendelevitch. Python? Check. Java? Check. Hadoop? Check. SQL? Check. Stats? Check. But his closing remark — “The road to data science is not a walk in the park. … This takes time, effort and a personal investment.” — might be the most important. We often talk about democratizing some of the data science tools, but the really good ones can do it all.

In Brief

Hadoop startup MapR has released a new version of its commercial HBase database, called M7. According to a press release, “HBase applications can now benefit from MapR’s high performance platform to address one of the major issues for on-line applications, consistent read latencies in the less than 20 millisecond range across varying workloads.” MapR released M7 in May and claims its architectural improvements over open source HBase result in a faster, easier experience.

1222324252675page 24 of 75

You're subscribed! If you like, you can update your settings