More data Stories
On The Web

This is a good presentation about Facebook’s graph-processing engine, Giraph, from a big data event held at the company’s Menlo Park campus in early June. The PRISM story kind of took over the news cycle that week, but the event also produced some news (for big data geeks, at least): Facebook’s Presto engine for interactive queries of its 250-petabyte Hadoop data warehouse.

Researchers have a devised a method for identifying fake Twitter accounts that proved highly accurate across 27 popular black-market merchants. With Twitter’s cooperation, they spotted and deleted millions of accounts, using only data generated during the account-registration process. Read more »

Upcoming Events

loading external resource
In Brief

For some time now, Foursquare has been fighting its critics by arguing that it is building the “location layer for the internet.” It has followed through on some of that claim by integrating itself into Instagram and other services, but now it could be on the verge of something much bigger: sources tell BuzzFeed it is close to signing a significant data deal with Yahoo.

In Brief

A database vendor called Objectivity has created a mobile app called GraphMyLife that aims to let consumers explore links between the people and content in their various social networks. I say “aims” because although the idea is pretty cool, the app is a bit laggy and confusing (at least on my phone). But cut Objectivity a break: it’s a specialized (and old) enterprise-tech company trying to humanize its graph database software.

In Brief

Business intelligence and analytics startup Birst has raised a $38 million Series E round led by Sequoia Capital. Birst has been very busy in the past couple years, moving from SaaS to on-prem software, rethinking the data warehouse and even launching a Hadoop-based service. It looks like Birst is positioned to test the IPO waters like Qliktech and Tableau before it.

In Brief

Cleversafe, a Chicago-based provider of object-storage systems for housing massive amounts of data, has raised a $55 million series D round led by New Enterprise Associates. Apart from traditional storage workloads, Cleversafe has also made a name for itself as a replacement for HDFS in Hadoop environments. According to Crunchbase, the company has now raised $91.4 million since 2007.

On The Web

Pamela “PJ” Jones, the proprietor of Groklaw, is shutting down operations in the wake of the Lavabit secure email service closure. Groklaw, which was originally set up to cover the long-running SCO v. Novell trial but went on to facilitate discussions around all sorts of open-source and patent issues, relied partly on anonymous user tips. Jones said email could no longer be trusted, and said she was personally trying to get off the internet as much as possible.

On The Web

The White House’s Office of Science and Technology Policy published new guidelines and FAQs to ease adoption of President Obama’s open data policy. In May,  the president signed an order mandating that agencies use machine-readable and open data formats when they collect or create information so it can be re-used efficiently.

cbow
photo: Google's new deep learning architectures

Google researchers have developed new methods for analyzing language using deep learning techniques. They’ve also open sourced an implementation of their work so any researchers can experiment with it. It could be the first of many deep learning tools designed for mass consumption. Read more »

In Brief

When it comes to data, soccer is the new baseball. The latest issue of the Economist has an article breaking down English Premiere League soccer players using data, and a subsequent blog post includes an interactive tool from machine learning startup Ayasdi that lets readers explore the data. Earlier this week, Disney researchers presented their analysis of an entire year’s worth of ball-position data for a professional soccer league and how that can affect the outcome of games.

1272829303177page 29 of 77

You're subscribed! If you like, you can update your settings