More data Stories

photo: U.S. House of Representatives

When U.S. lawmakers and policy experts get tired of fighting ideological battles over the past, they might want to put a little effort into helping improve the country’s future. Here are four technology issues that could help improve the economy and outline Americans’ digital rights. Read more »

On The Web

This is an interesting patent application, in part because of its techniques and in part because — like many technology-related patent applications — it’s hard to see how it’s particularly novel. The idea of using someone’s social graph to find influential connections that could inform mobile-app recommendations is pretty good, but at the core aren’t we just talking about the decision to value one variable more than another in a recommendation system?

In Brief

Hadapt, a startup that has been pushing SQL on Hadoop since 2011, is rolling out a new technology it calls “schema-less SQL.” Essentially, the SQL portion of Hadapt’s platform will automatically form columns from the keys of JSON and other data types, thus making the associated values queryable like values in a standard relational database. This sort of joint SQL-NoSQL support is likely to become a lot more normal for analytic databases. Curt Monash has a good technical breakdown of the new Hadapt feature.

Upcoming Events

In Brief

The Comparing Constitutions Project has launched new web tool called Constitute, which lets users search their way through the world’s constitutions by keyword or theme. Not only is the tool handy for gathering info on international laws, but it’s also indicative of how the web can ease access to valuable data via nice interfaces masking lots of complicated data-prep work. The organization’s website has lots of other constitutional data and visualizations, too.


Search is evolving to fit the needs of users who don’t just want a web site, but the actual answer to the question driving the search. To stay on top semantic search technologies are key. Read more »

Users have grown accustomed to a real-time web, but now they want an easier-to-implement real-time integration between web services. REST Hooks seems to be the emerging standard for such integration. Read more »

loading external resource
In Brief

Randall Munroe, the man who writes web comic xkcd, also runs a series called What If in which he offers the answer to questions using data gleaned from the web and physics. On Tuesday the he tackled the question “If all digital data were stored on punch cards, how big would Google’s data warehouse be?” The result is a speculative blog post that estimates Google’s server count (between 1.8 million and 2.4 million) total storage (10 exabytes) and tells you how to find the search giant’s secret data center locales (go read it to find out.)

In Brief

Recommind, a San Francisco-based company that sells machine learning software optimized for e-discovery in the legal industry, has raised $15 million from SAP Ventures. The new money will go toward growing the company’s footprint outside the legal space via enterprise software that lets humans and machines work closely with one another around data analysis — something Recommind CTO Jan Puzicha discussed with me in March at Structure: Data.

In Brief

DataSift, one of the two companies (along with Gnip) granted real-time access to the Twitter firehose, now offers real-time and historical analysis of Tumblr data. While it’s best-known for Twitter, DataSift actually analyzes dozens of social media and commenting platforms, which is pretty handy if you want to compare sentiment, engagement or whatever else across platforms where people behave quite differently.

On The Web

The NSA may have found a way to monitor some credit card transactions, according to a Snowden-derived report from Germany’s Der Spiegel. The agency said in leaked documents that it found a way to access Visa transactions in Europe, the Middle East and Africa, but the financial services company denies the tapping of its networks. The report highlights an NSA financial database called Tracfin, into which SWIFT international transfer information also flows through the interception of “SWIFT printer traffic from numerous banks.”

In Brief

An MIT professor has conducted some handy research that could help make applications run faster and use less energy by overcoming an inherent drawback of multicore processors. The problem is that although the local caches on chips save them the latency of having to access RAM, the hardware-wired algorithms powering them often assign data to cache locations randomly without considering the core trying to access it. The new software-based technique, called Jigsaw, tracks which cores are accessing what data — and how much — and assigns data locale accordingly. The paper detailing Jigsaw is available here.

In Brief

New research out of Carnegie Mellon University shows that analyzing fans’ tweets can help gamblers make better bets on NFL games. Sometimes. Their technique wasn’t very effective at picking winners or betting the over/under, but it was 55 percent accurate on bets against the spread (and then only during the middle of the season). I doubt anyone will undertake this effort themselves for such a slight edge, but there might be a business here if someone can figure out a consistently accurate model.


Couchbase is officially opening up two new technologies to mobile developers as part of a public beta program. Couchbase Lite is a lightweight database designed specifically for iOS and Android devices, while Cloud Sync Gateway syncs local data with a bigger database in the cloud. Read more »

1232425262775page 25 of 75

You're subscribed! If you like, you can update your settings