More data Stories

photo: Infochimps co-founder Flip Kromer

Infochimps is attempting to build a data market, and in doing so, the company is wading into some of the messiest and most unstructured data around, attempting to clean it up and put it up for sale. I talk to co-founder Flip Kromer about the challenges. Read more »

Upcoming Events

IBM Watson

Aaccording to one machine-learning expert, one key takeaway from Watson’s “Jeopardy!” victory is simple: humans are very smart. That a system such as Watson can understand natural language is a huge step forward, but it’s still only as good as its data and algorithms. Read more »


In a new Forrester report, authors James Staten and Lauren E. Nelson advise infrastructure and operations (I&O) leaders to encourage their data analysts to get hip to cloud-based analytics tools and to consider making their organizational data available to the public as a cloud resource. Read more »

loading external resource


HP announced this morning that it has signed an agreement to acquire analytical database provider Vertica for an undisclosed amount, a decision that finally puts HP into the data warehouse market and analytics space that is becoming more important by the day. Read more »


The interesting story behind OkCupid, the online dating site recently acquired by, is OkTrends, its blog that analyzes the site’s wealth of data to shed light on our love lives. But the interesting story behind OkTrends is its use of R to power those analytics. Read more »


Taming Twitter’s stream of endless data can be daunting, especially the more people you follow. But start-up My6sense is bringing some order to the chaos with a new Chrome browser extension that prioritizes a user’s Twitter stream, making it relevant to their tastes and interests. Read more »


New start-up BillGuard is looking to build a crowd-sourced anti-virus billing protection system that digests a consumer’s transactional history and pulls in alerts from banks, existing members and the web. The system uses big data analysis and machine learning to help users spot fraud and errors. Read more »


Like most social games, Tribal Crossing applications have a very high database write rate –- changes to the game state must be stored so the user doesn’t lose her game score, “loot” or location. Tribal Crossing migrated from MySQL to Membase to support a higher write rate. Read more »

amazon web services AWS

Netflix offers rent-by-mail and streaming movies. The shift from mail-order to streaming video had fairly significant implications for Netflix’s application infrastructure. Netflix realized it would need multiple geographically dispersed data centers and far more processing capacity so it turned to Amazon’s Web Services. Read more »


Few would argue that Hadoop doesn’t have a bright future as a foundational element of big data stacks, but Piccolo, a new project out of New York University, is moving data in-memory in an attempt to improve parallel-processing performance beyond what Hadoop and/or MapReduce can do. Read more »


With enterprise data volumes growing, business and IT leaders face significant opportunities and challenges from big data. The space, of course, is not without its obstacles — including plenty of privacy concerns — but in 2011, there are numerous sales-growth opportunities and new business models finally surfacing. Read more »


Openwave’s next generation platform must support geographic redundancy, massive scalability and high availability. It has to distribute databases redundantly across multiple data centers and handle large customer datasets – varying from hundreds of terabytes to petabytes, and supporting thousands of transactions per second from each customer. Read more »


Shutterfly is a popular Internet-based photo sharing and personal publishing company that manages a persistent store of more than 6 billion images with a transaction rate of up to 10,000 operations per second. Here’s why it made the journey from Oracle to MongoDB. Read more »


BI vendor Jaspersoft has expanded its software’s support to include pretty much the entire gamut of big data tools available. There might not be much business demand for all these connectors right now, but it’s wise for Jaspersoft to establish its presence in this area early. Read more »


Mobile-app-analytics startup Flurry is upgrading its data center network with Arista Networks 10 GigE switches, a move designed to improve network performance as Flurry continues to add both terabytes and nodes to its big data system. Is the network the hardware superstar in big data environments? Read more »

gas cloud

Today’s links offer further proof that technologies like Hadoop and NoSQL aren’t going anywhere — and might even be expanding — and that choosing the right cloud computing solution really should be about what’s best for the individual business (e.g., public vs. private, or available vs. reliable). Read more »


Hadoop startup Cloudera has rounded out its support of the Apache Software Foundation by becoming a Silver-level sponsor. Cloudera already contributes code and personnel to the Apache Hadoop project and Cloudera’s Doug Cutting (and Hadoop creator) is the ASF chairman. Read more »


On Friday, Microsoft’s HPC division opened up the company’s Dryad parallel-processing technologies as a Community Technology Preview (CTP). Dryad could be a rousing success, in part because Hadoop — which is written in Java — is not ideally suited to run atop Windows or support .NET applications. Read more »


Larry Ellison and Oracle aren’t interested when it comes to technology trends. They do their own thing, whether it’s mocking cloud companies or hiring deposed chief executives of rivals. Somehow, it all works out. Oracle reported blowout results for first quarter of 2011 on Friday. Read more »

1717273747577page 73 of 77

You're subscribed! If you like, you can update your settings