More data Stories

A few months ago, I posited that additional funding for Cloudera and Karmasphere signifies a large market opportunity for solutions that utilize the open-source analytics tool Hadoop. From the news generated this week by Yahoo’s third annual Hadoop Summit, my beliefs of this have only been affirmed. Read more »

Hadoop creator and champion Yahoo is taking advantage of its annual Hadoop Summit today by rolling out some new features for its open-source Hadoop distribution. The new features tackle security and workflow management, which Yahoo hopes will help Hadoop continue its proliferation among mainstream users. Read more »

Upcoming Events


Now that AT&T, along with all the providers internationally, have scrapped unlimited data plans and introduced caps, you’ll need to keep an eye on how much data you’re using. Here are a few ways to make sure you don’t end up going over your monthly allowance. Read more »

DemandTec, a retail forecasting software provider, has convinced Target Corp. to hand over even more of its shopping data in order to better set prices and forecast demand. But DemandTec has needs of its own — partners that can help it filter unstructured social data. Read more »

Google revamped its search indexing methodology this week, which was quickly eclipsed by the chatter about background images on its home page. But those images were a red herring distracting us from technology changes that could influence those delivering the real-time web for years to come. Read more »

Want to know how Apple’s Genius song recommendation system for iTunes works? A post telling folks was deleted without explanation, but it’s worth reading since recommendation engines are the key to shoving the web onto devices like mobile phones and for creating a hyperpersonalized surfing experience. Read more »


Given the recent news about AT&T’s decision to shift from unlimited 3G data plans, we were curious how much data you actually use on your device? Taking a peak at my stats revealed that I’ve downloaded 4.1GB of data and uploaded nearly a gig. Read more »

Is there a business in providing intelligible data sets to information workers, application developers and analysts in a world where once expensive data such as turn-by-turn directions or real-time financial quotes are now free? Microsoft, with its Project Dallas, joins other firms hoping that there is. Read more »

loading external resource

We managed to create 800,000 petabytes of digital information last year, according to a study released today by IDC and EMC. The creation of digital data will increase to 1.2 million petabytes by the end of this year, which means we need fatter pipes. Read more »

The World Bank, which tracks everything from mortality rates to livestock production in hundreds of countries around the globe, said today it is opening up its data, including removing all of the pay walls around data that used to require a subscription fee to download. Read more »

Twitter today open-sourced the code that it used to build its database of users and manage their relationships to one another, called FlockDB. The move comes shortly after Twitter released its Gizzard framework, which it uses to send thousands of queries a second to FlockDB. Read more »

In some ways, the fact that Hadoop is mature enough to inspire commercial products — Cloudera and Karmasphere, e.g. — means it’s yesterday’s news. Which open-source, big-data-inspired product will be the next to launch a wave of startups and drive tens of millions in VC spending? Read more »

From a comparison of auto and PC industries to problems associated with the location-based advertising to tips & tricks of reading startup term sheets — here is a selection of five articles to read. And after you are done, check out Hitchhiker’s guide to financial regulation. Read more »

Appistry today added another element to its cloud-computing application platform, announcing the April availability of CloudIQ Storage. With it, St. Louis-based Appistry joins the growing ranks of companies seizing on demand cloud storage solutions that maintain performance in the face of rapidly growing data volumes. Read more »

Big data is on the tip of everyone’s tongues these days as more information is contributed to electronic records and more sources provide that information. We now have a river of data that we’re going to harness and use to make money and better decisions. Read more »

The Icelandic government is expected to put forward legislation that could turn the northern nation into an international freedom-of-information haven, thanks in part to the efforts of Wikileaks and the country’s recent experiences with corporate and government inaction and secrecy during its banking crisis. Read more »

There are a few widespread misconceptions about Cloudera, the promising, well-funded Burlingame, Calif.-based startup that offers services, training and support for the open-source software framework Hadoop. At least that’s what I found out during a talk earlier today with the company’s CEO, Mike Olson. Read more »

BlueKai, which aggregates and sells data on 200 million online shoppers to advertisers and publishers, today announced a $21 million third round of funding led by GGV Capital and including former investors Redpoint Ventures and Battery Ventures, bringing its total funding to $34.7 million. Read more »

Most cloud providers house services in only a few geographically distributed data centers, and national or continental data storage regulations can limit how -– and if -– organizations move their operations to the cloud. Can legislation can be passed that takes into account such realities? Read more »

Google, nearly six years since it first applied for it, has finally received a patent for its MapReduce parallel programming model. The question now is how this will affect the various products and projects that utilize MapReduce, such as Apache’s MapReduce-inspired Hadoop project. Read more »


If you believe the marketing hype, you aren’t really a true Mac user unless you have MobileMe. MobileMe is pushed heavily in the Mac and iPhone UI as well as the Apple retail environment. Fancy terms like “beyond the box sales” are a clever way of […] Read more »

Berkeley Labs has been working on an open source version of a system for demand response services for the power grid (called openADR) for more than five years. But only one company in that time has commercialized a version of the open source platform — a […] Read more »

Can an open source data management system do for the smart grid what Google’s open mobile operating system Android is doing for cell phones — spawn innovation and low cost development? Execs at the Tennessee Valley Authority (TVA), the largest public power provider in the U.S., […] Read more »

While when it comes to cloud computing, no one has entirely sorted out what’s hype and what isn’t, nor exactly how it will be used by the enterprise, what is becoming increasingly clear is that Big Data is the future of IT. To that end, tackling […] Read more »

[qi:gigaom_icon_cloud-computing] Love it or fear it, there is no denying the impact cloud computing is having on IT practices. Despite a summer full of high-profile outages, cloud computing spent the season continuing its march toward ubiquity, as our third-quarter wrap-up at GigaOM Pro showed (subscription required). Read more »

Cloudera, a startup based in Burlingame, Calif., today announced the release of its first commercial product, Cloudera Desktop. It’s a graphical interface for managing Hadoop, the open-source framework that is catalyzing the data mining renaissance. Cloudera’s Hadoop now works on almost all major cloud platforms: Amazon […] Read more »

Hadoop, as a pivotal piece of the data mining renaissance, offers the ability to tackle large data sets in ways that weren’t previously feasible due to time and dollar constraints. But Hadoop can’t do everything quite yet, especially when it comes to real-time work flow. Fortunately, […] Read more »

Who doesn’t battle the Apple Sync Services dragon on a semi-regular basis? Here are a couple of examples from Apple Support about how to resolve problems with this very useful yet unfortunately flawed feature: Mac OS X 10.5: Resetting the SyncServices folder Sync Services: Advanced troubleshooting […] Read more »

Lately, Google Voice is perhaps one of the most widely discussed products in the Apple blogosphere besides Apple’s own native devices. With its rejection from the App Store and people pointing fingers at Apple, AT&T, Steve Jobs and just about everyone and everything else in between, […] Read more »

[qi:gigaom_icon_cloud-computing] Collectively, Yahoo, Facebook, Amazon and Google are rewriting the handbook for big data. Startups intending to reach these proportions must also change their thinking about data, and enterprises need this model for internal deployments as a way to retain an economic edge.The four leading web […] Read more »

Last week, Sam explored trends in the technology jobs market, suggesting that significant opportunities only reveal themselves when examining both the available jobs and the underlying trends in demand for skills. Coincidentally, on the same day that Sam’s piece was published, The New York Times suggested […] Read more »

172737475page 74 of 75

You're subscribed! If you like, you can update your settings