More hadoop Stories

tumblr dashboard

Tumblr hits 500 million page views a day, deals with 40,000 requests per second and sends more than a terabyte of data into its Hadoop cluster. Here’s how it went from nothing to a startup that needed to serve 15 billion page views a month. Read more »

SBI

Companies are hot on social media for a number of reasons, but perhaps chief among them should be that social platforms can create focus groups at a scale never before possible. Given the right big data tools and techniques, the insights can be fantastic. Read more »

loading external resource

hadoop

Hadoop features front and center in the discussion of how to implement a big data strategy, one of the biggest trends in IT. There’s just one problem that keeps cropping up: many people don’t seem to know exactly what it means when somebody says “Hadoop.” Read more »

Freedom-PRN1-02-1024x569

Facebook’s S-1 filing shows the company is all about infrastructure. The ad revenue and user experience it relies on to exist mean Facebook can’t afford to take it easy on IT, which means shareholders and users will both find plenty of reasons to get upset. Read more »

IMG_3293

If you like the idea of your analytics system’s getting more accurate with each piece of data it ingests, it looks like you are in for an exciting run, because machine learning appears to be catching fire across the ecosystem of big data vendors. Read more »

ebay screen

For eBay, big data is serious business. Every day, the site stores and analyzes data from millions of users buying, selling and searching for hundreds of millions of products. It handles all this data with lots of Hadoop, although a good data warehouse doesn’t hurt either. Read more »

loading external resource

plumscreen Shot 2012-01-31 at 12.05.25 PM

As promised, storage kingpin EMC has integrated its Isilon NAS product with Hadoop in a way that will bring Isilon’s OneFS file system to bear on data. EMC isn’t alone. Vendors from Amazon to Oracle are trying to tame this big data beast. Read more »

Subscriber Content

fireworks1

If you’re like many of us, you’re already thinking over some New Year’s resolutions that will make you a better “you” in 2012. But how are the tech industries’ thought leaders approaching the new year? We asked 12 of them for their resolutions. Read more at GigaOM Pro »

Screen Shot 2012-01-30 at 9.48.35 AM

Was Bill Gates, chairman and co-founder of Microsoft, the power behind the proprietary Windows-and-Office juggernaut, really an open source champion? A new Wired article lays Microsoft’s wider embrace of open source technologies — including Node.js and Hadoop — squarely at Gates’ feet. Read more »

6554314153_b776e626f0_z

Pentaho is moving its business intelligence tools to the Apache license to make them more compatible with big data technologies that already operate under that license. Pentaho’s Kettle extract, transform, load (ETL) technology was previously available under the LGPL or lesser Gnu General Public License. Read more »

visual

The great thing about big data is that there’s still plenty of room for new blood, especially for companies that want to leave infrastructure in the rearview mirror. At this point, the data-infrastructure space, including Hadoop, is well-funded and nearly saturated, but it also needs help. Read more »

jeremeyburton

With all the talk of big data, cynics think the whole notion has jumped the shark. Get ready, they say, for the next tech bubble to burst. EMC CMO Jeremy Burton is not among them. Granted, he’s a marketer, but what he says makes sense. Read more »

6647248261_fc8569458a_z

Big data has gotten very, very big if the elite talking heads at the World Economic Forum in Davos, Switzerland, are talking about it. And they are talking about it. Sessions include “Decoding the data deluge” and “Personal data: the ‘new oil’ of the 21st century.” Read more »

istock_000001007494xsmall

Ad-targeting company 33Across is acquiring link-tracking specialist Tynt Multimedia, resulting in a combined user graph spanning 1.25 billion users. Both are storing and analyzing billions of transactions daily, and they will use that data to help publishers compete on ad sales against mega sites like Google. Read more »

5313213268_4161e371b4_z

Joe Coyle, CTO of global integrator Capgemini, sees a lot of cloud pitches from all the major technology vendors — and God knows they all have a cloud strategy. Here’s what he thinks of the current state of the market. Read more »

netezza-twinfin-tour

IBM is working the reins of its Smarter Commerce initiative by rolling out a new Netezza analytics appliance designed to help retailers churn through potentially petabytes of consumer sales data in real time. It’s trying to capitalize on the increased importance of e-commerce to revenues. Read more »

big elephant

Although the first couple years of commercial Hadoop attention have been characterized by an attitude of “Hadoop is great, but …”, the tone is changing as Hadoop vendors increase the platform’s palatableness with each new iteration. No longer is Hadoop necessarily an epic undertaking rife with pitfalls. Read more »

Subscriber Content

gigaompromasterimagecloud

Continuing a yearlong trend, the fourth quarter in big IT was all about big data, and Hadoop in particular. Still, many are beginning to recognize the software framework’s shortcomings, which is why this quarter also saw more attention for startups claiming easy analytics and real-time processing. Elsewhere in infrastructure, SaaS startups made out well and valuations for these companies are getting higher, and naturally there was news from the AWS camp. This quarterly wrap-up examines these events and more, including the quarter’s dark spot, the hike in prices in the hard-drive manufacturing space due to the floods in Thailand. Companies mentioned in this report include Calxeda, Heroku, Rackspace, Salesforce.com and Tier3. For a full list of companies, and to read the full report, sign up for a free trial. Read more at GigaOM Pro »

3856456237_054e5b74e9_z

Splunk’s IPO has been much anticipated with good reason. Splunk’s machine data search, analytics and visualization technologies address the gap between the reams of big data generated by the second and the ability to parse and display that data in a meaningful way. Read more »

oracle hq

Oracle’s Big Data Appliance is now for sale, featuring Cloudera’s Hadoop distribution and management tools. Regardless of what anybody thinks about Oracle’s strategy, the deal is a coup for Cloudera as it tries to fend off competition from fellow Hadoop startups Hortonworks and MapR. Read more »

Cloud insights

I spent some time playing with Google Insights to find out what parts of the country are most interested in technology and when that interest hit its peak. It wasn’t surprising to see Silicon Valley rank highly, but did you know Utah was into next-generation programming? Read more »

hard_disk_head_on_platter

1010data says it now hosts more than 5 trillion records for its customers. If 1010data’s growth is a microcosm of the greater market, it’s no wonder there’s so much excitement around scalable data stores such as Hadoop, NoSQL databases and massively parallel analytic databases. Read more »

server farm

Google announced that it’s ending its Academic Cloud Computing Initiative, a joint program with IBM and the National Science Foundation that gave researchers access to a massive Hadoop cluster on which to run their data-intensive projects. The company says access to such resources is now common. Read more »

6259499293_b577b94cfd_z

This year may have been the beginning of the big data onslaught, but big data will only get bigger in 2012. Watch for companies to check out specialized databases for different data types and to segment their data centers for old and new workloads. Read more »

img-myhadoop-bigger4

Money has turned the Hadoop community, once united under the Apache banner and the cuddly stuffed-toy-elephant logo, into something resembling a frat house: Everyone’s under the same roof, but there’s plenty of machismo to go around. If it’s not good business; it is good theater. Read more »

cake pops

Beyond Hadoop, there’s a lot more to think about when it comes to big data, ranging from where companies will actually find workers to how they’ll deal with an impending privacy-policy onslaught. The answers won’t be easy to come by, but they could be critical. Read more »

2978844608_a443a5e60e_z

Microsoft’s Windows Azure platform as a service (PaaS) now supports Node.js, the popular server-side JavaScript development framework. That could give Azure more traction beyond the Microsoft .NET faithful. Also new: a limited trial of Hadoop-based distribution for Azure. Read more »

497364007_b28f03366a_z

LexisNexis is pressing MarkLogic’s technology into service for its just-launched Lexis Advance legal service. MarkLogic’s document storage, search and analytics technology replaces legacy home-built code as part of a platform modernization and big data push. Read more »

chorus line

Greenplum has announced its Unified Analytics Platform, a packaging of the Greenplum Database and Hadoop distribution along with its long-awaited Chorus software. Chorus is really what ties everything together, providing a platform to explore both types of data and to share interesting data sets and findings. Read more »

3745750194_a7e32a8505_z

Digital Reasoning, a somewhat shadowy specialist in big data analytics for the U.S. intelligence community, announced B Series funding and named industry vet John Brennan to its board. Funding came from CIA-backed In-Q-Tel, as well as some Silver Lake Sumeru partners and other unnamed investors. Read more »

14567811page 6 of 11