Hadoop is an open source software framework that was initially used for data-intensive queries at sites such as Yahoo, but has spread out to data crunching tasks of many types at all kinds of organizations. Hadoop can bring great efficiencies to dealing with large data sets, and is available in several distributions. Keep up with your Hadoop news here.
Hadoop
How Caffeine Is Giving Google a Turbo Boost
Google revamped its search indexing methodology this week, which was quickly eclipsed by the chatter about background images on its home page. But those images were a red herring distracting us from technology changes that could influence those delivering the real-time web for years to come. Read More »
A decade ago, scientists would collect data over a period of years, upload that data to a supercomputer, then wait for the opportunity to run it during a scheduled time. The process took months — or even years. Now, thanks to cheap processing power and the …
What IBM Does With Big Data
In a world of billion-dollar web companies and VC-backed startups trying to forever change human interaction via software, IBM tends to look a little staid. But don’t let its deliberate pace, legacy-software-mongering ways and suited executives fool you. If you pull back the covers, you’ll find …
Can Greenplum Become the Sun Microsystems of Databases?
When I heard about Greenplum’s new Chorus platform, the first concrete step toward the company’s Enterprise Data Cloud vision, I couldn’t resist learning more – the combination of databases, social networking, the cloud and Scott McNealy is a powerful proposition. What I learned is that Greenplum …
Twitter today open-sourced the code that it used to build its database of users and manage their relationships to one another, called FlockDB. The move comes shortly after Twitter released its Gizzard framework, which it uses to send thousands of queries a second to FlockDB. Read More »
Appistry today added another element to its cloud-computing application platform, announcing the April availability of CloudIQ Storage. With it, St. Louis-based Appistry joins the growing ranks of companies seizing on demand cloud storage solutions that maintain performance in the face of rapidly growing data volumes. Read More »
There are a few widespread misconceptions about Cloudera, the promising, well-funded Burlingame, Calif.-based startup that offers services, training and support for the open-source software framework Hadoop. At least that’s what I found out during a talk earlier today with the company’s CEO, Mike Olson. Read More »
Google, nearly six years since it first applied for it, has finally received a patent for its MapReduce parallel programming model. The question now is how this will affect the various products and projects that utilize MapReduce, such as Apache’s MapReduce-inspired Hadoop project. Read More »
As the year winds to close, GigaOM Pro’s crack team of contributors takes a look back at what went right, what went wrong, and for ... Read More »
It’s Tuesday and that can only mean it’s time for our weekly feature, the BlackBerry Buzz. The Buzz is where you’ll find out what’s been going on in the BlackBerry brambles. You’ll hear about everything that’s worth knowing in the awesome world of the BlackBerry. As … Read More »
Boxee CEO Avner Ronen stopped by our NewTeeVee Live conference last week to officially announce the first Boxee-branded hardware, a set-top box that will be available sometime next year. I interviewed Ronen after his keynote, and he gave me … Read More »
Berkeley Labs has been working on an open source version of a system for demand response services for the power grid (called openADR) for more than five years. But only one company in that time has commercialized a version of the open source platform — … Read More »
Can an open source data management system do for the smart grid what Google’s open mobile operating system Android is doing for cell phones — spawn innovation and low cost development? Execs at the Tennessee Valley Authority (TVA), the largest public power provider in the … Read More »
Love it or fear it, there is no denying the impact cloud computing is having on IT practices. Despite a summer full of high-profile outages, cloud computing spent the season continuing its march toward ubiquity, as our third-quarter wrap-up at GigaOM Pro showed (subscription … Read More »
Cloudera, a startup based in Burlingame, Calif., today announced the release of its first commercial product, Cloudera Desktop. It’s a graphical interface for managing Hadoop, the open-source framework that is catalyzing the data mining renaissance. Cloudera’s Hadoop now works on almost all … Read More »
Hadoop, as a pivotal piece of the data mining renaissance, offers the ability to tackle large data sets in ways that weren’t previously feasible due to time and dollar constraints. But Hadoop can’t do everything quite yet, especially when it comes to real-time work flow. … Read More »
After the Titanic hit an iceberg and started to sink, some of its passengers sat on the deck, playing cards — giving birth to the phrase “rearranging the deck chairs on the Titanic.” That’s precisely the image that came to mind when I heard … Read More »
With two major acquisitions announced today — the $420 million acquisition of SpringSource by VMware and Facebook buying Friendfeed for $50 million, I almost forgot to note that two good friends of this blog have switched jobs. Doug Cutting, creator of open-source software framework Hadoop,Read More »
At the Hadoop Summit in Silicon Valley today, Yahoo announced the availability of the Yahoo Distribution of Hadoop, a source-only version of Apache Hadoop that Yahoo uses within its own search engine. That’s more good news for Cloudera, a Burlingame, Calif-based startup that builds commercial … Read More »
Updated: Hadoop, the open-source software framework, is one of the technologies we have been following closely. If you are equally interested in Hadoop, then we have 10 free tickets for The Hadoop Summit that is going to be held this Wednesday, June 10, at the … Read More »
Hadoop, an open-source software program that helps process incredibly large data sets, has been generating plenty of buzz. The upcoming Hadoop Summit on June 10 marks a midway point in an eventful year for the technology. Cloudera, a … Read More »
At first glance it’s hard to see how the open-source software framework Hadoop, which was developed for analyzing large data sets generated by web sites, would be useful for the power grid — open-source tools and utilities don’t often mix. But that was before the … Read More »
“Hadoop is going to find potential markets in any industry where there are large data sets that need complex analysis,” Mike Olson, chief executive officer and one of the four co-founders of Cloudera, the startup that’s commercializing the open-source software framework Hadoop, told me … Read More »
Cloudera, a Burlingame, Calif.-based startup that is building commercial services around open-source software framework Hadoop, has closed $6 million in Series B funding, bringing the total raised by the company to $11 million. The latest round of funding was led by Greylock Partners. … Read More »
Earlier today, I stopped by at the Social Graph Symposium at Sun Microsystems’ Menlo Park campus. The event, which attracted some of the most well-known experts on social networks and social graphs, was organized to look at the various challenges and opportunities being presented by … Read More »
We are in the midst of a data mining renaissance. Traditionally, data warehousing implementations were large, complex and expensive, meaning only the top-ranking companies could afford them. Teradata pioneered the initial market for corporate data warehousing solutions and still maintains a segment lead, something HP’s … Read More »
You know the saying, “If you build it, they will come”? Well that certainly holds true for GPS functionality and mobile phones. Nearly 48 percent of the mobile app developers surveyed by Boston-based Skyhook Wireless said that location is what “sets their app apart, or … Read More »
Hadoop, Cassandra, HBase, Hypertable, Open Neptune… these are some open source projects that are being pursued by web technologists in order to deal with explosion of digital data in a post-terabyte world. The traditional way to deal with unstructured data isn’t working. What we need is … Read More »
Cloudera, a Burlingame, Calif-based company offering services around the open source software framework Hadoop, has raised $5 million in Series A funding led by Accel Partners. It has also attracted funding from seasoned infrastructure executive and Web veterans such as Caterina Fake (co-founder, Flickr), … Read More »
Yesterday, what’s old became new again for Aster Data Systems, when initial public customer MySpace released a video love letter to the analytic database specialist. The numbers in this relationship are well documented — 2-3 TB of new data each day, hundreds … Read More »
Aster Data Systems, which makes software that allows companies to build massively scalable databases on commodity hardware, has raised an additional $5 million as part of its Series B round of funding from Institutional Venture Partners. Aster had originally closed $12 million back in … Read More »
Today IBM announced that six universities are using its cloud computing expertise to set up and manage clouds located in Qatar, Africa and in Japan. It is using Hadoop for allocating resources in the cloud — something it first began doing in 2007 when … Read More »
Aster Data Systems, a Redwood City, Calif-based startup that makes data warehousing software, has raised $12 million in new funding from JAFCO Ventures along with participation from existing investors Sequoia Capital, Cambrian Ventures and First Round Capital. The company had previously raised $10 million … Read More »
GigaOM’s Structure 08 event offered a terrific opportunity to survey the changing landscape of computing infrastructure. But as with all technology shifts, innovation won’t just belong to the big established players like VMWare, Amazon, Google, Sun Microsystems, Salesforce.com and Read More »
Last week, OStatic noted the rumor, first reported by VentureBeat, that Microsoft intended to buy Silicon Valley semantic search engine Powerset for $100 million. Lo and behold, Microsoft and Powerset are confirming today that an acquisition agreement has been signed. … Read More »
Parascale, a Cupertino, Calif-based start-up that has developed a storage file system for a cloud of computers announced that it had attracted $11.37 million in Series A funding from Charles River Ventures and Menlo Ventures. The company recently changed its chief executive and brought … Read More »
We are only ten days away from Structure’08, our web infrastructure conference. As part of our preparation for this event, our team of reporters & bloggers is finding new & interesting open source projects that are tackling various aspects of Cloud Computing, a concept popularized … Read More »
As part of our renewed focus on technologies that matter, we are launching a series of events called GigaOM PM, occasional meetups at which we will gather to discuss topical and important technology breakthroughs. I will host these gatherings, and we will keep them small and … Read More »
-
Lloyd Dewolf: Derrick does Mint succeed at "data for delight" for you or people close to you? ...