VC-backed Upstart aims to help smart college graduates fund the startups of their dreams in return for a slice of their future earnings. So what’s so wrong with that? Plenty, according to Daniel Abadi, chief scientist of Hadapt and associate professor at Yale. Read more »
The delivery of real-time query makes Hadoop accessible to more users — and by orders of magnitude. Its significance goes well beyond delivering a database management system (DBMS) kind of query engine that other products have had for decades. Rather, Hadoop as a platform now supports a whole new paradigm of analytics. With the introduction of real-time query, Hadoop has taken a major step toward unifying the majority of big data analytic applications onto one platform. This research paper targets information technology professionals who have in-depth experience with traditional RDBMS and seek to understand where the Hadoop ecosystem and big data analytics fit. Read more at GigaOM Pro »
In just a few years, big data has turned from a buzzword and concept best left for large web companies into a force that drives much of our digital lives. Here are five technological trends that will change how data is processed and consumed going forward. Read more »
Rackspace is busy building a Hadoop service, giving the company one more avenue to compete with cloud kingpin Amazon Web Services. However, the two services — along with several others on the market — highlight just how different seemingly similar cloud services can be. Read more »
Dell is donating an ARM-based server to the Apache Software Foundation so contributors can test their projects on new, energy-efficient hardware architectures. Big data projects such as Hadoop and Cassandra are low-hanging fruit, but many webscale applications likely could use them to save power. Read more »
Cloudera has joined the fray of Hadoop companies trying to turn the big data platform into an engine for exploring data interactively using standard SQL. As the biggest company in the space, its new technology called Impala could go a long way toward changing Hadoop’s image. Read more »
Former Yahoo cloud VP Todd Papaioannou and Facebook engineer Jonathan Gray are trying to big data for programmers with a new platform service called Continuuity. It’s a development environment and runtime layer that sits atop a company’s Hadoop infrastructure and abstracts the complexity of writing apps. Read more »
Analytics startup Platfora is finally showing off its next-generation business intelligence software to the world, combining Hadoop, in-memory processing and HTML5 into an impressive product. It’s entering a competitive market full of large incumbents and other innovative startups all trying to change how we do BI. Read more »
Hadoop startup Hadapt has made its unified Hadoop-and-SQL analytic architecture even easier by adding native advanced analytic functions and integrating tightly with Tableau’s powerful BI software. It’s a sign of things to come as Hadoop and traditional SQL-based BI become cozy across the board. Read more »
Traditionally databases have either focused on real-time transactions or longer-term analytics of data, but newer technologies like Hadoop and a related open-source system called Hbase can combine those two things, according to a panel at GigaOM’s Structure Europe conference in Amsterdam Read more »
According to Equinix CTO Lane Patterson, CIOs are still learning what can and what can’t be done in the cloud. Ultimately, it’s a matter of trust. Read more »
The usual suspects Amazon and VMware made significant announcements in cloud in the third quarter, while Hadoop remained the talk of the town in big data. Emerging trends in software-defined networking and flash storage stirred up lots of M&A and venture investment in the quarter. Read more at GigaOM Pro »
IBM and Cisco have both launched specialized hardware designed to securely and efficiently handle big data, but is there a large market for specialized big data gear? If there is such a market, are these the boxes that will fill it? Read more »
A new startup called Trifacta, founded by UC-Berkeley professor Joe Hellerstein and Stanford professor Jeffrey Heer, wants to eliminate much of the hassle of making messy data usable. The company combines machine learning and human-computer interaction, and has raised $4.3 million from Accel Partners. Read more »
Big data company RainStor has raised $12 million is Series C funding for its database that’s designed to shrink data footprints by at least 95 percent. It also plays nice with Hadoop, meaning a system can handle ad hoc SQL queries as well as MapReduce jobs. Read more »
NASA and a couple other government agencies have kicked off a series of TopCoder challenges designed to find innovative solutions to the government’s big data problems. The first contest is all about making disparate, incompatible data sets usable and actually valuable across agencies. Read more »
Facebook knows something about big data — it collects more data and has built more tools than almost anybody else. Here, Facebook’s Jay Parikh and Accel Partners’ Ping Li talk about what lessons big data startups can take from Facebook to build businesses that can succeed. Read more »
Rather than rely on Hadoop or any other popular data-management tools to build a platform for democratizing data science, Precog decided to build its own system from scratch. That makes Precog stand out from the crowd, but it also means there’s little room for error. Read more »
It’s not for everyone, but if you’re storing petabytes of data Hadoop, Quantcast thinks it has the cure to your woes. Its newly open sourced Quantcast File System promises smaller clusters and better performance, and it has proven itself over exabytes of data inside Quantcast. Read more »
Carter S. won his first-ever Kaggle competition — our own GigaOM WordPress Challenge — using a brute force method of data science he calls overkill analytics. Rather than spend untold hours perfecting complex models, Carter used simple algorithms and let powerful microprocessors do the rest. Read more »
I spent two days last week watching experts on big data and data science discuss how their companies are building businesses around data, or at least rethinking how they do business. Although most came from the web, these five ideas should matter across industries. Read more »
The big data world is full of small, scrappy startups using their ingenuity to build complex systems out of open source software, but the Walt Disney Company is not one of them. Here’s what goes into building a big data platform in a Fortune 100 company. Read more »
Observers of database technology should look closely at the non-relational database market to see where the most interesting growth lies in the world of applied information storage and retrieval. Read more at GigaOM Pro »
Etsy shared the details of its hardware architecture on Friday, showing the world a whole lot of Supermicro servers running everything from web servers to Hadoop. At this point, software is the name of the game at webscale, so hardware openness is just welcome community service. Read more »
How much of your data is Facebook collecting every day? Some new stats from the company reveal just how large its user base is, and what big data means to a company with 950 million users. Read more »
One big problem with big data is that most analytical queries are slow and non-interactive. That’s why MapR and the Apache Foundation are backing Drill, an open source version of Google’s Dremel, as a tool to address that problem Read more »
Etsy, Airbnb and the Climate Corporation are all using a combination of Cascading and Amazon Elastic MapReduce to make creating Hadoop jobs as simple as possible. But they’re not the only options for doing so — simplifying Hadoop usage is big business in the IT world. Read more »
Although it’s still a work in progress, 0xdata thinks it has the answer to the problem of doing advanced statistical analysis at scale: Build on HDFS for scale, use the widely known R programming language and hide it all under a simple interface. Read more »
While computing in the cloud can cost less than running servers in your enterprise data center, the question of how much less isn’t an easy one to answer. The cloud will get cheaper in the future, but not before these challenges are addressed and overcome. Read more at GigaOM Pro »
Nimbula and MapR say that combining the former company’s scalable private cloud infrastructure with the latter’s Hadoop distribution will enable companies to run and manage big data applications much more easily. The idea is that a cloud infrastructure will make Hadoop much more flexible and available. Read more »
Infochimps has released version 1.1. of its platform that the company has described as Heroku for Hadoop. The new version takes things a step further, though, turning the platform into an engine for easily creating streaming workflows that don’t require using Hadoop at all. Read more »
Cloud computing and open source software have freed IT practitioners from so much legacy vendor baggage over the past few years. Isn’t it time to free them from inane benchmark boasting, too? A crowdsourced platform where users share their real-world performance experiences could help. Read more »
Drawn to Scale’s Spire database is meant to be all things to all people — it combines Hadoop, HBase and SQL to provide a fast, scalable, robust experience — and now it has integrated with MapR’s Hadoop distribution. It’s no surprise the young company already claims big customers. Read more »
Organizations are coping with the challenge of processing unprecedented volumes of data. However, the processes involved with using a large cluster to run applications like Hadoop are error-prone. So IT managers are turning to cluster-management solutions to automate tasks associated with cluster creation, management and maintenance. Read more at GigaOM Pro »
Nodeable is now offering a cloud service for processing and analyzing streams of data in real time. Its new flagship service, called StreamReduce, is built atop Twitter’s open source Storm framework and acts as Hadoop’s faster, nimbler front-end partner that delivers users insights as they happen. Read more »
In cloud and big data, the second quarter of 2012 featured several high-profile deals and product launches that could reshape the marketplace for everyone. Google and Microsoft launched Infrastructure-as-a-Service offerings, software-defined networking took off, and all eyes stayed fixed on the continuing promise of data analytics. Read more at GigaOM Pro »
Slowly but surely, health care is becoming a killer app for big data. Whether it’s Hadoop, machine learning or natural-language processing, folks in the worlds of medicine and hospital administration understand that data is the key to helping them take their fields to the next level. Read more »
Big data has become the latest front for the patent troll epidemic as a shell company is suing firms for using a common software framework known as the Hadoop Distributed File System (HDFS). Read more »
Hadoop is on its way to becomig the de facto platform for the next-generation of data-based applications, but it’s not without some flaws. Ironically, one of Hadoop’s biggest shortcomings right now is also one of its biggest strengths going forward — the Hadoop Distributed File System. Read more »
The market for business analytics software grew 14 percent in 2011 and will hit $50.7 billion in revenue by 2016, according to IDC. The category will grow at a 9.8-percent-a-year rate until then, driven in part by the hype around big data. Read more »