During the Open Compute Summit in San Jose, Facebook VP of Engineering Jay Parikh shared some big statistics for the company’s cold storage efforts, including those for a protoytpe Blu-ray system capable of storing a petabyte of data today. Read more »
Altiscale, the Hadoop-as-a-service startup co-founded by former Yahoo CTO Raymie Stata that launched in June, is now offering its Data Cloud platform to the public. It’s a cloud service in the same vein as Amazon Elastic MapReduce, although it’s probably more similar to fellow startup Qubole. Altiscale is custom-built to run Hadoop workloads (or Spark, or most anything that can run easily on YARN), is fully managed and automatically scales resources to meet the demands of a job. “There hasn’t been a customer yet that we haven’t been able to improve reliability for,” Stata told me recently, primarily by improving efficiency and eliminating failures.
IO, which is known for its modular data center designs and specialized data center management software, is getting into the cloud provider space with a new service called IO.Cloud. It’s very open at the foundational level, at least, running OpenStack software on Open Compute hardware. Read more »
The open source search tool Elasticsearch has been downloaded more than 6 million times and counts some household names among its customer base. Now, the company behind the software is launching its first commercial product: a management console called Marvel. Read more »
We have chosen eight of our favorite startups from 2013 as winners of the inaugural Gigaom Structure Data Awards, but readers will also have their chance to vote for the Readers’ Choice awards. Read more »
This isn’t your daddy’s, um, GoDaddy. The company is overhauling its technology platform in a mission to be a sort of Robin Hood for the world’s small businesses. It will take technology from the largest companies and bring it to the small ones. Read more »
Udacity Founder and CEO — and famed inventor of self-driving cars and wearable technologies at Google — came on the Structure Show podcast this week to talk about the promise, limitations and future of online education. Here’s what he had to say. Read more »
One of the big themes at our Structure Data conference in March is the advent of new techniques to make sense of new data sources. One of the most-promising is video, which had value well beyond capturing crimes and making us laugh on YouTube. Read more »
The White House announced on Thursday that it will form a working group to study big data and report on its implications to privacy, policy and society. That might be easier said than done. Read more »
The National Football League and General Electric announced on Thursday a list of 16 projects that will each receive $300,000 to advance their research in the field of diagnosing and preventing head injuries. Among the selected projects is a collaboration between the University of California, San Francisco, and machine learning startup Ayasdi to analyze CAT scan data to predict which players might have persistent symptoms. Another involves the Purdue Neurotrama Group and a company called BrainScope that uses machine learning algorithms to power a device that it hopes can detect head injuries on the sidelines. As everything from algorithms to computing power improve, machine learning is actually becoming fairly common in medical research.
A Cambridge, Mass.-based startup called Kensho has raised a $10 million seed round from a group of investors that includes General Catalyst, NEA, Accel Partners and Google Ventures. The company’s product, called Warren (as in Warren Buffett, I presume), is a natural-language search engine for data on financial markets. You (assuming you’re a banker or very sophisticated day trader) type in a question — an example from the company’s website is “Which aerospace companies rally following major breakthroughs in drone technology?” — and it returns results in the form of data. It looks like a smart product from a smart team, especially if the UI and visualizations are as good as the algorithms.
Prize money aside, Statwing’s new contest to find the best insights from a 400-plus-variable, 40,000-row social science dataset should at least be fun — there are a lot of interesting angles to explore — and is a great example of putting tools and data in the people’s hands. Read more »
Database startup MemSQL has been on fire since it launched in mid-2012, and now it has a lot more money to keep up that momentum. The company has closed an oversubscribed series B round worth $35 million. Read more »
A new study of data from massive open online courses offered by Harvard and MIT professors paints a different — and welcome — picture of the state of online education. Completition rates might be low, the authors argue, but that’s a misleading stat. Read more »
Online building materials wholesaler BuildDirect has raised a $27.3 million series B funding round, but it’s not the company’s products or breadth that have investors pumping money into — it’s how smart the company uses data. Read more »
Cloud storage provider Backblaze is at again, this time detailing which models of hard drives last the longest in its open source storage pod arrays. If money were no object, it would probably be Hitachi all the way. Read more »
I analyzed more than 5,000 posts by Gigaom writers in 2013 to identify the words and phrases we use the most. Can you guess what they are? Some of them might surprise you. Read more »
Pivotal’s new SVP of R&D Hugh Williams came on the Structure Show podcast this week to talk about the promise of big data and how he thinks his new employer is poised to deliver on it. But, he notes, there’s still work to do. Read more »
The Wall Street Journal is reporting that Dropbox has raised “about $250 million” at a valuation “close to $10 billion.” Given the crazy valuations of other startups, such as Snapchat, such a high number for Dropbox isn’t too surprising. It has hundreds millions of users and has the personnel in place to start adding value beyond just syncing and storing data. Of course, it was just down for a couple days, which is something that can’t happen too frequently if you’re also planning a move into the lucrative business-user market, which Dropbox is.
Google knows a lot about what people listen to because its Play Music service knows what’s in users’ MP3 libraries. A new tool lets people investigate which albums, artists and genres are most popular over time. Read more »
It’s not yet incorporated, but a Las Vegas startup called Skyworks Aerial Systems is trying to make a name for itself in the unmanned aerial vehicle, or drone, space. Read more »
EMC-VMware spinoff Pivotal has hired Hugh Williams as its senior vice president of research and development. Williams was most recently a VP at eBay responsible for the technological platforms that underpin the site’s customer experience. Read more »
Cloudera is touting the speed of its Impala query engine compared to Hive and a leading relational database system, but those aren’t really apples-to-apples comparisons. The real question is how all the SQL-on-Hadoop options stack up against one another. Read more »
Jason Hoffman, Joyent co-founder and former CTO, and current VP at Ericsson, shares his thoughts on all things cloud — from why Amazon Web Services is king in IaaS to why data prices for connected cars had better be reasonable. Read more »
AncestryDNA is getting much better at telling users where their ancestors hail from and who their relatives are, but all this improvement comes at a technological cost. The more data it gathers, the more it pushes its infrastructure and algorithms to the limit. Read more »
Facebook’s open source engine for interactive queries on Hadoop is now available as a cloud service thanks to startup Qubole. Facebook claims Presto is 10 times faster than Hive for most queries. Read more »
IBM has launched a whole new division around Watson, but a slow start in terms of uptake might be a sign of concern. Watson’s best chances for success might lie in the cloud, where its capabilities can really be pushed to the limit. Read more »
You can’t talk about data without talking Hadoop. That’s why three CEOs — Rob Bearden of Hortonworks, Tom Reilly of Cloudera and Paul Maritz of Pivotal — will take the stage to talk about where the market it headed and how their companies are helping steer its direction. Read more »
With $15 million from Kleiner Perkins and Jafco Ventures, Zephyr Health thinks it’s well-suited to help pharmaceutical companies and medical device makers navigate tougher financial times by making better use of the data around them. Read more »
API specialist Apigee has acquired a predictive analytics startup called InsightsOne. The companies say the goal is to connect enterprise data and APIs to help predict business outcomes and serve up insights. Read more »
A startup called Lumiata is taking webscale graph analysis like Google and Facebook have perfected and turning it toward personalized health care. As we generate more digital data about research and even personal health, it’s an idea whose time has come. Read more »
This post from the MIT Technology Review discusses how Google used deep learning to recognize houses numbers and make Street View more useful (the research paper it cites is here). It’s just the latest example of applied deep learning from Google, which already uses the technique to power speech recognition on Android phones and image recognition in Google+. And, as we’ve been noting for the past few months, other web companies are now getting on board, applying various forms of machine learning to take advantage of the immeasurable volumes of images and text they’ve accumulated over the years.
The streaming music space is heating up thanks to API services that put incredible amounts of music data in the hands of developers who want to build their own streaming services. Can Pandora’s “less is more” approach survive? Read more »
An Oakland-based startup called Omicia has raised a $6.8 million series A round of venture capital, led by Artis Ventures, for a cloud service that lets doctors analyze whole human genomes in order to identify the presence of diseases. The basic service is free (while more-advanced analyses and capabilities cost $99), and the whole process takes less than 3 hours for a whole genome (or less than 1 hour for an exome). Advances in sequencing, algorithms, data storage and cloud computing have been rapidly driving down the cost of genomic analysis over the past few years, leading to an uptick of of startup activity in the space.
There has been a spate of acquisitions lately targeting companies and founders with expertise in using machine learning to analyze images and text. Although buyers such as Pinterest and Yahoo are usually pretty quiet about their plans, the writing on the wall is clear. Read more »
City of Palo Alto CIO Jonathan Reichental came on the Structure Show podcast this week and talked about the promise of open government data. However, he cautioned, we’re a long way from where we need to be — and end that will require governments to change, too. Read more »
PayPal is shaping up to be one of the biggest OpenStack users around, using it to manage a private cloud spanning a few thousand servers. Can its work on the platform help fill gaps that currently scare other enterprise users away? Read more »
Two new platforms for storing and analyzing genomic data have raised venture capital recently, with Curoverse announcing $1.5 million in seed funding in mid-December and Tute Genomics announcing $1.5 million in seed funding on Dec. 31. Curoverse is a specialized private-cloud system, while Tute Genomics is a pure cloud service. Both are riding the waves of cheaper gene-sequencing costs, data storage and computing power, assuming it will result in a deluge of demand for genomic analysis over the next few years. They’re not alone: We’ve covered numerous startups trying to do the same thing, including DNAnexus, Bina Technologies, Spiral Genetics and Appistry.
Violin Memory Chief Operating Officer Dixon Doll, Jr., resigned just weeks after the resignation of CTO Jonathan Goldrick and termination of CEO Don Basile. The company hopes new leadership can get it back on track. Read more »
The legal profession has undergone a lot of unpleasant changes since the Great Recession struck in 2008. New data-analysis technologies and a new approach to thinking about data could help firms operate leaner, meaner and better. Read more »