Although techniques such as machine learning are taking off in the e-commerce and retail spaces as a way to display better recommendations or optimize product presentation, the smart money is still on humans getting the final say in what customers see. Read more »
RunKeeper tracked what its users were up to in Sochi during the Olympics and found they ran the equivalent of about 78 marathons. It’s an interesting nugget, but part of a much larger picture about learning how, when and where people exercise. Read more »
Website performance and security startup CloudFlare has acquired an anti-malware startup called StopTheHacker. The deal makes the popular CloudFlare that much more useful and also gives the company a new business to take advantage of the global infrastructure it’s building out. CEO Matthew Prince recently suggested it would get into the anti-malware space because it often has spare computing capacity that could be put to work scanning networks rather than sitting idle. Although it plans to integrate the two services more tightly, CloudFlare says it will continue operating and investing in the StopTheHacker service.
A company called Carrier IQ is trying to help mobile carriers serve their customers better by using machine learning algorithms to diagnose problems with their smartphone, such as poor battery performance or call quality. A smart use of the technology would be for carriers to get proactive in helping customers resolve their problems before they get annoyed enough to call customer service or, in an increasingly non-contractual industry, just go elsewhere without letting a carrier know they’re leaving. The holy grail of big data, after all, is to actually be able to be proactive.
IBM has acquired cloud-based database startup Cloudant. It’s a smart move in terms of getting a foothold in the cloud database space, but it also seemingly forces IBM to embrace cloud providers and technologies outside its current umbrella. Read more »
Structure Data has a great lineup of speakers, including a handful that will be talking about how to take advantage of new types of data. Here is a list of sessions anyone interested in sensors, location or artificial intelligence won’t want to miss. Read more »
A startup called BlueTalon officially launched on Friday with a platform for helping people easily share and collaborate on data stored in commercial databases. Read more »
On Thursday, Facebook announced via a post on its engineering page that it has revamped the Thrift framework it built in 2006 (which has since become an Apache project) and is re-releasing it as open source code via GitHub under the fbthrift moniker. Thrift was created as a tool for helping build distributed applications that need to call different services written in different languages. Although it has been very useful, the post’s author explains, Facebook and other Thrift users ran into performance issues and feature deficiencies that have been resolved with fbthrift.
Apache Mesos is the open source cluster-management software that automates operations at companies such as Twitter and Airbnb. Now, a startup called Mesosphere is building a business around taking it mainstream. Read more »
Expect Labs has unveiled the MindMeld API, a set of artificial intelligence capabilities delivered as a service. Developers can use it to create smart applications that know what types of content and search results to recommend, and when. Read more »
A security startup called Elastica came out of stealth mode on Tuesday, and brought with it $6.3 million in venture capital from the Mayfield Fund. Elastica tries to protect corporate data scattered across the dozens of cloud services companies might be using and, like so many other security startups, is touting its use of data science techniques to accomplish its goal. Elastica does have an impressive pedigree, though, both with the Ph.Ds. on its founding team and with advisers including Rayid Ghani (Obama for America, Edgeflip), Tom Reilly (ArcSight, Cloudera), M.C. Srivas (MapR) and Ion Stoica (UC Berkeley, Conviva, Databricks).
After weeks of voting, we’re pleased to announce the winners of the inaugural Structure Data Readers’ Choice awards. The eight winners represent some of the most innovative and promising startups that launched in 2013 and have made better data analysis their mission. Read more »
Analytics startup BeyondCore has raised $9 million for its technology that can analyze complex data sets and automatically highlight the strongest correlations. It’s a promising capability assuming companies are willing to open up analytics across the organization. Read more »
It didn’t take long for the Hadoop market to become a juggernaut, and it won’t take long for it to undergo some significant technological changes. Cloudera co-founder and chief strategy officer Mike Olson came on the Structure Show podcast to break it down. Read more »
Collecting student data digitally isn’t solely something for massive open online courses. Even university professors and their students can benefit from transforming the lecture experience into one designed to go anywhere and collect data all along the way. Read more »
Our Structure Data conference this year is about many things, but one big theme is the emergence of everything as data. Thanks to advances in sensors and machine learning, everything from soil to sounds can provide valuable data. Read more »
There has been a lot of data industry news this week coming out of the Strata conference, and elsewhere. Here are some of the highlights. Read more »
Health care startup Welltok, which has developed a platform to help consumers make wise choices about their health, has raised a $22 million Series C round of venture capital. New Enterprise Associates led the round, but IBM (via its new Watson group) and Qualcomm also pitched in. One of Welltok’s products, CafeConcierge, uses Watson’s cognitive computing capabilities as the basis of its personalized medicine approach. IBM, of course, is betting big on Watson as a source of future revenue and has vowed to invest $100 million in companies willing to integrate Watson into their products.
A robotics startup called Neurala has received a patent (No. 8,648,867) for a GPU-based system designed to run artificial neural network models. The patent covers the physical architecture of the system, which Neurala calls an “accelerator,” as well as aspects of data processing and user experience. It’s not clear whether the patent, which dates back to 2006, will affect others artificial intelligence efforts currently underway. Neurala’s business revolves around providing computer vision and navigation intelligence for robots, but GPUs are the computers of choice for many deep learning projects.
Netflix is the latest company to acknowledge that it’s researching new approaches to artificial intelligence that could help improve its products. Although it hasn’t said where it might apply deep learning models, the company has plenty of image and text data to learn from. Read more »
MapR is continuing along its path to Hadoop glory with new support for the YARN resource manager and a direct integration with the HP Vertica analytic database. In such a competitive space, every little edge matters. Read more »
Red Hat and Hortonworks are integrating a number of technologies to give joint customers a more seamless experience running their Hadoop workloads on private cloud or virtualized infrastructure. In an upstart market worth billions, it helps to have friends like Red Hat. Read more »
Splice Machine, a startup promising a SQL-on-Hadoop database that can handle both transactional and analytic workloads, has closed a $15 million series B round of venture capital from InterWest Partners, along with Mohr Davidow Ventures. Supporting transactional workloads would put Splice Machine in a good position among the glut of companies and projects letting users perform SQL operations on Hadoop, because most are strictly for analytics. The big question for Splice Machine, though, might be whether companies actually want to run transactions on that data or whether they’re willing to stick to a tried-and-true database for that.
Data means a lot to Ford, informing everything from product design to business intelligence. In this interview from the Structure Show podcast, Ford’s top data scientist talks all about how Ford approaches everything from deploying Hadoop to hiring the right people. Read more »
Twitter is offering up access to its entire corpus of tweets to a select group of researchers through a new data grant program. But the program raises a simmering question over whether such valuable data shouldn’t be more open in the first place. Read more »
MemSQL, the database startup from two former Facebook engineers, has already raised a lot of money and roped in some big customers. Now it’s looking to broaden its footprint with a flash-optimized columnar store to complement its in-memory row-based one. Read more »
Say what you will about Satya Nadella, but don’t say he doesn’t understand the value of Microsoft’s technology. In an era where it’s competing against Google and Amazon to become the default digital platform company, that’s more important than ever. Read more »
With a founding team out of Nutanix, Google and Yahoo, ThoughtSpot is pushing a new in-memory analytics and visualization engine delivered via appliance. Appliances can be risky, but when they work, they work. Read more »
Tableau had a huge fourth quarter and year in 2013, nearly doubling its year-over-year revenues for both periods, and putting it on a collision course with its larger competitors in a few years’ time. Read more »
Making sense of big data can be hard enough without spending untold hours having to write code or manually clean datasets that simply won’t work with existing BI tools. Trifacta is trying to automate that process with a new software product it announced on Tuesday. Read more »
A new research paper out of Carnegie Mellon University suggests that Facebook, LinkedIn, Netflix and other membership-based websites will see steady activity in daily active users while others will flounder. How their initial growth happens might play a big role in long-term success. Read more »
Twitter is fast becoming a platform that’s far more valuable for marketers, politicians, traders and journalists than for any given individual user. That’s because if you know how to use it, the breadth of raw data Twitter offers via its firehose can tell a lot of stories. Read more »
Troubled flash storage vendor Violin Memory has a new president and CEO, just months after a lackluster IPO and an ensuing scandal that resulted in the termination of its previous CEO and the departure of multiple executives. The company’s new leader, Kevin DeNuccio, has led infrastructure companies before, mostly in the networking space. He was CEO of a privately held London-based company called Metaswitch Networks, and before that was president and CEO of Redback Networks when Ericsson acquired it in 2006.
Cloud backup provider Backblaze has moved into a new data center in Sacramento capable of storing 500 petabytes, or half an exabyte, of data. It’s not full yet (the company was storing 75 petabytes as of November), but the pace is picking up and it probably will be sooner than some might expect. The crazy part is that Backblaze isn’t even that big a company or that widely used a service. Facebook alone is building enough capacity to house 3 exabytes of data in each of its 3 cold storage facilities. Sometimes, I can’t help but think that we’re just digitally hoarding.
Brian Burke followed up a career in the Navy by starting Advanced NFL Stats, and now his predictive models are powering the New York Times’ 4th Down Bot. Fans already love this kind of analysis, but will coaches ever come around? Read more »
Mailchimp chief data scientist John Foreman came on the Structure Show this week to report on his recent trip to Disneyland. It turns out the Magic Kingdom does indeed use data to deliver a personalized experience — and we’re fine with it because it’s fun(ish). Read more »
D-Wave Systems believes it has a real quantum computer and that future generations of its processor will prove what it’s capable of. What’s more, the company plans to deliver all the benefits of quantum computer via API. Read more »
Facebook might have launched the Open Compute Project to force server vendors to build higher-effiency gear, but it’s having a much greater impact than even Facebook anticipated. Read more »
Google spend a little less on infrastructure during the fourth quarter than during the third quarter, but it still spent a lot. Like $2.25 billion a lot. Read more »
Cloud collaboration star Box has filed for an IPO, according to a report in Quartz. It would be a wise move for the company, which is riding mega goodwill — and mega challenges — as it scales its business and its infrastructure. Read more »