Syapse, a startup trying to build something akin to Google’s Knowledge Graph for medical data, has raised a $10 million series B round of venture capital from Safeguard Scientific and existing investor Social+Capital Partnership. We covered Syapse when it launched in January 2013, promising to help doctors make sense of the myriad data sources and data points associated with medical tests, from how a sample was extracted to the method used for analyzing it.
Prism Skylabs, a video analytics startup targeting retailers, is hoping that new WiFi-randomization features in iOS 8 helping to cull the competitive market and improve consumer privacy. Its a hope many consumers probably share, but it could also be wishful thinking. Read more »
A handful of technology companies big and small have vowed to support and contribute to Kubernetes, Google’s open source technology for managing Docker containers. That’s a big boon for portability in cloud computing, and a good way for Google to show off its infrastructure edge. Read more »
Two news stories from Wednesday — one about a startup trying to play data broker between user and website and another about a study into what people would charge for their personal data — offer more evidence that there’s an appetite for a market where consumers sell their data to advertisers and website. The idea isn’t new (we wrote about its traction back in 2012) and actually has merit because it puts money in consumers’ pockets and higher-quality data in advertisers’ databases. But monetizing the idea might be easier said than done: Enliken, one of the startups we covered in that 2012 piece, appears to have closed its doors.
The Nieman Journalism Lab published a thoughtful critique of data journalism on Wednesday, but there are additional things the emerging space could do live up to its hype, including getting more creative about where writers source their data. Read more »
Expect Labs has released a new API service for automating voice search across various types of digital content. The big opportunity seems to be in movies and TV shows, where intelligent voice search theoretically results in better results and fewer lifted fingers. Read more »
Uber is claiming that late-night train service in Boston has resulted in decreased Uber rides after bars close on the weekends. And, the company claims, it’s happy about it because its customers are happier. Read more »
TempoDB, a Chicago-based startup that began as a database company for time-series data, has changed its name to TempoIQ and is now trying to become the go-to analytics platform for sensor data. Read more »
Tableau Software was born of academic research, and as the company grows it’s building an R&D division to help build a pipeline of innovation. Jock Mackinlay, a Tableau VP who heads up the team, explains how it works and what it’s working on. Read more »
Ben Uretsky, CEO of cloud computing startup DigitalOcean, came on the Structure Show this week to talk about how his company lures customers in the shadow of Amazon Web Services and Google. Read more »
There was a lot of news about Spark’s ascension in the big data ranks this week, as well as some speculation. According to Cloudera’s Mike Olson, his company is widely embracing Spark — including to run Hive — but not in place of Impala. Read more »
Twitter has released an analysis of activity on the social network during the overtime shootout period in last week’s World Cup match between Brazil and Chile. The pattern, which Twitter claims has repeated itself through every overtime shootout, is pretty interesting: people tweet like crazy leading up to the kick, watch intently (and with hands off keyboards) as the player gets ready and finally kicks, and then tweet like crazy again after the kick scores or misses. Seeing this phenomenon visualized is a small window into the relationships between our eyes, fingers, televisions and computer screens during big events.
Yahoo has released a massive dataset for researchers to experiment on. The dataset includes URLs for nearly 100 million images and 700,000 videos, as well as their metadata. Soon, a larger supercomputer-processed dataset that includes audio and visual features will be available. Read more »
Scientists have researched the effectiveness of deep learning techniques for discovering exotic particles and found some significant improvements over previous methods. They believe deep learning could help analyze data from the Large Hadron Collider. Read more »
Facebook’s study of how content manipulation can affect users’ moods has stirred up an ethical hornet’s nest, but there’s a bigger question beyond whether the study should have happened. Now that we know it’s possible, what’s to stop more ambitious attempts to manipulate consumers? Read more »
Startup Keen IO has a plan to become the premier platform for developers that want to analyze their data — a plan that doesn’t include being absorbed into the fold of a larger, less-innovative company. Now, it has $11.3 million from Sequoia Capital to help its cause. Read more »
Big data startup Databricks keeps humming along, announcing on Monday a large round of venture capital and a new cloud service that aims to seed adoption of Spark — a framework it says is faster, easier and more versatile than other options. Read more »
WANdisco, a company specializing in keeping Hadoop and HBase environments running in the case of system failures, has acquired a startup called OhmData that claims to have built a better version of HBase. Read more »
MapR has raised $110 million, $80 million of which is equity financing, in order to fuel its growing Hadoop business in the face of better-known rivals Cloudera and Hortonworks. Like those companies, MapR says it has the winning strategy and aims to be a publc company. Read more »
Apache Spark might push MapReduce to the back burner faster than some people might like, but it will also boost the Hadoop overall ecosystem. The project’s co-creator Matei Zaharia explains why Spark is so popular now and where it fits into the big data ecosystem. Read more »
There’s a lot of research going on right now about how to teach robots to learn new things, and it all points to the same general conclusion: Without lots and lots of data to train on, an intelligent robot isn’t very smart at all. Read more »
The Aereo holding itself was questionable, but the broader opinion opened the door to some even bigger questions about the legality of DVRs. Could the spate of copyright lawsuits cease if networks and startups agreed on a new type of currency in data? Read more »
Researchers at Carnegie Mellon have created a method called LiveLight that they claim can watch generally uneventful videos and pick out the parts that viewers probably want, or need, to see. Read more »
Google rolled out a slew of new cloud services at I/O, including one called Dataflow that’s meant to put standard MapReduce to shame. It’s advertised a much simpler way to build data pipelines that can handle both batch processing and streaming data. Read more »
Analytics startup DataHero hopes it can build “a BI layer over SaaS” and is opening up its cloud service in order to prove it. Users of the free version can now combine data from popular SaaS apps without a trial-period limitation. Read more »
NASA is launching a new challenge, hosted on Amazon Web Services, that gives the public access to a trove of earth sciences data and computational resources in the name of discovering new uses for all that information. Read more »
Dell, Cloudera and Intel are working together on an appliance designed to speed the performance of Hadoop environments by moving a lot more data into a shared memory space. Key to the performance improvement is Apache Spark, the in-memory data-processing framework that’s now included in Cloudera’s Hadoop distribution. At this point, it seems like Hadoop vendors are going to sell their wares regardless where they run, so a deal like this really helps Dell make the case that hardware matters in big data environments. The companies claim it’s the first in a family of “Dell Engineered Systems for Cloudera Enterprise.”
Aerospike, a NoSQL startup that has garnered a fair number of advertising industry customers thanks to its in-memory technology, has raised $20 million in a series C round and is open sourcing its database under the same license used by MongoDB. Read more »
Intel on Monday announced a new brawny HPC processor and a family of network fabric components that incorporates silicon photonics technology. The new chip, the next-generation of the Xeon Phi family, comes with up to 16 gigabytes of high-performance memory and more than 60 computing cores. The fabric lineup, called Omni Scale will include PCIe adapters, switches and software, as well as director switches that replace electrical transceivers with silicon photonics for improved speed and fewer cables. Intel is promising better performance and greater efficiency with the new tech — something it might need considering the introduction of ARM-plus-GPU-based chips into the HPC market that Intel presently dominates.
A Structure conference panel discussing the state of open source cloud computing agreed that open source clouds need to get easier to use, but not on much else. Read more »
HP’s Bill Veghte told the audience at Gigaom’s Structure conference that enterprise cloud adoption is in the low single digits, and that HP thinks its big investment in OpenStack will help it capture those apps. Read more »
Mode is trying to do for data scientists and analysts what GitHub did for developers by giving them a place where they can find, collaborate and work on data. Formation8 led the new round, which also included Reddit’s Alexis Ohanian. Read more »
Dropbox is acquiring a data visualization startup called Parastructure, according to TechCrunch. It’s one of those deals where nobody is talking yet, but what little info is publicly available about Parastructure helps shed some light on Dropbox’s motivation. It’s hard to imagine Dropbox getting into the analytics software business that Parastructure was targeting, but it’s not hard to imagine Dropbox acquiring some talent that can help it scale onto, and query, new data technologies. And, like Box did with its dLoop acquisition, Dropbox might also be looking to improve search for its business users.
Hadoop is a complex technology, so it helps to have friends in high places when you’re trying to develop it and integrate webscale tooling into enterprise environments. For Hortonworks, that friend is Yahoo, with which it continues a deep engineering partnership. Read more »
Microsoft, which recently showed off its machine learning research with Skype Translate, is opening up those capabilities with a new cloud service called Azure Machine Learning. Read more »
Apple, Cisco and AT&T have joined Verizon and the Electronic Frontier Foundation in supporting Microsoft’s attempt to quash a U.S. search warrant seeking email data about an Irish customer stored on Irish servers. Read more »
On this weeks Structure Show podcast, Barb Darrow, Stacey Higginbotham and I talk about which speaker and topics have us mosted excited about this weeks Structure conference. Needless to say, theres a lot to like about it. Read more »
Netflix explained how it’s using data analysis to do more than recommend movies in a blog post this week. From optimizing bitrate to churning through user feedback, advanced algorithms are helping ensure that minimal issues affect the streaming experience. Read more »
Business intelligence startup SiSense has raised a $30 million third round of venture capital from DFJ Growth, as well as existing investors Battery Ventures, Genesis Partners and Opus Capital. The company has now raised $44 million since it launched in 2010. Like most analytics startups, SiSense promises nice visualizations and a user-friendly experience, but its major bragging point is fast data processing thanks to an architecture that takes full advantage of the processor’s cache rather than just DRAM or disk. The company appears to being growing impressively, too, claiming triple-digit customer growth and some big-name accounts.
Google open sourced a Docker-centric tool called Kubernetes that lets its cloud computing customers automate their resource management similar to how Google does it internally. It’s part of a sustained approach to prove Google’s chops as a cloud provider by pushing its vision of computing. Read more »