The glut of research in teaching computers to analyze and understand images could prove very helpful in letting us take full advantage of the countless hours of video we’ll produce as wearable cameras go mainstream. Read more »
An MIT professor has conducted some handy research that could help make applications run faster and use less energy by overcoming an inherent drawback of multicore processors. The problem is that although the local caches on chips save them the latency of having to access RAM, the hardware-wired algorithms powering them often assign data to cache locations randomly without considering the core trying to access it. The new software-based technique, called Jigsaw, tracks which cores are accessing what data — and how much — and assigns data locale accordingly. The paper detailing Jigsaw is available here.
New research out of Carnegie Mellon University shows that analyzing fans’ tweets can help gamblers make better bets on NFL games. Sometimes. Their technique wasn’t very effective at picking winners or betting the over/under, but it was 55 percent accurate on bets against the spread (and then only during the middle of the season). I doubt anyone will undertake this effort themselves for such a slight edge, but there might be a business here if someone can figure out a consistently accurate model.
Couchbase is officially opening up two new technologies to mobile developers as part of a public beta program. Couchbase Lite is a lightweight database designed specifically for iOS and Android devices, while Cloud Sync Gateway syncs local data with a bigger database in the cloud. Read more »
Aquamatix, a Structure: Europe LaunchPad company based in London, is trying to improve the world’s water networks with lots and lots of sensors. Fixing outdated infrastructure is expensive, but real-time data from deep inside can help target specific problems. Read more »
Yup. Makes me wonder if the tech companies that have been lobbying for Patriot Act reform over the past few years were doing so in part to get out from under the NSA’s thumb. Policy discussions were always couched in geopolitical language, but they must have foreseen the backlash even from U.S. customers if word ever got out about what was up.
Dallas-based enterprise-search company PureDiscovery has closed a $10 million series C funding round that should help it brings its BrainSpace platform to the masses. The idea is one to build knowledge about the content of documents rather than just an index of what’s where. Read more »
A San Mateo, Calif.-based startup called Space-Time Insight has raised a $20 million series C investment round led by London-based firm Zouk Capital. Space-Time provides a platform for analyzing and visualizing streaming data, and is gaining traction in the utility sector. We profiled the company in 2011, specifically its work with California ISO to put real-time energy data on an 80-foot screen in the agency’s control room. Space-Time closed a $14 million series B investment round last September.
Narrative Science, a startup that turns complex text documents into reports or articles that are supposed to resemble something written by a human being, has raised an $11.5 million series C funding round. News organizations have already used the company’s software to turn sports stats or corporate earnings statements into articles, but it has potential anywhere someone is trying to analyze loads of text documents. CIA-backed venture capital firm In-Q-Tel invested in Narrative Science in June.
Box Founder and CEO Aaron Levie has a lot to say about just about everything in the world of IT. Here’s a sampling of his thoughts about cloud computing, mobile software, the Microsoft-Nokia deal and finding time for Twitter. Read more »
It’s not just U.S. companies such as Pinterest, Netflix and every SaaS startup under the sun that are running on cloud infrastructure. There are a lot of major European companies and organizations using cloud computing, too. Many of them will be at Structure: Europe. Read more »
Hortonworks is making progress on its mission (via a project called Stinger) to speed up SQL-like queries in Hadoop using Apache Hive. New features in the latest version of Hortonworks’ Hadoop distribution have improved Hive performance tens of times in some instances, and the company is aiming for 100x improvements soon. Hortonworks has also added support for new types of SQL data. Competitor Cloudera opted to forgo Hive in favor of its own Impala technology for interactive queries.
eBay has acquired Seattle-based price-prediction startup Decide.com, and the service will shut down on Sept. 30. The entire team will head over to eBay to help the e-commerce giant improve its experience through predictive modeling. The entire team except Co-founder and CTO Oren Etzioni, that is: the University of Washington computer science professor, Madrona Venture Group partner and former Farecast founder is heading up Paul Allen’s new Allen Institute for Artificial Intelligence.
Google has added another new capability to its BigQuery analytics service. This one lets users derive correlation values between similar data points, something Google highlighed using sensor data from its recent I/O conference. Read more »
Location-data startup Placed has been tracking the businesses that consumers visit for about a year, and now it’s tying that data to their TV habits and interests. Where do “The Biggest Loser” viewers hang out? Bakeries. Read more »
Big data startup HStreaming is now part of Swiss advertising firm Adello Group. HStreaming had standout technology by all accounts, but the business never scaled enough to survive in a tough market. Read more »
There’s some business context here around GoDaddy’s new focus on building products to help small businesses become relevant in a digital world, but there’s also video of Jean-Claude Van Damme playing the pan pipes. Read more »
This post from Slate is spot on, in my humble opinion. It might be overkill, but I can say the same about my own posting habits, and did last year. (I can’t say the same about my wife, though …) There are plenty of reasons to not want a digital profile you didn’t ask for, and advances in behavioral analysis and facial recognition are only making them worse.
Marathon is a new framework that turns Mesos — a favorite of Twitter — into a more dynamic tool for running different applications on a single set of machines. Marathon comes from a startup called Mesosphere, founded by two former Airbnb engineers who know Mesos cold. Read more »
SwiftKey, a London-based startup that sells a popular “smart” keyboard for Android devices, has closed a $17.5 million series B led by Index Ventures. The company plans to spend the money on research to “fuel further innovation in the fields of Natural Language Processing and Machine Learning,” among other things, according to a press release. That’s probably not a bad idea given Google’s vested interest keyboard dominance and focus on cutting-edge text analysis.
Twitter has open sourced a “streaming MapReduce” system called Summingbird that makes Hadoop and Storm play nicer together so applications that require both batch and stream processing can do their jobs with as little complexity as possible. Read more »
Former VMware CTO Steve Herrod joined General Catalyst Partners in January, and his first investment as a venture capitalist is a big one — $25 million in cloud backup service Datto. DealBook has a good writeup of Datto’s story, but the other angle is what the deal says about Herrod’s investment strategy and about GCP’s push into enterprise software.
Ex-NASA CIO and CTO, and current Nebula co-founder and CEO, Chris Kemp came on the Structure Show podcast this week to talk about everything from upgrading NASA’s infrastructure to commercializing the OpenStack software he helped create while there. Here are some highlights. Read more »
Facebook is hosting a Kaggle competition in order to identify candidate for a data scientist position. Résumés are so passé when you can just have applicants prove their skills first. Read more »
A provocative — and thoroughly researched — post from IEEE Spectrum about the shortage of workers with science, technology, engineering and math skills. I’m not skeptical enough to think it’s all manufactured concern so employers can keep salaries low, but I’ve read enough about the push for more immigrant visas for tech workers to know there’s something there.
Researchers have released a tool that lets anyone track the whereabouts of Twitter and Instagram users who allow geotagging of their posts. They want social media users to be aware that geotagging exists and what kind of information it provides. Read more »
I’d argue this is a prime example of when metadata is used correctly. If the other nearly 150,000 phone numbers were never investigated and the records were deleted once the feds found their guys, any invasion of privacy is only theoretical. There’s a big difference between this and GPS-tracking, or what the NSA is doing.
A London-based startup called import.io has built a service that lets users take information from websites and turn it into structured data that can populate a spreadsheet or feed an application via API. And it doesn’t require any coding. Read more »
File-sharing service Hotfile was found guilty of copyright infringement in a U.S. federal court case decided on Wednesday. But just because Hotfile appears guilty, that doesn’t mean cyberlockers are inherently evil — regardless what the MPAA says. Read more »
Hortonworks has released a set of icons for illustrating the roles of various Hadoop-ecosystem components in flow charts and other architectural diagrams. Earth-shattering? No. Helpful if you’re stuck trying to build a PowerPoint slide about your big data environment? Probably. Read more »
LinkedIn’s new University Pages are a case study in how to build a big data application. Ideas are great and pretty web design are great, but you also need people who can find and format the data, the the systems in place to make everything work. Read more »
Couchbase, a startup selling a NoSQL database of the same name, has raised a $25 million series D round. Adams Street Partners led the round and was joined by existing investors Accel Partners, Mayfield Fund, North Bridge Venture Partners and Ignition Partners. Couchbase doesn’t have the huge user base of MongoDB or the edginess of HBase, but it does have some big-name users (including Orbitz) and the company claims sales jumped 400 percent in the last year.
How much does the U.S. government request data from U.S. web properties? A lot. Here are eights charts showing data from Facebook, Google, Microsoft and Twitter about how many requests they get from across the globe. Read more »
MongoDB creator 10gen has changed its name to MongoDB, Inc. It’s probably not a bad idea to align the company’s name with the its sole product, but it will take a little getting used to. Read more »
Violin Memory has filed for a $173 million initial public offering, although it did so without much of the hype traditionally associated with Violin news. The company is on pace for $100 million in revenue this year, but it’s now part of a crowded flash market. Read more »
Hadoop-based analytics startup Tresata last week open sourced a set of machine learning libraries built on Scalding and designed to run in Hadoop and make use of the Apache Mahout project. Tresata is calling the project Ganita, and has also written a couple of explanatory blog posts about it, including how to do k-means clustering. The barriers to doing good work on big data just keep getting lower.
Publishing analytics startup Parse.ly has raised $5 million and has released its first report showing the top sources of traffic across its customer base. It claims hundreds of them, including big-name ones like Atlantic Media, Reuters and Mashable. Read more »
Based on the data scientists I’ve met and the “how to become a data scientist” talks I’ve seen, it’s hard to disagree. But SQL and coding skills can be really helpful if you need need to get stuff done beyond pure statistical analysis.
Amazon Web Services experienced a brief outage on Sunday afternoon. It only last about 60 minutes, but appears to have taken down popular sites such as Instagram, Flipboard and Vine for short periods. Read more »
Google cloud platform manager Greg DeMichillie was on our Structure Show podcast this week to defend Google’s position in the cloud computing market. He makes some fair points, but will they be enough to lure in developers and companies en masse? Read more »