Hadapt, a startup that has been pushing SQL on Hadoop since 2011, is rolling out a new technology it calls “schema-less SQL.” Essentially, the SQL portion of Hadapt’s platform will automatically form columns from the keys of JSON and other data types, thus making the associated values queryable like values in a standard relational database. This sort of joint SQL-NoSQL support is likely to become a lot more normal for analytic databases. Curt Monash has a good technical breakdown of the new Hadapt feature.
The Comparing Constitutions Project has launched new web tool called Constitute, which lets users search their way through the world’s constitutions by keyword or theme. Not only is the tool handy for gathering info on international laws, but it’s also indicative of how the web can ease access to valuable data via nice interfaces masking lots of complicated data-prep work. The organization’s website has lots of other constitutional data and visualizations, too.
At Structure: Europe 2013, New Relic Founder Lew Cirne, Kleiner Perkins General Partner Michael Abbott (former Twitter engineering VP) and North Bridge General Partner Jonathan Heiliger (former Facebook engineering VP) spoke about the business opportunities around next-gen analytics. Read more »
Structure:Europe was about many things — cloud computing, privacy, how to build a global business — but it might have been most about scale. The goal of any tech company is to handle untold millions of users and their data, and many speakers are doing just that. Read more »
A Denver-based startup called AlchemyAPI is close to rolling out deep-learning-based image recognition via its API service. The company has made something of a name for itself in the text-analysis world, and it says it can do image recognition as well as Google. Read more »
Rackspace VP of Technology Nigel Beighton shared his thoughts on the most important tools in the cloud at Structure: Europe. If you want to get the most out of the cloud, virtual servers alone won’t cut it. Read more »
Is it better for hosting providers to band together to take on Amazon Web Services or to focus on what each service provider does best? Read more »
Analytics database startup MemSQL has integrated JSON support into its big, fast in-memory SQL database. Bridging both worlds is a compelling idea, although execution isn’t always easy. Read more »
Recommind, a San Francisco-based company that sells machine learning software optimized for e-discovery in the legal industry, has raised $15 million from SAP Ventures. The new money will go toward growing the company’s footprint outside the legal space via enterprise software that lets humans and machines work closely with one another around data analysis — something Recommind CTO Jan Puzicha discussed with me in March at Structure: Data.
There are many ways to win cloud customers away from Amazon Web Services, a panel of European cloud providers said at Structure:Europe, and none of them involve trying to be Amazon Web Services. Read more »
Samza is LinkedIn’s take on Twitter’s Storm engine for stream processing, only built on top of LinkedIn’s own Kafka messaging system. It’s the latest in a growing line of open source efforts from LinkedIn, and another notch in the belt for Hadoop. Read more »
DataSift, one of the two companies (along with Gnip) granted real-time access to the Twitter firehose, now offers real-time and historical analysis of Tumblr data. While it’s best-known for Twitter, DataSift actually analyzes dozens of social media and commenting platforms, which is pretty handy if you want to compare sentiment, engagement or whatever else across platforms where people behave quite differently.
If you’ve ever wanted to see who follows you on Twitter, where they live and what they do, but don’t have a clue how to utilize the Twitter API, it’s your lucky day. Read more »
Veteran entrepreneur, investor and founding Vertica Systems CEO Andy Palmer has some thoughts about the most-important trends and promising startups in the data space. Here’s what he had to say during our Structure Show podcast this week. Read more »
The glut of research in teaching computers to analyze and understand images could prove very helpful in letting us take full advantage of the countless hours of video we’ll produce as wearable cameras go mainstream. Read more »
An MIT professor has conducted some handy research that could help make applications run faster and use less energy by overcoming an inherent drawback of multicore processors. The problem is that although the local caches on chips save them the latency of having to access RAM, the hardware-wired algorithms powering them often assign data to cache locations randomly without considering the core trying to access it. The new software-based technique, called Jigsaw, tracks which cores are accessing what data — and how much — and assigns data locale accordingly. The paper detailing Jigsaw is available here.
New research out of Carnegie Mellon University shows that analyzing fans’ tweets can help gamblers make better bets on NFL games. Sometimes. Their technique wasn’t very effective at picking winners or betting the over/under, but it was 55 percent accurate on bets against the spread (and then only during the middle of the season). I doubt anyone will undertake this effort themselves for such a slight edge, but there might be a business here if someone can figure out a consistently accurate model.
Couchbase is officially opening up two new technologies to mobile developers as part of a public beta program. Couchbase Lite is a lightweight database designed specifically for iOS and Android devices, while Cloud Sync Gateway syncs local data with a bigger database in the cloud. Read more »
Aquamatix, a Structure: Europe LaunchPad company based in London, is trying to improve the world’s water networks with lots and lots of sensors. Fixing outdated infrastructure is expensive, but real-time data from deep inside can help target specific problems. Read more »
Yup. Makes me wonder if the tech companies that have been lobbying for Patriot Act reform over the past few years were doing so in part to get out from under the NSA’s thumb. Policy discussions were always couched in geopolitical language, but they must have foreseen the backlash even from U.S. customers if word ever got out about what was up.
Dallas-based enterprise-search company PureDiscovery has closed a $10 million series C funding round that should help it brings its BrainSpace platform to the masses. The idea is one to build knowledge about the content of documents rather than just an index of what’s where. Read more »
A San Mateo, Calif.-based startup called Space-Time Insight has raised a $20 million series C investment round led by London-based firm Zouk Capital. Space-Time provides a platform for analyzing and visualizing streaming data, and is gaining traction in the utility sector. We profiled the company in 2011, specifically its work with California ISO to put real-time energy data on an 80-foot screen in the agency’s control room. Space-Time closed a $14 million series B investment round last September.
Narrative Science, a startup that turns complex text documents into reports or articles that are supposed to resemble something written by a human being, has raised an $11.5 million series C funding round. News organizations have already used the company’s software to turn sports stats or corporate earnings statements into articles, but it has potential anywhere someone is trying to analyze loads of text documents. CIA-backed venture capital firm In-Q-Tel invested in Narrative Science in June.
Box Founder and CEO Aaron Levie has a lot to say about just about everything in the world of IT. Here’s a sampling of his thoughts about cloud computing, mobile software, the Microsoft-Nokia deal and finding time for Twitter. Read more »
It’s not just U.S. companies such as Pinterest, Netflix and every SaaS startup under the sun that are running on cloud infrastructure. There are a lot of major European companies and organizations using cloud computing, too. Many of them will be at Structure: Europe. Read more »
Hortonworks is making progress on its mission (via a project called Stinger) to speed up SQL-like queries in Hadoop using Apache Hive. New features in the latest version of Hortonworks’ Hadoop distribution have improved Hive performance tens of times in some instances, and the company is aiming for 100x improvements soon. Hortonworks has also added support for new types of SQL data. Competitor Cloudera opted to forgo Hive in favor of its own Impala technology for interactive queries.
eBay has acquired Seattle-based price-prediction startup Decide.com, and the service will shut down on Sept. 30. The entire team will head over to eBay to help the e-commerce giant improve its experience through predictive modeling. The entire team except Co-founder and CTO Oren Etzioni, that is: the University of Washington computer science professor, Madrona Venture Group partner and former Farecast founder is heading up Paul Allen’s new Allen Institute for Artificial Intelligence.
Google has added another new capability to its BigQuery analytics service. This one lets users derive correlation values between similar data points, something Google highlighed using sensor data from its recent I/O conference. Read more »
Location-data startup Placed has been tracking the businesses that consumers visit for about a year, and now it’s tying that data to their TV habits and interests. Where do “The Biggest Loser” viewers hang out? Bakeries. Read more »
Big data startup HStreaming is now part of Swiss advertising firm Adello Group. HStreaming had standout technology by all accounts, but the business never scaled enough to survive in a tough market. Read more »
There’s some business context here around GoDaddy’s new focus on building products to help small businesses become relevant in a digital world, but there’s also video of Jean-Claude Van Damme playing the pan pipes. Read more »
This post from Slate is spot on, in my humble opinion. It might be overkill, but I can say the same about my own posting habits, and did last year. (I can’t say the same about my wife, though …) There are plenty of reasons to not want a digital profile you didn’t ask for, and advances in behavioral analysis and facial recognition are only making them worse.
Marathon is a new framework that turns Mesos — a favorite of Twitter — into a more dynamic tool for running different applications on a single set of machines. Marathon comes from a startup called Mesosphere, founded by two former Airbnb engineers who know Mesos cold. Read more »
SwiftKey, a London-based startup that sells a popular “smart” keyboard for Android devices, has closed a $17.5 million series B led by Index Ventures. The company plans to spend the money on research to “fuel further innovation in the fields of Natural Language Processing and Machine Learning,” among other things, according to a press release. That’s probably not a bad idea given Google’s vested interest keyboard dominance and focus on cutting-edge text analysis.
Twitter has open sourced a “streaming MapReduce” system called Summingbird that makes Hadoop and Storm play nicer together so applications that require both batch and stream processing can do their jobs with as little complexity as possible. Read more »
Former VMware CTO Steve Herrod joined General Catalyst Partners in January, and his first investment as a venture capitalist is a big one — $25 million in cloud backup service Datto. DealBook has a good writeup of Datto’s story, but the other angle is what the deal says about Herrod’s investment strategy and about GCP’s push into enterprise software.
Ex-NASA CIO and CTO, and current Nebula co-founder and CEO, Chris Kemp came on the Structure Show podcast this week to talk about everything from upgrading NASA’s infrastructure to commercializing the OpenStack software he helped create while there. Here are some highlights. Read more »
Facebook is hosting a Kaggle competition in order to identify candidate for a data scientist position. Résumés are so passé when you can just have applicants prove their skills first. Read more »
A provocative — and thoroughly researched — post from IEEE Spectrum about the shortage of workers with science, technology, engineering and math skills. I’m not skeptical enough to think it’s all manufactured concern so employers can keep salaries low, but I’ve read enough about the push for more immigrant visas for tech workers to know there’s something there.
Researchers have released a tool that lets anyone track the whereabouts of Twitter and Instagram users who allow geotagging of their posts. They want social media users to be aware that geotagging exists and what kind of information it provides. Read more »