According to a complaint filed by Zettaset, Intel’s Hadoop management software is so similar to Zettaset’s flagship Orchestrator product that “trying to run Zettaset on top of Intel is akin to trying to put a key into a lock already occupied by a key.” Read more »
Law professor and blogger Eric Goldman drops some knowledge on the ineffectiveness and, one could argue, innovation-hindering effects on these types or privacy laws. I think regulation is a good idea, but it must be flexible and it should be paired with better public education so consumers can make informed choices. I’d rather websites spend money protecting my data or asking me at the time of collection whether they can use data for ads.
Zettaset has sued Intel over its Intel Manager for Apache Hadoop product, claiming it misappropriates Zettaset’s trade secrets. The complaint was filed last week in California. Read more »
Microsoft says it has seen the light in terms of designing software that business users actually want to use. For its new Q&A feature for visualizing Excel data, for example, the product team spent six weeks thinking about UI before even thinking about technology. Read more »
Booz Allen Hamilton serves clients in areas such as national security, financial services and life sciences, and data science is an increasingly important part of the job. A VP in its Strategic Innovation Group talks about how he approaches hiring data scientists and analyzing clients’ data. Read more »
Google, along with its peers at NASA and D-Wave, has released a short video explaining its new quantum computer and the potential — albeit yet unimagined — things it will be able to do. Read more »
Car service startup Uber has produced some graphics showing the effect of the government shutdown on its ridership in the Washington, D.C., area. Certain routes, like between downtown and Capitol Hill, have shown a reduction that the company calls “significant.” Read more »
Cloud developers and engineers have probably heard about Netflix’sChaos Monkey before, and now the company has turned the tool to its production Cassandra database clusters, as this post explains. Chaos Monkey isn’t just about spotting weaknesses in cloud architectures — the real goal is figuring out fixes. Netflix has improved its Cassandra clusters with real-time monitoring and automatic replacement of failed nodes.
Two new research partnerships whose participants range from pharmaceutical companies to IT vendors are taking aim at improving disease treatment via data analysis. They’re targeting a handful of diseases specifically — heart disease and cancer among them — but they point toward a data-driven health care future. Read more »
A new research project from Carnegie Mellon University, funded by a $2.6 million grant from the National Science Foundation, aims to make microchips smarter and more efficient by analyzing the data they collect about themselves. The Statistical Learning in Chip project is focused on developing an integrated machine learning engine that can help chips dynamically manage their resource consumption and keep it at optimum levels. This would make the chips, and the devices running on them, more energy-efficient, resulting in longer battery life and cooler operating temperatures.
IBM has opened a new research lab in San Jose, Calif., called the Accelerated Discovery Lab. Its purpose is to bring together subject matter experts in key areas — the company cites drug discovery, social analytics and predictive maintenance (aka the industrial internet) — with the data and tools they need to make new discoveries in their fields. For IBM, which has billions in revenue riding on these industries, the more it can prove its worth to them, the better.
Dropcam has released a new monitoring camera called the Dropcam Pro that’s remarkably high-resolution, but also very smart. A new user experience enables advanced zooming from a smartphone, and cloud-based machine learning algorithms are letting users filter their video feeds. Read more »
IBM has been awarded a patent for moving virtual machines across physical servers in a cloud in order to ensure applications are receiving the bandwidth they need. It’s an interesting solution to a problem that has plagued some cloud users. Read more »
San Juan Capistrano, Calif.-based startup Cirro is betting that there’s real value in piles of data scattered across corporate data stores, and it has closed an $8 million series A round from Toba Capital, Frost Venture Partners and Miramar Venture Partners to help test its hypothesis. Its platform invokes a SQL-based analytic engine that hits all of a companies various data stores — including big data stores such as Hadoop and NoSQL databases — while carrying out queries.
TransLattice, a Santa Clara, Calif.-based startup selling a geographically distributed relational database system, has acquired Red Bank, N.J.-based cloud-database startup StormDB. Both companies are pushing production-grade, distributed OLTP systems and the Postgres-based StormDB has some of its own IP around MPP analytics and geospatial data. It seems this means StormDB will stop taking new customers but, according to an FAQ on its site, “TransLattice will honor commitments to current StormDB customers.”
Time-series data is proliferating like mad in the era of the internet of things and the industrial internet, and Chicago-based startup TempoDB wants to capture it all. The company has $3.2 million to help it try to pull this off. Read more »
Remember when “polyglot PaaS” was the new thing? Five years after launching, App Engine now supports PHP, Python, Java and Google’s own Go programming language. Kidding aside, App Engine actually has matured quite a bit, has attracted some relatively big users and is part of an ever-impressive cloud platform at Google.
A new open source tool called RAW makes it remarkably easy to visualize any data that you can copy and paste from a table. Football on my mind, I chose to look at the performance of NFL quarterbacks so far this season. Read more »
Teradata has upped the capabilities of its Teradata Aster big data platform by adding in a native graph-processing engine called SQL-GR. Not a bad idea considering the increased attention around graph processing lately, as well as the need for an aging Teradata to keep up with (or ahead of) of the Joneses in the big data space. And Teradata’s SNAP Framework — which ingests a query and then decides the right processing engines and data stores to invoke — is pretty sweet in theory.
Rayid Ghani might be best known for leading the Obama for America data science team, but his latest mission is to bring that experience to bear on the nonprofit world through a research director role at the University of Chicago and a startup called Edgeflip. Read more »
Execs are talking about measuring tweet volume and the reach of those tweets, but isn’t the real value in figuring out what people think? It’s not worth touting that 200,000 people tweeted and 4 million people saw those tweets if the overall sentiment is that the show sucks. But given the history of shows such as “Arrested Development,” 20,000 of the right people tweeting about how great something is might be worth noting even if ratings aren’t high.
HP is all about the enterprise cloud and all about OpenStack, although its approach might seem very different for devotees of open source software or Amazon Web Services. Here’s how HP’s Margaret Dawson explains the company’s strategy. Read more »
Cloudera, Hortonworks, MapR and others are battling to lock down market share for commercial Hadoop software, but they’re inherently limited when it comes to innovation. Why not take advantage of the work already done by big Hadoop users like Facebook, Twitter and LinkedIn? Read more »
GE is pushing new technology that uses hundreds of sensors and advanced analytics to make its fleet of gas turbines run more efficiently. GE is a big fan of big data, investing more than $100 million in tech companies this year alone. Read more »
Twitter’s IPO filing is full of nuggets about the company’s revenue and overall business, including our first real look into the company’s data centers. We still don’t know where they are, but we know what they cost. Read more »
A group of researchers from Stanford has been working on deep learning models that can make sense of whole sentences at a time, and has recently trained its models on a large collection of online movie reviews. Read more »
Hadoop startup WibiData has updated Kiji, its open source project that aims to make HBase a better (or easier) database for serving real-time applications. Among the updates in its latest SDK is an improved version of the KijiScoring feature. “Developers can now pass per-request settings to producer functions, greatly expanding the flexibility of real-time predictive model scoring. For example, a user’s current geolocation from mobile application can be factored in when re-computing which offers or recommendations to serve a user,” explains a press release.
IBM is teaming with MIT, Carnegie Mellon University, New York University and the Rensselaer Polytechnic Institute to advance the state of the art in building smarter computer systems. Their research ranges from automatically classifying text and images to human-computer interaction. Read more »
Guavus, a San Mateo, Calif.-based startup that specializes in analyzing the data coming off carrier networks, has hired former NetApp EVP Manish Goel as CEO. Goel replaces Anukool Lakhina, who founded the company and will stay on board to help drive its technology strategy, among other things. Guavus has raised $87 million in capital and claims some major wireless carriers as customers of its software that helps tie customer data to network activity.
Yelp has announced the winners of its inaugural Yelp Dataset Challenge, and the four entries it chose actually seem pretty useful. They run the gamut from a technique to highlight key words so users can read reviews faster to helping businesses predict whether they’ll see an uptick in activity on Yelp. Having read countless reviews giving restaurants low ratings even though the food was good, I think the entry that extracts subtopics (e.g., food, service, ambience) from restaurant reviews might be my favorite.
Paypal is a finalist in the Netflix OSS Cloud Prize contest for a project called Aurora, which is Netflix’s Asgard cloud-management system rebuilt for OpenStack. Netflix is famously a big Amazon cloud user, so seeing its technology retooled for OpenStack is an interesting turn. Read more »
Machine learning startup BigML now supports text data in its cloud-based prediction service. It has always analyzed numerical fields in complex datasets to determine the relationship between them and any given outcome, and how it will consider the importance of words, too. Read more »
A scientist writing for Politico has equated government data mining with atomic bombs and is calling for disarmament. But if citizens are going to have a voice in this debate, we probably need to solve web privacy first. Read more »
IBM is going to acquire a Dublin, Ireland-based company called The Now Factory, which specializes in providing customer and network analytics for wireless carriers. The idea is that better, faster data about their networks can help carriers optimize performance and better serve (or target) customers based on their usage behavior. The Now Factory seems similar in vision to the San Mateo, Calif.-based Guavus, and it seems logical the two will cross paths more often thanks to IBM’s global reach.
Splunk is furthering its evolution beyond IT search with a new set of features that make it easier for business users to create, analyze and visualize machine-generated data sets. With lots of competition popping up everyday, Splunk can’t rest on its laurels. Read more »
Fantasy football is a big business that thrives on data, making it a great way to prove out a new technology and possibly earn a few bucks. A startup called SkyPhrase, for example, is putting its natural-language processing technology to use on NFL statistics. Read more »
Startup Dataguise has closed a $13 million series B investment round “led by Toba Capital with additional capital coming from the investment arm of a leading electronic conglomerate,” according to a press release. Dataguise’s biggest selling point might be its product designed to secure data within Hadoop. Aside from standard authentication, Fremont, Calif.-based Dataguise actually uses big data techniques to analyze data, determine what’s sensitive and then mask or encrypt it.
Cloudera will be integrating with the Apache Accumulo database and, according to a press release, “devoting significant internal engineering resources to speed Accumulo’s development.” The National Security Agency created Accumulo and built in fine-grained authentication to ensure only authorized individuals could see ay given piece of data. Cloudera’s support could be bittersweet for Sqrrl, an Accumulo startup comprised of former NSA engineers and intelligence experts, which should benefit from a bigger ecosystem but whose sales might suffer if Accumulo makes its way into Cloudera’s Hadoop distribution.
EqualLogic and now DataGravity Co-founder Paula Long is very smart about storage technology. Right now, she’s looking at things like flash and cloud storage with a skeptical eye. They’re valuable and will become more valuable, she says, but only when they’re done right. Read more »
Shares of Violin Memory stock closed their first day of public availabiity at $7.02, down 22 percent from the morning’s initial asking price of $9 per share. However, CEO Don Basile is confident the market will come around in time. Read more »