More tech Stories
In Brief

If you’ve ever wanted to use the Couchbase NoSQL database but didn’t feel like managing servers, a San Mateo, Calif.-based startup called KuroBase says it has you covered with its new service. Cloud databases are already pretty popular with web developers running MongoDB, Postgres and even CouchDB (kind of, technically), but I believe this is a first for Couchbase. It could be popular, though, especially if developers are keen on Couchbase’s new ability to sync data between mobile devices and a central database.

In Brief

We have been hearing about things like YARN and high availability for a few years — they’ve even been incorporated into some commercial Hadoop distributions — but now they’re finally part of the official Apache Hadoop code base. Technically version 2.2.0, “The project’s latest release marks a major milestone more than four years in the making, and has achieved the level of stability and enterprise-readiness to earn the General Availability designation,” according to an Apache Software Foundation press release.

On The Web

I think this is more about Hadoop and other emerging technologies than the analysts quoted here are willing to admit. Why do you think Teradata is pushing its Hadoop story so much lately? There is, for example, crazy excitement around big data and Hadoop in China. Customers with blank slates center their efforts around Hadoop, while big existing customers are trying to offload more to Hadoop. Teradata sales are fairly flat right now even in the U.S. because big existing customers are getting bigger but fewer are signing up.

Upcoming Events

On The Web

IBM has shared some details about a new project called WatsonPaths that lets doctors actually interact with the system to understand how it came to its conclusions, and to tell it whether its “thinking” was right. This type of interaction is critical in any type of machine learning system where speed isn’t the primary objective, because it lets humans see things they might not have and also train the machines to be more accurate in the future. WatsonPaths is a GUI-based tool and is being developed along with doctors at the Cleveland Clinic.

On The Web

Law professor and blogger Eric Goldman drops some knowledge on the ineffectiveness and, one could argue, innovation-hindering effects on these types or privacy laws. I think regulation is a good idea, but it must be flexible and it should be paired with better public education so consumers can make informed choices. I’d rather websites spend money protecting my data or asking me at the time of collection whether they can use data for ads.

loading external resource
On The Web

Cloud developers and engineers have probably heard about Netflix’sChaos Monkey before, and now the company has turned the tool to its production Cassandra database clusters, as this post explains. Chaos Monkey isn’t just about spotting weaknesses in cloud architectures — the real goal is figuring out fixes. Netflix has improved its Cassandra clusters with real-time monitoring and automatic replacement of failed nodes.

In Brief

A new research project from Carnegie Mellon University, funded by a $2.6 million grant from the National Science Foundation, aims to make microchips smarter and more efficient by analyzing the data they collect about themselves. The Statistical Learning in Chip project is focused on developing an integrated machine learning engine that can help chips dynamically manage their resource consumption and keep it at optimum levels. This would make the chips, and the devices running on them, more energy-efficient, resulting in longer battery life and cooler operating temperatures.

In Brief

IBM has opened a new research lab in San Jose, Calif., called the Accelerated Discovery Lab. Its purpose is to bring together subject matter experts in key areas — the company cites drug discovery, social analytics and predictive maintenance (aka the industrial internet) — with the data and tools they need to make new discoveries in their fields. For IBM, which has billions in revenue riding on these industries, the more it can prove its worth to them, the better.

In Brief

San Juan Capistrano, Calif.-based startup Cirro is betting that there’s real value in piles of data scattered across corporate data stores, and it has closed an $8 million series A round from Toba Capital, Frost Venture Partners and Miramar Venture Partners to help test its hypothesis. Its platform invokes a SQL-based analytic engine that hits all of a companies various data stores — including big data stores such as Hadoop and NoSQL databases — while carrying out queries.

In Brief

TransLattice, a Santa Clara, Calif.-based startup selling a geographically distributed relational database system, has acquired Red Bank, N.J.-based cloud-database startup StormDB. Both companies are pushing production-grade, distributed OLTP systems and the Postgres-based StormDB has some of its own IP around MPP analytics and geospatial data. It seems this means StormDB will stop taking new customers but, according to an FAQ on its site, “TransLattice will honor commitments to current StormDB customers.”

On The Web

Remember when “polyglot PaaS” was the new thing? Five years after launching, App Engine now supports PHP, Python, Java and Google’s own Go programming language. Kidding aside, App Engine actually has matured quite a bit, has attracted some relatively big users and is part of an ever-impressive cloud platform at Google.

In Brief

Teradata has upped the capabilities of its Teradata Aster big data platform by adding in a native graph-processing engine called SQL-GR. Not a bad idea considering the increased attention around graph processing lately, as well as the need for an aging Teradata to keep up with (or ahead of) of the Joneses in the big data space. And Teradata’s SNAP Framework — which ingests a query and then decides the right processing engines and data stores to invoke — is pretty sweet in theory.

On The Web

Execs are talking about measuring tweet volume and the reach of those tweets, but isn’t the real value in figuring out what people think? It’s not worth touting that 200,000 people tweeted and 4 million people saw those tweets if the overall sentiment is that the show sucks. But given the history of shows such as “Arrested Development,” 20,000 of the right people tweeting about how great something is might be worth noting even if ratings aren’t high.

In Brief

Hadoop startup WibiData has updated Kiji, its open source project that aims to make HBase a better (or easier) database for serving real-time applications. Among the updates in its latest SDK is an improved version of the KijiScoring feature. “Developers can now pass per-request settings to producer functions, greatly expanding the flexibility of real-time predictive model scoring. For example, a user’s current geolocation from mobile application can be factored in when re-computing which offers or recommendations to serve a user,” explains a press release.

In Brief

Guavus, a San Mateo, Calif.-based startup that specializes in analyzing the data coming off carrier networks, has hired former NetApp EVP Manish Goel as CEO. Goel replaces Anukool Lakhina, who founded the company and will stay on board to help drive its technology strategy, among other things. Guavus has raised $87 million in capital and claims some major wireless carriers as customers of its software that helps tie customer data to network activity.

In Brief

Yelp has announced the winners of its inaugural Yelp Dataset Challenge, and the four entries it chose actually seem pretty useful. They run the gamut from a technique to highlight key words so users can read reviews faster to helping businesses predict whether they’ll see an uptick in activity on Yelp. Having read countless reviews giving restaurants low ratings even though the food was good, I think the entry that extracts subtopics (e.g., food, service, ambience) from restaurant reviews might be my favorite.

facebook-wage-3

Machine learning startup BigML now supports text data in its cloud-based prediction service. It has always analyzed numerical fields in complex datasets to determine the relationship between them and any given outcome, and how it will consider the importance of words, too. Read more »

167891044page 8 of 44