Netflix has open sourced a tool called Suro that collects event data from disparate application servers before sending them to other data platforms such as Hadoop and Elasticsearch. It’s more big data innovation that hopefully finds its way into the mainstream. Read more »
Facebook has hired deep learning expert Yann Lecun from New York University to head up its new artificial intelligence lab. It’s part of a bigger push along with — and against — companies like Google and Microsoft to advance research while improving their platforms. Read more »
Parse.ly Co-founder and CTO Andrew Montalenti shares his views on how startups can best keep their costs down and options open by using cloud computing wisely. But it’s a fast-moving market, so they have to keep abreast of what’s happening. Read more »
The New York Times has a new online tool and Twitter feed that analyzes every fourth down in every game and gives its analysis in real time. Fans, commentators and even bosses have yet another means by which to second guess coaches’ decisions. Read more »
If collaboration favorite Box is going to grow into its ambitions, it has to do more than just raise lots of venture capital. It also must build a more intelligent product, which is something dLoop co-founder Divya Jain will now focus her energy on doing. Read more »
Quasi-secret intelligence-software startup Palantir is reportedly in the process of raising more than $100 million at a $9 billion valuation. That says a lot about the value of its technology, which isn’t cloud-based or consumerized, but does what it does very well. Read more »
Data-munging specialist Trifacta has raised another $12 million for its mission to speed the process of going from raw data to usable data. As data volumes and types keep piling up, faster tools will mean a lot less wasted time. Read more »
Following on the heels of Apple buying Topsy, fellow Twitter-specialist DataSift has announced a $42 million round of venture capital financing. Read more »
A new algorithm from University of Toronto researchers can predict the identity of untagged photo subjects by analyzing the relationships of the other people (or things) in the photo. Read more »
Google probably does need to become feature-competitive with AWS sooner rather than later, but that doesn’t mean it necessarily needs to match AWS tit for tat. Maybe being Google will actually pay off in the end. Read more »
The tech world is wondering how Apple plans to utilize the assets it acquired by buying Topsy, which focuses on collecting and analyzing Twitter data. I suspect Apple is trying to fill a big data void in its platform battle against Google. Read more »
Netflix is now running its streaming service live across two regions of the Amazon Web Services cloud platform, an architectural decision that should avoid a nasty service disruption like the one that struck last Christmas Eve. Read more »
Yahoo has acquired SkyPhrase and will incorporate the team into Yahoo Labs. SkyPhrase had built a natural-language processing platform that returned relevant statistics in response to search queries entered using everyday language. Read more »
Cloud platform provider Tier3 recently went from being a 60-person startup to part of a deep-pocketed telco with 55 data centers around the world. Here’s where Tier3 founder and now CenturyLink cloud CTO Jared Wray sees opportunities for startups and telcos alike. Read more »
The tech world is still enthralled by Yahoo, if only to watch if the CEO du jour can remake what was a hugely important company. Here, four former Yahoo technology executives talk about why the company failed, and the great work it did while doing so. Read more »
This is an interesting (and pretty funny) post from MailChimp data scientist John Foreman about analyzing email addresses. For example, Gmail and Hotmail are similar in terms of number and age of users (although possibly for different reasons), as well as preferred browser. AOL and Comcast email users, on the other hand, are older and interested in way different things than Gmail users. Oh, and a surprising number of people still use the AOL browser.
A Japanese project aimed at creating a computer system smart enough to pass the University of Tokyo entrance exam scored above average on a recent test run of sample math questions, highlighting some its progress as well as some problems. Read more »
The platform-as-a-service market hasn’t caught on was wildly as some anticipated a few years ago, and Apprenda CEO Sinclair Schuller has some ideas why that it is. He says his PaaS company is killing it because it made some smart — and prudent — decisions. Read more »
Alpine Data Labs, a San Francisco-based startup that has its roots in Greenplum, has raised a $16 million series B round of venture capital from Sierra Ventures, Mission Ventures, UMC Capital and Robert Bosch Venture Capital. The company touts its usefulness even to non-data scientists, who can create visual analytic workflows without having to write code as with a program like R. Additionally, Alpine analyzes data within the the database (or Hadoop) itself, so users don’t have to bother themselves with sampling or moving data.
Facebook has open sourced a new embedded database called RocksDB that’s meant to take advantage of all the performance flash has to offer, from right on the application server. It might be a sign of best practices to come. Read more »
Dropbox acquired computer vision startup Anchovi Labs and its Ph.D. founders in September 2012 to very little fanfare. But the skillset they bring could be integral as Dropbox seeks to grow into a platform and competitors like Google and Yahoo beef up their image-recognition capabilities. Read more »
This is a good blog post from Gartner analyst Alessandro Perilli about some of the problems facing vendors selling OpenStack as private-cloud software. You should read it. My two cents: If OpenStack vendors really are at a loss for how to describe their products, perhaps they should look at how the Hadoop market has been able to (seemingly) thrive thanks to a strong community and clear product visions among the vendors involved, beyond the open source code.
You didn’t think all the research Microsoft has done around deep learning was just for show, did you? The company’s deep learning models are now powering voice commands on the Xbox One platform, thanks to a direct connection to Bing. Read more »
The days of the cold call might be gone for salespeople. Actually, the days of the not-too-promising call might soon be gone, too. On Tuesday, a company called InsideSales introduced a new capability that infuses neural network technology (the basis of deep learning) into its products to help identify the best leads and even the best ways to approach them. However, scoring sales leads is becoming the new black. We recently covered a company called Infer that delivers a similar service, and companies such as Intel are even doing some of this internally.
HP released a new version of its Vertica database than easily connects with other systems to bring in unstructured data. It’s a big update for a database based on analytic SQL workloads but that needs to find a way to play with today’s data formats. Read more »
Intel is using big data to improve everything from manufacturing efficiency to sales, and is increasingly looking toward technologies such as Hadoop and machine learning to create new opportunities. Read more »
Anyone wondering how Amazon Web Services is able to roll out so many new features to its cloud platform each year might just want to read the new biography on Amazon CEO Jeff Bezos, whose management style touches everything within the company. Read more »
Cycle Computing CEO Jason Stowe dives deep into the economic and innovative benefits of running massive scientific workloads in the cloud. When researchers aren’t constrained by the systems the can afford, they can ask bigger questions and get better results. Read more »
This post from the New York Times‘ Open blog talks about the architecture and algorithms underpinning its content-personalization engine. Its experience speaks to some larger trends around companies moving from batch to stream processing and to cloud services overall. The Times’ recommendation engine used to rely on MapReduce jobs that ran every 15 minutes, but now relies on a homegrown real-time system. It used to run on Cassandra, but now runs on Amazon’s DynamoDB service.
Finnish researchers have devised an algorithm that accurately determines mobile phone users’ modes of transportation by analyzing data from their phones’ accelerometers. Useful? Absolutely! Annoying? Possibly … Read more »
Amazon Web Services VP and Distinguished Engineer James Hamilton explained during a session at the AWS re:Invent conference how the cloud provider keeps costs as low as possible and innovation as high as possible. It’s all about being the master of your infrastructure. Read more »
It was a good day for anyone invested in the greater NoSQL market, as Riak creator Basho and Couchcase both announced big customers wins. Basho highlighted The Weather Company, which is running and replicating Riak across multiple global data centers, while travel-industry technology provider Amadeus is working with Couchbase to deploy that database across its customer-facing applications. It’s good news for the NoSQL space because any large companies choosing databases other than MongoDB is validation that they matter and a sign they’ll be around for a while.
Amazon Kinesis is a new service for capturing and processing streaming data, and it’s also about the only thing of its ilk available as a cloud service. Will other cloud providers ever catch up with AWS? Read more »
IBM has upped the ante in the API game by making its Watson question-answering system available as a service. That’s right, Watson could soon power your smartphone app. Read more »
Amazon Web Services announced a new service called Amazon WorkSpaces during its re:invent conference on Wednesday. If it can deliver VDI and gain traction where others have not, it could be a big boon for the company. Read more »
Machine learning startup Ayasdi has teamed up with Lawrence Livermore National Laboratory, as well as the Texas Medical Center, to help advance data analysis in a variety of complex fields. Read more »
Amazon Web Services is now offering up free access to three NASA datasets from the NASA Earth Exchange project about the world’s weather, geology and vegetation. The cloud is a natural place to house large datasets that many people or institutions might want to analyze without requiring everyone to download, store and analyze the data locally. Scientific data has proven particularly appealing early, with numerous cloud providers already hosting various datasets, often in the fields of genomics and biology.
IBM’s Steve Mills has been with the company for decades, and during that time has seen lots of technologies and trends come and go. Here are his thoughts on how the company approaches selling software in a changing IT world. Read more »
This survey from State Street and the Economist Intelligence Unit is a pretty good look at the opportunities and challenges of using data in the financial services industry. Many respondents noted the challenge of integrating lots of data sources, which is understandable and probably only going to get harder. It seems there’s a lot of promise in new services/data sources such as Dataminr and Premise Data, but they also represent a pretty big divergence from tradition.
Backblaze has shared the designs of its 180-terabyte storage pods, and now it’s sharing some details about how long the drives inside those boxes last. According to the company, nearly three-fourths of all the drives it has deployed are still running. Read more »