Hadoop-in-the-cloud startup Qubole says its customers used more than 100,000 nodes to run more than 350,000 jobs and process more than a petabyte of data in July. Those aren’t Facebook numbers, but they seem to signal an appetite among smaller users. Read more »
While some big data startups are thriving, others are shutting down or searching for buyers because it doesn’t look like a second round of venture capital is coming. Here are a few lessons I think I’ve gleaned from watching the space over the past few years. Read more »
Predictive analytics specialist NICE has acquired Causata, a marketing analytics startup built around a core of big data and machine learning technologies. Causata should bolster NICE’s customer-engagement platform that helps companies better understand their customers. The four-year-old Causata has raised $23 million in venture capital, all from Accel Partners.
IT services and consulting specialist CSC has acquired Infochimps, a startup that sells a big data query and processing platform. Infochimps had raised about $5 million in equity and debt financing since launching in 2009. Read more »
Curt Monash has some interesting data points on Hortonworks and the Hadoop market from its point of view — competitive landscape, cluster size, hardware setups, etc. Also word that Eric Baldeschwieler is doing “his own thing.”
Hortonworks CEO Rob Bearden has confirmed that co-founder and CTO Eric Baldeschwieler has left the company. No word as to why, but his departure is the latest event in a busy few months at Hortonworks. Read more »
10gen is announcing that energy demand specialist EnerNOC has rolled out MongoDB to help it analyze its power grid data in new ways. EnerNOC collects 1.5 billion data points every month, although it’s possible they won’t all find their way into the company’s MongoDB environment.
If the corporate website is any indication, Hortonworks co-founder Eric Baldeschwieler is no longer with the company. The former Hadoop boss at Yahoo was Hortonworks’ first CEO and was most recently CTO. Read more »
It’s not so much a new brand as a new offering from Cox Communications, called Contour. People often find Netflix’s recommendations less than ideal, but that’s only $8 a month. I hope it’s the massive DVR and second-screen experience that are supposed to hook users.
FBI CISO Patrick Reidy gave Black Hat attendees some advice on detecting insider threats inside their agencies or companies. Essentially, he said, there’s no Edward Snowden profile that should set off alarms, so organizations must know their people very, very well. Read more »
Tableau has pumped up the features on its free offering. Tableau Public, which runs on users’ desktops but stores data and visualizations in the cloud, now stores up to 1 gigabyte of data and can handle files with up to 1 million rows. The previous limits were 50 megabytes and 100,000 rows, respectively.
Researchers have simulated 1 second of real brain activity, on a network equivalent to 1 percent of an actual brain’s neural network, using the world’s fourth-fastest supercomputer. The results aren’t revolutionary just yet, but they do hint at what will be possible as computing power increases. Read more »
Rackspace VP of Intellectual Property Van Lindberg was one of six tech-industry executives testifying before the House Judiciary Committee about intellectual property on Thursday. He highlighted the value of open source and the sometimes ridiculous nature of DMCA takedown requests. Read more »
NSA Director Gen. Keith Alexander gave a contentious opening keynote at the Black Hat cybersecurity conference on Wednesday. Alexander defended the NSA’s activities, while some in the crowd hurled accusations of lying at him. Here are the links to a video of his keynote as well as his presentation slides. (Fair warning, the servers seem a bit bogged down.)
Image: Black Hat USA 2013
Hat tip to Nathan Yau at FlowingData for spotting the soon to be new and improved Data.gov site. Data.gov was one of Barack Obama early open-government initiatives, but as Yau points out, it wasn’t exactly user-friendly.
A startup called Pondera Solutions has built an entire business based on utilizing Google’s suite of services — its Prediction API most prominently — to power an offering it calls Fraud Detection as a Service. Read more »
ZestFinance, the machine learning meets personal loans startup from former Google CIO Douglas Merrill, has raised a $20 million series C round. The company’s model analyzes more than 70,000 variables in trying to provide good loans to folks with bad, or no, credit. Read more »
Fitness trackers and life logging apps might not add too much depth to our understanding of our daily routines, but they do provide a good judgmental eye. Who else is gonna call you out on being a hedonist? Read more »
Two-hour happy hours on slushies and optimally priced chili dogs aren’t the products of divination. Keeping a business like Sonic competitive means collecting and analyzing lots of data, something Sonic is now doing in the cloud instead of in its old data warehouse system. Read more »
Red Hat Enterprise Linux has some advanced identity management features, and now it has extended them to popular NoSQL database MongoDB. According to a 10gen press release, “IT departments now have access to centralized user, password and certificate management, and are empowered to provide secure MongoDB deployments that are tightly integrated into their back office infrastructure.”
Outgoing Bitly Chief Scientist Hilary Mason will be taking up some her time for the next year as a data scientist in residence at Accel Partners. Mason is a big name in data science circles and has been a big data adviser to Accel since 2011. Read more »
Like all most web companies, Airbnb is trying to provide a better user experience by analyzing lots and lots of data. Here’s how the company built its big data infrastructure atop Amazon’s cloud and how all that data manifests itself in products. Read more »
Mona Chalabi tried to dig up some numbers about online abuse (in light of the recent Twitter rape-threat controversy) and found them hard to come by. Even in an age of over-sharing on social media, it’s hard to quantify some problems without access to sophisticated algorithms and people willing to spends lots of time on them.
GridGain Systems has raised a $10 million series B investment round for its suite of in-memory computing technology. In-memory databases are popular because of their low latency, but GridGain actually offers a whole line of other use-specific products, including for high-performance computing and Hadoop. Almaz Capital led the round, with participation from existing investor RTP Ventures.
America has millions of open jobs and not nearly enough people qualified to fill them. Sometimes, that’s because people don’t know they exist. Online education can change that. Read more »
Researchers at the University of Notre Dame have built a system for generating personalized health assessments that uses techniques common in web recommendation engines. The aptly named Collaborative Assessment and Recommendation Engine uses collaborative filtering to analyze the similarities among patients in hopes of identifying common symptoms, treatments and other things.
Remember SOPA and PIPA, the two copyright-protection bills that stirred the internet into a frenzy in in late 2011 and early 2012? Well, Harvard’s Berkman Center for Internet & Society just released some really interesting research and an interactive visualization mapping media coverage of the topic over time.
This probably isn’t a surprise to anyone, but Google makes a lot more money than Facebook and also spends a lot more on infrastructure. Take a look. Read more »
There has been a lot of talk about data after the success of “Orange is the New Black” on Netflix, but as the content competition picks up in streaming TV, it might be the little things where big data has the biggest impact. Read more »
A newcomer called Treasure Data has raised $5 million in its quest to take on the big boys of big data — names like Teradata, Cloudera and Amazon Web Services. Read more »
LinkedIn released its second-annual list of the Top 10 most in-demand tech startups on Monday. Cloudera tops the list, with other Hadoop and enterprise IT companies comprising its majority. But do engineers love those technologies, or the payout that comes along with them? Read more »
DataStax has raised $45 million in series D round to scale its business of selling enterprises on the Cassandra NoSQL database. It might not get all the attention, but Cassandra does have some big users. Read more »
Two separate research projects from Carnegie Mellon University and Disney Research have analyzed digital sketches in order to detect patterns in people’s drawings. As an exercise in data collection it’s cool, but do we really need to democratize artistic talent with data? Read more »
Airbnb recently analyzed its reviews to to find out what cities are the most hospitable and what guest-and-host characteristics tend to influence positive reviews. Should Airbnb users care? Read more »
Here’s a look at the most popular topics, news and posts on Twitter over the past week — at least based on my analysis using a collection of free tools. Read more »
Google and Microsoft might have disappointed investors in the last quarter, but they didn’t disappoint their equipment manufacturers and data center partners. Both companies spent boatloads on infrastructure. Read more »
In-Q-Tel, the strategic investment arm of the U.S. intelligence community, has put money into an open source geospatial-data startup called OpenGeo. Read more »
Tableau has gotten into the SaaS game with a cloud-based version of its popular analytics software. Called Tableau Online, it’s essentially the company’s server-based version delivered as a service. Read more »
Whether it’s ethically right or wrong to investigate deep into suspects’ networks of connections, the NSA certainly has the processing power to do it. “Three hops” away isn’t much when you can map potentially trillions of identities. Read more »
Researchers have determined the most-controversial Wikipedia articles and topics across 10 different languages, and the results might surprise you. Religion, politics and war? Of course. But professional wrestling? Read more »