This is a good presentation about Facebook’s graph-processing engine, Giraph, from a big data event held at the company’s Menlo Park campus in early June. The PRISM story kind of took over the news cycle that week, but the event also produced some news (for big data geeks, at least): Facebook’s Presto engine for interactive queries of its 250-petabyte Hadoop data warehouse.
While the company is not likely to reach its Kickstarter goal next week, it’s secured enough outside funding that the project will continue. Read more »
Researchers have a devised a method for identifying fake Twitter accounts that proved highly accurate across 27 popular black-market merchants. With Twitter’s cooperation, they spotted and deleted millions of accounts, using only data generated during the account-registration process. Read more »
On this week’s show we give you the scoop on Samsung’s watch, ponder if the Nook is cooked and learn all about deep learning. Read more »
The Independent claims reporting restrictions limited The Guardian‘s recent surveillance coverage, but Edward Snowden claims The Independent‘s new Middle East scoop is a government plant. Read more »
The last day’s NSA headlines have been about how it broke the law and even violated the Constitution. But that’s just a small part of an opinion that raises more questions than answers, and that underscores the complex nature of data privacy. Read more »
A Facebook initiative to give internet access to people in underdeveloped countries, and a proposal to teach a homeless man to program — two examples of a kind of tone-deaf mindset about how to fix the world’s problems. Read more »
Super data analyst Nate Silver talks about his plans for the new-and-expanded FiveThirtyEight blog from ESPN. Anyone wanting Silver to run in with in-game analysis or predictions might be disappointed. Read more »
JackBe’s technology, already used by the likes of General Electric and Intel, will now provide the visualization and analytics layer across Software AG’s product suite, while also underpinning the new Intelligent Business Operations Platform. Read more »
Data Science is not a solitary endeavor. It requires a team of professionals with organization-wide big data analytics understanding, knowledge and skills — from management to the data science practitioner — to unlock insightful data analysis that drive meaningful, actionable business decisions. Read more »
It’s natural to hear all the hype about big data and sense a bubble is forming, but the speakers at this year’s Structure: Europe conference have proven it’s for real — and they know how to make it happen. Read more »
Four trillion transactions? That’s just one reason you should pay attention to Google’s growing cloud. Also, the raging cloud storage wars and why big data may be hyped but is still pretty big. Read more »
For some time now, Foursquare has been fighting its critics by arguing that it is building the “location layer for the internet.” It has followed through on some of that claim by integrating itself into Instagram and other services, but now it could be on the verge of something much bigger: sources tell BuzzFeed it is close to signing a significant data deal with Yahoo.
Tempo adds Company Cards to its contextual calendar, arming you with factoids and news before you enter a meeting. It’s not a huge feature, but Tempo says it’s a key step toward building your networking graph. Read more »
A database vendor called Objectivity has created a mobile app called GraphMyLife that aims to let consumers explore links between the people and content in their various social networks. I say “aims” because although the idea is pretty cool, the app is a bit laggy and confusing (at least on my phone). But cut Objectivity a break: it’s a specialized (and old) enterprise-tech company trying to humanize its graph database software.
A data science consultancy has published a report analyzing the design of retirement- and investment-industry websites, but the lessons are universal: Better design means better business. Read more »
According to the Wall Street Journal, the NSA’s surveillance program allows it to tap into more data than it has previously admitted — up to 75 percent of all internet traffic in the U.S., the newspaper says. Read more »
Facebook, Ericsson, MediaTek, Nokia, Opera, Qualcomm and Samsung are launching an initiative called internet.org that aims to connect the whole world with internet access via cheaper devices, better business models and better infrastructure. Read more »
The legal discussion forum Groklaw is the latest web service to shut down out of concern over the NSA’s surveillance program — and the latest sign of how much we are losing due to the chilling effects of that government behavior. Read more »
In a candid interview last week, Hortonworks CEO Rob Bearden discussed a variety of topics — including personnel, profitability and a public offering — in some detail. Hortonworks is a Hadoop startup that spun out of Yahoo in June 2011. Read more »
The $1.25 million program, with participants including Intel and Xively, will use sensors and data visualizations to train pupils in the use of emerging technologies and bring other subjects to life. Read more »
10gen has added some new features to its MongoDB connector for Hadoop, including support for Hive and the ability to backup MongoDB files in HDFS. Read more »
Business intelligence and analytics startup Birst has raised a $38 million Series E round led by Sequoia Capital. Birst has been very busy in the past couple years, moving from SaaS to on-prem software, rethinking the data warehouse and even launching a Hadoop-based service. It looks like Birst is positioned to test the IPO waters like Qliktech and Tableau before it.
Cleversafe, a Chicago-based provider of object-storage systems for housing massive amounts of data, has raised a $55 million series D round led by New Enterprise Associates. Apart from traditional storage workloads, Cleversafe has also made a name for itself as a replacement for HDFS in Hadoop environments. According to Crunchbase, the company has now raised $91.4 million since 2007.
While companies are starting to realize that cloud architects are vital to a well-designed cloud strategy and architecture, the problem is finding education programs that develop the design expertise that cloud architects require to ensure a smooth transition from physical to cloud to ITaaS environments. Read more »
Pamela “PJ” Jones, the proprietor of Groklaw, is shutting down operations in the wake of the Lavabit secure email service closure. Groklaw, which was originally set up to cover the long-running SCO v. Novell trial but went on to facilitate discussions around all sorts of open-source and patent issues, relied partly on anonymous user tips. Jones said email could no longer be trusted, and said she was personally trying to get off the internet as much as possible.
The details, which appear to be genuine, do not include passwords. They do include OAuth tokens, though, so Twitter users should probably revoke and re-establish access to connected third-party apps. Read more »
A recent New York Times article casts some doubt on the economic impact of big data. Here’s why I think we haven’t seen anything yet when it comes to big data and the global economy. Read more »
Genomic-analysis startup Bina Technologies is trying to grow its footprint by giving away its appliances on a pay-per-use basis. It’s also expanding its capabilities to include analysis of exomes, a much smaller but very valuable component of human genes. Read more »
Is it illegal to visit Craigslist when the site tells you not to? In a new ruling on a closely-watch case about data scraping, a federal judge suggested that start-up 3Taps violated an anti-hacking law by disguising its IP address. Read more »
The tool, which forms part of Recommind’s cloud-based Axcelerate On-Demand package, aims to give non-technical users a faster and more informative e-discovery process. Read more »
NewSQL player will use funding to pursue opportunities in e-commerce, gaming and advertising, says CEO Robin Purohit. Read more »
The four GigaOM podcasts covered a range of topics this week: From SDNs and VMWare to BlackBerry’s past and present. We also discuss why the future of mobile banking will change due to connected devices, so tune in! Read more »
The White House’s Office of Science and Technology Policy published new guidelines and FAQs to ease adoption of President Obama’s open data policy. In May, the president signed an order mandating that agencies use machine-readable and open data formats when they collect or create information so it can be re-used efficiently.
Facebook has reportedly done away with its once-important EdgeRank system in lieu of a system that considers about 100,000 factors in determing what content to show on users’ feeds. Read more »
Google researchers have developed new methods for analyzing language using deep learning techniques. They’ve also open sourced an implementation of their work so any researchers can experiment with it. It could be the first of many deep learning tools designed for mass consumption. Read more »
When it comes to data, soccer is the new baseball. The latest issue of the Economist has an article breaking down English Premiere League soccer players using data, and a subsequent blog post includes an interactive tool from machine learning startup Ayasdi that lets readers explore the data. Earlier this week, Disney researchers presented their analysis of an entire year’s worth of ball-position data for a professional soccer league and how that can affect the outcome of games.
Todd Papaioannou is joining big data-focused venture capital firm Data Collective as a entrepreneur in residence. Papaioannou was most recently co-founder and CEO of Continuuity, and as has held executives roles at companies including Yahoo and Teradata. Read more »
Documents leaked by Edward Snowden hint at the scale of human and system errors in the NSA’s surveillance apparatus, that have lead to many Americans’ phone calls and emails being intercepted. Read more »
A monthly look at where health tech investors are putting their money. Read more »