Yes, we told you this already
As we reported two weeks ago, after White House staff leaked the news on a conference call, DJ Patil, the former Chief…
What? More data silos?
So here’s some irony for you: For years, Andy Palmer and his oft-time startup partner Michael Stonebraker have pointed out that database…
Marriage of NoSQL, graph tech
DataStax, the rising NoSQL database vendor that hawks a commercial version of the open-source Apache Cassandra distributed database, plans to announce on…
Machines are tackling the NFL
It’s said that familiarity breeds contempt in personal relationships. In the NFL, it might also breed predictability. Although the New England Patriots and their…
Omaha, hadoop, set, hike
It was early in the 2015 NFC championship football game last Sunday when Green Bay Packers coach Mike McCarthy was twice confronted…
Another day, another study
When something is hyped as much as the notion of big data, there’s bound to be disappointment when results don’t meet expectations…
Even the White House is on it
Even as websites, wearable computers and, increasingly, every piece of technology we touch gathers and analyzes our data, there’s still hope that…
Computers became way more productive when operating systems allowed computers to run different programs without having to be re-jiggered. Palantir thinks the same thing is about to happen to data.
That security camera that’s staring at you in a store may just be there to scare off shop lifters — but it could also be used to help the store owner decide where to put which products.
Is it possible to build a successful app build around data capture when the developer doesn’t own or control the data? Absolutely, and RunKeeper is a perfect example: The data is key for patterns at scale to motivate users beyond their limits.
Trading on the New York Stock Exchange and internet traffic are predominantly made up of bots. Humans’ relationships with them will continue to evolve.
Here’s the back story on Amazon’s new data stream processing engine Kinesis.
For the internet of things to succeed, its parts have to be interoperable, Intel’s Boyd Davis said at Structure Data Thursday.
Microsoft Research distinguished scientist and manager John Platt explains the impact of machine learning on current and future Microsoft technologies.
AlchemyAPI CEO Elliot Turner and IBM Watson sales chief Stephen Gold took to the stage at Structure Data 2014 on Thursday to discuss the implications of cognitive computing’s rise.
There’s plenty of consumer data that can be mined in Africa, but it just happens to be controlled by the mobile operators. TA Telecom CEO Amr Shady explains how startups can put that data to use.
D-Wave Systems is moving closer all the time to commercializing its cloud quantum computing service. In the process, it is demonstrating just how fast quantum computing science is developing.
Food and water shortages can lead to violence and civil unrest. Tracking data about food availability at the local level can help predict problems before they happen.
Data from human trafficking hotlines can be mined to identify trends and shape policy.
Exposing data isn’t just a compliance challenge — it can also be creepy. The solution is to tag data and control access, thereby allowing only partial visibility.
Deploying a radical, new customer experience platform in 90 days seems on par with startups, but what about a major company? MetLife takes its big data challenges on by staying lean.
Ford is turning to data culled from social media to make design decisions on its new vehicles, according to data scientist Michael Cavaretta.
Years of data, user feedback and complicated simulations can help design both a million-dollar car and an auto that costs ten grand. McLaren Applied Technologies is doing just that, saving vast amounts of time to make smarter, higher performing products.
Pivotal’s CEO is seeing large companies bypass their legacy infrastructure and create new systems, thanks to pressure from the internet of things.
Foursquare co-founder and CEO Dennis Crowley explains that, in the future of Foursquare, you might not have to check in at all.
Lew Cirne said New Relics Insights will bring real-time analytics of application usage data to business users, not just data scientists.
Pivotal continues to integrate and enhance technologies bought by its parent companies to expand its big-data-for-the-enterprise vision.
Data science is hard, but focusing on these three things can help your company or organization unlock the value of its data.
Machine learning startup Wise.io, whose founders are University of California, Berkeley, astrophysicists, has raised a $2.5 million series A round of of venture capital. The company claims its software, which was built to analyze telescope imagery, can simplify the process of predicting customer behavior.
For a deeper dive into the topics and technologies covered on Gigaom, check out the latest in-depth analyses on Gigaom Research. This week, we look at 3D printing, present our PaaS Sector Roadmap, and tackle the evolution of TV.
Declara Co-founder and CEO knows a lot about overcoming adversity and the process of learning as an adult. She also knows a lot about algorithms. On the Structure Show podcast this week, she explained how the two have intersected with her company’s platform.
Another study is reporting on the inaccuracy of Google Flu Trends project, which predicts seasonal flu rates based on search data. However, Google’s algorithms don’t constitute the “big data” approach to this issue, they’re just one piece of a smart big data approach.
On this weeks’ Structure Show Ramona Pierson talks about how her 100 surrogate grandparents influenced how Declara helps us learn better.
For fun, I decided to turn my iTunes library into a network graph and compare the language in Edward Snowden’s recent SXSW interview to Gen. Keith Alexander’s Black Hat talk in July. Just because you’re not a data scientist doesn’t mean you can’t enjoy data.
Spotify wants to keep the Echo Nest open for third parties, but Rdio’s CEO says that he doesn’t want to share his company’s data with a competitor.
Even Booz Allen Hamilton has dollar signs in its eyes when it thinks about sports data. The company is getting started on a new venture to apply its data science mastery to the piles of sensor and statistical data teams are generating.
As a new study about sex trafficking during the Super Bowl highlights, advances in data analysis are underpinning some powerful new ways of tackling very tough problems. Among all the stones hurled at the tech sector lately, this is an area in which it can take pride.
One blog post says, “Not only is Data Science not a science, it’s not even a good job prospect.” Another says, “[T]here will always be a place for those who excel at solving ambiguous technological & business problems. And they’ll cost more than $30/hr.” Who’s right?
This year’s Structure Data conference has a few new wrinkles, including a trivia night at a nearby pub and a series of Data Lab talks about using new types of data. Here are the details.
In this week’s Structure Show, Eucalyptus CEO Marten Mickos talks about the long and winding road to private cloud, an idea that arrived ahead of its time.
A Huntsville, Ala., company is moving from the machine-to-machine world into cloud platforms and big data. Here’s how it did it and how it thinks its work could actually end up saving lives.
Everyone is trying to be a platform company these days, but in this weeks podcast we explore the challenges affiliated with building a business around integrating various APIs and the need for a magical user experience.
Uber has published a blog post explaining the difference that median income makes on the company’s service in Chicago. Beauty might be in the eye of the beholder here, but the study itself reinforces how much today’s data-driven companies know about their businesses.
Sqrrl co-founder and VP of business development Ely Kahn came on the Structure Show this week to break down the state of cybersecurity and the cutting edge of data analysis within the Department of Defense.
Apache Spark, an in-memory data-processing framework, is now a top-level Apache project. That’s an important step for Spark’s stability as it increasingly replaces MapReduce in next-generation big data applications.
SEC filings don’t have to be dense PDFs filled with numbers. A former SEC analyst turned the data into an interactive database so companies and their filings could be easily searched.
What can the internet of things learn from modern farming? Plenty given that the industry is well on its way to building viable businesses around connectivity and data analysis.
RunKeeper tracked what its users were up to in Sochi during the Olympics and found they ran the equivalent of about 78 marathons. It’s an interesting nugget, but part of a much larger picture about learning how, when and where people exercise.
After weeks of voting, we’re pleased to announce the winners of the inaugural Structure Data Readers’ Choice awards. The eight winners represent some of the most innovative and promising startups that launched in 2013 and have made better data analysis their mission.
Analytics startup BeyondCore has raised $9 million for its technology that can analyze complex data sets and automatically highlight the strongest correlations. It’s a promising capability assuming companies are willing to open up analytics across the organization.
Could data and connected devices make personal trainers obsolete? Fitness equipment and quantified self gear is coming on the market armed with algorithms that know if your crunches are correct and how effective your squats are.
Our Structure Data conference this year is about many things, but one big theme is the emergence of everything as data. Thanks to advances in sensors and machine learning, everything from soil to sounds can provide valuable data.
There has been a lot of data industry news this week coming out of the Strata conference, and elsewhere. Here are some of the highlights.
MapR is continuing along its path to Hadoop glory with new support for the YARN resource manager and a direct integration with the HP Vertica analytic database. In such a competitive space, every little edge matters.
Data means a lot to Ford, informing everything from product design to business intelligence. In this interview from the Structure Show podcast, Ford’s top data scientist talks all about how Ford approaches everything from deploying Hadoop to hiring the right people.
Twitter is fast becoming a platform that’s far more valuable for marketers, politicians, traders and journalists than for any given individual user. That’s because if you know how to use it, the breadth of raw data Twitter offers via its firehose can tell a lot of stories.
We have chosen eight of our favorite startups from 2013 as winners of the inaugural Gigaom Structure Data Awards, but readers will also have their chance to vote for the Readers’ Choice awards.
I analyzed more than 5,000 posts by Gigaom writers in 2013 to identify the words and phrases we use the most. Can you guess what they are? Some of them might surprise you.
Jason Hoffman, Joyent co-founder and former CTO, and current VP at Ericsson, shares his thoughts on all things cloud — from why Amazon Web Services is king in IaaS to why data prices for connected cars had better be reasonable.
The streaming music space is heating up thanks to API services that put incredible amounts of music data in the hands of developers who want to build their own streaming services. Can Pandora’s “less is more” approach survive?
The Atlantic’s sister publication, Quartz (QZ) yesterday published a provocative piece under the headline — 2013 was a lost year for tech. It was a good way to boost attention, but it also highlights a trend of looking at technology from a narrow lens of consumer-tech.
The New York Times has a new online tool and Twitter feed that analyzes every fourth down in every game and gives its analysis in real time. Fans, commentators and even bosses have yet another means by which to second guess coaches’ decisions.
Quasi-secret intelligence-software startup Palantir is reportedly in the process of raising more than $100 million at a $9 billion valuation. That says a lot about the value of its technology, which isn’t cloud-based or consumerized, but does what it does very well.
Developers love the latest and greatest tooling. Whether it’s Sawzall, a Google language that bridges declarative and procedural worlds. Or Kafka, a real-time framework for managing data streams. Here are four or five tools that deserve a look.
As we move towards a quantified society, one shaped by data, we start to dismiss things that are unquantified. Empathy, emotion and storytelling — these are as much a part of business as they are of life. Here is why.
Google Drive and LinkedIn suffered snafus this week; and Amazon will reportedly build a not-so-secret cloud for the Central Intelligence Agency.
Call it whatever you want — big data, data science, data intelligence — but be prepared to have your mind blown. Imagination and technology are on a collision course that will change the world in profound ways.
A few trends emerged in more than 30 talks at this year’s GigaOM Structure:Data conference in New York on March 20-21. The big one: people play a crucial part in the big data equation.
In-memory, SQL, NoSQL and graph databases were on display in a fiesty discussion about databases that don’t involve Hadoop. The distinctions stand out amid growing interest in specialized databases in a big-data age.
Financial institutions have a lot of data — as in multiple petabytes– so storing that data for use in new products and for regulatory compliance will move to the public cloud.
For a company like Eventbrite, which manages ticket sales and audiences for events across the world, keeping up with the constant flow of data and information is a challenge.
We’ve pretty much got the data scale problem solved, now we have to focus on speed. When that big data utopia of scale and performance is achieved, the applications could be awesome.
Rather than store lots of data and then analyze to draw insights, Guavus puts rapid analysis first, yielding considerable return on investment for telecommunications companies, said CEO Anukool Lakhina.
There’s no doubt that Hadoop is the data tool of the present and future, but more can be done to make it really shine for business intelligence.
Entrepreneurs who build applications on top of Hadoop see lots of use cases, but the ecosystem needs to evolve further in order to support wider and more cost-effective implementations.
LucidWorks CTO Grant Ingersoll made his case at Structure:Data on Thursday for why companies tackling big problems related to large sets of data should give search another look.
Asking how something is better than Hadoop is not the right question. For strategic thinking around big data companies need to figure out what they want to achieve, not what tool to use.
Want to see big data in action? When it comes to planning out data center capacity, data can influence everything from the power usage to planning for disasters.
Former Yahoo Chief Cloud Architect Todd Papaioannou said his difficulties building a consistent, stable Hadoop platform at Yahoo directly led him to found his startup Continuuity.
All companies have data, all companies have people: the secret to big data analytics is incorporating people into the overall process, according to speakers at GigaOM’s Structure Data.
Wrapping up the first day of GigaOM’s 2013 Structure:Data conference, entrepreneurs from six startups talked about big ideas that show ideals for how to derive valuable insights from large sets of data.
Just like internet cookies went from major privacy concern to an accepted part of web browsing, the sharing of personal information like location and DNA will become more acceptable once people understand the value they may get out of offering it.
Companies often need to decide between innovation and open standards when they put their data into the cloud. So how can we improve data portability?
While using machine learning over large data sets to serve up ads inside social networks isn’t new, there’s an era emerging where social network data can be used to help people solve important problems.
Big data and the horsepower needed to generate, store and manage it is all great. Now we need to make sure our data is reproducible, says AWS principle data scientist.
When it comes to using big data, there are still bottlenecks. Many of these are around the tools that people use to try to make sense of massive amounts of information.
Representatives at IBM and the New York Stock Exchange laid out a schematic for doing big data analytics and showed how it can work in practice.
Harvard recently threw a tough genomics problem to TopCoder’s crowdsourced community and discovered the contest not only revealed a much broader field of investigation but provided a high level of motivation to get the problem solved.
Just like any company, the Central Intelligence Agency is trying to filter the massive amounts of data that are being produced by both people and machines, and find the signal in a growing volume of digital noise.
What does it take to move companies toward a data-driven future? EMC chief strategist and Pivotal Initiative leader Paul Maritz spoke at Strucuture:Data in New York on how to move toward the future through human leadership and strategy.
There’s already a ton of data flying around, but the amount just scratches the surface of the data deluge that will come with the Internet of Things. Better get ready, says Snaplogic CEO Gaurav Dhillon.
In his talk at Structure: Data, Quid’s Sean Gourley talked about the meaningful differences between “data science” and “data intelligence.” While one is concerned with correlations, the other is concerned about solving problems.
Even though a perception persists that machines can increasingly solve complex problems and process large amounts of data on their own, machine learning experts say humans still play a key role.
Are algorithms actually making society dumber? Yes, says at least one big data expert. We can’t throw computers at our problems until we better define those problems though human input.
Lending firms like Zest and Kabbage are doing a better job than jobs at deciding if a person or small business should receive credit. Their advantage is thousands of data signals that banks don’t even consider.
When building successful apps, both designers and engineers have to remember that they are on the same team, said Kleiner Perkins’ Michael Abbott at Structure: Data 2013 Wednesday.
Riak CS distributed cloud storage technology has always been sort of open-sourcey but not really open sourced. That’s changing now with Basho putting it under the Apache 2 license.
You can find all of our coverage of Structure:Data 2013 here, along with links to more info on the conference and a livestream of the action.
Who better to show the CIA how to build a cloud than Amazon Web Services? No one’s confirming anything but an AWS-CIA contract would make sense for both parties.
LucidWorks’ Grant Ingersoll argues that it’s time to stop using language to diminish the importance of text, one of the defining computational challenges of our time.
For a deeper dive into the topics and technologies covered on GigaOM, check out the latest in-depth analyses on GigaOM Pro, our subscription-based research service. This week: how to scale up a startup, new adventures in convergence, and more.
When Google launched its EC2 rival, Google Compute Engine, last June, it set some high expectations. Sebastian Standil’s team at Scalr put the cloud infrastructure service through its paces — and were pleasantly surprised at what they found.
Google now allows joins within its BigQuery analytics service, as well as support for timestamped data and massive aggregations. Valuable stuff if you use BigQuery.
Pivotal Initiative is finally here. EMC Chairman Joe Tucci kicks off his latest venture in New York with help from Pat Gelsinger, Paul Maritz and others.
To bring in more data on the IT products that enterprises use, HG Data is taking on $2 million in venture funding, pointing to the role big data can play in lead generation.
Check out our special retrospective on the history of Hadoop, one of the most powerful open-source data tools ever developed, in this post.
Bit.ly, the company that works to help customers understand who is clicking on their links and measure news in real time, announced Monday that the company’s CEO will be stepping down.
We talk a lot about big data, but only analyze 1 percent of what’s available. In order to take advantage of the other 99 percent, we need to reconsider how we do big data.
As companies implement big data analytics strategies, they ought to consider some of the best practices in place before the rise of the term “big data.”
We were there very early on for the birth of Hadoop and its maturation into a vital data analysis tool. Here’s a look back at some of our best stories.
It is fashionable these days to either like big data or just malign big data. Regardless of what your personal feelings are, the question has always been and will always be – what is data good for. Here are three stories to illustrate those questions.
How big an impact has Hadoop had on the technology world? Check out our infographic on the reach of the most important big data tool of our time.
Open source data warehousing models have a lot of advantages, the ability to scale horizontally and cheaply among them, but traditional warehousing techniques have their strengths as well, said Vipul Sharma, principle software engineer and engineering manager at Eventbrite, at Structure:Data.
The problem for many companies is that user information is spread across hundreds or even thousands of different fields in various databases, and it’s difficult to compile it in real time. But doing that successfully is becoming increasingly important, says WiBiData at Structure:Data.
We’re walking around with sensors in our pockets: those of us carrying smartphones, anyway. As said at Structure:Data, there are huge opportunities for companies to improve existing services and create new ones with the huge amount of data provided by mobile computers.
Scott Metzger, VP of analytics at flash-memory array maker Violin Memory, argued at Structure:Data that putting flash memory at the heart of big-data infrastructure is a must for any business that is worried about how long it takes to get results from data analysis.
Los Alamos National Laboratory is trying to build to an exascale computer, which could process one billion billion calculations per second. The man in charge of executing that vision, however, sees a big obstacle toward building it. That problem, discussed at Structure:Data, is resilience.
There are plenty of benefits from making data available to large repositories. But Trend Micro’s Dave Asprey said at Structure:Data one thing holding enterprises back from putting their data in the cloud is the lack of security of what they’re sharing.
It’s easier to crunch massive amounts of data when you don’t have to reinvent the wheel for every scenario. Sultan Meghjji and his colleagues at Appistry are hoping to make this process run more smoothly, Meghjji explained at Structure:Data.
Is it possible that most people will be likely to have their DNA profile within the next 5 years? Yes it is, according to Andreas Sundquist, CEO and co-founder of DNAnexus, who suggested this to the audience at Structure:Data.
When running databases, how do you get the speed you want while offering the flexibility and cost savings of the cloud? At Structure:Data, Wordnik co-founder Tony Tam described how his company was able to move its relational database from dedicated hardware to the cloud.
“Anticipation denotes intelligence.” Zubin Dowlaty, VP and head of innovation and development of analytics-outsourcing firm Mu Sigma, said at Structure:Data that’s what companies need to be striving for and in this era of big data, the barriers to achieving that have fallen away.
Data collected can be useful to retailers in many ways, but not necessarily in the ways that one might expect, as discussed at Structure:Data. For example, did you know that there’s a correlation between the music you listen to and the things you might buy?
At Structure:Data, DataXu showed off its technology in the form of a writhing map of colors that reflected consumer sentiment to cell phone promotions. In practice, this means that a phone company’s ad campaign would automatically increase or decrease offers for contracts or free phones or pre-paid plans.
Consumers have long been trading their personal data in return for access to Web sites like Facebook. The tradeoff has worked well for companies and consumers but, as the pool of data grows, so have privacy concerns. At Structure:Data, panelists say the current so-called solutions are misguided.
Over time, we’re generating massive amounts of new data, and as it gets bigger, it becomes a challenge to gain insights through traditional database queries. At Structure:Data, SQLstream CEO Damian Black proposes how to solve this problem.
Hadoop may be the current leader of the pack when it comes to handling big data, but LexisNexis says at Structure:Data the system it developed for its own internal data use — and recently open-sourced — is a viable alternative and in some cases is superior.
Your corporation is watching you, and it might be using Cataphora’s software, which mines employees emails, IMs and other electronic communications to determine how big of a risk a corporation might face from one bad apple.
Machine-generated data, the non-intelligible zeros and ones that are generated by sensors and other devices, is no longer just for geeks. While it looks like gibberish forward-thinking consumers are already pressing that “gibberish” data into service, according to speakers at Structure: Data 2012.
In the same way a microscope helps augment the innate ability of the human eye, Quid is trying to create tools to augment how we as humans process unstructured data and visualize it, said Sean Gourley, co-founder and CTO of Quid at Structure:Data 2012.
It’s tough to overcome some of the biases that have become second nature in most businesses. But if you’re John Lucker, who’s a Principal at Deloitte, overcoming the “human factor” can be critical to the success of driving organizational change. Overcoming that is necessary, however.
The amount of data processed by companies big and small increases every day – and data centers have a hard time keeping up. Not only is scaling the physical infrastructure costly, it also consumes vast amounts of energy. A solution was discussed at Structure:Data.
Next-generation online banking service ZestCash uses data to help qualify people for short term loans. The company explains at Structure:Data it has been giving $300 to $800 loans to users based on thousands of variables, which are boiled down to 10 models.
Hadoop is a great platform for storing and processing data, but it needs applications to make it truly valuable. At Structure:Data, Cloudera CEO Mike Olson discussed the dearth of pro analytics apps using Hadoop, and invited the startups building them to come to him for money.
Humans have to be part of the equation when it comes to interpreting and processing data if we want to get the most value out of it, argued Arnab Gupta, CEO and founder of Opera Solutions at Structure:Data. Gupta advocates for making data small.
If 80 percent of new data created is going to be unstructured, where is all that data coming from? It’s coming from consumers’ activities online and it requires real-time processing, said Continuuity’s Todd Papaionnou at Structure:Data.
Predicting risk is key to the way insurance companies figure out your monthly rates and premiums, and it needs data and time to do so. Allstate said at Structure:Data that the best use of this data was to give it to the 30,000 scientists competing on Kaggle.
Robert Lefkowitz, director of web development at 1010data, argued at Structure:Data 2012 for why desktop-based or browser-based spreadsheets aren’t the solution if you have tons of data you need to put to work…
If IBM Distinguished Engineer Jeff Jonas ever invites you for a friendly game of puzzle, be prepared: Jonas likes to leave out half of the pieces, or introduce false positives from other puzzle sets. These kinds of experiments have taught him a lot about data analytics.
How do you or your business get the most value out of all the data you’ve accessed about your customers? At Structure:Data, Nick Weir, CEO of ChoozOn, gave some basic tips on the most important things to consider when mining your data.
One of the oldest problems in business, getting different groups to communicate with each other, could be solved by one of the newest online phenomenon: social tools. Two industry leaders explained how to break down these silos through social at the Structure:Data conference.
Companies are grappling with how to make use of all their data, facing the challenge of teasing out insights quickly and with flexibility. Moving to the cloud opens up security and privacy questions. But the effort can be worth it, says Google’s Ju-kay Kwek at Structure:Data.
Business not using machine learning to augment the products and services will find it difficult to compete in the future according to panelists at GigaOm’s Structure:Data event on Wednesday. These companies’ competitive disadvantage will get worse as machine learning solutions gain more intelligence.
Businesses understand now that big data can help them wring revenue out of once-unproductive assets. But that just fuels an exploding demand for bigger, faster, and more precise big data applications, experts speaking at Structure:Data say.
On March 21 and 22, at Structure:Data, we’ll look at how companies like @WalmartLabs, IBM, and PayPal are using big data and how technologies like Hadoop are evolving to help them analyze that data. Watch the livestream and read our blogs of the event here.
EMC Corp. (s EMC), the Hopkinton, Mass.-based storage and cloud hardware company has bought Pivotal Labs, a San Francisco-based consulting firms well known for its tool Pivotal Tracker and also for its pioneering work on agile development methodology. I had first reported earlier this week.
Legal scholars are always searching for ways to improve the patent system, sometimes via sweeping changes, but big data — especially techniques such as machine learning and natural-language processing — could help provide a technological fix to a big part of the problem.
Data, I believe is like plastic. You can use it to make wonderful things. However, like plastic, it can be a great polluter and create havoc on the environment. Or as I like to say, data without context is dirt.
As the volume of enterprise data created has moved past terabytes and into tens of petabytes, companies need to figure out the…