Blog Post

Data science: Not a real thing, or a thing not worth less than six figures?

You’d think by this point we might have settled on a definition of what a data scientist is, or at least on a general agreement that they’re important. It seems we have not.

Here are two blog posts published within the last day about the relative value of data scientists:

  • One is from Miko Matsumura, CEO of database vendor Hazelcast, who calls data scientists glorified database administrators who will find themselves the kings and queens of “a rotting whale-carcass of data.” You can read it here.
  • The other is from John Foreman, chief data scientist at MailChimp (and, full disclosure, a regular Gigaom contributor lately), who argues that the work of a good data scientist will never cost as little as $30 per hour. You can read it here.
John Foreman (center) at Structure Europe 2013.
John Foreman (center) at Structure Europe 2013.

We’ve covered this ground before, in posts trying to define the skills a good data scientist should have and whether massive open online courses can teach people the skills they need to really call themselves “data scientists.”

At Structure Data in two weeks, I’ll be sitting down with AnnaLee Saxenian, dean of the University of California, Berkeley’s School of Information, about how universities and other institutions can and should try to train the next generation of data analysts everyone seems to agree we’ll need. She’s overseeing a $60,000 a year graduate program in data science, so I’m really keen to hear what she has to say about it.

Given these two recent posts, though, I tend to agree with John’s take. Data scientists actually do a lot of complex work with data, sometimes at the intersection of data and business, and software can only automate so much of that. Even as software improves, one could argue, really skilled people will always be working ahead of the curve. Share your thoughts in the comments.

9 Responses to “Data science: Not a real thing, or a thing not worth less than six figures?”

  1. Back in the day, when studying Operations Research at a leading engineering school, they offered a great comment. “Anyone can do the plug-and-chug. However, understanding when a solution works – if it works – and when if fails is the critical distinction.

    I’d argue we’ve got the semantics wrong. We should be calling this role ‘data engineers’ not ‘data scientist.’ Scientists discover raw principals. Engineers apply them or use them to find new uses for them.

  2. Ron Gutman

    That’s correct. Human’s (with high levels of skill) will be overseeing the data processing of big data that we call data science for a long time to come. The job might change as more is automated and the human’s will take on new problems at higher levels. Possibly, as higher level skills are required, the number of humans needed with those skills will eventually decline, but other factors will work in their favor such as fragmentation of software environments for data science. Just one example of an evolution in data science that I don’t think has played out yet – when will advanced machine learning techniques be combined with big data tools? I haven’t seen much yet. So, no worries, as long as a data scientist pays attention to how things evolve and keeps skills up to date, they have career, if not job, security.

  3. Louis Dorard

    Then there’s the question of whether there will be still be that strong a demand for full-time Data Scientists, with tools such as Prediction APIs that commoditize Machine Learning…

  4. Russell Greenberg

    We are definitely seeing overhype of Big Data and I believe it is a result of the disruptive nature of the Internet and a need to honor/vilify the practitioners who cause this and stand to benefit. However, decision science and analytics have been with us since the mid 1900’s and I believe there is a genuine increase in opportunities caused by the digitization of large amounts of information and the power and ubiquity of data processing.

    I’ve been a management scientist since the mid-70 and although the technologies have advanced, the skillset of a data scientist remains the same: Identify an objective, assemble the information needed to meet the objective, apply an effective algorithm or analysis, implement the solution, check with the ‘client’ that the objective has been met, look for the next objective.

    A good data scientist needs to excel at each of those skills.

  5. Tomas Ulicny

    Big data and Data Science was here all the time, the current excitement and churn is at the bottom the result of marketing trying to bring new life into the Data Analytics business. New-short-sexy-catchy terms like BIG Data and Data SCIENTIST support that attempt.
    Job well done…
    Lets see the positives, as it can be beneficial to all involved : 1. To data analytics software providers, to be inventive and invest into the development of new-better-more clever software, that targets the “new” assumed great-margin market; 2. To current data analysts (“scientists”), that should realize what great value they provide, or can provide, and they might start to fight more for a deserved recognition; 3. To the data owners, to understand their data can be used in a very clever way that can result in improving efficiency, in providing better customer service, in targeting customers more precisely, …, with a possible good bunch of cash either earned or spared as the result.
    BUT, all involved should be cautious enough and realize BIG Data and Data SCIENCE is just on its way to the peak of the Hype Cycle Curve! Just don’t wear pink glasses, make sure to do your home-work good or rely on independent expert opinion.
    You don’t want to stay embarrassed once the bubble bursts…

  6. Rufus T. Fuss pucker

    Indepth Business knowledge first, add mad statistical skills, fundamental data reduction and analysis a solid grasp of data architecture appropriate to the business problem domain(s) and skills with at least 3-5 modern procedural and semantic application development languages and your Recruiter comes up with less than 100 remotely qualified candidates in N. America and most of them are working at one of the large internet or financial companies for a damn site more than low six figures

  7. Chris Lay

    It seems to me that data analysis is essential to any functional organization. Anyone(or an automated report) can pull and report on numbers, but I’ve learned that it takes real skill to find the numbers that matter and identify the story those numbers are telling. My two cents: data science is just getting started.