5 Comments

Summary:

ShareThrough data scientist Ryan Weald shares his thoughts on the importance of making data science as much about building products as it is about math, and how to organize a company to ensure this happens.

When we think about data science, we think of a mythical laboratory where scientists are feverishly crunching numbers to provide a clear quantitative view of the future. We’ve been sold on the idea that these transformative findings will unlock increased performance numbers. Large companies hire teams of academics to sift through their massive repositories of data, believing in data science as it is currently described: academics doing glorified business intelligence.

Startups and small businesses must take a drastically different approach if they want to see real value.

For startups, data science should not be seen as a separate scientific initiative but as an integrated part of the product. Speed and efficiency are key factors to burgeoning companies; hiring and building out a team of data scientists, or more aptly named “data product engineers,” is paramount. Once you accept that data science is about building data products, you will see that your data engineers, contrary to popular belief, do not need PhDs. Instead, they need to be able to integrate into the core of your product and engineering organization.

Approaching data science from the product lens in not a completely new idea. DJ Patil, who previously led data products at LinkedIn and is currently VP of product at RelateIQ, discussed this in his book Data Jujitsu: The Art of Turning Data into Product. His thesis runs along the same lines as ours: product-focused data science is different than the current business intelligence style of data science. BI initiatives are well understood, but integrating data into the heart of your product in real-time is not.

We integrate data into our core product offering, a native advertising exchange. We combine terabytes of publisher content data with user interaction data to understand the context of a user on a publisher’s page, allowing us to deliver the most contextually relevant content.

For an example of a data-driven product outside of the advertising industry, let’s take a look at Pandora. Pandora consumes song metadata and combines it with user listening history to create customized radio stations for its users. This is essentially a gigantic recommendation system, which is an inherent data science problem. It is hard to imagine Pandora’s core product without data-driven customized radio stations, which require the tight data integration we have been discussing.

How do companies build out data product teams that are nimble enough to create such products? We approach the entire data science hiring paradigm differently.

Most current (traditional) job postings are looking for individuals with Ph.D. and industrial research experience, but most startups don’t need bleeding-edge machine learning to drive substantial business success. Therefore, advanced academic credentials are not the right criteria to look for when hiring someone to build great data products for your startup. Someone who only feels comfortable writing non-production code in Hive, SQL, R and Matlab can’t build great data products. This creates a data science organization that works in a dark corner and throws algorithms over the fence hoping the engineers can implement them in the product.

What you really need is someone who understands how to take data and transform it into a product.

So who should you look for? You need data engineers with the skills to build a product from the ground up and release it to your end users just like your traditional engineering and product team. This can be found in entrepreneurial engineers who have a passion for science, math and discovery. These people need the intellectual curiosity, entrepreneurial instincts and data engineering skills necessary to deliver results in the form of phenomenal data products.

Take me as an example. I played a key role in developing our integrated data product engineering team — from building large-scale production data-processing pipelines to machine learning algorithms for click prediction — despite the fact that I lack the traditional academic credentials, having dropped out of college as an undergraduate. There are plenty of ex-engineers from Google, Twitter, Facebook and countless startups who would call themselves software engineers or developers, rather than data scientists, and they possess all the skills your startup needs.

Finally, it is not just about hiring the right people, but also about properly weaving them into your culture and organizational structure. Our data product engineers sit with and even pair program with other engineers, creating the tight integration that facilitates quick iterations to keep the engineering team nimble and delivering. If you separate your science team, you create a transactional relationship where “science” is thrown over the wall to engineering, resulting in your science team producing output that cannot be productionalized or productized.

The goal of a startup is to develop products that change the world, and often that starts with data. To do this you need data product engineers who tightly integrate with your engineering team, and have the skills to transform data into products.

Ryan Weald is a data scientist at ShareThrough.

Feature image courtesy of Shutterstock user Tatiana53.

  1. Reblogged this on Software Marketing Advisor Blog and commented:
    Worth reading if you are thinking about how to leverage data in your startup business planning. Data product engineers or data scientists? Whatever you call them, the key is to integrate into your core product team, not a separate “science” team.

    Share
  2. “For startups, data science should not be seen as a separate scientific initiative but as an integrated part of the product.” – can’t agree less. After all, startups need to start at the bottom. That means integrating every aspect of the product in one location, as much as possible.

    More surprising stories here about technology and the people behind them! — http://www.londonreal.tv/silicon/

    Share
  3. I agree with the essence of this article. You don’t need Phd’s in Maths or Stats on your staff, especially in a start up. You need strong quantitatively minded vertical business analysts to understand what questions to ask and what questions the business needs answering. Then you need a strong analytics leader who can translate the ask into a data science question. For a start up, use the 80:20 rule and most of the data science questions can be answered with access to the right data and the plethora of analytics point and click vendors in the market.

    Share
  4. This article applies primarily to ‘software’ products.

    Share
  5. I agree with you, but I think your perspective is too narrow. The point is that data science skills are important and increasingly necessary, but data science can’t be everything. Industry needs subject matter experts of a wide variety of stripes to work with data scientists in order to create the most valuable, useful, insightful teams. Data science is extremely important, but it can’t and shouldn’t be everything.

    Share

Comments have been disabled for this post