As the cliché goes, “data is the new gold” and many companies have realized this. The reason is, as the world generates more and more data and at the same time, competition between organizations is fiercer and faster, and as habits change faster, everyone wants an extra edge. Intuition, gut feeling or common sense rules are useful, but not enough. Data lets organizations understand clients, products and processes much better.
This is not just for web companies. For example, Rolls Royce has data scientists analyzing airplane engines data to determine when to schedule maintenance, L’Oreal, the cosmetics company, has data scientists studying the effect of several cosmetics on several types of skins, Fruition Sciences precisely determines when and how much to water grapes to produce better wines, FiveThirtyEight forecasts elections better than most, and Feedzai detects payment fraud. All this, and more, is just possible with data driven decisions.
What’s the demand for data scientists in today’s market?
Data scientists are in high demand, and current research has found there is simply not enough PhD talent to fill the jobs. The shortage is especially severe in the U.S. where 80% of new data scientist jobs created between 2010 and 2011 have not been filled. The number of graduates with the requisite technical skills isn’t keeping up with rising demand. Training quants to become data scientists can help, but training can take years. This data is from Monster Worldwide, Inc.
What’s the difference between a data scientist and data analyst?
Data science differs from traditional statistical analysis and computer science in that scientific method is applied with data collected using scientific principles.
The reason for the growing need for this new approach is related to big data, which requires the use of a very different technology stack than statistical analysis. In other words, statisticians from 20 years ago would not be required to analyze massive data sets on the almost real-time scale that’s often required by today’s business applications.
Boiled down, it’s the difference between explaining what data means now and predicting what a data set could mean in the future. Traditional data analysis in companies and retail environments has typically been implemented to explain trends in data by extracting interesting patterns from individual data sets with well-formulated queries. However, data science is seeking to uncover actionable knowledge from large, unwieldy data sets that can be used to make decisions and predictions, not just interpret numbers.
Give an example of a typical day-in-the-life of a data scientist.
Data Scientists typically have to do five different types of tasks: data cleaning (e.g., what to do if some expected data field is missing?); asking questions (e.g., what is the correct amount of water that provides the best tasting grapes?); getting answers from data using statistics and machine learning models; visualizing results (to summarize and explain to others); and improve models and algorithms to yield better results and execute faster or at a bigger scale. There are of course, other tasks such as reading research papers, collaborating with other scientists, hiring, and more, but those top five are the most common.
In any given day a Data Scientist may be doing one or more of those tasks. My personal favorites are asking questions and getting answers! I love the feeling of thinking about a new question and discovering something new, some correlation that only a few people or maybe no one knows about. When that happens, I feel like the first astronaut in an unexplored planet of data: things are new, alien, and exciting!
Companies today are wrestling with a constant influx of information about their customers. What are some of the best ways to manage this data and leverage it to the company’s advantage?
We are in a time when data scientists are in such a high demand that hiring good data scientists is one of the major challenges data-driven companies have to overcome. Hiring and maintaining them is very difficult. It is common to take six to 12 months to hire and having to interview tons of people for a single position.
After that, having a good company culture, when data questions are expected and welcomed, and where data questions are addressed without prior personal biases, is also very important.
Companies, including retailers, also need to invest in better software and hardware tools to store and query their ever-growing datasets and to quickly iterate and test hypothesis. Finally, companies also need to have the internal speed and agility to react and recognize the value of their data science findings. After all, what good is it to find data gold if you think it is just zinc?
Nuno Sebastiao is co-founder and CEO of Feedzai. He brings his experience in infrastructure and services in addition to corporate management skills to Feedzai. Previously, Sebastiao led the development of the European Space Agency Satellite Simulation Infrastructure. Prior to his tenure at the European Space Agency, he was a co-founder at Evolve Space Solutions, a services company working in the aerospace domain. Sebastiao holds an MBA from London Business School.