Data is growing exponentially, but we all know this. To handle, analyze and interpret such a large amount of data we need data scientists, who will help organizations to get the big picture and implement a better strategy on what to do under any possible scenario, present or future.
Therefore, we need to hire data scientists. Bear in mind that the hiring process of a data scientist is not like any other. Forget about ‘please provide me your resume’, ‘interview questions’, ‘what is your education about data science’, etc.
Many interviewers still use basic techniques to interview data scientists. Questionnaires during interviews are the most common evaluation method in deciding which candidate to hire.
Because of that, there is a lot of available material on the internet about how to interview a data scientist, such as ‘the best 50 questions to interview a data scientist’ and alike.
Well, I assure you, any aspiring data scientist candidate has read and learned all those same questions and answers that you are likely to inquire about. Those questions alone cannot determine the candidate’s skill level or qualifications.
After observing and working with many data scientists, I have reached some conclusions. At the end of the day, I am a data scientist and I learn by observing and contrasting data and evidence and identify patterns, of course!
Having interviewed many aspiring data scientists, as well as having trained over 300 people in data science, I would like to share my experience in the hiring process, and some additional considerations.
How to interview a data scientist
The best strategy for interviewing a data scientist candidate is to focus on their problem-solving skills; not so much their technical knowledge, nor their education (which of course are also necessary). By this, I am not saying to omit technical questions – statistical concepts, algorithms and models, programming, etc. – but rather focus on candidate ability to solve problems.
In real-life, 90% of the problems you will have to solve as a data scientist will be seemingly unsolvable problems.
Do not hesitate to pose absurd and unrealistic scenarios to the candidate and pay attention to how the candidate reasons his proposal to solve the problem. Perhaps it also provides an absurd solution (be prepared for it), but if it provides a solution it is much more than not contributing with anything.
“According to psychological studies, solving absurd scenarios improves creativity and pattern analysis”
A technique that has been very efficient in my experience is to prepare a scenario-test for the candidate and observe how he/she solves it. Also, ask the candidate to explain its conclusions. The ability to communicate effectively and explain a technical solution to any audience is essential when working with a data scientist.
So, instead of conducting a regular ‘interview’ as such, prepare a scenario-test with a dataset, and ask the candidate to 1. solve an objective within a time requirement, and 2. explain the conclusions and results of it.
Also, consider the business field where the candidate will be working and adapt the scenario-test and the dataset to such field. For example, if the candidate is going to work in a supply chain operations department, you can provide the candidate a dataset (there are hundreds or thousands of open datasets available on the internet) and ask them to make a predictive model of the delivery time of the goods, and ask the candidate to explain the results. Let him/her use the programming language or the framework or tools of his/her preference. The important thing is that you solve the problem, not focus on the tools used.
So, let’s understand a bit more about the data scientist role and some considerations around the hiring organization.
What Is a Data Scientist and What Do They Do?
A data scientist is a professional capable of solving operational problems with data; either interpreting the past, diagnosing the root causes of events, or estimating the most probable event in the future. In addition, a data scientist must know how to interpret the results obtained from his investigations, and what is more important: they must be able to explain these results to any audience – technical or not.
Yes, A data scientist is an individual well versed in both technical knowledge (applied statistics, programming, data engineering, business intelligence, etc.) and soft skills (communication, interpretation, critical thinking, etc.). And perhaps the most important and scarce thing: he must know and understand the operational or business area where he is going to perform his functions. If not, it must have the ability to quickly learn how the activity works.
Consider that a data scientist who works in human resources is not the same as another who works in marketing, or finance, or technical service, or internal operations, or engineering, or pharmaceutical, or industrial production. Data is data (we agree), but the environment is decisive for many nuances that it is necessary to know.
I assure you that a good data scientist does not rest until they find a ‘seemingly impossible’ solution to ‘seemingly impossible’ problems. In real life, this is a lot more common than what you might think.
Organizational considerations
When you are planning to hire a data scientist, it is essential to be very clear about the objective of why we need to hire a data scientist. In other words, define very well the scope of the human resource need. Likewise, it is as important, or more important, to know the state of the organization in the data science maturity cycle.
It is important to mention this because it is not always so clear. I have known managers who only want to hire data scientists to give an ‘image of digital modernity and progress’ in front of other managers or departments, but without knowing what tasks or scope of the future candidate will be entrusted with. The newly hired data scientist will become exhausted quickly and will go elsewhere as soon as possible.
On the other hand, there are organizations that are convinced that they are ready to apply Machine Learning or Artificial Intelligence, but they carry out the majority of their reporting manually (copying-pasting) in spreadsheets or slides; or worse, they do not have a proper architected or organized database system.
In conclusion
Before hiring a data scientist, consider seriously re-educating your existing staff and equipping them with technical knowledge related to data science. Your existing staff already know your business culture, the most common problems, and know well what works and what does not work in your organization. This is something that can take years for a data scientist.
Throughout my last 5 or 6 years of professional career as a data scientist, I have had the opportunity to job interview (or participate in job interviews in other departments) a good number of data scientist candidates; both very novice (juniors) and experienced (seniors) individuals.
In addition to this, I have had the great joy of training over 500 co-workers in data science and advanced analytics in the last 2 years. Professionals from different areas such as finance, operations, marketing, accounting, human resources, customer service, etc.
P. J. Moreno
Sr Data Scientist