A typical interview process for a data science position includes multiple rounds. Often, one of such rounds covers theoretical concepts, where the goal is to determine if the candidate knows the fundamentals of machine learning.
In this post, I’d like to summarize all my interviewing experience  —  from both interviewing and being interviewed  —  and came up with a list of 160+ theoretical data science questions.
This includes the following topics:
The number of questions in this post might seem overwhelming  —  and it indeed is. Keep in mind that the interview flow is based on what the company needs and what you have worked with, so if you didn’t work with models in time series or computer vision, you shouldn’t get questions about them.
Important: don’t feel discouraged if you don’t know the answers to some of the interview questions. This is absolutely fine.
Finally, to make it simpler, I grouped the questions into three categories, based on difficulty:
That’s, of course, subjective, and it’s based only on my personal opinion.
Let’s start!
Supervised machine learning
Linear regression
Validation
Classification
Regularization
Feature selection
Decision trees
Random forest
Gradient boosting
Parameter tuning
Neural networks
Optimization in neural networks
Neural networks for computer vision
Text classification
Clustering
Dimensionality reduction
Ranking and search
Recommender systems
Time series
That was a long list! I hope you found it useful. Good luck with your interviews!
The post is based on this thread on Twitter. Do you know the answers? Consider contributing to this github repository!