What is the difference between classification and regression?

Classification predicts a discrete category (spam/not spam). Regression predicts a continuous value (house price). Logistic regression is classification despite the name — a common exam trap.

Explain the bias-variance trade-off. How does it guide model selection?

Bias: error from wrong assumptions (underfitting — model too simple). Variance: error from sensitivity to training data (overfitting — model too complex). Goal: sweet spot that generalises. Regularisation trades some variance for lower bias.

What is cross-validation? Why is it better than a simple train-test split?

k-Fold CV splits data into k folds, trains k times each using a different fold as validation. Averages performance across folds for a more reliable estimate than a single split. Especially important for small datasets.

What is the difference between L1 (Lasso) and L2 (Ridge) regularisation?

L1 (sum of absolute weights): produces sparse models by driving some weights to exactly 0 — acts as feature selection. L2 (sum of squared weights): shrinks all weights towards 0 but rarely to exactly 0. Use L1 for feature selection, L2 for general regularisation.

What is the confusion matrix? Define precision, recall, and F1 score.

Precision = TP/(TP+FP) — of all predicted positives, how many are correct. Recall = TP/(TP+FN) — of all actual positives, how many did we catch. F1 = harmonic mean. High-precision when false positives are costly; high-recall when false negatives are costly.

Free tool · no sign-up · 10 seconds

Free Data Scientist Interview Question Generator

Generate AI-powered Data Scientist interview questions instantly — technical, behavioral, and situational. Calibrated for experienced-hire interviews at Indian tech companies.

Generate DS questions free Browse Data Scientist question bank

How to generate Data Scientist interview questions

1
Enter your role
Type or select your target role in the question generator. You can also specify experience level and domain for more tailored output.
2
Generate questions
Click "Generate questions" to get 10 curated interview questions in under 10 seconds — no account or sign-up needed.
3
Practice your answers
Work through each question aloud or in writing. Use the STAR method for behavioral questions and think through edge cases for technical questions.
4
Upgrade for scored mock interviews
For AI-scored practice with detailed feedback across 5 dimensions, start a full mock interview session on InterviewEra.

Sample Data Scientist interview questions

A preview from our curated question bank. The generator produces fresh, AI-tailored questions on each run.

1
What is the difference between classification and regression?
Tip: Classification predicts a discrete category (spam/not spam). Regression predicts a continuous value (house price). Logistic regression is classification despite the name — a common exam trap.
2
Explain the bias-variance trade-off. How does it guide model selection?
Tip: Bias: error from wrong assumptions (underfitting — model too simple). Variance: error from sensitivity to training data (overfitting — model too complex). Goal: sweet spot that generalises. Regularisation trades some variance for lower bias.
3
What is cross-validation? Why is it better than a simple train-test split?
Tip: k-Fold CV splits data into k folds, trains k times each using a different fold as validation. Averages performance across folds for a more reliable estimate than a single split. Especially important for small datasets.
4
What is the difference between L1 (Lasso) and L2 (Ridge) regularisation?
Tip: L1 (sum of absolute weights): produces sparse models by driving some weights to exactly 0 — acts as feature selection. L2 (sum of squared weights): shrinks all weights towards 0 but rarely to exactly 0. Use L1 for feature selection, L2 for general regularisation.
5
What is the confusion matrix? Define precision, recall, and F1 score.
Tip: Precision = TP/(TP+FP) — of all predicted positives, how many are correct. Recall = TP/(TP+FN) — of all actual positives, how many did we catch. F1 = harmonic mean. High-precision when false positives are costly; high-recall when false negatives are costly.

See all 12 curated Data Scientist questions →

Ready to practice your Data Scientist answers?

Go beyond reading questions — upload your resume and get AI-scored mock interview feedback across technical depth, communication, structure, confidence, and relevance.

Start free mock interview Generate questions now

Question generators for related roles

Interview prep resources

Data Scientist hiring companies

Free Data Scientist Interview Question Generator

Generate AI-powered Data Scientist interview questions instantly — technical, behavioral, and situational. Calibrated for experienced-hire interviews at Indian tech companies.

Generate DS questions free Browse Data Scientist question bank

How to generate Data Scientist interview questions

1
Enter your role
Type or select your target role in the question generator. You can also specify experience level and domain for more tailored output.
2
Generate questions
Click "Generate questions" to get 10 curated interview questions in under 10 seconds — no account or sign-up needed.
3
Practice your answers
Work through each question aloud or in writing. Use the STAR method for behavioral questions and think through edge cases for technical questions.
4
Upgrade for scored mock interviews
For AI-scored practice with detailed feedback across 5 dimensions, start a full mock interview session on InterviewEra.

Sample Data Scientist interview questions

A preview from our curated question bank. The generator produces fresh, AI-tailored questions on each run.

1
What is the difference between classification and regression?
Tip: Classification predicts a discrete category (spam/not spam). Regression predicts a continuous value (house price). Logistic regression is classification despite the name — a common exam trap.
2
Explain the bias-variance trade-off. How does it guide model selection?
Tip: Bias: error from wrong assumptions (underfitting — model too simple). Variance: error from sensitivity to training data (overfitting — model too complex). Goal: sweet spot that generalises. Regularisation trades some variance for lower bias.
3
What is cross-validation? Why is it better than a simple train-test split?
Tip: k-Fold CV splits data into k folds, trains k times each using a different fold as validation. Averages performance across folds for a more reliable estimate than a single split. Especially important for small datasets.
4
What is the difference between L1 (Lasso) and L2 (Ridge) regularisation?
Tip: L1 (sum of absolute weights): produces sparse models by driving some weights to exactly 0 — acts as feature selection. L2 (sum of squared weights): shrinks all weights towards 0 but rarely to exactly 0. Use L1 for feature selection, L2 for general regularisation.
5
What is the confusion matrix? Define precision, recall, and F1 score.
Tip: Precision = TP/(TP+FP) — of all predicted positives, how many are correct. Recall = TP/(TP+FN) — of all actual positives, how many did we catch. F1 = harmonic mean. High-precision when false positives are costly; high-recall when false negatives are costly.

See all 12 curated Data Scientist questions →

Ready to practice your Data Scientist answers?

Go beyond reading questions — upload your resume and get AI-scored mock interview feedback across technical depth, communication, structure, confidence, and relevance.

Start free mock interview Generate questions now

Free Data Scientist Interview Question Generator

How to generate Data Scientist interview questions

Sample Data Scientist interview questions

Ready to practice your Data Scientist answers?

Question generators for related roles

Interview prep resources

Data Scientist hiring companies

Related reading

Free Data Scientist Interview Question Generator

How to generate Data Scientist interview questions

Sample Data Scientist interview questions

Ready to practice your Data Scientist answers?

Question generators for related roles

Interview prep resources

Data Scientist hiring companies

Related reading