What is the bias-variance tradeoff in machine learning?

The bias-variance tradeoff is the tension between two sources of model error. Bias is error from incorrect assumptions causing underfitting. Variance is error from sensitivity to training data causing overfitting. Reducing one typically increases the other.

What is high bias in machine learning?

High bias means the model makes overly simplistic assumptions and fails to capture the underlying pattern — leading to underfitting.

What is high variance in machine learning?

High variance means the model is too sensitive to the training data and has learned its noise — leading to overfitting.

Bias-Variance Tradeoff

The Core Concept of Machine Learning — Explained for Engineering Students

Last Updated: March 2026

📌 Key Takeaways

Bias: Error from wrong assumptions — causes underfitting. High-bias models are too simple.
Variance: Error from over-sensitivity to training data — causes overfitting. High-variance models are too complex.
Total Error = Bias² + Variance + Irreducible Noise
The goal is to minimise total error — not just bias or variance individually.
Fix high bias: Use a more complex model, add more features, reduce regularisation.
Fix high variance: Get more data, add regularisation, use simpler model, use ensemble methods.

1. What is Bias? What is Variance?

Bias

Bias is the error introduced by approximating a real-world problem with a model that is too simple. A high-bias model makes strong, incorrect assumptions — for example, assuming the relationship is always linear when it is actually curved.

High bias leads to underfitting — the model fails to learn the underlying pattern, and its predictions are consistently off, even on training data.

Variance

Variance is the error introduced by a model that is overly sensitive to small fluctuations in the training data. A high-variance model learns not just the true pattern, but also the random noise in the training set.

High variance leads to overfitting — the model performs very well on training data but poorly on new, unseen data.

	High Bias	High Variance
Training Error	High	Low
Test/Validation Error	High	High
Problem	Underfitting	Overfitting
Model Complexity	Too simple	Too complex
Example Algorithm	Linear Regression on complex data	Deep Decision Tree on small dataset

2. Analogy — The Archer

Imagine four archers shooting at a target:

Low Bias, Low Variance: Shots clustered tightly around the bullseye — accurate and consistent. This is the ideal ML model.
High Bias, Low Variance: Shots clustered tightly but all far from the bullseye — consistently wrong. This is underfitting.
Low Bias, High Variance: Shots scattered all over but centred around the bullseye on average — inconsistent. This is overfitting.
High Bias, High Variance: Shots scattered widely AND away from the bullseye — the worst case.

3. Error Decomposition Formula

The expected prediction error of a model can be mathematically decomposed as:

Expected Error = Bias² + Variance + Irreducible Noise

Term	Meaning	Can We Reduce It?
Bias²	Squared difference between average model prediction and true value	Yes — use better model
Variance	How much model predictions vary across different training datasets	Yes — regularise/simplify
Irreducible Noise	Inherent randomness in the data itself	No — cannot be removed

Even a perfect model cannot drive total error to zero — the irreducible noise sets a floor. The job of the ML engineer is to drive bias and variance as low as possible without increasing the other.

4. Connection to Underfitting & Overfitting

Underfitting (High Bias)

Occurs when the model is not complex enough to capture the data structure. Signs: high training error AND high test error; adding more data does not help; predictions are consistently off in the same direction.

Example: Fitting a straight line through data that follows a quadratic curve.

Overfitting (High Variance)

Occurs when the model is too complex and learns the noise in training data. Signs: very low training error BUT high test error; large gap between training and validation performance.

Example: A degree-15 polynomial that perfectly passes through all 10 training points but oscillates wildly between them.

5. The Tradeoff — Why You Cannot Minimise Both at Once

As model complexity increases, bias decreases and variance increases. There is an optimal point of complexity that minimises total error. This is the core tension — the bias-variance tradeoff.

Simple models (high bias, low variance): Linear Regression, Naive Bayes, Linear SVM
Complex models (low bias, high variance): Deep Decision Trees, K-Nearest Neighbours (K=1), large Neural Networks without regularisation
Balanced models: Random Forest, Gradient Boosting, Ridge/Lasso Regression

6. How to Fix High Bias and High Variance

To Fix High Bias (Underfitting):

Use a more complex model (switch from linear to polynomial, or add more layers)
Add more relevant features (feature engineering)
Reduce regularisation strength (lower λ in Ridge/Lasso)
Train for more epochs (for neural networks)
Ensure data quality — noisy labels create artificial bias

To Fix High Variance (Overfitting):

Get more training data — the most reliable fix
Add regularisation (L1/Lasso, L2/Ridge, Dropout for neural networks)
Use a simpler model (reduce depth of decision tree, fewer layers in NN)
Use ensemble methods (Random Forest averages many high-variance trees)
Apply cross-validation to better estimate true model performance
Use early stopping in neural network training

7. Common Mistakes Students Make

Thinking high training accuracy means a good model: A model can memorise training data and still fail completely on new inputs. Always evaluate on a held-out test set.
Only focusing on reducing bias: Students often add complexity until training error drops, without checking if test error also drops.
Confusing irreducible noise with bias: Even the best model will have some error from the data itself — this is not a model flaw.
Not using validation sets: Without a separate validation set, you cannot detect overfitting during training.

8. Frequently Asked Questions

What is the ideal bias-variance balance?

The ideal balance is the model complexity that minimises total generalisation error. In practice, train multiple models of different complexities, evaluate each on a validation set, and choose the one with the lowest validation error.

Does more data reduce bias or variance?

More data primarily reduces variance. It does not significantly reduce bias. If a model is underfitting, adding more data will not fix it — you need a better model.

How does regularisation affect bias and variance?

Regularisation reduces variance by penalising model complexity. However, too much regularisation increases bias. The regularisation strength (λ) is a hyperparameter that controls this balance.

Bias-Variance Tradeoff — The Core ML Concept