Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Certometer Content Team

Published 14 May 2025

1.7K+

5 sec read

Introduction

Min-Max Scaling

Standardization (z-score Normalization)

Robust Scaling

markdown Copy

Blog Topic: Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Introduction

Feature scaling is a critical preprocessing step in machine learning that transforms features into a common scale, ensuring that no single feature dominates others due to its numeric range. Properly scaled features improve the performance of many algorithms, particularly those that rely on distance calculations, such as k-nearest neighbors (KNN) and support vector machines (SVM). This blog explores three popular feature scaling techniques: Min-Max Scaling, Standardization, and Robust Scaling.

1. Min-Max Scaling

Definition

Min-Max scaling is a technique that scales the features of a dataset to a fixed range, typically 0, 1. This transformation is useful when you want to preserve the relationships between the data while bringing all features into a uniform scale.

Formula

The Min-Max scaling formula is defined as:

X_scaled = X - min(X)/ max(X) - min(X)

Where:

( X ) = original value
( min(X) ) = minimum value of the feature
( max(X) ) = maximum value of the feature

Use Cases

When the data is not normally distributed and you want to ensure all features contribute equally to the distance metrics.
Situations where you want the transformed data to fall within a specific range.

Example

Consider a dataset with feature values ranging from 50 to 200. After applying Min-Max scaling, a value of 100 would be transformed to 0.25 if you want to scale it to a 0, 1 range.

2. Standardization (Z-score Normalization)

Definition

Standardization transforms the data so that it has a mean of 0 and a standard deviation of 1, resulting in a distribution with a standard normal distribution (also known as Gaussian distribution). This method is beneficial when the features follow a normal distribution.

Formula

The standardization formula is given by:

X_standardized = X - mu / sd

Where:

( X ) = original value
( mu ) = mean of the feature
( sd ) = standard deviation of the feature

Use Cases

When dealing with algorithms like support vector machines (SVM) or K-means clustering, where the distance between data points matter.
Situations where you have different units or scales in your features and need to normalize them.

Example

For a dataset with features that have a mean of 100 and a standard deviation of 20, a value of 120 would be standardized as follows:

X_standardized = 120 - 100 / 20 = 1

3. Robust Scaling

Definition

Robust scaling uses statistical measures that are robust to outliers, specifically the median and the interquartile range (IQR). It scales the data based on the central tendency and spread, making it an effective choice for datasets containing significant outliers.

Formula

The robust scaling formula is defined as:

X_scaled = X - median / IQR

Where:

( X ) = original value
( median ) = median of the feature
( IQR = Q3 - Q1 ) = interquartile range

Use Cases

Ideal for datasets with outliers, as it minimizes their impact on the scaling process.
Useful when the dataset is not normally distributed and you want a more robust alternative to standardization and Min-Max scaling.

Example

If a dataset has a median of 50 and an IQR of 40 (e.g., Q1 = 30, Q3 = 70), a value of 70 would be scaled as follows:

X_scaled = 70 - 50 / 40 = 0.5

Conclusion

Feature scaling is a vital preprocessing step that ensures that all features contribute equally to the analysis and modeling process. Each scaling method—Min-Max scaling, standardization, and robust scaling—serves different purposes and is suitable in various cons depending on the nature of the dataset and the algorithms used. By selecting the appropriate feature scaling technique, you can enhance model performance and improve the outcomes of your machine learning tasks.

Happy scaling!

Upskilling Made Easy.

Terms & Conditions

Return Policy

Disclaimer

Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Certometer Content Team

Published 14 May 2025

1.7K+

5 sec read

Introduction

Min-Max Scaling

Standardization (z-score Normalization)

Robust Scaling

markdown Copy

Blog Topic: Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Introduction

1. Min-Max Scaling

Definition

Formula

The Min-Max scaling formula is defined as:

X_scaled = X - min(X)/ max(X) - min(X)

Where:

( X ) = original value
( min(X) ) = minimum value of the feature
( max(X) ) = maximum value of the feature

Use Cases

When the data is not normally distributed and you want to ensure all features contribute equally to the distance metrics.
Situations where you want the transformed data to fall within a specific range.

Example

Consider a dataset with feature values ranging from 50 to 200. After applying Min-Max scaling, a value of 100 would be transformed to 0.25 if you want to scale it to a 0, 1 range.

2. Standardization (Z-score Normalization)

Definition

Formula

The standardization formula is given by:

X_standardized = X - mu / sd

Where:

( X ) = original value
( mu ) = mean of the feature
( sd ) = standard deviation of the feature

Use Cases

When dealing with algorithms like support vector machines (SVM) or K-means clustering, where the distance between data points matter.
Situations where you have different units or scales in your features and need to normalize them.

Example

For a dataset with features that have a mean of 100 and a standard deviation of 20, a value of 120 would be standardized as follows:

X_standardized = 120 - 100 / 20 = 1

3. Robust Scaling

Definition

Formula

The robust scaling formula is defined as:

X_scaled = X - median / IQR

Where:

( X ) = original value
( median ) = median of the feature
( IQR = Q3 - Q1 ) = interquartile range

Use Cases

Ideal for datasets with outliers, as it minimizes their impact on the scaling process.
Useful when the dataset is not normally distributed and you want a more robust alternative to standardization and Min-Max scaling.

Example

If a dataset has a median of 50 and an IQR of 40 (e.g., Q1 = 30, Q3 = 70), a value of 70 would be scaled as follows:

X_scaled = 70 - 50 / 40 = 0.5

Conclusion

Happy scaling!

Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Certometer Content Team

Table of contents

Blog Topic: Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Introduction

1. Min-Max Scaling

Definition

Formula

Use Cases

Example

2. Standardization (Z-score Normalization)

Definition

Formula

Use Cases

Example

3. Robust Scaling

Definition

Formula

Use Cases

Example

Conclusion

Table of contents

Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Certometer Content Team

Table of contents

Blog Topic: Feature Scaling in Machine Learning: Min-Max Scaling, Standardization, and Robust Scaling

Introduction

1. Min-Max Scaling

Definition

Formula

Use Cases

Example

2. Standardization (Z-score Normalization)

Definition

Formula

Use Cases

Example

3. Robust Scaling

Definition

Formula

Use Cases

Example

Conclusion

Related articles

Outlier Detection and Treatment: Z-Score, IQR, and Windsorization

Categorical Encoding: Methods to Transform Categorical Data

Handling Missing Values in Data

Understanding Hierarchical Clustering and Agglomerative Clustering in Data Analysis

Understanding K-Means Clustering and Evaluation Metrics

Understanding Gradient Boosting in Machine Learning

Understanding AdaBoost in Machine Learning

Understanding Random Forests in Machine Learning

Understanding Ensemble Learning: Bagging and Boosting

Understanding Hyperparameters in Decision Trees

Understanding Gini Impurity in Decision Trees

Understanding Entropy in the Con of Decision Trees

Introduction to Decision Trees for Machine Learning

Understanding k-Nearest Neighbours (KNN)

Understanding Classification Model Metrics: Precision, Recall, F1, F2, Accuracy, ROC, and AUC

Understanding Logistic Regression

Understanding Lasso Regression

Understanding Ridge Regression

Understanding Bias, Variance, Overfitting, Underfitting, and the Tradeoff

Understanding Polynomial Linear Regression

Understanding Multiple Linear Regression with Examples

Evaluation Metrics in Regression: RMSE, MSE, MAE, R², and Adjusted R²

Simple Linear Regression with a Quirky Example

Types of Machine Learning: Supervised, Unsupervised, and Reinforcement Learning

What is Machine Learning and How is it Different from Traditional Programming