Table of contents
Upskilling Made Easy.
Evaluation Metrics in Regression: RMSE, MSE, MAE, R², and Adjusted R²
Published 12 May 2025
2.0K+
5 sec read
Multiple Linear Regression (MLR) is a statistical method used for modeling the relationship between one dependent variable and two or more independent variables. It is an extension of simple linear regression, where only one independent variable is considered. MLR enables us to assess how multiple factors impact a response variable, making it a powerful tool in data analysis, prediction, and decision-making.
In MLR, the relationship between the dependent variable ( Y ) and independent variables ( X_1, X_2, ..., X_n ) is described by the following equation:
Y = c + m1 X_1 + m2 X_2 + ... + mn X_n + epsilon
Where:
For MLR to be valid, certain assumptions must be satisfied:
Let’s consider a practical example where we want to predict house prices based on several independent variables: size of the house (in square feet), number of bedrooms, and age of the house.
Size (sq ft) | Bedrooms | Age (years) | Price ($) |
---|---|---|---|
1500 | 3 | 10 | 300,000 |
1600 | 3 | 15 | 320,000 |
1700 | 4 | 5 | 350,000 |
1800 | 4 | 20 | 280,000 |
2000 | 4 | 12 | 400,000 |
In this scenario, the dependent variable (( Y )) is the Price of the house. The independent variables (( X_1, X_2, X_3 )) are:
A possible MLR equation could be formed as:
Price = c + m1 Size + m2 Bedrooms + m3 * Age
By employing statistical software or programming libraries (like Python’s statsmodels
or sklearn
), you can fit your model to the data and estimate the coefficients (( beta )).
Assume after running the regression analysis, we get:
The regression equation would then be: Price = 150,000 + 100 {Size} + 20,000 {Bedrooms} - 5,000 * {Age}
To predict the price of a house that is 1,800 sq ft with 3 bedrooms and 15 years old:
Price = 150,000 + 100 1800 + 20,000 3 - 5,000 * 15
Calculating it gives:
{Price} = 150,000 + 180,000 + 60,000 - 75,000 = 315,000
This means that the predicted price for a house that is 1,800 square feet in size, has 3 bedrooms, and is 15 years old would be approximately $315,000.
Multiple linear regression is particularly valuable for several reasons:
Understanding Relationships: With MLR, you can identify and quantify the relationships between a dependent variable and multiple independent variables, providing insights into how different factors influence outcomes.
Predictive Power: MLR allows for effective predictions of the dependent variable based on the input features, making it a staple for forecasting in various fields like economics, real estate, and healthcare.
Adaptable: MLR can be adapted to a wide range of problems by simply altering the independent variables, which makes it a flexible tool for data analysis.
Multiple linear regression is a vital statistical tool that enables analysts and researchers to understand complex relationships within data. By fitting a model to input variables, one can obtain valuable insights and make accurate predictions. Understanding the underlying principles and assumptions of MLR will empower you to apply this technique effectively across numerous domains, ultimately helping to drive informed decision-making based on data.
Happy analyzing!