Week 4 – Monday: Assessing the Performance of Linear Regression Model & Uncovering the Power of R-squared and Training Error by using 5-Fold Cross-Validations

In my last blog, I successfully implemented the 5-Fold Cross-Validations and get the scores in the linear regression model which I have considered in my analysis.

In this blog, I have computed the training error to fit the entire data set in the model and plot the training data on the scatter plot. To check the curve of the predictions in our linear regression model I have the plot the graph for Mean Squared Error scores across 5-Fold Cross-Validation and the scatter plot shows the Training Data (Actual vs Predicted) is distributed un-uniformly and by this we can assume that the R-squared value should be very critical to predict and calculate. The behaviour of the MSE Scores Across 5-Fold Cross-Validation plot shows the parabola by which we can assume that the distributions of the K-folds are best fitted in the linear regression model.

Later on I have calculated the R-squared to check the accuracy of the predictive model. How well a linear regression model “fits” a dataset is gauged by its R-squared value. R-squared, which is also known as the coefficient of determination, measures how much of the variance in the response variable can be accounted for by the predictor variable. Further, for a regression model to be regarded as reliable in scientific studies, the R-squared may need to be higher than 0.95. If the dataset exhibits extreme variability, an R-squared of merely 0.3 may be sufficient in other domains.

At last, I have calculated the Mean Squared Error as 0.5964346993680371, Training Error as 0.5831964900110125, Test Error as 0.632505457320762 and R-squared value as 0.36007498977268937. On the above note, I can conclude that the value for Test Error is more than the value of Training Error, it indicates that how well the model generalises to new data. Furthermore, I conclude that R-squared value which is fine suited to the accuracy of the predictive model on which I have performed the analysis on Linear Regression Model.

Project MTH522 - Jupyter Notebook_2

Leave a Reply

Your email address will not be published. Required fields are marked *