BOSTON’S ECONOMIC DIARY Tracing the Contours of Growth and Challenge: A Detailed Account from 2013-2019

Title: BOSTON’S ECONOMIC DIARY: Tracing the Contours of Growth and Challenge (2013-2019) In “Boston’s Economic Diary,” we embark on a comprehensive journey through the economic heartbeat of Boston from 2013 to 2019. This detailed study utilizes a spectrum of data to paint a vivid picture of the city’s economic growth and challenges during this period. Read More…

Week 13 Monday – Analysis and Visualization on ACF and PACF to build the pre-model for AR and MA

Working on time series analysis in the economy indicators dataset, I probably uncovered the patterns of Logan Passengers and Logan International Flights variables. Till now in my analysis I have successfully done the Autocorrelation Function (ACF) and Partial Autocorrelation (PACF). Autocorrelation Function (ACF): The correlation between a time series and its lagged values, or earlier data, Read More…

Week 12 Monday – Time Series approach for the analysis and predictive model

Moving deep down into the time series analysis, I have gone through the different topics such as autocorrelation, forecasting, and cyclical analysis. Also, while thinking about the regression analysis, I came to a decision that conventional linear regression may not be the best option for time series research, particularly if the data shows seasonality, trends, Read More…

BENEATH THE BADGE: An Insightful Exploration of Police Shooting Incidents in the USA

In “Beneath the Badge,” we delve deep into the critical and often contentious subject of police shooting incidents in the United States. This comprehensive study leverages detailed data analysis to uncover the underlying patterns and key insights into these incidents. Key Highlights of the Analysis: Demographic Trends: A closer look at the age profiles of Read More…

Week 5 Friday – Connecting the Dots: Understanding the Correlation Between Variables in Police Shootings Data Sets

In this blog I mainly talk about the different variables and the correlation between them in the given dataset by the Washington Post. There are two datasets which are giving the information about the fatal police shootings. The second data set is providing the information of the cops which of them are named for an Read More…

Week 4 – Monday: Assessing the Performance of Linear Regression Model & Uncovering the Power of R-squared and Training Error by using 5-Fold Cross-Validations

In my last blog, I successfully implemented the 5-Fold Cross-Validations and get the scores in the linear regression model which I have considered in my analysis. In this blog, I have computed the training error to fit the entire data set in the model and plot the training data on the scatter plot. To check Read More…

Week 3 – Wednesday: Assessing Model Performance: Cross-Validation, Training Error, and Test Error in CDC Data Analysis

As far in the project progress, I have done with the Breusch-Pagan test and sorted that there is no significant evidence of heteroskedasticity in the data analysis. In today’s blog, I will talk about the Cross-Validation, the Training error, and Test error. An essential tool in data analysis and machine learning, cross-validation is used to Read More…

Week 3 – Monday: Assessing Heteroskedasticity in Regression Analysis for the CDC dataset

This blog mainly provides the information about the heteroskedasticity in the statistical analysis. While analysing the data with three variables % Diabetic, % Obesity, and % Inactivity on the basis of the common factor i.e. FIPS. I came to conclusion that are significant evidence of heteroskedasticity is detected in the statistical analysis of the given data Read More…

Week 2 – Friday: Analysing Heteroskedasticity and Predictive Linear Models for Diabetes, Obesity, and Inactivity

As far in the blog, I have made two predictive linear models as %Diabetic-%Obesity and %Diabetic-%Inactivity and done the analysis by plotting them on histogram. Now, I have performed the regression analysis and the Breusch-Pagan test for the updated excel sheet in which we can analyse the statistical test for our data frames. To determine Read More…

Week 2 – Monday: New excel sheet with the common factor FIPS and comparing %Obese and %Inactive by scatter plot

In today’s blog I will talk about my progress report about the CDC data analysis. I have successfully performed the correlation analysis between the three data frames: Diabetes, Obesity, and Inactivity. I found some impactful information after the analysis of the datasets by using a common factor, FIPS code in between them. I wrote a Read More…

Week 1 – Friday

Today’s work was all about the analysis of the %Obesity and % Inactivity of CDC datasets. I have gone through the same concepts for understanding the common factors related to the linear regression and the behaviour of the plot for both the datasets obesity and inactivity. Similarly, I have written a code to find out Read More…

Week 1 – Wednesday

As we discussed the CDC datasets into the class today, my work towards the project has come to the analysis of the data from the different counties in U.S. and the correlation between the %diabetes and %inactivity. Moreover, for the further analysis of the data and to understand the residual and scatter plot I have Read More…

Week 1 Monday

In the today’s class I have learned the new topics related to the linear regression and some of the concepts which is related to the Centers for Disease Control and Prevention (CDC) data sets such as skewness, kurtosis, and Heteroskedasticity. Though, linear regression is an important tool in data analysis and statistics which can used Read More…