Search This Blog

Multiple Linear Regression Analysis And Visualization


Dataset

The mtcars data set is a built-in data set in R that contains information on various car models. It was collected by Motor Trend magazine in 1974 and includes the following variables:

mpg: Miles per gallon (response variable)

cyl: Number of cylinders in the engine

disp: Engine displacement (in cubic inches)

hp: Engine horsepower

drat: Rear axle ratio

wt: Weight (in 1000 lbs)

qsec: 1/4 mile time

vs: V/S variable (0 = V-engine, 1 = straight engine)

am: Transmission (0 = automatic, 1 = manual)

gear: Number of forward gears

carb: Number of carburetors

This data set is commonly used as an example in statistics and data analysis tutorials and courses, as it is small and easy to work with, yet still contains a variety of variables that can be used to demonstrate different techniques and analyses.

The first line of code fits a multiple linear regression model using the mtcars data set, with the response variable (mpg) and the predictor variables (disp, hp, and drat).

The second line of code displays a summary of the model, including the coefficients, p-values, and other statistics.

Call:

lm(formula = mpg ~ disp + hp + drat, data = mtcars)


Residuals:

    Min      1Q  Median      3Q     Max 

-5.1225 -1.8454 -0.4456  1.1342  6.4958 


Coefficients:

             Estimate Std. Error t value Pr(>|t|)   

(Intercept) 19.344293   6.370882   3.036  0.00513 **

disp        -0.019232   0.009371  -2.052  0.04960 * 

hp          -0.031229   0.013345  -2.340  0.02663 * 

drat         2.714975   1.487366   1.825  0.07863 . 

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 3.008 on 28 degrees of freedom

Multiple R-squared:  0.775,     Adjusted R-squared:  0.7509 

F-statistic: 32.15 on 3 and 28 DF,  p-value: 3.28e-09


The third line of code loads the car package, which is a package for data analysis.

The fourth line of code produces added variable plots (also known as partial regression plots or component-plus-residual plots), which are used to visualize the relationship between the response variable and each predictor variable, while holding the other predictor variables constant.


 

0 Comments:

Post a Comment