Statistics

# Multiple Linear Regression in R

This slideshow requires JavaScript.

In the previous exercise: Why do we need N-2?, I show a simple 1 dimensional regression by hand, which is followed by an examination of sample standard errors.  Below I make more extensive use of R (and an additional package) to plot what linear regression looks like in multiple dimensions. This generates the images above, (along with several others).  This illustrates that linear regression remains flat even in N dimensions, the surface of the regression is linear in coefficients.

As a class exercise, I ask that you consider different pairs dependent variables that are  functions of one another. What happens if the function is linear? What happens if the function is nonlinear, for example, $cos(x_1)=x_2$? Examine what happens to the surface of your regression as compared to the shape of the relationship you are investigating.  Is there a way you can contort the regression estimate into a curved surface to better match?  Why or why not?

install.packages(“plot3D”) # we need 3d plotting
library(“plot3D”, lib.loc=”~/R/win-library/3.1″) #Load it into R’s current library, may vary by computer.

set.seed(2343) #ensures replicatation. Sets seed of random number generators.
n<-25 #number of samples
x_1<-rnorm(n) #Our x’s come from a random sampling of X’s.
x_2<-rnorm(n)
b_0<-10
b_1<-3 #Those cursed jello puddings are associated with increased crime. Linear regression is supportive of association- not causation.
b_2<-(-3) # But student transit programs are associated with a decline in crime.
u<-rnorm(n)
y<-b_0+b_1*x_1+b_2*x_2+u #This is defining our true Y. The true relationship is linear.

#look at data in each dimension
plot(x_1,y)
plot(x_2,y)
#look at data overall
points3D(x_1,x_2,y,xlab=”x_1″,ylab=”x_2″,zlab=”y”,phi=5) #look at data. phi/theta is tilt.

fit<-lm(y~x_1+x_2)  #fit it with a linear model, regressing y on x_1, x_2

#Make a surface
x_1.pred <- seq(min(x_1), max(x_1), length.out = n)
x_2.pred <- seq(min(x_2), max(x_2), length.out = n)
xy <- expand.grid(x_1=x_1.pred, x_2=x_2.pred)
y.pred <- matrix (nrow = n, ncol = n, data = predict(fit, newdata = data.frame(xy), interval = “prediction”))

summary(fit) #view output of variables.

fitpoints<-predict(fit)  #get predicted points, needed to make a surface.

scatter3D(x_1,x_2,y,xlab=”x_1″,ylab=”x_2″,zlab=”y”,phi=5 , surf=list(x = x_1.pred, y = x_2.pred, z = y.pred, facets = NA, fit = fitpoints)) #look at data. phi/theta is tilt.
scatter3D(x_1,x_2,y,xlab=”x_1″,ylab=”x_2″,zlab=”y”,phi=45, surf=list(x = x_1.pred, y = x_2.pred, z = y.pred, facets = NA, fit = fitpoints)) #From straight on it is a flat plane, residuals are highlighted
scatter3D(x_1,x_2,y,xlab=”x_1″,ylab=”x_2″,zlab=”y”,phi=30, surf=list(x = x_1.pred, y = x_2.pred, z = y.pred, facets = NA, fit = fitpoints)) #From other angles it is clear it is somewhat straight.
scatter3D(x_1,x_2,y,xlab=”x_1″,ylab=”x_2″,zlab=”y”,phi=60, surf=list(x = x_1.pred, y = x_2.pred, z = y.pred, facets = NA, fit = fitpoints)) #look at data. phi/theta is tilt.