R Loop for Variable Names to run linear regression model

Stick

First off, I am pretty new to this so my method/thinking may be wrong, I have imported a xlsx data set into a data frame using R and R studio. I want to be able to loop through the column names to get all of the variables with exactly "10" in them in order to run a simple linear regression. So here's my code:

indx <- grepl('_10_', colnames(data)) #list returns all of the true values in the data set
col10 <- names(data[indx]) #this gives me the names of the columns I want

Here is the for loop I have which returns an error:

temp <- c()
for(i in 1:length(col10)){
   temp = col10[[i]]
  lm.test <- lm(Total_Transactions ~ temp[[i]], data = data)
  print(temp) #actually prints out the right column names
  i + 1
}

Is it even possible to run a loop to place those variables in the linear regression model? The error I am getting is: "Error in model.frame.default(formula = Total_Transactions ~ temp[[i]], : variable lengths differ (found for 'temp[[i]]')". If anyone could point me in the right direction I would be very grateful. Thanks.

Rui Barradas

Ok, I'll post an answer. I will use the dataset mtcarsas an example. I believe it will work with your dataset.
First, I create a store, lm.test, an object of class list. In your code you are assigning the output of lm(.) every time through the loop and in the end you would only have the last one, all others would have been rewriten by the newer ones.
Then, inside the loop, I use function reformulate to put together the regression formula. There are other ways of doing this but this one is simple.

# Use just some columns
data <- mtcars[, c("mpg", "cyl", "disp", "hp", "drat", "wt")]
col10 <- names(data)[-1]

lm.test <- vector("list", length(col10))

for(i in seq_along(col10)){
    lm.test[[i]] <- lm(reformulate(col10[i], "mpg"), data = data)
}

lm.test

Now you can use the results list for all sorts of things. I suggest you start using lapply and friends for that.
For instance, to extract the coefficients:

cfs <- lapply(lm.test, coef)

In order to get the summaries:

smry <- lapply(lm.test, summary)

It becomes very simple once you're familiar with *apply functions.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Multiple linear regression model in R

How to Loop/Repeat a Linear Regression in R

R categorical variable in Linear Regression

Linear regression model

R Linear regression - iterate both variable

for loop in r with variable names

Estimating bias in linear regression and linear mixed model in R simulation

linear regression model using R

estimate of the variance of estimator for the effect of a predictor variable in a multiple linear regression model in R

How to run all possible combinations in multiple linear regression model in R

R linear regression on a dataframe of variable length

Fitting a linear regression model in R

Linear Regression in For Loop

Linear regression in R: invalid type (list) for variable?

How to change the names of confidence levels per variable in linear regression

looping variable names in r with linear regression

Automatic variable selection – Regression linear model

How to use two data frames to run a linear regression with a for loop

How to run linear regression model for each industry-year excluding firm i observations in R?

R - linear regression model of seasonal time series

Multiple linear regression model

Confidence Interval after linear regression model in R

R loop over linear regression

Extract for each Territory a linear regression model - R

Linear Log Model in R weird Regression Line?

Ho to run stratified bootstrapped linear regression in R?

Linear regression in R, loop through csv files

R Loop for logistic regression model

Linear regression model incorrectly calculated in R