add list column to rowwise() dataframe

pyg

I'm trying to understand the behaviour of dplyr::rowwise()'d dataframes. Specifically whether I can apply dplyr::mutate() to them to create list columns that contain multiple objects.

In the example below I'm attempting to add a column to mtcars where each element/row is a list of length two, containing two lm models.

suppressPackageStartupMessages(library(dplyr))

## attempt to output two objects into a list
## -----> this does not work <-----
## why not?
mtcars %>%
  nest_by(cyl) %>%
  mutate(mods = list(
    mymod1 = list(lm(mpg ~ wt, data = data)),
    mymod2 = list(lm(disp ~ wt, data = data))
  ))
#> Error: Problem with `mutate()` input `mods`.
#> x Input `mods` can't be recycled to size 1.
#> ℹ Input `mods` is `list(...)`.
#> ℹ Input `mods` must be size 1, not 2.
#> ℹ Did you mean: `mods = list(list(...))` ?
#> ℹ The error occurred in row 1.

Created on 2021-06-06 by the reprex package (v1.0.0)

I can't quite make sense of the error message. Could anyone please help to clarify what's going on here?

Here are some related calls that do work, that led me to expect that the call above would also work.

suppressPackageStartupMessages(library(dplyr))

## output one object
## this works
mtcars %>%
  nest_by(cyl) %>%
  mutate(mod = list(lm(mpg ~ wt, data = data)))
#> # A tibble: 3 x 3
#> # Rowwise:  cyl
#>     cyl                data mod   
#>   <dbl> <list<tbl_df[,10]>> <list>
#> 1     4           [11 × 10] <lm>  
#> 2     6            [7 × 10] <lm>  
#> 3     8           [14 × 10] <lm>

## output one object into a list
## this also works
mtcars %>%
  nest_by(cyl) %>%
  mutate(mod = list(
    mymod = list(lm(mpg ~ wt, data = data))
  ))
#> # A tibble: 3 x 3
#> # Rowwise:  cyl
#>     cyl                data mod         
#>   <dbl> <list<tbl_df[,10]>> <named list>
#> 1     4           [11 × 10] <list [1]>  
#> 2     6            [7 × 10] <list [1]>  
#> 3     8           [14 × 10] <list [1]>

Created on 2021-06-06 by the reprex package (v1.0.0)

p.s. I understand that this can be achieved with other methods. This is mostly a learning opportunity :)

Ronak Shah

You can store a list of length 1 in a dataframe.

df <- data.frame(a = 1)
tmp <- list(mymod1 = lm(mpg ~ wt, data = mtcars))
length(tmp)
#[1] 1

df$b <- tmp

But you cannot store a list of length more than 1.

df <- data.frame(a = 1)
tmp <- list(mymod1 = lm(mpg ~ wt, data = mtcars), 
            mymod2 = lm(mpg ~ wt, data = mtcars))
length(tmp)
#[1] 2

df$b <- tmp

Error in $<-.data.frame(*tmp*, b, value = list(mymod1 = list(coefficients = c((Intercept) = 37.285126167342, : replacement has 2 rows, data has 1

For that again you need to make a list of length 1 which can be done as -

df <- data.frame(a = 1)
tmp <- list(list(mymod1 = lm(mpg ~ wt, data = mtcars), 
                 mymod2 = lm(mpg ~ wt, data = mtcars)))
length(tmp)
#[1] 1
df$b <- tmp

So for your example this will work -

library(dplyr)

mtcars %>%
  nest_by(cyl) %>%
  rowwise() %>%
  mutate(mods = list(list(mymod1 = list(lm(mpg ~ wt, data = data)),
                          mymod2 = list(lm(disp ~ wt, data = data))))) 

#    cyl                data mods            
#  <dbl> <list<tibble[,10]>> <list>          
#1     4           [11 × 10] <named list [2]>
#2     6            [7 × 10] <named list [2]>
#3     8           [14 × 10] <named list [2]>

which is also what the error message suggests to do.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Turn dataframe into list of lists rowwise?

Add list as a column to a dataframe

Add a list column to a dataframe

Add dataframe column to list column

Conditionally filter rowwise on a dataframe for a column variable with dplyr

Add column in dataframe from list

PySpark: Add a column to DataFrame when column is a list

How to add a column to a dataframe with a list of row numbers?

Add dataframe column containing minimum value of a list

Add column in dataframe from make list

Add file name as a column in each dataframe in a list

Duplicate list values and add in new column to dataframe

Add column if data is on the list - DataFrame in Python Pandas?

Apply function rowwise to pandas dataframe while referencing a column

Rowwise proportion test and add p value as new column

How to avoid rowwise when comparing a column against a list in dplyr

How to use rowwise to create a list column based on a function

New List Column from other vector columns with dplyr and rowwise

Add a new column to a dataframe with multiple condition based on list and a dataframe

Creating an empty dataframe or List with column names then add data by column names

Python Add Column to Pandas Dataframe That is a Count of List Elements in Another Column

Pyspark / Dataframe: Add new column that keeps nested list as nested list

How to add column to a dataframe from a list preserving the order of the list

Reordering rowwise values in a dataframe

R dataframe rowwise operation

Add key values pair inside list into pandas dataframe column

Add ID found in list to new column in pandas dataframe

python pandas add list in dataframe column as default value

How to add new element to pandas.DataFrame column which is list?