How do I write a R function that let me manipulate multiple R variables using dplyr's %>% pipes?

Bear25

I am trying to create a function to manipulate different datasets for but am facing several issues with this task. I am providing a simplified version of the data I am trying to manipulate in the dput() output below:

structure(list(id = structure(c(2, 4, 6, 8, 10), label = "iid", format.spss = "F4.0", display_width = 0L), A = c(13, 9, 14, 14, 13), B = c(12, 0, 9, 3, 10), C = c(13, 8, 14, 13, 11)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))

There are several things I am trying to do, but I get stuck at different junctures because of the way the data is formatted. First I need to sum up the values from columns A:D for each row into a variable called total. Next, I need to compute the probability by dividing each of columns A:D by total.

Here is where I face some issues. I wrote a function to perform the above:

functa <- function(x, id, vars) {
  
  x %>%
    mutate(total = rowSums(.[vars])) %>%
    mutate(prob = .[vars]/total)

}

When I call the function using the following line:

test <- functa(df_ED, "pid", c("A", "B", "C", "D"))

I get an object with 5 observations, but only 7 variables (instead of 10). When I inspect the object, I see 4 new variables (i.e., prob.A, prob.B, prob.C, prob. D) but they are read in as a single variable.

Any subsequent manipulations I would like to perform on this dataset cannot proceed as intended because of this. I have been working on this for the past two days but cannot find any information about this phenomenon and am guessing I am way in over my head.

My eventual goal with this function is to:

  1. compute a total variable (sum of A:D)
  2. compute a prob variable that should output 4 variables (i.e., A/total, B/total, etc.)
  3. recode prob variable such that all infinity values (i.e., "Inf") is recoded into 0
  4. sum all 4 prob variables into a single totalprob variable

Would appreciate any insights into this!

Ronak Shah

When you want to apply a function to multiple columns use across :

library(dplyr)

functa <- function(x, id, vars) {
  
  x %>%
           #sum all vars column
    mutate(total = rowSums(.[vars]),
           #Divide vars column with total and create new columns with prob
           across(all_of(vars), ~./total, .names = '{col}_prob'), 
           #Replace infinite value in prob column with 0
           across(ends_with('_prob'), ~replace(., is.infinite(.), 0))) %>%
           #Sum all prob columns. 
    mutate(totalprob = rowSums(select(., ends_with('prob'))))      
  
}

functa(df_ED, "pid", c("A", "B", "C"))

#     id     A     B     C total A_prob B_prob C_prob totalprob
#  <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>     <dbl>
#1     2    13    12    13    38  0.342  0.316  0.342         1
#2     4     9     0     8    17  0.529  0      0.471         1
#3     6    14     9    14    37  0.378  0.243  0.378         1
#4     8    14     3    13    30  0.467  0.1    0.433         1
#5    10    13    10    11    34  0.382  0.294  0.324         1

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How do I write these using pipes in R?

How do I impute missing variables in R using dplyr?

How do I write a loop to manipulate dataframes in R?

How do I write a function using dplyr that lets me perform different sets of computations?

How do I write a summarize function in R?

How do I "do nothing" when using the dplyr ifelse() in R?

R: In a function using dplyr, how do I check to make sure that an argument name is not quoted before proceeding?

How can I replace multiple characters in R using dplyr?

How do I write a function in r to do cacluations on a record?

How to write function with multiple grouping variable in R? I am using curly curly operator

How can I match the values of multiple variables according to the dates of a variable of interest and summarise them alone in R using dplyr?

How do I write a function in R with 2 parameters with a return of a percent?

How do I run Kruskal and post HOC on multiple variables in R?

How can I pass dataframe variables to a for-loop using pipes with dplyr?

How do I pass and manipulate the arguments of a matrix into the function using pointers?

How do I write this mathematical formula using R?

How do I manipulate variable names in R functions?

How do I manipulate a datetime variable imported from Excel into R

How can I write a function in R which accepts column names like dplyr?

How can I modify these dplyr code for multiple linear regression by combination of all variables in R

In R how do I pass multiple variable names to a function using tidyeval

R: How do I group by with mutate in dplyr?

How do filter dates in r using dplyr?

Standardize variables using dplyr [r]

Is it possible to put dplyr pipes into a function to consolidate repetitive code in R?

How can I mutate multiple variables using dplyr?

How do i do a t test within a dplyr pipe function in R

How do I write this using function in shell

R: Calculate mean by column in a list of dataframes using pipes %>% in dplyr

TOP Ranking

  1. 1

    Failed to listen on localhost:8000 (reason: Cannot assign requested address)

  2. 2

    How to import an asset in swift using Bundle.main.path() in a react-native native module

  3. 3

    Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

  4. 4

    pump.io port in URL

  5. 5

    Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

  6. 6

    BigQuery - concatenate ignoring NULL

  7. 7

    ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

  8. 8

    Do Idle Snowflake Connections Use Cloud Services Credits?

  9. 9

    maven-jaxb2-plugin cannot generate classes due to two declarations cause a collision in ObjectFactory class

  10. 10

    Compiler error CS0246 (type or namespace not found) on using Ninject in ASP.NET vNext

  11. 11

    Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

  12. 12

    Generate random UUIDv4 with Elm

  13. 13

    Jquery different data trapped from direct mousedown event and simulation via $(this).trigger('mousedown');

  14. 14

    Is it possible to Redo commits removed by GitHub Desktop's Undo on a Mac?

  15. 15

    flutter: dropdown item programmatically unselect problem

  16. 16

    Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd

  17. 17

    EXCEL: Find sum of values in one column with criteria from other column

  18. 18

    Pandas - check if dataframe has negative value in any column

  19. 19

    How to use merge windows unallocated space into Ubuntu using GParted?

  20. 20

    Make a B+ Tree concurrent thread safe

  21. 21

    ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

HotTag

Archive