Sorting multiple columns by first letter and by numbers in R

melbez Published at Dev

melbez

I have created a dataframe that looks like the following:

item  mean
a_b   5
a_c   2
a_a   4
b_d   7
b_f   3
b_e   1

I would like to sort it so that it is first sorted by whether or not it begins with "a_" or "b_", and then have it sorted by mean. The final dataframe should look like this:

item  mean
a_c   2
a_a   4
a_b   5
b_e   1
b_f   3
b_d   7

Note that the item column is not sorted perfectly alphabetically. It is only sorted by the first letter.

I have tried:

arrange(df, item, mean)

The problem with this is that it does not only sort by the "a_" and "b_" categories, but by the entire item name.

I am open to separating the original dataframe into separate dataframes using filter and then sorting the mean within these smaller subsets. I do not need everything to stay in the same dataframe. However, I am unsure how to use filter to only select rows that have items beginning with "a_" or "b_".

avid_useR

Another method using dplyr:

library(dplyr)
arrange(df, sub('_.+$', '', item), mean)

an alternative would be to use str_extract from stringr to extract only the first letter from item:

library(stringr)
arrange(df, str_extract(item, '^._'), mean)

Result:

  item mean
1  a_c    2
2  a_a    4
3  a_b    5
4  b_e    1
5  b_f    3
6  b_d    7

Data:

df <- structure(list(item = c("a_b", "a_c", "a_a", "b_d", "b_f", "b_e"
), mean = c(5L, 2L, 4L, 7L, 3L, 1L)), .Names = c("item", "mean"
), class = "data.frame", row.names = c(NA, -6L))

Notes:

sub('_.+$', '', item) creates a temporary variable by removing _ and everything after that from item. _.+$ matches a literal underscore (_) followed by any character one or more times (.+) at the end of the string ($).
str_extract(item, '^._') creates a temporary variable by extracting any one character (.) followed by a literal underscore (_) in the beginning of the string (^)
The neat thing about dplyr::arrange is that you can create a temporary sorting variable within the function and not have it included in the output.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-11-30

Comments

0 comments

TOP Ranking

Article

Sorting multiple columns by first letter and by numbers in R

Sorting multiple columns by first letter and by numbers in R

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

pump.io port in URL

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Unable to use switch toggle for dark mode in material-ui

grouping by column variables and appending a new variable based on condition

BigQuery - concatenate ignoring NULL

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

Python Read Directory And Output to CSV

Remove adjacent duplicates in linked list in C

Angular 8. Unknown amount of http.get requests in array to call, must be sequential, what to use

How to keep curl session alive between two php processes?

Limit number of characters in uitextview

JMeter: Why get error when try to save test plan

Always setting the text cursor on Textbox

MTKView Displaying Wide Gamut P3 Colorspace

Vector input in shiny R and then use it

How to implement an authentication method using Spring Boot and JPA?

Laravel getting value from another table using eloquent

org.springframework.web.client.HttpClientErrorException: 401 null

When I click any button in my view page the form is submitted