I have a data frame with prices of products on different dates. If there are same prices for a product on different dates, I would like to keep only the row with the most recent date.
Example of my data frame:
Date Price Product
1 2019-08-28 10 product 1
2 2019-08-27 10 product 1
3 2019-08-28 15 product 2
4 2019-08-27 14 product 2
5 2019-08-23 15 product 2
6 2019-08-27 10 product 3
So I would like to get rid of row 2 and row 5 and only have:
Date Price Product
1 2019-08-28 10 product 1
3 2019-08-28 15 product 2
4 2019-08-27 14 product 2
6 2019-08-27 10 product 3
Any suggestions? I could not find a question with a similar problem.
Order data by Date, then remove rows duplicated in the other two columns
df <- read.table(text = " Date Price Product
1 2019-08-28 10 product1
2 2019-08-27 10 product1
3 2019-08-28 15 product2
4 2019-08-27 14 product2
5 2019-08-23 15 product2
6 2019-08-27 10 product3",
header = TRUE, stringsAsFactors = FALSE)
df <- df[order(df[,1], decreasing = TRUE),]
df[!duplicated(df[,-1]),]
# Date Price Product
#1 2019-08-28 10 product1
#3 2019-08-28 15 product2
#4 2019-08-27 14 product2
#6 2019-08-27 10 product3
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments