R中的If / Else语句

Katie 发表于 Dev

凯蒂

我在R中有两个数据框：

city         price    bedroom   
San Jose     2000        1          
Barstow      1000        1          
NA           1500        1

重新创建的代码：

data = data.frame(city = c('San Jose', 'Barstow'), price = c(2000,1000, 1500), bedroom = c(1,1,1))

和：

Name       Density
San Jose    5358
Barstow      547

重新创建的代码：

population_density = data.frame(Name=c('San Jose', 'Barstow'), Density=c(5358, 547));

我想基于条件city_type在data数据集中创建一个额外的列，因此，如果城市人口密度高于1000，则为城市，低于1000的为郊区，NA为NA。

city         price    bedroom   city_type   
San Jose     2000        1        Urban
Barstow      1000        1        Suburb
NA           1500        1          NA

我正在使用for循环进行条件流：

for (row in 1:length(data)) {
    if (is.na(data[row,'city'])) {
        data[row, 'city_type'] = NA
    } else if (population[population$Name == data[row,'city'],]$Density>=1000) {
        data[row, 'city_type'] = 'Urban'
    } else {
        data[row, 'city_type'] = 'Suburb'
   }
}

for循环在原始数据集中具有20000多个观察值的情况下运行无误；但是，它会产生很多错误的结果（大部分情况下会产生NA）。

这里出了什么问题？如何才能更好地达到预期的效果？

奥利弗·鲍曼（Oliver Baumann）

我非常喜欢dplyr这种类型的联接/过滤器/突变工作流管道。所以这是我的建议：

library(dplyr)

# I had to add that extra "NA" there, did you not? Hm...
data <- data.frame(city = c('San Jose', 'Barstow', NA), price = c(2000,1000, 500), bedroom = c(1,1,1))
population <- data.frame(Name=c('San Jose', 'Barstow'), Density=c(5358, 547));

data %>% 
  # join the two dataframes by matching up the city name columns
  left_join(population, by = c("city" = "Name")) %>% 
  # add your new column based on the desired condition  
  mutate(
    city_type = ifelse(Density >= 1000, "Urban", "Suburb")
  )

输出：

      city price bedroom Density city_type
1 San Jose  2000       1    5358     Urban
2  Barstow  1000       1     547    Suburb
3     <NA>   500       1      NA      <NA>

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。