ggplot地图图例显示不一致

我会先向您道歉,以下示例不是“最小”的,但是除了完整数据集的特定实例外,我无法重现此行为。

我在这里之前问过这个问题,以为我找到了答案,但是这种行为又回来了,给我带来了困扰。基本上,我有一个脚本,可读取每日的COVID-19病例编号,并生成地图,其中各县按受感染人口的百分比进行颜色编码。该脚本产生了五张地图,一张是全国地图,一张是四个官方人口普查区域的一张,分别是东北,中西部,南部和西部。为了缩小尺寸,以下仅是国家和威德韦斯特地图。

我的原始脚本实际上产生了显示疾病传播的GIF动画,但是渲染它们需要很长时间。下面的版本仅给出了最新数据的单个图,应该可以很快运行。

我在下面的脚本中使用了dput,以避免您不得不读取文件和地理编码位置(我注释掉了代码),但是仍然需要读取较大的县人口csv文件。我已将其发布在这里的pastebin

    library(urbnmapr) # For map
    library(ggplot2)  # For map
    library(dplyr)    # For summarizing
    library(tidyr)    # For reshaping
    library(stringr)  # For padding leading zeros
    library(ggrepel)
    library(ggmap)
    library(usmap)
    library(gganimate)
    library(magrittr)
    library(gifski)
    library(ggnewscale)


    #if using Microsoft R, update checkpoint to get latest packages
    #checkpoint("2020-03-01")


    #start the clock
    ptm <- proc.time()

    set.seed(42)

    #first run setup tasks
    #these can be commented out once the data frames are in place

    ###################begin first run only################################

    #register_google(key = "your google map key here")

    #AFMCbases<-read.csv("C:/Users/jerem/Desktop/Work/covid_maps/AFMCbases.csv", header=TRUE, stringsAsFactors = FALSE)

    #geocode the place names
   # for(i in 1:nrow(AFMCbases)){
   #   result <- geocode(AFMCbases$Base[i])
   #   AFMCbases$longitude[i] <- as.numeric(result[1])
   #   AFMCbases$latitude[i] <- as.numeric(result[2])
   # }

    #transform the lat/lons to appropriate map projection
   # locations<-AFMCbases[,2:3]
   # new_locations <- usmap_transform(locations)
   # AFMCbases <- cbind(AFMCbases,new_locations[,3:4])
    AFMCbases <- structure(list(Base = c("Gunter AFB", "Davis Monthan AFB", "Edwards AFB", 
"Robins AFB", "Scott AFB", "Hanscom AFB", "Offutt AFB", "Holloman AFB", 
"Kirtland AFB", "Rome, NY", "Wright-Patterson AFB", "Tinker AFB", 
"Arnold AFB", "Joint Base San Antonio", "Hill AFB", "Arlington, VA", 
"Eglin AFB"), longitude = c(-86.244558, -110.8592578, -117.8912085, 
-83.591719, -89.8550095, -71.2743123, -95.9145568, -106.099291, 
-106.5338817, -75.4557303, -84.0537448, -97.4158295, -86.0303306, 
-98.4523675, -111.9826984, -77.1067698, -86.5533382), latitude = c(32.4083744, 
32.1675525, 34.9240314, 32.6400014, 38.5415911, 42.4579955, 41.1242718, 
32.8440404, 35.0539718, 43.2128473, 39.8137298, 35.4277, 35.3828616, 
29.4512786, 41.10968, 38.8799697, 30.4635583), personnel = c(820L, 
605L, 5317L, 14088L, 613L, 2906L, 177L, 699L, 1264L, 822L, 15299L, 
16032L, 389L, 3443L, 13679L, 1157L, 8143L), longitude.1 = c(1292311.33608434, 
-1025218.68277084, -1622487.54697885, 1533762.39465597, 881032.996527566, 
2296805.44531269, 342224.203588191, -572424.401062999, -596268.294707156, 
1951897.82199569, 1352969.1130143, 234917.935027853, 1263808.11814915, 
151230.865464104, -1000093.31185121, 1953459.66491185, 1292835.72883446
), latitude.1 = c(-1293180.11438144, -1358896.37536667, -946347.80198453, 
-1223833.19307048, -664025.051658055, 128586.352781279, -422393.887189579, 
-1328730.76688869, -1081540.1543388, 99159.9145180969, -445535.143260001, 
-1059563.46211616, -963250.657602903, -1722291.94024992, -359543.815036425, 
-408019.910644083, -1511165.09243038)), class = "data.frame", row.names = c(NA, 
-17L))

    #define census regions
    west_region <-c("WA", "OR","CA","NV","ID", "MT", "WY", "UT","CO", "AZ", "NM")
    NE_region <- c("ME","NH","VT","MA", "CT", "RI", "NY", "PA", "NJ")
    midwest_region <- c("ND", "SD", "NE", "KS", "MN", "IA", "MO", "WI", "IL","MI", "IN","OH")
    south_region <- c("TX", "OK", "AR", "LA", "MS", "TN", "KY", "AL", "GA","FL","SC","NC","VA","WV","DC","MD","DE")

    west_region_bases <- c("Davis Monthan AFB", "Edwards AFB","Holloman AFB","Kirtland AFB","Hill AFB")
    south_region_bases <- c("Robins AFB","Tinker AFB", "Arnold AFB", "Joint Base San Antonio", "Arlington, VA", "Eglin AFB")
    mw_region_bases <- c("Scott AFB", "Offutt AFB", "Wright-Patterson AFB")
    ne_region_bases <-c("Hanscom AFB", "Rome, NY")

    # Get COVID cases, available from:
    url <- "https://static.usafacts.org/public/data/covid-19/covid_confirmed_usafacts.csv"

    COV <- read.csv(url, stringsAsFactors = FALSE)

    #sometimes there are encoding issues with the first column name
    names(COV)[1] <- "countyFIPS"


    Covid <- pivot_longer(COV, cols=starts_with("X"),
                          values_to="cases",
                          names_to=c("X","date_infected"),
                          names_sep="X") %>%
      mutate(infected = as.Date(date_infected, format="%m.%d.%y"),
             countyFIPS = str_pad(as.character(countyFIPS), 5, pad="0"))

    # Obtain map data for counties (to link with covid data) and states (for showing borders)
    states_sf <- get_urbn_map(map = "states", sf = TRUE)
    counties_sf <- get_urbn_map(map = "counties", sf = TRUE)

    # Merge county map with total cases of cov

#this is the line to use for making animations
    #pop_counties_cov <- inner_join(counties_sf, Covid, by=c("county_fips"="countyFIPS")) %>%

    #to make last frame only
    pop_counties_cov <- inner_join(counties_sf, group_by(Covid, countyFIPS) %>%
                                 summarise(cases=max(cases)), by=c("county_fips"="countyFIPS"))


    #read the county population data
    counties_pop <- read.csv("C:/Users/jerem/Desktop/Work/covid_maps/countyPopulations.csv", header=TRUE, stringsAsFactors = FALSE)

    #pad the single digit state FIPS states
    counties_pop <- counties_pop %>% mutate(CountyFIPS=str_pad(as.character(CountyFIPS),5,pad="0"))

    #merge the population and covid data by FIPS
    pop_counties_cov$population <- counties_pop$Population[match(pop_counties_cov$county_fips,counties_pop$CountyFIPS)]

    #calculate the infection rate
    pop_counties_cov <- pop_counties_cov %>% mutate(infRate = (cases/population)*100)

    #counties with 0 infections don't appear in the usafacts data, so didn't get a population
    #set them to 0
    pop_counties_cov$population[is.na(pop_counties_cov$population)] <- 0
    pop_counties_cov$infRate[is.na(pop_counties_cov$infRate)] <- 0

    plotDate="April20"
    basepath = "C:/your/output/file/path"

    naColor = "white"
    lowColor = "green"
    midColor = "maroon"
    highColor = "red"
    baseFill = "dodgerblue4"
    baseColor = "firebrick"
    baseShape = 23
    scaleLow = "magenta"
    scaleHigh = "blue"


    ###################end first run only################################


    ###################National Map################################

    p <- pop_counties_cov %>%
      ggplot() +
      geom_sf(mapping = aes(fill = infRate, geometry=geometry), color = NA) +
      geom_sf(data = states_sf, fill = NA, color = "black", size = 0.25) +
      coord_sf(datum = NA) +   
      scale_fill_gradient(name = "% Pop \nInfected", trans = "log",low=lowColor, high=highColor,
                          breaks=c(0, round(max(pop_counties_cov$infRate),1)),
                          na.value = naColor) +
      new_scale_fill() +
      geom_point(data=AFMCbases, 
                 aes(x=longitude.1, y=latitude.1,fill=personnel), 
                 shape= baseShape,
                 color = "black",
                 size = 3) +
      scale_fill_gradient(name="AFMC \nMil + Civ",
                          low = scaleLow, high = scaleHigh,
                          breaks = c(1, max(AFMCbases$personnel)))+
      theme_bw() + 
      theme(legend.position="bottom", 
            panel.border = element_blank(),
            axis.title.x=element_blank(), 
            axis.title.y=element_blank()) +
      labs(title=paste('Confirmed COVID-19 Cases: ', max(Covid$infected),sep=""),
                         subtitle='HQ AFMC/A9A \nData: usafacts.org')

    # a <- p + transition_time(infected) + 
    #         labs(title='Confirmed COVID-19 Cases: {frame_time}',
    #              subtitle='HQ AFMC/A9A \nData: usafacts.org')
    # 
    # animate(a, 
    #        device="png", 
    #        renderer=file_renderer(paste(basepath,plotDate,"/national",sep=""),
    #                               prefix="gganim_plot",
    #                               overwrite=TRUE)
    # )
    # 
    # #make the national animated gif
    # png_files <- list.files(paste(basepath,plotDate,"/national",sep=""), pattern = ".*png$", full.names = TRUE)
    # st = format(Sys.time(), "%Y-%m-%d")
    # gifName <- paste(basepath,plotDate,"/national/COVID-19-Cases-byCounty_",st,".gif",sep="")
    # gifski(png_files, gif_file = gifName, width = 1000, height = 750, delay = 0.25, loop=FALSE)

    #save the image
    st = format(Sys.time(), "%Y-%m-%d")
    SaveFilename = paste(basepath,plotDate,"/national/COVID-19-Cases-byCounty_",st,".png",sep="")
    if(!dir.exists(paste(basepath,plotDate,"/national",sep=""))) dir.create(paste(basepath,plotDate,"/national",sep=""))
    ggsave(filename=SaveFilename, plot = p, dpi = 300)

    ###################End National Map################################


    ###################Midwest Map################################

    #filter out states
    #neCovid <- Covid %>% filter(State %in% NE_region )
    mw_pop_counties_cov <- pop_counties_cov %>% filter(state_abbv %in% midwest_region)
    mw_states_sf <- states_sf %>% filter(state_abbv %in% midwest_region)
    mw_counties_sf <- counties_sf %>% filter(state_abbv %in% midwest_region)

    #filter out bases
    mwBases <- AFMCbases %>% filter(Base %in% mw_region_bases)

    p <- mw_pop_counties_cov %>%
      ggplot() +
      geom_sf(mapping = aes(fill = infRate, geometry=geometry), color = NA) +
      geom_sf(data = mw_states_sf, fill = NA, color = "black", size = 0.25) +
      coord_sf(datum = NA) +   
      scale_fill_gradient(name = "% Pop \nInfected", trans = "log",low=lowColor, high=highColor,
                          breaks=c(0, round(max(mw_pop_counties_cov$infRate),1)),
                          na.value = naColor) +
      new_scale_fill() +
      geom_point(data=mwBases, 
                 aes(x=longitude.1, y=latitude.1,fill=personnel), 
                 shape = baseShape,
                 color = "black",
                 size=3) +
      scale_fill_gradient(name="AFMC \nMil + Civ",
                          low=scaleLow, high = scaleHigh,
                          breaks = c(1, max(mwBases$personnel)))+
      theme_bw() + 
      theme(legend.position="bottom", 
            panel.border = element_blank(),
            axis.title.x=element_blank(), 
            axis.title.y=element_blank()) +
      labs(title=paste('Confirmed COVID-19 Cases: ', max(Covid$infected),sep=""),
           subtitle='HQ AFMC/A9A \nData: usafacts.org')

    # a <- p + transition_time(infected) + 
    #   labs(title='Confirmed COVID-19 Cases: {frame_time}',
    #        subtitle='HQ AFMC/A9A \nData: usafacts.org')
    # 
    # animate(a, 
    #         device="png", 
    #         renderer=file_renderer(paste(basepath,plotDate,"/midwest",sep=""),
    #                                prefix="gganim_plot",
    #                                overwrite=TRUE)
    # )
    # 
    # #make the midwest animated gif
    # png_files <- list.files(paste(basepath,plotDate,"/midwest",sep=""), pattern = ".*png$", full.names = TRUE)
    # st = format(Sys.time(), "%Y-%m-%d")
    # gifName <- paste(basepath,plotDate,"/midwest/MW_COVID-19-Cases-byCounty_",st,".gif",sep="")
    # gifski(png_files, gif_file = gifName, width = 1000, height = 750, delay = 0.25, loop=FALSE)

    st = format(Sys.time(), "%Y-%m-%d")
    SaveFilename = paste(basepath,plotDate,"/midwest/MW_COVID-19-Cases-byCounty_",st,".png",sep="")
    if(!dir.exists(paste(basepath,plotDate,"/midwest",sep=""))) dir.create(paste(basepath,plotDate,"/midwest",sep=""))
    ggsave(filename=SaveFilename, plot = p, dpi = 300)

    ###################End Midwest Map################################

这是我今天早上运行代码时得到的国家地图

在此处输入图片说明

请注意,基地(彩色菱形)的人员数量是一个比例,而县的阴影则没有规模。

这是中西部的地图。您可以从代码中看到,这是同一ggplot,只是带有向下过滤到中西部地区县的数据集。

在此处输入图片说明

Now the scale is there. As mentioned in my previous question I thought that the answer had been something to do with the width of the image being insufficient to accommodate the scale. When I added a newline in the legend text to shorten it that appeared to do the trick. But now the legend is disappearing again, andmaking the output image wider has no effect. Plus, just by eyeball it would appear there is plenty of room in the national plot to accommodate the scale.

Another bizarre aspect is the behavior associated with rounding the breaks. Below is a west map where I applied no rounding to the breaks

scale_fill_gradient(name = "% Pop \nInfected",trans = "log", low=lowColor, high=highColor,
                       breaks=c(0, max(west_pop_counties_cov$infRate)),
                       na.value = naColor)

在此处输入图片说明

So the scale is back but it goes to 6 decimal places. If I try to round it to 2

 scale_fill_gradient(name = "% Pop \nInfected",trans = "log", low=lowColor, high=highColor,
                       breaks=c(0, round(max(west_pop_counties_cov$infRate),2)),
                       na.value = naColor)

I get this map

在此处输入图片说明

which surely indicates the horizontal space isn't the issue...if it can accommodate 6 decimal places then surely there's room for 2?

与编写原始脚本一样,我花了很多时间来尝试解决这种不一致的缩放行为。我需要这些东西保持一致,以便可以定期将它们作为工作产品提供。

dc37

您的问题是由于变量中存在0值,infRate导致此警告消息中所观察到log转换混乱scale_fill_gradient

警告消息:变换在离散的y轴上引入了无限值

在这里,您可以找到一种方法来解决此问题,方法是使用非0最小值进行设置limitsbreaks参数设置

> summary(pop_counties_cov$infRate)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.00000 0.01543 0.03993 0.09178 0.09043 2.87425 
> summary(pop_counties_cov$infRate[pop_counties_cov$infRate != 0])
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.001537 0.023724 0.048169 0.105326 0.102350 2.874253 

设置这些新限制(并删除具有0个值的行)可为您提供:

pop_counties_cov %>% 
  filter(infRate != 0) %>%
  ggplot() +
  geom_sf(mapping = aes(fill = infRate, geometry=geometry), color = NA) +
  geom_sf(data = states_sf, fill = NA, color = "black", size = 0.25) +
  coord_sf(datum = NA) +   
  scale_fill_gradient(name = "% Pop \nInfected", trans = "log",low=lowColor, high=highColor,
                      breaks=c(0.001,2.9),
                      na.value = naColor, limits = c(0.001,2.9)) +
  new_scale_fill() +
  geom_point(data=AFMCbases, 
             aes(x=longitude.1, y=latitude.1,fill=personnel), 
             shape= baseShape,
             color = "black",
             size = 3) +
  scale_fill_gradient(name="AFMC \nMil + Civ",
                      low = scaleLow, high = scaleHigh,
                      breaks = c(1, max(AFMCbases$personnel)))+
  theme_bw() + 
  theme(legend.position="bottom", 
        panel.border = element_blank(),
        axis.title.x=element_blank(), 
        axis.title.y=element_blank()) +
  labs(title=paste('Confirmed COVID-19 Cases: ', max(Covid$infected),sep=""),
       subtitle='HQ AFMC/A9A \nData: usafacts.org')

在此处输入图片说明

它能回答您的问题吗?

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章