如何在R中的距离矩阵旁边绘制树状图?

Anirban_Mitra

我正在寻找一种有效的方法来绘制从数据获得的树状图,但要与相应的距离矩阵一起而不是原始数据。我一直好奇如何用不同的论文来展示这一点,似乎他们所做的只是将热图和树状图分别绘制出来,并在图像编辑软件中进行处理。希望以下代码能使我明白自己想要的。假设我生成以下数据,并使用Pearson的相关性作为距离度量并使用完整的链接进行聚类,得到层次结构的聚类:

library(gplots)
set.seed(2)
x <- matrix(rnorm(100), nrow = 5)
dist.fn <- function(x) as.dist(1-cor(t(x)))
hclust.com <- function(x) hclust(x, method="complete")
h.ori <- heatmap.2(x, trace="none", distfun=dist.fn, hclustfun=hclust.com,dendrogram = "row",main = "Fig1")
h.ori$rowInd
# 1 3 5 4 2

在此处输入图片说明

现在,我可以按照图1中的树状图绘制相应的距离矩阵,以其行和列排序:

colfunc <- colorRampPalette(c("red", "white", "yellow")) #not really necessary
dmat <- cor(t(x))[h.ori$rowInd,h.ori$rowInd]
heatmap.2(dmat,Rowv = NULL,Colv = "Rowv",scale = 'none', 
          dendrogram='none',trace = 'none',density.info="none",
          labRow = h.ori$rowInd, labCol = h.ori$rowInd,
          col=colfunc(20))

在此处输入图片说明

Here goes my question: How do I add the dendrogram plotted in Fig1 on to the one in Fig2 (preferably along both columns and rows) ? The purpose is to view the clustering as produced by the dendrogram and for Block models this would be a nice way to visualize. Also as a side question, I know how to plot heatmaps using ggplot2 library i.e. using geom_tile(). Is there a way to do the same things I want above using ggplot2 ?

teunbrand

With regards to doing this in ggplot2; I wrote a function at some point that helps with this, though it is not without flaws. It takes an hclust object and uses that to plot a dendrogram as the axis guide. First we'll grab the dendrogram from the heatmap you had before.

library(gplots)
#> Warning: package 'gplots' was built under R version 4.0.2
#> 
#> Attaching package: 'gplots'
#> The following object is masked from 'package:stats':
#> 
#>     lowess
library(ggplot2)
library(ggh4x) #devtools::install_github("teunbrand/ggh4x")

set.seed(2)
x <- matrix(rnorm(100), nrow = 5)
dist.fn <- function(x) as.dist(1-cor(t(x)))
hclust.com <- function(x) hclust(x, method="complete")
h.ori <- heatmap.2(x, trace="none", distfun=dist.fn, hclustfun=hclust.com,dendrogram = "row",main = "Fig1")
h.ori$rowInd
#> [1] 1 3 5 4 2

然后hclust,将其格式化为一个对象,然后将其放入秤。比例尺(理论上)应根据聚类自动对变量进行排序。

我只是在图的每一侧添加树状图,因此您可以选择真正想要的树状图。

# Plot prep: making the distance and hclust objects
clust <- as.hclust(h.ori$rowDendrogram)
df <- reshape2::melt(cor(t(x)))

ggplot(df, aes(Var1, Var2, fill = value)) +
  geom_raster() +
  scale_fill_gradient2(low = "red", mid = "white", high = "yellow")+
  scale_x_dendrogram(hclust = clust) +
  scale_y_dendrogram(hclust = clust) +
  guides(
    x.sec = guide_dendro(dendro = ggdendro::dendro_data(clust), position = "top"),
    y.sec = guide_dendro(dendro = ggdendro::dendro_data(clust), position = "right")
  ) +
  coord_equal()

需要注意的是,对标签的控制还没有很好。如果您在使用该功能时遇到任何麻烦,请告诉我,以便我改善它。

祝好运!

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章