Rcpp的向量子集中的多态性

Mike Jiang

这是我想翻译为c ++来加快处理速度的一种R方法

setMethod("[[", signature=signature(x="ncdfFlowSet"),
               definition=function(x, i, j, use.exprs = TRUE, ...)
{
  #subset by j
  if(!missing(j)){
    if(is.character(j)){
      j <- match(j, localChNames)
      if(any(is.na(j)))
        stop("subscript out of bounds")
    }  

    fr@parameters <- fr@parameters[j, , drop = FALSE]
    localChNames <- localChNames[j]
  }

  #other stuff

})

在凯文的不错的工作矢量子集让我的生活更容易些了很多j子集

    // [[Rcpp::export]]
    Rcpp::S4 readFrame(Rcpp::S4 x
                        , std::string sampleName
                        , Rcpp::RObject j_obj
                        , bool useExpr
                        )
    {
        Rcpp::Environment frEnv = x.slot("frames");
        Rcpp::S4 frObj = frEnv.get(sampleName);
        Rcpp::S4 fr = Rcpp::clone(frObj);

          //get local channel names
          Rcpp::StringVector colnames = x.slot("colnames");

          Rcpp::StringVector ch_selected;
         /*
          * subset by j if applicable
          */
         int j_type = j_obj.sexp_type();
         //creating j index used for subsetting colnames and pdata
         Rcpp::IntegerVector j_indx;

         if(j_type == STRSXP)//when character vector
         {
             ch_selected = Rcpp::StringVector(j_obj.get__());
             unsigned nCol = ch_selected.size();
             j_indx = Rcpp::IntegerVector(nCol);
             //match ch_selected to colnames
            for(unsigned i = 0 ; i < nCol; i ++)
            {
                const Rcpp::internal::string_proxy<STRSXP> &thisCh = ch_selected(i);
                Rcpp::StringVector::iterator match_id = std::find(colnames.begin(), colnames.end(), thisCh);
                if(match_id == colnames.end()){
                    std::string strCh = Rcpp::as<std::string>(thisCh);
                    Rcpp::stop("j subscript out of bounds: " + strCh);
                }else
                {
                    j_indx(i) = match_id - colnames.begin();
                }
            }
         }
         else if(j_type == NILSXP)//j is set to NULL in R when not supplied
         {
             ch_selected = colnames;
         }
         else if(j_type == LGLSXP)
         {
             Rcpp::LogicalVector j_val(j_obj.get__());
             ch_selected = colnames[j_val];
             #to convert numeric indices to integer
         }
         else if(j_type == INTSXP)
         {
             Rcpp::IntegerVector j_val(j_obj.get__());
             j_indx = j_val - 1; //convert to 0-based index
             ch_selected = colnames[j_indx];
         }
         else if(j_type == REALSXP)
         {
             Rcpp::NumericVector j_val(j_obj.get__());
             #to convert numeric indices to integer
         }
         else
             Rcpp::stop("unsupported j expression!");
        /*
         * subset annotationDataFrame (a data frame)
         * 
         */
         if(j_type != NILSXP)
         {
            Rcpp::S4 pheno = fr.slot("parameters");
            Rcpp::DataFrame pData = pheno.slot("data");

            Rcpp::CharacterVector pd_name = pData["name"];
            Rcpp::CharacterVector pd_desc = pData["desc"];
            Rcpp::NumericVector pd_range = pData["range"];
            Rcpp::NumericVector pd_minRange = pData["minRange"];
            Rcpp::NumericVector pd_maxRange = pData["maxRange"];

            Rcpp::DataFrame plist = Rcpp::DataFrame::create(Rcpp::Named("name") = pd_name[j_indx]
                                                        ,Rcpp::Named("desc") = pd_desc[j_indx]
                                                        ,Rcpp::Named("range") = pd_range[j_indx]
                                                        ,Rcpp::Named("minRange") = pd_minRange[j_indx]
                                                        ,Rcpp::Named("maxRange") = pd_maxRange[j_indx]
                                                        );
            pheno.slot("data") = plist;
         }

然而j索引在R通常允许不同类型的输入（character，logical或numeric）。我不知道是否有相同种类的polymorphic机构（可能通过抽象的矢量指针/参考），以使冗余码（简单地由于不同类型RCPP :: **载体的），用于[-subsetting在data.frame以后可以被避免。

凯文·乌谢（Kevin Ushey）

我们通常主张将逻辑分为调度步骤和模板化功能步骤。因此，您应该可以使用以下类似的方法解决您的问题：

#include <Rcpp.h>
using namespace Rcpp;

template <typename T>
SEXP readFrame(Rcpp::S4 x, std::string sampleName, T const& j, bool useExpr) { 
    // use the typed 'j' expression
}

// [[Rcpp::export(subset)]]
SEXP readFrame_dispatch(Rcpp::S4 x, std::string sampleName, SEXP j, bool useExpr) 
    switch (TYPEOF(j)) {
    case INTSXP: return readFrame<IntegerVector>(x, sampleName, j, useExpr);
    case REALSXP: return readFrame<NumericVector>(x, sampleName, j, useExpr);
    case STRSXP: return readFrame<CharacterVector>(x, sampleName, j, useExpr);
    case LGLSXP: return readFrame<LogicalVector>(x, sampleName, j, useExpr);
    default: stop("Unsupported SEXP type");
    }
    return R_NilValue;
}

Rcpp中的设计目标之一是出于速度原因尽可能避免运行时多态-几乎所有多态都是静态完成的，并且理想情况下运行时查找应该只发生一次（除非有时我们被迫回叫R，一些例程）。

调度代码有点丑陋和机械，但是允许这种“样式”的编程。如果将“调度”与“实现”分开，则代码也变得更具可读性，因为您可以将调度的丑陋隐藏在一个位置。

我确实想知道是否存在一些宏魔术可以减少这种形式的分派代码中的代码重复，但是...

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-03-23

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章

Rcpp的向量子集中的多态性

Rcpp的向量子集中的多态性

UITableView的项目向下滚动后更改颜色，然后快速备份

Linux的官方Adobe Flash存储库是否已过时？

用日期数据透视表和日期顺序查询

应用发明者仅从列表中选择一个随机项一次

Mac OS X更新后的GRUB 2问题

验证REST API参数

Java Eclipse中的错误13，如何解决？

带有错误“ where”条件的查询如何返回结果？

ggplot：对齐多个分面图-所有大小不同的分面

尝试反复更改屏幕上按钮的位置 - kotlin android studio

如何从视图一次更新多行（ASP.NET - Core）

计算数据帧中每行的NA

蓝屏死机没有修复解决方案

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

离子动态工具栏背景色

VB.net将2条特定行导出到DataGridView

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

在Windows 7中无法删除文件（2）

python中的boto3文件上传

当我尝试下载 StanfordNLP en 模型时，出现错误

Node.js中未捕获的异常错误，发生调用