Skip to content

enhancement for RCTD function gather_results to increase speed. #196

@szimmerman92

Description

@szimmerman92

Hello,

Thank you for creating this amazing tool. I noticed that the gather_results function can be quite slow for my Xenium data. It has taken several days for samples that have hundreds of thousands of cells. So I decided to rewrite the gather_results function and it appears to be much faster. Do you think this approach could be helpful to speed up RCTD? Thank you. Below please find the code.

library(dplyr)
gather_results <- function(RCTD, results) {
  barcodes = colnames(RCTD@spatialRNA@counts)
  cell_type_names = RCTD@cell_type_info$renorm[[2]]
  weights_doublet = bind_rows(lapply(results, function(x) {
    double_weigths_temp = x$doublet_weights
    names(double_weigths_temp) = c("first_type","second_type")
    return(double_weigths_temp)
  }))
  weights_doublet = as.data.frame(weights_doublet)
  rownames(weights_doublet) = barcodes


  weights = bind_rows(lapply(results, function(x) x$all_weights))
  weights = as.data.frame(weights)
  rownames(weights) = barcodes
  colnames(weights) = cell_type_names

  results_df = bind_rows(lapply(results, function(x) {
    spot_class = x["spot_class"]
    first_type = x["first_type"]
    second_type = x["second_type"]
    first_class = x["first_class"]
    second_class = x["second_class"]
    min_score = x["min_score"]
    singlet_score = x["singlet_score"]
    conv_all = x["conv_all"]
    conv_doublet = x["conv_doublet"]
    return(c(spot_class,first_type,second_type,first_class,second_class,min_score,singlet_score,conv_all,conv_doublet))
  }))
  results_df = as.data.frame(results_df)
  rownames(results_df) = barcodes
  results_df$first_type = factor(results_df$first_type,levels = cell_type_names)
  results_df$second_type = factor(results_df$second_type,levels = cell_type_names)

  score_mat = lapply(results, function(x) x$score_mat)
  singlet_scores = lapply(results, function(x) x$singlet_scores)
  RCTD@results <- list(results_df = results_df, weights = weights, weights_doublet = weights_doublet,
                       score_mat = score_mat, singlet_scores = singlet_scores)
  return(RCTD)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions