Skip to content

Reading a .tsv.gz file #18

@slowkow

Description

@slowkow

Suppose we have a matrix of single-cell RNA-seq data that looks like this:

zcat exprMatrix.tsv.gz | head | cut -f1-5

gene    sci3-me-001.GTCGGAGTTTGAGGTAGAA sci3-me-001.ATTAGTCTGTGTATAATACG        sci3-me-001.GAGGAACTTAATACCATCC sci3-me-001.TTCGCGGATACTCTCTCAA
ENSMUSG00000051951.5|Xkr4       0       0       0       0
ENSMUSG00000103377.1|Gm37180    0       0       0       0
ENSMUSG00000104017.1|Gm37363    0       0       0       0
ENSMUSG00000103025.1|Gm37686    0       0       0       0
ENSMUSG00000089699.1|Gm1992     0       0       0       0
ENSMUSG00000103201.1|Gm37329    0       0       0       0
ENSMUSG00000103161.1|Gm38148    0       0       0       0
ENSMUSG00000102331.1|Gm19938    0       0       0       0
ENSMUSG00000102343.1|Gm37381    0       0       0       0

Rows:

zcat exprMatrix.tsv.gz | wc -l

26184

Columns:

zcat exprMatrix.tsv.gz | head -n1 | wc -w

2058653

Could I please ask if you might be able to share an R code snippet for how to use the beachmat package to read this data into a sparse matrix (dgCMatrix)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions