Figure out how to convert bcerror files to parquet

TLDR; use parquet instead of CSV. At a minimum, compress bcerror CSVs with gzip before adding to the repo.

Parquet files are much more disk efficient, faster to parse, etc. Not high priority but would be useful to incorporate into remora pipelines where we just want per-base stats.

Might be as simple as:

``` r
library(readr)
library(nanoparquet)

write_parquet(read_csv("file.csv"), "file.parquet").

# then inspect to 
file.info("file.csv")
file.info("file.parquet")

# reload file in subsequent analyses
read_parquet("file.parquet")
```

Could also combine multiple CSVs together into one parquet with a column for sample name.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out how to convert bcerror files to parquet #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Figure out how to convert bcerror files to parquet #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions