Add more data to recount3

This is a recurrent goal as new data is deposited nearly every day to the Sequence Read Archive.

- [ ] To add more data to `recount3`, we first need computing credits at some large computing clusters such as ACCESS (formerly called XSEDE) https://access-ci.org/. 

- [ ] Next, we have to run Monorail https://github.com/langmead-lab/monorail-external to process new data.

- [ ] The outputs are then transferred to a local cluster where we can keep a backup of the data. On the `recount3` paper, this is called the aggregation node. There files across studies are aggregated.

- [ ] The data is then uploaded to IDIES, AWS Open Data Sponsorship Program https://aws.amazon.com/marketplace/pp/prodview-t3rflz3f557jq#resources, AnVIL, or any other active mirrors. It has to follow the data structure that the `recount3` R package expects.

There are additional steps that are part of the `recount3` world such as:

* generating tissue predictions and all predictions that Shijie C. Zheng ran for the initial `recount3` release
* there are processing steps needed to generate the Snaptron compilations. Christopher Wilks ran this for the initial release
* Afrooz Razi et al recently also used part of the Monorail output that is not publicly shared to obtain genotype predictions https://doi.org/10.1101/2023.10.21.562237
* Update the recount3 study browser https://github.com/LieberInstitute/recount3-docs/tree/master/study-explorer


This goal really falls outside the `recount3` R package, though the R package is one of the most commonly used interfaces for the data. Accomplishing this goal will likely need its own support and/or coordination with [Wilks et al](https://doi.org/10.1186/s13059-021-02533-6) and/or [Razi et al](https://doi.org/10.1101/2023.10.21.562237)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more data to recount3 #50

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add more data to recount3 #50

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions