Skip to content

Clinical Download for All Donors - Donor Download endpoint by Arranger Filter #699

@joneubank

Description

@joneubank

Detailed Description

We want the ability to download clinical data for all donors in a File Repository query (including a query with no filters). To accomplish this, we want to add an endpoint to the gateway where a SQON filter for the file repository can be provided and the gateway will return the TSV download for all donors included in that filter.

Possible Implementation

This endpoint can be a GET request that takes the SQON as a query parameter. If no SQON is provided, we will use a default case of "all files" (no filter).

The handler should use this query to get the list of unique Donor IDs from the ES file centric index. It is important that this query apply the serverSide filters we have on all arranger requests that will filter the results based on the user permissions and the file embargo stage meta-data. With the list of donors retrieved, the donor data can be retrieved from the clinical service.

Considerations for large queries

Since the number of files will likely be in the tens or hundreds of thousands, we should instead be retrieving the donor ID aggregation unique values. This will work well up until we press against the ES max buckets limit (around 65k). A composite aggregation should allow streaming all unique donor IDs for the filter. A limit to the max donors in the request may be needed as the total number of ARGO donors increases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    new-featureRequest is a new feature

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions