Skip to content

Create script to auto-fill control/condition samples #55

@sgiannouk

Description

@sgiannouk

Create a helper script (pipelines/geo/autoassign_sample_groups.py) that automatically fetches GEO sample metadata and classifies samples as control or condition based on their titles/characteristics.

Goals
[ ] Parse mirna_experiment_info.tsv

[ ] For each GSE, query NCBI Entrez API to get all GSM sample metadata

[ ] Classify samples into Control and Condition using Claude Code skills

[ ] Output a draft TSV with control_samples and condition_samples columns filled

[ ] Flag when done/ambiguous cases for manual review

Final Workflow

  1. Run autoassign_sample_groups.py → generates draft with auto-filled columns
  2. Manual review and corrections
  3. Run RNA-seq pipeline with verified sample assignments

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions