This repository contains the codes of Jiazheng Miao's (MBI Class of 2025) Capstone Project, a mathematical modeling on the expression level of espA in Mycobacterium tuberculosis.
Execute the scripts according to the following order:
sbatch_download.sh: Download data listed indna_accession.txtandrna_accession.txtsbatch_dnaseq_pe.sh: Process paired-ended DNA-seq datasbatch_dnaseq_se.sh: Process single-ended DNA-seq datasbatch_rnaseq.sh: Process RNA-seq dataorganize_data.py: AggregateVCFfiles to aCSVfile, convert RNA read counts to LogFKPM, and screen for RD8/RD236a deletionsmerge_replicates.R: Combine identified variants and average the expression level across the technique replicatespca.R: Decompose the variant matrix (espA regulatory region excluded) into PCsmodeling.Rmd: Perform the modeling