- 李柏漢
- 林穎彥
- 黃宇秀
- 邱淦均
** For the full demo version, please check: docs/hic-preprocess-steps.md
# sra-tool env
docker pull ncbi/sra-tools
docker run -itd -v /mnt/e/workspace/bio-fp/:/home/yy/bio-fp --name yy-sra ncbi/sra-tools
docker attach yy-sra
prefetch SRR5579177
fasterq-dump SRR5579177 -p
# other process env
docker pull ubuntu
docker images
docker run -itd -v /mnt/e/workspace/bio-fp/:/home/yy/bio-fp/ --name yy-biofp ubuntu
docker attach yy-biofp
apt install fastqc
fastqc reads_1.fastq reads_2.fastq
apt install cutadapt
cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT \
-q 20,20 -m 36 \
-o trimmed_reads_SRR5579177_1.fastq -p trimmed_readsSRR5579177_2.fastq \
SRR5579177_1.fastq SRR5579177_2.fastq
wget https://hgdownload.cse.ucsc.edu/goldenPath/dm3/bigZips/dm3.fa.gz
gunzip dm3.fa.gz
bowtie-build ../dm3.fa dm3_index
bowtie dm3_index -1 ../SRR5579177_1.fastq -2 ../SRR5579177_2.fastq -t -a -m 1 --best -S dm3_bowtie_align_output.sam
samtools view -bS dm3_bowtie_align_output.sam > output.bam
samtools sort output.bam -o sorted.bam
pairtools parse -c dm3.chrom.sizes -o output.pairsam dm3_bowtie_align_output.sam
pairtools sort -o sorted.pairsam output.pairsam
pairtools dedup -o dedup.pairsam sorted.pairsam
pairtools select '(pair_type == "UU")' -o output.pairs dedup.pairsamcd code
Rscript contact_file_generate.R
Rscript contact_map_generate.R python3 build-cooler.py
cooler zoomify output.cool
python3 mcool_map.pyidea by Noble WS (2009) A Quick Guide to Organizing Computational Biology Projects. PLoS Comput Biol 5(7): e1000424.
-
Presentation:, 1131_bioinformatics_FP_group1.pdf
-
Related Document
- docs/hic-preprocess-steps.md
- data/data-src.md
- Source
- data/data-src.md (experiment data source details)
- Format
- sra
- fastq
- fa (fasta)
- sam
- bam
- txt
- ebwt
- Size
- SRA: ~ 15.3 GB
- FASTQ: ~ 68 GB
- FASTA (dm3): ~ 164 MB\
- EBWT Bowtie Index: ~ 1 KB ~ 161 MB
- SAM: ~ 115 GB
- Sizes (dm3): ~ 1 KB
- PairSAM: ~ 60.8 ~ 133 GB
- Pairs: ~ 60.8 GB
- BINS: ~ 332 KB
-
Which packages do you use?
- original packages in the paper
- bowtie
- additional packages you found
- sra-tools
- wget
- fastqc
- cutadapt
- reptyr
- samtools
- pairtools
- R Lib: ggplot2
- R Lib: reshape2
- Cooler
- Python Lib: cooler
- original packages in the paper
-
Analysis steps
- Download SRA/Reference Genome Files
- Convert SRA to FASTQ
- FASTQ Quality Control
- Build Bowtie Index
- Trimming
- Alignment
- Build Pairs
- Store SAM
- Create Contact File
- Build Contact Matrix
- Visualize Contact Map
- Reproduce Part
- Figure 1A Hi-C Contact Map
- result/res_10000_contact_heatmap.png
- QC Reports:
- results/fastqc-report/SRR5579177_1_fastqc.html
- results/fastqc-report/SRR5579177_2_fastqc.html
| Information | |
|---|---|
| Organism | Drosophila melanogaster |
| Instrument Model | Illumina HiSeq 2500 |
| Mapping Genome | Drosophila melanogaster genome (assembly dm3) |
| Data Processing Software | bowtie (params: -t -a -m 1 --best) |
- Any improvement or change in your package?
- SRA-Tools GitHub
- SRA-Tools Docker Hub
- SRA-Tools Docker Wiki
- Prefetch and Fasterq-Dump
- HowTo: Fasterq-Dump
- docs
- Related publications