Skip to content

Convert EDA analysis script to parametrized Quarto report #4

@jibarozzo

Description

@jibarozzo

Overview

Convert the existing R script R/002_eda_analysis.R into a parametrized Quarto report to improve reproducibility, documentation, and flexibility.

Current State

  • Script performs exploratory data analysis for microbiome data
  • Hard-coded file paths and parameters
  • Limited documentation and visualization output
  • No structured reporting format

Proposed Changes

1. Create Quarto Document Structure

  • Convert to .qmd format with YAML header
  • Add parameter definitions for:
    • Input file paths (phyloseq object, metadata)
    • Analysis thresholds (prevalence cutoffs, read count filters)
    • Visualization options (colors, themes, figure dimensions)
    • Output options (file formats, directories)

2. Parameterize Key Variables

params:
  physeq_file: "data/output/processed/sabr_2023_physeq_object.rda"
  prevalence_threshold: 0.5
  min_reads: 5000
  output_dir: "reports/eda_output"
  figure_width: 10
  figure_height: 8
  theme: "minimal"

3. Improve Documentation

  • Add narrative text explaining each analysis section
  • Include interpretation of results
  • Add method descriptions and citations
  • Provide clear figure captions

4. Enhance Visualizations

  • Add consistent theming across plots
  • Include interactive plots where appropriate
  • Ensure all plots have proper titles, labels, and legends
  • Add summary tables for key findings

5. Structure Sections

  • Executive Summary
  • Data Overview
  • Read Count Analysis
  • Coverage Analysis
  • Prevalence & Core Microbiome
  • Conclusions and Next Steps

6. Output Options

  • Support multiple output formats (HTML, PDF, Word)
  • Include option for self-contained reports
  • Add table of contents and cross-references

Implementation Notes

  • Maintain compatibility with existing data objects
  • Ensure all dependencies are properly loaded
  • Add error handling for missing files or parameters
  • Include session information for reproducibility

This conversion will make the EDA analysis more accessible to collaborators and easier to adapt for future projects.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or request

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions