hptmUsage

Preprocessing, visualization, and analysis (e.g., differential usage) of histone post-translational modifications (hPTMs). This package builds on the ‘msqrob2PTM’ workflow to enable robust and performant analysis of hPTM data.

Installation

Install the development version of hptmUsage from GitHub with:

# Install using "pak", alternatively use `devtools::install_github()` or `renv::install()`
# install.packages("pak")
pak::pak("rualmey/hptmUsage")

This package requires the following software to be installed (and to be found on PATH):

MAFFT for multiple sequence alignment
Quarto for reporting of results

Usage

Example code on the processing of a benchmark dataset is available in this repo.

Below, some basic functionality is shown:

# hptmUsage automatically loads QFeatures for its infrastructure
library(hptmUsage)
#> Loading required package: QFeatures
#> Loading required package: MultiAssayExperiment
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#> 
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#> 
#>     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#>     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#>     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#>     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#>     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#>     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#>     colWeightedMeans, colWeightedMedians, colWeightedSds,
#>     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#>     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#>     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#>     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#>     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#>     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#>     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#>     rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: generics
#> 
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#> 
#>     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#>     setequal, union
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
#>     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
#>     get, grep, grepl, is.unsorted, lapply, Map, mapply, match, mget,
#>     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#>     rbind, Reduce, rownames, sapply, saveRDS, table, tapply, unique,
#>     unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> 
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#> 
#>     findMatches
#> The following objects are masked from 'package:base':
#> 
#>     expand.grid, I, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> 
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#> 
#>     rowMedians
#> The following objects are masked from 'package:matrixStats':
#> 
#>     anyMissing, rowMedians
#> 
#> Attaching package: 'QFeatures'
#> The following object is masked from 'package:base':
#> 
#>     sweep

# Reading a Progenesis QIP all ion export
# As of now, this is the only supported input data format
hptmUsageData("all_ion_export.csv") |>
  readProgenesis()
#> Some features had a note:
#> * Feature 6: This feature has a note attached to it!
#> * Feature 38342: This feature lost its ID, for example due to feature editing without redoing tags
#> Warning in readProgenesis(hptmUsageData("all_ion_export.csv")): Some features
#> have no assigned sequence, please verify. These will be dropped: 38342
#> An instance of class QFeatures containing 1 set(s):
#>  [1] precursorRaw: SummarizedExperiment with 4 rows and 10 columns

# Retrieving histones from UniProt and aligning their sequences using MAFFT
# Note: this requires MAFFT to be on the system PATH
histonesFromUniprot() |>
 alignHistones()
#> $unaligned
#> AAStringSetList of length 5
#> [["H1"]] H10_HUMAN=MTENSTSAPAAKPKRAKASKKSTDHPKYSDMIVAAIQAEKNRAGSSRQSIQKYIKSHY...
#> [["H2A"]] H2AY_HUMAN=MSSRGGKKKSTKTSRSAKAGVIFPVGRMLRYIKKGHPKYRIGVGAPVYMAAVLEYL...
#> [["H2B"]] H2B1K_HUMAN=MPEPAKSAPAPKKGSKKAVTKAQKKDGKKRKRSRKESYSVYVYKVLKQVHPDTGI...
#> [["H3"]] CENPA_HUMAN=MGPRRRSRKPEAPRRRSPSPTPTPGPSRRGPSLGASSHQHSRRRQGWLKEIRKLQK...
#> [["H4"]] H4_HUMAN=MSGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVL...
#> 
#> $msa
#> AAStringSetList of length 5
#> [["H1"]] H10_HUMAN=-----------------------------------MTENST-SAPAA-----------...
#> [["H2A"]] H2A1A_HUMAN=-MSGR-----GK-QGGKARAKSKSRSSRAGLQFPVGRIHRLLRKGNYAE-RIGAG...
#> [["H2B"]] H2B1A_HUMAN=--------MPEVSSKGAT---ISKK-----G-FKKAVV--------KTQKK-EGK...
#> [["H3"]] H33_HUMAN=MARTKQTARKSTGGKAPRKQLATKAAR----KSAPSTGGVKKPHRYRPGTVALREIRR...
#> [["H4"]] H4_HUMAN=MSGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVL...
#> 
#> $msa_ref
#> AAStringSetList of length 5
#> [["H1"]] ref_H11_HUMAN=-----------------------------------MSETVP-PAPAASAAP---...
#> [["H2A"]] ref_H2A1B_HUMAN=-MSGR-----GK-QGGKARAKAKTRSSRAGLQFPVGRVHRLLRKGNYSE-R...
#> [["H2B"]] ref_H2B1J_HUMAN=--------MPE-PAKSAP---APKK-----G-SKKAVT--------KAQKK...
#> [["H3"]] ref_H31_HUMAN=MARTKQTARKSTGGKAPRKQLATKAAR----KSAPATGGVKKPHRYRPGTVALR...
#> [["H4"]] ref_H4_HUMAN=MSGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEET...

# Match peptide sequences to the retrieved histones
ncbtoy |>
  matchHistones(aligned_histones$unaligned, 1)
#> ⠙ 0/5 ETA: ? | Matching sequences
#> ⠹ 1/5 ETA:  3s | Matching sequences
#> ⠸ 4/5 ETA:  1s | Matching sequences
#> An instance of class QFeatures containing 1 set(s):
#>  [1] precursorRaw: SummarizedExperiment with 5472 rows and 10 columns

# And more to be found in the example code above

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.devcontainer		.devcontainer
R		R
data-raw		data-raw
data		data
inst		inst
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.Rprofile		.Rprofile
.gitignore		.gitignore
.lintr		.lintr
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
air.toml		air.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

hptmUsage

Installation

Usage

About

Uh oh!

Releases

Languages

License

rualmey/hptmUsage

Folders and files

Latest commit

History

Repository files navigation

hptmUsage

Installation

Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages