Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
a5dc3b5
first commit
lecorguille Apr 7, 2016
6bb9413
add a test data
lecorguille Apr 7, 2016
3bc1583
planemo tes using conda passed
lecorguille Apr 8, 2016
6b93b24
README
lecorguille Apr 8, 2016
11081c1
add tool_dep for pracma
lecorguille Apr 8, 2016
7aec8dc
add .shed.yml
lecorguille Apr 8, 2016
74ade3c
add planemo test results ; change version ; add README.md
lecorguille Apr 8, 2016
0ea7738
planemo shed_test passed
lecorguille Apr 8, 2016
057db9a
add info about conda
lecorguille May 16, 2016
5bbacfd
add travis test
lecorguille Apr 18, 2016
f6bb791
Update README.md
lecorguille Apr 18, 2016
b909b7d
small edit remove IPO mention in Conda
yguitton May 22, 2016
674a5a7
change output order to propose bucketedData to normalization by default
lecorguille Jul 4, 2016
5204b52
Add Metabolomics tag in Galaxy toolshed descriptions.
pkrog Jul 28, 2016
3f116ed
Update .travis.yml
lecorguille Jul 28, 2016
133f0a6
x-axis customization: add chemical shift labels
mtremblayfr May 24, 2016
93fee10
fix a bug on an object name
lecorguille Aug 12, 2016
dfd4777
minor
lecorguille Aug 12, 2016
7e495bd
remove the galaxy dev branch settings
lecorguille Aug 12, 2016
ee794a9
Update README.rst
mtremblayfr Nov 18, 2016
4f8eaf6
Change of file input type to allow bucketing of preprocessed files
mtremblayfr Oct 25, 2016
f9f1c1c
Bucketing depends on file type: if Bruker, read of raw files; if tsv,…
mtremblayfr Oct 25, 2016
dc498c8
Change of x-axis label
mtremblayfr Oct 25, 2016
d5c956b
remove r requirement
mtremblayfr Oct 27, 2016
5f46387
Input type change to allow preprocessed files: add of the tsv type in…
mtremblayfr Oct 28, 2016
3d7d477
File input type change to allow preprocessed file to be bucketed: add…
mtremblayfr Oct 28, 2016
190a3bd
Help change: input type
mtremblayfr Nov 7, 2016
ebd840a
Update NmrBucketing_wrapper.R
lecorguille Nov 14, 2016
826e69d
Update NmrBucketing_xml.xml
lecorguille Nov 14, 2016
0f24b23
Update NmrBucketing_xml.xml
lecorguille Nov 22, 2016
14743d4
remove library upload
lecorguille Apr 11, 2017
960fa82
Bucketing modification: if in exclusion zone, integration value = 0; …
mtremblayfr Apr 19, 2017
99c57d1
SampleMetadata and variableMetadata writing conditionned on the type …
mtremblayfr Apr 19, 2017
ff53de6
Graphical representation: split of the whole spectral window into "zo…
mtremblayfr Apr 20, 2017
a56a22c
Graphical function needed for spectra representation
mtremblayfr Apr 20, 2017
f8fe31a
update test-data
lecorguille Apr 21, 2017
4348159
minor change in travis.yml ; removing of deprecated files
lecorguille Apr 21, 2017
4aa413e
remove the old tool_dependencies
lecorguille Apr 21, 2017
fb95c3b
bump version ; add ChangeLog
lecorguille Apr 21, 2017
b989550
bump again the version number to 2.0.2
lecorguille Apr 21, 2017
2705e87
Removal of zero's for normalisation
mtremblayfr Sep 25, 2018
38e630f
Version change
mtremblayfr Sep 25, 2018
2b1e2da
Enhancement of output information: add of R package version
mtremblayfr Jun 6, 2017
22b42e2
Update .travis.yml
lecorguille Oct 2, 2018
4cd940e
nmr_bucketin - prepare to merge in tools-metabolomics
lecorguille Jul 29, 2019
74606e1
Merge nmr_bucketing repo - step2
lecorguille Jul 29, 2019
bd0bd3c
nmr_bucketing - NmrBucketing - add zip datatype
lecorguille Mar 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .gitignore

This file was deleted.

7 changes: 7 additions & 0 deletions tools/nmr_bucketing/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
categories: [Metabolomics]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
categories: [Metabolomics]
categories: Metabolomics

description: '[Metabolomics][W4M][NMR] NMR Bucketing - Bucketing / Binning (spectra segmentation in fixed-size windows) and integration (sum of absolute intensities inside each bucket) to preprocess NMR data'
homepage_url: http://workflow4metabolomics.org
long_description: 'Part of the W4M project: http://workflow4metabolomics.org'
name: nmr_bucketing
owner: marie-tremblay-metatoul
remote_repository_url: https://github.com/workflow4metabolomics/nmr_bucketing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be updated to now point to this repository?

74 changes: 74 additions & 0 deletions tools/nmr_bucketing/DrawSpec.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
drawSpec <- function (X, startP = -1, endP = -1, groupLabel = NULL, useLog = -1, highBound = -1, lowBound = -1,
xlab = NULL, ylab = NULL, main = NULL, nAxisPos = 4, offside = 0)
Comment on lines +1 to +2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
drawSpec <- function (X, startP = -1, endP = -1, groupLabel = NULL, useLog = -1, highBound = -1, lowBound = -1,
xlab = NULL, ylab = NULL, main = NULL, nAxisPos = 4, offside = 0)
drawSpec <- function (X, startP = -1, endP = -1, groupLabel = NULL, useLog = -1, highBound = Inf, lowBound = -Inf, xlab = "index", ylab = "intensity", main = NULL, nAxisPos = 4, offside = 0)

{
groupLabel_name = groupLabel
X = as.data.frame(X)
# colnames(X) = c(1:ncol(X))
Comment on lines +5 to +6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
X = as.data.frame(X)
# colnames(X) = c(1:ncol(X))

X = as.matrix(X)
if (highBound != -1) {
for (i in 1:nrow(X)) {
myIndex = which(X[i, ] > highBound)
X[i, myIndex] = highBound
}
}
Comment on lines +8 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (highBound != -1) {
for (i in 1:nrow(X)) {
myIndex = which(X[i, ] > highBound)
X[i, myIndex] = highBound
}
}
X[X > highBound] <- highBound

if (lowBound != -1) {
for (i in 1:nrow(X)) {
myIndex = which(X[i, ] < lowBound)
X[i, myIndex] = lowBound
}
}
Comment on lines +14 to +19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (lowBound != -1) {
for (i in 1:nrow(X)) {
myIndex = which(X[i, ] < lowBound)
X[i, myIndex] = lowBound
}
}
X[X < lowBound] <- lowBound

if (is.null(groupLabel)) {
groupLabel = c(1:nrow(X))
groupLabel = as.factor(groupLabel)
}
else {
levels(groupLabel) = c(1:length(levels(groupLabel)))
}
Comment on lines +20 to +26
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section either inits groupLabel or sets levels of groupLabel, which are not set if it is null - this seems odd to me.

if (startP == -1)
startP = 1
if (endP == -1)
endP = ncol(X)
if (is.null(xlab)) {
xlab = "index"
}
if (is.null(ylab)) {
ylab = "intensity"
}
Comment on lines +31 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (is.null(xlab)) {
xlab = "index"
}
if (is.null(ylab)) {
ylab = "intensity"
}

if (is.null(main)) {
main = paste(" ", startP + offside, "-", endP + offside)
}
GraphRange <- c(startP:endP)
yn <- X[, GraphRange]
if (useLog != -1)
yn = log(yn)
if (length(yn) > ncol(X))
{
plot(yn[1, ], ylim = c(min(yn), max(yn)), type = "n", ylab = ylab, xlab = xlab, main = main, xaxt = "n")
tempVal = trunc(length(GraphRange)/nAxisPos)
xPos = c(0:nAxisPos) * tempVal
axis(1, at = xPos, labels = colnames(X)[xPos + startP + offside])
for (i in 1:length(levels(groupLabel)))
{
groupLabelIdx = which(groupLabel == levels(groupLabel)[i])
color <- palette(rainbow(length(levels(groupLabel))))
for (j in 1:length(groupLabelIdx))
{
lines(yn[groupLabelIdx[j], ], col = color[i])
}
}
if (!is.null(groupLabel_name))
{
legendPos = "topleft"
legend(legendPos, levels(groupLabel_name), col = as.integer(levels(groupLabel)), text.col = "black", pch = c(19, 19), bg = "gray90")
}
}
if (length(yn) == ncol(X))
{
plot(yn, ylim = c(min(yn), max(yn)), type = "n", ylab = ylab, xlab = xlab, main = main, xaxt = "n")
tempVal = trunc(length(GraphRange)/nAxisPos)
xPos = c(0:nAxisPos) * tempVal
# axis(1, at = xPos, labels = xPos + startP + offside)
axis(1, at = xPos, labels = colnames(X)[xPos + startP + offside])
lines(yn)
}
}
273 changes: 273 additions & 0 deletions tools/nmr_bucketing/NmrBucketing_script.R
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run some auto formatting tool on this file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could overall be cleaned up into some minimal CRAN package as it has minimal dependencies etc. which would also mkae the Galaxy tool a bit nicer and include all the R functions. That would also make this functionality available to more users or other workflow engines.

Original file line number Diff line number Diff line change
@@ -0,0 +1,273 @@
################################################################################################
# SPECTRA BUCKETING AND INTEGRATION FROM RAW BRUKER FILES #
# User : Galaxy #
# Original data : -- #
# Starting date : 20-10-2014 #
# Version 1 : 18-12-2014 #
# Version 2 : 07-01-2015 #
# Version 3 : 24-10-2016 #
# #
# Input files : modification on october 2016 #
# - Raw bruker files included in user-defined fileName #
# - Preprocessed files (alignment, ...) included in p x n dataframe #
################################################################################################
Comment on lines +1 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
################################################################################################
# SPECTRA BUCKETING AND INTEGRATION FROM RAW BRUKER FILES #
# User : Galaxy #
# Original data : -- #
# Starting date : 20-10-2014 #
# Version 1 : 18-12-2014 #
# Version 2 : 07-01-2015 #
# Version 3 : 24-10-2016 #
# #
# Input files : modification on october 2016 #
# - Raw bruker files included in user-defined fileName #
# - Preprocessed files (alignment, ...) included in p x n dataframe #
################################################################################################

NmrBucketing <- function(fileType,fileName,leftBorder = 10.0,rightBorder = 0.5,bucketSize = 0.04,exclusionZones,
exclusionZonesBorders=NULL,graph=c("None","Overlay","One_per_individual"),
nomFichier,savLog.txtC = NULL)
{
## Option
##---------------
strAsFacL <- options()$stringsAsFactors
options(stingsAsFactors = FALSE)
options(warn = -1)
Comment on lines +20 to +22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed with modern R?



## Constants
##---------------
topEnvC <- environment()
flgC <- "\n"

## Log file (in case of integration into Galaxy)
##----------------------------------------------
if(!is.null(savLog.txtC))
sink(savLog.txtC, append = TRUE)

## Functions definition
##---------------------
## RAW BRUKER FILE READING FUNCTION
NmRBrucker_read <- function(DataDir,SampleSpectrum)
{

bruker.get_param <- function (ACQ,paramStr)
{
regexpStr <- paste("^...",paramStr,"=",sep="")
as.numeric(gsub("^[^=]+= ","" ,ACQ[which(simplify2array(regexpr(regexpStr,ACQ))>0)]))
}

ACQFILE <- "acqus"
SPECFILE <- paste(DataDir,"/1r",sep="")
PROCFILE <- paste(DataDir,"/procs",sep="")

ACQ <- readLines(ACQFILE)
TD <- bruker.get_param(ACQ,"TD")
SW <- bruker.get_param(ACQ,"SW")
SWH <- bruker.get_param(ACQ,"SW_h")
DTYPA <- bruker.get_param(ACQ,"DTYPA")
BYTORDA <- bruker.get_param(ACQ,"BYTORDA")
#ENDIAN = ifelse( BYTORDA==0, "little", "big")
ENDIAN <- "little"
SIZE = ifelse( DTYPA==0, 4, 8)

PROC <- readLines(PROCFILE)
OFFSET <- bruker.get_param(PROC,"OFFSET")
SI <- bruker.get_param(PROC,"SI")

to.read = file(SPECFILE,"rb")
maxTDSI = max(TD,SI)
# signal<-rev(readBin(to.read, what="int",size=SIZE, n=TD, signed = TRUE, endian = ENDIAN))
signal<-rev(readBin(to.read, what="int",size=SIZE, n=maxTDSI, signed = TRUE, endian = ENDIAN))
close(to.read)

td <- length(signal)

# dppm <- SW/(TD-1)
dppm <- SW/(td-1)
pmax <- OFFSET
pmin <- OFFSET - SW
ppmseq <- seq(from=pmin, to=pmax, by=dppm)
signal <- 100*signal/max(signal)

SampleSpectrum <- cbind(ppmseq,signal)
return(SampleSpectrum)
}

## SPECTRUM BUCKETING
NmrBrucker_bucket <- function(spectrum)
{
# Initialisations
b <- 1
j <- 1
# Variable number
J <- round((spectrum[1,1]-spectrum[dim(spectrum)[1],1])/bucketSize)
f.bucket <- matrix(rep(0,J*2),ncol=2)
colnames(f.bucket) <- c("Bucket",FileNames[i])


# Data bucketing
while (j < dim(spectrum)[1])
{
# chemical shift
BUB <- spectrum[j,1]

# In zone exclusion?
exclusion.in <- FALSE
if (!is.null(exclusionZonesBorders))
{
for (k in 1:nrow(exclusion.zone.m))
if (BUB <= exclusion.zone.m[k,1] && exclusion.zone.m[k,2] < BUB)
exclusion.in <- TRUE
}

if (exclusion.in)
{
# Bucketing
{
BLB <- BUB - bucketSize
bucket <- spectrum[j,]
while (j < dim(spectrum)[1] && spectrum[j,1] > BLB)
{
j <- j + 1
if (spectrum[j,1] > BLB)
bucket <- rbind(bucket,spectrum[j,])
}

f.bucket[b,] <- c(round(mean(bucket[,1]),3),0)

# Next bucket boundary
BUB <- spectrum[j,1]
b <- b + 1
}
}

if (!exclusion.in)
# Bucketing
{
BLB <- BUB - bucketSize
bucket <- spectrum[j,]
while (j < dim(spectrum)[1] && spectrum[j,1] > BLB)
{
j <- j + 1
if (spectrum[j,1] > BLB)
bucket <- rbind(bucket,spectrum[j,])
}

# Integration (trapezoid method)
s <- cumtrapz(bucket[,1],bucket[,2])
f.bucket[b,] <- c(round(mean(bucket[,1]),3),abs(s[length(s)][[1]]))

# Next bucket boundary
BUB <- spectrum[j,1]
b <- b + 1
}
}
return(f.bucket)
}

# Exclusion zones
if (!is.null(exclusionZonesBorders))
{
exclusion.zone.m <- matrix(exclusionZonesBorders[[1]],nrow=1)
if (length(exclusionZonesBorders) > 1)
for (k in 2:length(exclusionZonesBorders))
exclusion.zone.m <- rbind(exclusion.zone.m,exclusionZonesBorders[[k]])
}

## CHANGES
## Inputs from zip or library (raw files)
if (fileType == "zip")
{
# File names
FileNames <- list.files(fileName)
n <- length(FileNames)

# Reading and Bucketing
fileName <- paste(fileName,"/",sep="")

i <- 1
while (i <= n)
{
# File reading
SampleDir <- paste(fileName,FileNames[i],"/1/",sep="")
setwd(SampleDir)
DataDir <- "pdata/1"

rawSpectrum <- NmRBrucker_read(DataDir,rawSpectrum)

orderedSpectrum <- rawSpectrum[order(rawSpectrum[,1],decreasing=T), ]

# Removal of chemical shifts > leftBorder or < rightBorder boundaries
truncatedSpectrum <- orderedSpectrum[orderedSpectrum[,1] < leftBorder & orderedSpectrum[,1] > rightBorder, ]
truncatedSpectrum[,1] <- round(truncatedSpectrum[,1],3)

# Bucketing
spectrum.bucket <- NmrBrucker_bucket(truncatedSpectrum)
ppm <- spectrum.bucket[,1]

# spectrum Concatenation
if (i == 1)
bucketedSpectra <- spectrum.bucket
if (i > 1)
bucketedSpectra <- cbind(bucketedSpectra,spectrum.bucket[,2])
colnames(bucketedSpectra)[i+1] <- FileNames[i]

# Next sample
rm(spectrum.bucket)
i <- i +1
}
# Directory
cd(fileName)
}
Comment on lines +167 to +209
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make the code a bit easier to understand if this was extracted into its own function.


## Inputs from dataset (preprocessed files)
if (fileType=="tsv")
{
FileNames <- colnames(fileName)
n <- length(FileNames)

for (i in 1:ncol(fileName))
{
orderedSpectrum <- cbind(as.numeric(rownames(fileName)),fileName[,i])
orderedSpectrum <- orderedSpectrum[order(orderedSpectrum[,1],decreasing=T), ]

truncatedSpectrum <- orderedSpectrum[orderedSpectrum[,1] < leftBorder & orderedSpectrum[,1] > rightBorder, ]
truncatedSpectrum[,1] <- round(truncatedSpectrum[,1],3)

# Bucketing
spectrum.bucket <- NmrBrucker_bucket(truncatedSpectrum)
ppm <- spectrum.bucket[,1]

# spectrum Concatenation
if (i == 1)
bucketedSpectra <- spectrum.bucket
if (i > 1)
bucketedSpectra <- cbind(bucketedSpectra,spectrum.bucket[,2])
colnames(bucketedSpectra)[i+1] <- colnames(fileName)[i]
}
}
Comment on lines +212 to +236
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this - would be cool if this was its own function.


identifiants <- gsub("([- , * { } | \\[ ])","_",colnames(bucketedSpectra)[-1])
colnames(bucketedSpectra) <- c(colnames(bucketedSpectra)[1],identifiants)

bucketedSpectra <- bucketedSpectra[bucketedSpectra[,1]!=0,]
rownames(bucketedSpectra) <- paste("B",bucketedSpectra[,1],sep="")
bucketedSpectra <- bucketedSpectra[,-1]

# Metadata matrice outputs
sampleMetadata <- data.frame(1:n)
rownames(sampleMetadata) <- colnames(bucketedSpectra)
colnames(sampleMetadata) <- "SampleOrder"

variableMetadata <- data.frame(1:nrow(bucketedSpectra))
rownames(variableMetadata) <- rownames(bucketedSpectra)
colnames(variableMetadata) <- "VariableOrder"


return(list(bucketedSpectra,sampleMetadata,variableMetadata,ppm)) # ,truncatedSpectrum_matrice
}


#################################################################################################################
## Typical function call
#################################################################################################################
## StudyDir <- "K:/PROJETS/Metabohub/Bruker/Tlse_BPASourisCerveau/"
## upper <- 9.5
## lower <- 0.8
## bucket.width <- 0.01
## exclusion <- TRUE
## exclusion.zone <- list(c(5.1,4.5))
## graphique <- "Overlay"
## nomFichier <- "Tlse_BPASourisCerveau_NmrBucketing_graph.pdf"
## tlse_cerveaupnd21.bucket <- NmrBucketing(StudyDir,upper,lower,bucket.width,exclusion,exclusion.zone,graphique,nomFichier)
## write.table(tlse_cerveaupnd21.bucket,file=paste(StudyDir,"Tlse_BPASourisCerveau_NmrBucketing_dataMatrix.tsv",sep=""),
## quote=FALSE,row.nmaes=FALSE,sep="\t")
#################################################################################################################
Loading