The following document outlines a selection of motif sets exported from the MS2LDA MotifDB (https://ms2lda.org/motifdb) using the dump_motifdb.py script. This selection prioritizes scientifically relevant and community-contributed motif sets, excluding test entries and those with unclear descriptions. Each motif set has been chosen for its significance in various scientific analyses, including plant extracts, bacterial symbionts, and chemical libraries.
The table below provides an overview of each selected motif set, including its ID, name, feature set, description, and the number of motifs it contains.
| Motifset ID | Motifset Name | Feature Set | Description | No. of Motifs |
|---|---|---|---|---|
| 1 | Urine derived Mass2Motifs | 0.005 Da | Mass2Motifs annotated from positive ionisation mode mass spectra generated from extracts of urine samples. | 134 |
| 2 | GNPS library derived Mass2Motifs | 0.005 Da | MS/MS spectra obtained from reference compounds and isolated molecules from diverse sources with a focus on bacterial and plant related molecules. | 78 |
| 3 | Euphorbia Plant Mass2Motifs | 0.005 Da | Annotated Mass2Motifs from data set comprising 43 Euphorbia plant extracts. | 66 |
| 4 | Massbank library derived Mass2Motifs | 0.005 Da | MS/MS spectra obtained from reference compounds and isolated molecules from diverse sources with a slight focus on plant related molecules. | 46 |
| 5 | Rhamnaceae Plant Mass2Motifs | 0.005 Da | Annotated Mass2Motifs from data set comprising 71 Rhamnaceae plant extracts. | 31 |
| 6 | Streptomyces and Salinispora Mass2Motifs | 0.1 Da | Annotated Mass2Motif set from Streptomyces and Salinispora extracts with 0.1 Da binned features. | 40 |
| 16 | Photorhabdus and Xenorhabdus Mass2Motifs | 0.005 Da | Mass2Motifs discovered in Molecular Networking clustered positive ionisation mode mass spectra generated from bacterial extracts of the Photorhabdus and Xenorhabdus genera. | 46 |
| 17 | LDB_NEG_MotifDB_01 | 0.005 Da | Motif DB applicable to lichen samples, produced with 1500 spectra from with 250 lichen molecules. | 300 |
| 33 | LDB_NEG_MotifDB_02 | 0.005 Da | Motif DB applicable to lichen samples, produced with 816 spectra from with 250 lichen molecules. | 100 |
| 37 | LDB MotifDB POS | 0.01 Da | MotifDB produced by the positive mode spectra of the LDB (250+ compounds with 745 spectra including different adducts and acquisition from three LC-MS instruments). | 100 |
| 32 | Streptomyces S29 | 0.005 Da | Mass2Motifs annotated from positive ionisation mode mass spectra generated from extracts of Streptomyces sp. S29. | 13 |
| 31 | MIADB_pos_100 | 0.005 Da | Motif DB applicable to MIA-containing plants, produced with 200 spectra from the MIADB (100 mass 2 motifs). | 10 |
| 30 | MIADB_pos_60 | 0.005 Da | Motif DB applicable to MIA-containing plants, produced with 200 spectra from the MIADB (60 mass 2 motifs). | 7 |
| 29 | MIADB_pos_indole | 0.005 Da | Motif DB applicable to MIA-containing plants, produced with 200 spectra from the MIADB - one annotated motif related to decorated indole substructures. | 1 |
| 38 | Planomonospora-associated Mass2Motifs | 0.005 Da | Annotated Mass2Motif set from Planomonospora extracts with 0.01 Da binned features. | 30 |