BGC Atlas is a web resource dedicated to exploring the diversity of biosynthetic gene clusters (BGCs) in metagenomes. Leveraging the power of metagenomics, BGC Atlas identifies and analyzes BGCs from diverse environmental samples, providing insights into the chemical diversity encoded in bacterial genomes. Our goal is to enhance the understanding of secondary metabolites produced by microorganisms and their ecological and evolutionary roles.
If you use BGC Atlas in your research, please cite:
Bağcı, C., Nuhamunada, M., Goyat, H., Ladanyi, C., Sehnal, L., Blin, K., Kautsar, S. A., Tagirdzhanov, A., Gurevich, A., Mantri, S., von Mering, C., Udwary, D., Medema, M. H., Weber, T., & Ziemert, N. (2025). BGC Atlas: A web resource for exploring the global chemical diversity encoded in bacterial genomes. Nucleic Acids Research, 53(D1), D618–D624. https://doi.org/10.1093/nar/gkae953
-
Data Collection and Integration: Metagenomic datasets are collected from publicly available repositories (MGnify). Datasets are processed to extract assembled contigs and associated metadata, providing detailed environmental context for each BGC.
-
BGC Identification and Annotation: The antiSMASH tool is used to identify and annotate BGCs within metagenomic assemblies.
-
Clustering and Analysis: Identified BGCs are clustered into gene cluster families (GCFs) using BiG-SLICE.
-
User-Friendly Web Interface: The web interface allows users to explore BGCs, GCFs, and samples with ease. Users can filter and search for BGCs based on specific criteria, visualize their distribution across various biomes, and query the database for similar clusters.
The BGC Atlas interface consists of five main sections:
The Home page displays an overview of the BGC Atlas database, including the total number of samples, BGCs, and GCFs. It displays a global overview of the samples analyzed on a world map. Users can zoom in and out of the map, as well as pan to different regions. Users can highlight a section of the map using the rectangle tool and inspect the BGCs within that region.
The Samples section displays a table of metagenomic samples, including information on the sample name, biome, the number of BGCs identified, and their associated metadata. Users can filter and search for samples based on specific criteria.
The BGCs section provides detailed information on individual biosynthetic gene clusters identified in metagenomic samples. Users can view the list of all BGCs, their product categories and types, the GCFs they clustered into, and their membership value. BGC entries shown in red indicate that the BGC is a putative member of its GCF (above a membership value of 0.4).
The GCFs section presents gene cluster families (GCFs) identified in the database, along with information on the number of BGCs, their product types, and distribution across different biomes they are found in. Opening a GCF entry displays detailed information on the family, including the list of associated BGCs and samples.
The Search section allows users to perform homology searches using antiSMASH-compatible GenBank files of BGCs they identify from other sources against the BGC-Atlas database. Users can upload one or multiple GenBank files containing biosynthetic gene clusters and search the database for similar clusters.
The Download section provides access to the raw data (GenBank files for BGCs, the BiG-SLiCE clustering of the database, and the full dump of the database) used in the BGC Atlas database.
To set up a local instance of BGC Atlas, follow these steps. The project
requires Node.js 18 or later (the recommended version is defined in
the .nvmrc file):
-
Clone the repository:
git clone https://github.com/yourusername/bgc-atlas-web.git cd bgc-atlas-web -
Install dependencies:
npm install npm run build-css # or npm run watch-css for development -
Configure environment variables: Create a
.envfile in the root directory with the following variables (see.env.examplefor a complete template):# Database connection DB_USER=your_db_user DB_HOST=localhost DB_DATABASE=bgc_atlas DB_PASSWORD=change_me DB_PORT=5432 # Redis configuration REDIS_HOST=127.0.0.1 REDIS_PORT=6379 # Application settings APP_URL=http://localhost:3000 PORT=3000 ENABLE_SSL=true # Set to false to disable HTTPS SSL_CERT_PATH=/path/to/ssl/certs # Optional # Paths used by the search feature MONTHLY_SOIL_BASE_DIR=/path/to/monthly-soil # optional ULTRA_DEEP_SOIL_DIR=/path/to/ultra-deep-soil # optional SEARCH_UPLOADS_DIR=/path/to/search/uploads # optional SEARCH_SCRIPT_PATH=/path/to/search/script.py # required REPORTS_DIR=/path/to/reports # required -
Set up the database:
- Install PostgreSQL if not already installed
- Create a database named
bgcatlas - Import the database dump (available in the Download section of the live site)
-
Start the application:
npm start -
Access the application at
http://localhost:3000
BGC Atlas is built with the following main dependencies:
- Express.js - Web framework
- Helmet - Security headers
- Express Rate Limit - API rate limiting
- Pug - Template engine
- PostgreSQL - Database
- Leaflet - Interactive maps
- Node.js - JavaScript runtime
For a complete list of dependencies, see the package.json file.
The user interface is built with Pug templates. Reusable pieces of markup live in the views/components directory as mixins. Core elements such as the navigation bar and footer are defined once and included across all pages. Additional components, like a generic card, can be composed to simplify future UI work.
The current version of BGC Atlas includes:
- 35,486 samples from MGnify
- 1,854,079 BGCs identified
- 13,854 GCFs identified
- Current: Added API rate limiting to prevent abuse of endpoints.
- 04.06.2025: Added ultra-deep and monthly-soil sampling data from Schönbuch.
- 15.08.2024: First release. 35,486 samples from MGnify analysed, and 1,854,079 BGCs and 13,854 GCFs identified.
Contributions to BGC Atlas are welcome! Please feel free to submit a Pull Request.
- Create the database and schema by running the SQL script: This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
For any questions or feedback, please contact us at caner.bagci@uni-tuebingen.de.
psql -U postgres -f atlas_db_schema.sql- Load the dummy data into the database:
psql -U postgres -d atlas_v2025 -f atlas_dummy_data.sql- Navigate to the server directory:
cd server- Install dependencies:
npm install- Create a
.envfile in the server directory with the following content (adjust as needed):
# Database configuration
DB_HOST=localhost
DB_PORT=5432
DB_NAME=atlas_v2025
DB_USER=postgres
DB_PASSWORD=postgres
# Server configuration
PORT=3000
- Start the server:
npm run devThe server will be running at http://localhost:3000.
- Navigate to the client directory:
cd client- Install dependencies:
npm install- Create a
.env.localfile in the client directory with the following content:
NEXT_PUBLIC_API_URL=http://localhost:3000/api
- Start the client:
npm run devThe client will be running at http://localhost:3001.
The following API endpoints are available:
GET /api/stats/kpi- Get key performance indicators (counts of studies, samples, runs, assemblies)GET /api/stats/bgc-classes- Get BGC class distribution
GET /api/browse/studies- Browse studies with paginationGET /api/browse/samples- Browse samples with paginationGET /api/browse/runs- Browse runs with paginationGET /api/browse/biomes- Browse biomes with pagination
All browse endpoints support pagination with page and limit query parameters.
Example: GET /api/browse/studies?page=1&limit=100