Skip to content

[MaveRegistry] Cloud Infrastructure

Jochen Weile edited this page Jun 5, 2023 · 1 revision

The MaveRegistry service is hosted on Amazon Web Services (AWS) and MongoDB Atlas (using AWS compute notes as the backend). We didn’t use Google Cloud because of the data security concerns people raised about storing unpublished data with Google, which also has a biotech arm.

The MaveRegistry also relies on various services to handle specific aspects of its service.

  • The CAPTCHA requests are handled by hCaptcha (https://www.hcaptcha.com/)
  • The Google Sign-in requests are handled by Google Cloud.
  • The ORCID Sign-in requests are handled by ORCID

AWS S3

The AWS S3 (Simple Storage Service) is a cloud storage system that allows us to store the front-end website and user-generated static assets (e.g. profile pictures). It also allows us to serve static web content (the website) for a very small fee. S3 is fully managed by AWS and thus requires no maintenance from us.

Typically, we do not need to upload anything directly to the storage buckets because all code changes to the front-end website are deployed automatically via Github Actions.

You can access the S3 storage buckets (in region ca-central-1) here: https://s3.console.aws.amazon.com/s3/access?region=ca-central-1. Shown below is a screenshot of existing buckets as of August 4, 2022.

Elastic Beanstalk

The AWS Elastic Beanstalk is a fully managed Web App platform. We use it to manage the API server of MaveRegistry. It is easier than managing the underlying compute nodes ourselves as Elastic Beanstalk handles all the configuration and maintenance.

Typically, we do not need to upload anything directly to the Elastic Beanstalk environment because all code changes to the API server (parse server) are deployed automatically via Github Actions.

You can access the Elastic Beanstalk application (in region ca-central-1) here: https://ca-central-1.console.aws.amazon.com/elasticbeanstalk/home?region=ca-central-1#/applications.

Updating environment variables

We may need to update some environment variables (e.g. Google Auth key) from time to time. To do so, you need to go to the configuration page and edit the Environment properties.

AWS SES

The AWS SES (Simple Email Service) is a fully managed email service that we use to programmatically send emails from within the MaveRegistry application. EmailOctopus, the email marketing service we use to send newsletters (details in [MaveRegistry] Sending emails to MaveRegistry users), also uses our AWS SES account to send emails.

You can access our SES account (in region ca-central-1) here: https://ca-central-1.console.aws.amazon.com/ses/home?region=ca-central-1#/account.

MongoDB Atlas

The MongoDB Atlas is a fully managed MongoDB service that we use to host MaveRegistry database. We have two database deployments on MongoDB Atlas:

  1. maveregistry which is the development database. It is connected to when running the MaveReigstry codebase in the development environment (e.g. your home machine).
  2. production which is the production database. It is connected to by the API server in production.

You can access the MaveRegistry database deployments here: https://cloud.mongodb.com/v2/5ecbf2259860263244819acc. Attached below is a screenshot of our databases.

Resuming the development database

We use the free M0 Sandbox tier database for development. Because it's a free database deployment, MongoDB Atlas pauses the database when it's not accessed. Therefore, you need to resume the database by going to the MongoDB Atlas website.

hCaptcha

hCaptcha is a CAPTCHA service that we use to prevent bots from creating accounts on MaveReigstry. It is an alternative to the commonly used Google reCAPTCHA service.

As the service is already set up, we don’t expect any maintenance required.

Google Sign-in

We use Google Sign-in to let users sign up and log in with their existing Google account.

Please note that we implemented the sign-in function ourselves by calling Google API endpoints directly, instead of relying on Google SDK, to make sure only data necessary for user verification (does not include any unpublished research data) is sent to Google.

As the service is already set up, we don’t expect any maintenance required. If you do need to update the credentials, you can access the credentials here: https://console.cloud.google.com/apis/credentials?project=maveregistry-326220

ORCID Sign-in

We use ORCID Sign-in to let users sign up and log in with their existing Google account.

As the service is already set up, we don’t expect any maintenance required. If you do need to update the Client ID and secret, please contact Kevin Kuang.

Clone this wiki locally