- Text2Comic is a web application that utilizes Stable Diffusion to transform ordinary texts into visually captivating comic strips.
-
User Input Handling: User enters a text prompt on the frontend (Next.js) and submits it. On submission, the backend (Flask) validates the input and enqueues it onto a Google Pub/Sub Message Queue.
-
Request Distribution: The Message Queue distributes requests to available worker nodes. Worker nodes are managed within a Kubernetes cluster to ensure balanced workloads.
-
Text Processing and Image Generation: Worker nodes process the input using the Open AI API to parse the text and create a stpry. We then use a Stable Diffusion model to convert this story into a comic-style visual.
-
Image Storage and Retrieval: Generated images are uploaded to Google Cloud Storage and metadata, including user information and image URLs, are stored in a MongoDB database.
-
Caching:Redis Cache stores the prompt and the URL of the generated image for quicker access, reducing the need for re-computation.
-
User Access: Users retrieve generated images via Flask, which fetches them from either the google cloud storage after image generation or from the cache if already the same prompt has been used earlier.
During the development of Text2Comic, we encountered and overcame several challenges, including:
- Faced challenges in setting up the Kubernetes cluster, particularly with pod scaling and networking configurations
- Encountered issues with Pub/Sub message handling, Fixed these by refining subscription configurations and abstracting Pub/Sub from the backend.
- Saving and organizing generated image and showing it to on the frontend.
To use text2comic, you will require a Stable Diffusion key which can be acquired from: (https://beta.dreamstudio.ai/account)
- Clone the repo:
https://github.com/cu-csci-4253-datacenter-fall-2024/finalproject-final-project-team-39.git
- Change directory to client by
cd client - Install npm packages by running
npm i - Start the dev server by running
npm run dev
- Change directory to server by
cd server - Create a
.envfile and enter Open AI and Stable Diffusion API keys in the following format:OPEN_AI_API = '<your-api-key>'STABLE_DIFFUSION_API = '<your-api-key>' - Install the required packages using
pip install -r requirements.txt - Run the flask server using
flask --app main run
- Change directory to worker by
cd server/worker - Install the required packages using
pip install -r requirements.txt - Run the worker usimg
python worker.py.
The frontend will be accessible at localhost:5000.
To run this application on a Kubernetes cluster, navigate to the respective folders and run the commands mentioned.
kubectl apply -f frontend-service.yaml
kubectl apply -f frontend-deployment.yaml
kubectl apply -f server-service.yaml
kubectl apply -f server-deployment.yaml
kubectl apply -f worker-service.yaml
kubectl apply -f worker-deployment.yaml
Upon deployment, run the command kubectl get svc and access the external IP corresponding to the frontend service.
- Obed Junias
- Sahasraditya Thyadi