Skip to content

tddg/cs6501-serverless-ai-fall25

Repository files navigation

CS6501: Serverless AI

Serverless computing has emerged as a transformative paradigm in cloud computing, offering elastic pay-per-use compute abstractions that free developers from managing infrastructure. At the same time, the rapid advancement of AI, particularly large language model (LLM) fine-tuning and inference, has introduced new challenges in cloud computing and systems, ranging from resource efficiency to scalability and performance. The confluence of these two critical domains is giving rise to Serverless AI: a new design space where serverless platforms are reimagined to support modern AI workloads.

This course explores both foundational and cutting-edge research in serverless computing and its intersection with AI systems. The course is divided into two parts:

  1. Serverless computing and Function-as-a-Service: In this part, we will study the design and implementation of serverless architectures and a series of well-known problems of modern FaaS platforms (including cold starts, resource isolation, virtualization, state management, fault tolerance, and cloud-native applications, etc.)
  2. Serverless systems for AI applications: In this part, we will explore how serverless computing is being reimagined / extended to support emerging AI applications, including serverless LLM inference, serverless fine-tuning, on-demand GPU allocation, and serverless model hub (e.g., the Hugging Face infrastructure).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published