- LLM Evaluation & Benchmarking: Design and implement comprehensive evaluation systems (leveraging LLM-as-a-Judge, human annotation) to track core metrics: accuracy, hallucination rate, latency, and cost.
- Prompt Engineering & Optimization: Systematic prompt iteration, chain-of-thought design, and few-shot learning to maximize model performance.
- Model Fine-tuning & Alignment: Proficient in full pipelines including Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) for domain-specific model optimization and safety alignment.
- Emerging Architecture & Applied Research: In-depth exploration of cutting-edge areas: long-context optimization, multimodal understanding, agent collaboration, and AI memory systems.
- LLM Application Frameworks: Deep integration with DSPy, LangChain, AutoGen, CrewAI, and the ReAct paradigm.
- Advanced RAG Systems: Build enhanced retrieval pipelines incorporating vector databases, hybrid search, and custom retrievers.
- Agentic & Autonomous Systems: Develop multi-agent systems for research, process automation, and trading.
- AI Content Detection: Apply stylometric analysis and embedding techniques for AI-generated content identification.
- Internal Automation: Built a Slack → Notion → API → LLM automation workflow, reducing support response time by 60%.
- Multimodal Intelligence: Integration of CLIP for image tagging, YOLOv8 for content moderation, Whisper for ASR, and Tacotron2 for TTS.
- Frontend: Next.js, React, Vue
- Backend: Node.js, NestJS, FastAPI, Django
- Mobile: React Native, Expo, Flutter, Swift, Kotlin
- Cloud & DevOps: AWS, Azure, Docker, Kubernetes, PostgreSQL, MongoDB, Supabase
- Automation Tools: n8n, Zapier, Make.com, and custom API workflows.