Data Scientist / Machine Learning & AI Engineer | 15+ years architecting robust risk models and systems | Building intelligent products powered with Agentic AI, LLM Engineering, NLP & Deep Learning | Turning complex data into trustworthy, actionable solutions
- Languages: Python, SQL, R
- ML/DS: Scikit-learn, XGBoost, LightGBM, TensorFlow/Keras, PyTorch, Pandas, NumPy, Matplotlib/Seaborn
- LLM Engineering & GenAI: LangChain, LangGraph, Hugging Face Transformers, PEFT/LoRA, SFTTrainer, Ollama, OpenAI API, GPT-4o-mini, MedGemma, Phi-4-mini
- Agentic AI & RAG: LangGraph (stateful multi-node pipelines), LangChain RAG, ChromaDB, FAISS, Maximal Marginal Relevance (MMR), vector store engineering
- NLP: BERTopic, DistilBERT, spaCy, NLTK, topic modelling (LDA), clinical NLP, text preprocessing pipelines
- ML Engineering & Pipelines: End-to-end ML pipeline design, feature engineering, hyperparameter tuning (GridSearch, HyperBand), model validation, automated testing, production deployment
- Specialisations: Time-series forecasting (LSTM, LightGBM), LLM fine-tuning (LoRA/PEFT, knowledge distillation), agentic workflow design, clinical AI, unsupervised learning (K-Means, One-Class SVM, PCA, t-SNE)
- Infrastructure & Deployment: Docker (containerised deployment), CI/CD (GitHub → Render auto-deploy pipeline), Render, Google Colab (T4 GPU), edge deployment (GDPR-compliant, local inference)
I am a Data Scientist / AI-ML Engineer leveraging 15+ years of experience designing data-intensive intelligent systems for high-stakes environments. I combine the mathematical rigor of a Quantitative Risk veteran with the creative drive of a Product Builder to engineer AI solutions that are as reliable and trustworthy as they are innovative.
I combine deep technical ability with the leadership experience to guide teams, manage cross-functional stakeholders, and translate complex technical concepts for C-suite executives. I am uniquely positioned to architect and build AI products that are not just powerful, but intuitive, commercially viable, and safe.
- 3C: Clinical Code Collector for NICE — Agentic LangGraph pipeline (4-node stateful graph) combining LangChain RAG (ChromaDB, MMR retrieval) with GPT-4o-mini and deterministic NHS FHIR API search to automate SNOMED CT clinical code discovery. Hybrid LLM + deterministic design ensures full explainability and audit trail - grounding every code selection in exact NICE/QOF guideline evidence. Deployed on cloud at £0.01/query.
- Adversarial Multi-Agent Clinical Documentation System Fine-tuned MedGemma 1.5 4B with LoRA to convert doctor shorthand into hallucination-checked SOAP EHR records. Three-tier adversarial pipeline with PII scrubbing and independent Auditor agent - 100% local, GDPR-compliant, edge-deployable
- Topic Modelling of Gym Reviews for Actionable Strategies Leveraged Transformer models like BERTopic, DistilBert as well as classical NLP methods (LDA) to draw insights from customer experiences of a global gym company, through their reviews online. Integrated an LLM (Phi-4) to distil information and generate solutions for the critical topics.
- Student Drop-Out Prediction Model -Used XGBoost & Neural Networks to predict Student Dropout rates, Ultimately to assist the business with customer experience and retention, balancing resource utilisation.
- Commodity Market Prediction Models - LSTM models to capture temporal patterns across commodity and equity features. Leveraged Random Forest for feature selection from a set of high dimensional set of engineered features.
- Optimal S&P500 Position Prediction Model - An ensemble LightGBM approach to estimate optimal market position.
- Customer Segmentation - Unsupervised learning with K-means clustering, to optimize the marketing strategy of an e-commerce company. PCA and T-SNE views into the data
- Anomaly Detection - One-class SVM for monitoring a ship's engine to develop an early failure detection system
A forward-thinking team in FinTech, HealthTech, or Enterprise AI to architect and ship the next generation of reliable, intelligent AI products.
Visual arts and industrial design enthusiast, an amateur DIY'er , 🎾 tennis player/fan . Music, novels and movies are essentials of life for me.
- LinkedIn: [www.linkedin.com/in/ali-aydin-yildiz]
- Email: aliaydinyildiz@gmail.com