- 🎓 MS in Data Science @ Texas A&M University–Corpus Christi (GPA 4.0/4.0 | Graduate Fellow)
- 🎓 B.S. in Culture & Technology + Biotechnology @ Sungkyunkwan University (SKKU)
- 🔬 Former Data Science Researcher @ Samsung Medical Center (Epidemiology & Clinical Data)
- 📊 Passionate about financial time-series forecasting, NLP, and turning messy real-world data into actionable insights
-
📄 KDD 2026 (ACM SIGKDD) — Research paper under review on multimodal financial forecasting
🔗 Financial Time-Series Forecasting with Deep Learning and Social Media Sentiment -
🥇 Awarded the SKKU President Award (1st Place, R&D Innovation) for a methane-reducing functional gum project (₩2M prize, graduation evaluation waived)
🔗 View SKKU article (English) -
🎤 Presented stock forecasting research at Coastal Bend Conference (2025) and NMSU Workshop (2025)
🔗 View LinkedIn post
Graduate Research Assistant @ Prof. Sreelekha Guggilam's Lab (TAMUCC) · 2024.09 – Present
- Thesis research at the intersection of financial time-series modeling and social sentiment analysis
- Designed a Reddit sentiment-enhanced forecasting framework using FinBERT + multivariate transformer models (TFT, Chronos-T5, TimesFM)
- Fine-tuning large foundation models on stock price data; building scalable inference pipelines for real-time market prediction
Clinical Data Research Assistant @ Samsung Medical Center – CCE · 2022.08 – 2023.08
- Supported data collection and analysis for clinical epidemiological studies under Prof. Juhee Cho (SAHIST / Johns Hopkins) and Prof. Danbee Kang (SAHIST)
- Worked on structured health datasets related to rare and oncological diseases
- Managed and cleaned clinical datasets; assisted in data coordination (protocols, data workflows) for epidemiology research
Research Assistant @ Prof. Jinhee Hur's Lab (SKKU) · 2022.03 – 2022.07
- Conducted scientific literature review and basic data handling for lab-scale experiments under Prof. Jinhee Hur (Ph.D. from Johns Hopkins, Postdoctoral Fellow at Harvard)
- Gained foundational experience in academic research and interdisciplinary collaboration
[KDD 2026 — Under Review] Financial Time-Series Forecasting with Deep Learning and Social Media Sentiment
🔗 Repo
- Full-scale multimodal forecasting framework integrating Reddit-derived FinBERT sentiment + LOESS spike features into a Temporal Fusion Transformer (TFT) across 10 large-cap U.S. equities and 1,000 trading days
- Achieved +213.6% average IC improvement over TFT baseline; best model reached 20.19% cumulative return and Sharpe ratio of 2.74 in the test period
-
Expanded and formalized from Reddit-Driven-Stock-Forecasting for KDD 2026 submission
Reddit-Driven Stock Forecasting
- Clean, reproducible pipelines for forecasting stock prices (TSLA, NVDA) using ARIMA, Google TimesFM, Amazon Chronos, and TFT with Reddit sentiment and spike features
- TFT + Reddit achieved RMSE ↓ 40.2% on TSLA and ↓ 87.9% on NVDA vs. baseline TFT
Bayesian Cost Prediction in Healthcare
- Compared Bayesian and Frequentist linear regression on individual medical cost prediction using the
rethinkingpackage in R - Focused on uncertainty quantification and interpretability
- Data Science, Data Structures, Algorithms, Machine Learning
- Deep Learning, Numerical Methods, Computational Methods for Statistics
- Programming in Python, Problem Solving, Predictive Analytics, Biostatistics
- ✉️ Email: yejinh.cs@gmail.com
- 💻 GitHub