JC
work projects blog

Projects

Selected side projects.

  • Loan Underwriting
    A logistic regression model for predicting one-year probability of default and improving loan underwriting decisions.
  • Music Recommender System
    A PySpark recommender system for large-scale music listening data stored in HDFS.
  • NLP Topic Modeling
    Topic modeling on scraped news articles using LDA, K-means clustering, BERT, and BART summarization.
  • NLP Text Stock Prediction
    Stock price prediction using LSTM and BERT models to analyze daily news headlines and forecast stock movement.
  • Twitter Data Pipeline
    An Airflow ETL pipeline for Twitter data using Docker, Python, and AWS S3.
  • Health Insurance Premium Price Analysis
    Health insurance premium analysis using causal inference, regression, clustering, and machine learning methods.
  • Bank Customer Segmentation Dashboard
    An interactive dashboard for analyzing dummy bank customer data from the United Kingdom.
  • Data Analysis on Research Text Data
    Exploratory text analysis on National Science Foundation research grant data.
  • NYC Park Crime Tableau Dashboard
    A Tableau dashboard analyzing crime incidents at New York City parks.
© 2026 Jin Choi