1 min read

NLP Topic Modeling

Topic modeling on scraped news articles using LDA, K-means clustering, BERT, and BART summarization.

NLP Topic Modeling

This NLP project performs topic modeling on news articles and refines topic groupings with clustering and transformer-based methods.

  • Web scraped news articles and stored the collected data in Amazon S3.
  • Built an LDA topic modeling baseline, then improved the topic workflow with K-means clustering and BERT.
  • Evaluated performance with topic coherence and silhouette score.
  • Visualized word importance per topic and summarized articles with BART.