Topic Modeling on BBC News
A project about topic modeling, which discovering the abstract “topics” that occur in a collection of documents.
Description
This is a projects that used Python to implement clustering algorithm, including GMM and K-means to analyze word frequency in datasets.
Features
Analyze .pkl data and output image result which put similar topics together.
Used WordCloud to generate output.
Unsupervised machine learning.
Could be switched between K-mean and K-mean++ for clustering.
High-dimensional data.