pyLDAvis Topic Modeling Visualization


  • Each bubble represents a topic. The larger the bubble, the higher percentage of the number of documents in the corpus is about that topic.
  • Blue bars represent the overall frequency of each word in the corpus. If no topic is selected, the blue bars of the most frequently used words will be displayed.
  • Red bars give the estimated number of times a given term was generated by a given topic.
  • The further the bubbles are away from each other, the more different they are.
  • The λ slider allows to rank the terms according to term relevance.
  • By default, the terms of a topic are ranked in decreasing order according their topic-specific probability ( λ = 1 ). Moving the slider allows to adjust the rank of terms based on relevance for the specific topic.

LDA Mallet