Class MalletLdaWrapper.MalletLDATopicModeler

java.lang.Object
cc.mallet.topics.ParallelTopicModel
org.dice_research.topicmodeling.algorithm.mallet.MalletLdaWrapper.MalletLDATopicModeler
All Implemented Interfaces:
Serializable, org.dice_research.topicmodeling.algorithms.ClassificationModel, org.dice_research.topicmodeling.algorithms.LDAModel, org.dice_research.topicmodeling.algorithms.Model, org.dice_research.topicmodeling.algorithms.ProbabilisticWordTopicModel, org.dice_research.topicmodeling.algorithms.VocabularyContaining, org.dice_research.topicmodeling.algorithms.VocabularyContainingClassificationModel, org.dice_research.topicmodeling.algorithms.VocabularyContainingModel
Enclosing class:
MalletLdaWrapper

protected static class MalletLdaWrapper.MalletLDATopicModeler extends cc.mallet.topics.ParallelTopicModel implements org.dice_research.topicmodeling.algorithms.LDAModel
See Also:
Serialized Form
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static int
     
    protected int
     
     
    protected int
     
    protected int
     
    protected static org.slf4j.Logger
     
    protected cc.mallet.topics.WorkerRunnable[]
     
    private static long
     
    protected double[]
     
    protected org.dice_research.topicmodeling.utils.vocabulary.Vocabulary
     
    protected double[][]
     

    Fields inherited from class cc.mallet.topics.ParallelTopicModel

    alpha, alphabet, alphaSum, beta, betaSum, burninPeriod, data, DEFAULT_BETA, docLengthCounts, formatter, modelFilename, numIterations, numTopics, numTypes, optimizeInterval, printLogLikelihood, randomSeed, saveModelInterval, saveSampleInterval, saveStateInterval, showTopicsInterval, stateFilename, temperingInterval, tokensPerTopic, topicAlphabet, topicBits, topicDocCounts, topicMask, totalTokens, typeTopicCounts, UNASSIGNED_TOPIC, usingSymmetricAlpha, wordsPerTopic
  • Constructor Summary

    Constructors
    Constructor
    Description
    MalletLDATopicModeler​(int numberOfTopics, double alphaSum, double beta, long seed)
     
    MalletLDATopicModeler​(int numberOfTopics, long seed)
     
    MalletLDATopicModeler​(cc.mallet.types.LabelAlphabet topicAlphabet, double alphaSum, double beta, long seed)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected void
     
    void
     
    double[]
     
    double
     
    org.dice_research.topicmodeling.utils.doc.DocumentClassificationResult
    getClassificationForDocument​(org.dice_research.topicmodeling.utils.doc.Document document)
     
    cc.mallet.topics.TopicInferencer
     
    int
     
    double
    getProbabilityOfWord​(int wordId, int topicId)
     
    double
     
    double
    getSmoothedProbabilityOfWord​(int wordId, int topicId)
     
    double[]
    getTopicProbabilitiesForDocument​(org.dice_research.topicmodeling.utils.doc.DocumentWordCounts wordCounts)
     
    int
     
    org.dice_research.topicmodeling.utils.vocabulary.Vocabulary
     
    org.dice_research.topicmodeling.utils.vocabulary.VocabularyMapping
    getVocabularyMapping​(org.dice_research.topicmodeling.utils.vocabulary.Vocabulary otherVocabulary)
     
    int[]
     
    int[]
    inferTopicAssignmentsForDocument​(org.dice_research.topicmodeling.utils.doc.Document document)
     
    int[]
    inferTopicAssignmentsForDocument​(org.dice_research.topicmodeling.utils.doc.DocumentWordCounts wordCounts)
     
    void
    initialize​(cc.mallet.types.InstanceList instances)
     
    void
    setInferenceIterations​(int inferenceIterations)
     
    void
    setVersion​(int version)
     
    void
    setVocabularyDecorator​(org.dice_research.topicmodeling.utils.vocabulary.VocabularyDecorator vocabulary)
     

    Methods inherited from class cc.mallet.topics.ParallelTopicModel

    addInstances, buildInitialTypeTopicCounts, displayTopWords, getAlphabet, getData, getNumTopics, getProbEstimator, getSortedWords, getTopicAlphabet, getTopicProbabilities, getTopicProbabilities, getTopWords, initializeFromState, main, modelLogLikelihood, optimizeAlpha, optimizeBeta, printDocumentTopics, printDocumentTopics, printDocumentTopics, printState, printState, printTopicWordWeights, printTopicWordWeights, printTopWords, printTopWords, printTypeTopicCounts, read, setBurninPeriod, setNumIterations, setNumThreads, setOptimizeInterval, setRandomSeed, setSaveSerializedModel, setSaveState, setSymmetricAlpha, setTemperingInterval, setTopicDisplay, sumTypeTopicCounts, temperAlpha, topicPhraseXMLReport, topicXMLReport, write

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • serialVersionUID

      private static final long serialVersionUID
      See Also:
      Constant Field Values
    • logger

      protected static final org.slf4j.Logger logger
    • DEFAULT_INFERENCE_ITERATIONS

      public static int DEFAULT_INFERENCE_ITERATIONS
    • runnables

      protected transient cc.mallet.topics.WorkerRunnable[] runnables
    • iteration

      protected int iteration
    • inferencerVersion

      protected transient int inferencerVersion
    • inferencer

      protected transient MalletLdaInferenceWrapper inferencer
    • vocabulary

      protected org.dice_research.topicmodeling.utils.vocabulary.Vocabulary vocabulary
    • inferenceIterations

      protected int inferenceIterations
    • wordTopicWeights

      protected double[][] wordTopicWeights
    • topicWeights

      protected double[] topicWeights
  • Constructor Details

    • MalletLDATopicModeler

      public MalletLDATopicModeler(int numberOfTopics, long seed)
    • MalletLDATopicModeler

      public MalletLDATopicModeler(int numberOfTopics, double alphaSum, double beta, long seed)
    • MalletLDATopicModeler

      public MalletLDATopicModeler(cc.mallet.types.LabelAlphabet topicAlphabet, double alphaSum, double beta, long seed)
  • Method Details

    • initialize

      public void initialize(cc.mallet.types.InstanceList instances)
    • estimate

      public void estimate()
      Overrides:
      estimate in class cc.mallet.topics.ParallelTopicModel
    • getSmoothedProbabilityOfWord

      public double getSmoothedProbabilityOfWord(int wordId, int topicId)
      Specified by:
      getSmoothedProbabilityOfWord in interface org.dice_research.topicmodeling.algorithms.ProbabilisticWordTopicModel
    • getProbabilityOfWord

      public double getProbabilityOfWord(int wordId, int topicId)
      Specified by:
      getProbabilityOfWord in interface org.dice_research.topicmodeling.algorithms.ProbabilisticWordTopicModel
    • getSmoothedProbabilityOfTopic

      public double getSmoothedProbabilityOfTopic(int topicId)
      Specified by:
      getSmoothedProbabilityOfTopic in interface org.dice_research.topicmodeling.algorithms.ProbabilisticWordTopicModel
    • getNumberOfTopics

      public int getNumberOfTopics()
      Specified by:
      getNumberOfTopics in interface org.dice_research.topicmodeling.algorithms.ProbabilisticWordTopicModel
    • calculateSmoothedWeights

      protected void calculateSmoothedWeights()
    • getTopicProbabilitiesForDocument

      public double[] getTopicProbabilitiesForDocument(org.dice_research.topicmodeling.utils.doc.DocumentWordCounts wordCounts)
      Specified by:
      getTopicProbabilitiesForDocument in interface org.dice_research.topicmodeling.algorithms.ProbabilisticWordTopicModel
    • getVocabulary

      public org.dice_research.topicmodeling.utils.vocabulary.Vocabulary getVocabulary()
      Specified by:
      getVocabulary in interface org.dice_research.topicmodeling.algorithms.VocabularyContaining
    • getInferencer

      public cc.mallet.topics.TopicInferencer getInferencer()
      Overrides:
      getInferencer in class cc.mallet.topics.ParallelTopicModel
    • setVocabularyDecorator

      public void setVocabularyDecorator(org.dice_research.topicmodeling.utils.vocabulary.VocabularyDecorator vocabulary)
    • setVersion

      public void setVersion(int version)
      Specified by:
      setVersion in interface org.dice_research.topicmodeling.algorithms.Model
    • getVersion

      public int getVersion()
      Specified by:
      getVersion in interface org.dice_research.topicmodeling.algorithms.Model
    • getVocabularyMapping

      public org.dice_research.topicmodeling.utils.vocabulary.VocabularyMapping getVocabularyMapping(org.dice_research.topicmodeling.utils.vocabulary.Vocabulary otherVocabulary)
      Specified by:
      getVocabularyMapping in interface org.dice_research.topicmodeling.algorithms.VocabularyContainingModel
    • getClassificationForDocument

      public org.dice_research.topicmodeling.utils.doc.DocumentClassificationResult getClassificationForDocument(org.dice_research.topicmodeling.utils.doc.Document document)
      Specified by:
      getClassificationForDocument in interface org.dice_research.topicmodeling.algorithms.ClassificationModel
    • inferTopicAssignmentsForDocument

      public int[] inferTopicAssignmentsForDocument(org.dice_research.topicmodeling.utils.doc.Document document)
      Specified by:
      inferTopicAssignmentsForDocument in interface org.dice_research.topicmodeling.algorithms.LDAModel
    • inferTopicAssignmentsForDocument

      public int[] inferTopicAssignmentsForDocument(org.dice_research.topicmodeling.utils.doc.DocumentWordCounts wordCounts)
      Specified by:
      inferTopicAssignmentsForDocument in interface org.dice_research.topicmodeling.algorithms.LDAModel
    • inferTopicAssignmentsForDocument

      public int[] inferTopicAssignmentsForDocument(int[] tokens)
      Specified by:
      inferTopicAssignmentsForDocument in interface org.dice_research.topicmodeling.algorithms.LDAModel
    • getBeta

      public double getBeta()
      Specified by:
      getBeta in interface org.dice_research.topicmodeling.algorithms.LDAModel
    • getAlphas

      public double[] getAlphas()
      Specified by:
      getAlphas in interface org.dice_research.topicmodeling.algorithms.LDAModel
    • setInferenceIterations

      public void setInferenceIterations(int inferenceIterations)
      Specified by:
      setInferenceIterations in interface org.dice_research.topicmodeling.algorithms.LDAModel