topic_coherence.indirect_confirmation_measure – Indirect confirmation measure module¶This module contains functions to compute confirmation on a pair of words or word subsets.
The advantage of indirect confirmation measure is that it computes similarity of words in W’ and W* with respect to direct confirmations to all words. Eg. Suppose x and z are both competing brands of cars, which semantically support each other. However, both brands are seldom mentioned together in documents in the reference corpus. But their confirmations to other words like “road” or “speed” do strongly correlate. This would be reflected by an indirect confirmation measure. Thus, indirect confirmation measures may capture semantic support that direct measures would miss.
The formula used to compute indirect confirmation measure is
- m_{sim}_{(m, gamma)}(W’, W*) =
- s_{sim}(vec{V}^{,}_{m,gamma}(W’), vec{V}^{,}_{m,gamma}(W*))
where s_sim can be cosine, dice or jaccard similarity and
- vec{V}^{,}_{m,gamma}(W’) =
- Bigg {{sum_{w_{i} in W’}^{ } m(w_{i}, w_{j})^{gamma}}Bigg }_{j = 1,…,|W|}
Here ‘m’ is the direct confirmation measure used.
gensim.topic_coherence.indirect_confirmation_measure.ContextVectorComputer(measure, topics, accumulator, gamma)¶Bases: object
Lazily compute context vectors for topic segments.
compute_context_vector(segment_word_ids, topic_word_ids)¶Step 1. Check if (segment_word_ids, topic_word_ids) context vector has been cached. Step 2. If yes, return corresponding context vector, else compute, cache, and return.
gensim.topic_coherence.indirect_confirmation_measure.cosine_similarity(segmented_topics, accumulator, topics, measure='nlr', gamma=1, with_std=False, with_support=False)¶This function calculates the indirect cosine measure.
Given context vectors u = V(W’) and w = V(W*) for the word sets of a pair S_i = (W’, W*) indirect cosine measure is computed as the cosine similarity between u and w.
The formula used is
- m_{sim}_{(m, gamma)}(W’, W*) =
- s_{sim}(vec{V}^{,}_{m,gamma}(W’), vec{V}^{,}_{m,gamma}(W*))
where each vector
- vec{V}^{,}_{m,gamma}(W’) =
- Bigg {{sum_{w_{i} in W’}^{ } m(w_{i}, w_{j})^{gamma}}Bigg }_{j = 1,…,|W|}
| Parameters: |
|
|---|---|
| Returns: | of indirect cosine similarity measure for each topic. |
| Return type: | list |
gensim.topic_coherence.indirect_confirmation_measure.word2vec_similarity(segmented_topics, accumulator, with_std=False, with_support=False)¶For each topic segmentation, compute average cosine similarity using a WordVectorsAccumulator.
| Parameters: |
|
|---|---|
| Returns: | of word2vec cosine similarities per topic. |
| Return type: | list |