Coherence Model - uncaught exceptions when multiprocessing

10 views
Skip to first unread message

Jeff Abell

unread,
Apr 28, 2020, 6:03:17 PM4/28/20
to Gensim

Hello, 

I am running through a simple LSA tutorial and have run into a problem with Coherence model, namely when I compute_coherence_values using multiprocessing.  


libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Couldn't close file
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Couldn't close file
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Couldn't close file
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Couldn't close file
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Couldn't close file
Traceback (most recent call last):

  File "/Users/J/Documents/codesource/python/projects/LSA/untitled0.py", line 158, in plot_graph
    model_list, coherence_values = compute_coherence_values(dictionary, doc_term_matrix, doc_clean, stop, start, step)

  File "/Users/J/Documents/codesource/python/projects/LSA/untitled0.py", line 149, in compute_coherence_values
    coherence_values.append(coherencemodel.get_coherence())

  File "/opt/anaconda3/lib/python3.7/site-packages/gensim/models/coherencemodel.py", line 609, in get_coherence
    confirmed_measures = self.get_coherence_per_topic()

  File "/opt/anaconda3/lib/python3.7/site-packages/gensim/models/coherencemodel.py", line 569, in get_coherence_per_topic
    self.estimate_probabilities(segmented_topics)

  File "/opt/anaconda3/lib/python3.7/site-packages/gensim/models/coherencemodel.py", line 541, in estimate_probabilities
    self._accumulator = self.measure.prob(**kwargs)

  File "/opt/anaconda3/lib/python3.7/site-packages/gensim/topic_coherence/probability_estimation.py", line 156, in p_boolean_sliding_window
    return accumulator.accumulate(texts, window_size)

  File "/opt/anaconda3/lib/python3.7/site-packages/gensim/topic_coherence/text_analysis.py", line 452, in accumulate
    accumulators = self.terminate_workers(input_q, output_q, workers, interrupted)

  File "/opt/anaconda3/lib/python3.7/site-packages/gensim/topic_coherence/text_analysis.py", line 525, in terminate_workers
    input_q.put(None, block=True)

  File "/opt/anaconda3/lib/python3.7/multiprocessing/queues.py", line 82, in put
    if not self._sem.acquire(block, timeout):
 ++++++++++

My CoherenceModel is defined as coherencemodel = CoherenceModel(model=model, texts=doc_clean, dictionary=dictionary, coherence='c_v', processes = 5) / notice that there are 5 uncaught exceptions above, I guess one for each worker.

When I use 'processes = 1' (no multiprocessing) then the issue goes away.

Does anyone have any insight?  Is this a package issue on my end or a bug?

Thanks!



Reply all
Reply to author
Forward
0 new messages