get_coherence is to restart/respawn my program

903 views
Skip to first unread message

Dave -

unread,
Sep 13, 2018, 8:18:24 AM9/13/18
to Gensim
I'm working with gensim to get the coherence (get_coherence) of my LDA model.
I wrote a small script (see attachment) based on the tutorial of https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/.
But when i start this program on windows (python 3.6.5 and python 3.7 on different Windows 7 machines) i get restarted at the 
coherence_lda = coherence_model_lda.get_coherence()
on line157.

When the program gets restarted it will execute the script again but at line 157 i get a traceback, see below.
I tried this program on windows with an anaconda environment (python 3.6.5) and 'pure' python (python 3.7). The program exits at the same row.
However i tried this script also on a Linux environment (python 3.6.5) and there i works as expected.

If i comment out line 157 and 158 (the print statement) the program execute normally. Does anybody has a clue?

na Coherence model gecreerd
Coherence_Measure(seg=<function s_one_set at 0x0000000036EB0510>, prob=<function p_boolean_sliding_window at 0x000000003
6EB0730>, conf=<function cosine_similarity at 0x0000000036F41D08>, aggr=<function arithmetic_mean at 0x0000000036F432F0>
)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Traceback (most recent call last):
  File ".\python_machinelearningplus.py", line 157, in <module>
  File "D:\python3.7\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\python3.7\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "D:\python3.7\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "D:\python3.7\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    coherence_lda = coherence_model_lda.get_coherence()
    run_name="__mp_main__")  File "D:\python3.7\lib\site-packages\gensim\models\coherencemodel.py", line 603, in get_coh
erence

  File "D:\python3.7\lib\runpy.py", line 263, in run_path
    confirmed_measures = self.get_coherence_per_topic()
pkg_name=pkg_name, script_name=fname)  File "D:\python3.7\lib\site-packages\gensim\models\coherencemodel.py", line 563,
in get_coherence_per_topic

      File "D:\python3.7\lib\runpy.py", line 96, in _run_module_code
self.estimate_probabilities(segmented_topics)
      File "D:\python3.7\lib\site-packages\gensim\models\coherencemodel.py", line 535, in estimate_probabilities
mod_name, mod_spec, pkg_name, script_name)
      File "D:\python3.7\lib\runpy.py", line 85, in _run_code
self._accumulator = self.measure.prob(**kwargs)
      File "D:\python3.7\lib\site-packages\gensim\topic_coherence\probability_estimation.py", line 140, in p_boolean_sli
ding_window
exec(code, run_globals)
      File "H:\Data analyse\buitenland\python_machinelearningplus.py", line 157, in <module>
return accumulator.accumulate(texts, window_size)
  File "D:\python3.7\lib\site-packages\gensim\topic_coherence\text_analysis.py", line 436, in accumulate
    coherence_lda = coherence_model_lda.get_coherence()
      File "D:\python3.7\lib\site-packages\gensim\models\coherencemodel.py", line 603, in get_coherence
workers, input_q, output_q = self.start_workers(window_size)
      File "D:\python3.7\lib\site-packages\gensim\topic_coherence\text_analysis.py", line 470, in start_workers
confirmed_measures = self.get_coherence_per_topic()
      File "D:\python3.7\lib\site-packages\gensim\models\coherencemodel.py", line 563, in get_coherence_per_topic
worker.start()
      File "D:\python3.7\lib\multiprocessing\process.py", line 112, in start
self.estimate_probabilities(segmented_topics)
self._popen = self._Popen(self)  File "D:\python3.7\lib\site-packages\gensim\models\coherencemodel.py", line 535, in est
imate_probabilities

  File "D:\python3.7\lib\multiprocessing\context.py", line 223, in _Popen
    self._accumulator = self.measure.prob(**kwargs)
      File "D:\python3.7\lib\site-packages\gensim\topic_coherence\probability_estimation.py", line 140, in p_boolean_sli
ding_window
return _default_context.get_context().Process._Popen(process_obj)
      File "D:\python3.7\lib\multiprocessing\context.py", line 322, in _Popen
return accumulator.accumulate(texts, window_size)
      File "D:\python3.7\lib\site-packages\gensim\topic_coherence\text_analysis.py", line 436, in accumulate
return Popen(process_obj)
workers, input_q, output_q = self.start_workers(window_size)  File "D:\python3.7\lib\multiprocessing\popen_spawn_win32.p
y", line 65, in __init__

      File "D:\python3.7\lib\site-packages\gensim\topic_coherence\text_analysis.py", line 470, in start_workers
reduction.dump(process_obj, to_child)
      File "D:\python3.7\lib\multiprocessing\reduction.py", line 60, in dump
worker.start()
      File "D:\python3.7\lib\multiprocessing\process.py", line 112, in start
ForkingPickler(file, protocol).dump(obj)
    BrokenPipeErrorself._popen = self._Popen(self):
[Errno 32] Broken pipe  File "D:\python3.7\lib\multiprocessing\context.py", line 223, in _Popen

    return _default_context.get_context().Process._Popen(process_obj)
  File "D:\python3.7\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "D:\python3.7\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "D:\python3.7\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "D:\python3.7\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

 

code_machinelearning.py

Dave -

unread,
Sep 13, 2018, 8:28:14 AM9/13/18
to Gensim
If i execute my script with verbose logging i see the following:
na Coherence model gecreerd
Coherence_Measure(seg=<function s_one_set at 0x0000000020C7EF28>, prob=<function p_boolean_sliding_window at 0x0000000036EF21E0>, conf=<function cosine_similarity at 0x0000000036F69730>, aggr=<function arithmetic_mean at 0x0000000036F69C80>)
import _frozen_importlib # frozen

import _imp # builtin

import '_thread' # <class '_frozen_importlib.BuiltinImporter'>


import '_warnings' # <class '_frozen_importlib.BuiltinImporter'>
So in the logging there is also not a tip where i can look to debug...


Yogesh Kothiya

unread,
Nov 11, 2018, 3:21:19 AM11/11/18
to Gensim
Check this SO thread, If it helps.
Reply all
Reply to author
Forward
0 new messages