참고라 KoNLPy를 사용하지 않는 영어 텍스트의 경우 작성한 프로그램을으로 잘 실행이 됩니다.
2017-01-05 15:46:58,761 - gensim.models.word2vec - DEBUG - Fast version of gensim.models.word2vec is being used
2017-01-05 15:46:58,761 - gensim.models.doc2vec - INFO - collecting all words and their counts
2017-01-05 15:47:02,605 - gensim.models.doc2vec - INFO - PROGRESS: at example #0, processed 0 words (0/s), 0 word types, 0 tags
2017-01-05 15:48:11,999 - gensim.models.doc2vec - INFO - collected 9807 word types and 1000 unique tags from a corpus of 1000 examples and 313612 words
2017-01-05 15:48:12,000 - gensim.models.word2vec - INFO - Loading a fresh vocabulary
2017-01-05 15:48:12,050 - gensim.models.word2vec - INFO - min_count=5 retains 4599 unique words (46% of original 9807, drops 5208)
2017-01-05 15:48:12,052 - gensim.models.word2vec - INFO - min_count=5 leaves 303663 word corpus (96% of original 313612, drops 9949)
2017-01-05 15:48:12,105 - gensim.models.word2vec - INFO - deleting the raw counts dictionary of 9807 items
2017-01-05 15:48:12,107 - gensim.models.word2vec - INFO - sample=0.001 downsamples 43 most-common words
2017-01-05 15:48:12,107 - gensim.models.word2vec - INFO - downsampling leaves estimated 229548 word corpus (75.6% of prior 303663)
2017-01-05 15:48:12,109 - gensim.models.word2vec - INFO - estimated required memory for 4599 words and 300 dimensions: 14737100 bytes
2017-01-05 15:48:12,147 - gensim.models.word2vec - INFO - resetting layer weights
2017-01-05 15:48:12,566 - gensim.models.word2vec - INFO - training model with 48 workers on 4599 vocabulary and 300 features, using sg=0 hs=0 sample=0.001 negative=5 window=5
2017-01-05 15:48:12,566 - gensim.models.word2vec - INFO - expecting 1000 sentences, matching count from corpus used for vocabulary survey
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ff302aa7817, pid=79421, tid=0x00007ff1321fc700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_111-b14) (build 1.8.0_111-b14)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [_jpype.so+0x62817] JPJavaEnv::NewObjectA(_jclass*, _jmethodID*, jvalue*)+0x47
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /var/opt/work/dnn4pat/hs_err_pid79421.log
#
# If you would like to submit a bug report, please visit:
#
Aborted
* 사용환경은 아래와 같습니다.