I am facing the same issue of Memory Error.
I have dataset of 322 GB. I was reading the JSON files to prepare training data. Two lists were created, in one file name is appended (DocLabels) and in other file text is appended (Data) to supply at the time of training. When processing 322 GB data at once, got memory error. Then I divided the Dataset into 2 sets. While processing the first set of 110 GB, first part of preparing the dataset was completed successfully but got memory error during doc2vec training. Here, is the attached output.
2017-11-27 10:54:18,459 : INFO : collecting all words and their counts
2017-11-27 10:54:18,474 : INFO : PROGRESS: at example #0, processed 0 words (0/s), 0 word types, 0 tags
2017-11-27 10:55:03,105 : INFO : PROGRESS: at example #10000, processed 152663650 words (3419935/s), 5569472 word types, 10000 tags
2017-11-27 10:55:57,802 : INFO : PROGRESS: at example #20000, processed 331818820 words (3275295/s), 10035481 word types, 20000 tags
2017-11-27 10:56:52,864 : INFO : PROGRESS: at example #30000, processed 504357226 words (3133729/s), 14415155 word types, 30000 tags
2017-11-27 10:58:57,177 : INFO : PROGRESS: at example #40000, processed 889851113 words
(3101062/s), 22031283 word types, 40000 tags
2017-11-27 11:00:04,242 : INFO : PROGRESS: at example #50000, processed 1072569770 words (2724355/s), 25509947 word types, 50000 tags
2017-11-27 11:01:02,655 : INFO : PROGRESS: at example #60000, processed 1241225166 words (2887482/s), 28479318 word types, 60000 tags
2017-11-27 11:02:02,529 : INFO : PROGRESS: at example #70000, processed 1409582269 words (2811964/s), 31192772 word types, 70000 tags
2017-11-27 11:02:42,173 : INFO : PROGRESS: at example #80000, processed 1508806632 words (2502602/s), 33227334 word types, 80000 tags
2017-11-27 11:04:32,243 : INFO : PROGRESS: at example #90000, processed 1809654284 words (2733423/s), 39412168 word types, 90000 tags
2017-11-27 11:07:20,119 : INFO : PROGRESS: at example #100000, processed
2234156699 words (2528533/s), 47147221 word types, 100000 tags
2017-11-27 11:09:52,196 : INFO : PROGRESS: at example #110000, processed
2607896686 words (2457627/s), 53897383 word types, 110000 tags
2017-11-27 11:10:44,446 : INFO : PROGRESS: at example #120000, processed 2745264867 words (2629450/s), 56001822 word types, 120000 tags
2017-11-27 11:11:16,137 : INFO : PROGRESS: at example #130000, processed 2827117610 words (2582841/s), 58366521 word types, 130000 tags
2017-11-27 11:11:50,825 : INFO : PROGRESS: at example #140000, processed 2912509508 words (2461196/s), 60757543 word types, 140000 tags
2017-11-27 11:12:21,437 : INFO : PROGRESS: at example #150000, processed 2995331490 words (2705918/s), 62875375 word types, 150000 tags
2017-11-27 11:12:52,543 : INFO : PROGRESS: at example #160000, processed
3078531060 words (2674653/s), 64960484 word types, 160000 tags
2017-11-27 11:13:37,401 : INFO : PROGRESS: at example #170000, processed 3160857061 words (1835253/s), 67044892 word types, 170000 tags
2017-11-27 11:14:11,191 : INFO : PROGRESS: at example #180000, processed 3245049235 words (2491577/s), 69110558 word types, 180000 tags
2017-11-27 11:14:43,832 : INFO : PROGRESS: at example #190000, processed 3330286435 words (2611326/s), 71095891 word types, 190000 tags
2017-11-27 11:15:15,394 : INFO : PROGRESS: at example #200000, processed
3412482675 words (2604299/s), 73032591 word types, 200000 tags
2017-11-27 11:15:41,117 : INFO : PROGRESS: at example #210000, processed 3484747568 words (2809144/s), 73978394 word types, 210000 tags
2017-11-27 11:16:03,545 : INFO : PROGRESS: at example #220000, processed 3539712618 words (2450890/s), 74930744 word types, 220000 tags
2017-11-27 11:16:24,595 : INFO : PROGRESS: at example #230000, processed 3594204897 words (2588179/s), 75823469 word types, 230000 tags
2017-11-27 11:16:47,161 : INFO : PROGRESS: at example #240000, processed 3651254919 words (2529385/s), 76733627 word types, 240000 tags
2017-11-27 11:17:09,582 : INFO : PROGRESS: at example #250000, processed 3708906038 words (2571235/s), 77635553 word types, 250000 tags
2017-11-27 11:17:46,730 : INFO : PROGRESS: at example #260000, processed 3766507618 words (1550176/s), 78535135 word types, 260000 tags
2017-11-27 11:18:09,684 : INFO : PROGRESS: at example #270000, processed
3826063201 words (2594230/s), 79430195 word types, 270000 tags
2017-11-27 11:18:34,592 : INFO : PROGRESS: at example #280000, processed 3885800529 words (2398620/s), 80409526 word types, 280000 tags
2017-11-27 11:19:08,168 : INFO : PROGRESS: at example #290000, processed 3968037506 words (2449717/s), 81997016 word types, 290000 tags
2017-11-27 11:19:35,923 : INFO : PROGRESS: at example #300000, processed
4036921076 words (2482351/s), 83125495 word types, 300000 tags
2017-11-27 11:19:58,625 : INFO : PROGRESS: at example #310000, processed
4097091596 words (2648966/s), 84031816 word types, 310000 tags
2017-11-27 11:20:23,447 : INFO : PROGRESS: at example #320000, processed
4159877051 words (2529729/s), 85001748 word types, 320000 tags
2017-11-27 11:20:50,767 : INFO : PROGRESS: at example #330000, processed 4227457905 words (2474381/s), 86012206 word types, 330000 tags
2017-11-27 11:21:19,819 : INFO : PROGRESS: at example #340000, processed 4295950182 words (2357029/s), 87081250 word types, 340000 tags
2017-11-27 11:21:52,331 : INFO : PROGRESS: at example #350000, processed 4371362975 words (2319315/s), 88480879 word types, 350000 tags
2017-11-27 11:23:00,957 : INFO : PROGRESS: at example #360000, processed 4461660292 words (1315870/s), 90085061 word types, 360000 tags
2017-11-27 11:23:31,667 : INFO : PROGRESS: at example #370000, processed 4532568324 words (2308694/s), 91532531 word types, 370000 tags
2017-11-27 11:24:04,960 : INFO : PROGRESS: at example #380000, processed 4608419619 words (2278579/s), 92988541 word types, 380000 tags
2017-11-27 11:24:45,655 : INFO : PROGRESS: at example #390000, processed 4701794826 words (2293378/s), 95603736 word types, 390000 tags
2017-11-27 11:25:27,592 : INFO : PROGRESS: at example #400000, processed
4798265510 words (2301310/s), 98167804 word types, 400000 tags
2017-11-27 11:26:11,569 : INFO : PROGRESS: at example #410000, processed 4895266540 words (2205785/s), 100786108 word types, 410000 tags
2017-11-27 11:26:55,226 : INFO : PROGRESS: at example #420000, processed 4993322826 words (2246206/s), 103370226 word types, 420000 tags
2017-11-27 11:27:37,148 : INFO : PROGRESS: at example #430000, processed
5088389983 words (2267482/s), 105677374 word types, 430000 tags
2017-11-27 11:28:03,908 : INFO : PROGRESS: at example #440000, processed
5147896915 words (2224477/s), 106601742 word types, 440000 tags
2017-11-27 11:28:29,671 : INFO : PROGRESS: at example #450000, processed
5208146581 words (2338848/s), 107402834 word types, 450000 tags
2017-11-27 11:29:19,598 : INFO : PROGRESS: at example #460000, processed
5265476538 words (1148079/s), 108331428 word types, 460000 tags
2017-11-27 11:29:45,993 : INFO : PROGRESS: at example #470000, processed 5325687199 words (2280751/s), 109300165 word types, 470000 tags
2017-11-27 11:30:12,549 : INFO : PROGRESS: at example #480000, processed 5385844852 words (2265788/s), 110205462 word types, 480000 tags
2017-11-27 11:30:51,648 : INFO : PROGRESS: at example #490000, processed 5466415924 words (2060612/s), 111889347 word types, 490000 tags
2017-11-27 11:31:34,576 : INFO : PROGRESS: at example #500000, processed 5555596021 words (2077843/s), 113791549 word types, 500000 tags
2017-11-27 11:32:16,072 : INFO : PROGRESS: at example #510000, processed
5644147903 words (2133446/s), 115671742 word types, 510000 tags
2017-11-27 11:32:57,910 : INFO : PROGRESS: at example #520000, processed
5728904460 words (2026359/s), 117484160 word types, 520000 tags
2017-11-27 11:33:41,608 : INFO : PROGRESS: at example #530000, processed
5818595331 words (2051948/s), 119334571 word types, 530000 tags
2017-11-27 11:34:25,165 : INFO : PROGRESS: at example #540000, processed 5908503009 words (2064164/s), 121185709 word types, 540000 tags
2017-11-27 11:35:09,950 : INFO : PROGRESS: at example #550000, processed 5998600120 words (2011960/s), 123056834 word types, 550000 tags
2017-11-27 11:35:54,576 : INFO : PROGRESS: at example #560000, processed
6085824310 words (1954382/s), 124871802 word types, 560000 tags
2017-11-27 11:36:38,542 : INFO : PROGRESS: at example #570000, processed
6174440471 words (2016116/s), 126700160 word types, 570000 tags
2017-11-27 11:37:51,989 : INFO : PROGRESS: at example #580000, processed 6259424856 words (1157025/s), 128397926 word types, 580000 tags
2017-11-27 11:38:32,204 : INFO : PROGRESS: at example #590000, processed 6342094693 words (2055857/s), 129951745 word types, 590000 tags
2017-11-27 11:39:12,381 : INFO : PROGRESS: at example #600000, processed 6424854207 words (2059776/s), 131514222 word types, 600000 tags
2017-11-27 11:39:56,421 : INFO : PROGRESS: at example #610000, processed
6508111128 words (1890202/s), 133098502 word types, 610000 tags
2017-11-27 11:40:40,098 : INFO : PROGRESS: at example #620000, processed 6591677968 words (1913836/s), 134647880 word types, 620000 tags
2017-11-27 11:41:21,648 : INFO : PROGRESS: at example #630000, processed
6672233996 words (1938550/s), 136188929 word types, 630000 tags
2017-11-27 11:42:08,092 : INFO : PROGRESS: at example #640000, processed 6762071533 words (1934173/s), 137963800 word types, 640000 tags
2017-11-27 11:42:56,043 : INFO : PROGRESS: at example #650000, processed 6852436879 words (1884525/s), 139744193 word types, 650000 tags
2017-11-27 11:43:40,808 : INFO : PROGRESS: at example #660000, processed 6938789040 words (1928968/s), 141359878 word types, 660000 tags
2017-11-27 11:44:22,221 : INFO : PROGRESS: at example #670000, processed
7019344331 words (1945305/s), 142766760 word types, 670000 tags
2017-11-27 11:45:04,053 : INFO : PROGRESS: at example #680000, processed
7098128924 words (1883938/s), 144166899 word types, 680000 tags
2017-11-27 11:46:00,631 : INFO : PROGRESS: at example #690000, processed
7205858371 words (1904226/s), 145908318 word types, 690000 tags
2017-11-27 11:47:30,378 : INFO : PROGRESS: at example #700000, processed
7374395882 words (1877839/s), 148396626 word types, 700000 tags
2017-11-27 11:48:20,276 : INFO : PROGRESS: at example #710000, processed 7465636870 words (1828456/s), 149581811 word types, 710000 tags
2017-11-27 11:49:24,095 : INFO : PROGRESS: at example #720000, processed
7578445224 words (1767523/s), 151437817 word types, 720000 tags
2017-11-27 11:50:28,154 : INFO : PROGRESS: at example #730000, processed 7689860414 words (1739420/s), 153993381 word types, 730000 tags
2017-11-27 11:52:13,573 : INFO : PROGRESS: at example #740000, processed
7804638995 words (1088868/s), 156147343 word types, 740000 tags
2017-11-27 11:53:14,137 : INFO : PROGRESS: at example #750000, processed 7914193226 words (1808665/s), 157850041 word types, 750000 tags
2017-11-27 11:54:09,155 : INFO : PROGRESS: at example #760000, processed
8014233229 words (1818562/s), 159461712 word types, 760000 tags
2017-11-27 11:55:11,368 : INFO : PROGRESS: at example #770000, processed
8123415169 words (1754935/s), 160976003 word types, 770000 tags
2017-11-27 11:56:17,418 : INFO : PROGRESS: at example #780000, processed 8239969464 words (1764421/s), 162868161 word types, 780000 tags
2017-11-27 11:57:28,443 : INFO : PROGRESS: at example #790000, processed
8357823905 words (1659331/s), 164571859 word types, 790000 tags
2017-11-27 11:58:31,115 : INFO : PROGRESS: at example #800000, processed 8465856989 words (1723943/s), 165855863 word types, 800000 tags
2017-11-27 11:59:38,081 : INFO : PROGRESS: at example #810000, processed
8575907418 words (1643363/s), 167675625 word types, 810000 tags
2017-11-27 12:00:45,845 : INFO : PROGRESS: at example #820000, processed 8689399163 words (1674742/s), 169385297 word types, 820000 tags
2017-11-27 12:01:57,246 : INFO : PROGRESS: at example #830000, processed 8804099667 words (1606546/s), 170984319 word types, 830000 tags
2017-11-27 12:02:59,476 : INFO : PROGRESS: at example #840000, processed 8907297453 words (1658177/s), 172494784 word types, 840000 tags
2017-11-27 12:04:03,980 : INFO : PROGRESS: at example #850000, processed
9012025720 words (1623877/s), 174406058 word types, 850000 tags
2017-11-27 12:05:12,709 : INFO : PROGRESS: at example #860000, processed
9125655859 words (1653227/s), 176375161 word types, 860000 tags
2017-11-27 12:06:23,759 : INFO : PROGRESS: at example #870000, processed 9238572341 words (1589284/s), 178354187 word types, 870000 tags
2017-11-27 12:07:56,710 : INFO : PROGRESS: at example #880000, processed
9346242729 words (1158270/s), 180768166 word types, 880000 tags
2017-11-27 12:09:09,290 : INFO : PROGRESS: at example #890000, processed 9462554828 words (1602516/s), 182803395 word types, 890000 tags
2017-11-27 12:10:15,506 : INFO : PROGRESS: at example #900000, processed 9572241994 words (1656738/s), 184671040 word types, 900000 tags
2017-11-27 12:11:25,417 : INFO : PROGRESS: at example #910000, processed 9685914858 words (1626053/s), 186261093 word types, 910000 tags
2017-11-27 12:12:23,052 : INFO : PROGRESS: at example #920000, processed
9783252247 words (1688488/s), 187368580 word types, 920000 tags
2017-11-27 12:13:37,819 : INFO : PROGRESS: at example #930000, processed 9908719074 words (1678122/s), 189112546 word types, 930000 tags
2017-11-27 12:14:43,667 : INFO : PROGRESS: at example #940000, processed 10014129862 words (1600898/s), 190420710 word types, 940000 tags
2017-11-27 12:16:47,460 : INFO : PROGRESS: at example #950000, processed 10125139689 words (896738/s), 191705137 word types, 950000 tags
2017-11-27 12:18:02,244 : INFO : PROGRESS: at example #960000, processed 10247521765 words (1636399/s), 192857787 word types, 960000 tags
2017-11-27 12:19:19,960 : INFO : PROGRESS: at example #970000, processed 10377655281 words (1674546/s), 194106565 word types, 970000 tags
2017-11-27 12:20:33,180 : INFO : PROGRESS: at example #980000, processed 10496949512 words (1629473/s), 195385814 word types, 980000 tags
2017-11-27 12:21:48,766 : INFO : PROGRESS: at example #990000, processed 10612291662 words (1525873/s), 196747237 word types, 990000 tags
2017-11-27 12:23:07,130 : INFO : PROGRESS: at example #1000000, processed 10732101821 words (1528907/s), 197874949 word types, 1000000 tags
2017-11-27 12:24:45,052 : INFO : PROGRESS: at example #1010000, processed 10890197142 words (1614528/s), 198769761 word types, 1010000 tags
2017-11-27 12:26:21,289 : INFO : PROGRESS: at example #1020000, processed 11047017190 words (1629567/s), 200228568 word types, 1020000 tags
2017-11-27 12:28:16,026 : INFO : PROGRESS: at example #1030000, processed 11229081122 words (1586725/s), 202459220 word types, 1030000 tags
2017-11-27 12:30:29,013 : INFO : PROGRESS: at example #1040000, processed 11424319613 words (1468068/s), 208260001 word types, 1040000 tags
2017-11-27 12:31:37,200 : INFO : PROGRESS: at example #1050000, processed 11532405079 words (1585337/s), 209843180 word types, 1050000 tags
2017-11-27 12:34:04,003 : INFO : PROGRESS: at example #1060000, processed 11755580678 words (1520111/s), 213984687 word types, 1060000 tags
2017-11-27 12:36:00,303 : INFO : PROGRESS: at example #1070000, processed 11934190372 words (1535802/s), 216992114 word types, 1070000 tags
2017-11-27 12:37:16,950 : INFO : PROGRESS: at example #1080000, processed 12050889393 words (1522698/s), 219795582 word types, 1080000 tags
2017-11-27 12:39:37,346 : INFO : PROGRESS: at example #1090000, processed 12303608363 words (1799926/s), 225353138 word types, 1090000 tags
2017-11-27 12:42:13,420 : INFO : PROGRESS: at example #1100000, processed
12542671389 words (1531697/s), 229901537 word types, 1100000 tags
2017-11-27 12:44:27,612 : INFO : PROGRESS: at example #1110000, processed 12744399342 words (1503340/s), 232543628 word types, 1110000 tags
2017-11-27 12:45:36,236 : INFO : PROGRESS: at example #1120000, processed 12845213636 words (1469749/s), 233897416 word types, 1120000 tags
2017-11-27 12:46:57,911 : INFO : PROGRESS: at example #1130000, processed 12958009078 words (1380796/s), 236809426 word types, 1130000 tags
2017-11-27 12:51:02,836 : INFO : PROGRESS: at example #1140000, processed
13292693659 words (1366412/s), 242258680 word types, 1140000 tags
2017-11-27 12:52:40,566 : INFO : PROGRESS: at example #1150000, processed 13427173349 words (1376080/s), 244299228 word types, 1150000 tags
2017-11-27 12:55:38,203 : INFO : PROGRESS: at example #1160000, processed
13683147891 words (1440981/s), 248662959 word types, 1160000 tags
2017-11-27 13:01:31,595 : INFO : PROGRESS: at example #1170000, processed
14248810893 words (1600715/s), 257698537 word types, 1170000 tags
2017-11-27 13:06:43,517 : INFO : PROGRESS: at example #1180000, processed 14730700025 words (1544859/s), 265550753 word types, 1180000 tags
2017-11-27 13:08:36,461 : INFO : PROGRESS: at example #1190000, processed 14882238992 words (1341595/s), 267362897 word types, 1190000 tags
2017-11-27 13:11:09,104 : INFO : PROGRESS: at example #1200000, processed 14994557079 words (735842/s), 268526014 word types, 1200000 tags
2017-11-27 13:13:35,226 : INFO : PROGRESS: at example #1210000, processed
15207374894 words (1456539/s), 271348701 word types, 1210000 tags
2017-11-27 13:17:06,999 : INFO : PROGRESS: at example #1220000, processed 15495935602 words (1362558/s), 275731241 word types, 1220000 tags
2017-11-27 13:21:30,391 : INFO : PROGRESS: at example #1230000, processed 15860509580 words (1384158/s), 280004206 word types, 1230000 tags
2017-11-27 13:22:54,453 : INFO : PROGRESS: at example #1240000, processed 15972466459 words (1331792/s), 281398502 word types, 1240000 tags
2017-11-27 13:24:50,621 : INFO : PROGRESS: at example #1250000, processed
16125513992 words (1317438/s), 283066579 word types, 1250000 tags
2017-11-27 13:30:00,828 : INFO : PROGRESS: at example #1260000, processed
16568002145 words (1426481/s), 288482853 word types, 1260000 tags
2017-11-27 13:31:07,852 : INFO : PROGRESS: at example #1270000, processed 16650673396 words (1233381/s), 289273909 word types, 1270000 tags
2017-11-27 13:32:14,180 : INFO : PROGRESS: at example #1280000, processed 16736681722 words (1296570/s), 290444309 word types, 1280000 tags
2017-11-27 13:34:00,434 : INFO : PROGRESS: at example #1290000, processed 16883054135 words (1377598/s), 292448959 word types, 1290000 tags
2017-11-27 13:35:24,700 : INFO : PROGRESS: at example #1300000, processed 16988089315 words (1246432/s), 294098789 word types, 1300000 tags
2017-11-27 13:38:51,888 : INFO : PROGRESS: at example #1310000, processed
17257650497 words (1301093/s), 298892240 word types, 1310000 tags
2017-11-27 13:43:37,499 : INFO : PROGRESS: at example #1320000, processed 17619644259 words (1267448/s), 303380082 word types, 1320000 tags
2017-11-27 13:46:17,831 : INFO : PROGRESS: at example #1330000, processed
17826979573 words (1293163/s), 305716169 word types, 1330000 tags
2017-11-27 13:49:40,045 : INFO : PROGRESS: at example #1340000, processed
18093105136 words (1316016/s), 308753076 word types, 1340000 tags
2017-11-27 13:51:11,434 : INFO : PROGRESS: at example #1350000, processed
18208800931 words (1266091/s), 309884698 word types, 1350000 tags
2017-11-27 13:53:05,289 : INFO : PROGRESS: at example #1360000, processed
18355434145 words (1287764/s), 311215118 word types, 1360000 tags
2017-11-27 13:53:08,221 : INFO : collected 311250382 word types and 1360633 unique tags from a corpus of 1360633 examples and
18358933358 words
2017-11-27 13:53:08,221 : INFO : Loading a fresh vocabulary
2017-11-27 14:20:14,513 : INFO : min_count=5 retains 31820897 unique words (10% of original 311250382, drops 279429485)
2017-11-27 14:20:14,513 : INFO : min_count=5 leaves 17982644368 word corpus (97% of original
18358933358, drops 376288990)
2017-11-27 14:22:06,467 : INFO : deleting the raw counts dictionary of 311250382 items
2017-11-27 14:23:23,805 : INFO : sample=0.001 downsamples 30 most-common words
2017-11-27 14:23:23,805 : INFO : downsampling leaves estimated
15316549016 word corpus (85.2% of prior 17982644368)
2017-11-27 14:23:23,805 : INFO : estimated required memory for 31820897 words and 300 dimensions: 94185487500 bytes
2017-11-27 14:26:48,005 : INFO : resetting layer weights
Traceback (most recent call last):
File "C:/Users/Administrator/PycharmProjects/Concept_Hierarchical_Model/d2v_core.py", line 112, in <module>
model.build_vocab(it)
File "C:\Users\Administrator\Anaconda2\lib\site-packages\gensim\models\word2vec.py", line 546, in build_vocab
self.finalize_vocab(update=update) # build tables & arrays
File "C:\Users\Administrator\Anaconda2\lib\site-packages\gensim\models\word2vec.py", line 717, in finalize_vocab
self.reset_weights()
File "C:\Users\Administrator\Anaconda2\lib\site-packages\gensim\models\doc2vec.py", line 655, in reset_weights
super(Doc2Vec, self).reset_weights()
File "C:\Users\Administrator\Anaconda2\lib\site-packages\gensim\models\word2vec.py", line 1109, in reset_weights
self.syn1neg = zeros((len(self.wv.vocab), self.layer1_size), dtype=REAL)
MemoryError