Hi everyone. Recently we have started to experience a crash issue of MongoS on one of Windows Servers 2012 R2 which hosts application and MongoS router for that application.
In our cluster we have 8 shards and 8 mongoS.
When application starts, it requires to get a lot of data from DB and starting to pull all the required data from DB using local mongoS (many of the requests are async, which of course leads into creating a lot of connections/threads inside official MongoDB C# driver to MongoS). But in a seconds MongoS crashes with a stable period of 50%.
Here is the last MongoS log before crash:
I CONTROL [NetworkInterfaceASIO-0] mongos.exe ...\src\mongo\executor\connection_pool_asio.cpp(214) <lambda_35837b6b24c420cd94fefaa17131656f>::operator()+0x4c
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe ...\src\third_party\asio-asio-1-11-0\asio\include\asio\detail\completion_handler.hpp(69) asio::detail::completion_handler<<lambda_35837b6b24c420cd94fefaa17131656f> >::do_complete+0xcc
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe ...\src\third_party\asio-asio-1-11-0\asio\include\asio\detail\impl\strand_service.ipp(164) asio::detail::strand_service::do_complete+0xa7
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe ...\src\third_party\asio-asio-1-11-0\asio\include\asio\detail\impl\win_iocp_io_service.ipp(404) asio::detail::win_iocp_io_service::do_one+0x2be
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe ...\src\third_party\asio-asio-1-11-0\asio\include\asio\detail\impl\win_iocp_io_service.ipp(162) asio::detail::win_iocp_io_service::run+0xbb
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe ...\src\third_party\asio-asio-1-11-0\asio\include\asio\impl\io_service.ipp(61) asio::io_service::run+0x32
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe ...\src\mongo\executor\network_interface_asio.cpp(116) <lambda_c44aeeec9ed6b3fe32a46d4f069876e0>::operator()+0x204
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe c:\program files (x86)\microsoft visual studio 12.0\vc\include\thr\xthread(187) std::_LaunchPad<std::_Bind<0,void,<lambda_c44aeeec9ed6b3fe32a46d4f069876e0> > >::_Go+0x1c
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe f:\dd\vctools\crt\crtw32\stdcpp\thr\threadcall.cpp(28) _Call_func+0x14
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe f:\dd\vctools\crt\crtw32\startup\threadex.c(376) _callthreadstartex+0x17
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] mongos.exe f:\dd\vctools\crt\crtw32\startup\threadex.c(354) _threadstartex+0x102
2016-08-17T15:27:59.690+0000 I CONTROL [NetworkInterfaceASIO-0] KERNEL32.DLL BaseThreadInitThunk+0x22
2016-08-17T15:27:59.690+0000 I - [NetworkInterfaceASIO-0]
2016-08-17T15:27:59.691+0000 I CONTROL [NetworkInterfaceASIO-0] writing minidump diagnostic file C:\MMSAutomation\versions\mongodb-win32-x86_64-3.2016-08-17T15-27-59.mdmp
Machine information:
OS Information:
OS Name Microsoft Windows Server 2012 R2 Standard
Version 6.3.9600 Build 9600
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name VS1GS3
System Manufacturer Dell Inc.
System Model PowerEdge R730
System Type x64-based PC
System SKU SKU=NotProvided;ModelName=PowerEdge R730
Processor Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2400 Mhz, 6 Core(s), 12 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2400 Mhz, 6 Core(s), 12 Logical Processor(s)
BIOS Version/Date Dell Inc. 1.0.4, 8/28/2014
SMBIOS Version 2.8
Embedded Controller Version 255.255
BIOS Mode Legacy
BaseBoard Manufacturer Dell Inc.
BaseBoard Model Not Available
BaseBoard Name Base Board
Platform Role Enterprise Server
Windows Directory C:\windows
System Directory C:\windows\system32
Installed Physical Memory (RAM) 64.0 GB
Total Physical Memory 63.9 GB
Available Physical Memory 46.6 GB
Total Virtual Memory 73.4 GB
Available Virtual Memory 55.2 GB
The full log file and crash dumps are attached.
Many thanks for those who can help in advance,
Valentin