Mongo production system started to abort

1,383 views
Skip to first unread message

Tim Hawkins

unread,
Oct 14, 2017, 10:58:56 PM10/14/17
to mongodb-user
I have a very long running system on an ec2 instance which i have kept updated regularly

just recently it has started to abort randomly with the following in the log

017-10-14T06:22:47.240+0000 W FTDC     [ftdc] Uncaught exception in 'FileStreamFailed: Failed to write to interim file buffer for full-time diagnostic data capture: /mnt/data/var/lib/mongo/diagnostic.data/metrics.interim.temp' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem.
2017-10-14T06:23:14.013+0000 E STORAGE  [thread2] WiredTiger error (28) [1507962194:13919][23447:0x7fbdb764e700], file:WiredTiger.wt, WT_SESSION.checkpoint: /mnt/data/var/lib/mongo/WiredTiger.turtle.set: handle-write: pwrite: failed to write 1028 bytes at offset 0: No space left on device
2017-10-14T06:23:14.014+0000 E STORAGE  [thread2] WiredTiger error (28) [1507962194:14205][23447:0x7fbdb764e700], file:WiredTiger.wt, WT_SESSION.checkpoint: /mnt/data/var/lib/mongo/WiredTiger.turtle.set: handle-write: pwrite: failed to write 1028 bytes at offset 0: No space left on device
2017-10-14T06:23:14.014+0000 E STORAGE  [thread2] WiredTiger error (0) [1507962194:14257][23447:0x7fbdb764e700], file:WiredTiger.wt, WT_SESSION.checkpoint: WiredTiger.turtle: encountered an illegal file format or internal value
2017-10-14T06:23:14.014+0000 E STORAGE  [thread2] WiredTiger error (-31804) [1507962194:14267][23447:0x7fbdb764e700], file:WiredTiger.wt, WT_SESSION.checkpoint: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-10-14T06:23:14.015+0000 I -        [thread2] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361
2017-10-14T06:23:14.015+0000 I -        [thread2] 

***aborting after fassert() failure


2017-10-14T06:23:14.086+0000 I -        [WTJournalFlusher] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 64
2017-10-14T06:23:14.086+0000 I -        [WTJournalFlusher] 

***aborting after fassert() failure


2017-10-14T06:23:14.202+0000 F -        [thread2] Got signal: 6 (Aborted).

The system implies it is out of disk space, but there is plenty of disk free. 

Filesystem     1K-blocks      Used Available Use% Mounted on
devtmpfs         7819972        68   7819904   1% /dev
tmpfs            7830912         0   7830912   0% /dev/shm
/dev/xvda1      51473000  19434416  31938336  38% /
/dev/nvme0n1   463640688 190978252 272662436  42% /mnt/data

$ uname -a
Linux ip-XXX.XXX.XXX.XXX 4.4.44-39.55.amzn1.x86_64 #1 SMP Mon Jan 30 18:15:53 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

$ more /etc/mongod.conf 
# mongod.conf

# for documentation of all options, see:

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /mnt/data/var/log/mongodb/mongod.log

# Where and how to store data.
storage:
  dbPath: /mnt/data/var/lib/mongo
  journal:
    enabled: true
#  engine:
#  mmapv1:
#  wiredTiger:

# how the process runs
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid  # location of pidfile

# network interfaces
net:
  port: 27017
  #bindIp: 127.0.0.1  # Listen to local interface only, comment to listen on all interfaces.


#security:

#operationProfiling:

#replication:

#sharding:

## Enterprise-Only Options

#auditLog:

#snmp:

$ mongo
MongoDB shell version v3.4.9
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.9
Server has startup warnings: 
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] 
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] 
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] 
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-10-15T02:46:10.585+0000 I CONTROL  [initandlisten] 
> db.serverStatus()
{
"host" : "ip-XXX-XXX-XXX-XXX,
"version" : "3.4.9",
"process" : "mongod",
"pid" : NumberLong(27767),
"uptime" : 595,
"uptimeMillis" : NumberLong(595475),
"uptimeEstimate" : NumberLong(595),
"localTime" : ISODate("2017-10-15T02:56:02.612Z"),
"asserts" : {
"regular" : 0,
"warning" : 0,
"msg" : 0,
"user" : 0,
"rollovers" : 0
},
"connections" : {
"current" : 11,
"available" : 51189,
"totalCreated" : 16
},
"extra_info" : {
"note" : "fields vary by platform",
"page_faults" : 42
},
"globalLock" : {
"totalTime" : NumberLong(595055000),
"currentQueue" : {
"total" : 0,
"readers" : 0,
"writers" : 0
},
"activeClients" : {
"total" : 16,
"readers" : 0,
"writers" : 0
}
},
"locks" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(181020),
"w" : NumberLong(641),
"W" : NumberLong(3)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(90180),
"w" : NumberLong(126),
"R" : NumberLong(8),
"W" : NumberLong(515)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(90180),
"w" : NumberLong(126)
}
},
"Metadata" : {
"acquireCount" : {
"w" : NumberLong(1)
}
}
},
"network" : {
"bytesIn" : NumberLong(1428122),
"bytesOut" : NumberLong(424730507),
"physicalBytesIn" : NumberLong(1428122),
"physicalBytesOut" : NumberLong(424730507),
"numRequests" : NumberLong(11225)
},
"opLatencies" : {
"reads" : {
"latency" : NumberLong(15661955),
"ops" : NumberLong(4861)
},
"writes" : {
"latency" : NumberLong(52560),
"ops" : NumberLong(126)
},
"commands" : {
"latency" : NumberLong(29828),
"ops" : NumberLong(712)
}
},
"opcounters" : {
"insert" : 69,
"query" : 3719,
"update" : 57,
"delete" : 0,
"getmore" : 25,
"command" : 1657
},
"opcountersRepl" : {
"insert" : 0,
"query" : 0,
"update" : 0,
"delete" : 0,
"getmore" : 0,
"command" : 0
},
"storageEngine" : {
"name" : "wiredTiger",
"supportsCommittedReads" : true,
"readOnly" : false,
"persistent" : true
},
"tcmalloc" : {
"generic" : {
"current_allocated_bytes" : 880052496,
"heap_size" : 945741824
},
"tcmalloc" : {
"pageheap_free_bytes" : 41586688,
"pageheap_unmapped_bytes" : 0,
"max_total_thread_cache_bytes" : NumberLong(1073741824),
"current_total_thread_cache_bytes" : 16051128,
"total_free_bytes" : 24102640,
"central_cache_free_bytes" : 2986840,
"transfer_cache_free_bytes" : 5064672,
"thread_cache_free_bytes" : 16051128,
"aggressive_memory_decommit" : 0,
"formattedString" : "------------------------------------------------\nMALLOC:      880052496 (  839.3 MiB) Bytes in use by application\nMALLOC: +     41586688 (   39.7 MiB) Bytes in page heap freelist\nMALLOC: +      2986840 (    2.8 MiB) Bytes in central cache freelist\nMALLOC: +      5064672 (    4.8 MiB) Bytes in transfer cache freelist\nMALLOC: +     16051128 (   15.3 MiB) Bytes in thread cache freelists\nMALLOC: +      5005504 (    4.8 MiB) Bytes in malloc metadata\nMALLOC:   ------------\nMALLOC: =    950747328 (  906.7 MiB) Actual memory used (physical + swap)\nMALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka unmapped)\nMALLOC:   ------------\nMALLOC: =    950747328 (  906.7 MiB) Virtual address space used\nMALLOC:\nMALLOC:          37884              Spans in use\nMALLOC:             22              Thread heaps in use\nMALLOC:           4096              Tcmalloc page size\n------------------------------------------------\nCall ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).\nBytes released to the OS take up virtual address space but no physical memory.\n"
}
},
"wiredTiger" : {
"uri" : "statistics:",
"LSM" : {
"application work units currently queued" : 0,
"merge work units currently queued" : 0,
"rows merged in an LSM tree" : 0,
"sleep for LSM checkpoint throttle" : 0,
"sleep for LSM merge throttle" : 0,
"switch work units currently queued" : 0,
"tree maintenance operations discarded" : 0,
"tree maintenance operations executed" : 0,
"tree maintenance operations scheduled" : 0,
"tree queue hit maximum" : 0
},
"async" : {
"current work queue length" : 0,
"maximum work queue length" : 0,
"number of allocation state races" : 0,
"number of flush calls" : 0,
"number of operation slots viewed for allocation" : 0,
"number of times operation allocation failed" : 0,
"number of times worker found no work" : 0,
"total allocations" : 0,
"total compact calls" : 0,
"total insert calls" : 0,
"total remove calls" : 0,
"total search calls" : 0,
"total update calls" : 0
},
"block-manager" : {
"blocks pre-loaded" : 9992,
"blocks read" : 27878,
"blocks written" : 625,
"bytes read" : 333848576,
"bytes written" : 9736192,
"bytes written for checkpoint" : 9736192,
"mapped blocks read" : 0,
"mapped bytes read" : 0
},
"cache" : {
"application threads page read from disk to cache count" : 26679,
"application threads page read from disk to cache time (usecs)" : 3307307,
"application threads page write from cache to disk count" : 0,
"application threads page write from cache to disk time (usecs)" : 0,
"bytes belonging to page images in the cache" : 742726558,
"bytes currently in the cache" : 783398397,
"bytes not belonging to page images in the cache" : 40671839,
"bytes read into cache" : 687709776,
"bytes written from cache" : 6472220,
"checkpoint blocked page eviction" : 0,
"eviction calls to get a page" : 25,
"eviction calls to get a page found queue empty" : 25,
"eviction calls to get a page found queue empty after locking" : 0,
"eviction currently operating in aggressive mode" : 0,
"eviction empty score" : 0,
"eviction server candidate queue empty when topping up" : 0,
"eviction server candidate queue not empty when topping up" : 0,
"eviction server evicting pages" : 0,
"eviction server slept, because we did not make progress with eviction" : 0,
"eviction server unable to reach eviction goal" : 0,
"eviction state" : 16,
"eviction walks abandoned" : 0,
"eviction worker thread active" : 0,
"eviction worker thread created" : 0,
"eviction worker thread evicting pages" : 0,
"eviction worker thread removed" : 0,
"eviction worker thread stable number" : 0,
"failed eviction of pages that exceeded the in-memory maximum" : 0,
"files with active eviction walks" : 0,
"files with new eviction walks started" : 0,
"force re-tuning of eviction workers once in a while" : 0,
"hazard pointer blocked page eviction" : 0,
"hazard pointer check calls" : 0,
"hazard pointer check entries walked" : 0,
"hazard pointer maximum array length" : 0,
"in-memory page passed criteria to be split" : 0,
"in-memory page splits" : 0,
"internal pages evicted" : 0,
"internal pages split during eviction" : 0,
"leaf pages split during eviction" : 0,
"lookaside table insert calls" : 0,
"lookaside table remove calls" : 0,
"maximum bytes configured" : 7481589760,
"maximum page size at eviction" : 0,
"modified pages evicted" : 0,
"modified pages evicted by application threads" : 0,
"overflow pages read into cache" : 0,
"overflow values cached in memory" : 0,
"page split during eviction deepened the tree" : 0,
"page written requiring lookaside records" : 0,
"pages currently held in the cache" : 27435,
"pages evicted because they exceeded the in-memory maximum" : 0,
"pages evicted because they had chains of deleted items" : 0,
"pages evicted by application threads" : 0,
"pages queued for eviction" : 0,
"pages queued for urgent eviction" : 0,
"pages queued for urgent eviction during walk" : 0,
"pages read into cache" : 27428,
"pages read into cache requiring lookaside entries" : 0,
"pages requested from the cache" : 3854871,
"pages seen by eviction walk" : 0,
"pages selected for eviction unable to be evicted" : 0,
"pages walked for eviction" : 0,
"pages written from cache" : 457,
"pages written requiring in-memory restoration" : 0,
"percentage overhead" : 8,
"tracked bytes belonging to internal pages in the cache" : 6640435,
"tracked bytes belonging to leaf pages in the cache" : 776757962,
"tracked dirty bytes in the cache" : 250212,
"tracked dirty pages in the cache" : 10,
"unmodified pages evicted" : 0
},
"connection" : {
"auto adjusting condition resets" : 266,
"auto adjusting condition wait calls" : 3908,
"detected system time went backwards" : 0,
"files currently open" : 617,
"memory allocations" : 841338,
"memory frees" : 207484,
"memory re-allocations" : 2600,
"pthread mutex condition wait calls" : 9945,
"pthread mutex shared lock read-lock calls" : 56215,
"pthread mutex shared lock write-lock calls" : 8342,
"total fsync I/Os" : 148,
"total read I/Os" : 29259,
"total write I/Os" : 676
},
"cursor" : {
"cursor create calls" : 1807,
"cursor insert calls" : 940,
"cursor next calls" : 3868181,
"cursor prev calls" : 964,
"cursor remove calls" : 16,
"cursor reset calls" : 476178,
"cursor restarted searches" : 0,
"cursor search calls" : 2564396,
"cursor search near calls" : 1416819,
"cursor update calls" : 0,
"truncate calls" : 0
},
"data-handle" : {
"connection data handles currently active" : 614,
"connection sweep candidate became referenced" : 0,
"connection sweep dhandles closed" : 0,
"connection sweep dhandles removed from hash list" : 76,
"connection sweep time-of-death sets" : 445,
"connection sweeps" : 59,
"session dhandles swept" : 0,
"session sweep attempts" : 44
},
"lock" : {
"checkpoint lock acquisitions" : 8,
"checkpoint lock application thread wait time (usecs)" : 0,
"checkpoint lock internal thread wait time (usecs)" : 0,
"handle-list lock eviction thread wait time (usecs)" : 0,
"metadata lock acquisitions" : 8,
"metadata lock application thread wait time (usecs)" : 0,
"metadata lock internal thread wait time (usecs)" : 0,
"schema lock acquisitions" : 625,
"schema lock application thread wait time (usecs)" : 2234,
"schema lock internal thread wait time (usecs)" : 0,
"table lock acquisitions" : 0,
"table lock application thread time waiting for the table lock (usecs)" : 0,
"table lock internal thread time waiting for the table lock (usecs)" : 0
},
"log" : {
"busy returns attempting to switch slots" : 0,
"consolidated slot closures" : 39,
"consolidated slot join active slot closed" : 0,
"consolidated slot join races" : 0,
"consolidated slot join transitions" : 39,
"consolidated slot joins" : 148,
"consolidated slot transitions unable to find free slot" : 0,
"consolidated slot unbuffered writes" : 0,
"log bytes of payload data" : 711592,
"log bytes written" : 720512,
"log files manually zero-filled" : 0,
"log flush operations" : 5933,
"log force write operations" : 6588,
"log force write operations skipped" : 6561,
"log records compressed" : 98,
"log records not compressed" : 39,
"log records too small to compress" : 11,
"log release advances write LSN" : 10,
"log scan operations" : 3,
"log scan records requiring two reads" : 5,
"log server thread advances write LSN" : 29,
"log server thread write LSN walk skipped" : 1567,
"log sync operations" : 37,
"log sync time duration (usecs)" : 12121,
"log sync_dir operations" : 1,
"log sync_dir time duration (usecs)" : 1,
"log write operations" : 148,
"logging bytes consolidated" : 720128,
"maximum log file size" : 104857600,
"number of pre-allocated log files to create" : 2,
"pre-allocated log files not ready and missed" : 1,
"pre-allocated log files prepared" : 2,
"pre-allocated log files used" : 0,
"records processed by log scan" : 9,
"total in-memory size of compressed records" : 1007181,
"total log buffer size" : 33554432,
"total size of compressed records" : 692690,
"written slots coalesced" : 0,
"yields waiting for previous log file close" : 0
},
"reconciliation" : {
"fast-path pages deleted" : 0,
"page reconciliation calls" : 428,
"page reconciliation calls for eviction" : 0,
"pages deleted" : 0,
"split bytes currently awaiting free" : 0,
"split objects currently awaiting free" : 0
},
"session" : {
"open cursor count" : 650,
"open session count" : 18,
"table alter failed calls" : 0,
"table alter successful calls" : 0,
"table alter unchanged and skipped" : 0,
"table compact failed calls" : 0,
"table compact successful calls" : 0,
"table create failed calls" : 0,
"table create successful calls" : 0,
"table drop failed calls" : 0,
"table drop successful calls" : 0,
"table rebalance failed calls" : 0,
"table rebalance successful calls" : 0,
"table rename failed calls" : 0,
"table rename successful calls" : 0,
"table salvage failed calls" : 0,
"table salvage successful calls" : 0,
"table truncate failed calls" : 0,
"table truncate successful calls" : 0,
"table verify failed calls" : 0,
"table verify successful calls" : 0
},
"thread-state" : {
"active filesystem fsync calls" : 0,
"active filesystem read calls" : 0,
"active filesystem write calls" : 0
},
"thread-yield" : {
"application thread time evicting (usecs)" : 0,
"application thread time waiting for cache (usecs)" : 0,
"page acquire busy blocked" : 0,
"page acquire eviction blocked" : 0,
"page acquire locked blocked" : 0,
"page acquire read blocked" : 1106,
"page acquire time sleeping (usecs)" : 1107000
},
"transaction" : {
"number of named snapshots created" : 0,
"number of named snapshots dropped" : 0,
"transaction begins" : 35005,
"transaction checkpoint currently running" : 0,
"transaction checkpoint generation" : 8,
"transaction checkpoint max time (msecs)" : 45,
"transaction checkpoint min time (msecs)" : 4,
"transaction checkpoint most recent time (msecs)" : 8,
"transaction checkpoint scrub dirty target" : 0,
"transaction checkpoint scrub time (msecs)" : 0,
"transaction checkpoint total time (msecs)" : 174,
"transaction checkpoints" : 8,
"transaction checkpoints skipped because database was clean" : 2,
"transaction failures due to cache overflow" : 0,
"transaction fsync calls for checkpoint after allocating the transaction ID" : 8,
"transaction fsync duration for checkpoint after allocating the transaction ID (usecs)" : 0,
"transaction range of IDs currently pinned" : 0,
"transaction range of IDs currently pinned by a checkpoint" : 0,
"transaction range of IDs currently pinned by named snapshots" : 0,
"transaction sync calls" : 0,
"transactions committed" : 130,
"transactions rolled back" : 34864
},
"concurrentTransactions" : {
"write" : {
"out" : 0,
"available" : 128,
"totalTickets" : 128
},
"read" : {
"out" : 1,
"available" : 127,
"totalTickets" : 128
}
}
},
"mem" : {
"bits" : 64,
"resident" : 895,
"virtual" : 1803,
"supported" : true,
"mapped" : 0,
"mappedWithJournal" : 0
},
"metrics" : {
"commands" : {
"buildInfo" : {
"failed" : NumberLong(0),
"total" : NumberLong(17)
},
"count" : {
"failed" : NumberLong(0),
"total" : NumberLong(1118)
},
"createIndexes" : {
"failed" : NumberLong(0),
"total" : NumberLong(140)
},
"filemd5" : {
"failed" : NumberLong(0),
"total" : NumberLong(29)
},
"getLastError" : {
"failed" : NumberLong(0),
"total" : NumberLong(29)
},
"getLog" : {
"failed" : NumberLong(0),
"total" : NumberLong(1)
},
"insert" : {
"failed" : NumberLong(0),
"total" : NumberLong(69)
},
"isMaster" : {
"failed" : NumberLong(0),
"total" : NumberLong(153)
},
"ping" : {
"failed" : NumberLong(0),
"total" : NumberLong(167)
},
"replSetGetStatus" : {
"failed" : NumberLong(1),
"total" : NumberLong(1)
},
"serverStatus" : {
"failed" : NumberLong(0),
"total" : NumberLong(1)
},
"update" : {
"failed" : NumberLong(0),
"total" : NumberLong(57)
},
"whatsmyuri" : {
"failed" : NumberLong(0),
"total" : NumberLong(1)
}
},
"cursor" : {
"timedOut" : NumberLong(0),
"open" : {
"noTimeout" : NumberLong(0),
"pinned" : NumberLong(0),
"total" : NumberLong(0)
}
},
"document" : {
"deleted" : NumberLong(0),
"inserted" : NumberLong(69),
"returned" : NumberLong(35912),
"updated" : NumberLong(57)
},
"getLastError" : {
"wtime" : {
"num" : 0,
"totalMillis" : 0
},
"wtimeouts" : NumberLong(0)
},
"operation" : {
"scanAndOrder" : NumberLong(227),
"writeConflicts" : NumberLong(0)
},
"queryExecutor" : {
"scanned" : NumberLong(2638449),
"scannedObjects" : NumberLong(1450604)
},
"record" : {
"moves" : NumberLong(0)
},
"repl" : {
"executor" : {
"counters" : {
"eventCreated" : 0,
"eventWait" : 0,
"cancels" : 0,
"waits" : 0,
"scheduledNetCmd" : 0,
"scheduledDBWork" : 0,
"scheduledXclWork" : 0,
"scheduledWorkAt" : 0,
"scheduledWork" : 0,
"schedulingFailures" : 0
},
"queues" : {
"networkInProgress" : 0,
"dbWorkInProgress" : 0,
"exclusiveInProgress" : 0,
"sleepers" : 0,
"ready" : 0,
"free" : 0
},
"unsignaledEvents" : 0,
"eventWaiters" : 0,
"shuttingDown" : false,
"networkInterface" : "\nNetworkInterfaceASIO Operations' Diagnostic:\nOperation:    Count:   \nConnecting    0        \nIn Progress   0        \nSucceeded     0        \nCanceled      0        \nFailed        0        \nTimed Out     0        \n\n"
},
"apply" : {
"attemptsToBecomeSecondary" : NumberLong(0),
"batches" : {
"num" : 0,
"totalMillis" : 0
},
"ops" : NumberLong(0)
},
"buffer" : {
"count" : NumberLong(0),
"maxSizeBytes" : NumberLong(0),
"sizeBytes" : NumberLong(0)
},
"initialSync" : {
"completed" : NumberLong(0),
"failedAttempts" : NumberLong(0),
"failures" : NumberLong(0)
},
"network" : {
"bytes" : NumberLong(0),
"getmores" : {
"num" : 0,
"totalMillis" : 0
},
"ops" : NumberLong(0),
"readersCreated" : NumberLong(0)
},
"preload" : {
"docs" : {
"num" : 0,
"totalMillis" : 0
},
"indexes" : {
"num" : 0,
"totalMillis" : 0
}
}
},
"storage" : {
"freelist" : {
"search" : {
"bucketExhausted" : NumberLong(0),
"requests" : NumberLong(0),
"scanned" : NumberLong(0)
}
}
},
"ttl" : {
"deletedDocuments" : NumberLong(0),
"passes" : NumberLong(9)
}
},
"ok" : 1
}


Message has been deleted

Weishan Ang

unread,
Oct 15, 2017, 7:00:49 PM10/15/17
to mongodb-user
how about inodes?

Tim Hawkins

unread,
Oct 16, 2017, 9:30:38 AM10/16/17
to mongodb-user
re inodes: Good point i will check, there is tons of disk space left over 270G, growth is only about 1G a week. 


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/8e30c5a1-da12-46d7-8bf5-c02e76e47a12%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tim Hawkins

unread,
Oct 16, 2017, 9:33:43 AM10/16/17
to mongodb-user
Nope no problems with space or inodes. /mnt/data is where the database is and its log files. 

sh-4.2# df -i
Filesystem        Inodes  IUsed     IFree IUse% Mounted on
devtmpfs         1954993    465   1954528    1% /dev
tmpfs            1957728      1   1957727    1% /dev/shm
/dev/xvda1       3276800 817284   2459516   25% /
/dev/nvme0n1   463867136   4062 463863074    1% /mnt/data
sh-4.2# df
Filesystem     1K-blocks      Used Available Use% Mounted on
devtmpfs         7819972        68   7819904   1% /dev
tmpfs            7830912         0   7830912   0% /dev/shm
/dev/xvda1      51473000  16761328  34611424  33% /
/dev/nvme0n1   463640688 192048284 271592404  42% /mnt/data
sh-4.2# 



Weishan Ang

unread,
Oct 16, 2017, 10:24:40 AM10/16/17
to mongodb-user
After the mongod aborted, does it work when you startup mongod again?

I'm thinking of permission issues on /mnt/data/var/lib/mongo/diagnostic.data/metrics.interim.temp or /mnt/data/var/lib/mongo/WiredTiger.turtle.set:?

Kevin Adistambha

unread,
Oct 16, 2017, 6:43:11 PM10/16/17
to mongodb-user

Hi Tim

Is it possible that you’re running into ulimit issues, e.g. lower than recommended ulimit settings?

You might want to confirm that your deployment follows the recommendation outlined in the Amazon EC2 page, specifically, the Manually Deploy MongoDB on EC2 section.

On another note, please consider setting up authentication in your MongoDB deployment, and also disable transparent hugepages as per the startup warnings.

Best regards,
Kevin

Reply all
Reply to author
Forward
0 new messages