Unexpected WiredTigerLAS.wt file grow

2,244 views
Skip to first unread message

Роман Бирюлькин

unread,
Apr 10, 2018, 8:54:10 PM4/10/18
to mongodb-user
Hi everyone. we have mongodb cluster with 3 instances.  After increase of amount of writes there, WiredTigerLAS.wt become 20Gbs size on all nodes in just few mins. It leaded to out space issue and of course downtime. This is strange, becasue  usually this file have size less then 1,5 GB. I tried to find in a documentation how to control its size, but no luck. Can you suggest how to limit its size and maybe you know what can cause this situation.

Stephen Steneker

unread,
Apr 11, 2018, 6:10:17 AM4/11/18
to mongodb-user
On Wednesday, 11 April 2018 10:54:10 UTC+10, Роман Бирюлькин wrote:
Hi everyone. we have mongodb cluster with 3 instances.  After increase of amount of writes there, WiredTigerLAS.wt become 20Gbs size on all nodes in just few mins. It leaded to out space issue and of course downtime. This is strange, becasue  usually this file have size less then 1,5 GB. I tried to find in a documentation how to control its size, but no luck. Can you suggest how to limit its size and maybe you know what can cause this situation.

Hi,

WiredTigerLAS.wt is an overflow file for the WiredTiger in-memory cache. Excessive growth is definitely not expected and will likely be detrimental to performance. The file size cannot be directly configured, but we should be able to determine the configuration or workload issue that is leading to this outcome.

Can you please confirm some more details for your environment:
  • specific version of MongoDB server (x.y.z)
  • O/S and version
  • assuming your three node cluster is a single replica set:
    • the rs.conf().protocolVersion
    • the roles & storage engines for your three nodes
  • total RAM
  • any non-default storage configuration options aside from dbPath (db.serverCmdLineOpts().parsed.storage)
  • host environment type: bare metal, VM, container, ... ?
Also, were there any recent changes in your deployment (such as a MongoDB upgrade) that preceded this change in behaviour or does this only appear to coincide with increased write activity?

Thanks,
Stennie

Роман Бирюлькин

unread,
Apr 12, 2018, 1:05:37 AM4/12/18
to mongodb-user
Hi!

1. mongodb 3.6.2
2. ubuntu 14.04.5, 3.13.0-121-generic
3. protocol version - 1. roles: 1 primary, 2 secondary. storage engine: wiredTiger
4. total RAM - 15Gb
5. only disabled journal
6. aws ec2 instances
7. No any changes

Thanks

среда, 11 апреля 2018 г., 15:10:17 UTC+5 пользователь Stephen Steneker написал:

Kevin Adistambha

unread,
Apr 30, 2018, 2:19:15 AM4/30/18
to mongodb-user

Hi

Could you post some additional details:

  • Did you upgrade this deployment from an earlier MongoDB version, or did you create a new replica set with MongoDB 3.6.2?
  • Could you post the output of db.version() from all three nodes?
  • Could you post the output of rs.conf() and rs.status() from all three nodes?
  • Could you post the output of db.serverStatus().metrics.cursor from all three nodes?

I’d also like to clarify that there is no advantage in running WiredTiger with no journaling. In fact, in a replica set, it will force MongoDB to perform a checkpoint for every write (instead of writing to the journal which was optimized for writes and is much faster than a checkpoint), which will slow down your writes tremendously. In the future, running a replica set with WiredTiger and nojournal will not be a valid configuration (see SERVER-30347).

Best regards
Kevin

Bruce Zu

unread,
Aug 10, 2018, 6:51:08 PM8/10/18
to mongodb-user
on Aug 6 the WiredTigerLAS.wt issue is found in our replset.
Please see https://jira.mongodb.org/browse/WT-4238
It is closed by Nick without confirming with me. I can not reopen it. 
I did not test the result shows his answer is not right

He belive " MongoDB 3.6 enables read concern "majority" - with the three 3.4 nodes down, your replica set did not have enough members to satisfy the read concern, as the arbiter does not contain data. The majority read concern ensures that the data returned will not subsequently be rolled back, by confirming that it is acknowledged by the majority of data-bearing replica set members.

Because of this, your primary node was required to store an increasing amount of data in the cache, which ultimately overflowed into the lookaside table (represented by WiredTigerLAS.wt). If you've removed the 3.4 nodes, you should have a majority of data-bearing nodes to satisfy the read concern." 

1> the test I did is 

primary     172-31-82-157  ( is secondary before the issue is found)

secondary 172.31.54.204  ( is primary before the issue is found)

secondary 172-31-66-130 ( became unreachable before the issue is found)

secondary 172-31-67-188 ( new member after issue)

arbiter        172-31-5-208  ( it was 3.4.7 version before the issue is found. now it 3.6.6 ) 

 

All are v 3.6.6. 

1>TEST1:  stop mongod service on 3 secondary or any 2 secondary + arbiter will get the primary step down to secondary which then does accept any read and write operation anymore because we use the default read reference option.  So how can it be possible "your primary node was required to store an increasing amount of data in the cache"

2> TEST2: stop any 2 secondaries for 3 hours+ to not satisfy the condition  "have a majority of data-bearing nodes", but find there is nothing happen to the WiredTigerLAS.wt file. it is always 4k size. 

I am wondering how to retrigger "your primary node was required to store an increasing amount of data in the cache". Any help is appreciated


Another wired thing:

July 23, Monday at that time the replset also have another 3 secondary members with v3.4.7

That day I remove them from replset configuration and terminate them. This can be seen from the mongod.log

But after that, the primary has been kept trying to contact them until I restarted the primary mongod service on Aug 6, this also can be seen from the mongod.log  details in https://jira.mongodb.org/browse/WT-4238.  

This can be fixed by restart mongod service

This is apparently a bug.  before 3.6  remove replset member or rs.reconfig() does not need restart mongod service. 

Kevin Adistambha

unread,
Aug 14, 2018, 1:44:51 AM8/14/18
to mongodb-user

Hi Bruce,

WiredTiger will write to the WiredTigerLAS.wt file when it encounters a situation where it is required to keep things in its cache for an extended period of time. To be clear, it will not write anything to this file unless it absolutely has to.

The read concern majority feature introduced in MongoDB 3.6 allows you to read data that has been acknowledged by the majority of data-bearing nodes in a replica set (see read concern majority page)

To paraphrase from the read concern majority page:

read concern "majority" guarantees that the data read has been acknowledged by a majority of the replica set members (i.e. the documents read are durable and guaranteed not to roll back).

This feature was developed to enable causally consistent sessions.

To enable this feature, WiredTiger will keep data that has not been committed by the majority of data-bearing nodes in its cache. Consequently, when you have write/update workload that cannot be written to the majority of data-bearing nodes, it will accumulate in the WiredTiger cache, eventually spilling over to the WiredTigerLAS.wt file.

The WiredTigerLAS.wt file is ephemeral, and will not survive a restart of the node.

The read concern majority situation becomes more complex if you include arbiters in your setup. Arbiters are voting-only nodes, thus cannot participate in acknowledging majority writes. When you do not have the majority of data-bearing nodes available in your replica set, the WiredTigerLAS.wt file will continue to grow if you have ongoing write/update operations on the set.

To put it simply, you should not see the WiredTigerLAS.wt file growing if you remove the arbiters from your replica set.

If you have further questions, please:

  • Remove the arbiters from your replica set and confirm if the issue persists
  • Create a new thread describing your situation, the exact steps you performed to arrive at such situation, and the output of rs.status() during the issue.

Best regards,
Kevin

Bruce Zu

unread,
Aug 14, 2018, 2:13:44 PM8/14/18
to mongod...@googlegroups.com
Hi Kevin
Thank you for your detailed feedback.
Your solution works
Bruce

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/4b180f34-efd1-4da0-b14c-4b787b1cde30%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages