Consul KV export of Vault causes OOM.

350 views
Skip to first unread message

supreme patadia

unread,
Jan 5, 2018, 1:43:51 PM1/5/18
to Consul
Hello Everyone been lurking here for a while but I finally came across this issue which I can't seem to crack.

I have 3 consul servers running in HA.  They are responsible for replicating my Vault data.  

I get an OOM when I try to do a "consul KV export vault/".  Below is the GO Fatal Error that I get every-time I try to do a KV export.    The consul KV store has about 900Mb of data that needs to be exported.   KV export works for me in my stage env since the Data set is around 100Mb.  Is there any config parameters I an change to allow for a KV export without OOM.   Is there a limit to the KV data that consul can handle as I plan on adding a lot more over the next year.

Thanks again for any help.

Consul Version: 0.9.0
Vault Version: 0.8.3

[root@prod-db-03 ~]# free -mt

             total       used       free     shared    buffers     cached

Mem:          7483       2320       5162          0         80       1100

-/+ buffers/cache:       1138       6344

Swap:            0          0          0

Total:        7483       2320       5162




[root@prod~]# consul kv export vault/

fatal error: runtime: out of memory


runtime stack:

runtime.throw(0x11fbb1b, 0x16)

/goroot/src/runtime/panic.go:596 +0x95

runtime.sysMap(0xc4d4fc0000, 0x46660000, 0xc42006bd00, 0x19dae58)

/goroot/src/runtime/mem_linux.go:216 +0x1d0

runtime.(*mheap).sysAlloc(0x19c1bc0, 0x46660000, 0x1e90)

/goroot/src/runtime/malloc.go:440 +0x374

runtime.(*mheap).grow(0x19c1bc0, 0x2332d, 0x0)

/goroot/src/runtime/mheap.go:774 +0x62

runtime.(*mheap).allocSpanLocked(0x19c1bc0, 0x2332d, 0x5d)

/goroot/src/runtime/mheap.go:678 +0x44f

runtime.(*mheap).alloc_m(0x19c1bc0, 0x2332d, 0x100000000, 0xc42006bf70)

/goroot/src/runtime/mheap.go:562 +0xe2

runtime.(*mheap).alloc.func1()

/goroot/src/runtime/mheap.go:627 +0x4b

runtime.systemstack(0xc42006bf10)

/goroot/src/runtime/asm_amd64.s:343 +0xab

runtime.(*mheap).alloc(0x19c1bc0, 0x2332d, 0x10100000000, 0x0)

/goroot/src/runtime/mheap.go:628 +0xa0

runtime.largeAlloc(0x46659d5c, 0xc42006bf01, 0xc420013170)

/goroot/src/runtime/malloc.go:807 +0x93

runtime.mallocgc.func1()

/goroot/src/runtime/malloc.go:702 +0x3e

runtime.systemstack(0xc42001e600)

/goroot/src/runtime/asm_amd64.s:327 +0x79

runtime.mstart()

/goroot/src/runtime/proc.go:1132


goroutine 1 [running]:

runtime.systemstack_switch()

/goroot/src/runtime/asm_amd64.s:281 fp=0xc4204b3458 sp=0xc4204b3450

runtime.mallocgc(0x46659d5c, 0xf964c0, 0xc420013101, 0xc4204b3520)

/goroot/src/runtime/malloc.go:703 +0x930 fp=0xc4204b34f8 sp=0xc4204b3458

runtime.makeslice(0xf964c0, 0x46659d5c, 0x46659d5c, 0xc4b1c8a000, 0x2332b520, 0x2332b520)

/goroot/src/runtime/slice.go:54 +0x7b fp=0xc4204b3548 sp=0xc4204b34f8

bytes.makeSlice(0x46659d5c, 0x0, 0x0, 0x0)

/goroot/src/bytes/buffer.go:201 +0x77 fp=0xc4204b3588 sp=0xc4204b3548

bytes.(*Buffer).grow(0xc4201cc840, 0x331c, 0x1)

/goroot/src/bytes/buffer.go:109 +0x177 fp=0xc4204b35d8 sp=0xc4204b3588

bytes.(*Buffer).WriteString(0xc4201cc840, 0xc47e08b500, 0x331c, 0x0, 0xc4204b3688, 0xc4204b3698)

/goroot/src/bytes/buffer.go:146 +0x41 fp=0xc4204b3608 sp=0xc4204b35d8

encoding/json.(*encodeState).string(0xc4201cc840, 0xc47e08b500, 0x331c, 0xc47e08b501, 0x331c)

/goroot/src/encoding/json/encode.go:952 +0x50f fp=0xc4204b3680 sp=0xc4204b3608

encoding/json.stringEncoder(0xc4201cc840, 0xf96200, 0xc47ddb7788, 0x198, 0x100)

/goroot/src/encoding/json/encode.go:608 +0x214 fp=0xc4204b3740 sp=0xc4204b3680

encoding/json.(*structEncoder).encode(0xc451777e60, 0xc4201cc840, 0x10ca540, 0xc47ddb7770, 0x199, 0x756ea130630100)

/goroot/src/encoding/json/encode.go:645 +0x253 fp=0xc4204b38a0 sp=0xc4204b3740

encoding/json.(*structEncoder).(encoding/json.encode)-fm(0xc4201cc840, 0x10ca540, 0xc47ddb7770, 0x199, 0xc47ddb0100)

/goroot/src/encoding/json/encode.go:659 +0x64 fp=0xc4204b38e0 sp=0xc4204b38a0

encoding/json.(*ptrEncoder).encode(0xc42012a188, 0xc4201cc840, 0xf51720, 0xc4401895d8, 0x196, 0x23320100)

/goroot/src/encoding/json/encode.go:786 +0xe3 fp=0xc4204b3920 sp=0xc4204b38e0

encoding/json.(*ptrEncoder).(encoding/json.encode)-fm(0xc4201cc840, 0xf51720, 0xc4401895d8, 0x196, 0xf50100)

/goroot/src/encoding/json/encode.go:791 +0x64 fp=0xc4204b3960 sp=0xc4204b3920

encoding/json.(*arrayEncoder).encode(0xc42012a190, 0xc4201cc840, 0xf71960, 0xc42000a200, 0x97, 0xc420000100)

/goroot/src/encoding/json/encode.go:767 +0xf5 fp=0xc4204b39b8 sp=0xc4204b3960

encoding/json.(*arrayEncoder).(encoding/json.encode)-fm(0xc4201cc840, 0xf71960, 0xc42000a200, 0x97, 0xf70100)

/goroot/src/encoding/json/encode.go:774 +0x64 fp=0xc4204b39f8 sp=0xc4204b39b8

encoding/json.(*sliceEncoder).encode(0xc42012a198, 0xc4201cc840, 0xf71960, 0xc42000a200, 0x97, 0xf70100)

/goroot/src/encoding/json/encode.go:741 +0xc1 fp=0xc4204b3a38 sp=0xc4204b39f8

encoding/json.(*sliceEncoder).(encoding/json.encode)-fm(0xc4201cc840, 0xf71960, 0xc42000a200, 0x97, 0xc420000100)

/goroot/src/encoding/json/encode.go:753 +0x64 fp=0xc4204b3a78 sp=0xc4204b3a38

encoding/json.(*encodeState).reflectValue(0xc4201cc840, 0xf71960, 0xc42000a200, 0x97, 0x100)

/goroot/src/encoding/json/encode.go:323 +0x82 fp=0xc4204b3ab0 sp=0xc4204b3a78

encoding/json.(*encodeState).marshal(0xc4201cc840, 0xf71960, 0xc42000a200, 0xc420020100, 0x0, 0x0)

/goroot/src/encoding/json/encode.go:296 +0xb8 fp=0xc4204b3ae8 sp=0xc4204b3ab0

encoding/json.Marshal(0xf71960, 0xc42000a200, 0x410a9c, 0xc42000a200, 0xc4204b3cc0, 0x18, 0xc451777c01)

/goroot/src/encoding/json/encode.go:161 +0x6e fp=0xc4204b3b30 sp=0xc4204b3ae8

encoding/json.MarshalIndent(0xf71960, 0xc42000a200, 0x0, 0x0, 0x11e57d3, 0x1, 0x32a16, 0xc4201e5380, 0x0, 0x0, ...)

/goroot/src/encoding/json/encode.go:170 +0x3f fp=0xc4204b3bb0 sp=0xc4204b3b30

github.com/hashicorp/consul/command.(*KVExportCommand).Run(0xc42026db20, 0xc42000e130, 0x1, 0x1, 0xc420276560)

/gopath/src/github.com/hashicorp/consul/command/kv_export.go:85 +0x3fd fp=0xc4204b3d50 sp=0xc4204b3bb0

github.com/hashicorp/consul/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc420236240, 0xc420236240, 0x40, 0xc42021fd00)

/gopath/src/github.com/hashicorp/consul/vendor/github.com/mitchellh/cli/cli.go:160 +0x1cc fp=0xc4204b3e18 sp=0xc4204b3d50

main.realMain(0xc4200001a0)

/gopath/src/github.com/hashicorp/consul/main.go:50 +0x3fd fp=0xc4204b3f70 sp=0xc4204b3e18

main.main()

/gopath/src/github.com/hashicorp/consul/main.go:19 +0x22 fp=0xc4204b3f88 sp=0xc4204b3f70

runtime.main()

/goroot/src/runtime/proc.go:185 +0x20a fp=0xc4204b3fe0 sp=0xc4204b3f88

runtime.goexit()

/goroot/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4204b3fe8 sp=0xc4204b3fe0


goroutine 5 [syscall]:

os/signal.signal_recv(0x0)

/goroot/src/runtime/sigqueue.go:116 +0x104

os/signal.loop()

/goroot/src/os/signal/signal_unix.go:22 +0x22

created by os/signal.init.1

/goroot/src/os/signal/signal_unix.go:28 +0x41


goroutine 34 [semacquire]:

sync.runtime_notifyListWait(0xc420236340, 0xc400000000)

/goroot/src/runtime/sema.go:298 +0x10b

sync.(*Cond).Wait(0xc420236330)

/goroot/src/sync/cond.go:57 +0x89

io.(*pipe).read(0xc420236300, 0xc420294000, 0x1000, 0x1000, 0x0, 0x0, 0x0)

/goroot/src/io/pipe.go:47 +0x104

io.(*PipeReader).Read(0xc42012a238, 0xc420294000, 0x1000, 0x1000, 0x1000, 0x1000, 0x0)

/goroot/src/io/pipe.go:130 +0x4c

bufio.(*Scanner).Scan(0xc420135480, 0x0)

/goroot/src/bufio/scan.go:207 +0x294

github.com/hashicorp/consul/command.(*BaseCommand).NewFlagSet.func2(0xc420135480, 0xc42026db20)

/gopath/src/github.com/hashicorp/consul/command/base.go:162 +0x2f

created by github.com/hashicorp/consul/command.(*BaseCommand).NewFlagSet

/gopath/src/github.com/hashicorp/consul/command/base.go:165 +0x232

James Phillips

unread,
Jan 9, 2018, 10:36:12 PM1/9/18
to consu...@googlegroups.com
Hi,

The culprit here is the HTTP response buffering and the JSON
serialization, which is tracked in
https://github.com/hashicorp/consul/issues/1571.

Memory use can spike to ~3X the size of the exported KV data, so it's
good to size the memory on your Consul servers with plenty of margin
to avoid this kind of scenario (Consul holds the full KV contents in
RAM). In addition to the above issue, we are working on some better
ways to back up parts of the KV store for Vault use cases in a way
that will safer and more efficient.

In the meantime, you can use the Consul snapshot capability
(https://www.consul.io/docs/commands/snapshot.html, there's also an
API), which is designed to stream the contents of Consul's state store
without buffering it. This backs up the whole KV store, along with
things like ACLs and prepared queries, so it's a full snapshot of the
cluster state for disaster recovery. Another thing is to run your KV
export against a Consul client's HTTP API, and not the Consul servers
directly. This will prevent your server from OOM-ing by putting the
JSON serialization and HTTP response buffering on the client machine.
Since that client machine won't have the full KV contents in memory,
it's easier to provision it with headroom to do the export.

-- James
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/consul/issues
> IRC: #consul on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Consul" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to consul-tool...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/consul-tool/af7eb45d-e312-488d-99e7-87fae267f2b6%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

supreme patadia

unread,
Jan 10, 2018, 4:22:27 PM1/10/18
to Consul
Thank you so much, I will try these new approaches.
Reply all
Reply to author
Forward
0 new messages