Re: How to improve index update performance while document insertion?

64 views
Skip to first unread message

Nat

unread,
Oct 14, 2012, 4:55:16 PM10/14/12
to mongod...@googlegroups.com
Can you provide mongostat, iostat when the insert speed starts to slow down as well as the sample document and the indexes you have?

On Saturday, October 13, 2012 8:51:47 PM UTC-7, golangnewbie wrote:
I have 100 million documents for insertion, each document has four fields with two unique index in a single collection.
In the beginning, the  insert speed monitored from "monostat" tool is about 10000/second.when the number of inserted documents reached
20 million, the insert speed decreased dramatically and was unstable between 1000 and 5000.
Does anyone have a good solution for this problem? Splitting the collection to multiple can cause many other problems for my requirement and the two unique indexes  are necessary. Thanks.

golangnewbie

unread,
Oct 15, 2012, 8:24:12 AM10/15/12
to mongod...@googlegroups.com
mongostat monitor:
  3522      0      0      0       0    3522       0  5.95g  14.2g  3.89g      5     53.3          0       0|0     0|0   621k   332k    26   19:53:29 
  7031      0      0      0       0    7033       0  5.95g  14.2g  3.92g     10     51.3          0       0|0     0|0     1m   662k    26   19:53:30 
  5410      0      0      0       0    5411       0  5.95g  14.2g  3.92g      8     62.4          0       0|0     0|0   953k   509k    26   19:53:31 
   896      0      0      0       0     896       0  5.95g  14.2g  3.93g      2      3.6          0      0|10    0|10   157k    85k    26   19:53:32 
  4641      0      0      0       0    4642       0  5.95g  14.2g  3.93g      8      149          0       0|0     0|0   818k   437k    26   19:53:33 
  2914      0      0      0       0    2915       0  5.95g  14.2g  3.92g      4     51.7          0       0|9    0|10   513k   275k    26   19:53:34 
  3931      0      0      0       0    3935       0  5.95g  14.2g  3.93g      6     63.4          0       0|9    0|10   693k   371k    26   19:53:35 
  5139      0      0      0       0    5140       0  5.95g  14.2g  3.94g      8     64.5          0       0|9    0|10   906k   484k    26   19:53:36 
  3636      0      0      0       0    3638       0  5.95g  14.2g   3.9g      6     77.8          0       0|0     0|0   641k   343k    26   19:53:37 
insert  query update delete getmore command flushes mapped  vsize    res faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn       time 
  7652      0      0      0       0    7653       0  5.95g  14.2g  3.92g     12     57.6          0       0|0     0|0     1m   720k    26   19:53:38 
  5621      0      0      0       0    5622       0  5.95g  14.2g  3.94g      9     29.1          0       0|0     0|0   991k   529k    26   19:53:39 
  7518      0      0      0       0    7519       0  5.95g  14.2g  3.92g     13     36.8          0       0|0     0|0     1m   708k    26   19:53:40 
   182      0      0      0       0     183       0  5.95g  14.2g  3.91g      0      1.4          0       0|0     0|0    32k    18k    26   19:53:41
iostat :
   
   avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.04    0.00    0.22   10.82    0.00   88.92

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  3938.00    0.00  472.00     0.00    17.46    75.75   287.55  798.53   2.12 100.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-4              0.00     0.00    0.00 4421.00     0.00    17.27     8.00  3123.16  719.77   0.23 100.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    2.00     0.00     0.01     8.00     1.17    0.00 315.00  63.00


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.90    0.00    1.39    7.02    0.00   89.68

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  7789.00    0.00  849.00     0.00    37.60    90.69   285.24  379.61   1.18 100.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-4              0.00     0.00    0.00 8629.00     0.00    34.27     8.13  2493.07  403.58   0.12  99.90
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    1.00     0.00     0.00     8.00     0.89 2065.00 569.00  56.90

load average: 2.85, 2.77, 2.16

sample document with two unique index on _id and phone.

{
      _id : "mongodbnewbiethanks.",
      phone: "123-123-456-123"
      sex:"m"
}

and the external storage is hard disk.

Jason Zucchetto

unread,
Nov 22, 2012, 12:46:46 PM11/22/12
to mongod...@googlegroups.com
Hello, this degradation in performance is most likely caused by an index on your collection that could not fit in memory and had to be paged in from disk.  Pay close attention to the "idx miss %" in your mongostat output.  This explains why your performance was very fast in the beginning (the index was still small and could fit in memory) and degraded over time as the index grew (and could no longer fit in memory).

There are a few possibilites to improve performance in this situation: use a machine with more RAM, set up a sharded cluster for distributing RAM requirements across more machines, or, if this is a one time insert of bulk data, add the index after the bulk insert has completed.

I hope this has helps.

--Jason 
Reply all
Reply to author
Forward
0 new messages