Message from discussion
Images in documents and memory usage
Received: by 10.224.208.202 with SMTP id gd10mr1303321qab.16.1310063724263;
Thu, 07 Jul 2011 11:35:24 -0700 (PDT)
X-BeenThere: mongodb-user@googlegroups.com
Received: by 10.224.21.25 with SMTP id h25ls405537qab.5.gmail; Thu, 07 Jul
2011 11:35:17 -0700 (PDT)
Received: by 10.224.42.143 with SMTP id s15mr1284558qae.28.1310063717784;
Thu, 07 Jul 2011 11:35:17 -0700 (PDT)
Received: by 10.224.204.8 with SMTP id fk8msqab;
Thu, 7 Jul 2011 11:34:53 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.198.203 with SMTP id ep11mr218625vcb.41.1310063692859;
Thu, 07 Jul 2011 11:34:52 -0700 (PDT)
Received: by q17g2000vby.googlegroups.com with HTTP; Thu, 7 Jul 2011 11:34:52
-0700 (PDT)
Date: Thu, 7 Jul 2011 11:34:52 -0700 (PDT)
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML,
like Gecko) Chrome/12.0.742.112 Safari/534.30,gzip(gfe)
Message-ID: <32ac11fd-2cc2-4e1c-a655-606067e95094@q17g2000vby.googlegroups.com>
Subject: Images in documents and memory usage
From: "Maeldron T." <maeld...@gmail.com>
To: mongodb-user <mongodb-user@googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1
Hello,
as a newcomer to MongoDB I would like to migrate my existing apps. I
also want to build new HA websites on this nice database.
After reading the docs I consider storing images in MongoDB. The main
reason would be the easy replicaton and high availability.
After reading a dozen threads about this I am still not sure.
The number of the images are thousands or tens of thousand. They are
uploaded by the users.
Every image file is below 4MB. Should I use GridFS? I suppose I
shouldn't. What is the advantage of GridFS compared to simple MongoDB
documents (one document per image) if all the files are below 4MB?
I see that MongoDB likes to eat all the available RAM. On my servers I
have 4GB RAM. The total amount of the images are above 4GB.
I know that linux uses all the available memory for filesystem cache.
All my databases can fit in the RAM without the images.
1.: Suppose that I store the images in the filesystem.
That means that every read from the database will read the RAM. It
will be fast. In this case some of the images will be cached by the OS
and some will be not.
2.: Suppose that I put all the images into MongoDB. Now the whole
database will be larger than the amount of the RAM. Which means that
some database read *not against the images* will be read from disk.
What happens with the indexes? Do they have higher priority to stay in
RAM than other data?
Which solution is the better? I am not sure but I'd guess the first
one. However having all the images replicated with standard filesystem
is a big pain.