Storing jpegs in PouchDB

1,938 views
Skip to first unread message

Colin

unread,
Sep 13, 2012, 4:44:06 PM9/13/12
to pou...@googlegroups.com
I would like to store a jpeg in PouchDB as an attachment and then render it back to a user as an image in the browser or as a file download. What's the proper way to do that? I see that the Pouch API includes a putAttachment function, but I don't see a getAttachment function. How do I get my attachment back from Pouch after I've saved it? 

Putting that aside for a moment, suppose I just save the base64 encoded contents of my file into some other regular field in my json doc. Any suggestions as to how to render that as an image in the web browser from Pouch? There's no http server so I'm not sure how to build an img or href for it.

Thanks for any help,

Colin

Colin

unread,
Sep 14, 2012, 10:05:12 AM9/14/12
to pou...@googlegroups.com
I figured out how to do this using the approach below, but I am worried about the performance of storing 5MB+ jpegs in PouchDB. It seems to have a noticeable effect on the performance of Pouch. Once a 5MB jpeg is in there, all Pouch queries become slow. Is it a bad idea to be using PouchDB to store multiple 5MB+ jpegs? Should I instead use the File system and do my own puts to the server side CouchDB from there?

To store and retrieve binary files in PouchDB, do this:
1. Read the file in using the standard input type=file and a FileReader object.

2. Use readAsBinaryString to get the contents of the file and save this as a new document in Pouch. I'm saving it as a regular document, not as an inline or standalone Couch attachment because I have no need to serve it back out with http.

3. To read the document back out and render it to the user I do this:
$('#attachmentLink').click(function(){
Pouch(FIELDLINK.localDB, function(err, db) {
   db.get(expenseRow.attachmentID, function(err, doc) {
var byteArray = new Uint8Array(doc.contents.length);
for (var i = 0; i < doc.contents.length; i++) {
   byteArray[i] = doc.contents.charCodeAt(i) & 0xff;
}
var BlobBuilderObj = new (window.BlobBuilder || window.WebKitBlobBuilder)();
BlobBuilderObj.append(byteArray.buffer);
var blob=BlobBuilderObj.getBlob(doc.type);

var aURL=(window.webkitURL ? webkitURL : URL).createObjectURL(blob); 
$('#attachmentLink').attr("href",aURL);
   });
});
    });

Colin

unread,
Sep 14, 2012, 1:19:20 PM9/14/12
to pou...@googlegroups.com
UPDATE: I realized I should use Pouch's putAttachment and then read the attachment contents back using db.get with {attachments: true}. This "hides" the attachment contents inside an _attachments array on my documents. The db.query method seems to ignore this content (which is good) and Pouch's query performance is snappy again.

I think my problem was that I was storing my attachments as regular documents and Pouch was having to load all of their (huge) contents any time I did a db.query.

James Mayes

unread,
Sep 14, 2012, 1:43:08 PM9/14/12
to pou...@googlegroups.com
I believe the all pouchdb queries do a full scan.  

On Fri, Sep 14, 2012 at 12:19 PM, Colin <colin.h...@gmail.com> wrote:
UPDATE: I realized I should use Pouch's putAttachment and then read the attachment contents back using db.get with {attachments: true}. This "hides" the attachment contents inside an _attachments array on my documents. The db.query method seems to ignore this content (which is good) and Pouch's query performance is snappy again.

I think my problem was that I was storing my attachments as regular documents and Pouch was having to load all of their (huge) contents any time I did a db.query.

--
You received this message because you are subscribed to the Google Groups "PouchDB" group.
To post to this group, send email to pou...@googlegroups.com.
To unsubscribe from this group, send email to pouchdb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/pouchdb/-/RyGrIN7xOzkJ.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

Colin

unread,
Sep 14, 2012, 4:38:25 PM9/14/12
to pou...@googlegroups.com
Dale, would love to hear your advice / opinion on this...

Contrary to my last post, I'm finding that adding a couple of big attachments (about 5MB each) to a couple of documents using db.putAttachment does in fact cause all other Pouch queries (calls to db.query) to be slow. The slowness doesn't seem to grow after the first couple of attachments are added, which is good, but I still don't like the amount of slowness I'm encountering.

I see from the Pouch source code that you've given the attachments their own IndexedDB objectStore called attach-store and that this objectStore is only referenced when you need to retrieve or save attachments. It is not referenced when doing a db.query. That's good.

However, IndexedDB doesn't seem to like these big documents inside it. Do you have any advice on how to use attachments in Pouch and get good performance or am I just out of luck?

Dale Harvey

unread,
Sep 14, 2012, 5:03:40 PM9/14/12
to pou...@googlegroups.com
Hey Colin, Sorry for the delay

As you have noticed we have a seperate store for attachments, in theory attachments should definitely not slow us down noticeably, however there has been very little work done on this area.

The best thing to do would be to create a performance test for this, then we can make sure where the slowdown is and that there are no regressions etc

One thing I have noticed from your code that is worrying is that you use createObjectURL, those objects are helf indefinitely by the browser and if you do a lot of them the browser will inevitably slow down, are you seeing the same slowness after a refresh? remember any time you call createObjectURL you should also be releasing it (usually onload)

https://developer.mozilla.org/en-US/docs/DOM/window.URL.revokeObjectURL

--
You received this message because you are subscribed to the Google Groups "PouchDB" group.
To post to this group, send email to pou...@googlegroups.com.
To unsubscribe from this group, send email to pouchdb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/pouchdb/-/oZ5Q96oNDU8J.

Colin

unread,
Sep 14, 2012, 5:47:04 PM9/14/12
to pou...@googlegroups.com
Good suggestion on revoking objectURLs. Been meaning to add that. I've done so now but it didn't change the behavior.

I've added some simple timing log statements into pouch.js to better understand what's going on. Before I add any attachments, my db.query calls all take 200 - 300ms, which is fine. After I add one attachment, they are about the same. After I add a second attachment, they shoot up to 3000ms. Three attachments get me to 9000 ms, but after attachments 4 and 5 I was back down to 6000 ms queries.

Then I happened to wait around awhile and my query time mysteriously returned to 300ms. But, after saving a couple more attachments it's back up to 3000ms and isn't going back down again.

Does IndexedDB do any compaction or indexing that could be happening in the background? I can't find any information about such a thing, but these changing times make it seem like something like that is happening.

Any other thoughts or suggestions about things I could try? Would you like me to submit a performance test to you showing this behavior? How do I do that? I don't think I can live with query times this long so I have to figure something out.

Colin

unread,
Sep 15, 2012, 2:22:02 PM9/15/12
to pou...@googlegroups.com
I added some further timing log messages into pouch.js and found that after adding two 5MB attachments to the attach_store the time to execute a single txn.objectStore(BY_SEQ_STORE).get(metadata.seq) call goes from 1ms to about 50ms. Obviously, that's not going to result in acceptable overall performance. I think this is probably IndexedDB's fault as the db.query function in Pouch completely ignores the attach_store (as it should). I'm running Chrome Version 21.0.1180.89 on OSX.

Would it make sense to consider putting the attach_store in an entirely separate IndexedDB database rather than just in a separate objectStore? Maybe that would fix the performance problem.


Dale Harvey

unread,
Sep 15, 2012, 2:48:11 PM9/15/12
to pou...@googlegroups.com
If you would like to write a test

https://github.com/daleharvey/pouchdb/blob/master/tests/perf.html

is an extremely outdated example, just make a PR including any assets you are testing with and try to write a test that can be run without any user intervention. Once I am running the same code and seeing the same issues you are then it will be much easier to start reasoning about. I am fairly familiar with debugging firefox stuff under the hood, and the chrome devs have been really useful with any issues I have come up against in Chromes implementation.

We might even want to save files via the filesystem api, its something I have seen requested in CouchDB, but I would defintely like to be on the exact same page as you are before I start guessing at solutions.

On 15 September 2012 19:22, Colin <colin.h...@gmail.com> wrote:
I added some further timing log messages into pouch.js and found that after adding two 5MB attachments to the attach_store the time to execute a single txn.objectStore(BY_SEQ_STORE).get(metadata.seq) call goes from 1ms to about 50ms. Obviously, that's not going to result in acceptable overall performance. I think this is probably IndexedDB's fault as the db.query function in Pouch completely ignores the attach_store (as it should). I'm running Chrome Version 21.0.1180.89 on OSX.

Would it make sense to consider putting the attach_store in an entirely separate IndexedDB database rather than just in a separate objectStore? Maybe that would fix the performance problem.

--
You received this message because you are subscribed to the Google Groups "PouchDB" group.
To post to this group, send email to pou...@googlegroups.com.
To unsubscribe from this group, send email to pouchdb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/pouchdb/-/oTuRefOQ_74J.

Emiliano Heyns

unread,
Mar 3, 2014, 9:24:45 AM3/3/14
to pou...@googlegroups.com
Has this seen any activity after the exchange above? I'm planning to do something alike, storing blobs in pouch, including sync to a couch instance, but I'd be looking at at least 3000 blobs, varying between 800Kb (most of them) and 150Mb (between 10-100).
Reply all
Reply to author
Forward
0 new messages