Hi,
I have an application that uses uploads_in_blob feature as follows :
db._adapter.uploads_in_blob = True
This creates an additional column of BYTEA type at the DB Level (I am using postgres, if it matters)
Here is a sample table (This is just to give you the idea, this is not the exact definition):
db.define_table('uploaded_docs',
Field('document_name', 'string', default=''),
Field('doc', 'upload', label=T('Document'),
uploadfield=True,
requires=IS_NOT_EMPTY(
error_message=T('Select a document to upload'))),
Field('doc_property1', 'string'),
Field('doc_property2', 'string'),
Field('document_date', 'date'),
Field('status', 'string', default='Open',
requires=IS_IN_SET(['Draft', 'Approved', 'Under Review'])),
Field('remarks', 'string', default='')
)
- User can upload documents in the "doc" field (which is the upload field)
- There is some meta data as described by other fields.
- There is no restriction on the size of the document (Customer requirement, can't negotiate)
These documents are shown using SQLFORM.grid widget (automatic pagination, search, all the cool things)
Here is the problem :
Each time a DB query is run (and results returned to web2py), the size of each row returned also includes the size of the uploaded document.
e.g. If each row has a document of say 5MB, then 20 rows that are returned by default pagination, consumes 100MB
(I am not sure when this memory is released/GC'ed) So after going thru say 5 such queries, memory consumed is 500MB
I have deployed the app on webfaction, with default memory block of 512MB
So at this point, the "app" is killed, resulting into "502-Bad gateway" error to the end user.
Customer may not always "download" the file, customer may be just looking at the records' metadata, so access to the BLOB isn't needed till user clicks the download link (denoted by "file" URL)
When NOT using uploads_in_blob, the uploads folder only contains a filename, and the file actually resides on the disk. IMO the filesystem is accessed only when needed.
Is there a way to handle BLOB field in similar fashion ? (Access only when needed)
Are there any suggestions on how to limit the memory usage ?
(The app is already in production, so if I handle this via code changes, this is definitely preferred over data migration)
Thanks,
-Mandar