Digging through the tickets, it appears we have a few near-duplicates requesting some form of binary storage inside the database. Whilst I'm -0 (RDBMSes are not for binary data!), but I can see the use for them in some circumstances (#2417 lists a few), and these are likely to be a fairly commonly requested feature.
We have an ancient ticket at #250 from Jacob detailing the need for this, #652 wants an upload-to-database facility for Image- and FileFields. Finally, there's a patch in #2417 which implements a small binary field which JKM says is "very very good". Marc Fargas has added docs, etc.
So - if we do want a BinaryField we could use #2417 and make it suitable for larger binary stores (e.g. the VARBINARY used for MySQL has a max length of 255 bytes - perfect for the small bin. chunks wanted in #2417, but not for larger data), and then hook it up to Image/FileFields for #652.
An alternate solution is to check in #2417 for small binary chunks, and then hold 652 back until we decide if we want a LargeBinaryField for large binary chunks suitable for file uploads.
Whether you think RDMSSes are for binary data is mostly by the by.
IMO the patch for #2417 is sub-optimal as it:
1) Subclasses the Char field. 2) Does not provide an intelligent manipulator.
The solution to this problem is to provide a form upload field with the addition of a checkbox to signify deletion of the currently saved binary data. There is no sensible way to SHOW the current binary data - we can leave that up to the application designer.
I think a BinaryField could even help getting less data saved into the database, as some binary data fits well into the database, but must be saved as ASCII now (converted or by using a different format).
For example the full-history-branch uses pickle (cPickle) to serialize the data of an object. pickle known multiple methods to store the data. By default is creates a ASCII-dump, but you can change this behavior. From the docs (http://docs.python.org/lib/node316.html): "By default, the pickle data format uses a printable ASCII representation. This is slightly _more voluminous_ than a binary representation." Besides the other methods provide some optimization: "Protocol version 2 was introduced in Python 2.3. It provides much _more efficient_ pickling of new-style classes."
On 26 mar, 07:14, "Noah Slater" <nsla...@gmail.com> wrote:
> Saving binary data as printable ASCII seems extremely hack^H^H^H > suboptimal to me.
You can have a TextField and save data as base64 or save data as binary. In postgres binary data is not saved inside database, a link is saved only, and I'm not sure if these kind of data could be replicated through slony-1 or similar.
> To make a case for binary data - how about when you want to store a > small image for a UserProfile.
> Saving to the local file-system doesn't work when you are clustering > your application servers.
On 26 mar, 10:04, "Noah Slater" <nsla...@gmail.com> wrote:
> > it depends the FS you're using.
> To get round this problem I am using NFS. Do you have any other > suggestions?
hum, you could do some test using GFS
> Either way, I still think it would be nice to be able to store binary > blobs directly via models.
I'm agree with you. However, Django must be database independent, so sqlite, MySQL and others are able to hand binary data? One solution is write a Field to convert everything _to _base64 on write and convert everything _from_ base64 on read. That's way isn't necessary what db are you using.
> So - if we do want a BinaryField we could use #2417 and make it > suitable for larger binary stores (e.g. the VARBINARY used for MySQL > has a max length of 255 bytes - perfect for the small bin. chunks > wanted in #2417, but not for larger data), and then hook it up to > Image/FileFields for #652.
> An alternate solution is to check in #2417 for small binary chunks, > and then hold 652 back until we decide if we want a LargeBinaryField > for large binary chunks suitable for file uploads.
+1 on having a BinaryField. I'd actually like to see BinaryField be the "larger" binary field, and have a SmallBinaryField alongside for databases with those types.
-1 on allowing File/ImageField to be stored in the database. That's bad design 99% of the time, and will needlessly complicate file upload code.
On Mon, 2007-03-26 at 10:03 -0500, Jacob Kaplan-Moss wrote: > -1 on allowing File/ImageField to be stored in the database. That's > bad design 99% of the time, and will needlessly complicate file upload > code.
If people want to do it themselves, it's pretty easy to create a DBFile model with name and data members, so leaving it out of core doesn't preclude people from doing it if they're set on the idea. (I just want a BinaryField; I'll take it however I can get it.)
Hi, If you provide a BinaryField it's just a matter of time that "hacks" will start to go out on blogs, the wiki or even django-users to get ImageField and FileField on the database (there's a hack on this already), maybe it's 99% bad but if those fields are provided inside django it will be much better than having lots of hackish ways around.
And anyway, there's still a 1% of cases on which it's good design, normally cases of big applications.
An argument for supporting Image/Field on DB: Consider a case of multiple frontends with a big big database, having File and Image fields on filesystem forces you to keep the filesystem in sync among frontends. Now imagine you upload a file which is i.e. the image for an article; The article is inserted on the database and the file on the filesystem. All frontends will **immediatelly** show up the article, but only one will have the image! unless you start playing around with NFS or other networked filesystems.
It can also be a bit messy to do Point In Time recoveries, with everything on the database you can to a nice PTR without any trouble, if there are things on the filesystem you must make sure both things get recovered to the same point in time, and it's rare to see filesystems backed up **permanently** while point in time recoveries in databases (atleast postgresql) are heavily documented and a good resource for some kind of applications.
Third case; Imagine having one single directory holding a project but you run multiple instances of it over different databases (yes, doing tricky things to settings), having things on the filesystem makes things a bit harder.
I'm +1 on providing database backed File and Image fields while heavily discouraging it's use on the documentation by providing clear examples of the 99% and 1% sides of the thing so users are aware of which storage method should they choose.
Also +1 on the BinaryField, then atleast if one **really** needs to store things on DB it could be done :)
Cheers, Marc
On 3/26/07, Jacob Kaplan-Moss <jacob.kaplanm...@gmail.com> wrote:
> On 3/26/07, Simon G. <d...@simon.net.nz> wrote: > > So - if we do want a BinaryField we could use #2417 and make it > > suitable for larger binary stores (e.g. the VARBINARY used for MySQL > > has a max length of 255 bytes - perfect for the small bin. chunks > > wanted in #2417, but not for larger data), and then hook it up to > > Image/FileFields for #652.
> > An alternate solution is to check in #2417 for small binary chunks, > > and then hold 652 back until we decide if we want a LargeBinaryField > > for large binary chunks suitable for file uploads.
> +1 on having a BinaryField. I'd actually like to see BinaryField be > the "larger" binary field, and have a SmallBinaryField alongside for > databases with those types.
> -1 on allowing File/ImageField to be stored in the database. That's > bad design 99% of the time, and will needlessly complicate file upload > code.
On 3/27/07, Marc Fargas Esteve <teleni...@gmail.com> wrote:
> Hi, > If you provide a BinaryField it's just a matter of time that "hacks" will > start to go out on blogs, the wiki or even django-users to get ImageField > and FileField on the database (there's a hack on this already), maybe it's > 99% bad but if those fields are provided inside django it will be much > better than having lots of hackish ways around.
> And anyway, there's still a 1% of cases on which it's good design, normally > cases of big applications.
For big applications it's especially important not to have their DB serving static files. One of the first optimization advice that comes with Django is: use separate web server for serving static files and use separate box for your DB
this would be like doing exact opposite - put the load of serving static files onto the already busy DB server
> An argument for supporting Image/Field on DB: > Consider a case of multiple frontends with a big big database, having > File and Image fields on filesystem forces you to keep the filesystem in > sync among frontends. Now imagine you upload a file which is i.e. the image > for an article; The article is inserted on the database and the file on the > filesystem. All frontends will **immediatelly** show up the article, but > only one will have the image! unless you start playing around with NFS or > other networked filesystems.
network filesystem or NAS is the 'right solution' here, even if NFS is not the best solution performance-wise (NFS is slow) its still WAY faster than DB and can be on a separate box. again, this is important especially for large projects, that need their DB server to be unburdened by static junk
> It can also be a bit messy to do Point In Time recoveries, with everything > on the database you can to a nice PTR without any trouble, if there are > things on the filesystem you must make sure both things get recovered to the > same point in time, and it's rare to see filesystems backed up > **permanently** while point in time recoveries in databases (atleast > postgresql) are heavily documented and a good resource for some kind of > applications.
OK, goo point here, but its very minor issue, basically only media rich servers will use this and they would completely kill their DB server with the traffic (if it's that high that ''rsync'' cannot be used every 2 minutes)
> Third case; Imagine having one single directory holding a project but you > run multiple instances of it over different databases (yes, doing tricky > things to settings), having things on the filesystem makes things a bit > harder.
just put the DIR in the settings as well, I see no point here
> I'm +1 on providing database backed File and Image fields while heavily > discouraging it's use on the documentation by providing clear examples of > the 99% and 1% sides of the thing so users are aware of which storage method > should they choose.
> Also +1 on the BinaryField, then atleast if one **really** needs to store > things on DB it could be done :)
even though you raised some good arguments, I still believe BLOB's are mostly evil, and will be misused.
that said, I am -0 on this. its inviting people to shoot themselves in the foot.
I would be +1 on a BIG RED SIGN saying not to use it unless you REALLY KNOW what you are doing... ;)
> On 3/26/07, Jacob Kaplan-Moss <jacob.kaplanm...@gmail.com> wrote:
> > On 3/26/07, Simon G. <d...@simon.net.nz> wrote: > > > So - if we do want a BinaryField we could use #2417 and make it > > > suitable for larger binary stores (e.g. the VARBINARY used for MySQL > > > has a max length of 255 bytes - perfect for the small bin. chunks > > > wanted in #2417, but not for larger data), and then hook it up to > > > Image/FileFields for #652.
> > > An alternate solution is to check in #2417 for small binary chunks, > > > and then hold 652 back until we decide if we want a LargeBinaryField > > > for large binary chunks suitable for file uploads.
> > +1 on having a BinaryField. I'd actually like to see BinaryField be > > the "larger" binary field, and have a SmallBinaryField alongside for > > databases with those types.
> > -1 on allowing File/ImageField to be stored in the database. That's > > bad design 99% of the time, and will needlessly complicate file upload > > code.