Django as huge image host

97 views
Skip to first unread message

Shamail Tayyab

unread,
Dec 17, 2010, 4:28:42 AM12/17/10
to Django users
Hi,

I have this situation..

- We need to serve static files(images), lots of them, probably 100s
of them per second.

Our website is in Django and we need to support something like this:

1. a URL like /xyz.jpg should open an image.
2. image should be accesibe via this path only because we associate
some data while an image is served, viz number of hits, bandwidth
consumed, hits per second etc.

Prospects:
1. create something like cdn.sitename.com and on a hit to /xyz.png we
do something like

def servefile ( request ):
# save hit data
return HttpResponseRedirect ( "http://cdn.sitname.com/images/
identifier.jpg" )
# where identifier is some id for this image.

(problem here is, users may directly start using the cdn version
of URL and we'll never know whats going on)

2. directly serve image, something like

def servefile ( request ):
# save hit data
ofile = file ( os.path.join (settings.UPLOAD_DIR,
"identifier.jpg") )

wrapper = FileWrapper ( file(ofile) )
response = HttpResponse ( wrapper, content_type='image/jpg' )
response['Content-Disposition'] = 'inline; filename=' +
uiImage.filerealname
response['Content-Length'] = os.path.getsize(ofile)
return response

(problem here is:
1. caching (which can be overcomed by methods like Etag and some
other ones)
2. super slow. (this, on an Amazon EC2 normal instance cannot
past scale of 3-4images simultaneously :( ))


Please give me some good ways to do this.

Thanks and regards

Erik Cederstrand

unread,
Dec 17, 2010, 4:53:57 AM12/17/10
to django...@googlegroups.com

Den 17/12/2010 kl. 10.28 skrev Shamail Tayyab:

> I have this situation..
>
> - We need to serve static files(images), lots of them, probably 100s
> of them per second.
>
> Our website is in Django and we need to support something like this:
>
> 1. a URL like /xyz.jpg should open an image.
> 2. image should be accesibe via this path only because we associate
> some data while an image is served, viz number of hits, bandwidth
> consumed, hits per second etc.

I'm not sure this is even a task for Django. If your primary concern is performance, try doing this directly in Apache or another web server via rewrite rules and use the access logs to do the accounting. Apache is good at access statistics.

If your primary concern is statistics, there is no way you can get exact figures. There may be any number of caching mechanisms between you and the client that you don't control so GET requests may never reach your server. Anyway, at this level there's no such thing as a "cdn version" because you are supposedly in control of the web server that does the mapping from a GET request to a file on the filesystem.

Erik

Michel Thadeu Sabchuk

unread,
Dec 17, 2010, 5:43:39 AM12/17/10
to Django users
Hi,

> I'm not sure this is even a task for Django. If your primary concern is performance, try doing this directly in Apache or another web server via rewrite rules and use the access logs to do the accounting. Apache is good at access statistics.

Does your CDN offers statistics? I use rackspace cloud files and I
know there is log files I can rely on. I didn't used it yet but maybe
it's a choice...

--
Michel Sabchuk
http://turbosys.com.br/

Tom Evans

unread,
Dec 17, 2010, 5:53:50 AM12/17/10
to django...@googlegroups.com
On Fri, Dec 17, 2010 at 9:28 AM, Shamail Tayyab <pleo...@gmail.com> wrote:
> Hi,
>
>    I have this situation..
>
> - We need to serve static files(images), lots of them, probably 100s
> of them per second.
>

Google for X-SendFile


Cheers

Tom

Shamail Tayyab

unread,
Dec 17, 2010, 6:07:19 AM12/17/10
to Django users

> I'm not sure this is even a task for Django. If your primary concern is performance, try doing this directly in Apache or another web server via rewrite rules and use the access logs to do the accounting. Apache is good at access statistics.
>
> If your primary concern is statistics, there is no way you can get exact figures. There may be any number of caching mechanisms between you and the client that you don't control so GET requests may never reach your server. Anyway, at this level there's no such thing as a "cdn version" because you are supposedly in control of the web server that does the mapping from a GET request to a file on the filesystem.
>

Hi,

Yes, the site and other functionality is being powered by Django,
its hardly related to Django. True, we are in control of the server,
but we can use another virtual host for CDN or something. Yes, log
files can be a good idea, but not optimal for much of the stats. I'll
get to it as last resort. Thanks

Shamail Tayyab

unread,
Dec 17, 2010, 6:10:20 AM12/17/10
to Django users
> Does your CDN offers statistics? I use rackspace cloud files and I
> know there is log files I can rely on. I didn't used it yet but maybe
> it's a choice...

Hi,

No, we don't yet have a CDN, its a prospect, a costly prospect.
Thanks

Shamail Tayyab

unread,
Dec 17, 2010, 6:11:09 AM12/17/10
to Django users

> Google for X-SendFile
>
Hey! This one looks like exactly what I want... I'll give it a shot.

Thanks

Stephen Waterbury

unread,
Dec 30, 2010, 6:17:04 PM12/30/10
to django...@googlegroups.com
I am baffled, and it's probably something simple I'm missing ...
I just need to send a message for help and then I'll see it ...
(maybe ... ;)

My set up:
* apache2 on Ubuntu 10.04
* mod_wsgi 3.3, compiled with python 2.6.5 (the system python)
(but Ubuntu's mod_wsgi package was apparently not the problem)
* django installed in a virtualenv with python 2.6.5 also
* 2 django apps, one running on ':80' virtual host and one on
':8000' virtual host, each with a separate wsgi script (of course)
* apache server config (apache.conf) has WSGIPythonHome directive:
'WSGIPythonHome [path to virtualenv directory]'
* virtualenv directory has python interpreter in its bin dir and
python packages installed in its lib dir, including all django
libs
* I have verified that AuthenticationMiddleware can be imported
successfully from the command line within the virtualenv using
the virtualenv's python intepreter
* both django apps live within the virtualenv directory,
each in its own "project" directory there
* both apps are configured in apache with WSGIDaemonProcess directive

The error I continue to get is:

"ImproperlyConfigured at /

"The Django remote user auth middleware requires the
authentication middleware to be installed. Edit your
MIDDLEWARE_CLASSES setting to insert
'django.contrib.auth.middleware.AuthenticationMiddleware' before
the RemoteUserMiddleware class."

The relevant sections of the settings.py files for the apps are:
--------------
MIDDLEWARE_CLASSES = (
'django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.auth.middleware.RemoteUserMiddleware',
)

AUTHENTICATION_BACKENDS = (
'django.contrib.auth.backends.RemoteUserBackend',
)
---------------

Ultimately I'll be using Active Directory (Kerberos) auth, but for
purposes of testing the REMOTE_USER stuff I've configured Basic
Authentication, and it is working (I authenticate successfully
before seeing that error message).

One thing that puzzles me is that in spite of a WSGIPythonHome
directive that points to the virtualenv, the error message lists
the "Python Executable" as "/usr/bin/python". The "Python Path"
shown in the error message seems correct: it includes the
virtualenv's site-packages directory, which is where django is
installed (there is no system-level django installed, Ubuntu's or
otherwise), and as I say I tested that the middleware classes
that the error message complains about can be imported using the
virtualenv's python interpreter on the command line.

Any suggestions welcome! (I can send the whole error page if it
would help, but I'll need to launder it a little.)

TIA!
Steve

Stephen Waterbury

unread,
Dec 30, 2010, 6:20:15 PM12/30/10
to django...@googlegroups.com
Sorry for the repeats! I was already a member of the
group from another email address, but Google kept sending
me bounce messages even after I joined from this address
and resent -- then all 3 copies came at once ... oops. :(

Steve

Stephen Waterbury

unread,
Jan 3, 2011, 4:36:36 PM1/3/11
to django...@googlegroups.com
I posted the traceback I'm getting -- http://dpaste.com/293813/
I'm still completely mystified. The traceback directs me to:

"Edit your MIDDLEWARE_CLASSES setting to insert
'django.contrib.auth.middleware.AuthenticationMiddleware'
before the RemoteUserMiddleware class." But looking further
up in the traceback, it sees that d.c.a.m.AuthenticationMiddleware
*is* there, and it's true -- I already have it in my
MIDDLEWARE_CLASSES setting before the RemoteUserMiddleware class,
so how could mod_wsgi possibly not be finding it?

See below for more details on my configuration ...

Steve

Stephen Waterbury

unread,
Jan 3, 2011, 6:54:48 PM1/3/11
to django...@googlegroups.com
Ach, permissions problem. After I disabled the error message in
d.c.a.m.RemoteUserMiddleware, I got an error message about the
database not being writable -- I'm using sqlite and forgot that
the *directory* containing the db file has to be writable (for the
temp file). Doh! So the real problem that the
d.c.a.m.AuthenticationMiddleware message got triggered by was that
the User it tried to create couldn't be saved (I infer) -- not a
very good error message for that! (mod_wsgi works just fine! ;)

Steve

Karen Tracey

unread,
Jan 4, 2011, 10:16:29 PM1/4/11
to django...@googlegroups.com
On Mon, Jan 3, 2011 at 4:36 PM, Stephen Waterbury <stephen.c...@nasa.gov> wrote:
I posted the traceback I'm getting -- http://dpaste.com/293813/
I'm still completely mystified.  The traceback directs me to:
"Edit your MIDDLEWARE_CLASSES setting to insert
'django.contrib.auth.middleware.AuthenticationMiddleware'
before the RemoteUserMiddleware class."  But looking further
up in the traceback, it sees that d.c.a.m.AuthenticationMiddleware
*is* there, and it's true -- I already have it in my
MIDDLEWARE_CLASSES setting before the RemoteUserMiddleware class,
so how could mod_wsgi possibly not be finding it?

The code issuing this message here is guessing about the cause of the real problem it has run into, which is that the request object it has been handed has no user attribute. The most likely reason for no user attribute on the request at this point is missing AuthenticationMiddleware, but if AuthenticationMiddleware is in place then there must be something else causing the problem.

I happened to just stumble across one of these other possible problems: some other error in the project code. In my case I had a model with a method with a @Property decorator applied (leftover deliberate error to see what message that would generate -- should be @property). With that error in place, adding RemoteUserMiddleware to my config and trying to access the site via apache/mod_wsgi produced the error message you are seeing. The real error was obvious when I tried running the dev server, which would not even start properly with that error in place. Also, removing the RemoteUserMiddleware from the settings and accessing the site via apache/mod_wsgi showed the true error. Once I removed the erroneous @Property decorator, the RemoteUserMiddleware error went away.

I'm not sure why this RemoteUserMiddleware error is having the effect of hiding other errors, and don't have time to dig into that right now. But one possible way to figure out what is going on in your case would be to try removing that middleware and seeing if some other error is reported.

Karen
--
http://tracey.org/kmt/

Reply all
Reply to author
Forward
0 new messages