Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Feature request: collectstatic shouldn't recopy files that already exist in destination
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Dan Loewenherz  
View profile  
 More options Sep 27 2012, 12:51 pm
From: Dan Loewenherz <dloewenh...@gmail.com>
Date: Thu, 27 Sep 2012 09:51:52 -0700
Local: Thurs, Sep 27 2012 12:51 pm
Subject: Feature request: collectstatic shouldn't recopy files that already exist in destination

Hey all!

This is a feature request / proposal (one which I'm willing to build out,
given that I've already developed a solution for my own uploader).

I run a consulting business that helps small startups build initial MVPs.
When the time ultimately comes to deciding how to store static assets, my
preference (as is that of many others) is to use Amazon S3, with Amazon
CloudFront acting as a CDN to the appropriate buckets. For the purposes of
this ticket, s/S3/your object storage of choice/g.

Now, Django has an awesome mechanism for making sure static assets are up
on S3. With the appropriate static storage backend, running ./manage.py
collectstatic just searches through your static media folders and copies
the files.

The problem I've run into is that collectstatic copies all files,
regardless of whether they already exist on the destination. Uploading
5-10MB of files is pretty wasteful (and time consuming) when none of the
files have changed and no new files have been added.

As I wrote in the trac ticket (https://code.djangoproject.com/ticket/19021),
my current solution was to write a management command that does essentially
the same thing that collectstatic does. But, there are a few differences.
Here's a rundown (copied straight from the trac ticket).

I currently solve this problem by creating a file containing metadata of

> all the static media at the root of the destination. This file is a JSON
> object that contains file paths as keys and checksum as values. When an
> upload is started, the uploader checks to see if the file path exists as a
> key in the dictionary. If it does, it checks to see if the checksums have
> changed. If they haven't changed, the uploader skips the file. At the end
> of the upload, the checksum file is updated on the destination.

I'll contribute the patch. I know there is not a lot of time before the
feature freeze, but I'm willing to make this happen if there's interest.

If we don't want to change behavior, perhaps adding a flag such as
--skip-unchanged-files to the collectstatic command is the way to go?

All the best,
Dan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.