FormWizard and __call__

3 views
Skip to first unread message

Jay Parlar

unread,
Apr 10, 2008, 10:10:17 PM4/10/08
to django-d...@googlegroups.com
I originally posted this in -users, but since it deals with internals,
maybe it's more appropriate for -dev.

I'm looking at the FormWizard code, and in __call__, we have the following:

for i in range(current_step):
form = self.get_form(i, request.POST)
if request.POST.get("hash_%d" % i, '') != self.security_hash(request, form):
return self.render_hash_failure(request, i)
self.process_step(request, form, i)

My concern is that self.get_form() gets called everytime through. This
means that our form instances are being recreated/instantisated over
and over again.

I understand wanting to handle the case where some data has changed,
but shouldn't there be some sort of cache so we don't need to keep
recreating the instances?

Jay P.

Honza Král

unread,
Apr 11, 2008, 8:03:16 AM4/11/08
to django-d...@googlegroups.com

the short answer is I know of no other way how to make sure the
previous steps are intact.

you could perhaps postpone the validation of data till the end of the
wizard, so only run this once, but that could cause problems for
example if the user chose to override process_step and do something
based on the unsaved forms, for example generate third step
dynamically based on data from the second. If we didn't instantiate
the form and check the hash, we wouldn't be able to guarantee
correctness of the data in previous steps.

If you have any proposals on how to go around this, I will be happy to help out.

>
> Jay P.
>
> >
>

--
Honza Král
E-Mail: Honza...@gmail.com
ICQ#: 107471613
Phone: +420 606 678585

Jay Parlar

unread,
Apr 11, 2008, 10:15:43 AM4/11/08
to django-d...@googlegroups.com
On Fri, Apr 11, 2008 at 8:03 AM, Honza Král <honza...@gmail.com> wrote:
> you could perhaps postpone the validation of data till the end of the
> wizard, so only run this once, but that could cause problems for
> example if the user chose to override process_step and do something
> based on the unsaved forms, for example generate third step
> dynamically based on data from the second. If we didn't instantiate
> the form and check the hash, we wouldn't be able to guarantee
> correctness of the data in previous steps.

Overriding process_step is how I originally ran into this. I wanted to
dynamically create new Form classes (and add them to form_list), based
on the results of the first submitted forms. The problem was this
ended up recreating the new Form classes over and over in
process_step, and it kept appending them to form_list.

> If you have any proposals on how to go around this, I will be happy to help out.

Currently when we call get_form(i, request.POST) in __call__, we just
arbitrarily recreate the Form instance. What if on the first time we
create that instance, we hash and cache the components of request.POST
that are associated with that form? We can tie parts of request.POST
to a particular step, because the step is prepended to each key in
request.POST. Then when get_form() gets called again, we can just
check whether the POST data has changed. If it hasn't, we can return
the same Form instance.

So I'm essentially wondering whether we can do the reverse of what's
being done now. You currently instantiate the form, and do a
security_hash on it. I'd like to do a hash on the POST data, and then
instantiate if necessary (i.e. if we haven't already instantiated for
this hash).

Or am I missing something here?

Thanks,
Jay P.

Honza Král

unread,
Apr 11, 2008, 9:04:25 PM4/11/08
to django-d...@googlegroups.com
On Fri, Apr 11, 2008 at 4:15 PM, Jay Parlar <par...@gmail.com> wrote:
>
> On Fri, Apr 11, 2008 at 8:03 AM, Honza Král <honza...@gmail.com> wrote:
> > you could perhaps postpone the validation of data till the end of the
> > wizard, so only run this once, but that could cause problems for
> > example if the user chose to override process_step and do something
> > based on the unsaved forms, for example generate third step
> > dynamically based on data from the second. If we didn't instantiate
> > the form and check the hash, we wouldn't be able to guarantee
> > correctness of the data in previous steps.
>
> Overriding process_step is how I originally ran into this. I wanted to
> dynamically create new Form classes (and add them to form_list), based
> on the results of the first submitted forms. The problem was this
> ended up recreating the new Form classes over and over in
> process_step, and it kept appending them to form_list.
>
>
> > If you have any proposals on how to go around this, I will be happy to help out.
>
> Currently when we call get_form(i, request.POST) in __call__, we just
> arbitrarily recreate the Form instance. What if on the first time we
> create that instance, we hash and cache the components of request.POST
> that are associated with that form? We can tie parts of request.POST
> to a particular step, because the step is prepended to each key in
> request.POST. Then when get_form() gets called again, we can just
> check whether the POST data has changed. If it hasn't, we can return
> the same Form instance.

yes, but the POST can contain any data, only the form can know which
are interesting for us. The problem also is how do you want to cache
this between requests? Some sites do not have caching enabled, so
unless you want to store it in sessions (so there would be absolutely
no point in storing it in hidden fields as well) you have no place to
put that.

>
> So I'm essentially wondering whether we can do the reverse of what's
> being done now. You currently instantiate the form, and do a
> security_hash on it. I'd like to do a hash on the POST data, and then
> instantiate if necessary (i.e. if we haven't already instantiated for
> this hash).
>
> Or am I missing something here?

the hash is computed from cleaned data, from data that represent the
form fields, without prefixes etc, unlike the data contained in POST,
that doesn't know which form they belong to etc.

If you feel that I didn;t understand your proposal you are probably
right, a code snippet would definitely help ;)

thanks

>
> Thanks,

Jay Parlar

unread,
Apr 11, 2008, 9:37:03 PM4/11/08
to django-d...@googlegroups.com
On 4/11/08, Honza Král <honza...@gmail.com> wrote:
> On Fri, Apr 11, 2008 at 4:15 PM, Jay Parlar <par...@gmail.com> wrote:
> > Currently when we call get_form(i, request.POST) in __call__, we just
> > arbitrarily recreate the Form instance. What if on the first time we
> > create that instance, we hash and cache the components of request.POST
> > that are associated with that form? We can tie parts of request.POST
> > to a particular step, because the step is prepended to each key in
> > request.POST. Then when get_form() gets called again, we can just
> > check whether the POST data has changed. If it hasn't, we can return
> > the same Form instance.
>
>
> yes, but the POST can contain any data, only the form can know which
> are interesting for us. The problem also is how do you want to cache
> this between requests? Some sites do not have caching enabled, so
> unless you want to store it in sessions (so there would be absolutely
> no point in storing it in hidden fields as well) you have no place to
> put that.

I was going to say "Put the cache into self.foo", but now I'm just
realizing that there is just one FormWizard instantiated in urls.py,
so different people could be using the same FormWizard instance.

Hmmm.... Is that right? Because that really screws up my perception
now of what's going on. (For example, is it safe to put dynamic data
into self.form_list, because multiple people could be getting their
form_list affected by that. Sometimes working in just the dev server
makes me forget to think about issues like this).

I was hoping (assuming some kind of cache is available, even if it's
sessions) that the first time a form is instantiated, we cache the
names and values of the POST data that the form uses. Next time
get_form() is called, we check if those same values are still there.

We'd have to be careful though: If a user is on step Y, and a change
is noticed when calling get_form() on a step X (where X < Y), we'd
have to invalidate the caches for all steps after X, and reinstantiate
those forms when get_form() is called for them.

> the hash is computed from cleaned data, from data that represent the
> form fields, without prefixes etc, unlike the data contained in POST,
> that doesn't know which form they belong to etc.

That's why I was thinking of caching the names/values of the required
data in POST, for each step.

> If you feel that I didn;t understand your proposal you are probably
> right, a code snippet would definitely help ;)

I'll wait for your answers to what I've thrown up above, and then see
what I can do. Probably no time this weekend to try though :)

BTW: Web programming is not my forte. I try to get into Django
whenever I can, but I'm mostly a server-side and embedded guy, so
please forgive me if I get some concepts regarding requests wrong.

Thanks,
Jay P.

Honza Král

unread,
Apr 13, 2008, 6:13:49 PM4/13/08
to django-d...@googlegroups.com
On Sat, Apr 12, 2008 at 3:37 AM, Jay Parlar <par...@gmail.com> wrote:
>
> On 4/11/08, Honza Král <honza...@gmail.com> wrote:
> > On Fri, Apr 11, 2008 at 4:15 PM, Jay Parlar <par...@gmail.com> wrote:
>
> > > Currently when we call get_form(i, request.POST) in __call__, we just
> > > arbitrarily recreate the Form instance. What if on the first time we
> > > create that instance, we hash and cache the components of request.POST
> > > that are associated with that form? We can tie parts of request.POST
> > > to a particular step, because the step is prepended to each key in
> > > request.POST. Then when get_form() gets called again, we can just
> > > check whether the POST data has changed. If it hasn't, we can return
> > > the same Form instance.
> >
> >
> > yes, but the POST can contain any data, only the form can know which
> > are interesting for us. The problem also is how do you want to cache
> > this between requests? Some sites do not have caching enabled, so
> > unless you want to store it in sessions (so there would be absolutely
> > no point in storing it in hidden fields as well) you have no place to
> > put that.
>
> I was going to say "Put the cache into self.foo", but now I'm just
> realizing that there is just one FormWizard instantiated in urls.py,
> so different people could be using the same FormWizard instance.

there is no easy solution for this one - you could create your own
instance evrytime in a view, that would get rid of the problem with
more users on one instance, but what about multi-process (let alone
multi box) deployment, where there is no guarantee that request from
the same user will end up in the same process.

>
> Hmmm.... Is that right? Because that really screws up my perception
> now of what's going on. (For example, is it safe to put dynamic data
> into self.form_list, because multiple people could be getting their
> form_list affected by that. Sometimes working in just the dev server
> makes me forget to think about issues like this).
>
> I was hoping (assuming some kind of cache is available, even if it's
> sessions) that the first time a form is instantiated, we cache the
> names and values of the POST data that the form uses. Next time
> get_form() is called, we check if those same values are still there.

you could do that, but the it would just be easier to store all the
values in session. Why not, but its not what the wizard class does
(although it might be very easy to enable multiple storage options via
hooks)

>
> We'd have to be careful though: If a user is on step Y, and a change
> is noticed when calling get_form() on a step X (where X < Y), we'd
> have to invalidate the caches for all steps after X, and reinstantiate
> those forms when get_form() is called for them.
>
>
> > the hash is computed from cleaned data, from data that represent the
> > form fields, without prefixes etc, unlike the data contained in POST,
> > that doesn't know which form they belong to etc.
>
> That's why I was thinking of caching the names/values of the required
> data in POST, for each step.
>
>
> > If you feel that I didn;t understand your proposal you are probably
> > right, a code snippet would definitely help ;)
>
> I'll wait for your answers to what I've thrown up above, and then see
> what I can do. Probably no time this weekend to try though :)
>
> BTW: Web programming is not my forte. I try to get into Django
> whenever I can, but I'm mostly a server-side and embedded guy, so
> please forgive me if I get some concepts regarding requests wrong.


the easiest solution to your issues with the wizard class would be to
modify it to allow storing of the data in session, so that you could
move all this there. I am still -1 on storing the data on the server,
but if a clean implementation of pluggable storage (get_data(request):
return request.POST ??? ) is in place, I will have no objections
against the possibility.

Jay Parlar

unread,
Apr 14, 2008, 11:56:41 AM4/14/08
to django-d...@googlegroups.com
On Sun, Apr 13, 2008 at 6:13 PM, Honza Král <honza...@gmail.com> wrote:
> > I was going to say "Put the cache into self.foo", but now I'm just
> > realizing that there is just one FormWizard instantiated in urls.py,
> > so different people could be using the same FormWizard instance.
>
> there is no easy solution for this one - you could create your own
> instance evrytime in a view, that would get rid of the problem with
> more users on one instance, but what about multi-process (let alone
> multi box) deployment, where there is no guarantee that request from
> the same user will end up in the same process.

Yeah, I definitely see that. I guess it's a fairly typical problem in
Django/webdevelopment land. I always tend to forget about it when
developing with the dev server.

> > I was hoping (assuming some kind of cache is available, even if it's
> > sessions) that the first time a form is instantiated, we cache the
> > names and values of the POST data that the form uses. Next time
> > get_form() is called, we check if those same values are still there.
>
> you could do that, but the it would just be easier to store all the
> values in session. Why not, but its not what the wizard class does
> (although it might be very easy to enable multiple storage options via
> hooks)

Maybe I'll look into that.


> the easiest solution to your issues with the wizard class would be to
> modify it to allow storing of the data in session, so that you could
> move all this there. I am still -1 on storing the data on the server,
> but if a clean implementation of pluggable storage (get_data(request):
> return request.POST ??? ) is in place, I will have no objections
> against the possibility.

I'll see what I can do. Here's a question though: Do you think the
documentation for FormWizard should be changed, especially around
process_step, and mentions of form_list? This is the part that ended
up really confusing me:

"This method should not modify any of that data. Rather, it might want
to set self.extra_context or dynamically alter self.form_list, based
on previously submitted forms."

From my experience here, it seems one really shouldn't be dynamically
altering form_list, unless you want that alteration to globally affect
all users. Because this chunk of text is listed in the process_step()
documentation, I got the impression that form_list was much more
fine-grained than that.

Jay P.

Honza Král

unread,
Apr 15, 2008, 10:58:46 AM4/15/08
to django-d...@googlegroups.com


Yeah, a line saying that if you wish to modify some of the attributes,
you should create new instance every time a view is called ( == wrap
it in a view function) would be nice.

Reply all
Reply to author
Forward
0 new messages