The choice of CRC32 was for a few reasons:
1) available to all PHP versions (although not such an issue any more
since Smarty 3 is now PHP5+)
2) Performance. These hashes happen on every execution, so performance
is very important. CRC32 is 10x faster than MD5(). I think Uwe is going
to run a benchmark for base64_encode, but again I think CRC32 will win
out in speed.
As for collision, although it is possible, it is very unlikely. In 8
years of Smarty 2 we have never encountered a collision issue with
anyone using Smarty. So, CRC32 has been a fast and valid choice for
filepath creation.
3) shortness, keeping filenames readable/manageable. base64_encode could
get very long.
I'm open to other solutions that avoid collisions, are as fast or
faster, and keep filenames manageable. So far CRC32 has been the choice.
Monte
> --
>
> You received this message because you are subscribed to the Google
> Groups "Smarty Developers" group.
> To post to this group, send email to smarty-d...@googlegroups.com.
> To unsubscribe from this group, send email to
> smarty-develop...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/smarty-developers?hl=en.
> > Seehttp://en.wikipedia.org/wiki/Base64. I choose to replace / and +, to
> > make the generated filenames compatible with all file systems, according to
> >http://en.wikipedia.org/wiki/Filename#Comparison_of_file_name_limitat....
>
> > This could create longer file names. But again according to
> >http://en.wikipedia.org/wiki/Filename#Comparison_of_file_name_limitat...,
Thue Janus Kristensen wrote:
> Hmm. Then the only solution I can see is to save the filepath to the
> first line of the compiled template / cached output, and then check
> that it matches the expected filename when loading it.
>
> Regards, Thue
>
> On Sun, Dec 27, 2009 at 1:57 AM, uwe.tews <uwe....@googlemail.com
> <mailto:uwe....@googlemail.com>> wrote:
>
> That works for subfolders of template_dir. But template_dir itself can
> be an array folders so we still need unique id's.
>
> On Dec 27, 1:40 am, Thue Janus Kristensen <thu...@gmail.com
> <mailto:thu...@gmail.com>> wrote:
> > The best solution is to save the cached files in a directory
> structure
> > mirroring the template files themselves. IMO.
> >
> > That would also be user friendly, as it would be more obvious
> where to find
> > the cached file, given the location of the original template.
> >
> > Regards, Thue
> >
> > On Sat, Dec 26, 2009 at 10:04 PM, Thue Janus Kristensen
> <thu...@gmail.com <mailto:thu...@gmail.com>>wrote:
> <http://en.wikipedia.org/wiki/Base64>. I choose to replace / and +, to
> <mailto:smarty-d...@googlegroups.com>.
> To unsubscribe from this group, send email to
> smarty-develop...@googlegroups.com
> <mailto:smarty-developers%2Bunsu...@googlegroups.com>.
At first I was thinking this meant loading/reading a file as well as the
file timestamp check. If this is merely an assertion test added to the
top, I think it would be an acceptable integrity test.
Thue Janus Kristensen wrote:
> Well, you need to do something :). The current lack of collision
> detection is obviously unacceptable.
>
> And is it so unjustifiable? it takes almost no work to fetch out the
> first line of an already loaded file and compare it to an expected
> value. Or perhaps make the first line be a<?php assert that
> $template_path ="...."?>, that should be very fast too. I just don't
> see the big performance problem.
>
> Regards, Thue
>
> On Sun, Dec 27, 2009 at 8:45 PM, Monte Ohrt <mo...@ohrt.com
> <mailto:mo...@ohrt.com>> wrote:
>
> .. a runtime overhead we can't justify.
>
> Thue Janus Kristensen wrote:
> > Hmm. Then the only solution I can see is to save the filepath to the
> > first line of the compiled template / cached output, and then check
> > that it matches the expected filename when loading it.
> >
> > Regards, Thue
> >
> > On Sun, Dec 27, 2009 at 1:57 AM, uwe.tews
> <uwe....@googlemail.com <mailto:uwe....@googlemail.com>
> > <mailto:uwe....@googlemail.com
> <mailto:uwe....@googlemail.com>>> wrote:
> >
> > That works for subfolders of template_dir. But template_dir
> itself can
> > be an array folders so we still need unique id's.
> >
> > On Dec 27, 1:40 am, Thue Janus Kristensen <thu...@gmail.com
> <mailto:thu...@gmail.com>
> > <mailto:thu...@gmail.com <mailto:thu...@gmail.com>>> wrote:
> > > The best solution is to save the cached files in a directory
> > structure
> > > mirroring the template files themselves. IMO.
> > >
> > > That would also be user friendly, as it would be more obvious
> > where to find
> > > the cached file, given the location of the original template.
> > >
> > > Regards, Thue
> > >
> > > On Sat, Dec 26, 2009 at 10:04 PM, Thue Janus Kristensen
> > <thu...@gmail.com <mailto:thu...@gmail.com>
> <mailto:thu...@gmail.com <mailto:thu...@gmail.com>>>wrote:
> > <mailto:smarty-d...@googlegroups.com
> <mailto:smarty-d...@googlegroups.com>>.
> > To unsubscribe from this group, send email to
> > smarty-develop...@googlegroups.com
> <mailto:smarty-developers%2Bunsu...@googlegroups.com>
> > <mailto:smarty-developers%2Bunsu...@googlegroups.com
> <mailto:smarty-developers%252Buns...@googlegroups.com>>.
Also to note in this academic case, if CRC32 was duplicated, the
assertion test may not exist in the exploited file, so an external
assertion test may be required.
>
> Regards, Thue
>
> On Sun, Dec 27, 2009 at 8:45 PM, Monte Ohrt <mo...@ohrt.com
> <mailto:mo...@ohrt.com>> wrote:
>
> .. a runtime overhead we can't justify.
>
> Thue Janus Kristensen wrote:
> > Hmm. Then the only solution I can see is to save the filepath to the
> > first line of the compiled template / cached output, and then check
> > that it matches the expected filename when loading it.
> >
> > Regards, Thue
> >
> > On Sun, Dec 27, 2009 at 1:57 AM, uwe.tews
> <uwe....@googlemail.com <mailto:uwe....@googlemail.com>
> > <mailto:uwe....@googlemail.com
> <mailto:uwe....@googlemail.com>>> wrote:
> >
> > That works for subfolders of template_dir. But template_dir
> itself can
> > be an array folders so we still need unique id's.
> >
> > On Dec 27, 1:40 am, Thue Janus Kristensen <thu...@gmail.com
> <mailto:thu...@gmail.com>
> > <mailto:thu...@gmail.com <mailto:thu...@gmail.com>>> wrote:
> > > The best solution is to save the cached files in a directory
> > structure
> > > mirroring the template files themselves. IMO.
> > >
> > > That would also be user friendly, as it would be more obvious
> > where to find
> > > the cached file, given the location of the original template.
> > >
> > > Regards, Thue
> > >
> > > On Sat, Dec 26, 2009 at 10:04 PM, Thue Janus Kristensen
> > <thu...@gmail.com <mailto:thu...@gmail.com>
> <mailto:thu...@gmail.com <mailto:thu...@gmail.com>>>wrote:
> > <mailto:smarty-d...@googlegroups.com
> <mailto:smarty-d...@googlegroups.com>>.
> > To unsubscribe from this group, send email to
> > smarty-develop...@googlegroups.com
> <mailto:smarty-developers%2Bunsu...@googlegroups.com>
> > <mailto:smarty-developers%2Bunsu...@googlegroups.com
> <mailto:smarty-developers%252Buns...@googlegroups.com>>.
I see the academic case you present, and I am open to alternates to
crc32 if we can keep generated files manageable and not take a
performance hit.
> > > > > 2) If an attacker could control some co mponent of
> <mailto:smarty-developers%2Bunsu...@googlegroups.com
> <mailto:smarty-developers%252Buns...@googlegroups.com>
> > <mailto:smarty-developers%252Buns...@googlegroups.com
> <mailto:smarty-developers%25252Bun...@googlegroups.com>>>.
This is assuming you have filesystem access to create these paths, so
you can already view whatever files you want anyways :)
I see the academic case you present, and I am open to alternates to
crc32 if we can keep generated files manageable and not take a
performance hit.
My tests shows using the built-in md5 will be faster than the crc32
call as implemented.
CODE:
$start = microtime(1);
$iterations = 100000;
while($iterations--)
$hash = 'this/is/typicalstring.tpl';
printf("nothing time: %.3f \n",microtime(1)-$start);
$start = microtime(1);
$iterations = 100000;
while($iterations--)
$hash = crc32('this/is/typicalstring.tpl');
printf("crc32 time: %.3f \n",microtime(1)-$start);
$start = microtime(1);
$iterations = 100000;
while($iterations--)
$hash = (string)abs(crc32('this/is/typicalstring.tpl'));
printf("crc32 as implemented time: %.3f \n",microtime(1)-$start);
$start = microtime(1);
$iterations = 100000;
while($iterations--)
$hash = md5('this/is/typicalstring.tpl');
printf("md5 time: %.3f \n",microtime(1)-$start);
$start = microtime(1);
$iterations = 100000;
while($iterations--)
$hash = sha1('this/is/typicalstring.tpl');
printf("sha1 time: %.3f \n",microtime(1)-$start);
RESULT:
nothing time: 0.009
crc32 time: 0.033
crc32 as implemented time: 0.072
md5 time: 0.058
sha1 time: 0.068
> <mailto:smarty-d...@googlegroups.com>.
> To unsubscribe from this group, send email to
> smarty-develop...@googlegroups.com
> <mailto:smarty-developers%2Bunsu...@googlegroups.com>.
If sha1 avoids the extra cross-check for collisions at compile time, I'd
vote to go with it as this will be faster overall. md5() could also work
and is slightly quicker than sha1(), but not as secure from collisions
as md5 is a shorter bit length.
Thue Janus Kristensen wrote:
> Perhaps worth noting is that the longest time, 0.068 seconds for
> 100000 calls to sha1, corresponds to 6.8 в 10E-7s for one call to sha1.
On Dec 29, 8:54 pm, Thue Janus Kristensen <thu...@gmail.com> wrote:
> Personally I would tend to go with both sha1 and the run-time check, just to
> be ridiculously secure :).
>
> But using sha1 only and dropping the run-time check is also a reasonable
> choice.
>
> Regards, Thue
>
> On Tue, Dec 29, 2009 at 8:41 PM, Monte Ohrt <mo...@ohrt.com> wrote:
> > If sha1 avoids the extra cross-check for collisions at compile time, I'd
> > vote to go with it as this will be faster overall. md5() could also work
> > and is slightly quicker than sha1(), but not as secure from collisions
> > as md5 is a shorter bit length.
>
> > Thue Janus Kristensen wrote:
> > > Perhaps worth noting is that the longest time, 0.068 seconds for
> > > 100000 calls to sha1, corresponds to 6.8 в 10E-7s for one call to sha1.
>
> > > So assuming one sha1 use per page served, after one million pages
> > > served you will have used a total of less than 1 second of server time
> > > on calculating sha1 hashes.
>
> > > So what I conclude from this is the hash calculation that part of the
> > > code is not performance critical, so we might as well use sha1.
>
> > > (Real run time might be slightly longer due to cache misses, but still)
>
> > > Regards, Thue
>
> > > smarty-develop...@googlegroups.com<smarty-developers%2Bunsu...@googlegroups.com>
> > > <mailto:smarty-developers%2Bunsu...@googlegroups.com<smarty-developers%252Buns...@googlegroups.com>
> > >.
> > > For more options, visit this group at
> > > http://groups.google.com/group/smarty-developers?hl=en.
>
> > > --
>
> > > You received this message because you are subscribed to the Google
> > > Groups "Smarty Developers" group.
> > > To post to this group, send email to smarty-d...@googlegroups.com.
> > > To unsubscribe from this group, send email to
> > > smarty-develop...@googlegroups.com<smarty-developers%2Bunsu...@googlegroups.com>
> > .
> > > For more options, visit this group at
> > >http://groups.google.com/group/smarty-developers?hl=en.
>
> > --
>
> > You received this message because you are subscribed to the Google Groups
> > "Smarty Developers" group.
> > To post to this group, send email to smarty-d...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > smarty-develop...@googlegroups.com<smarty-developers%2Bunsu...@googlegroups.com>
The odds of a random collision are something like 1 in 10^30. Which
is about the same odds as all the oxygen molecules in your room
randomly migrating to one side of the room and you suffocating to
death. It theoretically could happen, but I wouldn't worry about it
:)
-John Campbell