It could be converted to base 96 the digests from hashlib module, and
random bytes used on crypto (to create the salt, the IV, or a key).
As you can see here [2], the printable ASCII characters are 94
(decimal code range of 33-126). So only left to add another 2
characters more; the space (code 32), and one not-printable char
(which doesn't create any problem) by last.
[1] http://svn.python.org/view/python/trunk/Modules/binascii.c
[2] http://en.wikipedia.org/wiki/ISO/IEC_8859-1
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-dev2-garchive-22421%40googlegroups.com
On Fri, Aug 1, 2008 at 4:06 PM, Kless <jona...@googlemail.com> wrote:
> I think that would be very interesting thay Python would have a module
> for working on base 96 too. [1]
>
> It could be converted to base 96 the digests from hashlib module, and
> random bytes used on crypto (to create the salt, the IV, or a key).
>
> As you can see here [2], the printable ASCII characters are 94
> (decimal code range of 33-126). So only left to add another 2
> characters more; the space (code 32), and one not-printable char
> (which doesn't create any problem) by last.
>
>
> [1] http://svn.python.org/view/python/trunk/Modules/binascii.c
> [2] http://en.wikipedia.org/wiki/ISO/IEC_8859-1
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
For some reason, integral powers of two seem so much more, well,
POWERFUL, if you know what I mean. Frankly I think you are being either
optimistic or charitable in suggesting that such a use case might exist.
There's a reason that DEC called their equivalent of base64 "6-bit
encoding".
But then I wanted to keep integer division as it was, so I am clearly a
techno-luddite. If the world wants fractional bits I'm sure it's only a
matter of time before some genius decides to design a 67.9-bit computer.
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
So the next encoding possible would of base-128 (7-bits encoding),
althought I don't know if were possible since that there would than
use non-printable characters and could change the text (by use of
chars. as Backspace or Delete).
On 2 ago, 03:21, Steve Holden <st...@holdenweb.com> wrote:
> 96 is approximately 2^6.585
>
> For some reason, integral powers of two seem so much more, well,
> POWERFUL, if you know what I mean. Frankly I think you are being either
> optimistic or charitable in suggesting that such a use case might exist.
>
> There's a reason that DEC called their equivalent of base64 "6-bit
> encoding".
>
> But then I wanted to keep integer division as it was, so I am clearly a
> techno-luddite. If the world wants fractional bits I'm sure it's only a
> matter of time before some genius decides to design a 67.9-bit computer.
- Josiah
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/josiah.carlson%40gmail.com
RFC 1924, published on April 1, 1996, to shorten the representation
of IPv6 addresses, so that you can write
ssh '4)+k&C#VzJ4br>0wv%Yp'
instead of having to write
ssh 1080:0:0:0:8:800:200C:417A
Most notably, section 7 (implementation issues) points out
Many current processors do not find 128 bit integer arithmetic, as
required for this technique, a trivial operation. This is not
considered a serious drawback in the representation, but a flaw of
the processor designs.
For arbitrary-sized data, you'd have to give up 128-bit arithmetic,
of course, and represent the input data to encode as a long integer.
Regards,
Martin
P.S. Just in case it isn't clear: I would oppose any specific proposal
to add this Ascii85 algorithm to the standard library. It would sound
like we don't have any real problems to solve.
Original intent (encoding IPV6 addresses) != current usefulness (a
more efficient ascii encoding of binary data). Generally, I'm of the
opinion that base64 (as an ascii encoding of binary data) is
sufficient for any needs I have, but there are cases where having a
more efficient representation would be useful. I would also not
suggest addition in the 2.6/3.0 timeframe, at best it would be
2.7/3.1, and only if someone submits a patch with testcases (note that
the wiki page provides C source for one-shot encoding and decoding
that doesn't require 128-bit arithmetic).
Sounds to me like a project for the OP.
- Josiah
Same here.
> Original intent (encoding IPV6 addresses) != current usefulness (a
> more efficient ascii encoding of binary data).
That was an April Fool's RFC.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
According to Wikipedia, "its main modern use is in Adobe's PostScript and
Portable Document Format file formats".
It is also used by git for diffs of binary files, and those diffs are supposedly
understood by other VCSes like Mercurial... indeed, Mercurial has a Python
extension for base85 encoding (but licensed under the GPL):
http://selenic.com/hg/index.cgi/file/cbdfd08eabc9/mercurial/base85.c
(I suppose Bazaar has something similar)
Endly, since this encoding allows to pack more bytes into the same number of
ASCII characters than its traditional alternatives, it is likely to gain
traction in applications which need to create a pure ASCII representation of
binary data.
Regards
Antoine.
I'm very interested in this (for Rietveld). Where can I learn more
about how git handles diffs of binary files? Does it actually show
adds and deletes of sections of the file?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
See also http://en.wikipedia.org/wiki/April_Fools%27_Day_RFC -- it has
a ton of these. Great fun reading through some of them on an idle
Saturday afternoon. :-)
Well, I'm not sure. I just tried with Mercurial, first committing a
binary file with the following structure:
part1 part3
and then changing it to the following structure:
part1 part2 part3 part2
(part{1,2,3} being some binary chunks of 400 bytes each
from /dev/urandom)
The "git-style" diff given by Mercurial is then:
diff --git a/binfile b/binfile
index acfa6ffc5287c6e9cd400af7b8ab09d072a28b02..5b9a69212ae8f39bf41fbf2194db2b730dcb0ae9
GIT binary patch
literal 1600
zc%1Fi`#%#1003~2SQat~^D2)~Q>OB2Y-`ijkz&}wZbC#-o=0t7jaVepQ|D0_8*3gJ
zMSY0rj%^W39zC2fTEo0@yVCs|_rrbvhcAJotX5m|hB+RX(Aa5xSa4Y^GkS%y10Hva
z3^q{I&mAF9vs@GpEP1!lCxtq*vdKD&+M&87%65P%egC&>7+Bgzx0-lUziyCW?%ELc
z;eHsAnXOY+YY~y3f6~CD+?JujZGa*JV=V-x-twhC^~z}e+->VcW=&UqfNg97Mxf3d
zP2!VM#<4|n+(B|5rOUMBfQ=w}vEdoi_TK&saEG1S{mn@ndj^rKLkR~K7EJZGGO3U9
z<g>qYkn__U%akFI(fHNxLoP?qI1sW!5@Div?MoBQU<}W8T2DXa{`gkjO1RO?{-Yz3
z-yd-sVx%pSu0elCXI*-RPErV~&bEbl*yk6ff?mV<KJ-WwL$dK$cWOO8v8nQ6i6@~u
zvw~4tJo7xMUB<3mj-!r@xsl3A+<JySyN4MA5V)G1_~N92FPXjkyeyjjlF2Cwdfx=~
zrIy88pZDkmmlT6BVEB&HsO>pWHk^!$4x~AJ$M!$>LrjA^UDjlQD0igDHf?>kfTL8C
zT`$ToUViVhgSRSdJij81#v>3jDpj>m&hGZ(a4UQqYSc{FI)@&=mHL-D&8&McIqHsU
zfI-aCDfNmLo!AZjn=0JV{sMJcsTSiO;$}r>P(?>*s6&cpc-Lu5__+c38NPW>O=Ze3
zuFNT^b+XahK^P*sIU93zA4b<Bi<K+TWMzdB4}W%epKy!i*d>tzW?3CiviE&Xi9>Bo
zUeM*=71Le)&!sv725-*vg<uCI_{^T8FxI^)wh*sYr0H@is@*)iWzn066>S1naIlau
zItQO-*hG^8yP=|O<8TL=MWBR_ts5qYMlzBvx9@assRs0B-+%0N-!#0o=9E#}7KwgS
zL(7(rp`ZFnuqIS(+L{IckNZ%BsAvS7nS6!Qvhd<7{m}yj9>#ZU@DGy9C0o;vRo8VN
z{7^{BvC^#ss2zI20|7B%d7&`+;}M|Lb4kcpnT&a>ztw$PNWnC?zuEIbUm<$TLPu{@
zFY}#Iu~hB^StZra*ohA%zVI_1hLduWZa6S~5X9x-A^(HQYtpfwIrX2BO0ucP;atl(
z9w7PlD`@LzhQ`qgJ$YA&U#BnZ>E7FQ)o02@hJW4jTYGs3*i0#Kwn2EEC`iaj#%!n!
zBp#{xc{d%j`Q}GwfJ*5YN5L1jO$egiqV_Dr=2I4()xVmuAhxjtq%Tv8Xpj0eNbDC`
zr)~pTMOKjpr4)Ts!=w3K?TKMAmsB;SKU4G(tZRNi_28i83YGM!sFTf@cL9z%07l$5
zz=K?Re2pPf)sp-XNp`QKAIQ}l7Qjr5SbMa;Q&#UT4(|#}kQ>Zx58&XEdwZV}rT}gi
zAJ(u*feBc}#SFv;?WH=Nl}c*ntY4Tbs@l#;Ya(NdbFhN{QV43Y-kINHGC2@E)ms1`
zEe~IK=-G~B2d_gQ&0mTKxO#ub8Q)9wI!6xVMVM}&lvlFHsV+BT+H~fhnnW~!EEnc+
zD=DAs=Y01@@>AfcR=Yz7$Z5mZ8P~c0<c+6^Ln)%3kuKh-Dp%rRlz3?{E)fP1KT406
Xzmcn%Pn@uyr&=uAM*jcfzxCr^k_Znj
>From that I don't know what can be done with the diff. Looking at the
Mercurial source code suggests that you can encode deltas in the patch,
but that Mercurial doesn't support it (see "# TODO: deltas"):
http://www.selenic.com/hg/index.cgi/file/cbdfd08eabc9/mercurial/patch.py#l1117
A basic explanation of binary diffs here:
http://www.selenic.com/pipermail/mercurial/2008-July/020184.html
The explanation mentions base-64 but it was corrected in a later message
here:
http://www.selenic.com/pipermail/mercurial/2008-July/020192.html
Regards
Antoine.
PS: here are the commands I've typed:
$ hg init bindiff
$ cd bindiff/
$ dd if=/dev/urandom of=part1 bs=1 count=400
[snip output]
$ dd if=/dev/urandom of=part2 bs=1 count=400
[snip output]
$ dd if=/dev/urandom of=part3 bs=1 count=400
[snip output]
$ cat part1 part3 > binfile
$ hg add binfile
$ hg ci -m "added binfile"
$ cat part1 part2 part3 > binfile
$ hg di
diff -r 19cfb10c4a01 binfile
Binary file binfile has changed
$ hg di --git
[produces the patch above]
> So the next encoding possible would of base-128 (7-bits encoding)
A while ago I wanted to pack as much information as
possible into a string of printable characters, and
I came up with a base-95 encoding that packs 9 bytes
into 11 characters.
The application involved representing data using
Python string literals, so it was important that only
printable characters were used. I settled on the
9/11 combination as a reasonable compromise between
packing efficiency and not having the block size
too long.
If anyone's interested, I could dig out the
encoding and decoding routines I wrote.
--
Greg
There were a lot of Python jokes for the Apr 1st. What a pity we have
ceased to make such jokes.
http://mail.python.org/pipermail/python-list/2001-April/076593.html
http://mail.python.org/pipermail/python-list/2003-April/197232.html
http://mail.python.org/pipermail/python-list/2004-April/256320.html
(Despite being a joke it really works!)
http://mail.python.org/pipermail/python-list/2005-April/315453.html
http://mail.python.org/pipermail/python-list/2005-April/315457.html
http://mail.python.org/pipermail/python-list/2006-April/375866.html
Oleg.
--
Oleg Broytmann http://phd.pp.ru/ p...@phd.pp.ru
Programmers don't die, they just GOSUB without RETURN.
You forget the April 1st PEPs:
http://www.python.org/dev/peps/pep-0313/
http://www.python.org/dev/peps/pep-3117/
Georg
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
I figured it's a slow Sunday so I'd collect them on the wiki:
http://wiki.python.org/moin/AprilFools
I found the Python/Perl joint development press release, but only on the
Wayback machine. It appears that when redesigning the python.org website
that page was deemed inappropriate.
Skip
Although it wasn't April 1, here's one I posted
in response to python-dev discussions.
http://mail.python.org/pipermail/python-list/2001-May/084169.html
There was also another one concerning how to reduce
the number of ways of copying a list, but Google
doesn't seem to want to find it.
--
Greg
Great!
> I found the Python/Perl joint development press release, but only on the
> Wayback machine. It appears that when redesigning the python.org website
> that page was deemed inappropriate.
Alas, way too much stuff was dropped by the redesign. ;-(
I should track down the Tim Peters award (or whatever it was called)
and link that.
We should probably cross-link with the Python humor page on python.org
(unless that's also been axed).
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
IIRC, the humor page was axed due to lack of updates -- I recommend
finding the material using Wayback and just adding it to the wiki.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/
Adopt A Process -- stop killing all your children!
http://pyfound.blogspot.com/2006/04/python-25-licensing-change.html
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
_______________________________________________
It's still there:
http://www.python.org/doc/humor/
it's just been absorbed into the documentation. ;-)
I can't find Tim Peters' award page and don't know if you can search the
Wayback Machine. (I suspect you have to know a precise URL.)
Skip
I added a link to the wiki page.
> I can't find Tim Peters' award page and don't know if you can search the
> Wayback Machine. (I suspect you have to know a precise URL.)
Added this too; searching for <pythonic award tim peters> gave me an
email that had the correct URLs. It's still one of my favorites.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
> Martin v. Löwis <martin <at> v.loewis.de> writes:
>>
>> P.S. Just in case it isn't clear: I would oppose any specific
>> proposal
>> to add this Ascii85 algorithm to the standard library. It would sound
>> like we don't have any real problems to solve.
>
> According to Wikipedia, "its main modern use is in Adobe's
> PostScript and
> Portable Document Format file formats".
... git ... mercurial ... bzr
It's sort of too bad about the April Fool's RFC, because now people
tend to think that an encoding with a non-power-of-2 base is just a
joke.
I had to overcome that when working with my programming partner, but
he eventually decided that base-62 was indeed a useful encoding for
our purposes. :-)
I've written a few ascii encoders over the years, mostly in Python,
plus an optimized C version of base-32 (with a real live Duff's Device):
base62.py:
http://allmydata.org/source/z-base-62/trunk-hashedformat/z-base-62/
base62/base62.py
base36.py:
http://allmydata.org/source/z-base-36/trunk-hashedformat/z-base-36/
base36/base36.py
base32.py:
http://allmydata.org/source/z-base-32/trunk-hashedformat/base32/
base32/base32.py
base32.c:
http://allmydata.org/source/z-base-32/trunk-hashedformat/base32/base32.c
Regards,
Zooko
The best April Fool's jokes (imo) are the ones that are obviously
silly right off, but that 1) work, 2) no sane person would ever use,
and 3) offer up something useful hidden in the joke. The April Fool's
RFC fits the bill perfectly, because out of it all comes base85, which
is an actual improvement over base64 (25% expansion of data vs. 33%).
That some people missed that part of the joke isn't terribly
surprising (I have in other situations).
- Josiah
> I had to overcome that when working with my programming partner, but he
> eventually decided that base-62 was indeed a useful encoding for our
> purposes. :-)
>
> I've written a few ascii encoders over the years, mostly in Python, plus an
> optimized C version of base-32 (with a real live Duff's Device):
>
> base62.py:
>
> http://allmydata.org/source/z-base-62/trunk-hashedformat/z-base-62/base62/base62.py
>
> base36.py:
>
> http://allmydata.org/source/z-base-36/trunk-hashedformat/z-base-36/base36/base36.py
>
> base32.py:
>
> http://allmydata.org/source/z-base-32/trunk-hashedformat/base32/base32/base32.py
>
> base32.c:
>
> http://allmydata.org/source/z-base-32/trunk-hashedformat/base32/base32.c
>
> Regards,
>
> Zooko
> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/josiah.carlson%40gmail.com