Managing Google Groups headaches

rusi

unread,

Nov 28, 2013, 8:52:21 AM11/28/13

to

This silly google-groups does not reflect changed subject lines!!
That means that GG users who may want to read this may not see it.
So reposting as a new thread:
--------------------------------------
Here's what I do to manage the GG-headaches:

1. Firefox needs to have the "Its all text" addon installed
https://addons.mozilla.org/en-US/firefox/addon/its-all-text/

2. Set the editor in "Its all text" to emacs
[You can use anything… including pure python… more on that below]

3. Put the following into your emacs init
-----------------
;; Clean up Google Groups extra newlines containing only "> "

(defun clean-gg ()
(interactive)
(replace-regexp "^> *\n> *\n> *$" "-=\=-" nil 0 (point-max))
(flush-lines "> *$" 0 (point-max))
(replace-regexp "-=\=-" "" nil 0 (point-max))
; (save-buffers-kill-terminal t)
)

(global-set-key (kbd "<f9>") 'clean-gg)

;(push 'clean-gg find-file-hook)
----------------

Now firefox will show a small new "edit" edit button in the text window.
Clicking that puts you into emacs with the text of the message.

F9 will cleanup the double-spaces.

Depending on whether you are comfortable with emacs or not you can do
either of:

1. Continue editing in emacs.
M-q and/or auto-fill-mode will clean up long-line paras
Save-quit will put you back into firefox with cleaned up text

2. Not comfortable with emacs? Just F9 and save-quit will get you back
to emacs with cleaned up double-spaced text.
The long lines problem remains in this case.

Dont like emacs?

1. If you know how to write similar code for vi (or whatever) you are
set.
2. You can also setup emacs to cleanup and close immediately
3. You can also setup your 'editor' to be a pure python script
[Ive not got round to doing it because I'm not sure how to
catch-report errors in a proper cross-platform way.]
4. If you are a javascript/greasemonkey expert I guess you can convert
the emacs-code to JS/GM code and that would be a zero-click
solution.

Usually use emacs? (ie have it running usually)
You may prefer emacsclient to emacs for the editor.
It will be more instantaneous.

Chris Angelico

unread,

Nov 28, 2013, 8:58:14 AM11/28/13

to pytho...@python.org

On Fri, Nov 29, 2013 at 12:52 AM, rusi <rusto...@gmail.com> wrote:
> Here's what I do to manage the GG-headaches:

Useful tips, I am sure, but they solve the problem only for you.
Everyone who reads python-list/c.l.p will have to implement equivalent
patches. Every archive of the newsgroup or mailing list suffers from
the same problems, too, and it's not going to be easy to solve that
for people.

The true solution is either to fix Google Groups or to not use it.

ChrisA

rusi

unread,

Nov 28, 2013, 9:17:23 AM11/28/13

to

On Thursday, November 28, 2013 7:28:14 PM UTC+5:30, Chris Angelico wrote:

> On Fri, Nov 29, 2013 at 12:52 AM, rusi wrote:
> > Here's what I do to manage the GG-headaches:

> Useful tips, I am sure, but they solve the problem only for you.
> Everyone who reads python-list/c.l.p will have to implement equivalent
> patches. Every archive of the newsgroup or mailing list suffers from
> the same problems, too, and it's not going to be easy to solve that
> for people.

The problems with GG as I understand are
1. Double spacing
2. Long lines

As far as I can see both are cured with the method outlined.
If its not for others and only for me, I'd like to know.
That 2 is a problem was only brought to my notice recently.
And so my fix for it is recent.

Chris Angelico

unread,

Nov 28, 2013, 9:25:52 AM11/28/13

to pytho...@python.org

On Fri, Nov 29, 2013 at 1:17 AM, rusi <rusto...@gmail.com> wrote:
> The problems with GG as I understand are
> 1. Double spacing
> 2. Long lines
>
> As far as I can see both are cured with the method outlined.
> If its not for others and only for me, I'd like to know.
> That 2 is a problem was only brought to my notice recently.
> And so my fix for it is recent.

Yes. Those are the problems. Are you suggesting this as a way to post
via GG without it being a nuisance, or to read news without seeing
those problems? If the former, it is surely far FAR easier to just
read and write mail on python-list, or use Thunderbird, or somesuch,
than to go through these hoops just to be able to keep using buggy
software. People won't do it. And if the latter, well, that's my point
about it solving things only for you.

In fact, either way, it solves things only for you. The problem is
that there are a huge number of users who are not you.

ChrisA

rusi

unread,

Nov 28, 2013, 10:04:53 AM11/28/13

to

On Thursday, November 28, 2013 7:55:52 PM UTC+5:30, Chris Angelico wrote:

> On Fri, Nov 29, 2013 at 1:17 AM, rusi wrote:
> > The problems with GG as I understand are
> > 1. Double spacing
> > 2. Long lines
> > As far as I can see both are cured with the method outlined.
> > If its not for others and only for me, I'd like to know.
> > That 2 is a problem was only brought to my notice recently.
> > And so my fix for it is recent.

> Yes. Those are the problems. Are you suggesting this as a way to post
> via GG without it being a nuisance, or to read news without seeing
> those problems?

The former.

> If the former, it is surely far FAR easier to just
> read and write mail on python-list, or use Thunderbird, or somesuch,
> than to go through these hoops just to be able to keep using buggy
> software.

Its a one time setup -- as is thunderbird.

Its really quite unclear to me why GG is a problem if all the problems
of GG are obviated.

Chris Angelico

unread,

Nov 28, 2013, 10:08:17 AM11/28/13

to pytho...@python.org

On Fri, Nov 29, 2013 at 2:04 AM, rusi <rusto...@gmail.com> wrote:
> Its really quite unclear to me why GG is a problem if all the problems
> of GG are obviated.

Which is easier, fiddling around with your setup so you can post
reasonably on Google Groups, or just getting a better client? With
your setup, you have to drop out to another editor and press F9 for it
to work. With pretty much any other newsreader on the planet, this
works straight off, no setup necessary.

I'm still going to advise people to stop using buggy rubbish.

ChrisA

Michael Torrie

unread,

Nov 28, 2013, 10:40:47 AM11/28/13

to pytho...@python.org

On 11/28/2013 08:08 AM, Chris Angelico wrote:
> Which is easier, fiddling around with your setup so you can post
> reasonably on Google Groups, or just getting a better client? With
> your setup, you have to drop out to another editor and press F9 for it
> to work. With pretty much any other newsreader on the planet, this
> works straight off, no setup necessary.
>
> I'm still going to advise people to stop using buggy rubbish.

My opinion is that the Python list should dump the Usenet tie-in and
just go straight e-mail. Python is the only list I'm on that has a
usenet gateway.

I used to love usenet back in the day, but in the present internet
climate makes it unworkable, though I concede that e-mail is reaching
the end of its usefulness as well.

I wouldn't oppose a dual e-mail list and web-based forum system,
provided the forum system supported threaded conversations in a clean
and useful way (maybe like google wave used to).

Alister

unread,

Nov 28, 2013, 10:50:39 AM11/28/13

to

Whilst I agree with Chris A's main points I would at least say thankyou
for :-

A) finding a solution that works for you.
B) Posting it so that others can try it to see if it works for them.

Perhaps the best option is for everybody to bombard Google with bug
reports (preferably typed with extra long lines & double spaced as that
is clearly what they are used to & we would not want to upset them would
we? )

--
It is better for civilization to be going down the drain than to be
coming up it.
-- Henry Allen

Alister

unread,

Nov 28, 2013, 10:49:09 AM11/28/13

to

On Fri, 29 Nov 2013 02:08:17 +1100, Chris Angelico wrote:

Alister

unread,

Nov 28, 2013, 10:49:39 AM11/28/13

to

On Fri, 29 Nov 2013 02:08:17 +1100, Chris Angelico wrote:

Alister

unread,

Nov 28, 2013, 10:50:09 AM11/28/13

to

On Fri, 29 Nov 2013 02:08:17 +1100, Chris Angelico wrote:

rusi

unread,

Nov 28, 2013, 11:22:27 AM11/28/13

to

On Thursday, November 28, 2013 9:20:39 PM UTC+5:30, Alister wrote:
> On Fri, 29 Nov 2013 02:08:17 +1100, Chris Angelico wrote:

> > On Fri, Nov 29, 2013 at 2:04 AM, rusi wrote:
> >> Its really quite unclear to me why GG is a problem if all the problems
> >> of GG are obviated.
> > Which is easier, fiddling around with your setup so you can post
> > reasonably on Google Groups, or just getting a better client? With your
> > setup, you have to drop out to another editor and press F9 for it to
> > work. With pretty much any other newsreader on the planet, this works
> > straight off, no setup necessary.
> > I'm still going to advise people to stop using buggy rubbish.
> > ChrisA

> Whilst I agree with Chris A's main points I would at least say thankyou
> for :-

Well thanks for the thanks :-)

> A) finding a solution that works for you.
> B) Posting it so that others can try it to see if it works for them.

> Perhaps the best option is for everybody to bombard Google with bug
> reports (preferably typed with extra long lines & double spaced as that
> is clearly what they are used to & we would not want to upset them would
> we? )

If that has even a small likelihood of succeeding I heartily support
it. My impression is its been done with no result -- Usenet is too
fringe and obsolete a technology for Google to bother.

On a different note your message has arrived 4 times.
What client did you use?

Travis Griggs

unread,

Nov 28, 2013, 11:23:31 AM11/28/13

to pytho...@python.org

Sent from my iPhone

> On Nov 28, 2013, at 7:40, Michael Torrie <tor...@gmail.com> wrote:
>
>> On 11/28/2013 08:08 AM, Chris Angelico wrote:

>> Which is easier, fiddling around with your setup so you can post
>> reasonably on Google Groups, or just getting a better client? With
>> your setup, you have to drop out to another editor and press F9 for it
>> to work. With pretty much any other newsreader on the planet, this
>> works straight off, no setup necessary.
>>
>> I'm still going to advise people to stop using buggy rubbish.
>

> My opinion is that the Python list should dump the Usenet tie-in and
> just go straight e-mail. Python is the only list I'm on that has a
> usenet gateway.
>
> I used to love usenet back in the day, but in the present internet
> climate makes it unworkable, though I concede that e-mail is reaching
> the end of its usefulness as well.
>
> I wouldn't oppose a dual e-mail list and web-based forum system,
> provided the forum system supported threaded conversations in a clean
> and useful way (maybe like google wave used to).

> --
> https://mail.python.org/mailman/listinfo/python-list

Here! Here! Well said and amen. My thoughts exactly.

Alister

unread,

Nov 28, 2013, 11:33:57 AM11/28/13

to

I thought I had resolved that last week (using Pan under linux)

--
A bureaucrat's idea of cleaning up his files is to make a copy of
everything
before he destroys it.

Roy Smith

unread,

Nov 28, 2013, 11:43:44 AM11/28/13

to

In article <RCJlu.5$rx...@fx05.am4>,

Alister <aliste...@ntlworld.com> wrote:

> Perhaps the best option is for everybody to bombard Google with bug
> reports (preferably typed with extra long lines & double spaced as that
> is clearly what they are used to & we would not want to upset them would
> we? )

It's pretty clear Google doesn't care about Google Groups. Or, at
least, they don't care that it interacts badly with newsgroups, and in
particular with bidirectional newsgroup/mailing-list gateways.

The purpose of Google Groups is to generate traffic to their site, which
it does just fine. Making it behave better with newsgroups won't change
that, so there's no incentive for them to do so.

Ned Batchelder

unread,

Nov 28, 2013, 12:23:06 PM11/28/13

to pytho...@python.org

On 11/28/13 11:23 AM, Travis Griggs wrote:
>
>
> Sent from my iPhone
>
>> On Nov 28, 2013, at 7:40, Michael Torrie <tor...@gmail.com> wrote:
>>
>>> On 11/28/2013 08:08 AM, Chris Angelico wrote:

>>> Which is easier, fiddling around with your setup so you can post
>>> reasonably on Google Groups, or just getting a better client? With
>>> your setup, you have to drop out to another editor and press F9 for it
>>> to work. With pretty much any other newsreader on the planet, this
>>> works straight off, no setup necessary.
>>>
>>> I'm still going to advise people to stop using buggy rubbish.
>>

>> My opinion is that the Python list should dump the Usenet tie-in and
>> just go straight e-mail. Python is the only list I'm on that has a
>> usenet gateway.
>>
>> I used to love usenet back in the day, but in the present internet
>> climate makes it unworkable, though I concede that e-mail is reaching
>> the end of its usefulness as well.
>>
>> I wouldn't oppose a dual e-mail list and web-based forum system,
>> provided the forum system supported threaded conversations in a clean
>> and useful way (maybe like google wave used to).
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>
> Here! Here! Well said and amen. My thoughts exactly.
>

Funny, I thought the sentiment of many here was, "let's just keep this
as a newsgroup, why do we need the mailing list also?" but I'll admit to
being confused about what people have been proposing for alternate
topologies.

--Ned.

Chris Angelico

unread,

Nov 28, 2013, 12:29:43 PM11/28/13

to pytho...@python.org

On Fri, Nov 29, 2013 at 3:43 AM, Roy Smith <r...@panix.com> wrote:
> The purpose of Google Groups is to generate traffic to their site, which
> it does just fine. Making it behave better with newsgroups won't change
> that, so there's no incentive for them to do so.

Which is why the solution is to tell people to get off it. So long as
people still use it, Google has no incentive to make it better.

ChrisA

rusi

unread,

Nov 28, 2013, 1:13:36 PM11/28/13

to

Here's a 1-click pure python solution.

As I said I dont know how to manage errors!

1. Put it in a file say cleangg.py and make it executable
2. Install it as the 'editor' for the "Its all text" firefox addon
3. Click the edit and you should get a cleaned out post

------------------------------
#!/usr/bin/env python3

from sys import argv
import re
from re import sub

def clean(s):
s1 = sub("^> *\n> *$", "¶", s, flags=re.M)
s2 = sub("^> *\n", "", s1, flags=re.M)
s3 = sub("¶\n", ">\n", s2, flags=re.M)
return s3

def main():
print ("argv[1] %s" % argv[1])
with open(argv[1]) as f:
s = f.read()
with open(argv[1], "w") as f:
f.write(clean(s))

main()

Michael Torrie

unread,

Nov 28, 2013, 1:29:13 PM11/28/13

to pytho...@python.org

On 11/28/2013 10:23 AM, Ned Batchelder wrote:
> Funny, I thought the sentiment of many here was, "let's just keep this
> as a newsgroup, why do we need the mailing list also?" but I'll admit to
> being confused about what people have been proposing for alternate
> topologies.

That may well be the majority sentiment here. I only state my opinion.

Seems like 90% of the problems on this list come from the unchecked
usenet side of things. Such as trolls or spam. For example a certain
iron-skulled person who posted his whining rants and threats from half a
dozen different addresses to the annoyance of all. Despite many calls
to banish him from the list for his blatant disregard for list
etiquette, with usenet it's just not possible. Although I'm sure some
would argue that's a good thing to be unable to kick offenders off the list.

I've used mailing lists for many years and they seem to be a good
compromise between an open community and a controlled forum.

rusi

unread,

Nov 28, 2013, 1:37:29 PM11/28/13

to

On Thursday, November 28, 2013 11:59:13 PM UTC+5:30, Michael Torrie wrote:
> On 11/28/2013 10:23 AM, Ned Batchelder wrote:
> > Funny, I thought the sentiment of many here was, "let's just keep this
> > as a newsgroup, why do we need the mailing list also?" but I'll admit to
> > being confused about what people have been proposing for alternate
> > topologies.
>
> That may well be the majority sentiment here. I only state my opinion.
>
> Seems like 90% of the problems on this list come from the unchecked
> usenet side of things. Such as trolls or spam. For example a certain
> iron-skulled person who posted his whining rants and threats from half a
> dozen different addresses to the annoyance of all. Despite many calls
> to banish him from the list for his blatant disregard for list
> etiquette, with usenet it's just not possible. Although I'm sure some
> would argue that's a good thing to be unable to kick offenders off the list.

Do you realize that that person was not using GG?

IOW we are unfortunately conflating two completely unrelated things:
1. GG has some technical problems which are fairly easy to solve
2. All kinds of people hop onto the list. In addition to genuine ones there are
spammers, trolls, dicks, nuts, philosophers, help-vampires etc etc.

rusi

unread,

Nov 28, 2013, 2:00:22 PM11/28/13

to

To add to that:
1. In this thread itself there is a quadruple-post
2. In an adjacent thread there is the mess due to html mail
3. Sometime ago there was a long argument around the old and unsettled:
Reply vs Reply-all debate

All these are due to NON use of GG.
Does that mean everyone should use GG?
Heck no!

Just this: Technology will occasionally have problems and these can
usually be solved technically.

Walter Hurry

unread,

Nov 28, 2013, 2:40:13 PM11/28/13

to

On Thu, 28 Nov 2013 11:00:22 -0800, rusi wrote:

> On Friday, November 29, 2013 12:07:29 AM UTC+5:30, rusi wrote:

> To add to that:
> 1. In this thread itself there is a quadruple-post 2. In an adjacent
> thread there is the mess due to html mail 3. Sometime ago there was a
> long argument around the old and unsettled:
> Reply vs Reply-all debate
>
> All these are due to NON use of GG.
> Does that mean everyone should use GG?
> Heck no!
>
> Just this: Technology will occasionally have problems and these can
> usually be solved technically.

All true, but the fact remains that the vast majority of GG posters can't
be bothered to do the necessary, or are too stupid, or simply don't care.

You are the exception which proves the rule. I'm with Chris Angelico on
this one.

Michael Torrie

unread,

Nov 28, 2013, 1:50:47 PM11/28/13

to pytho...@python.org

On 11/28/2013 11:37 AM, rusi wrote:
> Do you realize that that person was not using GG?

I do but he was using usenet.

> IOW we are unfortunately conflating two completely unrelated things:
> 1. GG has some technical problems which are fairly easy to solve
> 2. All kinds of people hop onto the list. In addition to genuine ones there are
> spammers, trolls, dicks, nuts, philosophers, help-vampires etc etc.

What they have in common is usenet. Ditching usenet would solve both
problems.

Michael Torrie

unread,

Nov 28, 2013, 2:55:27 PM11/28/13

to pytho...@python.org

My point was that the list problems in general seem to be related to
usenet. GG formatting, spam, trolls. I guess I should have changed the
subject line. Ditching usenet solves the GG problem and a number of
other problems as well.

>> IOW we are unfortunately conflating two completely unrelated things:
>> 1. GG has some technical problems which are fairly easy to solve
>> 2. All kinds of people hop onto the list. In addition to genuine ones there are
>> spammers, trolls, dicks, nuts, philosophers, help-vampires etc etc.
>
> To add to that:
> 1. In this thread itself there is a quadruple-post

Again, sure this was not due to GG, but it was due to a usenet client.
So again, while usenet isn't the problem per se here, moving away from
usenet would have prevented that particular problem.

> 2. In an adjacent thread there is the mess due to html mail

Guess I never see this since I use thunderbird and I can configure it to
always show plain text.

> 3. Sometime ago there was a long argument around the old and unsettled:
> Reply vs Reply-all debate

I think the debate was not that but rather should the list messages
default to reply to list or reply to sender. And I haven't seen that
argument in many years now. Certainly not in the context of usenet vs
e-mail, which I was addressing.

Walter Hurry

unread,

Nov 28, 2013, 3:39:27 PM11/28/13

to

On Thu, 28 Nov 2013 11:50:47 -0700, Michael Torrie wrote:

> On 11/28/2013 11:37 AM, rusi wrote:

>> Do you realize that that person was not using GG?
>

> I do but he was using usenet.
>

>> IOW we are unfortunately conflating two completely unrelated things:
>> 1. GG has some technical problems which are fairly easy to solve 2. All
>> kinds of people hop onto the list. In addition to genuine ones there
>> are
>> spammers, trolls, dicks, nuts, philosophers, help-vampires etc etc.
>

> What they have in common is usenet. Ditching usenet would solve both
> problems.

Sledgehammer to crack a nut IMO. It's only Alister who appears to suffer
from these "multiple post" problems. And Pan is not the culprit - I'm
using Pan on both Linux and FreeBSD without issues, as doubtless are many
others.

Message has been deleted

Zero Piraeus

unread,

Nov 28, 2013, 11:29:23 AM11/28/13

to pytho...@python.org

:

On Thu, Nov 28, 2013 at 08:40:47AM -0700, Michael Torrie wrote:
> My opinion is that the Python list should dump the Usenet tie-in and
> just go straight e-mail.

+1 Hell yes.

--
Zero Piraeus: coram publico
http://etiol.net/pubkey.asc

Terry Reedy

unread,

Nov 28, 2013, 5:32:04 PM11/28/13

to pytho...@python.org

On 11/28/2013 10:40 AM, Michael Torrie wrote:
> On 11/28/2013 08:08 AM, Chris Angelico wrote:
>> Which is easier, fiddling around with your setup so you can post
>> reasonably on Google Groups, or just getting a better client? With
>> your setup, you have to drop out to another editor and press F9 for it
>> to work. With pretty much any other newsreader on the planet, this
>> works straight off, no setup necessary.
>>
>> I'm still going to advise people to stop using buggy rubbish.
>

> My opinion is that the Python list should dump the Usenet tie-in

I am beginning to think this also.

> and just go straight e-mail.

email + gmane newsgroup mirror

> Python is the only list I'm on that has a usenet gateway.

1000 of techical mlists have a gmane mirror. There are over 200 just for
Python.

--
Terry Jan Reedy

Terry Reedy

unread,

Nov 28, 2013, 5:44:15 PM11/28/13

to pytho...@python.org

On 11/28/2013 1:29 PM, Michael Torrie wrote:

> Seems like 90% of the problems on this list come from the unchecked
> usenet side of things. Such as trolls or spam.

...
> Despite many calls to banish [such] ...

> with usenet it's just not possible.

The usenet gateway has been changed recently to no longer pass
everything to python-list (and on to gmane) without question. If you
want the benefit of such moderation as there is, use either of those two.

--
Terry Jan Reedy

Arif Khokar

unread,

Nov 28, 2013, 7:46:21 PM11/28/13

to

The problem could also be solved through client side filtering (i. e.,
killfiles). I usually killfile posters who crosspost to unrelated
groups (which filters 99% of the spam that comes through). I'm sure
that the usenet/email gateway could be configured to filter such posts
on the server side so those who read this list via email won't have
those problems.

The problem with just using email is that it's a bit more difficult to
browse archived posts to this group. After I subscribed to this group
(comp.lang.python) using my news client, I could immediately browse
posts made as far back as April.

Mark Lawrence

unread,

Nov 29, 2013, 9:39:08 AM11/29/13

to pytho...@python.org

On 28/11/2013 16:29, Zero Piraeus wrote:
> :
>
> On Thu, Nov 28, 2013 at 08:40:47AM -0700, Michael Torrie wrote:

>> My opinion is that the Python list should dump the Usenet tie-in and
>> just go straight e-mail.
>
> +1 Hell yes.
>

I'd happily use semaphore but given time you're bound to find someone
who could screw that up. So I'll stick with Thunderbird and gmane,
reading some 40-ish Python lists and blogs. Well, I think they're blogs :)

--
Python is the second best programming language in the world.
But the best has yet to be invented. Christian Tismer

Mark Lawrence

Mark Lawrence

unread,

Nov 29, 2013, 9:41:35 AM11/29/13

to pytho...@python.org

On 29/11/2013 00:46, Arif Khokar wrote:
> On 11/28/2013 1:50 PM, Michael Torrie wrote:
>> On 11/28/2013 11:37 AM, rusi wrote:
>
>>> 2. All kinds of people hop onto the list. In addition to genuine ones
>>> there are
>>> spammers, trolls, dicks, nuts, philosophers, help-vampires etc etc.
>>
>> What they have in common is usenet. Ditching usenet would solve both
>> problems.
>
> The problem could also be solved through client side filtering (i. e.,
> killfiles). I usually killfile posters who crosspost to unrelated
> groups (which filters 99% of the spam that comes through). I'm sure
> that the usenet/email gateway could be configured to filter such posts
> on the server side so those who read this list via email won't have
> those problems.
>

Read through gmane, it's effectively spam free.

Grant Edwards

unread,

Nov 29, 2013, 11:15:08 AM11/29/13

to

On 2013-11-28, Zero Piraeus <z...@etiol.net> wrote:
>:
>
> On Thu, Nov 28, 2013 at 08:40:47AM -0700, Michael Torrie wrote:
>> My opinion is that the Python list should dump the Usenet tie-in and
>> just go straight e-mail.
>
> +1 Hell yes.

I'd have to reluctantly agree. I've been using Usenet for 25 years,
and I still read this as comp.lang.python, but this is practically the
only Usenet group left that I follow. There are a number of mailing
lists I follow via gmane's NNTP server, and I can certainly do the
same for this one.

I've been filtering out all postings from GG for years, so it doesn't
really matter to me, but apparently there are a lot of people with
defective mail/news clients for whom that's apparently not possible?
[Otherwise, I don't understand what all the complaining is about.]

--
Grant

Grant Edwards

unread,

Nov 29, 2013, 11:17:45 AM11/29/13

to

You're assuming that Usenet === NNTP. You can point your news client
at gmane.org's NNTP server and get all the benefits of "news" for
regular mailing lists.

--
Grant

pec...@pascolo.net

unread,

Nov 30, 2013, 8:25:45 AM11/30/13

to

Dennis Lee Bieber <wlf...@ix.netcom.com> writes:

> [NNTP] clients provide full-fledged editors
and conversely full-fledged editors provide
NNTP clients

Neil Cerutti

unread,

Dec 2, 2013, 8:03:52 AM12/2/13

to pytho...@python.org

The current situation does force a lot of technology-focused
people, progammers in particular, into a low opinion of Google.
The crappy usenet portal is poor marketing.

I wish they'd never bought dejanews.

--
Neil Cerutti

Roy Smith

unread,

Dec 2, 2013, 8:29:47 AM12/2/13

to

In article <mailman.3461.1385989...@python.org>,

If you think, "The set of people who are still trying to use usenet
groups for anything serious" is a lot of people, you don't understand
the scale on which Google operates.

Neil Cerutti

unread,

Dec 2, 2013, 9:04:33 AM12/2/13

to pytho...@python.org

It's probably hard to even visualize.

--
Neil Cerutti

rusi

unread,

Dec 2, 2013, 12:11:01 PM12/2/13

to

On Monday, December 2, 2013 7:34:33 PM UTC+5:30, Neil Cerutti wrote:

> On 2013-12-02, Roy Smith wrote:
> >> The current situation does force a lot of technology-focused
> >> people, progammers in particular, into a low opinion of Google.
> >> The crappy usenet portal is poor marketing.
> >
> > If you think, "The set of people who are still trying to use
> > usenet groups for anything serious" is a lot of people, you
> > don't understand the scale on which Google operates.
>
> It's probably hard to even visualize.

I was dreaming about in an alternate surreal world…
And now you guys have crashed me back to planet-earth-2013

!MEAN!

Mark Lawrence

unread,

Dec 2, 2013, 12:48:25 PM12/2/13

to pytho...@python.org

On 02/12/2013 17:11, rusi wrote:
> On Monday, December 2, 2013 7:34:33 PM UTC+5:30, Neil Cerutti wrote:
>> On 2013-12-02, Roy Smith wrote:
>>>> The current situation does force a lot of technology-focused
>>>> people, progammers in particular, into a low opinion of Google.
>>>> The crappy usenet portal is poor marketing.
>>>
>>> If you think, "The set of people who are still trying to use
>>> usenet groups for anything serious" is a lot of people, you
>>> don't understand the scale on which Google operates.
>>
>> It's probably hard to even visualize.
>

> I was dreaming about in an alternate surreal worldï¿½

> And now you guys have crashed me back to planet-earth-2013
>
> !MEAN!
>

ï¿½As this is an international group why not ï¿½MEAN!? :)

Quickly runs off to hide...

Chris Angelico

unread,

Dec 2, 2013, 12:54:36 PM12/2/13

to pytho...@python.org

On Tue, Dec 3, 2013 at 4:48 AM, Mark Lawrence <bream...@yahoo.co.uk> wrote:
> ¿As this is an international group why not ¡MEAN!? :)

¿Does punctuation nest to any level when you ask, ¿Shouldn't it be ¡MEAN!??

ChrisA

Mark Lawrence

unread,

Dec 2, 2013, 1:07:27 PM12/2/13

to pytho...@python.org

Yes.

Message has been deleted

Michael Torrie

unread,

Dec 2, 2013, 8:17:51 PM12/2/13

to pytho...@python.org

On 12/02/2013 06:03 AM, Neil Cerutti wrote:
> I wish they'd never bought dejanews.

I wish Google hadn't bought a lot of things. Seems like they bye up a
lot of cool, nerd-centric apps and companies and then turned them into
apps that do less and do it poorly, but in a slick way that appeals to
the unwashed masses. And add "social" to it. Great for their bottom
line, but horrible for those of us that actually use things as tools.

Besides the dejanews thing, another one is Google Voice. Used to be a
great tool but now they are trying to integrate it with Google Hangouts,
reduce its functionality, reduce interoperability, and otherwise ruin
it. I fear next year is the last year for Google Voice in any usable
form for me. Might just have to bite the bullet and set up my own PBX
and pay for a voip provider and port my google voice number over to it.
I'd hate to lose the number; I've used it since Grand Central times.

And Gmail is also becoming less useful to me. I don't want to use
hangouts; xmpp and google talk worked just fine. But alas that's
disappearing.

And the list goes on.

Roy Smith

unread,

Dec 2, 2013, 8:43:03 PM12/2/13

to

In article <mailman.3495.1386033...@python.org>,

Michael Torrie <tor...@gmail.com> wrote:

> I wish Google hadn't bought a lot of things. Seems like they bye up a
> lot of cool, nerd-centric apps and companies and then turned them into
> apps that do less and do it poorly, but in a slick way that appeals to
> the unwashed masses. And add "social" to it. Great for their bottom
> line, but horrible for those of us that actually use things as tools.

And this is surprising, why?

rusi

unread,

Dec 2, 2013, 9:27:42 PM12/2/13

to

On Tuesday, December 3, 2013 7:13:03 AM UTC+5:30, Roy Smith wrote:

> Michael Torrie wrote:
> > I wish Google hadn't bought a lot of things. Seems like they bye up a
> > lot of cool, nerd-centric apps and companies and then turned them into
> > apps that do less and do it poorly, but in a slick way that appeals to
> > the unwashed masses. And add "social" to it. Great for their bottom
> > line, but horrible for those of us that actually use things as tools.
> And this is surprising, why?

Something floating around here (was it Ben Finney's footer??) went
something like:

We must expect it; else we would be surprised

Put differently: One evidence of being awake (and not in dreamland) is
surprise

A directly related piece by Nicholas Carr
http://www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/306868/

Relevant at a deeper level is his "IT doesn't matter"
http://www.roughtype.com/?p=644

We software professionals cannot agree with this and keep our self-respect/sanity/identity. However its true; so denial remains the
only option.

Michael Torrie

unread,

Dec 2, 2013, 10:09:02 PM12/2/13

to pytho...@python.org

Well back when Google was a young hip company they billed themselves as
a bunch of nerds making stuff for nerds. But yes we should have seen
this coming.

rusi

unread,

Dec 2, 2013, 10:26:31 PM12/2/13

to

So were Bill Gates and Jobs -- nerdy youths.
We tend to not think them so because they are an earlier generation.

Grant Edwards

unread,

Dec 2, 2013, 11:27:07 PM12/2/13

to

On 2013-12-03, Michael Torrie <tor...@gmail.com> wrote:
> On 12/02/2013 06:03 AM, Neil Cerutti wrote:
>> I wish they'd never bought dejanews.
>
> I wish Google hadn't bought a lot of things. Seems like they bye up a
> lot of cool, nerd-centric apps and companies and then turned them into
> apps that do less and do it poorly,

Or they just shut them down. I still wish SageTv was in business. My
SageTv stuff is still running fine, but I don't know what I'm going to
do when it dies. I guess I'll have to go back to Mytv and the
associated huge, loud, noisy front-end boxes.

That said, I'm still pretty happy with Gmail (I use it mostly via
mutt/IMAP rather than the WebUI), and it sure beats the e-mail service
I paid for in the past [it's certainly _way_ better than the Outlook
server they run at work]. The Google search engine still works fine
for me, and my Nexus Galaxy phone has been great.

--
Grant

alex23

unread,

Dec 3, 2013, 1:30:05 AM12/3/13

to

On 3/12/2013 11:17 AM, Michael Torrie wrote:
> And Gmail is also becoming less useful to me. I don't want to use
> hangouts; xmpp and google talk worked just fine. But alas that's
> disappearing.

I really hate Hangouts. If I wanted to use Skype I would be using Skype.

I'm also still unable to understand why Google scrapped Reader and kept
Groups, although I suspect it's because the latter will eventually
integrate more closely with Plus & Hangouts.

Steven D'Aprano

unread,

Dec 3, 2013, 2:13:16 AM12/3/13

to

Not aimed specifically at either Michael or Alex, but a general
observation aimed at you all.

You poor fools you, this is what happens when you give control of the
tools you use to a (near) monopolist whose incentives are not your
incentives.

I mean, Microsoft was bad enough, but they could never reach through the
aether into your computer and remove software they no longer wanted you
to use. The worst that could happen was that they would stop supporting
it and you'd be stuck with old obsolete hardware running old obsolete
software that nevertheless did exactly what you want. Google, on the
other hand, can and will take away software that you use.

--
Steven

Chris Angelico

unread,

Dec 3, 2013, 2:01:59 AM12/3/13

to pytho...@python.org

On Tue, Dec 3, 2013 at 3:27 PM, Grant Edwards <inv...@invalid.invalid> wrote:
> That said, I'm still pretty happy with Gmail (I use it mostly via
> mutt/IMAP rather than the WebUI), and it sure beats the e-mail service
> I paid for in the past [it's certainly _way_ better than the Outlook
> server they run at work]. The Google search engine still works fine
> for me, and my Nexus Galaxy phone has been great.

Things Google does well are those that take advantage of the corpus of
searchable data - like Translate. I wouldn't bother with any other
online translation tool than Google Translate.

ChrisA

Mark Lawrence

unread,

Dec 3, 2013, 7:09:41 AM12/3/13

to pytho...@python.org

On 03/12/2013 01:17, Michael Torrie wrote:
>
> And the list goes on.
>

The love of money...

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

alex23

unread,

Dec 3, 2013, 7:23:43 PM12/3/13

to

On 3/12/2013 5:13 PM, Steven D'Aprano wrote:
> You poor fools you, this is what happens when you give control of the
> tools you use to a (near) monopolist whose incentives are not your
> incentives.

To paraphrase Franklin: those who would give up control to purchase
convenience deserve neither. A lesson hard learned :(

Cameron Simpson

unread,

Dec 3, 2013, 7:38:19 PM12/3/13

to pytho...@python.org

On 28Nov2013 19:46, Arif Khokar <akhok...@wvu.edu> wrote:
> The problem with just using email is that it's a bit more difficult
> to browse archived posts to this group. After I subscribed to this
> group (comp.lang.python) using my news client, I could immediately
> browse posts made as far back as April.

I vastly prefer email.

My first act on joining any mailing list is to download the entire
archive into my local mail store. I have a script for this, for
mailman at least.

Example:

get-mailman-archive http://mail.python.org/pipermail/pythonmac-sig/ >python-mac.mbox

I then suck the whole thing into the folder to which future list
posts will get filed. That way I have the whole archive, and it is
local, and I can examine it with whatever tools take my fancy
(mairix, mutt, grep, etc).

Most mailman lists make their archives readily available.

This cannot be said for the travesty that is Google Groups, and in
fact almost any other list/group/forum run with other software.

Really, most mailing list archives are easily small enough to do
this routinely.

Happy to assist anyone with scripts etc.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

Carpe Daemon - Seize the Background Process
- Paul Tomblin <ab...@freenet2.carleton.ca>

Cameron Simpson

unread,

Dec 3, 2013, 7:40:05 PM12/3/13

to pytho...@python.org

GNU Emacs is a LISP operating system disguised as a word processor.
- Doug Mohney, in comp.arch

Sorry, could not resist. I am, of course, a vi user.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

[...] look at yourself and, while you're at it, drag your eyes over some of
your mates who also ride bikes. They are doers, are they not, and what of the
rest of the population, that tragic ants nest of ever so busy bodies who
scurry from nest to work and back to the nest again at night, to recharge
themselves for another day of spirit-crushing toil? They are the watchers.
- Sanford, REVS, 23dec93

rusi

unread,

Dec 3, 2013, 8:39:16 PM12/3/13

to

On Wednesday, December 4, 2013 6:10:05 AM UTC+5:30, Cameron Simpson wrote:

> > Dennis Lee Bieber writes:
> > > [NNTP] clients provide full-fledged editors
> > and conversely full-fledged editors provide
> > NNTP clients
> GNU Emacs is a LISP operating system disguised as a word processor.
> - Doug Mohney, in comp.arch

In a similar vein, most phones nowadays are just computers
with a pocket-size form-factor and some wireless networking.

So when you say…

> My first act on joining any mailing list is to download the entire
> archive into my local mail store. I have a script for this, for
> mailman at least.

and you happen to own >1 thingys that have general computing
functionality -- phones, laptops, desktops, etc -- do you sync
all your mailing-lists with all of them?

I know friends who have installed a home-data-store…

[Ive been resisting getting something like a NAS because each new
thingabob I own is one more thing to maintain. I also know from
past experience that such luddite battles are in the end always
lost -- Im no technophile but I expect to live and die a techie]

And inspite of all that it still sometimes happens that one has
to work on a 'machine' that is not one's own. What then?

The unfortunate and inexorable conclusion is that when the
(wo)man <-> computer relation goes from 1-1 to 1-many, data and
functionality will move away from 'own-machine' to the cloud.

Will the data be subject to privacy-abuse and worse? Sure
Will the functionality be as good as something one can fine-tune
on one's own computer? heck no!

But in the end, uniform access will trump all that -- compare the
number of vi+emacs+eclipse users with google-doc users…

So to come back full-circle:

Earlier (your quote paraphrased)
Emacs is a full-blown OS -- only lacks a good editor.
Now: replace 'emacs' with 'firefox'.

Chris Angelico

unread,

Dec 3, 2013, 9:03:03 PM12/3/13

to pytho...@python.org

On Wed, Dec 4, 2013 at 12:39 PM, rusi <rusto...@gmail.com> wrote:
> The unfortunate and inexorable conclusion is that when the
> (wo)man <-> computer relation goes from 1-1 to 1-many, data and
> functionality will move away from 'own-machine' to the cloud.
>
> Will the data be subject to privacy-abuse and worse? Sure
> Will the functionality be as good as something one can fine-tune
> on one's own computer? heck no!

The solution often is to run your own central server and have all
devices connect to that. You get full control and still allow any
device to access the same content.

ChrisA

Neil Cerutti

unread,

Dec 4, 2013, 9:34:41 AM12/4/13

to pytho...@python.org

But Franklin's quote doesn't apply when free alternatives exist.
I can use a non-open email system until I don't want to any more,
and switch out when it no longer please me.

The cost of switching isn't zero, but it's much easier than
emmigrating from a police state.

Moreover, I'll always feel that I deserve more than I actually
do deserve.

--
Neil Cerutti

Rich Kulawiec

unread,

Dec 4, 2013, 9:52:23 AM12/4/13

to pytho...@python.org

(comments from a lurker on python-list)

- Google "groups" is a disaster. It's extremely poorly-run, and is in
fact a disservice to Usenet -- which is alive and well, tyvm, and still used
by many of the most senior and experienced people on the Internet. (While
some newsgroups are languishing and some have almost no traffic, others
are thriving. As it should be.) I could catalog the litany of egregious
mistakes that Google has made, but what's the point? They're clearly
uninterested in fixing them. Their only interest is in slapping the
"Google" label on Usenet -- which is far more important in the evolution
of the Internet than Google will ever be -- so that they can use it
as a marketing vehicle. Worse, Google has completely failed to control
outbound abuse from Google groups, which is why many consider it a
best practice to simply drop all Usenet traffic originating there.

- That said, there is value in bidirectionally gatewaying mailing lists
with corresponding Usenet newsgroups. Usenet's propagation properties often
make it the medium of choice for many people, particularly those in areas
with slow, expensive, erratic, etc. connectivity. Conversely, delivery
of Usenet traffic via email is a better solution for others. Software
like Mailman facilitates this fairly well, even given the impedance
mismatch between SMTP and NNTP.

- Mailing lists/Usenet newsgroups remain, as they've been for a very
long time, the solutions of choice for online discussions. Yes, I'm
aware of web forums: I've used hundreds of them. They suck. They ALL
suck, they just all suck differently. I could spend the next several
thousand lines explaining why, but instead I'll just abbreviate: they
don't handle threading, they don't let me use my editor of choice,
they don't let me build my own archive that I can search MY way including
when I'm offline, they are brittle and highly vulnerable to abuse
and security breaches, they encourage worst practices in writing
style (including top-posting and full-quoting), they translate poorly
to other formats, they are difficult to archive, they're even more
difficult to migrate (whereas Unix mbox format files from 30 years ago
are still perfectly usable today), they aren't standardized, they
aren't easily scalable, they're overly complex, they don't support
proper quoting, they don't support proper attribution, they can't
be easily forwarded, they...oh, it just goes on. My point being that
there's a reason that the IETF and the W3C and NANOG and lots of other
groups that could use anything they want use mailing lists: they work.

- That said, they work *if configured properly*, which unfortunately
these days includes a hefty dose of anti-abuse controls. This list
(for the most part) isn't particularly targeted, but it is occasionally
and in the spirit of trying to help out, I can assist with that. (I think
it's fair to say I have a little bit of email expertise.) If any of
the list's owners are reading this and want help, please let me know.

- They also work well *if used properly*, which means that participants
should use proper email/news etiquette: line wrap, sane quoting style,
reasonable editing of followups, preservation of threads, all that stuff.
The more people do more of that, the smoother things work. On the other
hand, if nobody does that, the result is impaired communication and
quite often, a chorus of "mailing lists suck" even though the problem
is not the mailing lists: it's the bad habits of the users on them.
(And of course changing mediums won't fix that.)

- To bring this back around to one of the starting points for this
discussion: I think the current setup is functioning well, even given
the sporadic stresses placed on it. I think it would be best to invest
effort in maintaining/improving it as it stands (which is why I volunteered
to do so, see above) rather than migrating to something else.

---rsk

Mark Lawrence

unread,

Dec 4, 2013, 10:21:29 AM12/4/13

to pytho...@python.org

On 04/12/2013 14:34, Neil Cerutti wrote:
> On 2013-12-04, alex23 <wuw...@gmail.com> wrote:

> But Franklin's quote doesn't apply when free alternatives exist.

Free at the point of delivery, someone, somewhere, has given blood,
toil, tears and sweat.

Chris Angelico

unread,

Dec 4, 2013, 10:46:22 AM12/4/13

to pytho...@python.org

On Thu, Dec 5, 2013 at 1:52 AM, Rich Kulawiec <r...@gsp.org> wrote:
> Mailing lists/Usenet newsgroups remain, as they've been for a very
> long time, the solutions of choice for online discussions. Yes, I'm
> aware of web forums: I've used hundreds of them. They suck. They ALL
> suck, they just all suck differently.

I absolutely agree. And Mailman lists are both easy and powerful -
I've deployed a number of them and subscribed to many MANY more - and
play nicely with other internet standards. Instead of having to
remember to check umpteen web-based forums, I just check my emails,
which I do constantly anyway. Adding another mailing list costs me
nothing; adding another forum costs me quite a bit of time.

Ultimately it comes down to this: It would take an enormous amount of
effort for something else to replicate the power of SMTP and/or NNTP,
ergo nothing has achieved that. The open standards mean there are
myriad clients available, and no new protocol or system can ever hope
to compete with that.

ChrisA

Grant Edwards

unread,

Dec 4, 2013, 10:50:11 AM12/4/13

to

On 2013-12-04, Cameron Simpson <c...@zip.com.au> wrote:
> On 30Nov2013 14:25, pec...@pascolo.net <pec...@pascolo.net> wrote:
>> Dennis Lee Bieber <wlf...@ix.netcom.com> writes:
>> > [NNTP] clients provide full-fledged editors
>> and conversely full-fledged editors provide
>> NNTP clients
>
> GNU Emacs is a LISP operating system disguised as a word processor.
> - Doug Mohney, in comp.arch

Unix: A set of device drivers used to support the the Emacs operating
system.

- Don't remember who, where, or when

--
Grant Edwards grant.b.edwards Yow! I feel like a wet
at parking meter on Darvon!
gmail.com

Mark Lawrence

unread,

Dec 4, 2013, 11:07:45 AM12/4/13

to pytho...@python.org

On 04/12/2013 15:50, Grant Edwards wrote:
> On 2013-12-04, Cameron Simpson <c...@zip.com.au> wrote:
>> On 30Nov2013 14:25, pec...@pascolo.net <pec...@pascolo.net> wrote:
>>> Dennis Lee Bieber <wlf...@ix.netcom.com> writes:
>>>> [NNTP] clients provide full-fledged editors
>>> and conversely full-fledged editors provide
>>> NNTP clients
>>
>> GNU Emacs is a LISP operating system disguised as a word processor.
>> - Doug Mohney, in comp.arch
>
> Unix: A set of device drivers used to support the the Emacs operating
> system.
>
> - Don't remember who, where, or when
>

It's a funny thing the computing world, with some people deriving
operating systems from raincoats, and others editing code with a
domestic household cleaner, what next, I ask myself?

Ned Batchelder

unread,

Dec 4, 2013, 11:21:07 AM12/4/13

to pytho...@python.org

On 12/4/13 11:07 AM, Mark Lawrence wrote:
> On 04/12/2013 15:50, Grant Edwards wrote:

>> On 2013-12-04, Cameron Simpson <c...@zip.com.au> wrote:
>>> On 30Nov2013 14:25, pec...@pascolo.net <pec...@pascolo.net> wrote:
>>>> Dennis Lee Bieber <wlf...@ix.netcom.com> writes:
>>>>> [NNTP] clients provide full-fledged editors
>>>> and conversely full-fledged editors provide
>>>> NNTP clients
>>>
>>> GNU Emacs is a LISP operating system disguised as a word processor.
>>> - Doug Mohney, in comp.arch
>>
>> Unix: A set of device drivers used to support the the Emacs operating
>> system.
>>
>> - Don't remember who, where, or when
>>
>

> It's a funny thing the computing world, with some people deriving
> operating systems from raincoats, and others editing code with a
> domestic household cleaner, what next, I ask myself?
>

Computing with vacuum cleaners is on the decline at least:
http://www.vax.co.uk/vacuum-cleaners

--Ned.

Travis Griggs

unread,

Dec 4, 2013, 11:31:12 AM12/4/13

to pytho...@python.org

On Dec 4, 2013, at 6:52 AM, Rich Kulawiec <r...@gsp.org> wrote:

> Yes, I'm
> aware of web forums: I've used hundreds of them. They suck. They ALL

> suck, they just all suck differently. I could spend the next several
> thousand lines explaining why, but instead I'll just abbreviate: they
> don't handle threading, they don't let me use my editor of choice,
> they don't let me build my own archive that I can search MY way including
> when I'm offline, they are brittle and highly vulnerable to abuse
> and security breaches, they encourage worst practices in writing
> style (including top-posting and full-quoting), they translate poorly
> to other formats, they are difficult to archive, they're even more
> difficult to migrate (whereas Unix mbox format files from 30 years ago
> are still perfectly usable today), they aren't standardized, they
> aren't easily scalable, they're overly complex, they don't support
> proper quoting, they don't support proper attribution, they can't
> be easily forwarded, they...oh, it just goes on. My point being that
> there's a reason that the IETF and the W3C and NANOG and lots of other
> groups that could use anything they want use mailing lists: they work.

One of the best rants I’ve ever read. Full mental harmonic resonance while I read this. Hope you don’t mind, but I think I’ll be plagiarizing your comments in the future. Maybe I’ll post it on a couple of the web forums I currently have the luxury of regularly hating.

Mark Lawrence

unread,

Dec 4, 2013, 11:33:44 AM12/4/13

to pytho...@python.org

On 04/12/2013 16:21, Ned Batchelder wrote:
> On 12/4/13 11:07 AM, Mark Lawrence wrote:
>> On 04/12/2013 15:50, Grant Edwards wrote:

>>> On 2013-12-04, Cameron Simpson <c...@zip.com.au> wrote:
>>>> On 30Nov2013 14:25, pec...@pascolo.net <pec...@pascolo.net> wrote:
>>>>> Dennis Lee Bieber <wlf...@ix.netcom.com> writes:
>>>>>> [NNTP] clients provide full-fledged editors
>>>>> and conversely full-fledged editors provide
>>>>> NNTP clients
>>>>
>>>> GNU Emacs is a LISP operating system disguised as a word processor.
>>>> - Doug Mohney, in comp.arch
>>>
>>> Unix: A set of device drivers used to support the the Emacs operating
>>> system.
>>>
>>> - Don't remember who, where, or when
>>>
>>

>> It's a funny thing the computing world, with some people deriving
>> operating systems from raincoats, and others editing code with a
>> domestic household cleaner, what next, I ask myself?
>>
>
> Computing with vacuum cleaners is on the decline at least:
> http://www.vax.co.uk/vacuum-cleaners
>
> --Ned.
>

Well it shouldn't be. It's a well known fact that VMS stands for Very
Much Safer. I'd compare it to inferior products, but not even the
threat of The Comfy Chair will make me type the names.

Cameron Simpson

unread,

Dec 4, 2013, 5:47:11 PM12/4/13

to pytho...@python.org

On 03Dec2013 17:39, rusi <rusto...@gmail.com> wrote:
> On Wednesday, December 4, 2013 6:10:05 AM UTC+5:30, Cameron Simpson wrote:
> > My first act on joining any mailing list is to download the entire
> > archive into my local mail store. I have a script for this, for
> > mailman at least.
>
> and you happen to own >1 thingys that have general computing
> functionality -- phones, laptops, desktops, etc -- do you sync
> all your mailing-lists with all of them?

No. I'm using a laptops my primary host, and it has the mailing
lists (and all my email). It is usually on and fetches and files
my email; it also forwards _specific_ stuff to a separate mail
account accessed by my phone.

I used to use a home server, but the remote access, while fairly
transparent (script to "ssh then run mutt"), was irritating. And
when I didn't have remote access, very very irritating.

So I'm choosing the better environment with my email local to the laptop and
a select copy of important things (work and friends) copied to an account for
my phone.

[...]

> And inspite of all that it still sometimes happens that one has
> to work on a 'machine' that is not one's own. What then?

Fingers crossed the important stuff gets to my phone. If urgent I
can reply from that, and I'm somewhat up to date on what I care
about. The phone also has (disabled) access to my primary mail spool
for circumstances when the laptop is offline. When online, the
laptop empties that spool ad forwards particulars. When offline, I
can consult what's queuing up.

> The unfortunate and inexorable conclusion is that when the
> (wo)man <-> computer relation goes from 1-1 to 1-many, data and
> functionality will move away from 'own-machine' to the cloud.
> Will the data be subject to privacy-abuse and worse? Sure
> Will the functionality be as good as something one can fine-tune
> on one's own computer? heck no!

I'm striving to resist that for now. Privacy. Security. Dependence
on others' hardware and (not mine => wrong!) technical choices of
software.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

All it takes is working on someone elses program to understand why they call
it "code". - Henry O. Farad <l...@netcom.COM>

Roy Smith

unread,

Dec 4, 2013, 7:58:54 PM12/4/13

to

In article <mailman.3565.1386170...@python.org>,

Rich Kulawiec <r...@gsp.org> wrote:

> Yes, I'm
> aware of web forums: I've used hundreds of them. They suck. They ALL
> suck, they just all suck differently. I could spend the next several
> thousand lines explaining why, but instead I'll just abbreviate: they
> don't handle threading, they don't let me use my editor of choice,
> they don't let me build my own archive that I can search MY way including
> when I'm offline, they are brittle and highly vulnerable to abuse
> and security breaches, they encourage worst practices in writing
> style (including top-posting and full-quoting), they translate poorly
> to other formats, they are difficult to archive, they're even more
> difficult to migrate (whereas Unix mbox format files from 30 years ago
> are still perfectly usable today), they aren't standardized, they
> aren't easily scalable, they're overly complex, they don't support
> proper quoting, they don't support proper attribution, they can't
> be easily forwarded, they...oh, it just goes on.

The real problem with web forums is they conflate transport and
presentation into a single opaque blob, and are pretty much universally
designed to be a closed system. Mail and usenet were both engineered to
make a sharp division between transport and presentation, which meant it
was possible to evolve each at their own pace.

Mostly that meant people could go off and develop new client
applications which interoperated with the existing system. But, it also
meant that transport layers could be switched out (as when NNTP
gradually, but inexorably, replaced UUCP as the primary usenet transport
layer).

rusi

unread,

Dec 6, 2013, 2:13:54 AM12/6/13

to

On Thursday, December 5, 2013 6:28:54 AM UTC+5:30, Roy Smith wrote:

There is a deep assumption hovering round-about the above -- what I
will call the 'Unix assumption(s)'. But before that, just a check on
terminology. By 'presentation' you mean what people normally call
'mail-clients': thunderbird, mutt etc. And by 'transport' you mean
sendmail, exim, qmail etc etc -- what normally are called
'mail-servers.' Right??

Assuming this is the intended meaning of the terminology (yeah its
clearer terminology than the usual and yeah Im also a 'Unix-guy'),
here's the 'Unix-assumption':

- human communication…
(is not very different from)
- machine communication…
(can be done by)
- text…
(for which)
- ASCII is fine…
(which is just)
- bytes…
(inside/between byte-memory-organized)
- von Neumann computers

To the extent that these assumptions are invalid, the 'opaque-blob'
may well be preferable.

Roy Smith

unread,

Dec 6, 2013, 2:36:30 AM12/6/13

to

In article <51007240-6bc9-4f0b...@googlegroups.com>,

rusi <rusto...@gmail.com> wrote:

> On Thursday, December 5, 2013 6:28:54 AM UTC+5:30, Roy Smith wrote:

> > The real problem with web forums is they conflate transport and
> > presentation into a single opaque blob, and are pretty much universally
> > designed to be a closed system. Mail and usenet were both engineered to
> > make a sharp division between transport and presentation, which meant it
> > was possible to evolve each at their own pace.
>
> > Mostly that meant people could go off and develop new client
> > applications which interoperated with the existing system. But, it also
> > meant that transport layers could be switched out (as when NNTP
> > gradually, but inexorably, replaced UUCP as the primary usenet transport
> > layer).
>
> There is a deep assumption hovering round-about the above -- what I
> will call the 'Unix assumption(s)'.

It has nothing to do with Unix. The separation of transport from
presentation is just as valid on Windows, Mac, etc.

> But before that, just a check on
> terminology. By 'presentation' you mean what people normally call
> 'mail-clients': thunderbird, mutt etc. And by 'transport' you mean
> sendmail, exim, qmail etc etc -- what normally are called
> 'mail-servers.' Right??

Yes.

> Assuming this is the intended meaning of the terminology (yeah its
> clearer terminology than the usual and yeah Im also a 'Unix-guy'),
> here's the 'Unix-assumption':
>
> - human communication�
> (is not very different from)
> - machine communication�
> (can be done by)
> - text�
> (for which)
> - ASCII is fine�
> (which is just)
> - bytes�
> (inside/between byte-memory-organized)
> - von Neumann computers
>
> To the extent that these assumptions are invalid, the 'opaque-blob'
> may well be preferable.

I think you're off on the wrong track here. This has nothing to do with
plain text (ascii or otherwise). It has to do with divorcing how you
store and transport messages (be they plain text, HTML, or whatever)
from how a user interacts with them.

Take something like Wikipedia (by which, I really mean, MediaWiki, which
is the underlying software package). Most people think of Wikipedia as
a web site. But, there's another layer below that which lets you get
access to the contents of articles, navigate all the rich connections
like category trees, and all sorts of metadata like edit histories.
Which means, if I wanted to (and many examples of this exist), I can
write my own client which presents the same information in different
ways.

rusi

unread,

Dec 6, 2013, 2:42:57 AM12/6/13

to

On Thursday, December 5, 2013 4:17:11 AM UTC+5:30, Cameron Simpson wrote:

Thanks Cameron. I am not sure how to parse the last sentence but on the
whole thanks for a fair balanced and honest review.

I think I have similar sentiments, viz. I am not one to gush about
the latest gizmodic blissiness, however whenever Ive resisted and been
a late adopter -- color monitor, laptop, cellphone, credit card etc
etc -- in the end Ive had to move with the time and not been
better-off for my earlier resistance.

rusi

unread,

Dec 6, 2013, 8:03:57 AM12/6/13

to

On Friday, December 6, 2013 1:06:30 PM UTC+5:30, Roy Smith wrote:

Evidently (and completely inadvertently) this exchange has just
illustrated one of the inadmissable assumptions:

"unicode as a medium is universal in the same way that ASCII used to be"

I wrote a number of ellipsis characters ie codepoint 2026 as in:

- human communication…
(is not very different from)

- machine communication…

Somewhere between my sending and your quoting those ellipses became
the replacement character FFFD

> > - human communication�
> > (is not very different from)
> > - machine communication�

Leaving aside whose fault this is (very likely buggy google groups),
this mojibaking cannot happen if the assumption "All text is ASCII"
were to uniformly hold.

Of course with unicode also this can be made to not happen, but that
is fragile and error-prone. And that is because ASCII (not extended)
is ONE thing in a way that unicode is hopelessly a motley inconsistent
variety.

With unicode there are in-memory formats, transportation formats eg
UTF-8, strange beasties like FSR (which then hopelessly and
inveterately tickle our resident trolls!) multi-layer encodings (in
html), BOMS and unnecessary/inconsistent BOMS (in microsoft-notepad).
With ASCII, ASCII is ASCII; ie "ABC" is 65,66,67 whether its in-core,
in-file, in-pipe or whatever. Ok there are a few wrinkles to this
eg. the null-terminator in C-strings. I think this is the exception to
the rule that in classic Unix, ASCII is completely inter-operable and
therefore a universal data-structure for inter-process or inter-machine
communication.

It is this universal data structure that makes classic unix pipes and
filters possible and easy (of which your separation of presentation
and transportation is just one case).

Give it up and the composability goes with it.

Go up from the ASCII -> Unicode level to the plain-text -> hypertext
(aka html) level and these composability problems hit with redoubled
force.

> Take something like Wikipedia (by which, I really mean, MediaWiki, which
> is the underlying software package). Most people think of Wikipedia as
> a web site. But, there's another layer below that which lets you get
> access to the contents of articles, navigate all the rich connections
> like category trees, and all sorts of metadata like edit histories.
> Which means, if I wanted to (and many examples of this exist), I can
> write my own client which presents the same information in different
> ways.

Not sure whats your point.
Html is a universal data-structuring format -- ok for presentation, bad for
data-structuring
SQL databases (assuming thats the mediawiki backend) is another -- ok for
data-structuring bad for presentation.

Mediawiki mediates between the two formats.

Beyond that I lost you... what are you trying to say??

Chris Angelico

unread,

Dec 6, 2013, 8:19:04 AM12/6/13

to pytho...@python.org

On Sat, Dec 7, 2013 at 12:03 AM, rusi <rusto...@gmail.com> wrote:
> SQL databases (assuming thats the mediawiki backend) is another -- ok for
> data-structuring bad for presentation.

No, SQL databases don't store structured text. MediaWiki just stores a
single blob (not in the database sense of that word) of text.

ChrisA

rusi

unread,

Dec 6, 2013, 8:32:35 AM12/6/13

to

On Friday, December 6, 2013 6:49:04 PM UTC+5:30, Chris Angelico wrote:

> On Sat, Dec 7, 2013 at 12:03 AM, rusi wrote:
> > SQL databases (assuming thats the mediawiki backend) is another -- ok for
> > data-structuring bad for presentation.

> No, SQL databases don't store structured text. MediaWiki just stores a
> single blob (not in the database sense of that word) of text.

I guess we are using 'structured' in different ways. All I am saying
is that mediawiki which seems to present as html, actually stores its
stuff as SQL -- nothing more or less structured than the schemas here:
http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage

Chris Angelico

unread,

Dec 6, 2013, 8:48:19 AM12/6/13

to pytho...@python.org

On Sat, Dec 7, 2013 at 12:32 AM, rusi <rusto...@gmail.com> wrote:
> I guess we are using 'structured' in different ways. All I am saying
> is that mediawiki which seems to present as html, actually stores its
> stuff as SQL -- nothing more or less structured than the schemas here:
> http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage

Yeah, but the structure is all about the metadata. Ultimately, there's
one single text field containing the entire content as you would see
it in the page editor: wiki markup in straight text. MediaWiki uses an
SQL database to store that lump of text, but ultimately the
relationship is between wikitext and HTML, no SQL involvement.

Wiki markup is reasonable for text structuring. (Not for generic data
structuring, but it's decent for text.) Same with reStructuredText,
used for PEPs. An SQL database is a good way to store mappings of
"this key, this tuple of data" and retrieve them conveniently,
including (and this is the bit that's more complicated in a straight
Python dictionary) using any value out of the tuple as the key, and
(and this is where a dict *really* can't hack it) storing/retrieving
more data than fits in memory. The two are orthogonal. Your point is
better supported by wikitext than by SQL, here, except that there
aren't fifty other systems that parse and display wikitext. In fact,
what you're suggesting is a good argument for deprecating HTML email
in favour of RST email, and using docutils to render the result either
as HTML (for webmail users) or as some other format. And I wouldn't be
against that :) But good luck convincing the world that Microsoft
Outlook is doing the wrong thing.

ChrisA

rusi

unread,

Dec 6, 2013, 9:11:23 AM12/6/13

to

On Friday, December 6, 2013 7:18:19 PM UTC+5:30, Chris Angelico wrote:

> On Sat, Dec 7, 2013 at 12:32 AM, rusi wrote:
> > I guess we are using 'structured' in different ways. All I am saying
> > is that mediawiki which seems to present as html, actually stores its
> > stuff as SQL -- nothing more or less structured than the schemas here:
> > http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage

> Yeah, but the structure is all about the metadata.

Ok (I'd drop the 'all')

> Ultimately, there's one single text field containing the entire content

Right

> as you would see it in the page editor: wiki markup in straight text.

Aha! There you are! Its 'page editor' here and not the html which
'display source' (control-u) which a browser would show. And wikimedia
is the software that mediates.

The usual direction (seen by users of wikipedia) is that wikimedia
takes this text, along with the other unrelated (metadata?) seen
around -- sidebar, tabs etc, css settings and munges it all into html

The other direction (seen by editors of wikipedia) is that you edit a
page and that page and history etc will show the changes,
reflecting the fact that the SQL content has changed.

> MediaWiki uses an SQL database to store that lump of text, but
> ultimately the relationship is between wikitext and HTML, no SQL
> involvement.

Dunno what you mean. Every time someone browses wikipedia, things are
getting pulled out of the SQL and munged into the html (s)he sees.

Chris Angelico

unread,

Dec 6, 2013, 9:51:13 AM12/6/13

to pytho...@python.org

On Sat, Dec 7, 2013 at 1:11 AM, rusi <rusto...@gmail.com> wrote:
> Aha! There you are! Its 'page editor' here and not the html which
> 'display source' (control-u) which a browser would show. And wikimedia
> is the software that mediates.
>
> The usual direction (seen by users of wikipedia) is that wikimedia
> takes this text, along with the other unrelated (metadata?) seen
> around -- sidebar, tabs etc, css settings and munges it all into html
>
> The other direction (seen by editors of wikipedia) is that you edit a
> page and that page and history etc will show the changes,
> reflecting the fact that the SQL content has changed.

MediaWiki is fundamentally very similar to a structure that I'm trying
to deploy for a community web site that I host, approximately thus:

* A git repository stores a bunch of RST files
* A script auto-generates index files based on the presence of certain
file names, and renders via rst2html
* The HTML pages are served as static content

MediaWiki is like this:

* Each page has a history, represented by a series of state snapshots
of wikitext
* On display, the wikitext is converted to HTML and served.

The main difference is that MediaWiki is optimized for rapid and
constant editing, where what I'm pushing for is optimized for less
common edits that might span multiple files. (MW has no facility for
atomically changing multiple pages, and atomically reverting those
changes, and so on. Each page stands alone.) They're still broadly
doing the same thing: storing marked-up text and rendering HTML. The
fact that one uses an SQL database and the other uses a git repository
is actually quite insignificant - it's as significant as the choice of
whether to store your data on a hard disk or an SSD. The system is no
different.

>> MediaWiki uses an SQL database to store that lump of text, but
>> ultimately the relationship is between wikitext and HTML, no SQL
>> involvement.
>
> Dunno what you mean. Every time someone browses wikipedia, things are
> getting pulled out of the SQL and munged into the html (s)he sees.

Yes, but that's just mechanics. The fact that the PHP scripts to
operate Wikipedia are being pulled off a file system doesn't mean that
MediaWiki is an ext3-to-HTML renderer. It's a wikitext-to-HTML
renderer.

Anyway. As I said, your point is still mostly there, as long as you
use wikitext rather than SQL.

ChrisA

Steven D'Aprano

unread,

Dec 6, 2013, 2:00:18 PM12/6/13

to

On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote:

> Evidently (and completely inadvertently) this exchange has just
> illustrated one of the inadmissable assumptions:
>
> "unicode as a medium is universal in the same way that ASCII used to be"

Ironically, your post was not Unicode.

Seriously. I am 100% serious.

Your post was sent using a legacy encoding, Windows-1252, also known as
CP-1252, which is most certainly *not* Unicode. Whatever software you
used to send the message correctly flagged it with a charset header:

Content-Type: text/plain; charset=windows-1252

Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle
encodings correctly (or at all!), it screws up the encoding then sends a
reply with no charset line at all. This is one bug that cannot be blamed
on Google Groups -- or on Unicode.

> I wrote a number of ellipsis characters ie codepoint 2026 as in:

Actually you didn't. You wrote a number of ellipsis characters, hex byte
\x85 (decimal 133), in the CP1252 charset. That happens to be mapped to
code point U+2026 in Unicode, but the two are as distinct as ASCII and
EBCDIC.

> Somewhere between my sending and your quoting those ellipses became the
> replacement character FFFD

Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about
encodings and character sets. It doesn't just assume things are ASCII,
but makes a half-hearted attempt to be charset-aware, but badly. I can
only imagine that it was written back in the Dark Ages where there were a
lot of different charsets in use but no conventions for specifying which
charset was in use. Or perhaps the author was smoking crack while coding.

> Leaving aside whose fault this is (very likely buggy google groups),
> this mojibaking cannot happen if the assumption "All text is ASCII" were
> to uniformly hold.

This is incorrect. People forget that ASCII has evolved since the first
version of the standard in 1963. There have actually been five versions
of the ASCII standard, plus one unpublished version. (And that's not
including the things which are frequently called ASCII but aren't.)

ASCII-1963 didn't even include lowercase letters. It is also missing some
graphic characters like braces, and included at least two characters no
longer used, the up-arrow and left-arrow. The control characters were
also significantly different from today.

ASCII-1965 was unpublished and unused. I don't know the details of what
it changed.

ASCII-1967 is a lot closer to the ASCII in use today. It made
considerable changes to the control characters, moving, adding, removing,
or renaming at least half a dozen control characters. It officially added
lowercase letters, braces, and some others. It replaced the up-arrow
character with the caret and the left-arrow with the underscore. It was
ambiguous, allowing variations and substitutions, e.g.:

- character 33 was permitted to be either the exclamation
mark ! or the logical OR symbol |

- consequently character 124 (vertical bar) was always
displayed as a broken bar ¦, which explains why even today
many keyboards show it that way

- character 35 was permitted to be either the number sign # or
the pound sign £

- character 94 could be either a caret ^ or a logical NOT ¬

Even the humble comma could be pressed into service as a cedilla.

ASCII-1968 didn't change any characters, but allowed the use of LF on its
own. Previously, you had to use either LF/CR or CR/LF as newline.

ASCII-1977 removed the ambiguities from the 1967 standard.

The most recent version is ASCII-1986 (also known as ANSI X3.4-1986).
Unfortunately I haven't been able to find out what changes were made -- I
presume they were minor, and didn't affect the character set.

So as you can see, even with actual ASCII, you can have mojibake. It's
just not normally called that. But if you are given an arbitrary ASCII
file of unknown age, containing code 94, how can you be sure it was
intended as a caret rather than a logical NOT symbol? You can't.

Then there are at least 30 official variations of ASCII, strictly
speaking part of ISO-646. These 7-bit codes were commonly called "ASCII"
by their users, despite the differences, e.g. replacing the dollar sign $
with the international currency sign ¤, or replacing the left brace
{ with the letter s with caron š.

One consequence of this is that the MIME type for ASCII text is called
"US ASCII", despite the redundancy, because many people expect "ASCII"
alone to mean whatever national variation they are used to.

But it gets worse: there are proprietary variations on ASCII which are
commonly called "ASCII" but aren't, including dozens of 8-bit so-called
"extended ASCII" character sets, which is where the problems *really*
pile up. Invariably back in the 1980s and early 1990s people used to call
these "ASCII" no matter that they used 8-bits and contained anything up
to 256 characters.

Just because somebody calls something "ASCII", doesn't make it so; even
if it is ASCII, doesn't mean you know which version of ASCII; even if you
know which version, doesn't mean you know how to interpret certain codes.
It simply is *wrong* to think that "good ol' plain ASCII text" is
unambiguous and devoid of problems.

> With unicode there are in-memory formats, transportation formats eg
> UTF-8,

And the same applies to ASCII.

ASCII is a *seven-bit code*. It will work fine on computers where the
word-size is seven bits. If the word-size is eight bits, or more, you
have to pad the ASCII code. How do you do that? Pad the most-significant
end or the least significant end? That's a choice there. How do you pad
it, with a zero or a one? That's another choice. If your word-size is
more than eight bits, you might even pad *both* ends.

In C, a char is defined as the smallest addressable unit of the machine
that can contain basic character set, not necessarily eight bits.
Implementations of C and C++ sometimes reserve 8, 9, 16, 32, or 36 bits
as a "byte" and/or char. Your in-memory representation of ASCII "a" could
easily end up as bits 001100001 or 0000000001100001.

And then there is the question of whether ASCII characters should be Big
Endian or Little Endian. I'm referring here to bit endianness, rather
than bytes: should character 'a' be represented as bits 1100001 (most
significant bit to the left) or 1000011 (least significant bit to the
left)? This may be relevant with certain networking protocols. Not all
networking protocols are big-endian, nor are all processors. The Ada
programming language even supports both bit orders.

When transmitting ASCII characters, the networking protocol could include
various start and stop bits and parity codes. A single 7-bit ASCII
character might be anything up to 12 bits in length on the wire. It is
simply naive to imagine that the transmission of ASCII codes is the same
as the in-memory or on-disk storage of ASCII.

You're lucky to be active in a time when most common processors have
standardized on a single bit-order, and when most (but not all) network
protocols have done the same. But that doesn't mean that these issues
don't exist for ASCII. If you get a message that purports to be ASCII
text but looks like this:

"\tS\x1b\x1b{\x01u{'\x1b\x13!"

you should suspect strongly that it is "Hello World!" which has been
accidentally bit-reversed by some rogue piece of hardware.

--
Steven

Gene Heskett

unread,

Dec 6, 2013, 2:34:54 PM12/6/13

to pytho...@python.org

On Friday 06 December 2013 14:30:06 Steven D'Aprano did opine:

> displayed as a broken bar آ¦, which explains why even today

> many keyboards show it that way
>
> - character 35 was permitted to be either the number sign # or

> the pound sign آ£
>
> - character 94 could be either a caret ^ or a logical NOT آ¬

>
> Even the humble comma could be pressed into service as a cedilla.
>
> ASCII-1968 didn't change any characters, but allowed the use of LF on
> its own. Previously, you had to use either LF/CR or CR/LF as newline.
>
> ASCII-1977 removed the ambiguities from the 1967 standard.
>
> The most recent version is ASCII-1986 (also known as ANSI X3.4-1986).
> Unfortunately I haven't been able to find out what changes were made --
> I presume they were minor, and didn't affect the character set.
>
> So as you can see, even with actual ASCII, you can have mojibake. It's
> just not normally called that. But if you are given an arbitrary ASCII
> file of unknown age, containing code 94, how can you be sure it was
> intended as a caret rather than a logical NOT symbol? You can't.
>
> Then there are at least 30 official variations of ASCII, strictly
> speaking part of ISO-646. These 7-bit codes were commonly called "ASCII"
> by their users, despite the differences, e.g. replacing the dollar sign

> $ with the international currency sign آ¤, or replacing the left brace
> { with the letter s with caron إ،.

You can lay a lot of the ASCII ambiguity on D.E.C. and their vt series
terminals, anything newer than a vt100 made liberal use of the msbit in a
character. Having written an emulator for the vt-220, I can testify that
really getting it right, was a right pain in the ass. And then I added
zmodem triggers and detections.

Cheers, Gene
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>

Mother Earth is not flat!
A pen in the hand of this president is far more
dangerous than 200 million guns in the hands of
law-abiding citizens.

Roy Smith

unread,

Dec 6, 2013, 3:54:03 PM12/6/13

to pytho...@python.org

Steven D'Aprano <steve+comp.lang.python <at> pearwood.info> writes:

> Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about
> encodings and character sets. It doesn't just assume things are ASCII,
> but makes a half-hearted attempt to be charset-aware, but badly. I can
> only imagine that it was written back in the Dark Ages

Indeed. The basic codebase probably goes back 20 years. I'm posting this
from gmane, just so people don't think I'm a total luddite.

> When transmitting ASCII characters, the networking protocol could include
> various start and stop bits and parity codes. A single 7-bit ASCII
> character might be anything up to 12 bits in length on the wire.

Not to mention that some really old hardware used 1.5 stop bits!

Gregory Ewing

unread,

Dec 6, 2013, 6:27:58 PM12/6/13

to

The point is the existence of an alternative interface that's
designed for use by other programs rather than humans.

This is what web forums are missing. If it existed, one could
easily create an alternative client with a newsreader-like
interface. Without it, such a client would have to be a
monstrosity that worked by screen-scraping the html.

It's not about the format of the messages themselves -- that
could be text, or html, or reST, or bbcode or whatever. It's
about the *framing* of the messages, and being able to
query them by their metadata.

--
Greg

Chris Angelico

unread,

Dec 6, 2013, 6:42:13 PM12/6/13

to pytho...@python.org

On Sat, Dec 7, 2013 at 6:00 AM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> - character 33 was permitted to be either the exclamation
> mark ! or the logical OR symbol |
>
> - consequently character 124 (vertical bar) was always
> displayed as a broken bar ¦, which explains why even today
> many keyboards show it that way
>
> - character 35 was permitted to be either the number sign # or
> the pound sign £
>
> - character 94 could be either a caret ^ or a logical NOT ¬

Yeah, good fun stuff. I first met several of these ambiguities in the
OS/2 REXX documentation, which detailed the language's operators by
specifying their byte values as well as their characters - for
instance, this quote from the docs (yeah, I still have it all here):

"""
Note: Depending upon your Personal System keyboard and the code page
you are using, you may not have the solid vertical bar to select. For
this reason, REXX also recognizes the use of the split vertical bar as
a logical OR symbol. Some keyboards may have both characters. If so,
they are not interchangeable; only the character that is equal to the
ASCII value of 124 works as the logical OR. This type of mismatch can
also cause the character on your screen to be different from the
character on your keyboard.
"""
(The front material on the docs says "(C) Copyright IBM Corp. 1987,
1994. All Rights Reserved.")

It says "ASCII value" where on this list we would be more likely to
call it "byte value", and I'd prefer to say "represented by" rather
than "equal to", but nonetheless, this is still clearly distinguishing
characters and bytes. The language spec is on characters, but
ultimately the interpreter is going to be looking at bytes, so when
there's a problem, it's byte 124 that's the one defined as logical OR.
Oh, and note the copyright date. The byte/char distinction isn't new.

ChrisA

Ned Batchelder

unread,

Dec 6, 2013, 9:24:50 PM12/6/13

to pytho...@python.org

On 12/6/13 8:03 AM, rusi wrote:
>> I think you're off on the wrong track here. This has nothing to do with
>> >plain text (ascii or otherwise). It has to do with divorcing how you
>> >store and transport messages (be they plain text, HTML, or whatever)
>> >from how a user interacts with them.
>
> Evidently (and completely inadvertently) this exchange has just
> illustrated one of the inadmissable assumptions:
>
> "unicode as a medium is universal in the same way that ASCII used to be"
>
> I wrote a number of ellipsis characters ie codepoint 2026 as in:
>
> - human communication…
> (is not very different from)
> - machine communication…
>
> Somewhere between my sending and your quoting those ellipses became
> the replacement character FFFD
>
>>> > > - human communication�
>>> > >(is not very different from)
>>> > > - machine communication�
> Leaving aside whose fault this is (very likely buggy google groups),
> this mojibaking cannot happen if the assumption "All text is ASCII"
> were to uniformly hold.
>
> Of course with unicode also this can be made to not happen, but that
> is fragile and error-prone. And that is because ASCII (not extended)
> is ONE thing in a way that unicode is hopelessly a motley inconsistent
> variety.

You seem to be suggesting that we should stick to ASCII. There are of
course languages that need more than just the Latin alphabet. How would
you suggest we support them? Or maybe I don't understand?

--Ned.

rusi

unread,

Dec 6, 2013, 9:33:39 PM12/6/13

to

OOf! Thats a lot of data to digest! Thanks anyway.

There's one thing I want to get into:

> Your post was sent using a legacy encoding, Windows-1252, also known as
> CP-1252, which is most certainly *not* Unicode. Whatever software you
> used to send the message correctly flagged it with a charset header:

What the hell! I am using firefox 25.0 in debian-testing and posting via GG.

$ locale
shows me:
LANG=en_US.UTF-8

and a bunch of other things all en_US.UTF-8.

For the most part when I point FF at any site and go to view ->
character-encoding, it says Unicode (UTF-8).

However when I go to anything in the python archives:
https://mail.python.org/pipermail/python-list/2013-December/

FF shows it as Western (Windows-1252)

That seems to suggest that something is not right with the python
mailing list config. No??

Chris Angelico

unread,

Dec 6, 2013, 9:41:45 PM12/6/13

to pytho...@python.org

On Sat, Dec 7, 2013 at 1:33 PM, rusi <rusto...@gmail.com> wrote:
> That seems to suggest that something is not right with the python
> mailing list config. No??

If in doubt, blame someone else, eh?

I'd first check what your browser's actually sending. Firebug will
help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
That's the first step.

ChrisA

Steven D'Aprano

unread,

Dec 6, 2013, 10:07:25 PM12/6/13

to

On Thu, 05 Dec 2013 23:13:54 -0800, rusi wrote:

> On Thursday, December 5, 2013 6:28:54 AM UTC+5:30, Roy Smith wrote:

>> The real problem with web forums is they conflate transport and
>> presentation into a single opaque blob, and are pretty much universally
>> designed to be a closed system. Mail and usenet were both engineered
>> to make a sharp division between transport and presentation, which
>> meant it was possible to evolve each at their own pace.
>
>> Mostly that meant people could go off and develop new client
>> applications which interoperated with the existing system. But, it
>> also meant that transport layers could be switched out (as when NNTP
>> gradually, but inexorably, replaced UUCP as the primary usenet
>> transport layer).
>
> There is a deep assumption hovering round-about the above -- what I will
> call the 'Unix assumption(s)'. But before that, just a check on
> terminology. By 'presentation' you mean what people normally call
> 'mail-clients': thunderbird, mutt etc. And by 'transport' you mean
> sendmail, exim, qmail etc etc -- what normally are called
> 'mail-servers.' Right??

Presentation means how the data is presented. Transport means how the
data is transported. It doesn't refer to a specific piece of software
like Thunderbird, but to the logical fact that what people see (the
presentation) is not identical to what gets transported from one computer
to another.

All programs make *some* distinction between the two. Email is encoded,
wrapped with normally-hidden headers, and then sent, before being
displayed at the other end sans such headers. But some programs make a
nice clean distinction. If your mail client converts emails to sound for
the benefit of the blind, that is easy to do because there is a clean
*and public* distinction between the transport and presentation of email
-- everybody can agree on how to extract the message ("Hi Bob, are we
still meeting up for drinks tomorrow night?") from the transportation
layer (the email envelope).

In contrast, that is not the case with nearly all web forums. By
deliberate design, or mere ignorance and neglect, they mix up the message
you care about ("Hi Bob...") and the stuff you need to get that message
(the HTML and Javascript code) in one big ball of mud, and don't have
APIs for getting messages. Or worse, they deliberate obfuscate the
content, in an attempt to lock people in to only using the specific
interface they want you to use.

Consider the difference between (say) Twitter, which has published
standard APIs for reading and writing tweets, and StackOverflow, which as
far as I can tell insists that the one and only way to read and write
comments is via their website. The internal formatting of the website is
not public and is subject to change without notice.

(If I have unfairly maligned StackOverflow, substitute any number of
dozens or hundreds of web forums.)

[...]

> To the extent that these assumptions are invalid, the 'opaque-blob' may
> well be preferable.

No. Nice clean interfaces separating concerns (such as transport and
presentation) have little to do with ASCII text. One can define clear and
open binary protocols too.

--
Steven

rusi

unread,

Dec 6, 2013, 10:16:35 PM12/6/13

to

On Saturday, December 7, 2013 8:11:45 AM UTC+5:30, Chris Angelico wrote:

> On Sat, Dec 7, 2013 at 1:33 PM, rusi wrote:
> > That seems to suggest that something is not right with the python
> > mailing list config. No??

> If in doubt, blame someone else, eh?

> I'd first check what your browser's actually sending. Firebug will
> help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
> That's the first step.

If you give me some tip where to look, I'll do that.
But I dont see what this has to do with forms.

Everything in the python archive (not just my posts) show as Win 1252
[I checked about 6]

Every other page that I checked (most nothing to do with python list,
GG etc) show UTF-8. [I checked about 5]

None of these checkings had forms to be filled.

MRAB

unread,

Dec 6, 2013, 10:19:15 PM12/6/13

to pytho...@python.org

On 07/12/2013 02:41, Chris Angelico wrote:

> On Sat, Dec 7, 2013 at 1:33 PM, rusi <rusto...@gmail.com> wrote:
>> That seems to suggest that something is not right with the python
>> mailing list config. No??
>

> If in doubt, blame someone else, eh?
>
> I'd first check what your browser's actually sending. Firebug will
> help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
> That's the first step.
>

Looking back through the thread, it looks like:

Roy posted a reply in us-ascii.

rusi replied in windows-1252, adding the '…'.

Roy replied in us-ascii, but with 'Š' in place of '…'.

rusi replied in utf-8, with '�' in place of '…'

Roy Smith

unread,

Dec 6, 2013, 10:40:58 PM12/6/13

to

In article <52a290ed$0$30003$c3e8da3$5496...@news.astraweb.com>,

Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> In contrast, that is not the case with nearly all web forums. By
> deliberate design, or mere ignorance and neglect, they mix up the message
> you care about ("Hi Bob...") and the stuff you need to get that message
> (the HTML and Javascript code) in one big ball of mud, and don't have
> APIs for getting messages.

BTW, I was going to bring up vBulletin as an example of a typical web
forum which suffers from the "big ball of mud" syndrome. Then I
discovered that it does indeed have a reasonable looking API
(http://www.vbulletin.com/vbcms/content.php/367-API-Overview).

Beautiful Soup is an awesome tool. Even more awesome is when you don't
have to use it :-)

Chris Angelico

unread,

Dec 6, 2013, 11:08:41 PM12/6/13

to pytho...@python.org

On Sat, Dec 7, 2013 at 2:16 PM, rusi <rusto...@gmail.com> wrote:
> On Saturday, December 7, 2013 8:11:45 AM UTC+5:30, Chris Angelico wrote:
>> On Sat, Dec 7, 2013 at 1:33 PM, rusi wrote:
>> > That seems to suggest that something is not right with the python
>> > mailing list config. No??
>
>> If in doubt, blame someone else, eh?
>
>> I'd first check what your browser's actually sending. Firebug will
>> help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
>> That's the first step.
>
> If you give me some tip where to look, I'll do that.
> But I dont see what this has to do with forms.
>

Page encodings specify what comes from the server to your browser.
Your post went the other way. Tracing the data going back to the
server would tell you how it's encoded.

ChrisA

rusi

unread,

Dec 7, 2013, 2:43:05 AM12/7/13

to

Heh! Yes I guess that can be read into what I was saying.

Practically: I dont see that as an option or that the question of
going back to ASCII even arises.

I was talking more philosophically/historically.

Up until the time of Unix a file for example was a structured
heavy-duty concept motivated by entirely technological considerations:
http://en.wikipedia.org/wiki/Data_set_%28IBM_mainframe%29

By simplifying that into the modern concept of file -- just a stream
of bytes -- and allowing the puns:

byte string
= char list
= text

some elegant systems could be made with people having 'beautiful thoughts:'

Everything that could be stored anywhere -- core or disk -- being bytes
one could go to the next stage and pass around these bytes between
processes. And so we get the elegant -- pipeline -- beauty of Unix
scripts.

Of course there was a catch (Isn't there always?):

Things that did not fit in with this philosophy -- eg clicks of a mouse,
bits on display -- were modelled badly or not at all.

Not-at-all: CLI
Badly: Monstrosity called X

And this explains some of the cultural kinks of our field:

Unix guys invariably think of CLIs as natural and obvious whereas GUIs
are just wasteful eye-candy.

[Yours truly is one of those old geezers who does not know how to
write a GUI to save his life. Almost normal in the Unix world except
that he's not proud of it]

Windows/Mac people do not suffer these delusions but then they dont think of
programming as natural or obvious at all.

Ive often been amused at windows folk: They dont think of Word as a program.
Rather docs are things that magically open when clicked :-)

Brings me to the point I was trying to make (got side-tracked by
the failure of a character to roundtrip between me and Roy -- Im none the
wiser why)

The ASCII = Text = Unicode (non)equation is a relatively minor point.

The more central point is that humans use and need more than just
words to communicate. By straitjacketing communication into the thin
channel of text we are severely impoverishing ourselves.

We communicate with systems with programs that are unstructured
text-files even though programs are conceptually highly structured entities.

Likewise we communicate with each other by this obscenely obsolete
textual mode that I am using right now when rich text formats have been
available for decades.

Some of my more detailed writings on this:

http://blog.languager.org/2013/09/poorest-computer-users-are-programmers.html

http://blog.languager.org/2012/10/html-is-why-mess-in-programming-syntax.html

wxjm...@gmail.com

unread,

Dec 7, 2013, 5:16:02 AM12/7/13

to

Rusi:

"unicode as a medium is universal in the same way that
ASCII used to be"

Probably, you do not realize deeply how this sentence
is correct. Unicode and ascii are constructed in the
same way. It has not even to do with "characters", but
with mathematics.

It is on this level the FSR fails. It is mathematically
wrong by design!

jmf

Steven D'Aprano

unread,

Dec 7, 2013, 6:25:07 AM12/7/13

to

I'm reminded of that fellow, I don't remember his name, who *years* after
the Wright Brothers had flown, and there were dozens of people building
aeroplanes, was still trying to convince everyone that heavier-than-air
flight was mathematically impossible.

--
Steven

Chris Angelico

unread,

Dec 7, 2013, 6:49:51 AM12/7/13

to pytho...@python.org

Nearest I can find is:

https://en.wikipedia.org/wiki/Simon_Newcomb#On_the_impossibility_of_a_flying_machine

He at least accepted the Wrights' work once he found out about it.
Also, he didn't make repeated usenet posts that torpedo you in the
face and leave an "Uh?"-shaped hole. [1] I'm still not sure what jmf
meant by the above.

ChrisA

[1] http://bofh.ntk.net/BOFH/1999/bastard99-24.php

giacomo boffi

unread,

Dec 7, 2013, 11:05:34 AM12/7/13

to

Steven D'Aprano <steve+comp....@pearwood.info> writes:

> Ironically, your post was not Unicode. [...] Your post was sent

> using a legacy encoding, Windows-1252, also known as CP-1252

i access rusi's post using a NNTP server,
and in his post i see

Content-Type: text/plain; charset=UTF-8

is it possible that what you see is an artifact
of the gateway?

Roy Smith

unread,

Dec 7, 2013, 11:08:08 AM12/7/13

to

In article <31f1bb84-1432-446c...@googlegroups.com>,

wxjm...@gmail.com wrote:

> It is on this level the FSR fails.

What is "FSR"? I apologize if this was explained earlier in the thread
and I can't find the reference.

https://en.wikipedia.org/wiki/FSR#Science_and_technology was no help.

Rotwang

unread,

Dec 7, 2013, 11:15:33 AM12/7/13

to

On 07/12/2013 16:08, Roy Smith wrote:
> In article <31f1bb84-1432-446c...@googlegroups.com>,
> wxjm...@gmail.com wrote:
>
>> It is on this level the FSR fails.
>
> What is "FSR"? I apologize if this was explained earlier in the thread
> and I can't find the reference.

It's the Flexible String Representation, introduced in Python 3.3:

http://www.python.org/dev/peps/pep-0393/

Tim Chase

unread,

Dec 7, 2013, 11:19:32 AM12/7/13

to pytho...@python.org

On 2013-12-07 11:08, Roy Smith wrote:
> In article <31f1bb84-1432-446c...@googlegroups.com>,
> wxjm...@gmail.com wrote:
>
> > It is on this level the FSR fails.
>
> What is "FSR"? I apologize if this was explained earlier in the
> thread and I can't find the reference.

Flexible String Representation = PEP393

http://www.python.org/dev/peps/pep-0393/

-tkc