Phil
--
Philip Chee <phi...@aleytys.pc.my>, <phili...@gmail.com>
http://flashblock.mozdev.org/ http://xsidebar.mozdev.org
Guard us from the she-wolf and the wolf, and guard us from the thief,
oh Night, and so be good for us to pass.
Does W3C have any recommendations on HTML intended for archival
purposes, i.e. web pages that will never be touched after being
created? Or is it all about whatever is new and shiny, with nobody
caring about last week's content? Thanks.
--
Keith F. Lynch - http://keithlynch.net/
Please see http://keithlynch.net/email.html before emailing me.
Anything which is strictly standards compliant now will continue to
work. The problem with XHTML2 was that it wasn't backward compatible
with anything.
<http://dbaron.org/log/20090707-ex-html>
I'm afraid I took only a quick look at
<http://www.w3.org/TR/html5-diff/> and the HTML5 spec. Can you easily
explain this statement?
1.2 Backwards Compatible
HTML 5 is defined in a way that it is backwards compatible with
the way user agents handle deployed content. To keep the authoring
language relatively simple for authors several elements and
attributes are not included as outlined in the other sections of
this document, such as presentational elements that are better
dealt with using CSS.
User agents, however, will always have to support these older
elements and attributes and this is why the specification clearly
separates requirements for authors and user agents. ...
How does the spec define what to do with <center>, <marquee>, <a
name=...>, <table summary=...>, <meta http-equiv=...>, <img border="0"
...>, when they are "not included"? (I'm going by
<http://www.w3.org/TR/html5/the-xhtml-syntax.html#obsolete-features>
for a listing.) If they're mentioned in the spec, they're "included".
--
Tim McDaniel, tm...@panix.com
>>Anything which is strictly standards compliant now will continue to
>>work.
>
> I'm afraid I took only a quick look at
> <http://www.w3.org/TR/html5-diff/> and the HTML5 spec. Can you easily
> explain this statement?
No.
>Philip Chee <phi...@aleytys.pc.my> wrote:
>> An Unofficial Q&A about the Discontinuation of the XHTML2 WG
>> <http://hsivonen.iki.fi/xhtml2-html5-q-and-a/>
>Does W3C have any recommendations on HTML intended for archival
>purposes, i.e. web pages that will never be touched after being
>created? Or is it all about whatever is new and shiny, with nobody
>caring about last week's content? Thanks.
Nothing lasts forever.
But specifying the DTD helps.
>--
>Keith F. Lynch - http://keithlynch.net/
>Please see http://keithlynch.net/email.html before emailing me.
Ben
--
Ben Yalow yb...@panix.com
Not speaking for anybody
> Nothing lasts forever.
> But specifying the DTD helps.
Nothing lasts forever, but I don't think there's any excuse for
continuing to let information be lost. This isn't the dark ages,
when it was thought reasonable destroy the last copy of some work
of antiquity to free up the vellum for yet another Bible.
There's no reason why every rasff post, for instance, shouldn't last
as long as civilization. And no reason why civilizations shouldn't
last trillions of eons, or perhaps much longer.
Thanks.
> The problem with XHTML2 was that it wasn't backward compatible with
> anything.
So? Neither was the first version of HTML, or of anything else.
Neither, for instance, is digital television. I feel sorry for
anyone who created much content in XHTML2.
>Ben Yalow <yb...@panix.com> wrote:
>> "Keith F. Lynch" <k...@KeithLynch.net> writes:
>>> Does W3C have any recommendations on HTML intended for archival
>>> purposes, i.e. web pages that will never be touched after being
>>> created? Or is it all about whatever is new and shiny, with nobody
>>> caring about last week's content? Thanks.
>> Nothing lasts forever.
>> But specifying the DTD helps.
>Nothing lasts forever, but I don't think there's any excuse for
>continuing to let information be lost. This isn't the dark ages,
>when it was thought reasonable destroy the last copy of some work
>of antiquity to free up the vellum for yet another Bible.
It isn't lost. It just can't be displayed/understood without special
tools. And those tools may not be web browsers.
>There's no reason why every rasff post, for instance, shouldn't last
>as long as civilization. And no reason why civilizations shouldn't
>last trillions of eons, or perhaps much longer.
That depends on somebody being willing to pay Google enough to keep the
data they have (where "enough" is defined by Google -- so far, the amount
is $0).
>--
>Keith F. Lynch - http://keithlynch.net/
>Please see http://keithlynch.net/email.html before emailing me.
Ben
All major browsers support what they call "quirks mode" so that version
should still work in these browsers. And as long as your current html
documents conform to HTML 4.01 Strict they will continue to be supported
indefinitely.
> I feel sorry for
> anyone who created much content in XHTML2.
Given that no major browser (IE, Firefox, Safari/Webkit, Opera) ever
implemented XHTML2, I have no sympathy for anyone who created any
contend in XHTML2 (as opposed to XHTML1).
Phil (imho xhtml1 was a solution looking for a problem)
> That depends on somebody being willing to pay Google enough to keep
> the data they have (where "enough" is defined by Google -- so far,
> the amount is $0).
They've made enough of a botch of their user interface that it's
almost worthless as is. For instance I was unable to retrieve
James Nicoll's original "The problem with defending the purity
of the English language" post recently, when the subject came up
in alt.folklore.urban.
Disk space is cheap enough today that it would be practical for
individuals to keep a copy of their complete Usenet database -- if
Google was willing to sell copies. It's not as if it was doing them
or anyone else any good where it is.
What it means is that, for backwards compatibility, a browser must
correctly display these elements, however they do not form part of the
spec for new code and should be shown as incorrect by an authoring
application. That is they must be supported by browsers but are strongly
deprecated as bad practice and should not be used. Effectively they are
incorrect but must be understood.
--
Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm
Livejournal http://brett-dunbar.livejournal.com/
Brett Paul Dunbar
To email me, use reply-to address
>Ben Yalow <yb...@panix.com> wrote:
>> "Keith F. Lynch" <k...@KeithLynch.net> writes:
>>> There's no reason why every rasff post, for instance, shouldn't
>>> last as long as civilization. And no reason why civilizations
>>> shouldn't last trillions of eons, or perhaps much longer.
>> That depends on somebody being willing to pay Google enough to keep
>> the data they have (where "enough" is defined by Google -- so far,
>> the amount is $0).
>They've made enough of a botch of their user interface that it's
>almost worthless as is. For instance I was unable to retrieve
>James Nicoll's original "The problem with defending the purity
>of the English language" post recently, when the subject came up
>in alt.folklore.urban.
>Disk space is cheap enough today that it would be practical for
>individuals to keep a copy of their complete Usenet database -- if
>Google was willing to sell copies. It's not as if it was doing them
>or anyone else any good where it is.
They clearly think it is. But feel free to ask them to quote a price.
And, of course, Usenet takes a lot of disk space -- pr0n and warez are
big. The text groups are small, of course -- but that's a tiny fraction
of Usenet.
>--
>Keith F. Lynch - http://keithlynch.net/
>Please see http://keithlynch.net/email.html before emailing me.
Ben
>> Disk space is cheap enough today that it would be practical for
>> individuals to keep a copy of their complete Usenet database -- if
>> Google was willing to sell copies. It's not as if it was doing
>> them or anyone else any good where it is.
> They clearly think it is. But feel free to ask them to quote
> a price.
I may do so once my economic position improves.
> And, of course, Usenet takes a lot of disk space -- pr0n and warez
> are big. The text groups are small, of course -- but that's a tiny
> fraction of Usenet.
Does Google save the binaries, or just the text groups? I was
thinking only of the latter -- and mostly of the earlier postings. If
I were to dedicate a 1 terabyte drive to it, how many years of Usenet,
starting at the beginning, excluding binaries, would that hold? Would
it cover all of the '80s? All of the '90s? More? Does anyone know?
They don't have binaries.
> thinking only of the latter -- and mostly of the earlier postings. If
> I were to dedicate a 1 terabyte drive to it, how many years of Usenet,
> starting at the beginning, excluding binaries, would that hold? Would
> it cover all of the '80s? All of the '90s? More? Does anyone know?
Google says their archive has over a billion messages. What is a
reasonable average size of a Usenet post?
rgds,
netcat
When DejaNews first started, they included binaries in their archives.
However, these were soon dropped for space reasons. This was well
before Google's purchase of DN.
I'm another person who'd pay a reasonable fee for an unencumbered copy
Google's complete Usenet archives on DvDs or a hard drive, but only if
they included all the messages that have been dropped over the years.
pt
Probably a couple of K. So the non-binaries will probably all fit onto a
few T. Not shippable over the net in any reasonable amount of time, but
trivial to move around on a directly connected hard drive.
>rgds,
>netcat
I wonder how practical it would be for them to sell selections--all
groups in a hierarchy, say, or the whole archive up to 1995, or ... .
--
http://www.daviddfriedman.com/ http://daviddfriedman.blogspot.com/
Author of
_Future Imperfect: Technology and Freedom in an Uncertain World_,
Cambridge University Press.
>>> That depends on somebody being willing to pay Google enough to keep
>>> the data they have (where "enough" is defined by Google -- so far,
>>> the amount is $0).
>
>>They've made enough of a botch of their user interface that it's
>>almost worthless as is. For instance I was unable to retrieve
>>James Nicoll's original "The problem with defending the purity
>>of the English language" post recently, when the subject came up
>>in alt.folklore.urban.
>>Disk space is cheap enough today that it would be practical for
>>individuals to keep a copy of their complete Usenet database -- if
>>Google was willing to sell copies. It's not as if it was doing them
>>or anyone else any good where it is.
>And, of course, Usenet takes a lot of disk space -- pr0n and warez are
>big. The text groups are small, of course -- but that's a tiny fraction
>of Usenet.
Interestingly, rec.arts.sf.written is often listed by them as being
one of the top-ten traffic groups (as is another that I read).
--
Michael F. Stemper
#include <Standard_Disclaimer>
No animals were harmed in the composition of this message.
I wonder if anyone has asked them.
You don't think they've permanently and irrevocably deleted those
messages from their archives?
> Google says their archive has over a billion messages. What is a
> reasonable average size of a Usenet post?
Back in the days when I was downloading newsgroups from Demon, over a
phone line, reckoning 2000 bytes per article was near enough for a
useful prediction of the time I'd need.
--
David G. Bell -- SF Fan, Filker, and Punslinger.
On the horizon, a carrier task force of the Salvation Navy was
turning into the wind, preparing to launch Zeppelins.
No, I don't.
Google isn't noted for either refusing to gather, or for throwing away
information. Short of a court order to delete it, I expect they still
have it.
pt
If you mean articles with expiration headers, they haven't deleted
them, they just won't show them to free searches.
Seth
What about x-no-archive messages? What about old messages whose
authors asked Google to remove all of them?
Wwlcome back to rasff. I missed you. It's been six months and 14,000
messages. Are you reading *all* of them?
It is rumoured that there are people who would sue if certain deleted
messages were published, and who routinely Google for their names
appearing. Long-time net-users in the UK may recall a court case
involving one of the pioneer ISPs in the UK.
(I also recall that the NNTP articles which triggered that case were
pretty vile stuff.)