Google and validation

Big Bill

unread,

Sep 2, 2006, 12:43:09 AM9/2/06

to

Some interesting stuff here;

http://www.searchengineguide.com/searchbrief/senews/008222.html

BB
--

http://www.crystal-liaison.com/crystal-world/index.html
http://www.crystal-liaison.com/crystal-world/african-elephant.html
http://www.crystal-liaison.com/crystal-world/american-eagle.html

John Bokma

unread,

Sep 2, 2006, 1:25:54 AM9/2/06

to

Big Bill <kr...@cityscape.co.uk> wrote:

> http://www.searchengineguide.com/searchbrief/senews/008222.html

Thanks.

2 notes: she should talk more with her husband: CSS programmer???

second note: 4 944 vs 3 902 sounds like quite a saving. The problem is
that nowadays quite some site send out their HTML compressed (gzip), and
it might very well be the case that the former is smaller then the latter.

But Google *should* have a serious look at their HTML, I agree on that
point.

--
John Need help with SEO? Get started with a SEO report of your site:

--> http://johnbokma.com/websitedesign/seo-expert-help.html

Roy Schestowitz

unread,

Sep 2, 2006, 1:55:25 AM9/2/06

to

__/ [ John Bokma ] on Saturday 02 September 2006 06:25 \__

> Big Bill <kr...@cityscape.co.uk> wrote:
>
>> http://www.searchengineguide.com/searchbrief/senews/008222.html
>
> Thanks.
>
> 2 notes: she should talk more with her husband: CSS programmer???
>
> second note: 4 944 vs 3 902 sounds like quite a saving. The problem is
> that nowadays quite some site send out their HTML compressed (gzip), and
> it might very well be the case that the former is smaller then the latter.
>
> But Google *should* have a serious look at their HTML, I agree on that
> point.

I can't recall if it was Cutts or DiBona (I searched, but couldn't find it
after a 2-minute effort) who said that Google's broken code is intended to
make pages more compact and thus quick to deliver. This sounds like utter BS
to me because if standards are compromised for speed, then developers will
design for IE just because the majority of people use it. Principles and /ad
hoc/ workarounds that supersede them are unacceptable. Maybe we should also
start throwing bottle out the car's windows. You know, because it's
quicker...

Best wishes,

Roy

--
Roy S. Schestowitz | "Error, no keyboard - press F1 to continue"
http://Schestowitz.com | GNU is Not UNIX | PGP-Key: 0x74572E8E
roy pts/1 Fri Sep 1 16:58 still logged in
http://iuron.com - proposing a non-profit search engine

Borek

unread,

Sep 2, 2006, 4:19:36 AM9/2/06

to

On Sat, 02 Sep 2006 07:25:54 +0200, John Bokma <jo...@castleamber.com>
wrote:

> second note: 4 944 vs 3 902 sounds like quite a saving. The problem is
> that nowadays quite some site send out their HTML compressed (gzip), and
> it might very well be the case that the former is smaller then the
> latter.
>
> But Google *should* have a serious look at their HTML, I agree on that
> point.

Google is a bunch of morons when it comes to HTML. Look at their page -
they for ages use one-letter ids (two years at least IIRC) to make the
code shorter and to save on bandwidth, but they can't understand that they
can save huge properly using css. That's an old news for some.

Best,
Borek
--
http://www.chembuddy.com
http://www.ph-meter.info
http://www.terapia-kregoslupa.waw.pl

Roy Schestowitz

unread,

Sep 2, 2006, 9:29:17 AM9/2/06

to

__/ [ Borek ] on Saturday 02 September 2006 09:19 \__

> On Sat, 02 Sep 2006 07:25:54 +0200, John Bokma <jo...@castleamber.com>
> wrote:
>
>> second note: 4 944 vs 3 902 sounds like quite a saving. The problem is
>> that nowadays quite some site send out their HTML compressed (gzip), and
>> it might very well be the case that the former is smaller then the
>> latter.
>>
>> But Google *should* have a serious look at their HTML, I agree on that
>> point.
>
> Google is a bunch of morons when it comes to HTML. Look at their page -
> they for ages use one-letter ids (two years at least IIRC) to make the
> code shorter and to save on bandwidth, but they can't understand that they
> can save huge properly using css. That's an old news for some.

Very sad news, too. To elaborate on my other post, this sets
a terrible examples for Webmasters (think along the lines
of: "well, even Google don't make it valid, so why should
/I/?"). What's more, how are newer and less mature browsers
supposed to cope with attributes that intentionally neglect
quotes/apostrophes? Isn't that what
specification/standards/recommendations are for? Equality
and independence on a product? That which doesn't involve
hacks, workarounds and undocumented exception handling? What
about OpenDocument? I am glad that Google don't have a go at
making /that/ 'efficient'... I am worried that Google is
beginning to adopt Microsoft's habits of 'extending'
standards to suit their own convenience and agenda
(compromising for speed in that case). Microsoft Office
formats, for example, use binary because it's quicker than
XML or a well-structured and easily interpertable
(backward-'engineerable') form, among other reasons.

Best wishes,

Roy

--
Roy S. Schestowitz | "Far away from home, robots build people"
http://Schestowitz.com | SuSE Linux | PGP-Key: 0x74572E8E
2:20pm up 44 days 2:32, 7 users, load average: 0.00, 0.00, 0.00
http://iuron.com - Open Source knowledge engine project

Big Bill

unread,

Sep 2, 2006, 12:59:20 PM9/2/06

to

I hope you didn't mention that at your interviews... :-)

Roy Schestowitz

unread,

Sep 2, 2006, 1:33:50 PM9/2/06

to

__/ [ Big Bill ] on Saturday 02 September 2006 17:59 \__

> I hope you didn't mention that at your interviews... :-)

I already have too much ani-Google material on the Web. So they might as well
accept that. I refuted the claim that Google was inovate because, just like
Microsoft, most technologies are inhereited through (potentially-hostile)
acquisitions.

http://en.wikipedia.org/wiki/List_of_Acquisitions_by_Google
http://en.wikipedia.org/wiki/List_of_companies_acquired_by_Microsoft_Corporation

Oracle is worse.

John Bokma

unread,

Sep 2, 2006, 2:02:09 PM9/2/06

to

Roy Schestowitz <newsg...@schestowitz.com> wrote:

> sounds like utter BS to me because if standards are compromised for
> speed, then developers will design for IE just because the majority of
> people use it.

Which makes it more standard then W3C recommendations ;-) Something to
think about.

John Bokma

unread,

Sep 2, 2006, 2:42:24 PM9/2/06

to

Roy Schestowitz <newsg...@schestowitz.com> wrote:

> __/ [ Borek ] on Saturday 02 September 2006 09:19 \__
>
>> On Sat, 02 Sep 2006 07:25:54 +0200, John Bokma <jo...@castleamber.com>
>> wrote:
>>
>>> second note: 4 944 vs 3 902 sounds like quite a saving. The problem
>>> is that nowadays quite some site send out their HTML compressed
>>> (gzip), and it might very well be the case that the former is
>>> smaller then the latter.
>>>
>>> But Google *should* have a serious look at their HTML, I agree on
>>> that point.
>>
>> Google is a bunch of morons when it comes to HTML. Look at their page
>> - they for ages use one-letter ids (two years at least IIRC) to make
>> the code shorter and to save on bandwidth, but they can't understand
>> that they can save huge properly using css. That's an old news for
>> some.
>
> Very sad news, too. To elaborate on my other post, this sets
> a terrible examples for Webmasters (think along the lines

I am right thinking about the lines. Fully justified means there are no
easy anchors anymore. Especially with monospaced fonts fully justified
text if a pain in the ass to read.

> of: "well, even Google don't make it valid, so why should
> /I/?").

An important question that more people should ask themselves: wtf is
W3C. Their ideas are not always the best ones, sadly. They are
"everywhere", but I have sometimes the idea that they should focus on
HTML and CSS, and not coming up with wild ideas like XML for mobile
vibrators that soon everybody will call a standard no matter how crappy
it's thought out :-)

> What's more, how are newer and less mature browsers
> supposed to cope with attributes that intentionally neglect
> quotes/apostrophes? Isn't that what
> specification/standards/recommendations are for? Equality
> and independence on a product?

For HTML it's specified in the recommendation(s) when attributes must be
quoted and when it's ok to leave them out. No idea if Google follows
this. Also, a lot of people forget that HTML 4.01 has a lot of optional
stuff, a page can sometimes be made way shorter by leaving out all
implied stuff.

On the other hand, read my gzip story, a lot of webservers now serve
compressed content to browsers that tell them they can handle it. Why
gzip was chosen is a bit beyond me, because there are better compression
algorithms, and maybe even better results can be obtained with a
dedicated one for HTML.

> That which doesn't involve
> hacks, workarounds and undocumented exception handling? What
> about OpenDocument? I am glad that Google don't have a go at
> making /that/ 'efficient'... I am worried that Google is
> beginning to adopt Microsoft's habits of 'extending'
> standards to suit their own convenience and agenda

You think that w3c's agenda is different? Or any OS project for that
matter? Most OS projects are a bunch of followers with one ego at the
wheel. Sometimes one minor change isn't accepted because the head didn't
think of it first, so instead of saying: wow, nifty! You get 10001
arguments (most wrong) why it shouldn't be added.

Maybe I bumped into the wrong people, but with the OS projects I
contacted I always got:

- deny
- make it sound insignificant
- argue why it shouldn't be added no matter what

A very common mistake, which you also seem to make, is that you think
that OS is different from a company with a bunch of people all
developing. The only difference is that you can take the source, and
modify it to your own needs. Things like support and speed of fixes all
depend on the bunch of people.

> (compromising for speed in that case). Microsoft Office
> formats, for example, use binary because it's quicker than
> XML or a well-structured and easily interpertable
> (backward-'engineerable') form, among other reasons.

XML is often mistaken for a better solution, especially compared to
binary, because it's human readable. For most people a hex dump and an
XML dump is equally readable: not.

One thing you see a lot in IT is that suddenly a "new technique" pops up
and a lot of people jump on that and claim that they are better then the
competition because of their use of that technique. XML is a good
example as any.

A well documented binary format is as good as a well documented XML
format for implementing it. For the majority of users it doesn't really
matter. And yes, the binary format can be made 10-50 times smaller
compared to XML, which is noticeable in speed.

Also often forgotten is that XML is a very bare bone specification. It's
not more complicated then: an int is always 64 bits, a character always
16 bits. An implementation is nothing more then describing the structure
(records) and one can do that as good with XML as with a binary format,
since nothing is stopping you from replacing each XML element with a,
for example, 8 bit value, and pack attributes similar. Or: hexdumping
XML just shows a binary format which uses a lot of space to store a
little information nothing more, nothing less.

Big Bill

unread,

Sep 2, 2006, 3:24:30 PM9/2/06

to

I think the thing with xml - I actually studied xml, whodathunkit, eh?
got my little certificate somewhere too - was that it was
intentionally barebones and that it presented the world with a
communications format that was infinitely malleable. Totally
web-dependant, though, dent the infrastructure and it's blown. But, if
you believed in the original purpose of the internet, (not so much the
arpanet) surviving nukes etc, then maybe that made sense too. XML is
still very much in its infancy, it's got a long way to go yet I think.

John Bokma

unread,

Sep 2, 2006, 6:26:40 PM9/2/06

to

Big Bill <kr...@cityscape.co.uk> wrote:

[ XML ]

> I think the thing with xml - I actually studied xml, whodathunkit, eh?
> got my little certificate somewhere too - was that it was
> intentionally barebones and that it presented the world with a
> communications format that was infinitely malleable.

Yes, it's not that far from 8 bits = 1 byte :-)

> Totally
> web-dependant, though, dent the infrastructure and it's blown.

Huh? XML can be used anywhere, it's not limited to the web.

> But, if
> you believed in the original purpose of the internet, (not so much the
> arpanet) surviving nukes etc, then maybe that made sense too. XML is
> still very much in its infancy, it's got a long way to go yet I think.

I can't think of anything that can be added to XML itself :-)

Big Bill

unread,

Sep 2, 2006, 6:38:55 PM9/2/06

to

On 2 Sep 2006 22:26:40 GMT, John Bokma <jo...@castleamber.com> wrote:

>Big Bill <kr...@cityscape.co.uk> wrote:
>
>[ XML ]
>> I think the thing with xml - I actually studied xml, whodathunkit, eh?
>> got my little certificate somewhere too - was that it was
>> intentionally barebones and that it presented the world with a
>> communications format that was infinitely malleable.
>
>Yes, it's not that far from 8 bits = 1 byte :-)
>
>> Totally
>> web-dependant, though, dent the infrastructure and it's blown.
>
>Huh? XML can be used anywhere, it's not limited to the web.

Um... it can? Where would you use it? I don't remember that.

>> But, if
>> you believed in the original purpose of the internet, (not so much the
>> arpanet) surviving nukes etc, then maybe that made sense too. XML is
>> still very much in its infancy, it's got a long way to go yet I think.
>
>I can't think of anything that can be added to XML itself :-)

In it's use, I meant. A replacement for html.

John Bokma

unread,

Sep 2, 2006, 8:24:16 PM9/2/06

to

Big Bill <kr...@cityscape.co.uk> wrote:

> On 2 Sep 2006 22:26:40 GMT, John Bokma <jo...@castleamber.com> wrote:
>
>>Big Bill <kr...@cityscape.co.uk> wrote:
>>
>>[ XML ]
>>> I think the thing with xml - I actually studied xml, whodathunkit, eh?
>>> got my little certificate somewhere too - was that it was
>>> intentionally barebones and that it presented the world with a
>>> communications format that was infinitely malleable.
>>
>>Yes, it's not that far from 8 bits = 1 byte :-)
>>
>>> Totally
>>> web-dependant, though, dent the infrastructure and it's blown.
>>
>>Huh? XML can be used anywhere, it's not limited to the web.
>
> Um... it can? Where would you use it? I don't remember that.

Thing of XML as of JPEG. JPEG is for storing images, XML is more generic.
But the use of JPEG is not limited to pictures on the Internet.

For example a Usenet program can decide to use something like:

<?xml version="1.0"?>
<usenet>
<subscribed>
<group>alt.internet.search-engines</group>
<group>comp.lang.perl.misc</group>
</subscrided>
</usenet>

for storing program settings. If you use MSN Messenger, and keep
conversations, have a peek into My Received Files\accountnr\History

You will see one xml file per person you have been talking with (or more
if you keep archives as well). You also will see an xsl file, which is
used when you view the xml file in IE to present the information.

>>I can't think of anything that can be added to XML itself :-)
>
> In it's use, I meant. A replacement for html.

XML is not a replacement for HTML. XHTML is an application of XML. XHTML
might replace HTML, but I hope not.

You can use XML for anything you want. XML is not a replacement for INI
files, but one can use an XML application that stores the same information
as is stored in the INI file it replaces.

David

unread,

Sep 2, 2006, 10:54:09 PM9/2/06

to

On Sat, 02 Sep 2006 06:55:25 +0100, Roy Schestowitz
<newsg...@schestowitz.com> wrote:

>I can't recall if it was Cutts or DiBona (I searched, but couldn't find it
>after a 2-minute effort) who said that Google's broken code is intended to
>make pages more compact and thus quick to deliver. This sounds like utter BS
>to me because if standards are compromised for speed, then developers will
>design for IE just because the majority of people use it. Principles and /ad
>hoc/ workarounds that supersede them are unacceptable. Maybe we should also
>start throwing bottle out the car's windows. You know, because it's
>quicker...
>
>Best wishes,
>
>Roy

Yeah, I always thought it was because Google didn't have the SEO
skills needed to optimise their code for better rankings etc..., they
probably use Dreamweaver or something :-)

David
--
SEO Services http://www.seo-gold.com/expert-seo-consultant.php
Ode to Ethical SEO http://www.totallyduh.com/ethical-seo-expert.html

Roy Schestowitz

unread,

Sep 3, 2006, 12:27:08 AM9/3/06

to

__/ [ David ] on Sunday 03 September 2006 03:54 \__

> On Sat, 02 Sep 2006 06:55:25 +0100, Roy Schestowitz
> <newsg...@schestowitz.com> wrote:
>
>>I can't recall if it was Cutts or DiBona (I searched, but couldn't find it
>>after a 2-minute effort) who said that Google's broken code is intended to
>>make pages more compact and thus quick to deliver. This sounds like utter
>>BS to me because if standards are compromised for speed, then developers
>>will design for IE just because the majority of people use it. Principles
>>and /ad hoc/ workarounds that supersede them are unacceptable. Maybe we
>>should also start throwing bottle out the car's windows. You know, because
>>it's quicker...
>>
>>Best wishes,
>>
>>Roy
>
> Yeah, I always thought it was because Google didn't have the SEO
> skills needed to optimise their code for better rankings etc..., they
> probably use Dreamweaver or something :-)

Google doesn't need to optimise. That's like God praying to God. *smile*

Roy Schestowitz

unread,

Sep 3, 2006, 12:45:45 AM9/3/06

to

__/ [ John Bokma ] on Saturday 02 September 2006 19:42 \__

> Roy Schestowitz <newsg...@schestowitz.com> wrote:
>
>> __/ [ Borek ] on Saturday 02 September 2006 09:19 \__
>>
>>> On Sat, 02 Sep 2006 07:25:54 +0200, John Bokma <jo...@castleamber.com>
>>> wrote:
>>>
>>>> second note: 4 944 vs 3 902 sounds like quite a saving. The problem
>>>> is that nowadays quite some site send out their HTML compressed
>>>> (gzip), and it might very well be the case that the former is
>>>> smaller then the latter.
>>>>
>>>> But Google *should* have a serious look at their HTML, I agree on
>>>> that point.
>>>
>>> Google is a bunch of morons when it comes to HTML. Look at their page
>>> - they for ages use one-letter ids (two years at least IIRC) to make
>>> the code shorter and to save on bandwidth, but they can't understand
>>> that they can save huge properly using css. That's an old news for
>>> some.
>>
>> Very sad news, too. To elaborate on my other post, this sets
>> a terrible examples for Webmasters (think along the lines
>
> I am right thinking about the lines. Fully justified means there are no
> easy anchors anymore. Especially with monospaced fonts fully justified
> text if a pain in the ass to read.

Okay, okay. *smile* It's just CTRL+ALT+F7 away, so I'm tempted to give it a
go every now and then...

>> of: "well, even Google don't make it valid, so why should
>> /I/?").
>
> An important question that more people should ask themselves: wtf is
> W3C. Their ideas are not always the best ones, sadly. They are
> "everywhere", but I have sometimes the idea that they should focus on
> HTML and CSS, and not coming up with wild ideas like XML for mobile
> vibrators that soon everybody will call a standard no matter how crappy
> it's thought out :-)

XML and standards facilitate modularity. Where would OSS be without
specifications? Look at the mess Windows has reached (and Apple's Mac OS
before it took Darwin). Windows still requires a 60% rewrite of the code
(Jim Allchin) because it's utterly unmaintainable (all the planned features
are conceded because they can't be implemented). The need to accommodate
many implementation gives power in maintaining a system and replacing weaker
components with superior ones. That's why OSS is winning.

>> What's more, how are newer and less mature browsers
>> supposed to cope with attributes that intentionally neglect
>> quotes/apostrophes? Isn't that what
>> specification/standards/recommendations are for? Equality
>> and independence on a product?
>
> For HTML it's specified in the recommendation(s) when attributes must be
> quoted and when it's ok to leave them out. No idea if Google follows
> this. Also, a lot of people forget that HTML 4.01 has a lot of optional
> stuff, a page can sometimes be made way shorter by leaving out all
> implied stuff.

This should not be done. If you code a quick-and-dirty, then fine. If you
build a system which delivers billions of pages a day and you spew out junk,
then it's just irresponsible and selfish. WordPress, for example, was built
as a standards-compliant and accessible CMS from the start. And look where
it stands today. People should stop programming browsers to render site X
correctly just as Web developers should stop wasting their times on hacks.
Standards resolve it /all/.

> On the other hand, read my gzip story, a lot of webservers now serve
> compressed content to browsers that tell them they can handle it. Why
> gzip was chosen is a bit beyond me, because there are better compression
> algorithms, and maybe even better results can be obtained with a
> dedicated one for HTML.

Gzip is a /de facto/ standard.

>> That which doesn't involve
>> hacks, workarounds and undocumented exception handling? What
>> about OpenDocument? I am glad that Google don't have a go at
>> making /that/ 'efficient'... I am worried that Google is
>> beginning to adopt Microsoft's habits of 'extending'
>> standards to suit their own convenience and agenda
>
> You think that w3c's agenda is different? Or any OS project for that
> matter? Most OS projects are a bunch of followers with one ego at the
> wheel. Sometimes one minor change isn't accepted because the head didn't
> think of it first, so instead of saying: wow, nifty! You get 10001
> arguments (most wrong) why it shouldn't be added.
>
> Maybe I bumped into the wrong people, but with the OS projects I
> contacted I always got:
>
> - deny
> - make it sound insignificant
> - argue why it shouldn't be added no matter what

I can attest to the same experience. Probably self-centred programmers who
are possessive.

> A very common mistake, which you also seem to make, is that you think
> that OS is different from a company with a bunch of people all
> developing. The only difference is that you can take the source, and
> modify it to your own needs. Things like support and speed of fixes all
> depend on the bunch of people.
>
>> (compromising for speed in that case). Microsoft Office
>> formats, for example, use binary because it's quicker than
>> XML or a well-structured and easily interpertable
>> (backward-'engineerable') form, among other reasons.
>
> XML is often mistaken for a better solution, especially compared to
> binary, because it's human readable. For most people a hex dump and an
> XML dump is equally readable: not.

I strongly disagree. Many people don't document their hex dump. Trust me,
they don't. And sometimes, human-readable has its merits. My Palm archives
are utterly useless if they are a mishmash of binary and ASCII. If it was
XML, I could at least migrate my data manually, understanding what I'm
doing. I have also done some mass alteration of configurations in programs
using search and replace in XML settings files. Why? Because it's quicker.
You don't get this flexibility with 'binary blobs'.

> One thing you see a lot in IT is that suddenly a "new technique" pops up
> and a lot of people jump on that and claim that they are better then the
> competition because of their use of that technique. XML is a good
> example as any.

XML is a concept. Let's just replace "XML" with the term "structured data".

> A well documented binary format is as good as a well documented XML
> format for implementing it. For the majority of users it doesn't really
> matter. And yes, the binary format can be made 10-50 times smaller
> compared to XML, which is noticeable in speed.

Computers are fast nowadays. Some people still work with bloatware. There are
also /ad hoc/ methods for making things quicker, e.g. cumulative read/write.
Speaking of which, OpenDocument speeds will improve. Let the implementation
mature. And disregard the Microsoft FUD. They are just afraid because their
cash cow is in jeopardy as many countries are putting ODF policies in place.

> Also often forgotten is that XML is a very bare bone specification. It's
> not more complicated then: an int is always 64 bits, a character always
> 16 bits. An implementation is nothing more then describing the structure
> (records) and one can do that as good with XML as with a binary format,
> since nothing is stopping you from replacing each XML element with a,
> for example, 8 bit value, and pack attributes similar. Or: hexdumping
> XML just shows a binary format which uses a lot of space to store a
> little information nothing more, nothing less.

Binary is serial. XML is by nature hierarchical and explicitly so. That's why
folks like Tim Bray had it proposed in the first place, I assume.

Best wishes,

Roy

--
Roy S. Schestowitz | "Oops. My brain just hit a bad sector"
http://Schestowitz.com | Free as in Free Beer Ś PGP-Key: 0x74572E8E
Load average (/proc/loadavg): 1.21 1.00 0.96 4/140 13374
http://iuron.com - semantic search engine project initiative

Message has been deleted

Big Bill

unread,

Sep 3, 2006, 2:50:19 AM9/3/06

to

You don't see it used in off-web applications though. Mind you, these
days, what's an off-web application?

Big Bill

unread,

Sep 3, 2006, 2:50:19 AM9/3/06

to

On Sun, 03 Sep 2006 01:23:53 -0400, John A.
<no....@spammers.virg.iniaqu.ilter.allowed.com> wrote:

>It's also used to describe the UI in Mozilla/Seamonkey/Firefox:
>http://www.mozilla.org/projects/xul/
>
>as well as, IIRC, Google's AdWords Editor:
>http://services.google.com/adwordseditor/
>which I believe runs on Mozilla's XULRunner:
>http://developer.mozilla.org/en/docs/XULRunner
>
>I believe QuickBooks' SDK uses XML as well in its calls and replies.
>http://developer.intuit.com/QuickBooksSDK/Briefing/

Well I suppose Quickbooks is usually used off-line.

Els

unread,

Sep 3, 2006, 3:00:08 AM9/3/06

to

Big Bill wrote:

[XML]

> You don't see it used in off-web applications though. Mind you, these
> days, what's an off-web application?

Would the archives of my MSN chats be off-web?
(I surely hope so btw! ;-))
They are in XML.

--
Els http://locusmeus.com/
accessible web design: http://locusoptimus.com/

John Bokma

unread,

Sep 3, 2006, 3:01:04 AM9/3/06

to

John A. <no....@spammers.virg.iniaqu.ilter.allowed.com> wrote:

> On 3 Sep 2006 00:24:16 GMT, John Bokma <jo...@castleamber.com> wrote:

>>You can use XML for anything you want. XML is not a replacement for
>>INI files, but one can use an XML application that stores the same
>>information as is stored in the INI file it replaces.
>

> It's also used to describe the UI in Mozilla/Seamonkey/Firefox:
> http://www.mozilla.org/projects/xul/
>
> as well as, IIRC, Google's AdWords Editor:
> http://services.google.com/adwordseditor/
> which I believe runs on Mozilla's XULRunner:
> http://developer.mozilla.org/en/docs/XULRunner
>
> I believe QuickBooks' SDK uses XML as well in its calls and replies.
> http://developer.intuit.com/QuickBooksSDK/Briefing/
>

> I'm sure the list could go on and on.

Of the programs I use, besides the already mentioned MSN Messenger:

OOo (OpenOffice.org)
Ant (apache)

Probably a few more I forgot :-)

John Bokma

unread,

Sep 3, 2006, 3:30:11 AM9/3/06

to

Roy Schestowitz <newsg...@schestowitz.com> wrote:

> __/ [ John Bokma ] on Saturday 02 September 2006 19:42 \__

Short, it's late, and I really like you as a person, and I hope that
reality one day is able to slap you, since I seem to fail :-)

> XML and standards facilitate modularity. Where would OSS be without
> specifications?

Where it is now? Most OS projects work evolutionary, someone has an
idea, writes code, decides to OS it, and it grows and grows, until the
team discovers that many people are not happy with lack of documentation
and specifications ("use the source Luke" is not funny).

Examples: PHP and Perl (Perl 6 is now specified, which I am happy
about).

CS OTOH is often developed with a tight budget and hurray, some
companies understand that writing specs before starting to program might
offer at least some way to make things work with c * budget (with c
hopefully less or equal to 2).

Of course there are exceptions on both, but most OS projects I am aware
of suck at 2 of the following 3: code, specification, documentation. And
most on all 3 :-D.

> Look at the mess Windows has reached (and Apple's Mac
> OS before it took Darwin). Windows still requires a 60% rewrite of the
> code (Jim Allchin) because it's utterly unmaintainable (all the
> planned features are conceded because they can't be implemented).

The thing with any large project is that there is a time that writing
new specs, and entirely *rewrite* core code is the best thing to do.
Netscape did it with their render engine (hurray, now we have Gecko),
and Perl is doing it with Perl 6.

Firefox is replacing their bookmarks system (because bookmarks *do*
suck), and their history format (because Mork does suck donkey ass).

> The
> need to accommodate many implementation gives power in maintaining a
> system and replacing weaker components with superior ones. That's why
> OSS is winning.

I am not going to hold my breath. Personally the winning team for me is
a mix of CS and OSS (as probably a lot of people are using atm). Which
one does the best job, is the winner, and I don't care if it's CS or
OSS. I don't have the time to manually patch OSS to my requirements, so
if there is a better CS solution, I go for it, even if that means a
vendor lock in (which is a joke, since if the OSS goes in a different
direction I am fcked as well).

>> For HTML it's specified in the recommendation(s) when attributes must
>> be quoted and when it's ok to leave them out. No idea if Google
>> follows this. Also, a lot of people forget that HTML 4.01 has a lot
>> of optional stuff, a page can sometimes be made way shorter by
>> leaving out all implied stuff.
>
>
> This should not be done. If you code a quick-and-dirty, then fine.

Read again Roy, I guess you missed something. From the HTML 4.01
Specification (not standard):

"7.4.1 The HEAD element

[... brevity ... ]

Start tag: *optional*, End tag: *optional*
"

How can coding according to the standard^H^H^H^H^H^Hspecification be
quick and dirty?

> If
> you build a system which delivers billions of pages a day and you spew
> out junk, then it's just irresponsible and selfish.

See above.

> WordPress, for
> example, was built as a standards-compliant and accessible CMS from
> the start. And look where it stands today. People should stop
> programming browsers to render site X correctly just as Web developers
> should stop wasting their times on hacks. Standards resolve it /all/.

You must really be kidding yourself. Anyway, WP uses XHTML, bad choice
IMNSHO. As soon as XHTML is used as XML I am afraid quite some bloggers
might end up with "Parsing error 131313 unclosed element foo at line
1232" on their page...

>> On the other hand, read my gzip story, a lot of webservers now serve
>> compressed content to browsers that tell them they can handle it. Why
>> gzip was chosen is a bit beyond me, because there are better
>> compression algorithms, and maybe even better results can be obtained
>> with a dedicated one for HTML.
>
> Gzip is a /de facto/ standard.

Your point is? Why does this stop W3C from creating a dedicated
compression algorithm for HTML and XML? (There is already an application
that "optimizes" XML so gzip and friends work better, forgot the name,
will look up later).

Anyway, http://en.wikipedia.org/wiki/Bzip2

No reason why gzip was picked over bzip2 IMO, unless someone did a test
with 10,000 HTML pages and gzip was the clear winner (which I doubt, but
I can be wrong sometimes :-D)

[ OS hackers ]

>> - deny
>> - make it sound insignificant
>> - argue why it shouldn't be added no matter what
>
> I can attest to the same experience. Probably self-centred programmers
> who are possessive.

When people create, it's their baby :-D. I have used both non-OS and OS
libraries, and so far have had better and faster support in the former
case. Paying does have now and then advantages :-)

>> XML is often mistaken for a better solution, especially compared to
>> binary, because it's human readable. For most people a hex dump and
>> an XML dump is equally readable: not.
>
> I strongly disagree. Many people don't document their hex dump. Trust
> me, they don't.

And they document their XML dumps?

Can you say what the legal values for size are in:

Have a look at machine generated XML, and wonder :-) To most people it
doesn't differ from a hex dump. Of course if you are familiar with the
format it *is* readable, and yes, more readable then a hex dump. But I
am afraid that to most people: look for the foo element in the bar
element around line 14 and change it into baz is as magic as: fire up a
hex editor, goto line 3efad and type deadbeef

> And sometimes, human-readable has its merits. My Palm
> archives are utterly useless if they are a mishmash of binary and
> ASCII. If it was XML, I could at least migrate my data manually,
> understanding what I'm doing.

Yes, for some people it *is* more readable. Like I wrote: "For most

people a hex dump and an XML dump is equally readable: not."

> I have also done some mass alteration of

> configurations in programs using search and replace in XML settings
> files. Why? Because it's quicker. You don't get this flexibility with
> 'binary blobs'.

My point was that to most people the tasks are equal: beyond their
reach.

>> A well documented binary format is as good as a well documented XML
>> format for implementing it. For the majority of users it doesn't
>> really matter. And yes, the binary format can be made 10-50 times
>> smaller compared to XML, which is noticeable in speed.
>
> Computers are fast nowadays. Some people still work with bloatware.

Firefox and OOo? Why shouldn't they, it works ok :-D.

> There are also /ad hoc/ methods for making things quicker, e.g.
> cumulative read/write. Speaking of which, OpenDocument speeds will
> improve. Let the implementation mature. And disregard the Microsoft
> FUD. They are just afraid because their cash cow is in jeopardy as
> many countries are putting ODF policies in place.

I do my best to ignore MS FUD as well as GNU/Linux, Firefox, and general
OS FUD.

> Binary is serial. XML is by nature hierarchical and explicitly so.
> That's why folks like Tim Bray had it proposed in the first place, I
> assume.

XML *is* binary. Like I said, make a hexdump of an XML file, it's
educational.

John Bokma

unread,

Sep 3, 2006, 3:36:03 AM9/3/06

to

Big Bill <kr...@cityscape.co.uk> wrote:

[ XML ]

> You don't see it used in off-web applications though. Mind you, these
> days, what's an off-web application?

OpenOffice.org uses it, Inkscape uses it (SVG is IIRC an XML application),
ant uses it (Apache Ant). 3 examples of pure off line applications :-D

Roy Schestowitz

unread,

Sep 3, 2006, 3:58:07 AM9/3/06

to

__/ [ John Bokma ] on Sunday 03 September 2006 08:01 \__

> John A. <no....@spammers.virg.iniaqu.ilter.allowed.com> wrote:
>
>> On 3 Sep 2006 00:24:16 GMT, John Bokma <jo...@castleamber.com> wrote:
>
>>>You can use XML for anything you want. XML is not a replacement for
>>>INI files, but one can use an XML application that stores the same
>>>information as is stored in the INI file it replaces.
>>
>> It's also used to describe the UI in Mozilla/Seamonkey/Firefox:
>> http://www.mozilla.org/projects/xul/
>>
>> as well as, IIRC, Google's AdWords Editor:
>> http://services.google.com/adwordseditor/
>> which I believe runs on Mozilla's XULRunner:
>> http://developer.mozilla.org/en/docs/XULRunner
>>
>> I believe QuickBooks' SDK uses XML as well in its calls and replies.
>> http://developer.intuit.com/QuickBooksSDK/Briefing/
>>
>> I'm sure the list could go on and on.
>
> Of the programs I use, besides the already mentioned MSN Messenger:
>
> OOo (OpenOffice.org)
> Ant (apache)
>
> Probably a few more I forgot :-)

Almost every program that I use makes use of XML/RDF: RSSOwl, KNode,
AmaroK... it's nice to stroll between different program and be able to take
data and settings along with you. It's a freedom that's appreicated by those
not stubborn enough to justify persistence with just one package/vendor.

John, I appreciate the long reply (other path down this thread). We're both
argumantative and this could go on forever. *smile*

Best wishes,

Roy

--
Roy S. Schestowitz | Useless fact: 111111 X 111111 = 12345654321

http://Schestowitz.com | GNU is Not UNIX | PGP-Key: 0x74572E8E

root pts/7 baine.wiau.man.a Sun Sep 3 04:07 - 04:09 (00:01)

Big Bill

unread,

Sep 3, 2006, 4:13:17 AM9/3/06

to

On 3 Sep 2006 07:36:03 GMT, John Bokma <jo...@castleamber.com> wrote:

>Big Bill <kr...@cityscape.co.uk> wrote:
>
>[ XML ]
>> You don't see it used in off-web applications though. Mind you, these
>> days, what's an off-web application?
>
>OpenOffice.org uses it, Inkscape uses it (SVG is IIRC an XML application),
>ant uses it (Apache Ant). 3 examples of pure off line applications :-D

None of which I'm familiar with, though I may have used OpenOffice in
the past. Do they use it internally? Or as a means of information
exchange with other programs?

Big Bill

unread,

Sep 3, 2006, 4:13:17 AM9/3/06

to

On Sun, 3 Sep 2006 09:00:08 +0200, Els <els.a...@tiscali.nl> wrote:

>Big Bill wrote:
>
>[XML]
>> You don't see it used in off-web applications though. Mind you, these
>> days, what's an off-web application?
>
>Would the archives of my MSN chats be off-web?
>(I surely hope so btw! ;-))
>They are in XML.

They're web-oriented though. XML is for information exchange and i
suppose more of that is exchanged over the web than anywhere else.

John Bokma

unread,

Sep 3, 2006, 4:18:17 AM9/3/06

to

Big Bill <kr...@cityscape.co.uk> wrote:

> On 3 Sep 2006 07:36:03 GMT, John Bokma <jo...@castleamber.com> wrote:
>
>>Big Bill <kr...@cityscape.co.uk> wrote:
>>
>>[ XML ]
>>> You don't see it used in off-web applications though. Mind you,
>>> these days, what's an off-web application?
>>
>>OpenOffice.org uses it, Inkscape uses it (SVG is IIRC an XML
>>application), ant uses it (Apache Ant). 3 examples of pure off line
>>applications :-D
>
> None of which I'm familiar with, though I may have used OpenOffice in
> the past. Do they use it internally? Or as a means of information
> exchange with other programs?

Ooo and Inkscape both. Ant uses XML to create build files (like: copy
those files there, do this on them, then zip them, and then email them),
which I guess are used by Ant (and clones) only, but I am not 100% sure
about that. OTOH clones implies exchange with other programs :-)

Big Bill

unread,

Sep 3, 2006, 6:03:56 AM9/3/06

to

On 3 Sep 2006 08:18:17 GMT, John Bokma <jo...@castleamber.com> wrote:

>Big Bill <kr...@cityscape.co.uk> wrote:
>
>> On 3 Sep 2006 07:36:03 GMT, John Bokma <jo...@castleamber.com> wrote:
>>
>>>Big Bill <kr...@cityscape.co.uk> wrote:
>>>
>>>[ XML ]
>>>> You don't see it used in off-web applications though. Mind you,
>>>> these days, what's an off-web application?
>>>
>>>OpenOffice.org uses it, Inkscape uses it (SVG is IIRC an XML
>>>application), ant uses it (Apache Ant). 3 examples of pure off line
>>>applications :-D
>>
>> None of which I'm familiar with, though I may have used OpenOffice in
>> the past. Do they use it internally? Or as a means of information
>> exchange with other programs?
>
>Ooo and Inkscape both. Ant uses XML to create build files (like: copy
>those files there, do this on them, then zip them, and then email them),
>which I guess are used by Ant (and clones) only, but I am not 100% sure
>about that. OTOH clones implies exchange with other programs :-)

Why would a program need to put files in anything other than a native
format to communicate with a clone of itself though? Unless I suppose
XML is adopted as the native format on the premise that information
exchange with other programs will be a feature in the future.

Nikita the Spider

unread,

Sep 3, 2006, 12:46:33 PM9/3/06

to

In article <Xns9833197557...@130.133.1.4>,

John Bokma <jo...@castleamber.com> wrote:
> >
> > Binary is serial. XML is by nature hierarchical and explicitly so.
> > That's why folks like Tim Bray had it proposed in the first place, I
> > assume.
>
> XML *is* binary. Like I said, make a hexdump of an XML file, it's
> educational.

Either you're joking or you have an unusual definition of the word
"binary". Technically, every file on a computer is binary since they're
all just numbers, but that definition makes the word kind of useless, no?

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more

John Bokma

unread,

Sep 3, 2006, 3:56:07 PM9/3/06

to

Big Bill <kr...@cityscape.co.uk> wrote:

> On 3 Sep 2006 08:18:17 GMT, John Bokma <jo...@castleamber.com> wrote:
>
>>Big Bill <kr...@cityscape.co.uk> wrote:
>>
>>> On 3 Sep 2006 07:36:03 GMT, John Bokma <jo...@castleamber.com> wrote:
>>>
>>>>Big Bill <kr...@cityscape.co.uk> wrote:
>>>>
>>>>[ XML ]
>>>>> You don't see it used in off-web applications though. Mind you,
>>>>> these days, what's an off-web application?
>>>>
>>>>OpenOffice.org uses it, Inkscape uses it (SVG is IIRC an XML
>>>>application), ant uses it (Apache Ant). 3 examples of pure off line
>>>>applications :-D
>>>
>>> None of which I'm familiar with, though I may have used OpenOffice
>>> in the past. Do they use it internally? Or as a means of information
>>> exchange with other programs?
>>
>>Ooo and Inkscape both. Ant uses XML to create build files (like: copy
>>those files there, do this on them, then zip them, and then email
>>them), which I guess are used by Ant (and clones) only, but I am not
>>100% sure about that. OTOH clones implies exchange with other programs
>>:-)
>
> Why would a program need to put files in anything other than a native
> format to communicate with a clone of itself though? Unless I suppose
> XML is adopted as the native format on the premise that information
> exchange with other programs will be a feature in the future.

Good question. With Ant the answer is: XML was chosen because it's
somewhat human readable. But yeah, they could have developed their own
format, and Ant would have worked the same. Maybe one can even say that
some of the issues with Ant is that they tried to hard to use XML, not
sure.

http://johnbokma.com/mexit/2006/02/21/ant-mail-task.html has a short
example, what is does if I type:

ant mail

on the command line, that piece of XML is interpreted (<target
name="mail">) It sends an email message and attaches 0 or more files
that are in the given directory (specified with ${dist.dir} which is
assigned a path somewhere on top of the file, and not shown on that
page.

For me this works great, I edit some files, I test the project, and when
I think I am done, I just type ant mail, and the result is emailed to
the customer :-) No need to zip manually, type a message in Thunderbird,
attach the zip, and press send.

(I currently use a more advanced version that also has the version and
name of the project in the Subject of the email :-) ).

Since I keep what I have done on the project (what's new) in a txt file
that's part of the project I don't have to type it in an email plus that
I keep this part of the documentation in the (IMO) right place.

Big Bill

unread,

Sep 3, 2006, 4:42:14 PM9/3/06

to

Else - he's doing it again!

BB :-)