Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A good representation of XML in Java?

12 views
Skip to first unread message

Spud

unread,
Sep 5, 2010, 12:15:09 PM9/5/10
to
I'm designing an API that is intended to make it easy to manipulate XML.
The idea is that you read XML from disk into a "document", and then you
can add/delete nodes or attributes, iterate over nodes, search for
nodes, etc..

The overall design goal is simplicity, speed, and memory efficiency, and
not necessarily correctness.

I'd really like to use an existing API, but haven't found one I like.
I've looked at DOM, JDOM, dom4j, and XOM. (XOM is the most interesting,
but it forces the use of Strings, which is a memory management nightmare
when you're dealing with millions of files.)

Does anyone know of a good DOM-like API for XML?

Arne Vajhøj

unread,
Sep 5, 2010, 2:14:32 PM9/5/10
to

I don't know of any that you have not listed.

And in general I would recommend "widely used" over
"personally like" anyday.

Arne

jaap

unread,
Sep 5, 2010, 3:25:08 PM9/5/10
to
Op 05-09-10 18:15, schreef Spud:

Oh my god! I have met a rather experienced programmer who had the same
idea. When he left the company, I had the honour to inherit his code. It
was a nightmare.
Use a standard Api, if you don't like it or if you find bugs, repair it.

Where do you want to use code for if it has not to be correct?

Jaap

Daniel Pitts

unread,
Sep 5, 2010, 3:43:37 PM9/5/10
to

Also, for processing XML:
StAX, SAX.

Correctness is always a requirement. With out it, speed and efficiency
are meaningless.

As far as I know, you are going to need to use Strings for any kind of
text you get out of an XML document, or at least something equivalent.
Strings are not any more troublesome than anything else as far as memory
management it concerned.

Unless you find a road that modern wheels don't work on, then don't
bother reinventing the wheel. In other words, unless you have an actual
need for a new "DOM" implementation, don't waste your effort.

To carry the analogy further, if you go off-roading, look at things
other than wheels. If you're dealing with large XML files, use a push
or pull based parser (SAX or StAX respectively) instead of a Document
based one.

Have you tried using existing DOM implementation for your use-cases?
What trouble did you run into?

Good luck,
Daniel.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

Arne Vajhøj

unread,
Sep 5, 2010, 4:24:08 PM9/5/10
to
On 05-09-2010 15:43, Daniel Pitts wrote:
> On 9/5/2010 9:15 AM, Spud wrote:
>> The overall design goal is simplicity, speed, and memory efficiency, and
>> not necessarily correctness.

> Correctness is always a requirement. With out it, speed and efficiency
> are meaningless.

Good advice.

If correctness is not required, then the following is both
simple, fast and has a low memory footprint:

public class SuperFastXmlParser {
public SuperFastXmlParser(String fnm) {
// nothing
}
public String getAnythingYouWant() {
return "42";
}
}

:-)

Arne

Lew

unread,
Sep 5, 2010, 4:28:37 PM9/5/10
to
Spud wrote:
>> I'm designing an API that is intended to make it easy to manipulate XML.
>> The idea is that you read XML from disk into a "document", and then you
>> can add/delete nodes or attributes, iterate over nodes, search for
>> nodes, etc..

Foolish, but good luck anyway.

>> The overall design goal is simplicity, speed, and memory efficiency, and
>> not necessarily correctness.

Yeah, because fast and incorrect is so-o-o much better than slower but correct.

If I could've gotten away with wrong answers, I'd've been a much faster
programmer all these years.

>> I'd really like to use an existing API, but haven't found one I like.
>> I've looked at DOM, JDOM, dom4j, and XOM. (XOM is the most interesting,
>> but it forces the use of Strings, which is a memory management nightmare
>> when you're dealing with millions of files.)

What is your criterion for "like"?

>> Does anyone know of a good DOM-like API for XML?

Daniel Pitts wrote:
> DOM, JDOM, dom4j, and XOM.
>
> Also, for processing XML:
> StAX, SAX.
>
> Correctness is always a requirement. With out it, speed and efficiency
> are meaningless.

What he said.

> As far as I know, you are going to need to use Strings for any kind of
> text you get out of an XML document, or at least something equivalent.
> Strings are not any more troublesome than anything else as far as memory
> management it concerned.

JAXB, baby!

It derives types from the schema, so you're not limited to Strings.

> Unless you find a road that modern wheels don't work on, then don't
> bother reinventing the wheel. In other words, unless you have an actual
> need for a new "DOM" implementation, don't waste your effort.

> To carry the analogy further, if you go off-roading, look at things
> other than wheels. If you're dealing with large XML files, use a push or
> pull based parser (SAX or StAX respectively) instead of a Document based
> one.
>
> Have you tried using existing DOM implementation for your use-cases?
> What trouble did you run into?

As Arne said, there's a value to using the standards even if you don't
entirely "like" them.

How is "like" an engineering evaluation? I propose that you use
"maintainable, correct and allows me to be replaced as the programmer" instead.

--
Lew

Eric Sosman

unread,
Sep 5, 2010, 4:43:58 PM9/5/10
to
On 9/5/2010 12:15 PM, Spud wrote:
> [...]

> The overall design goal is simplicity, speed, and memory efficiency, and
> not necessarily correctness. [...]

public class SpudsXML {
public SpudsXML() {
throw new UnsupportedOperationException("SAFUYOYO");
}
}

... seems to meet the requirements.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Joshua Cranmer

unread,
Sep 5, 2010, 4:58:07 PM9/5/10
to
On 09/05/2010 12:15 PM, Spud wrote:
> The overall design goal is simplicity, speed, and memory efficiency, and
> not necessarily correctness.

There once was a numerical analysis professor who often got requests for
solving some equations. When he asked about what error the person
wanted, the response was often "I don't care." In those cases, he would
respond "the answer is 7... with unknown error."

> Does anyone know of a good DOM-like API for XML?

Java's DOM stuff has worked well enough for me. Then again, I don't do
heavy XML coding.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

Daniel Pitts

unread,
Sep 5, 2010, 8:35:21 PM9/5/10
to
On 9/5/2010 1:28 PM, Lew wrote:
> How is "like" an engineering evaluation? I propose that you use
> "maintainable, correct and allows me to be replaced as the programmer"
> instead.
Change "replaced as the programmer" with "collaborate with other
programmers", and you have yourself a deal :-)

Actually, now that I said that, I realize the mistake.

It may allow me to be replaced as the programmer, but only as I become
the engineer dictating the structure of the system ;-)

The real benefit of not writing your own wheel is this: You come to
your boss and say "You wanted something round: Here it is!" and present
him with an existing wheel. He then is grateful that this didn't turn
into one of those long, expensive, buggy, failed projects, and you are
kept on. Then, when your boss asks you to "integrate this wheel with an
axle", you look at the wheel you chose. It happens to have that
integration built in, because that is a common thing that needs to be
done. You show your boss how fortuitous your thinking was when you
originally presented the wheel, and you get a promotion.

Also, you got all of that benefit WITHOUT the extra work of having to
prove how "clever" you were by reinventing the wheel.

"Can we add brakes too?" "Already done sir!"

Lew

unread,
Sep 5, 2010, 9:57:10 PM9/5/10
to
Lew wrote:
>> How is "like" an engineering evaluation? I propose that you use
>> "maintainable, correct and allows me to be replaced as the programmer"
>> instead.

Daniel Pitts wrote:
> Change "replaced as the programmer" with "collaborate with other
> programmers", and you have yourself a deal :-)
>
> Actually, now that I said that, I realize the mistake.
>
> It may allow me to be replaced as the programmer, but only as I become
> the engineer dictating the structure of the system ;-)

The mistake is when people code things to attempt to become irreplaceable.
The best coding is when you code the program to be taken over easily by
someone else. I call that "allowing myself to be replaced". There's no
scarcity of new and exciting projects to take on, and proper coding prevents
one from being trapped in the same project for the remainder of one's career.

By doing that you *are* the engineer dictating the structure of the system.
If you don't code to make yourself replaceable you will never leave the cellar
of maintenance for that code to progress to architect or chief engineer or
whatever.

So I reject the reinterpretation and aver that it's wise to "code to be replaced".

--
Lew

BGB / cr88192

unread,
Sep 5, 2010, 11:29:24 PM9/5/10
to

"Daniel Pitts" <newsgroup....@virtualinfinity.net> wrote in message
news:eHWgo.105116$lS1....@newsfe12.iad...

there are reasons when and when not to reinvent.

sometimes, there are cases where a small and specific piece of code to solve
a problem is preferable to a large and complex 3rd party library, as
dragging along a library can be more of a problem than simply having a piece
of code to get the job done.

but, if the logic is well outside the realm of trivial, one may consider
looking into existing options and seeing if there is anything which fits
ones' requirements.

if no options are a good fit, or other notable issues exist (terrible code,
licensing issues, ...), then there is IMO not much problem in coding up
ones' own solution.


even then, I prefer to try to stay within applicable standards, reuse
whatever designs and code can be used effectively, ... since novel solutions
tend to create more effort in the long run, and typically, the more
conservative the solution, the easier it is to keep everything going well
later.


granted, if the feature is part of the standard library, then unless there
is a very good reason, it is ill advised to do ones' own version.


in all though, there is no simple "cut and dry" answer...


bugbear

unread,
Sep 6, 2010, 4:55:33 AM9/6/10
to
Spud wrote:
> I'm designing an API that is intended to make it easy to manipulate XML.
> The idea is that you read XML from disk into a "document", and then you
> can add/delete nodes or attributes, iterate over nodes, search for
> nodes, etc..

Have you looked at XSLT ?

BugBear

Tom Anderson

unread,
Sep 6, 2010, 8:20:43 AM9/6/10
to

Where this analogy fails is that DOM is not the equvalent of a decent
off-the-shelf wheel. It's the equivalent of a wheel which is round for
most of its circumference, flat for a third of it, and comprises a rim
made of cast iron with spokes made of string. Your boss and colleagues may
not be impressed if you choose it.

It lets you very quickly procure a wheel, which will get your project
rolling, but it's not one that's going to give you a comfortable ride. In
this situation, building your own wheel will take longer, but give you a
better ride and a higher top speed.

However, in this case, there is another option, in the shape of dom4j,
which is a wheel of much better craftsmanship.

There is another consideration, of course, and that is the availability of
wheelrights, and interoperability with other wheel-using machinery. W3C
DOM is the standard, so everyone knows how to use it, and everyone can
interoperate with it. dom4j may be much better to work with directly, but
it's a pain for interoperation. It does have a W3C wrapper that lets it
masquerade as DOM, but from what i remember, it can take some contortions
to make this work.

tom

--
Women are monsters, men are clueless, everyone fights and no-one ever
wins. -- cleanskies

David Lamb

unread,
Sep 6, 2010, 10:24:50 AM9/6/10
to
On 05/09/2010 9:57 PM, Lew wrote:
> The mistake is when people code things to attempt to become
> irreplaceable.

I once heard the aphorism, "if an employee is irreplaceable, fire him --
and his manager."

Arved Sandstrom

unread,
Sep 6, 2010, 11:28:28 AM9/6/10
to

I've heard that. This is not to be confused with the oft-encountered
situation where there are no in-company backstops or readily available
external replacements for certain critical people; if someone is a superstar
and is doing everything correctly you'd have to be an idiot to fire them,
you just hope they don't get a better offer somewhere else.

What we're talking about is where a person has deliberately engineered a
situation where they cannot be replaced without damage to budget and
schedules, and it was avoidable. So this doesn't include a very good
developer using techniques (that someone at his experience level ought to
have) to solve a problem that are outside the grasp of his less-skilled
colleagues; in this case I'd hire better people, not fire the single
excellent dude that you've got. But it does include, for example, a
technically skilled programmer going outside the box and using a
non-authorized programming language to solve a problem (even if that
language was indisputably the best technical choice) when he knows damned
well that only 3 guys in the entire metropolis know how to code in that
language.

AHS
--
Before a man speaks it is always safe to assume that he is a fool.
After he speaks, it is seldom necessary to assume it. -- H.L. Mencken


Joshua Cranmer

unread,
Sep 6, 2010, 12:25:09 PM9/6/10
to

I know someone who was hired to make an irreplaceable employee replaceable.

David Lamb

unread,
Sep 6, 2010, 6:18:55 PM9/6/10
to
On 06/09/2010 11:28 AM, Arved Sandstrom wrote:
> David Lamb wrote:
>> On 05/09/2010 9:57 PM, Lew wrote:
>>> The mistake is when people code things to attempt to become
>>> irreplaceable.
>>
>> I once heard the aphorism, "if an employee is irreplaceable, fire him
>> -- and his manager."
>
> I've heard that. This is not to be confused with the oft-encountered
> situation where there are no in-company backstops or readily available
> external replacements for certain critical people; if someone is a superstar
> and is doing everything correctly you'd have to be an idiot to fire them,
> you just hope they don't get a better offer somewhere else.

Yes, but if they're hit by a bus tomorrow, you'd better have some idea
of how to cope.

Eric Sosman

unread,
Sep 6, 2010, 9:46:12 PM9/6/10
to

A recent example of the irreplaceable employee is that of Mr.
James Gandolfini, who worked for HBO. If the press reports are to
be believed, he extracted a two-point-five-times salary raise on
the grounds that he was indispensable to their enterprise.

But imagine yourself as an HBO executive: Are you *really*
gonna put out a contract on Tony Soprano? If you do, do you think
the contractor will whack him, or whack you?

Sometimes you gotta just suck it up.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Lew

unread,
Sep 7, 2010, 8:32:24 AM9/7/10
to

Why is it "hit by a bus"? Bus accidents are quite rare, by comparison to
other ways of dying. Why isn't it "hit by a train" or "... meteorite" or "...
stray bullet" or "... myocardial infarction"?

--
Lew

Spud

unread,
Sep 7, 2010, 1:27:09 PM9/7/10
to

A clarification on what I meant by "not necessarily correct": this app
needs to support nodes, attributes, and text, and nothing else. Many XML
libraries add much complexity handling name spaces, comments, processing
instructions, DTDs, and schemas. That's not necessary here. The
overriding goal is to make it fast and simple for developers to
manipulate the specific type of XML that we have.

On strings: when you're doing (very) large scale data processing,
strings are a no-no. The only answer is to have some kind of reusable
buffer for storing text. Here's a discussion of the topic:

http://lingpipe-blog.com/2010/06/22/the-unbearable-heaviness-jav-strings/

Lew

unread,
Sep 7, 2010, 3:56:27 PM9/7/10
to
Spud wrote:
> A clarification on what I meant by "not necessarily correct": this app
> needs to support nodes, attributes, and text, and nothing else. Many XML
> libraries add much complexity handling name spaces, comments, processing
> instructions, DTDs, and schemas. That's not necessary here. The
> overriding goal is to make it fast and simple for developers to
> manipulate the specific type of XML that we have.
>

Most XML parsers have the ability to switch namespace processing on or
off. However, switching it off is not a good idea.

Since the existing libraries handle namespaces just fine, there's no
need to worry about how hard it is to implement.

> On strings: when you're doing (very) large scale data processing,
> strings are a no-no. The only answer is to have some kind of reusable
> buffer for storing text. Here's a discussion of the topic:
>

> http://lingpipe-blog.com/2010/06/22/the-unbearable-heaviness-jav-stri...
>

The other only answer is to use JAXB.

--
Lew

Daniel Pitts

unread,
Sep 7, 2010, 4:26:44 PM9/7/10
to
On 9/7/2010 10:27 AM, Spud wrote:
> On strings: when you're doing (very) large scale data processing,
> strings are a no-no. The only answer is to have some kind of reusable
> buffer for storing text.
Define very large scale? Also, "strings" are not the same as "Strings",
so lets make sure you are specifically talking about "using String
instances is a no-no".

All this says is that there is a 60 byte overhead per String instance.
You're original objection was that it had something to do with bogging
down the GC.

While 60 byte overhead seems to be a lot, keep in mind that any object
holding string-like data will have the same overhead. Reusable objects
(char buffers) may interfere with the GC more than short lived objects,
depending on the GC implementation. I've heard this is the case for most
modern JVM GCs, which is why Object pools are not common in modern Java
programs.

It sounds to me like you read in a few places about potential
inefficiencies with Java Strings, and have decided to avoid them before
finding actual, practical problems.

Arne Vajhøj

unread,
Sep 7, 2010, 6:38:25 PM9/7/10
to

Why are demo classes/methods often called foobar?

That is just common practice.

Same with walking out in front of a bus, hit by a bus, bus number
etc..

http://en.wikipedia.org/wiki/High_bus_number

Arne

Peter Duniho

unread,
Sep 7, 2010, 8:02:47 PM9/7/10
to
Arne Vajhøj wrote:
> Why are demo classes/methods often called foobar?

"Foobar" derives from "fubar", an old (military) slang for "fucked up
beyond all recognition". The word didn't just show up out of thin air;
there's some history behind it.

> That is just common practice.
>
> Same with walking out in front of a bus, hit by a bus, bus number
> etc..

Right. And likewise, the idea of being hit by a bus didn't just come
out of thin air. Culturally, there was a fixation on that particular
accident and it gained traction as a metaphor for any sudden and
unexpected demise.

One explanation notes a number of literary appearances of the phrase,
going back to 1907 (when buses were relatively new, and probably the
most likely thing to kill a random person wandering across a street).
Like many pop culture references, once started it can feed on itself,
until it's just a standard catch-phrase.
http://www.slate.com/id/2223749/

What any of this has to do with XML, never mind XML in Java, I have no
idea. :)

Pete

Arne Vajhøj

unread,
Sep 7, 2010, 9:25:23 PM9/7/10
to
On 07-09-2010 20:02, Peter Duniho wrote:
> Arne Vajhøj wrote:
>> Why are demo classes/methods often called foobar?
>
> "Foobar" derives from "fubar", an old (military) slang for "fucked up
> beyond all recognition".

Wikipedia just consider that a "may":

http://en.wikipedia.org/wiki/Foobar
http://en.wikipedia.org/wiki/FUBAR

and given that the one is neutral and the other one is
very negative, then it is not that obvious.

> What any of this has to do with XML, never mind XML in Java, I have
no idea. :)

Usenet !

Arne

Lew

unread,
Sep 7, 2010, 9:42:51 PM9/7/10
to
Peter Duniho wrote:
>> "Foobar" derives from "fubar", an old (military) slang for "fucked up
>> beyond all recognition".

Arne Vajhøj wrote:
> Wikipedia just consider that a "may":
>
> http://en.wikipedia.org/wiki/Foobar
> http://en.wikipedia.org/wiki/FUBAR
>
> and given that the one is neutral and the other one is
> very negative, then it is not that obvious.

A link from that second Wikipedia entry points to RFC 3092 which gives more
details.
<http://www.faqs.org/rfcs/rfc3092.html>

The only question seems to be which derived from which; it's utterly obvious
that the terms are tightly related regardless of etymology.

Certainly it's no coincidence that "foo" and "bar" appear together in computer
texts; the relationship there to "fubar" is deliberate and, of course, utterly
obvious.

--
Lew

Mike Schilling

unread,
Sep 8, 2010, 2:08:18 AM9/8/10
to

"Peter Duniho" <NpOeS...@NnOwSlPiAnMk.com> wrote in message
news:Hbedncef3os0TxvR...@posted.palinacquisition...

What's odd is that

1. I've heard that expression many, many times in the context of "What would
we do if this person with lots of valuable information in his head
disappears?"
2. But it's always been "What if he gets hit by a *truck*?"

Peter Duniho

unread,
Sep 8, 2010, 2:13:40 AM9/8/10
to
Arne Vajhøj wrote:
> On 07-09-2010 20:02, Peter Duniho wrote:
>> Arne Vajhøj wrote:
>>> Why are demo classes/methods often called foobar?
>>
>> "Foobar" derives from "fubar", an old (military) slang for "fucked up
>> beyond all recognition".
>
> Wikipedia just consider that a "may":
>
> http://en.wikipedia.org/wiki/Foobar
> http://en.wikipedia.org/wiki/FUBAR
>
> and given that the one is neutral and the other one is
> very negative, then it is not that obvious.

Whatever.

I'm willing to entertain the idea that the reason "fubar" as an acronym
appeared and was popular because of a previously existing acronym "foo".

But it's clear enough that whether the initial military slang was
"fubar" or "foobar", the use of "foo" and "bar" in CS derives from the
earlier military usage.

Pete

Lew

unread,
Sep 8, 2010, 7:47:53 AM9/8/10
to
Mike Schilling wrote:
> What's odd is that
>
> 1. I've heard that expression many, many times in the context of "What
> would we do if this person with lots of valuable information in his head
> disappears?"
> 2. But it's always been "What if he gets hit by a *truck*?"

I've actually heard, "What if (s)he gets abducted by aliens?"

If that's the risk, job security is assured.

--
Lew

David Lamb

unread,
Sep 8, 2010, 12:22:36 PM9/8/10
to
On 08/09/2010 2:13 AM, Peter Duniho wrote:
> Arne Vajhøj wrote:
>> On 07-09-2010 20:02, Peter Duniho wrote:
>>> Arne Vajhøj wrote:
>>>> Why are demo classes/methods often called foobar?
>>>
>>> "Foobar" derives from "fubar", an old (military) slang for "fucked up
>>> beyond all recognition".
>>
>> Wikipedia just consider that a "may":
>>
>> http://en.wikipedia.org/wiki/Foobar
>> http://en.wikipedia.org/wiki/FUBAR
>>
>> and given that the one is neutral and the other one is
>> very negative, then it is not that obvious.

Sure it is. The term arose decades ago when computers (both hardware
and software) were even more unreliable than now.

> But it's clear enough that whether the initial military slang was
> "fubar" or "foobar", the use of "foo" and "bar" in CS derives from the
> earlier military usage.

For what it's worth when considering etymology: When I was in the
Computer Science Department at Carnegie-Mellon University in the
mid-1970's, it was generally accepted without question that foobar came
from FUBAR.

Lew

unread,
Sep 8, 2010, 1:20:21 PM9/8/10
to
David Lamb wrote:
> For what it's worth when considering etymology: When I was in the
> Computer Science Department at Carnegie-Mellon University in the
> mid-1970's, it was generally accepted without question that foobar came
> from FUBAR.
>

It's worth something, but it is not proof.

It's worth observing that "foobar" and "fubar" (or the upper-case
version thereof) are culturally intertwined, regardless of whether the
derivation is folk etymology. However, if the one doesn't derive from
the other then the coincidence is just too incredible. Occam's Razor
suggests acceptance of the etymology.

<http://en.wikipedia.org/wiki/False_etymology>
<http://en.wikipedia.org/wiki/Occams_razor>

--
Lew

Tom Anderson

unread,
Sep 8, 2010, 7:09:03 PM9/8/10
to
On Tue, 7 Sep 2010, Peter Duniho wrote:

> Arne Vajh?j wrote:
>> Why are demo classes/methods often called foobar?
>
> "Foobar" derives from "fubar", an old (military) slang for "fucked up
> beyond all recognition". The word didn't just show up out of thin air;
> there's some history behind it.

Controversial!

>> That is just common practice.
>>
>> Same with walking out in front of a bus, hit by a bus, bus number
>> etc..
>
> Right. And likewise, the idea of being hit by a bus didn't just come out of
> thin air. Culturally, there was a fixation on that particular accident and
> it gained traction as a metaphor for any sudden and unexpected demise.
>
> One explanation notes a number of literary appearances of the phrase, going
> back to 1907 (when buses were relatively new, and probably the most likely
> thing to kill a random person wandering across a street). Like many pop
> culture references, once started it can feed on itself, until it's just a
> standard catch-phrase.
> http://www.slate.com/id/2223749/
>
> What any of this has to do with XML, never mind XML in Java, I have no idea.
> :)

Because XML experts are most at risk of getting run over by an enterprise
service bus DUH.

tom

--
If you're going to print crazy, ridiculous things, you might as well
make them extra crazy. -- Mark Rein

Tom Anderson

unread,
Sep 8, 2010, 7:12:31 PM9/8/10
to
On Tue, 7 Sep 2010, Lew wrote:

> Peter Duniho wrote:
>>> "Foobar" derives from "fubar", an old (military) slang for "fucked up
>>> beyond all recognition".
>

> Arne Vajh?j wrote:
>> Wikipedia just consider that a "may":
>>
>> http://en.wikipedia.org/wiki/Foobar
>> http://en.wikipedia.org/wiki/FUBAR
>>
>> and given that the one is neutral and the other one is
>> very negative, then it is not that obvious.
>
> A link from that second Wikipedia entry points to RFC 3092 which gives more
> details.
> <http://www.faqs.org/rfcs/rfc3092.html>
>
> The only question seems to be which derived from which; it's utterly obvious
> that the terms are tightly related regardless of etymology.
>
> Certainly it's no coincidence that "foo" and "bar" appear together in
> computer texts; the relationship there to "fubar" is deliberate and, of
> course, utterly obvious.

I don't see a relationship at all, and deny entirely that it is deliberate
or obvious. The words seem completely unrelated to me.

I think, like "bog standard", this is one of those phrases whose origin,
despite lying in the most comprehensively documented century in human
history, will remain forever shrouded in mystery.

Arne Vajhøj

unread,
Sep 8, 2010, 8:24:17 PM9/8/10
to

Good one!

Arne

RedGrittyBrick

unread,
Sep 9, 2010, 4:50:44 AM9/9/10
to
On 09/09/2010 00:12, Tom Anderson wrote:
> On Tue, 7 Sep 2010, Lew wrote:
>
>> Peter Duniho wrote:
>>>> "Foobar" derives from "fubar", an old (military) slang for "fucked up
>>>> beyond all recognition".
>>
>> Arne Vajh?j wrote:
>>> Wikipedia just consider that a "may":
>>>
>>> http://en.wikipedia.org/wiki/Foobar
>>> http://en.wikipedia.org/wiki/FUBAR
>>>
>>> and given that the one is neutral and the other one is
>>> very negative, then it is not that obvious.
>>
>> A link from that second Wikipedia entry points to RFC 3092 which gives
>> more details.
>> <http://www.faqs.org/rfcs/rfc3092.html>
>>
>> The only question seems to be which derived from which; it's utterly
>> obvious that the terms are tightly related regardless of etymology.
>>
>> Certainly it's no coincidence that "foo" and "bar" appear together in
>> computer texts; the relationship there to "fubar" is deliberate and,
>> of course, utterly obvious.
>
> I don't see a relationship at all, and deny entirely that it is
> deliberate or obvious. The words seem completely unrelated to me.
>

Bloomin 'eck, you're yanking our chains! Sling yer hook!

--
RGB

Spud

unread,
Sep 11, 2010, 8:00:56 PM9/11/10
to
No. It's based on actual experience, the same experience the blogger had.

One large reusable buffer, containing many strings, is far more
efficient than creating and destroying String objects millions of times
over. Simply have a large array of char, and two parallel arrays of int
that contain pointers to the start and end of each string. This doesn't
work for every app, but it does for this one. Once you've filled the
buffer, processed your batch of strings, and no longer need them, just
reset the pointers and you're done. Zero GC required.

This is how my primitive XML DOM class works. It works fine; I just
don't want to maintain it any more.

Generational garbage collectors are nice, but they're not magic.

Arne Vajhøj

unread,
Sep 19, 2010, 7:26:11 PM9/19/10
to
On 05-09-2010 23:29, BGB / cr88192 wrote:
> "Daniel Pitts"<newsgroup....@virtualinfinity.net> wrote in message

>> The real benefit of not writing your own wheel is this: You come to your
>> boss and say "You wanted something round: Here it is!" and present him
>> with an existing wheel. He then is grateful that this didn't turn into
>> one of those long, expensive, buggy, failed projects, and you are kept on.
>> Then, when your boss asks you to "integrate this wheel with an axle", you
>> look at the wheel you chose. It happens to have that integration built
>> in, because that is a common thing that needs to be done. You show your
>> boss how fortuitous your thinking was when you originally presented the
>> wheel, and you get a promotion.
>>
>> Also, you got all of that benefit WITHOUT the extra work of having to
>> prove how "clever" you were by reinventing the wheel.
>>
>> "Can we add brakes too?" "Already done sir!"
>
> there are reasons when and when not to reinvent.
>
> sometimes, there are cases where a small and specific piece of code to solve
> a problem is preferable to a large and complex 3rd party library, as
> dragging along a library can be more of a problem than simply having a piece
> of code to get the job done.

If the library is well documented and well supported, then does it
matter how big it is?

Arne

Arne Vajhøj

unread,
Sep 19, 2010, 8:20:16 PM9/19/10
to
On 08-09-2010 12:22, David Lamb wrote:
> On 08/09/2010 2:13 AM, Peter Duniho wrote:
>> Arne Vajhøj wrote:
>>> On 07-09-2010 20:02, Peter Duniho wrote:
>>>> Arne Vajhøj wrote:
>>>>> Why are demo classes/methods often called foobar?
>>>>
>>>> "Foobar" derives from "fubar", an old (military) slang for "fucked up
>>>> beyond all recognition".
>>>
>>> Wikipedia just consider that a "may":
>>>
>>> http://en.wikipedia.org/wiki/Foobar
>>> http://en.wikipedia.org/wiki/FUBAR
>>>
>>> and given that the one is neutral and the other one is
>>> very negative, then it is not that obvious.
>
> Sure it is. The term arose decades ago when computers (both hardware and
> software) were even more unreliable than now.

So of a term was neutrally back when computers were unreliable, then
it matches fine with a negative term?

Sorry - that is not obvious to me.

>> But it's clear enough that whether the initial military slang was
>> "fubar" or "foobar", the use of "foo" and "bar" in CS derives from the
>> earlier military usage.
>
> For what it's worth when considering etymology: When I was in the
> Computer Science Department at Carnegie-Mellon University in the
> mid-1970's, it was generally accepted without question that foobar came
> from FUBAR.

That just proves that the opinion is common and has been for many
years.

Arne

Arne Vajhøj

unread,
Sep 19, 2010, 8:22:40 PM9/19/10
to

That is not clear from the Wikipedia articles.

Arne

Eric Sosman

unread,
Sep 19, 2010, 8:24:30 PM9/19/10
to

Yes, if the dependencies are unmanageable. You'd like to use
WizardXML, a nice, svelte, open-source, virtuous, pacifist library
with a low carbon footprint; fine. WizardXML in turn uses MagicBuzz
dispatching framework, which makes heavy use of OccultObject, which
itself inherits from IncantationInstantiator. Version 3, naturally,
since Version 4 introduced an API incompatibility in order to permit
the use of an advanced spell-spelling package (spelling a spell wrong
can get you in *enormous* trouble ...).

Unfortunately, you also need PsychicPsimulator. PsychicPsimulator
relies on Pseeress, which also uses IncantationInstantiator -- version
5, since there's just no hope of implementing the advanced capabilities
of Pseeress with the antiquated pre-V4 API ...

Componentized architectures have many advantages, but there's a
trade-off: They exacerbate the problems of "version skew" and make them
much nastier than they usually are in monolithic schemes. Which is not
to say that the monolith is somehow "superior," just to say that there
are trade-offs, and that TANSTAAFL.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Arne Vajhøj

unread,
Sep 19, 2010, 8:29:11 PM9/19/10
to

The question whether such an coincidence is too incredible can
be somewhat verified by checking whether the English language
has other words that are very similar but does not have a
common origin.

See:
http://en.wikipedia.org/wiki/Homophone

Arne

Arne Vajhøj

unread,
Sep 19, 2010, 8:32:15 PM9/19/10
to

Well - if you have to get the rigth versions of everything, then
it can become a huge problem.

But if everything comes packaged, then it is usually just some
MB of disk space.

Arne

Arne Vajhøj

unread,
Sep 19, 2010, 8:45:36 PM9/19/10
to
On 06-09-2010 11:28, Arved Sandstrom wrote:
> David Lamb wrote:
>> On 05/09/2010 9:57 PM, Lew wrote:
>>> The mistake is when people code things to attempt to become
>>> irreplaceable.
>>
>> I once heard the aphorism, "if an employee is irreplaceable, fire him
>> -- and his manager."
>
> I've heard that. This is not to be confused with the oft-encountered
> situation where there are no in-company backstops or readily available
> external replacements for certain critical people; if someone is a superstar
> and is doing everything correctly you'd have to be an idiot to fire them,
> you just hope they don't get a better offer somewhere else.
>
> What we're talking about is where a person has deliberately engineered a
> situation where they cannot be replaced without damage to budget and
> schedules, and it was avoidable. So this doesn't include a very good
> developer using techniques (that someone at his experience level ought to
> have) to solve a problem that are outside the grasp of his less-skilled
> colleagues; in this case I'd hire better people, not fire the single
> excellent dude that you've got. But it does include, for example, a
> technically skilled programmer going outside the box and using a
> non-authorized programming language to solve a problem (even if that
> language was indisputably the best technical choice) when he knows damned
> well that only 3 guys in the entire metropolis know how to code in that
> language.

It is obviously good to have superstar programmers.

But the superstar techniques should only be used if needed.

If a dumb solution is good, then it is better than the
sophisticated solution doing the same.

There is an old joke about hello world programs that goes
something like:

beginner - Basic
intermediate - Pascal
good - C
advanced - C++
expert - C++ with MFC
guru - Basic

(with code snippets shown)

It is both funny and has a good point.

Unfortunately I am out of luck with Google, so I can
not find it online.

Arne

markspace

unread,
Sep 19, 2010, 9:44:23 PM9/19/10
to
On 9/19/2010 5:45 PM, Arne Vajhøj wrote:

> There is an old joke about hello world programs that goes
> something like:
>
> beginner - Basic
> intermediate - Pascal
> good - C
> advanced - C++
> expert - C++ with MFC
> guru - Basic
>
> (with code snippets shown)
>
> It is both funny and has a good point.


I've seen that one. I also wish I could find it again, although I think
I remember it using some esoteric Ada constructs instead of C++ and MFC.

It started with (Java equivalent, I've forgot too much C):

Beginner Hello World:

public class Hello {
public static void main( Strings[] args ) {
System.out.println( "Hello World" );
}
}

and got progressively more baroque as you progressed, until you hit
expert or so and then they got progressively simpler. I think there was
an "Experienced Programmer" and a "Very Experienced Programmer" after
Expert.

Very Experienced Programmer Hello World:

$ cc ~/misc/hw/hw.c; ~/misc/hw/a.out
Hello World

and then finally for the guru version:

Guru Hello World:

$ cat
Hello World^D
Hello World

and the joke (and point) was made.

>
> Unfortunately I am out of luck with Google, so I can
> not find it online.


Same, although if somebody can find a version of it, that would be great
and much appreciated.

Arne Vajhøj

unread,
Sep 19, 2010, 10:08:29 PM9/19/10
to
On 07-09-2010 13:27, Spud wrote:
> A clarification on what I meant by "not necessarily correct": this app
> needs to support nodes, attributes, and text, and nothing else. Many XML
> libraries add much complexity handling name spaces, comments, processing
> instructions, DTDs, and schemas. That's not necessary here. The
> overriding goal is to make it fast and simple for developers to
> manipulate the specific type of XML that we have.

If you were to write the code from scratch, then there would be
huge benefits by making it simple.

But when the advanced code already exists, then it is not your cost.
And feature rich code that is highly optimized because many
people are using it are often faster than the homebrewed code.

> On strings: when you're doing (very) large scale data processing,
> strings are a no-no. The only answer is to have some kind of reusable
> buffer for storing text. Here's a discussion of the topic:
>
> http://lingpipe-blog.com/2010/06/22/the-unbearable-heaviness-jav-strings/

No no no.

That article just spend some time figuring out how much memory
the internal data structures to hold a string.

It does not claim that you can not use strings for large scale
data processing.

Instead you should read what
http://www.ibm.com/developerworks/java/library/j-jtp01274.html
has to say about object pooling.

Arne

John B. Matthews

unread,
Sep 19, 2010, 10:13:10 PM9/19/10
to
In article <i76e9p$vla$1...@news.eternal-september.org>,
markspace <nos...@nowhere.com> wrote:

> Same, although if somebody can find a version of it, that would be great
> and much appreciated.

<http://www.astahost.com/info.php/Joke-Evolution-Programmers_t14168.html>

--
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>

Arne Vajhøj

unread,
Sep 19, 2010, 10:18:59 PM9/19/10
to
On 11-09-2010 20:00, Spud wrote:
> On 9/7/2010 3:26 PM, Daniel Pitts wrote:
>> On 9/7/2010 10:27 AM, Spud wrote:
>>> On strings: when you're doing (very) large scale data processing,
>>> strings are a no-no. The only answer is to have some kind of reusable
>>> buffer for storing text.
>> Define very large scale? Also, "strings" are not the same as "Strings",
>> so lets make sure you are specifically talking about "using String
>> instances is a no-no".
>>
>>> Here's a discussion of the topic:
>>>
>>> http://lingpipe-blog.com/2010/06/22/the-unbearable-heaviness-jav-strings/
>>>
>> All this says is that there is a 60 byte overhead per String instance.
>> You're original objection was that it had something to do with bogging
>> down the GC.
>>
>> While 60 byte overhead seems to be a lot, keep in mind that any object
>> holding string-like data will have the same overhead. Reusable objects
>> (char buffers) may interfere with the GC more than short lived objects,
>> depending on the GC implementation. I've heard this is the case for most
>> modern JVM GCs, which is why Object pools are not common in modern Java
>> programs.
>>
>> It sounds to me like you read in a few places about potential
>> inefficiencies with Java Strings, and have decided to avoid them before
>> finding actual, practical problems.
>>
> No. It's based on actual experience, the same experience the blogger had.

Actually if you bother to read the the blog, then the blogger
did not have such experience he just claimed without any evidence
that "It’s just too expensive to allocate objects in tight loops".

And serious research in the topic directly contradicts it.

Brian Goetz in
http://www.ibm.com/developerworks/java/library/j-jtp01274.html

Joshua Bloch Effective Java 1st Ed Item 4 final remarks

> One large reusable buffer, containing many strings, is far more
> efficient than creating and destroying String objects millions of times
> over. Simply have a large array of char, and two parallel arrays of int
> that contain pointers to the start and end of each string. This doesn't
> work for every app, but it does for this one. Once you've filled the
> buffer, processed your batch of strings, and no longer need them, just
> reset the pointers and you're done. Zero GC required.

You can't read a blog text correct and you want us to believe
you can measure performance?

Arne

Arne Vajhøj

unread,
Sep 19, 2010, 10:20:12 PM9/19/10
to
On 19-09-2010 22:13, John B. Matthews wrote:
> In article<i76e9p$vla$1...@news.eternal-september.org>,
> markspace<nos...@nowhere.com> wrote:
>> Same, although if somebody can find a version of it, that would be great
>> and much appreciated.
>
> <http://www.astahost.com/info.php/Joke-Evolution-Programmers_t14168.html>

Yep.

That is it.

Arne

Tom Anderson

unread,
Sep 20, 2010, 11:13:44 AM9/20/10
to
On Sun, 19 Sep 2010, markspace wrote:

Surely:

echo "Hello World"

?

And i bet there's a perl version that goes:

perl -MCPAN -e 'install HelloWorld'

tom

--
Hawaii may be many things, but it is not the sort of place you go to
for smart, sexy, geeky women. It's more the place for luau excess,
non-literate arts, and maritime athleticism. -- applez

Message has been deleted

markspace

unread,
Sep 20, 2010, 12:58:19 PM9/20/10
to
On 9/19/2010 7:13 PM, John B. Matthews wrote:
> In article<i76e9p$vla$1...@news.eternal-september.org>,
> markspace<nos...@nowhere.com> wrote:
>
>> Same, although if somebody can find a version of it, that would be great
>> and much appreciated.
>
> <http://www.astahost.com/info.php/Joke-Evolution-Programmers_t14168.html>
>


Yup! That's looks like it, or a recent version of it anyway. Thanks John!


markspace

unread,
Sep 20, 2010, 1:00:41 PM9/20/10
to
On 9/20/2010 8:13 AM, Tom Anderson wrote:

> On Sun, 19 Sep 2010, markspace wrote:
>> $ cat
>> Hello World^D
>> Hello World


> Surely:
>
> echo "Hello World"


*shrug* I recall the joke as using "cat". My version takes 3 less
keystrokes, I guess. My memory could be wrong of course.

David Lamb

unread,
Sep 23, 2010, 12:44:38 PM9/23/10
to

I know; thus "for what it's worth" with "worth" likely depending on the
pre-existing opinion of the reader.

0 new messages