http://groups.google.com/group/lock-free/topics
Everybody is welcome to post/comment some thoughts/ideas/algos to this
new group.
Dmitriy V'jukov
> I think that interested people can move away from
> comp.programming.threads with detailed questions about lock-free tech.
> So I create dedicated group 'lock-free':
That would be a good thing if in cpt the traffic would be very high, but
it isn't. Lock-free programming relates to threading as much as locking
does. And not every one who read news want to create a google account.
regards
Torsten
> That would be a good thing if in cpt the traffic would be very high, but
> it isn't. Lock-free programming relates to threading as much as locking
> does.
But it is not prohibited to subdivide comp.programming.threads
> And not every one who read news want to create a google account.
I permit posting to everyone, not only members. So now there is no
need to have google account.
Dmitriy V'jukov
>>And not every one who read news want to create a google account.
>
>
> I permit posting to everyone, not only members. So now there is no
> need to have google account.
You don't understand. usenet != google groups. For the usenet, lots
of specialized applications exist that support posting and reading in
various ways. For google groups, only google is available.
So long,
Thomas
> You don't understand. usenet != google groups. For the usenet, lots
> of specialized applications exist that support posting and reading in
> various ways. For google groups, only google is available.
>
> So long,
> Thomas
google groups have enough comfortable interface... so I think it is
not a problem...
btw you can work with google group with your favorite mail client
Dmitriy V'jukov
> On May 31, 2:55 pm, Thomas Richter <t...@math.TU-Berlin.DE> wrote:
>
>> You don't understand. usenet != google groups. For the usenet, lots
>> of specialized applications exist that support posting and reading in
>> various ways. For google groups, only google is available.
>>
>> So long,
>> Thomas
>
> google groups have enough comfortable interface... so I think it is
> not a problem...
Actually, the google interface is horrible, and has all but destroyed
what was good about Usenet, and flooded it with spam and worse.
> btw you can work with google group with your favorite mail client
Why would I want to use a mail program to read news?
--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw
Please, create dedicated usenet group for lock-free subjects. I just
don't know how to do it.
Dmitriy V'jukov
Usenet explicitly prohibits groups with low traffic?
Point for new group you can see here:
http://groups.google.ru/group/comp.programming.threads/tree/browse_frm/thread/60177465eed45c5b/f83460262d48e916?rnum=1&_done=%2Fgroup%2Fcomp.programming.threads%2Fbrowse_frm%2Fthread%2F60177465eed45c5b%2Fd349afa89c24c006%3F#doc_af4ca37ac8368f32
Dmitriy V'jukov
Not realy, but why create more groups then nessary?
> Point for new group you can see here:
> http://groups.google.ru/group/comp.programming.threads/tree/browse_frm/thread/60177465eed45c5b/f83460262d48e916?rnum=1&_done=%2Fgroup%2Fcomp.programming.threads%2Fbrowse_frm%2Fthread%2F60177465eed45c5b%2Fd349afa89c24c006%3F#doc_af4ca37ac8368f32
I don't get your point. Someone asks an othere one to stop posting code
undocumented code in this newsgroup and you think a new group would
solve this? When a new group, why not comp.programming.threads.code?
regards
Torsten
> I don't get your point. Someone asks an othere one to stop posting code
> undocumented code in this newsgroup and you think a new group would
> solve this? When a new group, why not comp.programming.threads.code?
Because I don't know how to create usenet group.
Dmitriy V'jukov
>
> Point for new group you can see here:
>
So you want a group where you can read undocumented and buggy code and, at the
same time, keep out people who actually point out that the code _has_ problems?
Well, good luck.
I've been lurking for long on this newsgroup, and I have frequently seen
threads (no pun intended) where Chris posts some code and then follows up
to his own posts where he corrects bugs. Sorry, but in my eyes, such approach
raises strong doubts about both the author and the code reliability. Not
to mention the inavoidable question raised: how many bugs are remaining in
the code?
So, for production use, I consider Chris's code next to useless (due to serious
reliability concerns). For _learning_, it's much better to read research
articles than reverse-engineer someone's ASM code.
As for usable code, that too can be found out there. Here's for example a
new scalable memory allocator:
> > Point for new group you can see here:
>
> So you want a group where you can read undocumented and buggy code and, at the
> same time, keep out people who actually point out that the code _has_ problems?
> Well, good luck.
>
> I've been lurking for long on this newsgroup, and I have frequently seen
> threads (no pun intended) where Chris posts some code and then follows up
> to his own posts where he corrects bugs. Sorry, but in my eyes, such approach
> raises strong doubts about both the author and the code reliability. Not
> to mention the inavoidable question raised: how many bugs are remaining in
> the code?
You don't understand at all. This is not the code that you can copy-
paste to your project. If you want reliable code and if you don't want
to go into details of how it works, you better just buy some
commercial library (btw Chris can sell one to you :) ).
> So, for production use, I consider Chris's code next to useless (due to serious
> reliability concerns). For _learning_, it's much better to read research
> articles than reverse-engineer someone's ASM code.
Most of the time there are sh#tty stuff. There are really very few
articles with worthwhile ideas.
> As for usable code, that too can be found out there. Here's for example a
> new scalable memory allocator:
>
> http://people.cs.vt.edu/~scschnei/streamflow/
"As a part of our evaluation of Streamflow, we implemented Maged
Michael's lock-free allocator..."
lol
Chris' "buggy" code blow this "usable" code out of the water
completely!
If I remember right, Maged Michael's allocator has 3 CAS per
operation, when Chris' allocator has 0 CAS per operation.
Dmitriy V'jukov
> > As for usable code, that too can be found out there. Here's for example a
> > new scalable memory allocator:
>
> >http://people.cs.vt.edu/~scschnei/streamflow/
>
> "As a part of our evaluation of Streamflow, we implemented Maged
> Michael's lock-free allocator..."
>
> lol
> Chris' "buggy" code blow this "usable" code out of the water
> completely!
> If I remember right, Maged Michael's allocator has 3 CAS per
> operation, when Chris' allocator has 0 CAS per operation.
It seems that I am wrong.
Chris, it's better you to see the link. It seems that they steal your
idea! Paper dated by 2006. So when you make patent application?
Dmitriy V'jukov
>
> Most of the time there are sh#tty stuff. There are really very few
> articles with worthwhile ideas.
>
_Any_ idea is worthwhile until a better one comes along. So what is exactly
your point?
>
> lol
> Chris' "buggy" code blow this "usable" code out of the water
> completely!
>
Well, correct and slow is better than fast and buggy. Your opinion
might differ.
> > You don't understand at all. This is not the code that you can copy-
> > paste to your project. If you want reliable code and if you don't want
> > to go into details of how it works, you better just buy some
> > commercial library (btw Chris can sell one to you :) ).
>
> I don't want a commercial library, and I'm able to write the components
> I need myself.
So what is the problem? Get Chris' *idea* and write the component! It
would be faster and more scalable then most you can find out there. I
assume that you write unbuggy code...
> > Most of the time there are sh#tty stuff. There are really very few
> > articles with worthwhile ideas.
>
> _Any_ idea is worthwhile until a better one comes along. So what is exactly
> your point?
Better ones already here. Just see some posts in this group... But if
you can see only on unbuggy code... so you miss better ones...
> Well, correct and slow is better than fast and buggy.
Please point to some bugs.
Dmitriy V'jukov
http://www.cs.uu.nl/wais/html/na-dir/usenet/creating-newsgroups/part1.html
Why don't you Google my record: You just might be able to find out that I at
least make an effort to document the appropriate corrections!
Agreed?
--
Ian Collins.
Humm. It seems like they WERE using the following algorithm:
http://www.research.ibm.com/people/m/michael/pldi-2004.pdf
Which uses an interlocked op every time you call malloc or free.
> It seems that they steal your idea! Paper dated by 2006.
http://people.cs.vt.edu/~scschnei/papers/ismm06.pdf
God damn! It sure as hell looks like they ripped me off big time! Those
fucking punk bastards!!!
I developed my algorithm back in early 2004, and post it here in Oct. 2005:
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/8245f4b48591fc69
> So when you make patent application?
I would rather not say but some e-mails or on the way to every single author
of that paper! Perhaps you should e-mail them as well.
This kind of crap happened to Joe Seighs atomic_ptr... Sun Microsystems has
a patent application out there that describes Joe's algorithm in great
detail...
I am pissed off!
I don't think they swiped it. They just independently reinvented it
because they assumed nothing of consequence ever gets posted on usenet.
It's their problem and maybe a problem for anyone who ends up licensing
the patent if it's granted. The patent application does have nice
illustrations. I post ascii diagrams once in a while but there's no
comparison.
> I am pissed off!
Look at it as independent confirmation of your idea. :)
--
Joe Seigh
When you get lemons, you make lemonade.
When you get hardware, you make software.
Zelijko, that allocator design is NOTHING NEW. Sorry to burst your bubble.
But, I can prove this with some of my what you call buggy, and crappy,
prior-art posted on this very group. Funny how you ignore my invention, and
go ahead and give a plug to somebody who might be a rip off artist. Their
claims that the allocator is new are COMPLETELY false. If I were you
Zelijko, I would ask them about this.
You ignore the inventor of the algorithm, and try to help somebody who might
be in the practice of infringing on other peoples established prior-art.
Are you referring to Sun and atomic_ptr, or this streamflow thing which is
most likely a rip off of my allocator?
[...]
>> I am pissed off!
>
> Look at it as independent confirmation of your idea. :)
LOL! ;^)
What do you think I should do about this Joe? IMHO, my per-thread allocator
technique is just too big of a issue to let this one just skate by... The
algorithm is just too darn good and clean to give up without a fight. I
think I should aggressively follow up on this... But, damn...
Son-of-a-bitch! I really don't want to dedicate actual time and resources
for lawyer fees on crap like this. I want them to post a correction in the
paper. They can't say the abstract idea is anything new at all. They have to
reference my work on the original invention. They got grants to do the
paper... Read the Acknowledgements in the paper...
That's crap!
:^0
Oops. You were talking about Sun. I don't think the authors of the allocator
paper have filed patents. Or at least they don't seem to mention anything
about it.
[...]
Here are the websites for the authors of the questionable paper:
http://people.cs.vt.edu/~scschnei/
(the phd sudent)
http://www.inf.uth.gr/~cda/
(the faculty)
http://people.cs.vt.edu/~dsn/
(the professor)
Their methodologies seems to be a carbon copy of my 2004 invention which I
published on Oct 2 2005 in this group.
Damn.
Now, a question: the advantage of Maged's allocator is that it uses
*per-processor* heaps, where your algorithm and Streamflow use *per-thread*.
Any other ("better") algorithms that use only per-processor heaps?
>
> But, I can prove this with some of my what you call buggy, and crappy,
> prior-art posted on this very group. Funny how you ignore my invention, and
> go ahead and give a plug to somebody who might be a rip off artist. Their
>
That's because you failed to present your invention in a usable and easily
comprehensible form. And I don't really care who was "the first".
>
> You ignore the inventor of the algorithm, and try to help somebody who might
> be in the practice of infringing on other peoples established prior-art.
>
Of course I ignore the inventor. The _invention_ is what I care about, not
the inventor or the recency of the invention. [If it is any comfort to you,
I never, ever, remember names of book authors, names of actors in movies,
names of bands, etc. I remember only the particular pieces that I liked -
titles of books, movies or songs.] As for helping them - I'm not doing that.
I gave their work as an example of the kind of work that i appreciate:
properly documented and benchmarked code.
We might argue about the meaning of "properly", but they have at least shown
*some* benchmarks, given *usable* source code package, and also *documented*
(in human-readable text) their algorithm in detail. Your work, compared to
theirs, has only maybe-usable source code. Is it so hard to comprehend why I
prefer their work?
>
> We might argue about the meaning of "properly", but they have at least shown
> *some* benchmarks, given *usable* source code package, and also *documented*
> (in human-readable text) their algorithm in detail. Your work, compared to
> theirs, has only maybe-usable source code. Is it so hard to comprehend why I
> prefer their work?
>
And please consider this as a constructive criticism. Your ideas might indeed
have some merit or even be better than the (academic) "state of the art", but
unless you present them in nicer form, they will probably be left overseen.
I have been saying that on this group since 2005.
> Now, a question: the advantage of Maged's allocator is that it uses
> *per-processor* heaps, where your algorithm and Streamflow use
> *per-thread*.
> Any other ("better") algorithms that use only per-processor heaps?
Not that I know of. Basically, Per-CPU can't really compete with per-thread
because of the synchronization overhead involved. Random threads could be
scheduled to run on the same core, so the per-cpu sync is necessary. In some
situations you can get rid of the LOCK prefix on x86 arch if you use clever
per-cpu. For instance, you could bind a group of threads to the same CPU,
therefore you can use interlocked rmw on x86 without the LOCK prefix.
However, even with this type of optimization, the per-thread stuff still
beats it by a fairly WIDE margin because its not making frequent use of
interlocked rmw or memory barriers. This is sort of unfair the various
per-cpu synchronization schemes.
>> But, I can prove this with some of my what you call buggy, and crappy,
>> prior-art posted on this very group. Funny how you ignore my invention,
>> and
>> go ahead and give a plug to somebody who might be a rip off artist. Their
>>
> That's because you failed to present your invention in a usable and easily
> comprehensible form. And I don't really care who was "the first".
Damn. Well, will you at least look at my website when I announce it in this
group in the next week or two?
>> You ignore the inventor of the algorithm, and try to help somebody who
>> might
>> be in the practice of infringing on other peoples established prior-art.
>>
> Of course I ignore the inventor.
IMHO, that's not a good practice.
> The _invention_ is what I care about, not
> the inventor or the recency of the invention.
IMHO, you must give the inventor of the things you care about some credit.
No?
> [If it is any comfort to you,
> I never, ever, remember names of book authors, names of actors in movies,
> names of bands, etc.
Perhaps, you should remember the names of some authors. It could turn out to
be helpful sometime down the road...
> I remember only the particular pieces that I liked -
> titles of books, movies or songs.] As for helping them - I'm not doing
> that.
> I gave their work as an example of the kind of work that i appreciate:
> properly documented and benchmarked code.
Fine.
> We might argue about the meaning of "properly", but they have at least
> shown
> *some* benchmarks, given *usable* source code package, and also
> *documented*
> (in human-readable text) their algorithm in detail. Your work, compared
> to
> theirs, has only maybe-usable source code. Is it so hard to comprehend
> why I
> prefer their work?
Okay. I see your point. I hope the Website, and the diagrams and samples of
user-level source code will change your mind...
So be it.
[...]
> And please consider this as a constructive criticism. Your ideas might
> indeed
> have some merit or even be better than the (academic) "state of the art",
> but
> unless you present them in nicer form, they will probably be left
> overseen.
I will be counting on the content of my upcoming website to be able to help
change your mind about my work. Hopefully!
;^)
[...]
> For instance, you could bind a group of threads to the same CPU, therefore
> you can use interlocked rmw on x86 without the LOCK prefix.
[...]
Only the threads in the group can skip the LOCK prefix wrt interlocked ops
on shared locations that are bound to the group. Of course this would be the
per-cpu data. So, the actually data-structures needed for this could be as
simple as this:
<warning, buggy pseudo-code of data-structures!>
// declare per-thread / cpu structures
typedef struct per_cpu_s per_cup_t;
typedef struct per_thread_s per_thread_t;
typedef struct per_thread_group_s per_thread_group_t;
// per-thread represents a single thread
struct per_thread_s {
per_thread_t *next; // the next thread
per_thread_group_t *group; // the thread group we belong to
[...]; // fill-in-the-blank
};
// per-thread group represents a list of threads
struct per_thread_group_s {
per_thread_group_t *next; // the next group
per_thread_t *threads; // the threads in the group
per_cpu_t *cpu; // set affinity for threads to this cpu
[...]; // fill-in-the-blank
};
// rper-cpu represents a list of thread groups
struct per_cpu_s {
per_thread_group_t *groups; // this cpu's thread groups
[...]; // fill-in-the-blank
};
// You bind the affinity of each thread in a group to the cpu structure the
group belongs to.
// Threads within a group can communicate with each other by using
interlocked rmw operations without the LOCK prefix
// Threads that wish to communicate with another threads that belongs to
another group on another cpu, well it must use interlocked with the LOCK
prefix
>
> Okay. I see your point. I hope the Website, and the diagrams and samples of
> user-level source code will change your mind...
>
I will for sure look at the new web-site :)
I am going to include links to as many posts which include pseudo-code that
I have posted as I can in a dedicated USENET Prior-Art sub-section under the
References main-section of a website I am working on; I am going to post
notice of it here. That would at least get all the links in one place... I
guess that would be some sort of a start in clearing up the mess of tangled
pseudo-code I have created!
:^0
So it would seem... Strange why he would choose that particular picture to
show off... Interesting indeed.
> PhD students are inclined to read usenet :)
I hope they don't read USENET after a "long" nights of heated poker games
and, perhaps, the possible consumption of a "questionable" amount of beer...
Just kidding! ;^)
1. cmpxchg on a single cpu
2. lock cmpxchg on a single cpu (no contention)
3. lock cmpxchg on 2,3,4.. cpus (heavy contention)
I guess the metric I'm interested in is the # of operations / second. I'm
particularly interested in the difference between 1 and 2.
> Look at it as independent confirmation of your idea. :)
For Chris it would be better if they confirm his idea by buying his
library... :)
Btw if they just publish the algorithm and not patent it, afaik the
algorithm now in public domain, so nobody else can't patent it, and
everybody is free to use it. At least Chris still can use it in his
library...
Dmitriy V'jukov
> Btw if they just publish the algorithm and not patent it, afaik the
> algorithm now in public domain, so nobody else can't patent it, and
> everybody is free to use it. At least Chris still can use it in his
> library...
Btw are posts in usenet, posts in blogs etc considered as
"Publications" by US patent law? I mean "Publications" which can
prevent patenting. Or they just close eyes on this, and say that there
are no usenet/internet? And consider only publications in journals?
Dmitriy V'jukov
--
Ian Collins.
> I don't think the authors of the allocator
> paper have filed patents. Or at least they don't seem to mention anything
> about it.
Michael, Maged M publications about SMR don't mention anything about
it too. And publications about ROP too...
At least search in patent applications doesn't get any result on it...
Dmitriy V'jukov
> I want them to post a correction in the
> paper. They can't say the abstract idea is anything new at all. They have to
> reference my work on the original invention.
Am I understand right that this give nothing to you in the sense of
patenting? You can't patent it anyway since there are publication in
public domain on it?
Dmitriy V'jukov
:^)
Are you searching on the inventor's names? Try
in/Michael-Maged-M
in/McKenney-Paul-e
in/Moir-Mark-S
in/Herlihy-Maurice-P
Most people won't mention patent applications publically because they
get a lot of heat on that. But researchers especially are required
by their employers to file invention disclosures and are likely
performance reviewed on number of patents and other revenue generating
things they do. Ditto for universities since they now have and license
patents as a source of revenue. Profs probably have to sign IP aggreements.
Grad students probably get screwed on this.
Any new non trvial algorithm that gets a paper published likely has
a patent application in the works. It might take a while to find
it as the patent applications don't get published for at least a year
or more.
According to the IP lawyers I've spoken to, yes. Actually, anything
published on the internet. Google "archives" the former. The
latter, you'd probably have to prove some members of the public saw
it around the time in question.
Doesn't stop anyone from patenting existing ideas. Happens all the
time.
And though in theory publishing something protects your right to
use your own ideas, it really depends on whether their lawyers
are bigger than your lawyers.
[...]
Well, so far things are looking good in the initial round of e-mails I had
with the authors of the paper. The professor, who turns out to be a nice
guy, is going to inform the student of my claims. I just don't want to get a
lawyer involved yet. Luckily for me, so far, it appears like I might not
have to do that... The professor understands a great deal of the technical
aspects and from our initial correspondence I think its going to be easy to
talk to him in order to resolve the issue at hand. He seems to be agreeing
to consider revising the paper and referencing my works.
Humm... I can't remember if you sent the authors of the Sun patent any
emails. Did you? If so, care to mention a little something about what was
said Joe?
No, no. I mean patent applications for the memory allocator.
Dmitriy V'jukov
The professor has asked me to please refrain from using any derogatory
comments wrt his well respected and hard working phd students on this group.
I pledge to do just that, and have apologized for any juvenile comments I
made. I wrote that stuff when I just found out about this issue, and I was
under the influence of anger. I need to work on that...
Anyway, it looks like we are going to be able to resolve the issue rather
quickly.
I promise to keep this group informed of any progress.
Thank you all.
The Professor (Dimitris Nikolopoulos) has assures me that they have no plans
to patent their algorithm. So the answer to your question is no, there is no
patent application...
> Humm... I can't remember if you sent the authors of the Sun patent any
> emails. Did you? If so, care to mention a little something about what
> was said Joe?
They said they'd refer it to their lawyers since at that point it's
not really something they're allowed to discuss directly with outsiders.
I don't know what the status on the application is but I haven't seen
any papers written on it which is what you'd expect from a research
group that made a breakthough (they've been working on that problem
a long time).
Did you take a look in Google?
Yuck, there's gotta be a cleaner URL. Oh well, that one was sufficient.
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.5" / 37N 20' 15.3"
Internet: steve @ Watt.COM Whois: SW32-ARIN
Free time? There's no such thing. It just comes in varying prices...
Just delete everything after the second forward slash starting at
'.../thread/'
So, the long links you posted could be trimmed to:
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/7c2556f454cbd268
Why would you imagine that they even knew that you'd had the same
idea? All of this stuff is completely obvious. (That is, the solutions
are obvious to anyone who arrives at a need for them.)
--
David Hopwood <david....@industrial-designers.co.uk>
Really? The scheduler can only switch between "random threads" on
instruction boundaries. So you can achieve the effect of expensive
interlocked instructions using non-interlocked instructions.
(This is even if we don't use pinning of threads to CPUs combined
with a user-level load balancer.)
> In some situations you can get rid of the LOCK prefix on
> x86 arch if you use clever per-cpu.
This is not dependent on x86, and there's no particular need for
cleverness.
--
David Hopwood <david....@industrial-designers.co.uk>
In article <bqFai.119232$4a.8...@fe3.news.blueyonder.co.uk>,
David Hopwood <david....@industrial-designers.co.uk> writes:
> Chris Thomasson wrote:
> > [ ... ] Random
> > threads could be scheduled to run on the same core, so the per-cpu sync
> > is necessary.
>
> Really? The scheduler can only switch between "random threads" on
> instruction boundaries. So you can achieve the effect of expensive
> interlocked instructions using non-interlocked instructions.
This claim as stated is false. A scheduler can, in general, switch
between "random threads" anytime that it gains control. So, the
issue comes down to whether an instruction can be interrupted in
mid-execution. And the answer is yes:
Probably the best known example is the Motorola 680x0 family.
Upon encountering a page fault or protection exception, this type
of processor suspends itself mid-instruction and traps to the OS.
Typically, the OS initiates any I/O needed to resolve the fault
and schedules a different thread on the processor. The information
needed to continue the processor is stored on a privileged stack
and the OS must take care to restore this state data prior to
resuming the thread later.
Other, mostly historic, processors have been capable of taking
even interrupts in the middle of executing an instruction. Typically
these were instructions that handled long strings in a single
instruction.
Even the Intel x86 family has non-atomic instructions: A fault
or interrupt can interrupt an instruction with the REP prefix at
certain points before the instruction completes. In this situation,
the ISP will remain at the "current" instruction (the REP), but
various processor registers (in particular, (E)SI and (E)DI) will
have been modified.
> (This is even if we don't use pinning of threads to CPUs combined
> with a user-level load balancer.)
>
> > In some situations you can get rid of the LOCK prefix on
> > x86 arch if you use clever per-cpu.
>
> This is not dependent on x86, and there's no particular need for
> cleverness.
This is the claim that I am particularly concerned about.
First, some readers may interpret this claim to mean that the C
language construct "++var" is atomic, which is almost never is.
Second, some readers may interpret this statement to mean that
some instructions, such as the x86 ADD instruction, operate
atomically on memory, which they do not on multi-CPU systems unless
you properly incorporate the LOCK prefix.
Granted, a programmer can use non-atomic instructions to achieve
coherent concurrency, but I claim that such use constitutes
"cleverness." Even use of common mechanisms as Peterman's and
Lamport's algorithms requires assurance of "proper" compiler
generated code and the existence of coherent caches.
In summary, while your claims are likely to be true in most currently
popular desktop and server systems, they are NOT universally true --
even in POSIX and other less-popular systems.
> --
> David Hopwood <david....@industrial-designers.co.uk>
- dmw
--
. Douglas Wells . Connection Technologies .
. Internet: -sp9804- -at - contek.com- .
I'm under the impression that on 680x0 processors, this is only
visible in the case of floating-point instructions. If that is
correct, then it doesn't interfere significantly with the use of
this technique.
> Other, mostly historic, processors have been capable of taking
> even interrupts in the middle of executing an instruction. Typically
> these were instructions that handled long strings in a single
> instruction.
Then this technique cannot be used *for those instructions*. They
are usually instructions that could not have been interlocked anyway.
> Even the Intel x86 family has non-atomic instructions: A fault
> or interrupt can interrupt an instruction with the REP prefix at
> certain points before the instruction completes.
An instruction with the REP prefix cannot be interlocked (on 80286+,
combining the REP and LOCK prefixes causes an invalid opcode exception).
I said only that interlocked instructions can be replaced with
non-interlocked ones.
[...]
>>This is not dependent on x86, and there's no particular need for
>>cleverness.
>
> This is the claim that I am particularly concerned about.
>
> First, some readers may interpret this claim to mean that the C
> language construct "++var" is atomic, which is almost never is.
"++var" is obviously not an instruction. Anyone who thinks that it
is (or that it necessarily compiles to a single instruction) is
indeed likely to misinterpret what I said, but they are likely to
have bigger problems than that.
> Second, some readers may interpret this statement to mean that
> some instructions, such as the x86 ADD instruction, operate
> atomically on memory, [...]
Only if they are not paying attention to the context.
--
David Hopwood <david....@industrial-designers.co.uk>
No. This could occur on any 680x0 instruction that referenced
more than one page. Examples include instructions that spanned a
page boundary and instructions where a memory operand was in
different page than the instruction (as almost always happened for
references to shared variables). I tried to do a quick web search
but didn't find anything explicit other than processor manuals.
However, I can attest to this on personal experience of having
implemented several SMP OSs, at least for x=1,2,3.
> > Other, mostly historic, processors have been capable of taking
> > even interrupts in the middle of executing an instruction. Typically
> > these were instructions that handled long strings in a single
> > instruction.
>
> Then this technique cannot be used *for those instructions*. They
> are usually instructions that could not have been interlocked anyway.
Agreed. As I acknowledged in my original article, I was mostly
concerned that the literal discussion had become so detailed that
the larger context was being omitted. I attempted to search the
immediately preceding articles and could find no references that
the context was other than the literal stated claim that ""The
scheduler can only switch between 'random threads' on instruction
boundaries."
> > Even the Intel x86 family has non-atomic instructions: A fault
> > or interrupt can interrupt an instruction with the REP prefix at
> > certain points before the instruction completes.
>
> An instruction with the REP prefix cannot be interlocked (on 80286+,
> combining the REP and LOCK prefixes causes an invalid opcode exception).
> I said only that interlocked instructions can be replaced with
> non-interlocked ones.
Ah, it was this context that I couldn't find in the discussion. (That
search was hindered somewhat by the fact that the article history
was so long and some news readers/systems had failed to follow
best practices in maintaining the references header line.) Thanks
for making it explicit.
> [...]
> >>This is not dependent on x86, and there's no particular need for
> >>cleverness.
> >
> > This is the claim that I am particularly concerned about.
> >
> > First, some readers may interpret this claim to mean that the C
> > language construct "++var" is atomic, which is almost never is.
>
> "++var" is obviously not an instruction. Anyone who thinks that it
> is (or that it necessarily compiles to a single instruction) is
> indeed likely to misinterpret what I said, but they are likely to
> have bigger problems than that.
>
> > Second, some readers may interpret this statement to mean that
> > some instructions, such as the x86 ADD instruction, operate
> > atomically on memory, [...]
>
> Only if they are not paying attention to the context.
Agreed on both points, but my intent was to correct the impression
for the casual reader.
The patent application is made public and they can talk about how it
basically claims prior-art in detail?
:^0
> I don't know what the status on the application is but I haven't seen
> any papers written on it which is what you'd expect from a research
> group that made a breakthough (they've been working on that problem
> a long time).
Working hard on researching USENET an ripping off algorithms.
The lock-free mechanism for the vZOOM and StreamFlow allocators were not
that obvious because. The vZOOM allocator framework is based on a unique
lock-free mechnisim I invented back in December of 2004, and posted some
pseudo-code on Oct. 3 2005. According to the Professor, they created their
algorithm (StreamFlow) in the Summer of 2005, and published it in the
referenced paper in 2006. I can prove that I developed the algorithm
in-house in late 2004.
__________________
The algorithms are rather recent. So, the actual lock-free method was not
that obvious. If it was, it would have been developed and patented in the
last 20 years or something...
__________________
> (That is, the solutions
> are obvious to anyone who arrives at a need for them.)
Highly scaleable memory allocators are in need indeed, agreed?
;^)
Is suppose to read:
The patent application is made public and they "can't" talk about how it
The opposite I think. If they paid attention to usenet they'd
realize there was orginal research going on there and could have
save themselves some time.
For the time being though it's safe for them to continue research
as I'm taking a break from inventing new lock-free algorithms.
For one thing, it's too easy. The novelty has definitely worn
off. So no more public domain stuff from me.
Indeed. I think the review board of the paper referenced in this thread:
http://people.cs.vt.edu/~scschnei/streamflow/
FAILED to examine USENET...
> For the time being though it's safe for them to continue research
> as I'm taking a break from inventing new lock-free algorithms.
> For one thing, it's too easy. The novelty has definitely worn
> off. So no more public domain stuff from me.
I have a strong feeling of agreement.... Humm...
Indeed. I think the review board of the paper referenced in this thread:
http://people.cs.vt.edu/~scschnei/streamflow/
FAILED to examine USENET...
[...]
____________
Here is the exact paper:
http://people.cs.vt.edu/~scschnei/papers/ismm06.pdf
The "academic world" needs to learn how to search through USENET for a
couple of hours before inventing anything "new"...
:^0