Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How can " dup drop " be anything other than a nop

284 views
Skip to first unread message

Bob Armstrong

unread,
Sep 30, 2015, 10:20:46 PM9/30/15
to
Here's a question for those who really grok x86 . I'm working to implement my abstraction of a life spent in APL , more recently K , into Forth and chose Ron Aaron's Reva Forth because it was practical and built directly in a machine language , the WinTel in my notebook and on my desk .

I'm definitely working at my Peter Principle limit .

Here are my relevant notes from yesterday :

| ======================== | Mon.Sep,20150928 | ======================== |
| ... weird issue w simply nested fns remains

text> "lf tokcut rho | |>| 1432 | works
: tst "lf tokcut ; text> tst rho | works not
: tst "lf dup drop tokcut ; text> tst rho | works
: tst dup drop "lf tokcut ; text> tst rho | works not

Without even knowing anything else , how can a " dup drop " have any effect ? How for that matter can a call and a return have any effect ?

Here are some details . I've uploaded a copy of 4th.CoSy as of the 29th to http://cosy.com/CoSy/4th.CoSy.html , 4thCoSy1509.zip , for anyone who wants to try the whole thing , and , since a basic use of CoSy is as a daily log , of which programming is just part of life , you can see the rather messy path of development leading up to the present .

One of the core concepts of APLs is "atomic apply" . In CoSy , nouns generically are reference counted list of lists with modulo indexing . Think of them as trees or bushes with individual items as leafs or "atoms" . And many verbs apply "atomically" . That is , the verb applies to each atom , or for a "dyadic" verb , each corresponding pair of atoms ( leafs ) . For example , in current 4th.CoSy :

i( 0 -1 )i 10 _i iota +i
0 0 2 2 4 4 6 6 8 8

But when lists are nested more deeply , the function has to recursively go out to the simple lists to apply to corresponding leafs . My friend Morten Kromberg recently gave an excellent Google Tech Talk on APL , https://youtu.be/PlM9BXfu7UY , specifically market leader , Dyalog , which he and his wife Gitte run . I recommend Morten's talk for a much deeper introduction to the nature and thrust of APL .

In properly implementing the recursive atomic apply "adverb" , as Ken Iverson would call it because it takes verbs as arguments , the desire for a "stack frame" vocabulary to simplify the writing and reading of these ubiquitous sorts of recursive verbs arose . It's in implementing the notions of George B. Lyons : Stack Frames and Local Variables : http://www.forth.com/archive/jfar/vol3/no1/article3.pdf that this bug emerged .

Here's the vocabulary :

s0 cell- dup constant s1 dup dup ! variable, SFptr
| initialize StackFrame pointer
| I'm not sure what SFptr gets initialized to matters
| | relies for stopping on the 0th stack cell being set to itself

: SF+ | puts previous esi on the stack and saves current
esi@ cell- SFptr xchg ;

: SFx cells SFptr @ + ; | ( n -- n offset by current pointer )

: SF@ SFx @ ; : SF! SFx ! ; | Fetch and store relative to current pointer

: SF- | ( ... n -- drop n ) restores previous stack pointer and drops n items beyond current pointer .
>aux SFptr @ dup @ SFptr ! cell+ esi! aux> ndrop ;


These rely on the words ' esi@ and ' esi! which are the most complicated x86 I've ever written :) .


: esi@ inline{ 8D 76 FC 89 F0 } ; | lea esi,[esi-04] mov eax, esi
| esi contains the current stack ptr ,
| , ie , the address of the item which was ToS when it was called .

: esi! asm{ mov esi, eax } drop ;
| |(| esi@ esi! |)| ends up doing nothing |

: ndrop ( ... n -- drops n cells from stk ) | optimized from ' drop loop .
esi@ swap 1+ cells + esi! ;
| don't understand why :| : ndrop 1+ cells esi@ + esi! ; |: doesn't work .


Here are the definitions of ' dup and ' drop in Reva :

see dup see drop
00427A79 8D 76 FC lea esi,[esi-04]
00427A7C 89 06 mov [esi],eax
00427A7E C3 ret

0042799E 8B 06 mov eax,[esi]
004279A0 8D 76 04 lea esi,[esi+04]
004279A3 C3 ret

That pretty much defines the structure of the stack .


The problem could conceivably be in the reference incrementing and decrementing functions which call ' SF+ and ' SF- on entry and exit of functions . But I figure someone who really groks x86 can either point out how : dupdrop dup drop ; can have any effect wherever in the code . I probably could finesse the problem by sprinkling dupdrop in the right places in my code , but that's no substitute for understanding and eliminating the cause .

Thanks for any help .

Peace thru Freedom

Bob A

Rod Pemberton

unread,
Oct 2, 2015, 1:02:07 AM10/2/15
to
On Wed, 30 Sep 2015 22:20:42 -0400, Bob Armstrong <b...@cosy.com> wrote:

> | ... weird issue w simply nested fns remains
>
> text> "lf tokcut rho | |>| 1432 | works
> : tst "lf tokcut ; text> tst rho | works not
> : tst "lf dup drop tokcut ; text> tst rho | works
> : tst dup drop "lf tokcut ; text> tst rho | works not
>
> Without even knowing anything else , how can a " dup drop " have any effect ?

Is it that "dup drop" is doing something or fixing something,
or rather is it that _anything_ inbetween "lf and tokcut fixes
the issue? ...

I.e., my instinct would be that that there is something wrong
with either "lf or your outer interpreter which is being reset
by "dup drop" prior to tokcut.


FYI, you seem to think DUP DROP is a NOP, which it effectively
is, but the way it's implemented, it has two side effects which
could be the sources of problems or fixes.

1) It puts a value on the stack, which is still present even
though the stack pointer has been repointed, i.e., the value
is still present above the top of stack location. E.g., it's
possible something may be accessing the value there and then
working.

2) It saves and restores EAX. Yes, that's a side effect.
I.e., perhaps corruption of EAX occurs between "lf and
tokcut without the DUP DROP sequence being present. Therefore,
it works correctly. This could be an issue if you're mixing
C (or some another high-level language) and assembly and haven't
preserved the compilers in-use registers when switching between
assembly and C. These in-use registers get "clobbered" or
destroyed, unless your assembly preserves their contents. EAX
is usually a register which is destroyed by high-level languages.


Rod Pemberton


--
Just how many texting and calendar apps does humanity need?
Just how many food articles from neurotic millenials do we need?

Bob Armstrong

unread,
Oct 2, 2015, 8:05:43 PM10/2/15
to
On Thursday, October 1, 2015 at 11:02:07 PM UTC-6, Rod Pemberton wrote:
> On Wed, 30 Sep 2015 22:20:42 -0400, Bob Armstrong <b...@cosy.com> wrote:
>
> > | ... weird issue w simply nested fns remains
> >
> > text> "lf tokcut rho | |>| 1432 | works
> > : tst "lf tokcut ; text> tst rho | works not
> > : tst "lf dup drop tokcut ; text> tst rho | works
> > : tst dup drop "lf tokcut ; text> tst rho | works not
> >
> > Without even knowing anything else , how can a " dup drop " have any effect ?
>
> Is it that "dup drop" is doing something or fixing something,
> or rather is it that _anything_ in between "lf and tokcut fixes
> the issue? ...

Good point . Seems to be anything that touches the stack . I first saw the behavior when I stuck in a ' $.s for debugging . I just tried

: ts ;
: tst "lf ts tokcut ; text> tst rho | works not .

So seems dependent on stack action .
also

: tst tokcut ; text> "lf tst rho | works just to test that .

BTW : this line retrieves the contents of the text window I'm working in and splits it into , at the moment , 1465 individual ref counted strings .

> I.e., my instinct would be that that there is something wrong
> with either "lf or your outer interpreter which is being reset
> by "dup drop" prior to tokcut.

"lf is just constant holding a 1 character ( CoSy ) string containing a line feed .

Get the same problem with any string
daylnTok
s"
| ======================== | "

: daylncut ( str -- listOFstrings ) daylnTok dup drop tokcut ;

text> daylncut rho |>| 150 | split on days

> FYI, you seem to think DUP DROP is a NOP, which it effectively
> is, but the way it's implemented, it has two side effects which
> could be the sources of problems or fixes.

This is my question in which my minimal grudging knowledge of x86 leaves me wondering.

> 1) It puts a value on the stack, which is still present even
> though the stack pointer has been repointed, i.e., the value
> is still present above the top of stack location. E.g., it's
> possible something may be accessing the value there and then
> working.

Good point . It occurred to me that maybe an extra copy of the "lf was getting used .

But
: tst "lf 1 drop tokcut ; text> tst rho |>| 1475 | works

and a raw 1 would bomb tokcut .

> 2) It saves and restores EAX. Yes, that's a side effect.
> I.e., perhaps corruption of EAX occurs between "lf and
> tokcut without the DUP DROP sequence being present. Therefore,
> it works correctly. This could be an issue if you're mixing
> C (or some another high-level language) and assembly and haven't
> preserved the compilers in-use registers when switching between
> assembly and C. These in-use registers get "clobbered" or
> destroyed, unless your assembly preserves their contents. EAX
> is usually a register which is destroyed by high-level languages.

As I mentioned , a reason I chose Reva is that it is completely in a machine language and I didn't want any layer which was not open to me between me and the chip language . I first programmed a ( IBM 1620 ) in 1963 and by the time C/UNIX came along I had left all lesser languages behind for APL .

So the problem has to be right here in very basic code .
I just quit the IUP gui my interface is normally in and executed directly in the Reva console in case the gui could possibly be affecting execution ( I don't really know x86 interrupt facilities ) , but same behavior .

> Rod Pemberton

Thanks for the thoughts . For now I'm finessing the problem with dup drops to get some of my important search functions running again , but also stripping down some examples of the reference incrementing and decrementing functions which call these to see if I can zero in on it that way .

Thanks again
Bob A -- http://CoSy.com

Rod Pemberton

unread,
Oct 2, 2015, 10:29:24 PM10/2/15
to
On Fri, 02 Oct 2015 20:05:39 -0400, Bob Armstrong <b...@cosy.com> wrote:

> On Thursday, October 1, 2015 at 11:02:07 PM UTC-6, Rod Pemberton wrote:
>> On Wed, 30 Sep 2015 22:20:42 -0400, Bob Armstrong <b...@cosy.com> wrote:

>> > | ... weird issue w simply nested fns remains
>> >
>> > text> "lf tokcut rho | |>| 1432 | works
>> > : tst "lf tokcut ; text> tst rho | works not
>> > : tst "lf dup drop tokcut ; text> tst rho | works
>> > : tst dup drop "lf tokcut ; text> tst rho | works not
>> >
>> > Without even knowing anything else , how can a " dup drop " have any effect ?
>>
>> Is it that "dup drop" is doing something or fixing something,
>> or rather is it that _anything_ in between "lf and tokcut fixes
>> the issue? ...
>
> Good point . Seems to be anything that touches the stack .
> I first saw the behavior when I stuck in a ' $.s for debugging .
> I just tried
>
> : ts ;
> : tst "lf ts tokcut ; text> tst rho | works not .
>
> So seems dependent on stack action .
> also
>
> : tst tokcut ; text> "lf tst rho | works just to test that .
>
> [...]
>
> It occurred to me that maybe an extra copy of the "lf was getting used .
>
> But
> : tst "lf 1 drop tokcut ; text> tst rho |>| 1475 | works
>
> and a raw 1 would bomb tokcut .

Since Reva or Cosy keeps the TOS (top-of-stack) in register,
pushing a one (1) to the stack, places one (1) in EAX, and
saves EAX's prior value on the stack, which is then restored
by DROP, the same as DUP DROP. So, '1 DROP' can't be used
to rule out corruption of EAX or modification of it's value.

At this point, my suspicions would be:

1) corruption of EAX

2) a saved address on the return stack not accounted for
and in the way of needed data

3) something else off-by-one etc

Is tokcut a Reva or Cosy word?


Rod Pemberton

Anton Ertl

unread,
Oct 8, 2015, 2:37:53 AM10/8/15
to
Bob Armstrong <b...@cosy.com> writes:
>Good point . It occurred to me that maybe an extra copy of the "lf was gett=
>ing used .
>
>But=20
>: tst "lf 1 drop tokcut ; text> tst rho |>| 1475 | works=20

There is obviously a bug in play here. As the potential number of
bugs is infinite, we can only guess about the bug if you don't show
the code that the Forth system generates here.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2015: http://www.rigwit.co.uk/EuroForth2015/

Bob Armstrong

unread,
Oct 12, 2015, 2:35:55 AM10/12/15
to
Rod and Anton , thanks for your feedback . It's hard to find the time to attack this problem .

I've been trying to strip it down to the nub . Here's where I am .:

: SF+ esi@ cell- SFptr xchg ;
| puts previous esi on the stack and save current

reset 1 SF+ 0 SF+ $.s
(4) 1 427870 0 427868
reset : stst SF+ 0 SF+ ; 1 stst $.s | 0 could be any number . gets replaced
(4) 1 427870 4 427868

reset 1 SF+ 0 0 SF+ $.s
(5) 1 427870 0 0 427868
reset : stst SF+ 0 0 SF+ ; 1 stst $.s
(5) 1 427870 0 432078 427868

| as per Anton's important point :
see SF+
0062AD8C E8 DB BE FF FF call esi@
0062AD91 8D 40 FC lea eax,[eax-04]
0062AD94 8D 76 FC lea esi,[esi-04]
0062AD97 89 06 mov [esi],eax
0062AD99 B8 5C AD 62 00 mov eax,0062AD5C
0062AD9E E9 91 0B FF FF jmp xchg
see esi@
00626C6C 8D 76 FC lea esi,[esi-04]
00626C6F 89 F0 mov eax,esi
00626C71 C3 ret
see cell-
00427D2F 8D 40 FC lea eax,[eax-04]
00427D32 C3 ret
see xchg
0061B934 8B 18 mov ebx,[eax]
0061B936 8B 0E mov ecx,[esi]
0061B938 89 08 mov [eax],ecx
0061B93A 89 D8 mov eax,ebx
0061B93C 8D 76 04 lea esi,[esi+04]
0061B93F C3 ret

| Sucks that Google of all people can't handle html .

Thanks for the intelligent eyes .

B...@CoSy.com
--

Bob Armstrong

unread,
Oct 12, 2015, 2:51:25 AM10/12/15
to
I realized I forgot the most important layer :

see stst
0063621C E8 6B 4B FF FF call SF+
00636221 8D 76 FC lea esi,[esi-04]
00636224 89 06 mov [esi],eax
00636226 B8 00 00 00 00 mov eax,00000000
0063622B 8D 76 FC lea esi,[esi-04]
0063622E 89 06 mov [esi],eax
00636230 B8 00 00 00 00 mov eax,00000000
00636235 E9 52 4B FF FF jmp SF+

| Or fix the font .

Anton Ertl

unread,
Oct 12, 2015, 6:00:54 AM10/12/15
to
It seems that you have TOS in eax and SP in esi. ESI@ fails to store
the previous value of the TOS into memory, so whatever was there
previously remains there. This could be the bug you are after.

As for the font, yes, a fixed-width font is a good idea for
programming newsgroups.

Bob Armstrong

unread,
Oct 12, 2015, 7:19:37 PM10/12/15
to
Anton ,

Thanks . I think Rod was pointing to the same problem . I'll beg general ignorance of x86 as an excuse , but I really should have understood the absolute need for saving eax even when getting the stack pointer itself . Here's the now working definition

: esi@ inline{ 8D 76 FC 89 06 89 F0 } ;
| lea esi,[esi-04] mov [esi],eax mov eax, esi
| esi contains the current stack ptr ,
| , ie , the address of the item which was ToS when it was called .

The Stack Frame vocabulary makes writing recursive functions , in particular , much more readable .

I think one problem I had was in considering interactive mode "correct" , and compiled suspect . Now , I'm not about to spend time figuring out why interactive happened to work ok .

Thanks again ,

Bob A
--
0 new messages