Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
HELP - ASSEMBLER NEWBIE
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Frank Kotler  
View profile  
 More options Feb 20 2002, 7:11 pm
Newsgroups: comp.lang.asm, alt.lang.asm
From: Frank Kotler <fbkot...@ne.mediaone.net>
Date: Thu, 21 Feb 2002 00:08:13 GMT
Local: Wed, Feb 20 2002 7:08 pm
Subject: Re: HELP - ASSEMBLER NEWBIE

Dzemal Kulenovic wrote:

Hi Dzemal. This newsgroup is pretty much dead, which is why you aren't
getting any responses. Try posting in news:comp.lang.asm.x86 - that's a
moderated group, and you *might* have to cc clax-sub...@crayne.org to
get through. Also news:alt.lang.asm is a lot more active than this one,
tho not limited to x86.

I don't think I'll be able to help you very much.

> Ok, here's the c++ function:

In the first place, I don't know C++ :) (but this doesn't look too
tough)

> inline void _stdcall SetPixel(int y,int x,char color)
> {
>      __int16 q=y>>1;
>      (pBitmapRow[x])[q] &= 0xF << (y & 1 ? 4 : 0);
>      (pBitmapRow[x])[q] |= color << (y & 1 ? 0 : 4);
> }

In the second place, I'm not sure I understand what we're doing here.
What video mode is this? (I only know the really simple-minded ones) It
*looks* to me like maybe a 16-color mode - that is, 4 bits per pixel. Is
that right? I think I'm confused, because it looks to me that while we
set the color of an "odd" pixel, we zero out the "even" pixel, and vice
versa. I think I'm missing something - no great surprise. Wait, we're
basing this on whether the *row* is odd or even. I'm just confused.

It's really important to understand exactly what we *need* to do here,
because the first step is to optimize the algorithm, before we even
*think* about asm. A crappy algorithm implemented in exquisite assembly
is still crappy code!

> VC++ disassembly window (debug build of course therefore unoptimized) says
> this translate to:

Can't you get VC++ to spit out asm for an optimized compile?

> mov      eax,dword ptr [ebp+0Ch]
> sar        eax,1
> mov         word ptr [ebp-4],ax

Okay, this is just calculating "q" and storing it in a temporary
variable. I'm not sure we need to do this at all (store it, that is...).
I think we'd be better off if we hadn't made it int16, too - the
processor is generally happier operating on it's "native" size.

> movsx       edx,word ptr [ebp-4]

Move "q" into edx - we do this repeatedly, and I don't think we need to.
I think we could just use edx in the first place, and leave it there.

> mov         eax,dword ptr [ebp+10h]

"x"

> mov         eax,dword ptr [eax*4+4369E8h]

"*x", if I understand it...

> mov         ecx,dword ptr [ebp+0Ch]

"y"

> and         ecx,1
> neg         ecx
> sbb         ecx,ecx
> and         ecx,4

Make cl either 4 or 0 ...

> mov         ebx,0Fh
> shl         ebx,cl

Make ebx either 0F0h or 0Fh...

> mov         cl,byte ptr [eax+edx]

Get our "destination" byte.

> and         cl,bl

Mask out just the nibble we want (whyever we want it! :)

> movsx       edx,word ptr [ebp-4]

Get "q" into edx - wasn't it already there?

> mov         eax,dword ptr [ebp+10h]
> mov         eax,dword ptr [eax*4+4369E8h]

"x" and "*x"

> mov         byte ptr [eax+edx],cl

Move our "anded" byte back to "destination".

> movsx       edx,word ptr [ebp-4]

Get "q"... again.

> mov         eax,dword ptr [ebp+10h]
> mov         eax,dword ptr [eax*4+4369E8h]

"*x"

> movsx       ebx,byte ptr [ebp+14h]

Color into ebx. (bl is all we really need)

> mov         ecx,dword ptr [ebp+0Ch]
> and         ecx,1
> neg         ecx
> sbb         ecx,ecx
> and         ecx,0FCh
> add         ecx,4

Make ecx our shift count depending on "odd" or "even"...

> shl         ebx,cl

Shift the color - or not.

> mov         cl,byte ptr [eax+edx]

Our destination byte.

> or          cl,bl

Or it with our shifted color.

> movsx       edx,word ptr [ebp-4]

Get "q", in case we mislaid it.

> mov         eax,dword ptr [ebp+10h]
> mov         eax,dword ptr [eax*4+4369E8h]

"*x" again.

> mov         byte ptr [eax+edx],cl

Move our byte back into it's destination.

> Now I don't know a first thing about asm but still, 36 INSTRUCTIONS ???!!

Hehe! The "first thing about asm" is that it's most likely *going* to
take a lot of instructions :) Don't put too much stock in the
instruction count - often speed-optimized code is a *lot* longer than
the size-optimized version. But I think an optimized version of this
would be shorter just by eliminating duplicate code. I'd think in terms
of calculating the destination address *once* and hanging on to it.
Likewise, the "shift-count" could probably be figured just once - I
still don't get why we're basing this on "y"! (so my analysis above may
be totally off-base)

> I figure there must be a more optimal way to encode this in assembler (I
> will use C++ inline assembler)

I don't know the syntax for C++ inline assembler (I'm a devout Nasmist),
so I'm not going to be able to help you there. It's a problem, because I
can't readily test any "bright ideas" I might come up with :)

> So, can anyone PLEASE help me with better asm code- I desperately need this
> routine to run as fast as possible.

You may need to re-think what you're doing "from the top". If you're
filling any horizontal "runs" of pixels, for example, there are faster
ways of doing this than by calling setpixel repeatedly - and other
speedups. You might want to store your data in a different manner for
faster access - "offset thinking" is faster than "row-column thinking".
At the very least, we can get rid of some of the unneccessary code!

It would be interesting to see what your compiler comes up with for
optimized code, if you can get it to do it.

But there are surely optimized setpixel's available, if you know where
to look. And folks more knowledgeable than I in a livelier newsgroup.
I'm going to cross-post this to alt.lang.asm - I hope you can look for
replies there(?). We *desperately* need some asm to talk about over
there! :)

Best,
Frank


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Debs  
View profile  
 More options Feb 20 2002, 9:58 pm
Newsgroups: comp.lang.asm, alt.lang.asm
From: Debs <d...@spamfree.net>
Date: Thu, 21 Feb 2002 02:57:47 +0000
Local: Wed, Feb 20 2002 9:57 pm
Subject: Re: HELP - ASSEMBLER NEWBIE
Hello cyberfolk!

On Thu, 21 Feb 2002 00:08:13 GMT, Frank Kotler spake thus:

>Dzemal Kulenovic wrote:

>Hi Dzemal. This newsgroup is pretty much dead, which is why you aren't
>getting any responses. Try posting in news:comp.lang.asm.x86 - that's a
>moderated group, and you *might* have to cc clax-sub...@crayne.org to
>get through. Also news:alt.lang.asm is a lot more active than this one,
>tho not limited to x86.

Hi :) As Frank says, c.l.a is not the liveliest of assembly language
groups, I read this post in a.l.a

>> Ok, here's the c++ function:

>In the first place, I don't know C++ :) (but this doesn't look too
>tough)

>> inline void _stdcall SetPixel(int y,int x,char color)
>> {
>>      __int16 q=y>>1;
>>      (pBitmapRow[x])[q] &= 0xF << (y & 1 ? 4 : 0);
>>      (pBitmapRow[x])[q] |= color << (y & 1 ? 0 : 4);
>> }

If I understand this correctly, we are saying (in PseudoCode):

{
define z = (pBitmapRow[x])[q]

        int     q = y shr 1     ; ebp+0Ch

        if(y == 1)
        {
                z = z  AND 0xF0
        }
        else
        {
                z = z  AND 0xF
        }

        if(y == 1)
        {
                z = z OR color
        }
        else
        {
                z = z OR (color shl 4)
        }

}

I'm not sure how you would code (pBitmapBow[x])[q] in assembler, so
I'll leave that part of it the same as in the original code (you'll
ahve to convert to make use of the library code).

inline void _stdcall SetPixel(int y,int x,char color)
{
     __int32 q=y>>1;
     (pBitmapRow[x])[q] &= 0xF << (y & 1 ? 4 : 0);
     (pBitmapRow[x])[q] |= color << (y & 1 ? 0 : 4);

}

I changed q to a 32-bit int, for the same reason Frank gave (smaller,
faster code).

>> mov      eax,dword ptr [ebp+0Ch]
>> sar        eax,1

#define z = (pBitmapRow[x])[q]  ; this is just to make the code easier
                                ; to show you

        mov     eax,dword ptr[ebp+0Ch]  ; y
        mov     ecx,eax                 ; less memory reads
        sar     eax,1                   ; ecx = y, edx = q

>> mov         word ptr [ebp-4],ax
>> movsx       edx,word ptr [ebp-4]

        mov     esi,eax                 ; 2 less memory accesses
                                        ; esi = q

>> mov         eax,dword ptr [ebp+10h]

        mov     eax,dword ptr[ebp+10h]  ; x

>> mov         eax,dword ptr [eax*4+4369E8h]

        mov     eax,dword ptr[eax*4+4369E8h]
                                        ; eax = (pBitmapRow[x])

; the above constant is set up by the system and is dependant on the
; position in memory of the resident code. I don't know how you would
; code that in assembler.

>> mov         ecx,dword ptr [ebp+0Ch]
>> and         ecx,1

        and     ecx,1                   ; ecx = y & 1

>> neg         ecx
>> sbb         ecx,ecx
>> and         ecx,4

        shl     ecx,2                   ; ecx = (y & 1 ? 4 : 0)

>> mov         ebx,0Fh
>> shl         ebx,cl

        mov     ebx,0Fh
        shl     ebx,cl                  ; ebx = 0xF << (y&1 ? 4:0)

>> mov         cl,byte ptr [eax+edx]
>> and         cl,bl

        xor     edx,edx
        mov     edx,byte ptr[eax+esi]
        and     edx,ebx         ; edx = z & (0xF << (y&1 ? 4:0))

The following code makes little sense to me (basically because it
makes no attempt to optimise as a routine), so I'll ignore it and just
show you how I would write it without trying to compare to the code
MSVC gave you.

        xor     cl,4                    ; ecx = (y & 1 ? 0 : 4)
        mov     ebx,[ebp+14h]           ; ebx low byte = color
                                ; the processor would have pushed as
                                ; a dword, so this is OK.

        shl     ebx,cl                  ; ebx = color<<(y&1 ? 0:4)

        or      edx,ebx                 ; z = z OR color<<(y&1 ? 0:4)
                                        ; edx = dl = z
        mov     byte ptr[eax+esi],dl

I'm sure I havent provided the most optimal way of doing it, but 18
instructions with 6 memory accesses, instead of 36 instructions and 21
memory accesses!

The part that you will have to work out (if nobody else is able to
help) is how to get the address for (pBitmapRow[x]), as it won't be
4369E8h every time you run it.

>> I figure there must be a more optimal way to encode this in assembler (I
>> will use C++ inline assembler)

>I don't know the syntax for C++ inline assembler (I'm a devout Nasmist),
>so I'm not going to be able to help you there. It's a problem, because I
>can't readily test any "bright ideas" I might come up with :)

Me neither, but I figure that using the syntax in the posted code will
provide something that should work :)

>> So, can anyone PLEASE help me with better asm code- I desperately need this
>> routine to run as fast as possible.

Like Frank I'm a commited nasm user, and I haven't got a lot of idea
how to do things with other assemblers. I pick it up as I go, by
looking at how others write code for those other assemblers (and, of
course, by looking at code samples online) :) The above should do the
same as the code VC produced, but it will be a lot faster (more than
twice the speed, because of the number of memory accesses saved as
well as being half the number of instructions).

I agree with what Frank said about rethinking the algorithm if you are
writing a lot of pixels. A line can be drawn far faster than writing
it pixel by pixel. whether it's horizontal, vertical or diagonal, as
coding the entire line in assembler allows you to write multiple
pixels in one write (if they are adjacent horizontally) and to reuse
values in registers without saving in memory between calls when they
are not horizontally adjacent.

--
Debs
d...@dwiles.nospam.demon.co.uk
----
If you're not part of the solution, start another problem!


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Randall Hyde  
View profile  
 More options Feb 28 2002, 12:15 pm
Newsgroups: comp.lang.asm, alt.lang.asm
From: "Randall Hyde" <rh...@cs.ucr.edu>
Date: Thu, 28 Feb 2002 17:14:10 GMT
Local: Thurs, Feb 28 2002 12:14 pm
Subject: Re: HELP - ASSEMBLER NEWBIE

Frank Kotler wrote in message <3C743A59.61DFD...@ne.mediaone.net>...
>Dzemal Kulenovic wrote:

>Hi Dzemal. This newsgroup is pretty much dead, which is why you aren't
>getting any responses. Try posting in news:comp.lang.asm.x86 - that's a
>moderated group, and you *might* have to cc clax-sub...@crayne.org to
>get through. Also news:alt.lang.asm is a lot more active than this one,
>tho not limited to x86.

It seemed to be dead.
Then we get this post from PacBell and discover that
it's really been alive, but stuff has been getting lost.
Shades of comp.lang.asm.x86!
Randy Hyde

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google