--
Patrick D. Rockwell
mailto:proc...@thegrid.net
mailto:HNH...@prodigy.net
mailto:patri...@aol.com
I doubt that coding in assembly is going to dynamically increase the
speed of your routine. Your Pascal compiler should be using the math
coprocessor which is what you would be using in assembly too. (At least,
I hope that Borland Pascal is still not using their own floating point
format for "real" variables. If so, you can switch to "single" or "double"
which use the standard IEEE format that coprocessor supports) However, if
you insist, my on-line assembly tutorial covers basic coprocessor programming.
Go to my URL below to find it.
BTW, why did you post this to comp.lang.c++??
--
Paul Carter [http://www.geocities.com/pacman0x80]
> How do you do exponentiation in assembly language? For example, how do
> you raise 3 to the power of
> 4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
> Pascal unit which will integrate the
> area under a curve with assembly language which is much faster than
> pascal is. Thanks in advance.
>
See chapter 14 of the Art of Assembly at http://webster.cs.ucr.edu
Keep in mind that your code won't be tremendously faster than Pascal
unless the Pascal code is really bad. The FPU is probably going to
be the limiting factor here. OTOH, you can probably double the
speed with a little effort.
Randy Hyde
"Patrick D. Rockwell" wrote:
>
> How do you do exponentiation in assembly language?
Depends what processor you're on. And why do you ask a question
about speeding up PASCAL performance in ASSEMBLY in a C++ group?
I have no idea wether you're limited to use CPU only or not. But using FPU
would be much more fatser and simpler. Of course, if you want to spend a lot
of time figuring out how to implement this in software w/o FPU, you can. You
will be cool after you finish. :)
Good Luck.
--
Alexei A. Frounze
alexfru [AT] chat [DOT] ru
frounze [AT] ece [DOT] rochester [DOT] edu
http://alexfru.chat.ru
http://members.xoom.com/alexfru/
http://welcome.to/pmode/
"Patrick D. Rockwell" <proc...@thegrid.net> wrote in message
news:FPvQ5.1223$ue.1...@newsread1.prod.itd.earthlink.net...
> How do you do exponentiation in assembly language? For example, how do
> you raise 3 to the power of
> 4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
> Pascal unit which will integrate the
> area under a curve with assembly language which is much faster than
> pascal is. Thanks in advance.
>
> How do you do exponentiation in assembly language? For example, how do
> you raise 3 to the power of
> 4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
> Pascal unit which will integrate the
> area under a curve with assembly language which is much faster than
> pascal is. Thanks in advance.
>
> --
> Patrick D. Rockwell
> mailto:proc...@thegrid.net
> mailto:HNH...@prodigy.net
> mailto:patri...@aol.com
Very easy. You write the math function yourself, or you obtain a math
library for the assembler you are working with.
BTW, assembly is faster to execute but much longer to write in than a high
level language.
Why don't you just profile your program (find out where the bottlenecks
are) then optimize those parts?
-- Thomas
The typical way on 486 or better processors is to use the built-in FPU. Some
of the elementary mathematical functions are performed directly by the FPU,
but in other cases - you are provided with the necessary building blocks.
For example, sin() and cos() are performed directly using the FSIN and FCOS
instructions, but exp() has to be performed by a combination of FLDL2E,
F2XM1 and a few "glue" floating-point operations.
Note that some operations are only defined by the FPU for a limited range,
even when the mathematical range is larger. If you anticipate values this
large, you must reduce the argument using mathematical identities before
applyin the operation. Examples are FSIN and FCOS, which are only defined
for the range +/-2^62 (calculation of trig functions on larger values is
essentially meaningless; all information about the position of the angle in
the circle has been lost).
I suggest that you visit Intel's site (http://develop.intel.com) and
download the 486 or Pentium series programmer's manual.
Daniel Pfeffer
Before you do that, make sure that your integration routine is fast
enough.
What do you exactly mean, something like you want to enter the formula as
a string and then compile ti into machine code instructions instead
evaluating it with an interpretive method like expression tree or with
stacks?
Osmo
>How do you do exponentiation in assembly language? For example, how do
>you raise 3 to the power of
>4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
>Pascal unit which will integrate the
>area under a curve with assembly language which is much faster than
>pascal is. Thanks in advance.
Use the Power function in the math unit, which is available in
the professional and C/S version. It might not be in the basic
version, I'm not sure.
Assembly won't be significantly faster than Pascal.
Jud McCranie
>Use the Power function in the math unit, which is available in
>the professional and C/S version. ...
Whoops, I thought I was in a Delphi NG when I wrote this reply.
If you have a numeric coprocessor and use single, double, or
extended type, exp(y*ln(x)) for x^y, where y is a real number is
probably fast enough. If y is an integer, there are obviously
better methods.
Jud McCranie
That reminds me Randal. I saw a bug in your TimerISR in chapter 17. This ISR
is loading the EAX register, but only saving and restoring the AX register:
TimerISR proc near
----> push ax
----> mov eax, 0 ;Ch 0, latch & read data.
out Cntrl_8254, al ;Output to 8253 cmd register.
in al, Timer0_8254 ;Read latch #0 (LSB) & ignore.
mov ah, al
jmp SettleDelay ;Settling delay for 8254 chip.
SettleDelay: in al, Timer0_8254 ;Read latch #0 (MSB)
xchg ah, al
neg ax ;Fix, 'cause timer counts
down.
add cseg:SumLatency, eax
inc cseg:Executions
pop ax
jmp cseg:OldInt8
TimerISR endp
--
Jay
Jason Burgon - Author of "Graphic Vision" GUI for DOS/DPMI
=== Free LFN capable Dos/WinDos replacement and ===
=== New Graphic Vision version 2.10 available from: ===
http://www.jayman.demon.co.uk
>How do you do exponentiation in assembly language? For example, how do
>you raise 3 to the power of
>4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
>Pascal unit which will integrate the
>area under a curve
What are you trying to integrate, and how do you intend to do
it? If you are integrating x^c for some constant c, you can do
that analytically.
Jud McCranie
Yes, but I'm talking about implementing exponentiation in assembly
language.
Thanks for the update.
I don't know that I'll ever update AoA
again (alas, the HTML version was
created manually and updating it is
a *lot* of work, especially given that
the Win32 version is on the verge of appearing).
OTOH, maybe at some point I should collect
all the bug reports I've received over the
years (if I can still find them) and
post an errata page.
Randy Hyde
Cant you just call exp and ln?
--
Rudolf Polzer
REBOUNCE - http://www.rebounce.de.vu
I wish I was what I was when I wished I was what I am now.
Rätsel:
1
11
21
1211
111221
312211
13112221
1113213211
??????????????
>Yes, but I'm talking about implementing exponentiation in assembly
>language.
you need to examine why you're doing exponentiation. If y is a
real number, then you can use x^y = exp(y*ln(x)). If y is an
integer, there are better ways. If y=2, use sqr(x), if y=3, use
x*sqr(x), if y=4, use sqr(sqr(x)). If your problem is
integration, you need to examine your method of integration.
What are you integrating, and how are you doing it?
Jud McCranie
Years ago I put a vector arithmetic unit into SWAG, which amongst other
things demonstrates the use of 80x87 assembler to speed up repetitive
calculations by a factor of 2-3 compared with the direct implementation
in Pascal. 'Amongst other things' means that you will find it in the
memory section.
Note that simply replacing a single instruction with assembler wont cut,
but if you replace the whole loop (I presume you want to use Simpson's
algorithm), savings could be significant. The disadvantage is that you
are limited to the Wintel platform, even porting to Linux will be
laborious, as the assembler uses a different format.
Here a simple example:
const MaxVektor = 8000;
type VektorStruc = record
Spalten : word;
Daten : array [1..MaxVektor] of float;
end;
VektorTyp = ^VektorStruc;
PtrParts = record
Ofs, Seg : word;
end;
procedure AddConstant (A, B : VektorTyp; C : float);
{ add constant C to every element of A and put the result in B }
var i,
j, k,
l, m : word;
p : PtrParts absolute A;
q : PtrParts absolute B;
begin
if (A^.Spalten <= 0) or (A^.Spalten <> B^.Spalten) then exit;
i := SizeOf(Float);
j := p.Ofs;
k := p.Seg;
l := q.Ofs;
m := q.Seg;
asm
push ds
mov ax,i { AX contains SizeOf(float) }
(* les bx,A^ { this doesn't work, } *)
mov es,k
mov bx,j { so we work around the problem }
mov cx,word ptr es:[bx] { CX contains length of vector }
add bx,2 { ES:[BX] = A.Daten[1] }
mov si,l
mov ds,m
add si,2 { DS:[SI] = B.Daten[1] }
fld qword ptr C { bring C to ST(0) }
@AddLoop:
fld qword ptr ES:[BX] { A.Daten[j] to ST(0) }
fadd st,st(1) { ST(0) = A.Daten[j] + C }
fstp qword ptr DS:[SI] { save result to B.Daten[j] }
add bx,ax { ES:[BX] points to next Element of A
}
add si,ax { DS:[SI] points to next element of B
}
loop @AddLoop
fstp st(0) { clear Coprozessor-Stack }
pop ds
end;
end;
function Betrag (Vek : VektorTyp) : float;
{ calculates absolut value of a vector }
var i,
j, k : word;
p : PtrParts absolute Vek;
begin
if (Vek^.Spalten = 0)
then
begin
Betrag := 0;
exit;
end;
i := SizeOf(Float);
j := p.Ofs;
k := p.Seg;
asm
mov ax,i
(* les bx,Vektor^ *)
mov es,k
mov bx,j
mov cx,word ptr es:[bx]
add bx,2
fldz { initialise St(0), this will contain
the sum of squared elements }
@MulLoop:
fld qword ptr ES:[BX] { A[j] to ST(0), => Sum to ST(1) }
fmul st,st { ST(0) = A[j]^2 }
faddp st(1),st { Sum in ST(0), discard intermediate
result }
add bx,ax
loop @MulLoop
fsqrt { ST(0) := sqrt(ST(0)) }
fstp qword ptr @Result { Assign ST(0) to function result and
clear coprocessor stack }
end;
end;
This should demonstrate the basic principle of how to integrate
assembler into Pascal routines. A good introduction to assembler is
Shoemaker's "Assembly Language and Hardware of the IBM PC". Only thing
to keep in mind: Make sure you always clear the coprocessor stack after
you are finished, otherwise your program will sooner or later crash.
Sent via Deja.com http://www.deja.com/
Before you buy.
>Cant you just call exp and ln?
How do you do 2^x or 10^x with the FPU?
Gruss
Julian aba JMMR
(Zum Antworten via Mail .nospam in der ElektroPost Adresse entfernen.)
(To reply simply remove .nospam in my email address.)
Then check the source of the Math unit of Free Pascal, or a different free
compiler or math library.
But IIRC it is in ATT assembler syntax.
> "Patrick D. Rockwell" <proc...@thegrid.net> wrote:
>
> >How do you do exponentiation in assembly language? For example, how do
> >you raise 3 to the power of
> >4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
> >Pascal unit which will integrate the
> >area under a curve
>
> What are you trying to integrate, and how do you intend to do
> it? If you are integrating x^c for some constant c, you can do
> that analytically.
> Jud McCranie
Different things. I'm thinking of writing a program which can compute
probabilities, so I'm interested in
integrating differen't probability distributions quickly and efficiently.
From the replies that I've received, I
gather that it depends on which formula I'm going to integrate, although
Simpsons rule is faster and better than the trapazoid rule.
Good Luck
--
Alexei A. Frounze
alexfru [AT] chat [DOT] ru
frounze [AT] ece [DOT] rochester [DOT] edu
http://alexfru.chat.ru
http://members.xoom.com/alexfru/
http://welcome.to/pmode/
"Patrick D. Rockwell" <proc...@thegrid.net> wrote in message
news:t2iebuo...@corp.supernews.com...
>> What are you trying to integrate, and how do you intend to do
>> it?
>
>Different things. I'm thinking of writing a program which can compute
>probabilities, so I'm interested in
>integrating differen't probability distributions quickly and efficiently.
>From the replies that I've received, I
>gather that it depends on which formula I'm going to integrate, although
>Simpsons rule is faster and better than the trapazoid rule.
There are a lot of issues here. Yes, Simpson's is better than
trapezoid, but I almost always use Gaussian.
Initilly, you were asking about raising to a power. The normal
dist, etc usually use e^x, so just use exp(x) in those cases.
For the common probability distributions, there are well-known
approximations (polynomials and ratios of polynomials) that can
be as good as integrating, and are much faster.
Another thing to do is to build a table, say in increments of
0.001 of the cumulative normal distribution by integrating or
using the approximations, and use that table instead of
recomputing. If your goal is to save a significant amount of
time, then you must be asking for the integrals many, many
times, so the table is a good idea.
So choose a good integration method, using approximations, or
using a table - any of these should be more effective than
coding in assembler.
I'll try to get you a comparison of the speed of integration
methods.
Jud McCranie
>Different things. I'm thinking of writing a program which can compute
>probabilities, so I'm interested in
>integrating differen't probability distributions quickly and efficiently.
>From the replies that I've received, I
>gather that it depends on which formula I'm going to integrate, although
>Simpsons rule is faster and better than the trapazoid rule.
I tested integrating the normal curve from 0 to 2.5 with
Gaussian integration and the trapezoid method. Accuracy was 6
digits or more. My program is written in Pascal and run on a
300 MHz P-II. The Gaussian took 0.000037 seconds, the trapezoid
took 0.0013 seconds. So the Gaussian was 35 times faster than
the trapezoid. But even if you do 10,000 calls to the trapezoid
method, it takes a total of about 13 seconds. Furthermore, if
you use it to build a table and integrate only between the
points in the table and sum them, it will be much faster than
that - maybe under a second.
Jud McCranie
>From the replies that I've received, I
>gather that it depends on which formula I'm going to integrate, although
>Simpsons rule is faster and better than the trapazoid rule.
Here are my results for integrating the normal curve 0 to 2.5,
written in Pascal, 300 MHz P-II, at least 6 digits of accuracy:
method time in microseconds
Trapezoid 1200
Simpson 74
Gaussian 34
approximation 0.6 (approximation from Handbook of math
functions)
I was a little surprised that Simpsons was so much closer to
Gaussian than the trapezoid method. Simpson's is only slightly
more complicated than the trapezoid method, and you can use it
on a table of equally-spaced points.
Jud McCranie
> "Patrick D. Rockwell" wrote:
> >
> > How do you do exponentiation in assembly language?
>
> Depends what processor you're on. And why do you ask a question
> about speeding up PASCAL performance in ASSEMBLY in a C++ group?
I figured that the C++ people might know something about Assembly.
> In comp.lang.asm.x86 Patrick D. Rockwell <proc...@thegrid.net> wrote:
> > How do you do exponentiation in assembly language? For example, how do
> > you raise 3 to the power of
> > 4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
> > Pascal unit which will integrate the
> > area under a curve with assembly language which is much faster than
> > pascal is. Thanks in advance.
>
> > --
> > Patrick D. Rockwell
> > mailto:proc...@thegrid.net
> > mailto:HNH...@prodigy.net
> > mailto:patri...@aol.com
>
> I doubt that coding in assembly is going to dynamically increase the
> speed of your routine. Your Pascal compiler should be using the math
> coprocessor which is what you would be using in assembly too. (At least,
> I hope that Borland Pascal is still not using their own floating point
> format for "real" variables. If so, you can switch to "single" or "double"
> which use the standard IEEE format that coprocessor supports) However, if
> you insist, my on-line assembly tutorial covers basic coprocessor programming.
> Go to my URL below to find it.
Thanks!
> BTW, why did you post this to comp.lang.c++??
It was probably unnecessary, but I figured that the people who frequent
comp.lang.c++ might
know some assembly too. :-)
> "Patrick D. Rockwell" <proc...@thegrid.net> wrote:
>
> >How do you do exponentiation in assembly language? For example, how do
> >you raise 3 to the power of
> >4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
> >Pascal unit which will integrate the
> >area under a curve with assembly language which is much faster than
> >pascal is. Thanks in advance.
>
> Use the Power function in the math unit, which is available in
> the professional and C/S version. It might not be in the basic
> version, I'm not sure.
I don't know how to do that. I haven't seen it in my assembly book. BTW, I
use a 586.
> "Patrick D. Rockwell" <proc...@thegrid.net> wrote in message
> news:FPvQ5.1223$ue.1...@newsread1.prod.itd.earthlink.net...
> > How do you do exponentiation in assembly language? For example, how do
> > you raise 3 to the power of
> > 4.37 or 3^4.37? How do you do logrithms or roots? I want to write a
> > Pascal unit which will integrate the
> > area under a curve with assembly language which is much faster than
> > pascal is. Thanks in advance.
>
> The typical way on 486 or better processors is to use the built-in FPU. Some
> of the elementary mathematical functions are performed directly by the FPU,
> but in other cases - you are provided with the necessary building blocks.
> For example, sin() and cos() are performed directly using the FSIN and FCOS
> instructions, but exp() has to be performed by a combination of FLDL2E,
> F2XM1 and a few "glue" floating-point operations.
You mean that FSIN & FCOS are assembly commands? I didn't see them in my Turbo
assembler book.
> Note that some operations are only defined by the FPU for a limited range,
> even when the mathematical range is larger. If you anticipate values this
> large, you must reduce the argument using mathematical identities before
> applyin the operation. Examples are FSIN and FCOS, which are only defined
> for the range +/-2^62 (calculation of trig functions on larger values is
> essentially meaningless; all information about the position of the angle in
> the circle has been lost).
>
> I suggest that you visit Intel's site (http://develop.intel.com) and
> download the 486 or Pentium series programmer's manual.
The link above doesn't seem to exist.
"Patrick D. Rockwell" <proc...@thegrid.net> wrote in message
news:HWQ46.255$3t2....@newsread1.prod.itd.earthlink.net...
Frank
<OT>
Well that is probably because Assembly is not portable
and is different for every platform. There are definately
FSIN & FCOS in present on my Pentium II...
But wasn't that the point you were trying to make ;-) ?
It is a 1991 compiler. I don't assume Turbo C is much better.
> Ron Natalie wrote:
>
> > "Patrick D. Rockwell" wrote:
> > >
> > > How do you do exponentiation in assembly language?
> >
> > Depends what processor you're on. And why do you ask a question
> > about speeding up PASCAL performance in ASSEMBLY in a C++ group?
>
> I figured that the C++ people might know something about Assembly.
Yeah, right, C++ --- nothing but a glorified macro assembler.
But why did you think the msdos-crowd would know about it?
Sven
--
_ __ The Cognitive Systems Group
| |/ /___ __ _ ___ University of Hamburg
| ' </ _ \/ _` (_-< phone: +49 (0)40 42883-2576 Vogt-Koelln-Strasse 30
|_|\_\___/\__, /__/ fax : +49 (0)40 42883-2572 D-22527 Hamburg
|___/ http://kogs-www.informatik.uni-hamburg.de/~utcke/home.html
The fsin and fcos commands in a HLL sometimes aren't the same as those in
assembly language. The ones in the compiler often have some extra code to
check for errors, convert arguments to a suitable range, etc.
The "simple" math operations can expand to a lot of code. Borland has a
fastmath.h header for recent versions of their C/C++ compilers that replaces
some standard math functions with shorter versions, leaving most error
checking to the programmer. MSVC may have a similar thing, but I haven't
used that one.
Something really bad (for performance) can also happen in a C/C++ compiler.
The language standards specify that if a floating-point number must be
converted to an int, the rounding happens by rounding towards zero. So, the
generated code must first save the FPU control word (usually has rounding
control set to 0 by default), set rounding control to 3, store the number,
and then reset the control word to the previous value. So for floating-point
calculations in C, it's probably best to keep everything as a float all the
time, and only convert to int when you really have to.
All of this becomes much easier to discover if you have the source code for
the RTL, especially if it's commented well.
By the way, the following code for such a conversion (see bottom of message)
comes from an Intel document. Now, obviously, there are some inaccuracies
there, which cause wrong results in some cases.
OK, so when you ignore the bugs, one question remains: why is that code so
long? I mean, you either have a float that can be converted to a 64-bit int,
or you have one that won't fit in that int for some reason (too big/small,
NaN, whatever). (Disable FPU exceptions before testing that.)
What are they trying to accomplish with all of those manipulations? If there
is no way to correctly convert that float to an int, I'd say all you can do
is give up. But the writer of this code obviously thought that it was a good
idea to store some value in there, even if no correct one existed. Why?
Perhaps this code is just meant to confuse people, or to prove that
cut-and-paste of code is a bad idea. But who knows, there might be some
interesting stuff in this madness.
Does Intel often have inaccuracies like these in their manuals? And where
are the errata? I'm also interested to know where documentation for other PC
processor makers can be downloaded. Does anybody know if Intel still has
hardcopy versions of the instruction list? I have a few books with
instruction documentation up to Pentium, but not for FPU, MMX, and SIMD. The
pdf files from Intel have the information, but they're not very convenient
to search through.
--
Wim Libaers
Remove DONTSPAM from my reply address to send me mail.
----
From:
Intel ® Architecture
Optimization
Reference Manual
When implementing an application, consider if the rounding mode is
important to the results. If not, use the algorithm in Example to avoid the
synchronization and overhead of the fldcw instruction and changing the
rounding mode.
Example 2-14 Algorithm to Avoid Changing the Rounding Mode
_fto132proc
lea ecx,[esp-8]
sub esp,16 ; allocate frame
and ecx,-8 ; align pointer on boundary of 8
fld st(0) ; duplicate FPU stack top
fistp qword ptr[ecx]
fild qword ptr[ecx]
mov edx,[ecx+4]; high dword of integer
mov eax,[ecx] ; low dword of integer
test eax,eax
je integer_QnaN_or_zero
arg is not integer QnaN:
fsubp st(1),st ; TOS=d-round(d),
; { st(1)=st(1)-st & pop ST)
test edx,edx ; what's sign of integer
jns positive ; number is negative
; dead cycle
; dead cycle
fstp dword ptr[ecx]; result of subtraction
mov ecx,[ecx] ; dword of diff(single-
; precision)
add esp,16
xor ecx,80000000h
add ecx,7fffffffh ; if diff<0 then decrement
; integer
adc eax,0 ; inc eax (add CARRY flag)
ret
positive:
fstp dword ptr[ecx]; 17-18 result of
subtraction
mov ecx,[ecx] ; dword of diff(single-
; precision)
add esp,16
add ecx,7fffffffh ; if diff<0 then decrement
; integer
sbb eax,0 ; dec eax (subtract CARRY flag)
ret
integer_QnaN_or_zero:
test edx,7fffffffh
jnz arg_is_not_integer_QnaN
add esp,16
ret