Just be carefull of starting on odd bytes. (Slows my comp down real fast).
--
Charles Wood
REMOVEME...@worldnet.att.net
MikeC wrote in message <38e3b933...@news.cig.mot.com>...
With regards,
Michael Kochetkov.
MikeC <user...@remove.hotmail.com> wrote in message
news:38e3b933...@news.cig.mot.com...
start:
do what ever here
loop start:
Was slower than:
start:
cmp cx,0
jg start
Where as my compiler generates:
repne movsd
For pentium class code..
Odd.
--
Charles Wood
REMOVEME...@worldnet.att.net
Nightcap wrote in message
<6bSE4.1677$is2.1...@bgtnsc05-news.ops.worldnet.att.net>...
>X-No-Archive: Yes
>MikeC wrote:
>> Is there any version of memset() which would work on words instead of
>> bytes? For Visual C++ on Intel processors.
>I doubt that, as such a function couldn't be generic. You however can make
one yourself, and not
>only a word-wide but a two-word one. Well, depending on your processor, of
course, but with i86s you
>can plop a double-word at a time, so depending on what exactly you need and
what your data are you
>might speed up this operation significantly. Also, keep in mind, if you're
on a Pentium type of
>processor (what else is there these days?) "rep" is not faster and _is_
slower than a simple loop.
I know what compiler you use.
I know this is now going off-topic (so last response ;-) for both newsgroups
but I could not resist.
The advice is too generic. It does not cover Level 1 or Level 2 caches nor
586 class Pentiums or 686 class Pentiums. Sometimes repne movsd is faster
than than an unrolled loop and sometimes it is not depending on situation.
Then again if the source or destination is video, either will do as the
video will saturate it.
What _does_ speed things up is to use 8-byte moves. So using MOVQ if MMX is
available or FILD if coprocessor is available is faster.
Stephen Howe
memset should be as efficient as you can make it (assuming a good
quality compiler). If you start on an unsuitable boundary it should
clear memory until it gets to a suitable boundary then use a faster
(e.g. word/quadlet based) method.
Paul
>
>Is there any version of memset() which would work on words instead of
>bytes? For Visual C++ on Intel processors.
>
A standard way would be:
#include <algorithm>
int main()
{
const int size = 512;
//whatever your WORD type is:
unsigned short* ptr = new unsigned short[size];
//memset line
std::fill(ptr, ptr + size, 0);
return 0;
}
I have no idea how fast this would be - I suppose you could specialize
std::fill for various types like unsigned short, uint, etc, writing it
with the relevant memset. This might actually degrade performance
however.
Something like:
typedef unsigned short ushort;
template<> inline
void fill<ushort*, ushort>(ushort* _F, ushort* _L, ushort _X)
{
memset(static_cast<void*>(_F), _X, sizeof(ushort) * (_L - _F));
}
Tom
What everyone (except perhaps the original poster) seem to be forgetting
is that memset can be used for things *beside* zeroing out memory. You can
actually use a real value in it. i.e..:
char a[100];
int b[100];
memset(a,0,sizeof(a));
memset(b,0,sizeof(b));
perform roughly the same effect on both a & b. However,
memset(a,1,sizeof(a)); // fills every element of a with 0x01
memset(b,1,sizeof(b)); // fills every element of b with 0x01010101
Now, I assume the original question was not, how to use x86 assembler to
speed this up, but what command to use to fill every element of b with
0x00000001, and the answer to that is:
#include <algorithm>
std::fill(b, b+sizeof(b), 1); // or
std::fill_n(b, sizeof(b), 1);
--
Truth,
James Curran
http://www.NJTheater.com
http://www.NJTheater.com/JamesCurran
"Tom" <rhino...@hotmail.com> wrote in message
news:38e4b649...@news.demon.co.uk...
I have calculated that many copies inside the loop is faster:
; Assume R0 is source, R1 destination, R2 counter, R3 data transfer:
loop:
ldrh r3, [r1], #2
strh r3, [r0, #2]
ldrh r3, [r1], #2
strh r3, [r0, #2]
ldrh r3, [r1], #2
strh r3, [r0, #2]
ldrh r3, [r1], #2
strh r3, [r0, #2]
subs r3, r3, #4
jgt loop
{Arm7 thumb platform}
Sent via Deja.com http://www.deja.com/
Before you buy.
#pragma intrinsic(memset)
Depending on the count entry, the compilator choose the better one.
It works also for others as: memcpy,str*,...
Cheers,
Guillaume.
MikeC wrote in message <38e3b933...@news.cig.mot.com>...
>