Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

C2P

114 views
Skip to first unread message

Juergen Rally Fischer

unread,
Jan 5, 1996, 3:00:00 AM1/5/96
to
Sven Steiniger (ss...@irz.inf.tu-dresden.de) wrote:

: Hi Everyone !

: I have several questions about c2p-converters
: - what are the fastest ones (ideas, theory, sources) ?
depends on cpu. 040 cpu only. other cpus mix cpu & blitter.

what means "fast" ? :) no, not a silly questions. 2 possibilities on
non-040-c2p:

a) the c2p procedure is to end as soon as possible
b) the c2p may take a while as it is done by blitter, the cpu already
does other things (this is for games).

for b) there's blitterscreen:
BLITTERSCREEN sources: http://www.informatik.tu-muenchen.de/~fischerj/
;)
you won't need to switch back to 6 planes using that ;)

: - what the hell is a scrambled (maybe spelled wrong) buffer ?

the buffer still consists of chunky bytes, but rearranged in a way
the c2p routine can do conversion faster. only good for special fx,
not for games.

: (I have heard this sometimes in this group)
: and what are his adventages/disadvantages ?
: - are there any very fast c2p which use a comparebuffer ?
yes, AFAIK Peter McGavin got most experience data on this.
AFAIK the routines get very fast when the soure to convert doesn't
change much.

: - ... anything to this topic
: I have never written an c2p by myself but I converted and modified
: converters I found in aminet. All of them do not use fixed plsize and the
: most are for 8 bitplanes.
: Are there really fast 6 bitplane c2p-converter ?
: My actual approachs are (50Mhz 030, OS-conform (except screens of
: gfxcards), 320x256)

aah 320x256. forget blitterscreen ;) but if you do last conversion pass with
blitter, you'd still get more speed!

: 8bpls : 45ms
: 6bpls : 43ms

the thing why 6planes is not much faster is you still parse throug a
8bit chunky source. but when it comes to blitter assistance, the
difference might be more (don't know, just a feeling).

: 4bpls : 30ms
: These c2p-converter(s) use LONG-writes to CHIPmem.

I wonder why 040/25 is said to do it 4MB/sec c2p but 030-50 is lot slower ?

: Thanks.

: Sven Steiniger
------------------------------------------------------------------------
fisc...@Informatik.TU-Muenchen.DE (Juergen "Rally" Fischer) =:)

Patrick Hanevold

unread,
Jan 5, 1996, 3:00:00 AM1/5/96
to
>for b) there's blitterscreen:
>BLITTERSCREEN sources: http://www.informatik.tu-muenchen.de/~fischerj/
>;)
SLOOOOOOOOOOOW!!! :)
In a couple of days, you'll get ours. :)

<sb>Patrick Hanevold - Virtual Reality developer
<sb>patrick....@login.eunet.no
<sb>Amiga and official Be developer


Sven Steiniger

unread,
Jan 5, 1996, 3:00:00 AM1/5/96
to

Hi Everyone !

I have several questions about c2p-converters
- what are the fastest ones (ideas, theory, sources) ?

- what the hell is a scrambled (maybe spelled wrong) buffer ?

(I have heard this sometimes in this group)
and what are his adventages/disadvantages ?
- are there any very fast c2p which use a comparebuffer ?

- ... anything to this topic
I have never written an c2p by myself but I converted and modified
converters I found in aminet. All of them do not use fixed plsize and the
most are for 8 bitplanes.
Are there really fast 6 bitplane c2p-converter ?
My actual approachs are (50Mhz 030, OS-conform (except screens of
gfxcards), 320x256)

8bpls : 45ms
6bpls : 43ms


4bpls : 30ms
These c2p-converter(s) use LONG-writes to CHIPmem.

Thanks.

Sven Steiniger

Patrick Hanevold

unread,
Jan 5, 1996, 3:00:00 AM1/5/96
to

We are going to release one in a couple of days.

2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)
Thats 160x128 bytes.
Can't get faster!

Andrew Bennett

unread,
Jan 5, 1996, 3:00:00 AM1/5/96
to
In a message of 05 Jan 96 Sven Steiniger wrote to All:

SS> I have several questions about c2p-converters
SS> - what are the fastest ones (ideas, theory, sources) ?

Depends a lot on what machine your using,and also on the types of images
used... (eg if there's not much difference between frames a smart routine
usually wins)

SS> - what the hell is a scrambled (maybe spelled wrong) buffer ?

The bytes for each pixel aren't sequential but are in an order that means less
work for the c2p routine.. eg instead of 1234567... you might use
1,5,9,13,2,6,10,14,3,.....

This is only really useful for a vertical raycasting engine though... ie one
table lookup per column.

SS> - are there any very fast c2p which use a comparebuffer ?

Yep. I've tried a few methods involving this technique. The last idea I had
was to detect which planes needed to be re-written, as opposed to simply
testing for no change in a group of pixels.

It works fairly well. The benefit is that as long as large regions only use a
small range of colours,then most of the time only a few planes actually need
re-writing. The problem is doing this calculation fast enough to make it
worthwhile.

SS> My actual approachs are (50Mhz 030, OS-conform (except screens of
SS> gfxcards), 320x256)
SS> 8bpls : 45ms

SS> 4bpls : 30ms

The 4 plane one should be much faster.. IMO

Ben... :)


Stephan Schaem

unread,
Jan 6, 1996, 3:00:00 AM1/6/96
to
Stephan Schaem (ssc...@teleport.com) wrote:
: I get 21ms for 4bpl on a 25mhz 030 ECS (The nibble pass is not
: done, so 4 pass VS 5),

I want to add that I can get 20ms for 4bpl 5pass (Non scrambed) If
I use a 512K table , a little bit silly but I get roughtly 4.1 mpixel
second: 320x210 pixel converted per frame (NTSC).

Stephan

Andrew Bennett

unread,
Jan 6, 1996, 3:00:00 AM1/6/96
to
In a message of 05 Jan 96 Patrick Hanevold wrote to All:

PH> We are going to release one in a couple of days.

PH> 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)
PH> Thats 160x128 bytes.
PH> Can't get faster!

In principle you can actually...

Quite a lot of the planar data may not need re-writing in the first place.

Ben... :)


Jyrki Saarinen

unread,
Jan 6, 1996, 3:00:00 AM1/6/96
to

> I wonder why 040/25 is said to do it 4MB/sec c2p but 030-50 is lot
> slower ?

Maybe not the correct routine? I think that a 030/50
really is able to do c2p free. Maybe 4th pass done with
the blitter.

-- _
a Stellar programmer _ //
"Amiga - back for the future" \X/

Jyrki Saarinen

unread,
Jan 6, 1996, 3:00:00 AM1/6/96
to

> I personaly use the 5 pass methode.
> swap word
> swap byte
> swap nyble
> swap bitpair
> swap bit

Oh, word swapping is the 5th pass. What is your opinion,
I think 030/50 should be able to do c2p free. Perhaps
three passes with CPU and the last one with the blitter.

> Aminet is not a good image of where c2p is today.

Well, there is the only 040 routine, by Peter McGavin.
(Or was it James McCoull, I wonder where he has vanished)

> I'm personally waiting for some new C2p code to be posted soon
> that is suposed to be much faster then what I wrote so far.

How is doing this..? ;)

Andrew Bennett

unread,
Jan 6, 1996, 3:00:00 AM1/6/96
to
In a message of 06 Jan 96 Patrick Hanevold wrote to All:

PH>>> 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)

PH>>> Can't get faster!

>> In principle you can actually...
>> Quite a lot of the planar data may not need re-writing in the first
>> place.

PH> re-writing? What the heck are you talking about?

Assuming that you still have a screenful of planar data from the last
conversion,it's very likely that a lot of the new planar data you write will be
identical to the old copy and thus doesn't need to be overwritten with the same
data!! (Depending on the type of graphics used and the amount of change per
frame...)

The end result is a c2p routine which takes a variable amount of time to do
the conversion,but may well be faster.

eg: Assuming that the first four pixels are colours 12,13,220,114. On the next
pass let's say they are now 15,12,223,116.

If you eor the old set with the new set,and re-order the data into
planar,you'll find that only planes 1 & 2 have actually changed... So you can
save 6 chip writes and a fair bit of manipulation,but at the the expense of
reading twice as much data from (hopefully) fast memory.

Ben... :)


Patrick Hanevold

unread,
Jan 6, 1996, 3:00:00 AM1/6/96
to

PH>> We are going to release one in a couple of days.

PH>> 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)
PH>> Thats 160x128 bytes.


PH>> Can't get faster!

>In principle you can actually...
>Quite a lot of the planar data may not need re-writing in the first place.

re-writing? What the heck are you talking about?

<sb>Patrick Hanevold - Virtual Reality developer

Jyrki Saarinen

unread,
Jan 6, 1996, 3:00:00 AM1/6/96
to

> - what are the fastest ones (ideas, theory, sources) ?

The idea situation is to do as many c2p passes with the CPU,
depending on the CPU speed, but in such a way that the
chunky2planar CPU process is as fast as normal copying
from fast ram to chip ram. So the c2p passes are "free"
between the chip writes.
The rest passes are done with the blitter using QBlit().

Starting from the 040, blitter is not needed since
040 is able to done all c2p passes completely free!

> - what the hell is a scrambled (maybe spelled wrong) buffer ?

> (I have heard this sometimes in this group)
> and what are his adventages/disadvantages ?

Pass1 is skipped by writing vertical colums in non-linear
order. Only good for vertical drawing, like Wolf3D.

Pass1 only rearranges bytes.

> Are there really fast 6 bitplane c2p-converter ?

> My actual approachs are (50Mhz 030, OS-conform (except screens of

> gfxcards), 320x256)
> 8bpls : 45ms
> 6bpls : 43ms
> 4bpls : 30ms
> These c2p-converter(s) use LONG-writes to CHIPmem.

What routine this is? I guess 030/50 should be able
to get c2p free --> 320x256x8 screen should be about 20ms.
This could be perhaps achieved by doing three passes


with CPU and the last one with the blitter.

-- _

Stephan Schaem

unread,
Jan 7, 1996, 3:00:00 AM1/7/96
to
Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:

: > I personaly use the 5 pass methode.


: > swap word
: > swap byte
: > swap nyble
: > swap bitpair
: > swap bit

: Oh, word swapping is the 5th pass. What is your opinion,

: I think 030/50 should be able to do c2p free. Perhaps
: three passes with CPU and the last one with the blitter.

You can do 13 inst betwen chip write on a 50mhz 030, right?
Thats 52 inst. Alomst yea. (4bit )
Actually my table methode use 50inst to do the 5pass, including
load/store.32 inst without the last pass, but on a 030 you
cant do fast read with a chip write.
basicly:
mw (a0)+,d0
ml (a1,d0*4),d1
mw (a0)+,d0
add.l (a2,d0*4),d1
.. 3 more time (Do pass 1,2,4)
pass 16+pass8 is 30 inst ... easy to make it overlap with
the 4 chipram write. (You write the 4 previous converted
long, this mean you have to unroll the loop and have 2 set of
register) But there is enought register & the all thing fit in the
cache. So the blitter is not needed...

Stephan

Patrick Hanevold

unread,
Jan 7, 1996, 3:00:00 AM1/7/96
to

[c2p]
>Ben... :)
OK. All true, not implemented, but anyone can do that in a minute.

Juergen Rally Fischer

unread,
Jan 8, 1996, 3:00:00 AM1/8/96
to
Patrick Hanevold (patrick....@login.eunet.no) wrote:

: PH>> We are going to release one in a couple of days.

: PH>> 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)
: PH>> Thats 160x128 bytes.
: PH>> Can't get faster!

: >In principle you can actually...
: >Quite a lot of the planar data may not need re-writing in the first place.

depends on cpu, on 020-14, not writing is slower than a plain copy.

btw, on 040, anyone ever though on not writing 0's to the planes and clear them
with blitter ?

: re-writing? What the heck are you talking about?

: <sb>Patrick Hanevold - Virtual Reality developer

Juergen Rally Fischer

unread,
Jan 8, 1996, 3:00:00 AM1/8/96
to
Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:

: What routine this is? I guess 030/50 should be able
: to get c2p free --> 320x256x8 screen should be about 20ms.

how fast is it donig the 040 version. Is the one from aminet free on 040?

: This could be perhaps achieved by doing three passes


: with CPU and the last one with the blitter.

if maybe new A1200 is only 030-40 or something, maybe blitter has to
overtake 1.5 passes ;)

: -- _


: a Stellar programmer _ //
: "Amiga - back for the future" \X/

Juergen Rally Fischer

unread,
Jan 8, 1996, 3:00:00 AM1/8/96
to
Patrick Hanevold (patrick....@login.eunet.no) wrote:

: We are going to release one in a couple of days.

: 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)

nice, but which cpu and which framerate (for the case of blitter assistance) ?

: Thats 160x128 bytes.
: Can't get faster!

: <sb>Patrick Hanevold - Virtual Reality developer

Jyrki Saarinen

unread,
Jan 8, 1996, 3:00:00 AM1/8/96
to

> : We are going to release one in a couple of days.
> : 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)
>
> nice, but which cpu and which framerate (for the case of blitter
> assistance) ?

It does two passes almost free even on a 28MHz 020/030. Completely
free on a 40MHz 030. The blitting time was quite small if
I remeber correctly, the whole time was 17ms, and the CPU
part is ~5ms, so ~12ms blitting.

Jyrki Saarinen

unread,
Jan 8, 1996, 3:00:00 AM1/8/96
to

> : What routine this is? I guess 030/50 should be able
> : to get c2p free --> 320x256x8 screen should be about 20ms.
>
> how fast is it donig the 040 version. Is the one from aminet free on
> 040?

c2p_040_FastRam4.s 4543 ----rwed 29-Elo-93 14:53:40

This routine is from Aminet, if I recall correctly from a
package called FASTC2P.LZH or something. It does chunky2planar
"free". Funny how the c2p_020_FastRam2.s is about 50% slower
on the 040 that the 040 routine..

> if maybe new A1200 is only 030-40 or something, maybe blitter has to
> overtake 1.5 passes ;)

Could you see a 030/50 doing three passes free? It should
be possible since Ludde & Guys made a routine that does
two passes _almost_ free on a 28MHz 020/030.

Christopher Dyken

unread,
Jan 8, 1996, 3:00:00 AM1/8/96
to
PH>>> We are going to release one in a couple of days.

PH>>> 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)


PH>>> Thats 160x128 bytes.
PH>>> Can't get faster!

>>In principle you can actually...
>>Quite a lot of the planar data may not need re-writing in the first place.

>re-writing? What the heck are you talking about?

Compare buffers. The routine detect differences and only update changed areas.


--
Christopher Dyken (chri...@sn.no | http://www.sn.no/~christod)
LoungeBar Development


Sam Yee

unread,
Jan 9, 1996, 3:00:00 AM1/9/96
to
(please e-mail to jer...@wimsey.com)

patrick....@login.eunet.no (Patrick Hanevold) writes:

>>for b) there's blitterscreen:
>>BLITTERSCREEN sources: http://www.informatik.tu-muenchen.de/~fischerj/
>>;)
>SLOOOOOOOOOOOW!!! :)
>In a couple of days, you'll get ours. :)

><sb>Patrick Hanevold - Virtual Reality developer


><sb>patrick....@login.eunet.no
><sb>Amiga and official Be developer

Are you using HAM mode? You can get excellent speed only updating the lower
four bitplanes and interlacing lores screens, alternating odd/even chunky
pixels.

jer...@wimsey.com
------------------
"Let he who has no sword sell his cloak and buy one." - Jesus, Luke 22:36


Jorge Acereda Macia

unread,
Jan 9, 1996, 3:00:00 AM1/9/96
to
Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:

> What routine this is? I guess 030/50 should be able
> to get c2p free --> 320x256x8 screen should be about 20ms.

> This could be perhaps achieved by doing three passes
> with CPU and the last one with the blitter.

20 ms? Using scrambled chunky buffer I get 320x200x8 in 25.7
on 030@50 (slow :-( ). Well, I keep multitasking enabled using a 127
TaskPri, so interrupts should suck the cache... But 20 ms. for
320x256... Could you give some hints?

TIA,
--
---------------------------- --------------------------------------------
| Jorge Acereda | Dream the same thing everynight |
| ii...@rossegat.uji.es | I see our freedom in my sight |
| Intel Outside | No locked doors, no windows barred |
| Amiga Rules | No things to make my brain seem scarred |
---------------------------- --------------------------------------------

Stephan Schaem

unread,
Jan 9, 1996, 3:00:00 AM1/9/96
to
Stephan Schaem (ssc...@teleport.com) wrote:
: Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:

: : Oh, word swapping is the 5th pass. What is your opinion,
: : I think 030/50 should be able to do c2p free. Perhaps
: : three passes with CPU and the last one with the blitter.

After twidling I got the 4bit version to reach 18ms at 320x256.
On my 25mhz 030 this equal to half the bandwidth speed.
Because it uses tables it will never be able to c2p for 'free'.
But for a 25mhz 030 18ms is pretty acceptable if you have some
use of the unused blitter.(Because the first 3 pass are done with
a lookup table the 2 other pass overlaped during the chipmem write,
so the blitter as nothing to do to help)


Stephan

Thore Bjerklund Karlsen

unread,
Jan 9, 1996, 3:00:00 AM1/9/96
to
(Jyrki Saarinen)

>> : We are going to release one in a couple of days.
>> : 2x2x256 just as fast as it takes to copy the chunky-buffer to chip.


>>
>> nice, but which cpu and which framerate (for the case of blitter
>> assistance) ?

>It does two passes almost free even on a 28MHz 020/030. Completely
>free on a 40MHz 030. The blitting time was quite small if
>I remeber correctly, the whole time was 17ms, and the CPU
>part is ~5ms, so ~12ms blitting.

Why not just measure it in scanlines? Why ms?

__
\\\__ Thore B. Karlsen % t...@sn.no % C64-C128D-A1200-A2000C
\XX/ Wowbagger/AFL&SSN % -c0d3r- % A1230/50MHz-2C4F/340MB

Acts.9.5: .. And the Lord said, I am Jesus whom thou persecutest: it is
hard for thee to kick against the pricks.

Jyrki Saarinen

unread,
Jan 9, 1996, 3:00:00 AM1/9/96
to

> After twidling I got the 4bit version to reach 18ms at 320x256.
> On my 25mhz 030 this equal to half the bandwidth speed.
> Because it uses tables it will never be able to c2p for 'free'.
> But for a 25mhz 030 18ms is pretty acceptable if you have some
> use of the unused blitter.(Because the first 3 pass are done with
> a lookup table the 2 other pass overlaped during the chipmem write,
> so the blitter as nothing to do to help)

Yep, 16c just is not enough .. What do you think, could
it be possible to do three passes free on a 030/50?
On the other hand, two passes free and more blitting
should be OK too, because on 320x256 the 50MHz 68030
surely is not able to render a complex tmapped
scene faster than 25fps.

Jyrki Saarinen

unread,
Jan 10, 1996, 3:00:00 AM1/10/96
to

> 20 ms? Using scrambled chunky buffer I get 320x200x8 in 25.7
> on 030@50 (slow :-( ). Well, I keep multitasking enabled using a 127
> TaskPri, so interrupts should suck the cache... But 20 ms. for
> 320x256... Could you give some hints?

I was just speculating. Copying a 320x256x8 data area from
fast to chip ram takes about 20ms. Now, if three c2p passes
could be done "free" on 030/50 and the last pass with
the blitter, it would be great.

I dont think your systemfriendliness slows down..
Do you use the blitter?

Juergen Rally Fischer

unread,
Jan 10, 1996, 3:00:00 AM1/10/96
to

In article <4cskrt$k...@morgoth.sfu.ca>, sa...@news.sfu.ca (Sam Yee) writes:
|> Organization: Simon Fraser University
|> Lines: 22
|> Message-ID: <4cskrt$k...@morgoth.sfu.ca>
|> References: <4cj57f$7...@irz210.inf.tu-dresden.de> <4cjmsk$9...@sunsystem5.informatik.tu-muenchen.de> <291.6578T...@login.eunet.no>
|> NNTP-Posting-Host: fraser.sfu.ca
|> X-Newsreader: NN version 6.5.0 #5 (NOV)

|>
|> (please e-mail to jer...@wimsey.com)
|>
|> patrick....@login.eunet.no (Patrick Hanevold) writes:
|>
|> >>for b) there's blitterscreen:
|> >>BLITTERSCREEN sources: http://www.informatik.tu-muenchen.de/~fischerj/
|> >>;)
|> >SLOOOOOOOOOOOW!!! :)

naaah, whats cooler than 020-14 c2p for free ;) I see that for realtimish
fx it's better to accept cpu doing 10ms for doing one pass for a 2x2 screen.
I already got the routine, wait for bltscr_v1.3 :)

|> >In a couple of days, you'll get ours. :)

oh, yes I'm interested!

|>
|> ><sb>Patrick Hanevold - Virtual Reality developer
|> ><sb>patrick....@login.eunet.no
|> ><sb>Amiga and official Be developer
|>
|> Are you using HAM mode? You can get excellent speed only updating the lower
|> four bitplanes and interlacing lores screens, alternating odd/even chunky
|> pixels.

cool, have you done a demo with your idea ?

if the cpu is quick enough, and the renderer is dirty enough to get conversion
for free ;) then you should even be able to copy a 1x1 screen in 1/2 frame! =:o

but maybe you got to wait for PPC to get it for free ;)

would be especially interesting for all A3000 users! (quick cpu but no AGA)

|>
|> jer...@wimsey.com
|> ------------------
|> "Let he who has no sword sell his cloak and buy one." - Jesus, Luke 22:36
|>

Juergen Rally Fischer

unread,
Jan 10, 1996, 3:00:00 AM1/10/96
to

In article <3823...@kone.fipnet.fi>, "Jyrki Saarinen" <jsaa...@kone.fipnet.fi> writes:
|>
|> > : We are going to release one in a couple of days.
|> > : 2x2x256 just as fast as it takes to copy the chunky-buffer to chip. :)

|> >
|> > nice, but which cpu and which framerate (for the case of blitter
|> > assistance) ?
|>
|> It does two passes almost free even on a 28MHz 020/030. Completely
|> free on a 40MHz 030. The blitting time was quite small if
|> I remeber correctly, the whole time was 17ms, and the CPU
|> part is ~5ms, so ~12ms blitting.

5ms is the time for copying a 2x2 frame.
10ms is the time for one blitter pass.

The values seem to make sense, but isn't it rather 7ms for cpu doing
passes not completely free, and 10ms for one blitter pass ? ;)

BTW how much is the 2pass routine on 020 ?

My one pass routine needs 10ms for a 2x2 screen, so 2times more than
copying, I hope doing 2 passes is ~15ms and not ~20ms...

|>
|> -- _
|> a Stellar programmer _ //
|> "Amiga - back for the future" \X/

Juergen Rally Fischer

unread,
Jan 10, 1996, 3:00:00 AM1/10/96
to

In article <3823...@kone.fipnet.fi>, "Jyrki Saarinen" <jsaa...@kone.fipnet.fi> writes:

|> c2p_040_FastRam4.s 4543 ----rwed 29-Elo-93 14:53:40
|>
|> This routine is from Aminet, if I recall correctly from a
|> package called FASTC2P.LZH or something. It does chunky2planar
|> "free". Funny how the c2p_020_FastRam2.s is about 50% slower
|> on the 040 that the 040 routine..
|>
|> > if maybe new A1200 is only 030-40 or something, maybe blitter has to
|> > overtake 1.5 passes ;)
|>
|> Could you see a 030/50 doing three passes free? It should

If it can do this, and you use 3 pass c2p, then it does c2p for free.

|> be possible since Ludde & Guys made a routine that does
|> two passes _almost_ free on a 28MHz 020/030.

So remaining blitter action (one pass) is only 2 frames (for fullscreen 1x1!).

Stephan Schaem

unread,
Jan 10, 1996, 3:00:00 AM1/10/96
to
Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:
: Yep, 16c just is not enough .. What do you think, could

: it be possible to do three passes free on a 030/50?
: On the other hand, two passes free and more blitting
: should be OK too, because on 320x256 the 50MHz 68030
: surely is not able to render a complex tmapped
: scene faster than 25fps.

basicly 40 inst per pass that equal 96cycle*3 /8 = 36...
can you fit 36 cycle in betwen chipmem write on a 50mhz 030?

Stephan

Jyrki Saarinen

unread,
Jan 11, 1996, 3:00:00 AM1/11/96
to

> |> It does two passes almost free even on a 28MHz 020/030. Completely
> |> free on a 40MHz 030. The blitting time was quite small if
> |> I remeber correctly, the whole time was 17ms, and the CPU
> |> part is ~5ms, so ~12ms blitting.
>
> 5ms is the time for copying a 2x2 frame.
> 10ms is the time for one blitter pass.

Well, something around those figures.

> The values seem to make sense, but isn't it rather 7ms for cpu doing
> passes not completely free, and 10ms for one blitter pass ? ;)

Nope, starting from a 40MHz 68030 it does those two passes free.
Almost on a 28MHz Amiga, and in real-life situation (8bpl lores)
I could propably claim that even a 28MHz 020/030 does those
two passes free.

> BTW how much is the 2pass routine on 020 ?

I have not tested.. the old one was 20ms (7ms on 030/50),
I think the new 2pass routine could be about 10-15ms on a
020/14. I have to test it..

> My one pass routine needs 10ms for a 2x2 screen, so 2times more than
> copying, I hope doing 2 passes is ~15ms and not ~20ms...

Yep, Luddes one pass routine took also 10ms on 020/14.

Jyrki Saarinen

unread,
Jan 11, 1996, 3:00:00 AM1/11/96
to

> |> be possible since Ludde & Guys made a routine that does
> |> two passes _almost_ free on a 28MHz 020/030.
>
> So remaining blitter action (one pass) is only 2 frames (for fullscreen
> 1x1!).

How about doing the last pass, too..? ;) I calculated
about 40ms for blitting when converting a 320x128 screen,
two passes with a CPU. This is nice, too, since a 28MHz does
the CPU part free, and the "large" blitting time is not a problem
since the CPU propably is not able to render faster than
25fps.

Doug Reed

unread,
Jan 11, 1996, 3:00:00 AM1/11/96
to
In article <4d16bt$4...@sunsystem5.informatik.tu-muenchen.de>,

fisc...@informatik.tu-muenchen.de (Juergen "Rally" Fischer) wrote:

>|> Are you using HAM mode? You can get excellent speed only updating
the lower
>|> four bitplanes and interlacing lores screens, alternating odd/even
chunky
>|> pixels.
>
>cool, have you done a demo with your idea ?
>
>if the cpu is quick enough, and the renderer is dirty enough to get
>conversion for free ;) then you should even be able to copy a 1x1
>screen in 1/2 >frame! =:o but maybe you got to wait for PPC to get it
>for free ;)
>would be especially interesting for all A3000 users! (quick cpu but no
>AGA)

I tried this using SuperHires Ham-6 (Soz A3000 users). On my 030-50F it
does a 12-Bit truecolour 1X1 screen (256*256 [1024*256]) in about 12.5
fps, and the processor is currently idle for >= 3/4 of the time (no
rendering just F>C conversion), as the blitter is doing the bulk of the
work and so there is plenty of time to render the screen with the
processor. The only problems that I can see with it is that colours
have to be scrambled to get this speed (this could be a pain when
shading Etc.) and it requires s**tloads of memory. It does look good
though. If only I had time to do something with it ;)

Doug.

Jyrki Saarinen

unread,
Jan 11, 1996, 3:00:00 AM1/11/96
to

> Why not just measure it in scanlines? Why ms?

Because I have a routine that measures the time that
a subroutine takes in milliseconds.

Jorge Acereda Macia

unread,
Jan 12, 1996, 3:00:00 AM1/12/96
to
Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:

> This routine is from Aminet, if I recall correctly from a
> package called FASTC2P.LZH or something. It does chunky2planar
> "free". Funny how the c2p_020_FastRam2.s is about 50% slower
> on the 040 that the 040 routine..

Yep, but check the chip writes in the 020 version. This writes
can be pipelined a lot more.

Juergen Rally Fischer

unread,
Jan 12, 1996, 3:00:00 AM1/12/96
to

In article <4d3j36$70g...@salford.ac.uk>, D.A....@chemistry.salford.ac.uk (Doug Reed) writes:
|> Organization: University of Salford
|> Lines: 29
|> Message-ID: <4d3j36$70g...@salford.ac.uk>
|> References: <4cj57f$7...@irz210.inf.tu-dresden.de> <4cjmsk$9...@sunsystem5.informatik.tu-muenchen.de> <291.6578T...@login.eunet.no> <4cskrt$k...@morgoth.sfu.ca> <4d16bt$4...@sunsystem5.informatik.tu-muenchen.de>
|> NNTP-Posting-Host: ais-ck-018.salford.ac.uk
|> X-Newsreader: News Xpress Version 1.0 Beta #3

|>
|> In article <4d16bt$4...@sunsystem5.informatik.tu-muenchen.de>,
|> fisc...@informatik.tu-muenchen.de (Juergen "Rally" Fischer) wrote:
|>
|> >|> Are you using HAM mode? You can get excellent speed only updating
|> the lower
|> >|> four bitplanes and interlacing lores screens, alternating odd/even
|> chunky
|> >|> pixels.
|> >
|> >cool, have you done a demo with your idea ?
|> >
|> >if the cpu is quick enough, and the renderer is dirty enough to get
|> >conversion for free ;) then you should even be able to copy a 1x1
|> >screen in 1/2 >frame! =:o but maybe you got to wait for PPC to get it
|> >for free ;)
|> >would be especially interesting for all A3000 users! (quick cpu but no
|> >AGA)
|>
|> I tried this using SuperHires Ham-6 (Soz A3000 users). On my 030-50F it
|> does a 12-Bit truecolour 1X1 screen (256*256 [1024*256]) in about 12.5
|> fps, and the processor is currently idle for >= 3/4 of the time (no
|> rendering just F>C conversion), as the blitter is doing the bulk of the

how does it look ? tried watching a pic of a typical doom-scene on it ? :)

|> work and so there is plenty of time to render the screen with the
|> processor. The only problems that I can see with it is that colours
|> have to be scrambled to get this speed (this could be a pain when

scrambled _colors_ ? you mean words ?

|> shading Etc.) and it requires s**tloads of memory. It does look good
|> though. If only I had time to do something with it ;)
|>
|> Doug.

Stephan Schaem

unread,
Jan 12, 1996, 3:00:00 AM1/12/96
to
Thore Bjerklund Karlsen (t...@sn.no) wrote:

: Why not just measure it in scanlines? Why ms?

An even better measure 'standart' : pixel second :)

Stephan

Juergen Rally Fischer

unread,
Jan 12, 1996, 3:00:00 AM1/12/96
to

In article <4cu7vo$1...@sinsen.sn.no>, t...@sn.no (Thore Bjerklund Karlsen) writes:
|> (Jyrki Saarinen)

|>
|> >> : We are going to release one in a couple of days.
|> >> : 2x2x256 just as fast as it takes to copy the chunky-buffer to chip.
|> >>
|> >> nice, but which cpu and which framerate (for the case of blitter
|> >> assistance) ?
|>
|> >It does two passes almost free even on a 28MHz 020/030. Completely
|> >free on a 40MHz 030. The blitting time was quite small if
|> >I remeber correctly, the whole time was 17ms, and the CPU
|> >part is ~5ms, so ~12ms blitting.
|>
|> Why not just measure it in scanlines? Why ms?

he gave the numbers in ms.

after having counted the scanlines the effect needs,
(by measuring the hight of it on the TV! :D) you calculate

time= (measured_height/height_of_wbscreen)*(256/312.5)*20ms

for a wb screen of 256 pixel height.

;)

Rallys demo-coding formulas: ;)
----------------------------

nr_rasterlines_fx = (measured_height/measured_height_screen)*nr_lines_screen

time = (nr_rasterlines_fx/nr_rasterlines_tv) * frametime_tv

with nr_rasterlines_tv = 312.5 and frametime = 20ms on PAL.

note that (frametime_tv/nr_rasterlines_tv) is almost same on both PAL and NTSC,
the reason is they got almost same horiz frequency with a reasterline beeing
6.4ns on PAL.

So we can aproximate

time = (nr_rasterlines_fx*6.4ns)
= (measured_height/measured_height_screen)*nr_lines_screen*6.4ns


=;)

what do you think about my formulas ? imho much more interesting than
the crap at school ;)

|>
|> __
|> \\\__ Thore B. Karlsen % t...@sn.no % C64-C128D-A1200-A2000C
|> \XX/ Wowbagger/AFL&SSN % -c0d3r- % A1230/50MHz-2C4F/340MB
|>
|> Acts.9.5: .. And the Lord said, I am Jesus whom thou persecutest: it is
|> hard for thee to kick against the pricks.
|>

Juergen Rally Fischer

unread,
Jan 12, 1996, 3:00:00 AM1/12/96
to

In article <3823...@kone.fipnet.fi>, "Jyrki Saarinen" <jsaa...@kone.fipnet.fi> writes:
|>
|> > |> be possible since Ludde & Guys made a routine that does
|> > |> two passes _almost_ free on a 28MHz 020/030.
|> >
|> > So remaining blitter action (one pass) is only 2 frames (for fullscreen
|> > 1x1!).
|>
|> How about doing the last pass, too..? ;) I calculated
|> about 40ms for blitting when converting a 320x128 screen,
|> two passes with a CPU. This is nice, too, since a 28MHz does
|> the CPU part free, and the "large" blitting time is not a problem
|> since the CPU propably is not able to render faster than
|> 25fps.

uhm well. I thought we were talking about more demoish (realtimish) fx here.

|>
|> -- _
|> a Stellar programmer _ //
|> "Amiga - back for the future" \X/

Lasse Olsen

unread,
Jan 12, 1996, 3:00:00 AM1/12/96
to
Doug Reed (D.A....@chemistry.salford.ac.uk) wrote:
: In article <4d16bt$4...@sunsystem5.informatik.tu-muenchen.de>,

: fisc...@informatik.tu-muenchen.de (Juergen "Rally" Fischer) wrote:

: >|> Are you using HAM mode? You can get excellent speed only updating
: the lower
: >|> four bitplanes and interlacing lores screens, alternating odd/even
: chunky
: >|> pixels.
: >
: >cool, have you done a demo with your idea ?
: >
: >if the cpu is quick enough, and the renderer is dirty enough to get
: >conversion for free ;) then you should even be able to copy a 1x1
: >screen in 1/2 >frame! =:o but maybe you got to wait for PPC to get it
: >for free ;)
: >would be especially interesting for all A3000 users! (quick cpu but no
: >AGA)

: I tried this using SuperHires Ham-6 (Soz A3000 users). On my 030-50F it
: does a 12-Bit truecolour 1X1 screen (256*256 [1024*256]) in about 12.5
: fps, and the processor is currently idle for >= 3/4 of the time (no
: rendering just F>C conversion), as the blitter is doing the bulk of the

: work and so there is plenty of time to render the screen with the

: processor. The only problems that I can see with it is that colours
: have to be scrambled to get this speed (this could be a pain when

: shading Etc.) and it requires s**tloads of memory. It does look good

: though. If only I had time to do something with it ;)

: Doug.

Very interresting indeed. :)
Anyone up for a test of this thesus?
Cheers...


Juergen Rally Fischer

unread,
Jan 12, 1996, 3:00:00 AM1/12/96
to

In article <4d27jf$g...@morgoth.sfu.ca>, sa...@news.sfu.ca (Sam Yee) writes:
|> Organization: Simon Fraser University
|> Lines: 41
|> Message-ID: <4d27jf$g...@morgoth.sfu.ca>
|> NNTP-Posting-Host: fraser.sfu.ca
|> X-Newsreader: NN version 6.5.0 #5 (NOV)
|>
|>
|> (please e-mail to jer...@wimsey.com)
|>
|> fisc...@informatik.tu-muenchen.de (Juergen "Rally" Fischer) writes:
|>
|>
|> >|> Are you using HAM mode? You can get excellent speed only updating the lower
|> >|> four bitplanes and interlacing lores screens, alternating odd/even chunky
|> >|> pixels.
|>
|> >cool, have you done a demo with your idea ?
|>
|> Not yet. Have no access to an Amiga right now. The basic idea is to use
|> 16-bit chunky pixels in R4G4B4B4R3G3B3B3R2G2B2B2R1G1B1B1R0G0B0B0 format

what about a very dirty 1x1 mode:

R0G1B2R4G5B6

;) only one component of a pixel is overtaken, but not the one needed most
but just random (well, R,G and B cyclic).

Anyone knows how picutres look after this kind of conversion ?


|> and do a two pass (4- and 8-bit) C2P on the lower four bitplanes. The upper
|> two bitplanes in HAM mode are set to display RGBBRGBB... If you use
|> movep.w to output your chunky words, you don't need to do the 8-bit pass,
|> if you do the 4-bit with blitter.
|> This gives you a 16-bit 2x2/2x1 truecolour mode on all Amigas. Also you need to
|> setup the copper to swap screens every 1/60th or 1/50th of a second, and shift
|> the screen display right two pixels to fill in the odd/even pixels.
|> You can do the same with an interlaced screen, but the non-interlaced
|> flicker version looks cleaner (tried a dpaint anim using the same framerate
|> and it looks better than interlaced).

you mean the pixels get flicker-mixed ? theoretical you'd need a mask, but
maybe this looks better.

|>
|> And if you have the luxury of an AGA machine, you can set up a hires HAM
|> screen (640x100 or 640x200) and get a 160x100 or 160x200 display without
|> the need to flicker screens back and forth.

my picture viewer uses a render method, that just looks which component is
to needed updated (the biggest difference to the new wannabe-24bit value).
looks good for photos, quite quick (seen from speed of picture viewers,
which include palette for rendering).

With using tables and a PPC maybe a way for 24bit games ?

|>
|> >if the cpu is quick enough, and the renderer is dirty enough to get conversion
|> >for free ;) then you should even be able to copy a 1x1 screen in 1/2 frame! =:o
|>
|> >but maybe you got to wait for PPC to get it for free ;)
|>
|> >would be especially interesting for all A3000 users! (quick cpu but no AGA)
|>
|>

|> jer...@wimsey.com
|> ------------------
|> "Let he who has no sword sell his cloak and buy one." - Jesus, Luke 22:36
|>

Jyrki Saarinen

unread,
Jan 13, 1996, 3:00:00 AM1/13/96
to

> |> How about doing the last pass, too..? ;) I calculated
> |> about 40ms for blitting when converting a 320x128 screen,
> |> two passes with a CPU. This is nice, too, since a 28MHz does
> |> the CPU part free, and the "large" blitting time is not a problem
> |> since the CPU propably is not able to render faster than
> |> 25fps.
>
> uhm well. I thought we were talking about more demoish (realtimish) fx
> here.

Isnt 25fps enough you for a 3d scene with thousands of polygons?

Thore Bjerklund Karlsen

unread,
Jan 14, 1996, 3:00:00 AM1/14/96
to
(Stephan Schaem)

>: Why not just measure it in scanlines? Why ms?

> An even better measure 'standart' : pixel second :)

Well, mostly when you talk about C2P conversion, it is for realtime
use, meaning a decent framerate. Alas, it makes sense to measure it in
something frame-relative, like pixels/frame, scanlines or something
else.

__
\\\__ Thore B. Karlsen % t...@sn.no % C64-C128D-A1200-A2000C
\XX/ Wowbagger/AFL&SSN % -c0d3r- % A1230/50MHz-2C4F/340MB

Psa.137.9: Happy shall he be, that taketh and dasheth thy little ones
against the stones.

Thore Bjerklund Karlsen

unread,
Jan 14, 1996, 3:00:00 AM1/14/96
to
(Juergen "Rally" Fischer)

>|> >> : We are going to release one in a couple of days.
>|> >> : 2x2x256 just as fast as it takes to copy the chunky-buffer to c
>|> >>

>|> >> nice, but which cpu and which framerate (for the case of blitter
>|> >> assistance) ?
>|>
>|> >It does two passes almost free even on a 28MHz 020/030. Completely
>|> >free on a 40MHz 030. The blitting time was quite small if
>|> >I remeber correctly, the whole time was 17ms, and the CPU
>|> >part is ~5ms, so ~12ms blitting.
>|>

>|> Why not just measure it in scanlines? Why ms?

>he gave the numbers in ms.

Yes, indeed he did!

>after having counted the scanlines the effect needs,
>(by measuring the hight of it on the TV! :D) you calculate

>time= (measured_height/height_of_wbscreen)*(256/312.5)*20ms

>for a wb screen of 256 pixel height.

>;)

Hey, I KNOW how to convert it.. :) But scanlines mean a lot more to me
than ms, it's not obvious from those numbers how much of my frame(s) is
wasted doing C2P. You don't do C2P only once per second!

>Rallys demo-coding formulas: ;)
>----------------------------

[...]

>what do you think about my formulas ? imho much more interesting than
>the crap at school ;)

Isn't everything more interesting than school though.. :)

__
\\\__ Thore B. Karlsen % t...@sn.no % C64-C128D-A1200-A2000C
\XX/ Wowbagger/AFL&SSN % -c0d3r- % A1230/50MHz-2C4F/340MB

4.Kings.2.23: .. and as he was going up by the way, there came forth
little children out of the city, and mocked him, and said
unto him, Go up, thou bald head; go up, thou bald head.
And he .. cursed them in the name of the LORD. And there
came forth two she bears out of the wood, and tare forty
and two children of them.

Jyrki Saarinen

unread,
Jan 14, 1996, 3:00:00 AM1/14/96
to

> Well, mostly when you talk about C2P conversion, it is for realtime
> use, meaning a decent framerate. Alas, it makes sense to measure it in
> something frame-relative, like pixels/frame, scanlines or something
> else.

Framerate = 1000ms/(time in ms)

Jorge Acereda Macia

unread,
Jan 15, 1996, 3:00:00 AM1/15/96
to
Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:

> > 20 ms? Using scrambled chunky buffer I get 320x200x8 in 25.7
> > on 030@50 (slow :-( ). Well, I keep multitasking enabled using a 127
> > TaskPri, so interrupts should suck the cache... But 20 ms. for
> > 320x256... Could you give some hints?

> I was just speculating. Copying a 320x256x8 data area from
> fast to chip ram takes about 20ms. Now, if three c2p passes
> could be done "free" on 030/50 and the last pass with
> the blitter, it would be great.

Using a scrambled buffer, I assume...

> I dont think your systemfriendliness slows down..
> Do you use the blitter?

No blitter. 320x200 with CPU only, takes 25.7 ms, beeing the first two
passes 5.something ms and the other two+writes to chip ram the other 20 ms.
(030@50 times) It's pretty slow. I think three totally free passes are not
possible on 030, but the times should not be more than 22 or 23 ms (?).

So, how much would require the last pass with the blitter? Will we
see full screen 25 fps doom on 030?

Greets,
Jorge Acereda (ii...@rossegat.uji.es)

Stephan Schaem

unread,
Jan 15, 1996, 3:00:00 AM1/15/96
to
Thore Bjerklund Karlsen (t...@sn.no) wrote:
: (Stephan Schaem)

: >: Why not just measure it in scanlines? Why ms?

: > An even better measure 'standart' : pixel second :)

: Well, mostly when you talk about C2P conversion, it is for realtime


: use, meaning a decent framerate. Alas, it makes sense to measure it in
: something frame-relative, like pixels/frame, scanlines or something
: else.

Most number I see are in pixel second... From pc card speed sheet to SGI
tech reference, in this 2 example they aply to realtime operation.

having a ms report of convereting a 320x256 screen is totaly weird.
scanline is the same thing... you need to define your scanline timing.

A pixel and a second is well defined... What is more logical?

1) I can c2p 4.4 mpixel per second on a 25mhz 030
or
2) I can c2p a 320x256 screen in 18.5 ms ...

Stephan

Jorge Acereda Macia

unread,
Jan 15, 1996, 3:00:00 AM1/15/96
to
Juergen "Rally" Fischer (fisc...@informatik.tu-muenchen.de) wrote:

> |> Are you using HAM mode? You can get excellent speed only updating the lower
> |> four bitplanes and interlacing lores screens, alternating odd/even chunky
> |> pixels.

> cool, have you done a demo with your idea ?

I remember a demo with a BIG morphing texturemapped torus (from Sonik ?).
This seems to be using HAM8. It has some glitches, and shows horizontal
lines on screen from time to time.

Any idea on how are they doing the c2p?

Greets,
Jorge Acereda (ii...@rossegat.uji.es)

John Hendrikx

unread,
Jan 15, 1996, 3:00:00 AM1/15/96
to
In a message of 14 Jan 96 Thore Bjerklund Karlsen wrote to All:

>> : Why not just measure it in scanlines? Why ms?

>> An even better measure 'standart' : pixel second :)

TBK> Well, mostly when you talk about C2P conversion, it is for
TBK> realtime use, meaning a decent framerate. Alas, it makes sense to
TBK> measure it in something frame-relative, like pixels/frame,
TBK> scanlines or something else.

I personally prefer ms, as 'frame-speed' is not always the same which could
lead to wrong conclusions.

Grtz John

-----------------------------------------------------------------------
John.H...@grafix.xs4all.nl TextDemo/FastView/Etc... development
-----------------------------------------------------------------------
-- Via Xenolink 1.985B3, XenolinkUUCP 1.1

Juergen Rally Fischer

unread,
Jan 16, 1996, 3:00:00 AM1/16/96
to
Jyrki Saarinen (jsaa...@kone.fipnet.fi) wrote:


: > uhm well. I thought we were talking about more demoish (realtimish) fx
: > here.

: Isnt 25fps enough you for a 3d scene with thousands of polygons?

I didn't think about special fps.
well, if blitter can keep up doing normal c2p then it's ok
(which not only depends on fps but on cpu & number of passes)

else removing 1 pass can help, with cpu doing 2 passes and blitter
1 insted of 2 you get factor 2 then...

: -- _

Juergen Rally Fischer

unread,
Jan 16, 1996, 3:00:00 AM1/16/96
to
Stephan Schaem (ssc...@teleport.com) wrote:

: Thore Bjerklund Karlsen (t...@sn.no) wrote:
: : (Stephan Schaem)

: : >: Why not just measure it in scanlines? Why ms?

: : > An even better measure 'standart' : pixel second :)

you mean pixel/second I guess.

: having a ms report of convereting a 320x256 screen is totaly weird.
noooo.
: scanline is the same thing... you need to define your scanline timing.
a scanline is 64ns on amiga (i.e. anyone talking about scanlines means
a PAL/NTSC one).

: A pixel and a second is well defined... What is more logical?

: 1) I can c2p 4.4 mpixel per second on a 25mhz 030
: or
: 2) I can c2p a 320x256 screen in 18.5 ms ...

logic ? both is information. and the latter one gives me the information
that it'll do it beyond one frame.

: Stephan

Iain McCord

unread,
Jan 17, 1996, 3:00:00 AM1/17/96
to
In article <4d6e7q$j...@sunsystem5.informatik.tu-muenchen.de>,

Juergen "Rally" Fischer <fisc...@informatik.tu-muenchen.de> wrote:
>R0G1B2R4G5B6
>
>;) only one component of a pixel is overtaken, but not the one needed most
>but just random (well, R,G and B cyclic).
>
>Anyone knows how picutres look after this kind of conversion ?

Look at a tv picture, after all thats how the screen is made up, of tiny
pixels of rgb.
How exactly is a HAM screen generated, i.e. what's the largest screen width?
I only ask because I gather that you can change 1 colour at a time, in effect
you have overlaping RGB planes each 3pixels wide. You could then say that a
1280 pixel wide HAM8 screen is equivalent to a 427 pixel wide 24 bit chunky
display. If you need a more accurate representation of a picture, leave the
green image alone and adjust the red and blue images to suit.

Jorge Acereda Macia

unread,
Jan 17, 1996, 3:00:00 AM1/17/96
to
Thore Bjerklund Karlsen (t...@sn.no) wrote:

> Why not just measure it in scanlines? Why ms?

20 ms. = 1 frame.

IMHO it's better to measure ms. instead of scanlines... Scanlines
measuring is harder in OS-friendly code.

Greets,

Patrick Hanevold

unread,
Jan 17, 1996, 3:00:00 AM1/17/96
to

>|> >SLOOOOOOOOOOOW!!! :)

>naaah, whats cooler than 020-14 c2p for free ;) I see that for realtimish
>fx it's better to accept cpu doing 10ms for doing one pass for a 2x2 screen.
>I already got the routine, wait for bltscr_v1.3 :)

>|> >In a couple of days, you'll get ours. :)

>oh, yes I'm interested!

Hehe.. Must write docs first. The source is ready.

>|>
>|> Are you using HAM mode? You can get excellent speed only updating the
>|> lower four bitplanes and interlacing lores screens, alternating odd/even
>|> chunky pixels.

I haven't replyed to this. Sorry to who ever wrote it.
We are using 256 colors 2x2.
Have plans to test 65536 colors. NO HAM! :)

>cool, have you done a demo with your idea ?

>if the cpu is quick enough, and the renderer is dirty enough to get


>conversion for free ;) then you should even be able to copy a 1x1 screen in
>1/2 frame! =:o

Sorry, it's 2x2. Will try it at 1x1. It will be fast. Fast as it can get.

>but maybe you got to wait for PPC to get it for free ;)

>would be especially interesting for all A3000 users! (quick cpu but no AGA)

Hehe.. :)

<sb>Patrick Hanevold - Virtual Reality developer
<sb>patrick....@login.eunet.no
<sb>Amiga and official Be developer


Juergen Rally Fischer

unread,
Jan 17, 1996, 3:00:00 AM1/17/96
to

In article <4de4s1$k...@oreig.uji.es>, ii...@rossegat.uji.es (Jorge Acereda Macia) writes:
|> Organization: Universitat Jaume I. Castelló de la Plana. Spain
|> Lines: 27
|> Distribution: world
|> Message-ID: <4de4s1$k...@oreig.uji.es>
|> References: <3823...@kone.fipnet.fi> <4cuhbv$l...@oreig.uji.es> <3823...@kone.fipnet.fi>
|> NNTP-Posting-Host: @rossegat.uji.es
|> X-Newsreader: TIN [version 1.2 PL2]
a blitter pass on 320x256x8planes needs about 0.04 sec.

|> see full screen 25 fps doom on 030?

well, if the copying to vram (assuming free 2pss action) needs a frame
there's not much left.

25fps are rather ratings for a well cached 486-66 (no rather 486-80).

but maybe a 030-50 with good mem interface doing 320x160 can go 3 frames,
about 1/2 frame copy, 1.5 frames for raw-mapping (assuming 20cyle inner loop.
shading ?), so 1 frame left for polygon 1st-outer-loops+2nd-outer,
rotation, z....

well, a A1200+ with 030-40 and 4 frame doom, aaaah :)

|>
|> Greets,
|> Jorge Acereda (ii...@rossegat.uji.es)

Stephan Schaem

unread,
Jan 18, 1996, 3:00:00 AM1/18/96
to
Jorge Acereda Macia (ii...@rossegat.uji.es) wrote:

: Thore Bjerklund Karlsen (t...@sn.no) wrote:

: > Why not just measure it in scanlines? Why ms?

: 20 ms. = 1 frame.

: IMHO it's better to measure ms. instead of scanlines... Scanlines
: measuring is harder in OS-friendly code.

pixel/second make even more sense... 20ms mean nothing to me?
20ms to convert 320x256 pixel: is what you need to type.
Or Just say type: 4mpix/s

Stephan

Juergen Rally Fischer

unread,
Jan 18, 1996, 3:00:00 AM1/18/96
to

In article <4djdpi$i...@neilson.cs.strath.ac.uk>, im...@cs.strath.ac.uk (Iain McCord) writes:
|> In article <4d6e7q$j...@sunsystem5.informatik.tu-muenchen.de>,
|> Juergen "Rally" Fischer <fisc...@informatik.tu-muenchen.de> wrote:
|> >R0G1B2R4G5B6
|> >
|> >;) only one component of a pixel is overtaken, but not the one needed most
|> >but just random (well, R,G and B cyclic).
|> >
|> >Anyone knows how picutres look after this kind of conversion ?
|>
|> Look at a tv picture, after all thats how the screen is made up, of tiny
|> pixels of rgb.

well, it's a bit different there I guess.

|> How exactly is a HAM screen generated, i.e. what's the largest screen width?
|> I only ask because I gather that you can change 1 colour at a time, in effect
|> you have overlaping RGB planes each 3pixels wide. You could then say that a

yes!


|> 1280 pixel wide HAM8 screen is equivalent to a 427 pixel wide 24 bit chunky
|> display. If you need a more accurate representation of a picture, leave the
|> green image alone and adjust the red and blue images to suit.

theroetical it's 3x1.
I tried it. Looking to a photo on 640x512 via composite you don't recognize
much. Looking at it on LORES you can see a bit blockines. If the colors do
not change too quick blockines is less than on 2x1, as it's smoothed.

could be maybe the only fast way for 24bit animations (for the case 24bit
games get common. It's quite difficult for cpu to do shading on 24bit
values, so 8bit will stay for a while).

and maybe it's best way to do 256 color games on a fast ECS machine.
and maybe conversion time is not that much compared to rendering a
doom scene on A500 ;)

Jyrki Saarinen

unread,
Jan 18, 1996, 3:00:00 AM1/18/96
to

> > I was just speculating. Copying a 320x256x8 data area from
> > fast to chip ram takes about 20ms. Now, if three c2p passes
> > could be done "free" on 030/50 and the last pass with
> > the blitter, it would be great.
>
> Using a scrambled buffer, I assume...

No. Three passes with the CPU and the last one with the blitter.

> No blitter. 320x200 with CPU only, takes 25.7 ms, beeing the first two
> passes 5.something ms and the other two+writes to chip ram the other 20
> ms. (030@50 times) It's pretty slow. I think three totally free passes
> are not possible on 030, but the times should not be more than 22 or 23 ms

Should be, because two passes are free on a 28MHz 020/030.

> So, how much would require the last pass with the blitter? Will we

> see full screen 25 fps doom on 030?

Hmm. Propably about 40ms..

Juergen Rally Fischer

unread,
Jan 18, 1996, 3:00:00 AM1/18/96
to

In article <4dkbml$e...@maureen.teleport.com>, ssc...@teleport.com (Stephan Schaem) writes:
|> Organization: Teleport - Portland's Public Access (503) 220-1016
|> Lines: 15
|> Message-ID: <4dkbml$e...@maureen.teleport.com>
|> References: <4cu7vo$1...@sinsen.sn.no> <4djdu8$b...@oreig.uji.es>
|> NNTP-Posting-Host: kelly.teleport.com

|> X-Newsreader: TIN [version 1.2 PL2]
|>
|> Jorge Acereda Macia (ii...@rossegat.uji.es) wrote:
|> : Thore Bjerklund Karlsen (t...@sn.no) wrote:
|>
|> : > Why not just measure it in scanlines? Why ms?
|>
|> : 20 ms. = 1 frame.
|>
|> : IMHO it's better to measure ms. instead of scanlines... Scanlines
|> : measuring is harder in OS-friendly code.

no timer is more accurate than showing the timing via raster display.
you know any method with less overhead than the 8 cycles for a write
to colorregister ?

|>
|> pixel/second make even more sense... 20ms mean nothing to me?

well, pixel/second is not interestong, who wants to look an animation
at 1fps ?

|> 20ms to convert 320x256 pixel: is what you need to type.
|> Or Just say type: 4mpix/s

ok, the latter is less chars, true :)

|>
|> Stephan

Stephan Schaem

unread,
Jan 19, 1996, 3:00:00 AM1/19/96
to
Juergen "Rally" Fischer (fisc...@informatik.tu-muenchen.de) wrote:

: In article <4dkbml$e...@maureen.teleport.com>, ssc...@teleport.com (Stephan Schaem) writes:
: |> Organization: Teleport - Portland's Public Access (503) 220-1016
: |> Lines: 15
: |> Message-ID: <4dkbml$e...@maureen.teleport.com>
: |> References: <4cu7vo$1...@sinsen.sn.no> <4djdu8$b...@oreig.uji.es>
: |> NNTP-Posting-Host: kelly.teleport.com
: |> X-Newsreader: TIN [version 1.2 PL2]
: |>
: |> Jorge Acereda Macia (ii...@rossegat.uji.es) wrote:
: |> : Thore Bjerklund Karlsen (t...@sn.no) wrote:
: |>
: |> : > Why not just measure it in scanlines? Why ms?
: |>
: |> : 20 ms. = 1 frame.
: |>
: |> : IMHO it's better to measure ms. instead of scanlines... Scanlines
: |> : measuring is harder in OS-friendly code.

: no timer is more accurate than showing the timing via raster display.
: you know any method with less overhead than the 8 cycles for a write
: to colorregister ?

I use color register miself... then do iteration*videorate = mpixel

: |>
: |> pixel/second make even more sense... 20ms mean nothing to me?

: well, pixel/second is not interestong, who wants to look an animation
: at 1fps ?

Who want to look at a 320x256 display when I'm interested in 256x200 ?
mpixel is eassy to manipulate to your need.And you wont come across
a claim like I can c2p in 8ms... and then realize it was 160x200 screen
2x2 he was talking about :)

: |> 20ms to convert 320x256 pixel: is what you need to type.


: |> Or Just say type: 4mpix/s

: ok, the latter is less chars, true :)

I wanted to point out the resolution.. its totaly non standart

Stephan

Jorge Acereda Macia

unread,
Jan 22, 1996, 3:00:00 AM1/22/96
to
Stephan Schaem (ssc...@teleport.com) wrote:
> Who want to look at a 320x256 display when I'm interested in 256x200 ?
> mpixel is eassy to manipulate to your need.And you wont come across
> a claim like I can c2p in 8ms... and then realize it was 160x200 screen
> 2x2 he was talking about :)

Then we should talk about X NxN mpix/sec

I agree X ms for 320x256 is not a good nomenclature. Many people
use 256x256 or 320x200 in their code.

Jorge Acereda Macia

unread,
Jan 22, 1996, 3:00:00 AM1/22/96
to
Juergen "Rally" Fischer (fisc...@informatik.tu-muenchen.de) wrote:

> |> : IMHO it's better to measure ms. instead of scanlines... Scanlines
> |> : measuring is harder in OS-friendly code.

> no timer is more accurate than showing the timing via raster display.
> you know any method with less overhead than the 8 cycles for a write
> to colorregister ?

Yeah, but when testing multitasking code this can be a bit harder.
ReadEClock() is fast. A scanline is a lot of time. ms are more accurate
in this sense if you measure in "integer rasterlines".

And what's that of writing to colorregisters? Are you counting scanlines
directly in the monitor??? %-)

Juergen Rally Fischer

unread,
Jan 23, 1996, 3:00:00 AM1/23/96
to

In article <4e0s27$h...@oreig.uji.es>, ii...@rossegat.uji.es (Jorge Acereda Macia) writes:
|> Organization: Universitat Jaume I. Castelló de la Plana. Spain
|> Lines: 19
|> Distribution: world
|> Message-ID: <4e0s27$h...@oreig.uji.es>
|> References: <4cu7vo$1...@sinsen.sn.no> <4djdu8$b...@oreig.uji.es> <4dkbml$e...@maureen.teleport.com> <4dm3sm$p...@sunsystem5.informatik.tu-muenchen.de> <4dp3vb$n...@maureen.teleport.com>
|> NNTP-Posting-Host: @rossegat.uji.es

|> X-Newsreader: TIN [version 1.2 PL2]
|>
|> Stephan Schaem (ssc...@teleport.com) wrote:
|> > Who want to look at a 320x256 display when I'm interested in 256x200 ?
|> > mpixel is eassy to manipulate to your need.And you wont come across
|> > a claim like I can c2p in 8ms... and then realize it was 160x200 screen
|> > 2x2 he was talking about :)
|>
|> Then we should talk about X NxN mpix/sec
|>
|> I agree X ms for 320x256 is not a good nomenclature. Many people
|> use 256x256 or 320x200 in their code.

well it's a difference if you use it for coding or for telling others.

pix/sec is best for telling the speed to others (you should mention if
you do less then 8 planes ;)

|>
|> Greets,
|> --
|> ---------------------------- --------------------------------------------
|> | Jorge Acereda | Dream the same thing everynight |
|> | ii...@rossegat.uji.es | I see our freedom in my sight |
|> | Intel Outside | No locked doors, no windows barred |
|> | Amiga Rules | No things to make my brain seem scarred |
|> ---------------------------- --------------------------------------------

John Hendrikx

unread,
Jan 24, 1996, 3:00:00 AM1/24/96
to
In a message of 19 Jan 96 Jorge Acereda Macia wrote to All:

>> after having counted the scanlines the effect needs, (by measuring the
>> hight of it on the TV! :D) you calculate

>> time= (measured_height/height_of_wbscreen)*(256/312.5)*20ms

JAM> Why don't you use ReadEClock()?

Works great, in fact, when used properly (and when you time sufficient loops
and take some other stuff into account) you can create a routine which is
accurate enough to tell you that instructions like 'Move.l d0,d0' take 2.00
cycles on a 68030 (so it is accurate to about 1/100th of a CPU cycle -- I guess
that beats counting rasterlines)

Juergen Rally Fischer

unread,
Jan 29, 1996, 3:00:00 AM1/29/96
to

In article <john.hend...@grafix.xs4all.nl>, john.h...@grafix.xs4all.nl (John Hendrikx) writes:
|> In a message of 19 Jan 96 Jorge Acereda Macia wrote to All:
|>
|> >> after having counted the scanlines the effect needs, (by measuring the
|> >> hight of it on the TV! :D) you calculate
|>
|> >> time= (measured_height/height_of_wbscreen)*(256/312.5)*20ms
|>
|> JAM> Why don't you use ReadEClock()?
|>
|> Works great, in fact, when used properly (and when you time sufficient loops
|> and take some other stuff into account) you can create a routine which is
|> accurate enough to tell you that instructions like 'Move.l d0,d0' take 2.00
|> cycles on a 68030 (so it is accurate to about 1/100th of a CPU cycle -- I guess
|> that beats counting rasterlines)

hehe and what about estimating the ratio outer vs inner loop of a
polyengine ? I let outer loop appear yellow and inner red.

Then I could see how ratios change change when polywidth
(a horiz line of it) increases. Outer loop same, but incresed
storage time, so the disply became more red at bottom.

Show me one clock that can perform this without making the
outer loop 2 times slower ;) I bet it won't include realtime
visualisation of the timings ;)

|>
|> Grtz John
|>
|> -----------------------------------------------------------------------
|> John.H...@grafix.xs4all.nl TextDemo/FastView/Etc... development
|> -----------------------------------------------------------------------
|> -- Via Xenolink 1.985B3, XenolinkUUCP 1.1

------------------------------------------------------------------------

Michael van Elst

unread,
Jan 30, 1996, 3:00:00 AM1/30/96
to
fisc...@informatik.tu-muenchen.de (Juergen "Rally" Fischer) writes:

>hehe and what about estimating the ratio outer vs inner loop of a
>polyengine ? I let outer loop appear yellow and inner red.

What about actually _measuring_ it ? :)

>Show me one clock that can perform this without making the
>outer loop 2 times slower ;)

The trick is to measure with varying loop counts. You only take
the time for whole outer loop (or many runs through this loop).

>I bet it won't include realtime
>visualisation of the timings ;)

This is true. But then you might be interested into exact results
instead of a visualization.

--
Michael van Elst

Internet: mle...@serpens.rhein.de
"A potential Snark may lurk in every tree."

John Hendrikx

unread,
Jan 31, 1996, 3:00:00 AM1/31/96
to
In a message of 29 Jan 96 Juergen "rally" Fischer wrote to All:

|>> JAM> Why don't you use ReadEClock()?

|>> Works great, in fact, when used properly (and when you time sufficient
|>> loops and take some other stuff into account) you can create a routine
|>> which is accurate enough to tell you that instructions like 'Move.l
|>> d0,d0' take 2.00 cycles on a 68030 (so it is accurate to about 1/100th
|>> of a CPU cycle -- I guess that beats counting rasterlines)

JrF> hehe and what about estimating the ratio outer vs inner loop of a
JrF> polyengine ? I let outer loop appear yellow and inner red.

That would be simple. Just time the outerloop + innerloop, and then time the
outerloop alone (ie, with a non-existant innerloop). Do some math and you
should get an acceptable answer.

Grtz John

-----------------------------------------------------------------------
John.H...@grafix.xs4all.nl TextDemo/FastView/Etc... development
-----------------------------------------------------------------------

-- Via Xenolink 1.981, XenolinkUUCP 1.1

Jorge Acereda Macia

unread,
Feb 5, 1996, 3:00:00 AM2/5/96
to
Juergen "Rally" Fischer (fisc...@informatik.tu-muenchen.de) wrote:

> |> Yeah, but when testing multitasking code this can be a bit harder.
> |> ReadEClock() is fast. A scanline is a lot of time. ms are more accurate

> mhm in which lib is this ? OS3.0 ? Do I need to open something bevore
> using it ?

Yeah, OS3.0. You need to open lowlevel.library, but if you need to
use it on <3.0 systems, you can write your replacement using timer.device.

Michael van Elst

unread,
Feb 5, 1996, 3:00:00 AM2/5/96
to
ii...@rossegat.uji.es (Jorge Acereda Macia) writes:

>Yeah, OS3.0. You need to open lowlevel.library,

In fact you need OS3.1 to have lowlevel.library and it contains just a similar
function called ElapsedTime.

>but if you need to
>use it on <3.0 systems, you can write your replacement using timer.device.

Actually you can just call ReadEClock() from the timer.device library.

Jorge Acereda Macia

unread,
Feb 7, 1996, 3:00:00 AM2/7/96
to
Michael van Elst (mle...@serpens.rhein.de) wrote:
> ii...@rossegat.uji.es (Jorge Acereda Macia) writes:

> >Yeah, OS3.0. You need to open lowlevel.library,

> In fact you need OS3.1 to have lowlevel.library and it contains just a similar
> function called ElapsedTime.

!?!?!? Really? Well, I have 1200 with OS3.0. and it came with lowlevel.library
(I bought it used). Haven't reinstalled the OS though.

> >but if you need to
> >use it on <3.0 systems, you can write your replacement using timer.device.

> Actually you can just call ReadEClock() from the timer.device library.

Hmmm... True. Seems like I confused both functions :-)

0 new messages