I've seen a more clever scheme for 6502s, possibly in a really old
Dr. Dobbs. A pair of 6502s are run on opposite clock phases. Each has
complete access to the memory without the need for an arbiter (or,
rather, the difference in clock phase is the arbiter).
--
Roger Ivie
ri...@ridgenet.net
I have a Rockwell databook from '84 or so that has preliminary info on
a chip (6529?) that has two 6502 cores using the scheme described
above. Don't know if it ever saw production.
-Dave
Any chance of scanning the article and posting it?
My subscription started with the Aug '78 issue.
Jim
I remember spending much time with timing diagrams etc. about 1974
trying to interlock two 8080's on shared memory. I eventually
concluded it was not possible, or that the net timing gains
available would be negligible. There was to be no arbiter,
accesses would be interlocked to avoid any such needs.
My reason was that I wanted two separate processes to function,
one for i/o, and one for supervision. The supervisor was supposed
to set up the matrix in memory, and the i/o processor handled
continuous output frames.
I never attempted it with Z80's, but I believe their timings were
similar.
--
Chuck F (cbfal...@yahoo.com) (cbfal...@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
I wonder if the shared memory is static (no refresh interference)
and twice the usual speed so both ports can access it without waiting.
Or is a wait cycle or 2 inevitable?
>I've seen a more clever scheme for 6502s, possibly in a really old
>Dr. Dobbs. A pair of 6502s are run on opposite clock phases. Each has
>complete access to the memory without the need for an arbiter (or,
>rather, the difference in clock phase is the arbiter).
Long ago I pondered how to cycle-steal on a Z80.
I figured that static memory could be available for one cycle
after each opcode fetch during the refresh cycle.
Or use M1 to separate I and D (instuction and data)
to some simplified memory management unit
so data RAM is available during the opcode fetch
so long as the PC never points into that chip's address space!
If the system was EPROM for instruction, the RAM is available
during all M1 cycles (opcode fetch, refresh)
so 2 cycles are possible per opcode!
Has anyone really done that to squeeze max performance out of a Z80?
--
Jeffrey Jonas
jeffj@panix(dot)com
The original Dr. JCL and Mr .hide
I don't think you'd be able to do it with a pair
of Z80s if each chip was executing a different
sequence of instructions since the Z80 doesn't have
uniform machine instruction cycles.
As in, instruction fetch has 4 cycles. Memory
read and write have 3 cycles each if no wait states
are used. Input and output have two mandatory wait
states each. Some kind of arbiter would be needed
to insert wait states to keep the chips in sync.
Given today's fast static rams, you could probably
have four 6 Mhz Z80s sharing memory with no wait
states if you could design a suitable arbiter.
A dual ported static ram would also allow you to
have a pair of Z80s with no arbiter.
-Frank
I discovered a useful hack to multiplex the Z80 in the Heath H19
terminal. The H19 has a 2.5 MHz Z80, 2k bytes of static RAM, and a 6845
video controller chip. There is a multiplexer to switch the RAM's data
and address bus to the 6845 when the Z80 is not addressing it. But every
time the Z80 accesses that RAM, the 6845 cannot read RAM data. So they
blank the video during Z80 accesses. This results in annoying black
"dashes" on the screen.
Looking at the data sheet, I noticed that the Z80 latches the data on a
memory read cycle on the falling edge of the clock. Likewise, memory
write data is latched into the memory chip on the falling edge of the
clock.
Conversely, the 6845 was latching the RAM data it read on the rising
edge of the Z80's clock.
At 2.5 MHz, one clock cycle is 400 nsec, so the high and low times are
each 200 nsec. I replaced the stock 450 nsec 2114 RAMs with faster 200
nsec parts, so they could complete a read or write cycle in the 200 nsec
time that the Z80 clock was low. Then I cut the video RAM's multiplexer
line from the address decoder, and connect it instead to the Z80 clock.
The result was that the video RAM was read by the 6845 only when the
clock was low, and was read/written by the Z80 when the clock was high.
The two were neatly interleaved, and never collided. The black "flecks"
were gone, which actually allowed me to considerably speed up the video.
Hundreds of H19s were modified this way. There were some details to
finesse; to make it reliable on all units, I had to carefully work out
the propagation delays of the rest of the circuitry, switch the
multiplexers to faster chips, and add a delay capacitor to insure that
it switched at exactly the right time to give equal access time to both
6845 and Z80.
Note that the Z80 instruction fetch latches data on the RISING edge of
the clock. So, I could not store programs in video RAM (hardly a
limitation).
So, for the special case of two Z80s sharing a single memory that is NOT
used for program storage, they can share RAM by running one Z80 with an
inverted version of the other's clock, and using this clock to control
the multiplexer between the memory's address and data buses.
--
Lee A. Hart Ring the bells that still can ring
814 8th Ave. N. Forget your perfect offering
Sartell, MN 56377 USA There is a crack in everything
leeahart_at_earthlink.net That's how the light gets in - Leonard Cohen
Very neat hack!
The Apple II was designed around a similar hack, since the 6502 only
accesses memory on one phase of its clock. The other phase was
used (with double-speed DRAM) to access the video area of memory.
And the mapping of raster location to memory address was chosen
so that video refresh also performed DRAM refresh. (This is why the
video memory of an Apple II is not "linear" in the vertical dimension.)
-michael
Check out 8-bit Apple sound that will amaze you on my
Home page: http://members.aol.com/MJMahon/
Back in 1980 I was hacking a multiple z80 system at high speeds
(first4 then later 6 and 8 mhz). The solution I arrived at due to the
asynchronous timing was to build a first one there gets it and the
other gets a clock delay as a wait. Worked very well and was
reproducable.
Allison
>> I never attempted it with Z80's, but I believe their timings were
>> similar.
yes they were. hence the idea of making the system async
with holdoff.
> I don't think you'd be able to do it with a pair
>of Z80s if each chip was executing a different
>sequence of instructions since the Z80 doesn't have
>uniform machine instruction cycles.
Right. However the idea of synchronous can't work with z80
like 6800 or 6500. If you make meory fast and available to the first
arrival it can be ok.
> Given today's fast static rams, you could probably
>have four 6 Mhz Z80s sharing memory with no wait
>states if you could design a suitable arbiter.
There was little problem get fast parts then, they werent cheap
thats all. Examples 2147 (slow parts were 55ns), 2167 (fast ones
were 35ns), Upd410-3 (85ns in 1978!).
> A dual ported static ram would also allow you to
>have a pair of Z80s with no arbiter.
You still need it. the Z80s could be miles apart or almost in
lockstep in any given moment. It would make the arbiter easier.
I took the easiest path, two z80s common clock shared busses.
anytime one wanted to read or write the arbiter would delay the clock
and assert wait on the other. Net slowdown was small but the bus gets
rather nasty busy and it was hard to do on S100 (split data busses).
In the end I took it apart, went with independent cpus with local
memory and MMU, high speed DMA and several periperal cpus
to unload burdensome tasks. The net result was cleaner more
modular programming as well as data flows.
Allison
(snip)
> Hundreds of H19s were modified this way. There were some details to
> finesse; to make it reliable on all units, I had to carefully work out
> the propagation delays of the rest of the circuitry, switch the
> multiplexers to faster chips, and add a delay capacitor to insure that
> it switched at exactly the right time to give equal access time to both
> 6845 and Z80.
To this day, I have a complete, unused H19 mod kit on the shelf. Never
had a chance to install it, too nostalgic to throw it out. <g>
Steve
I have a similar article for Z80s, using opposite clock phases, the
arbitrator though is uses for only one reason - to stretch instructions
that take an odd number of clock cycles to the next even clock cycle.
Julian
By the way, it is possible to make a multi Z80 (or 6502) using a similar
method to the two CPU method ie:
(odd cycle)
CPU1-----------------
| |
RAMBANK1 RAMBANK2 expansion
| | |
expansion CPU2--------------------
(even cycle)
If you were to make a CPU board with the above layout with 2 CPU's per
board, you could daisy chain the boards to an infinite level.
I thought of getting one made before, but no-one else I knew was interested
in the project.
If you design it a little more specific you can come up with a quad ie:
(odd) (even)
CPU1--------RAMBANK1--------CPU2
| |
| |
RAMBANK2 RAMBANK3
| |
| |
CPU3--------RAMBANK1--------CPU4
(even) (odd)
Regards, Julian
6502 (plus 6800 and 6809) are synchronous machines and that works well
for them.
>
>By the way, it is possible to make a multi Z80 (or 6502) using a similar
>method to the two CPU method ie:
Cant do that with z80 as reads and writes to the bus occupy more than
one half clock cycle and typically closer to a full clock minus
propagation delays. Look at the Z80 timing, it's an asynch machine
with cycle timing that does vary such as M1 (instruction fetch) states
having a shorter read cycle than any subsequent memory read. Also
write cycles are slightly different timing as read. The last item is
a significant difference from 6502/6800/6809, the z80 has a distinct
set of instuctions for IO also with slighty different timing that are
simply not present in the others.
In the end yes you can have multiple z80s but the arbiter has to do a
lot more work than you would for 6502/600/6809. It's not too bad to
do once you understood the timing and accept the fact that at times
there will be a bus collision and the only solution is to hold off one
of the CPUs by delaying clock or relying on the DMA logic for part of
the arbitration.
Allison