I have an idea for an "art" project, of sorts - streaming video from a PC over serial to an Apple II. The idea is to present the appearance of a very "retro client" version of the immersive 3D client/server application my company creates on the Apple - just for fun.
The basic idea is as follows:
* Service on the contemporary machine that takes frame-grabs of an application window at some frequency (say, 5Hz) * The service then processes the frames to an Apple-friendly format. Initially, probably uncompressed lo-res (40x48) display. * The data is streamed out over serial to the Apple * The Apple is running a fairly simple loop that reads and displays frames on the lo-res screen * The Apple is non-interactive while this is running.
A lo-res screen is 40x48 or 1920 pixels/frame; 4 bits/pixel = 7680 bits/frame. At 9600 bps that's about 1Hz refresh, which is a reasonable lower bound.
Questions: * Can the IIc (a free one landed on my lap, which inspired this; sans power brick tho) do faster than 9600 reliably? Googling around, 19200bps seems possible (2Hz refresh) - 115k is only available on the IIgs correct? * Has anyone already implemented this? In 2006, Frank M. was doing experiments with pre-converted lo-res video (e.g. http://groups.google.com/group/comp.sys.apple2/browse_thread/thread/4...) which is an inspiration, but not directly applicable.
Notes: * For maximum ease of setup, it would be ideal if the Apple side of the code could be bootstrapped like ADT * Based on David Schmenk's HBCC game, full-screen updates to the lo-res screen while doing lots of other processing seems entirely feasible, so I'm assuming that there are CPU cycles available for intra-frame compression and even inter-frame compression. * As an added feature... if the scene stops changing (to within some threshold), the code could switch from streaming a lo-res screen to a hi-res screen. This would take ~8 seconds (at 9600bps), but the result would be that after the scene becomes static for 8 seconds the image quality would increase dramatically, until the next change. To maintain the low latency, the hi-res stream would need be interruptable (i.e. if change is detected by the encoder, abort the transfer of the hi-res frame and resume lo-res frames) requiring a modicum of protocol design. Compression here would also be nice, but decompression on the Apple might be a killer. * Obviously, there's the possibility of adding additional bells and whistles, such as letting the Apple send keystrokes back "upstream" to drive the source application, or sidechannel streams e.g. to display text status displays on the Apple, trigger beeps, etc.
I haven't done any work on this and, truth be told, this is likely to be one of those projects that never goes anywhere. But I wanted to throw out the idea for comments. Is there any other interest in this?
On Mar 29, 9:38 am, "Joshua Bell" <inexorablet...@hotmail.com> wrote:
> I haven't done any work on this and, truth be told, this is likely to be one > of those projects that never goes anywhere. But I wanted to throw out the > idea for comments. Is there any other interest in this?
I think it is a great idea... maybe have the option of : instead of framegrabs of application, do framegrabs of a video file...
On Mar 30, 3:38 am, "Joshua Bell" <inexorablet...@hotmail.com> wrote:
> Questions: > * Can the IIc (a free one landed on my lap, which inspired this; sans power > brick tho) do faster than 9600 reliably? Googling around, 19200bps seems > possible (2Hz refresh) - 115k is only available on the IIgs correct?
The 6551 in a IIc can do 115kb/s - ADTPro allows this speed. Depending on the vintage of your IIc the serial port may be slightly (3%) slow and this may cause problems with some serial devices. My old IIc has no problems up and down loading disk images with ADTPro.
Joshua Bell wrote: > I have an idea for an "art" project, of sorts - streaming video from a PC > over serial to an Apple II. The idea is to present the appearance of a very > "retro client" version of the immersive 3D client/server application my > company creates on the Apple - just for fun.
Sounds like great fun!
> The basic idea is as follows:
> * Service on the contemporary machine that takes frame-grabs of an > application window at some frequency (say, 5Hz) > * The service then processes the frames to an Apple-friendly format. > Initially, probably uncompressed lo-res (40x48) display. > * The data is streamed out over serial to the Apple > * The Apple is running a fairly simple loop that reads and displays frames > on the lo-res screen > * The Apple is non-interactive while this is running.
Since the time to transfer a frame is certainly known as soon as it is captured (even if it is, say, run-length compressed), the frame rate can be made adaptive to the content. In other words, if it can run faster, it does.
> A lo-res screen is 40x48 or 1920 pixels/frame; 4 bits/pixel = 7680 > bits/frame. At 9600 bps that's about 1Hz refresh, which is a reasonable > lower bound.
> Questions: > * Can the IIc (a free one landed on my lap, which inspired this; sans power > brick tho) do faster than 9600 reliably? Googling around, 19200bps seems > possible (2Hz refresh) - 115k is only available on the IIgs correct? > * Has anyone already implemented this? In 2006, Frank M. was doing > experiments with pre-converted lo-res video (e.g. > http://groups.google.com/group/comp.sys.apple2/browse_thread/thread/4...) > which is an inspiration, but not directly applicable.
As David pointed out, the IIc is quite capable of 115kbps with just a single extra POKE for configuration.
And what you propose is *very* much like video streaming from a disk file, but with a serial link instead and possibley some decompression code.
> Notes: > * For maximum ease of setup, it would be ideal if the Apple side of the code > could be bootstrapped like ADT > * Based on David Schmenk's HBCC game, full-screen updates to the lo-res > screen while doing lots of other processing seems entirely feasible, so I'm > assuming that there are CPU cycles available for intra-frame compression and > even inter-frame compression. > * As an added feature... if the scene stops changing (to within some > threshold), the code could switch from streaming a lo-res screen to a hi-res > screen. This would take ~8 seconds (at 9600bps), but the result would be > that after the scene becomes static for 8 seconds the image quality would > increase dramatically, until the next change. To maintain the low latency, > the hi-res stream would need be interruptable (i.e. if change is detected by > the encoder, abort the transfer of the hi-res frame and resume lo-res > frames) requiring a modicum of protocol design. Compression here would also > be nice, but decompression on the Apple might be a killer.
Not if it saves more data transfer time than it costs in processor time. That's the tradeoff to examine, and it, of course, depends critically on the actual serial data transfer rate. At 115kbps, it may be necessary to forego decompression just to maintain the UART transfer rate. For example, interrupt processing of UART data is not the way to go at this rate--just simple polling and data storage.
There will be time in the loop to detect a simple protocol escape... I'd recommend using it to trigger page-flipping at end of page and any other things you might want to add. If you want to add keyboard sensing, then that will cost another 6 cycles per received character to test and a loss of some incoming characters when a keypress is sensed and acted upon. A back of the envelope estimate indicates that this will fit into the receive loop at 115kbps (~78 cycles/char).
The basic loop looks like it's about 30 cycles with a simple escape test, so it shouldn't be critical as long as one byte value can be dedicated to the escape function. And even in this case, a second consecutive "escape" could result in a return to the storage loop to store the "escape" byte:
loop lda UAstat,x ; UART have char? and #mask ; Mask status bxx loop ; -No, wait. lda UAdata,x ; -Yes, get char. cmp #escape ; Cmd escape? beq cmd ; -Yes, do it. sta (zp),y ; -No, store it iny ; and increment bne nocar ; store address. inc zp+1 nocar lda kbd ; Key pressed? bpl loop ; -No. ... ; -Yes, process key. jmp loop ; Return to loop, having missed x chars. <Since the sending machine cannot anticipate this, it will be necessary to either "flush" the in-process page, for which sync has been lost, or keep accurate track of how many characters were lost and resume the loop with Y and (zp) incremented accordingly (to just re-use the previous valued for the skipped bytes.>
cmd <receive another character as above> <switch on char value to cmd routine> <each cmd routine misses x chars, where x could be zero for very simple commands, like page flips>
When you decide to switch to hi-res, you'll be filling the hi-res buffer while continuing to display the last full lo-res buffer, I presume. So if you don't finish transferring the full hi-res screen, the partial frame transfer would never be visible.
> * Obviously, there's the possibility of adding additional bells and > whistles, such as letting the Apple send keystrokes back "upstream" to drive > the source application, or sidechannel streams e.g. to display text status > displays on the Apple, trigger beeps, etc.
Those can all be done, but will necessarily require either a "pause" in data transmission, or, perhaps better, a pre-computed number of nulls inserted in the stream after a protocol command to allow for processing time on the Apple side. Each "long" command could finish in a routine to receive bytes until a "data restart" byte is received, which would then return to the main loop.
For highest speed, the protocol will need to be pretty "fragile", with recovery from a data transmission error essentially awaiting the next re-syncing event, but it should be fine for a local serial link. Command decoding may need to include some redundancy to prevent wild transfers of control (I would favor a direct vector through a table with, perhaps, a requirement that the command byte be sent twice, in both true and compliment form to add some robustness.)
> I haven't done any work on this and, truth be told, this is likely to be one > of those projects that never goes anywhere. But I wanted to throw out the > idea for comments. Is there any other interest in this?
I hope you give it a shot--it would be fun, and might open the door to other "streaming video" activities.
On Mar 31, 8:32 am, "Michael J. Mahon" <mjma...@aol.com> wrote:
> There will be time in the loop to detect a simple protocol escape... > I'd recommend using it to trigger page-flipping at end of page and > any other things you might want to add. If you want to add keyboard > sensing, then that will cost another 6 cycles per received character > to test and a loss of some incoming characters when a keypress is > sensed and acted upon. A back of the envelope estimate indicates > that this will fit into the receive loop at 115kbps (~78 cycles/char).
The IIc is the only 8-bit Apple II that can have keyboard generated interrupts. This reduces the cost of monitoring the keyboard latch to zero cycles but increases the number of cycles used to actually handle the key press.
Michael J. Mahon wrote: > Since the time to transfer a frame is certainly known as soon > as it is captured (even if it is, say, run-length compressed), > the frame rate can be made adaptive to the content. In other > words, if it can run faster, it does.
Yep. In this scenario, since it's doing realtime, maintaining a consistent frame rate isn't necessary - "as fast as possible" is what's desired. The closest analogy would be the "remote desktop" applications like VNC, which attempt to give you a generic mechanism for the local display (and optionally interactivity) of a remote computer's screen (or app window).
>> Compression here would also be nice, but >> decompression on the Apple might be a killer.
> Not if it saves more data transfer time than it costs in processor > time. That's the tradeoff to examine, and it, of course, depends > critically on the actual serial data transfer rate. At 115kbps, it > may be necessary to forego decompression just to maintain the UART > transfer rate. For example, interrupt processing of UART data is > not the way to go at this rate--just simple polling and data storage.
Back-of-the-envelope for 115kbps gives me about a 2Hz update rate for hi-res graphics without compression. Given that, I'd probably skip lo-res entirely since that rate is "good enough" for what I'm imagining.
You're right about compression. However, in the particular case I'm thinking about, the graphics will not be amenable to the sort of decompression the Apple could do in realtime, at least for intra-frame. Interframe compression a la MPEG would be feasible to get higher than 2Hz (if less than a full frame changes, transmit only the rectangle that does), but since it's an interactive first-person immersive 3D environment the cases where higher than 2Hz refresh are compelling are when the viewpoint is changing rapidly (i.e you're trying to naviate), which are the worst-case scenarios for intra-frame compression.
(That's not to say that either the more complex intermixed lo-res/hi-res or intra-frame compression wouldn't be useful in more general applications of this notion, i.e. streaming arbitrary video. I just know enough about the data in this case to shelve those ideas for now.)
> There will be time in the loop to detect a simple protocol escape...
I confess to not knowing enough about serial transmission to know how much putting a handshake between frames would slow things down, but I have to assume "not much". If we're transmitting a frame in 0.5 seconds, spending 0.01 second between frames for the client to say "user pressed 'W' key" is feasible, so doing this within the frame loop is not necessary.
I'm also not sure how the sender knows when when the receiver is done pulling bits out of a buffer; to reduce perceived latency in this realtime client/server app, the server should capture the frame to send as close to when the client is done receiving the previous frame as possible. That's probably serial communication 101, tho.
> When you decide to switch to hi-res, you'll be filling the hi-res > buffer while continuing to display the last full lo-res buffer, I > presume. So if you don't finish transferring the full hi-res > screen, the partial frame transfer would never be visible.
Yep. (Again, it seems that at 115kbps I'd just skip lo-res which simplifies things)
> For highest speed, the protocol will need to be pretty "fragile", > with recovery from a data transmission error essentially awaiting > the next re-syncing event
Agreed. Again, for this scenario, that's fine.
It's looking like the simplest client implementation is basically: * initialize the UART to 115kbps * start a loop - * read from UART until a "start of frame" signature is seen * start filling the non-visible hires page * after 8192 bytes, flip hires pages * jump to start of loop (i.e. wait for a signature)
The next thing to implement would be a simple 1-byte sync signal every 256 bytes; if not seen, assume there was a glitch somewhere and abort the display of this frame, waiting for the next start-of-frame signature.
I've never done serial programming on either side, but I'm assuming the initialization is trivial and you've provided the guts of the loop already. (Thanks!)
The hard part now seems like it's actually on the sending side, where we need to tackle algorithms for generating decent color hi-res screens. I'd probably start by assuming it's a 140x192 with a fixed 6 color palette, do an error-diffusion dither, and ignore the artifacts from trying to put blue and green within (etc) a byte. Which would make Rich happy, I'm assuming.
>>Since the time to transfer a frame is certainly known as soon >>as it is captured (even if it is, say, run-length compressed), >>the frame rate can be made adaptive to the content. In other >>words, if it can run faster, it does.
> Yep. In this scenario, since it's doing realtime, maintaining a consistent > frame rate isn't necessary - "as fast as possible" is what's desired. The > closest analogy would be the "remote desktop" applications like VNC, which > attempt to give you a generic mechanism for the local display (and > optionally interactivity) of a remote computer's screen (or app window).
>>>Compression here would also be nice, but >>>decompression on the Apple might be a killer.
>>Not if it saves more data transfer time than it costs in processor >>time. That's the tradeoff to examine, and it, of course, depends >>critically on the actual serial data transfer rate. At 115kbps, it >>may be necessary to forego decompression just to maintain the UART >>transfer rate. For example, interrupt processing of UART data is >>not the way to go at this rate--just simple polling and data storage.
> Back-of-the-envelope for 115kbps gives me about a 2Hz update rate for hi-res > graphics without compression. Given that, I'd probably skip lo-res entirely > since that rate is "good enough" for what I'm imagining.
Well, you'll need at least 9 bit times per byte, and that puts you at 12,800 bytes/sec, or about 1.5 seconds per hi-res screen. That's actually pretty slow.
Run-length compression could save you some, depending on whether you compressed the actual screen or an XOR with the previous screen, and, of course, on how much of the screen is changing each frame.
Compression would have to be *very* simple to fit in the <80 cycles you have per byte, but since run-length uses character pairs, it might work out well.
> You're right about compression. However, in the particular case I'm thinking > about, the graphics will not be amenable to the sort of decompression the > Apple could do in realtime, at least for intra-frame. Interframe compression > a la MPEG would be feasible to get higher than 2Hz (if less than a full > frame changes, transmit only the rectangle that does), but since it's an > interactive first-person immersive 3D environment the cases where higher > than 2Hz refresh are compelling are when the viewpoint is changing rapidly > (i.e you're trying to naviate), which are the worst-case scenarios for > intra-frame compression.
> (That's not to say that either the more complex intermixed lo-res/hi-res or > intra-frame compression wouldn't be useful in more general applications of > this notion, i.e. streaming arbitrary video. I just know enough about the > data in this case to shelve those ideas for now.)
Give up on anything more complicated than run-length compression of data or data differences. 80 cycles per byte is a harsh mistress. ;-)
>>There will be time in the loop to detect a simple protocol escape...
> I confess to not knowing enough about serial transmission to know how much > putting a handshake between frames would slow things down, but I have to > assume "not much". If we're transmitting a frame in 0.5 seconds, spending > 0.01 second between frames for the client to say "user pressed 'W' key" is > feasible, so doing this within the frame loop is not necessary.
You don't want a handshake, since the only recovery is to keep going!
If the sending machine is sent data, it simply acts on it without any housekeeping handshakes. All "service" should simply be "best effort".
> I'm also not sure how the sender knows when when the receiver is done > pulling bits out of a buffer; to reduce perceived latency in this realtime > client/server app, the server should capture the frame to send as close to > when the client is done receiving the previous frame as possible. That's > probably serial communication 101, tho.
The receiver doesn't "pull" bits, it is "sent" to the receiver by the sender without handshaking.
Since the sender knows exactly when the end-of-frame 2-byte sequencs has been sent, it knows exactly when the receiver has displayed it.
>>When you decide to switch to hi-res, you'll be filling the hi-res >>buffer while continuing to display the last full lo-res buffer, I >>presume. So if you don't finish transferring the full hi-res >>screen, the partial frame transfer would never be visible.
> Yep. (Again, it seems that at 115kbps I'd just skip lo-res which simplifies > things)
You'd be surprised how much better a 16-color, 6 FPS display looks than a 6-color 0.67 FPS display! (If you want 16-color hi-res, that's double hi-res, and requires twice as long to send.)
At 6 FPS you can see motion pretty well, but at under 1 FPS it's pretty hard unless things are changing quite slowly.
It's also quite easy to change, since a simple "switched" command handler on the Apple II side can be told exactly which screen to fill next and which to display now. There are really no smarts on the Apple side at all--in fact, it doesn't even keep count of bytes transferred, it just stores them in increasing addresses until told to switch buffers and screens.
>>For highest speed, the protocol will need to be pretty "fragile", >>with recovery from a data transmission error essentially awaiting >>the next re-syncing event
> Agreed. Again, for this scenario, that's fine.
> It's looking like the simplest client implementation is basically: > * initialize the UART to 115kbps > * start a loop - > * read from UART until a "start of frame" signature is seen > * start filling the non-visible hires page > * after 8192 bytes, flip hires pages > * jump to start of loop (i.e. wait for a signature)
Actually, switching buffers and screens is fast enough that you can probably do it between the end-of-frame command and the next character, so no "wait for start of frame" is needed.
If you want to accept other, longer to process commands (like play a sound on the Apple speaker), then the sending machine would anticipate the delay and insert an appropriate number of in-line nulls and finish (at just after the time when the Apple would be done in the worst case) with a "resume data" byte that would send control back into the "receive data" loop. This could even happen inside a frame.
> The next thing to implement would be a simple 1-byte sync signal every 256 > bytes; if not seen, assume there was a glitch somewhere and abort the > display of this frame, waiting for the next start-of-frame signature.
No, just run open-loop. There's nothing the sender can do to improve things in case of an error--this is real-time, after all--and you can easily verify correct or incorrect operation by observing the screen.
> I've never done serial programming on either side, but I'm assuming the > initialization is trivial and you've provided the guts of the loop already. > (Thanks!)
> The hard part now seems like it's actually on the sending side, where we > need to tackle algorithms for generating decent color hi-res screens. I'd > probably start by assuming it's a 140x192 with a fixed 6 color palette, do > an error-diffusion dither, and ignore the artifacts from trying to put blue > and green within (etc) a byte. Which would make Rich happy, I'm assuming.
I agree completely. Palette reduction is non-trivial (maybe with an animation, you can get some advantage from the fact that many colors do not change from frame to frame).
Don't ignore the "color set" bit, or what you produce will be suitable only for monochrome viewing. Trust me, with only 40 bytes across the screen, changing random byte's color sets makes a very visible mess.
Maybe a good way to start would be to make a monochrome version. The conversion is much easier to write. ;-)
BTW, although as David points out, the IIc keyboard can generate interrupts, I would very much recommend *against* using interrupts. They will make your character receive loop non-deterministic and random failures will occur. Further, if you skip keyboard (or any other) interrupts, *all* Apple II's will run the code perfectly.
>> Back-of-the-envelope for 115kbps gives me about a 2Hz update rate >> for hi-res graphics without compression. Given that, I'd probably >> skip lo-res entirely since that rate is "good enough" for what I'm >> imagining.
> Well, you'll need at least 9 bit times per byte, and that puts you > at 12,800 bytes/sec, or about 1.5 seconds per hi-res screen.
Can you double check your math? I think you inverted the ratio...
So lower than the 2Hz I guessed, but > 1Hz, which is okay for me... unless I am doing something stupid (it is just 9 bits of the transmission rate per byte, yes?) Also, if we ignore the screen holes, it's only 7680 bytes/frame:
(This optimization might be worth it; the 40-byte scan line loop could be unrolled if necessary.)
> That's actually pretty slow.
In my specific scenario, where I'm not actually streaming video but an interactive application, quality actually trumps frame rate at about the 1Hz rate. Since the app in question is interactive, the usual case will be long periods of slowly changing content where quality is key, followed bursts of activity. I agree that for actual *usability* from an Apple client, dropping to lores would be great, but it's not necessary in this case.
(That's not to say we shouldn't go all out and design it! I'm just justifying my shortcuts.)
> Give up on anything more complicated than run-length compression of > data or data differences. 80 cycles per byte is a harsh mistress.
Data differences, definitely. If only part of the screen is updating, transmit the bounds and then the contents of the rectangle (on byte boundaries, of course).
> If the sending machine is sent data, it simply acts on it without any > housekeeping handshakes. All "service" should simply be "best > effort".
Ah, right - I'm thinking Ethernet where you at the lowest level you can't count on delivery.
> The receiver doesn't "pull" bits, it is "sent" to the receiver by > the sender without handshaking.
> Since the sender knows exactly when the end-of-frame 2-byte sequencs > has been sent, it knows exactly when the receiver has displayed it.
I guess I was expecting buffering to be occurring on some level... working too high on the stack for too long, I guess. I'll read up on serial programming before I ask more dumb questions. :)
> You'd be surprised how much better a 16-color, 6 FPS display looks > than a 6-color 0.67 FPS display! (If you want 16-color hi-res, that's > double hi-res, and requires twice as long to send.)
> At 6 FPS you can see motion pretty well, but at under 1 FPS it's > pretty hard unless things are changing quite slowly.
Agreed - the lores video demos definitely inspired this. Again, due to the *specific* nature of what I'm thinking of streaming, at about 1 FPS I think the hires option will be superior. But I'll have to try it out to be sure.
> It's also quite easy to change, since a simple "switched" command > handler on the Apple II side can be told exactly which screen to > fill next and which to display now. There are really no smarts on > the Apple side at all--in fact, it doesn't even keep count of bytes > transferred, it just stores them in increasing addresses until told > to switch buffers and screens.
That makes sense. Keep it simple!
> Actually, switching buffers and screens is fast enough that you can > probably do it between the end-of-frame command and the next > character, so no "wait for start of frame" is needed.
That's residue from me thinking about a dumb, error-prone protocol where you aren't reliably getting bytes from one end to another in predictable form. Serial = simple, got it!
> No, just run open-loop. There's nothing the sender can do to improve > things in case of an error--this is real-time, after all--and you can > easily verify correct or incorrect operation by observing the screen.
I was more thinking that the Apple could discard a frame entirely if the results are bad, but if the error cases are more likely to be corrupt data than out of sync reads (which would corrupt the whole rest of the stream) then this isn't necessary. (Now, instead of thinking high level buffering, I'm thinking low level latching onto a stream. Duh...) So it would need to be an actual checksum... and this is probably overdesigning before we even know what the error rate is. So.... scrap it!
> I agree completely. Palette reduction is non-trivial (maybe with an > animation, you can get some advantage from the fact that many colors > do not change from frame to frame).
We'll have to time it and see... I haven't actually coded up a diffusion dither myself, but I'm expecting it can crank out 2Hz.
> Maybe a good way to start would be to make a monochrome version. The > conversion is much easier to write. ;-)
Definitely phase 1! Phase 2 is probably "ignore the high bit and hope for the best" and see how that looks. Phase 3 is where magic happens. I was also pondering the style where alternate scan lines have alternate color set bits set consistently. This would make pure-blue unachievable, but gives you a larger palette of blended colors, and a screen resolution of 140x96.