BBC BASIC, Long Strings, And BeagleBoard

Michael

unread,

Jun 4, 2012, 6:19:04 AM6/4/12

to

Hi Guys.

I have finally decided to re-write my code to pick up music tracks,
with the ability to pick up meta data, or if lacking, decipher the
meta data from the filepath (for WAV).

This of course means I will need to handle long filenames, and thus
long strings.

I have tried Steve Drain's Basalt and Long Strings library, but they
appears to struggle with internal errors on my BeagleBoard. I will
contact Steve about these issues soon.

I have turned to use Routines (as I am using this also in my main
application) and it's Long Strings.

How do I now save, and reload these from a basic data file? †I am
currently use PRINT# and INPUT#.

Cheers
Michael

David Holden

unread,

Jun 4, 2012, 9:37:26 AM6/4/12

to

On 4-Jun-2012, Michael <michael...@gmail.com> wrote:

> I have finally decided to re-write my code to pick up music tracks,
> with the ability to pick up meta data, or if lacking, decipher the
> meta data from the filepath (for WAV).
>
> This of course means I will need to handle long filenames, and thus
> long strings.
>

Two points. Firstly it's very bad practice to allow anything on your
computer to have a path/file name longer than about 240 character as the
Wimp message system can't handle anything longer than this so all sorts of
programs will crash.

Secondly it's best not to use Basic strings for this sort of thing anyway. A
'string' is just a collection of bytes, and you can put them in an integer
array and access them with the indirection operators rather then the Basic
string functions. That way they can be as long as you like.

> How do I now save, and reload these from a basic data file? �I am
> currently use PRINT# and INPUT#.

Yuk! Those should have been shot long ago. Either arrange your data in a
block of memory and load and save that (much quicker), or if you must save
and load it a bit at a time use BPUT# and GET$# or BGET#. They don't do
everything backwards.

--
David Holden - APDL - <http://www.apdl.co.uk>

Gavin Wraith

unread,

Jun 4, 2012, 10:49:10 AM6/4/12

to

In message <a33s0...@mid.individual.net>

"David Holden" <Spa...@apdl.co.uk> wrote:

> Secondly it's best not to use Basic strings for this sort of thing anyway. A
> 'string' is just a collection of bytes, and you can put them in an integer
> array and access them with the indirection operators rather then the Basic
> string functions. That way they can be as long as you like.
>
> > How do I now save, and reload these from a basic data file? I am
> > currently use PRINT# and INPUT#.
>
> Yuk! Those should have been shot long ago. Either arrange your data in a
> block of memory and load and save that (much quicker), or if you must save
> and load it a bit at a time use BPUT# and GET$# or BGET#. They don't do
> everything backwards.

I second David Holden's advice. My kneejerk reaction when I see issues
about long strings in Basic is to beat the drum and say "Why use Basic
when you could use RiscLua, which does not have those problems?". However,
I recently wrote an article for Drag'n'Drop, "The difference words can make
(compared to bytes)", comparing two RiscLua programs to extract data from
toolbox Res files. One program used Lua strings and the io library (ANSI
compatible approach), the other used dim-ed arrays and sys-calls to OS_File
(RISC OS only approach). The latter program was 150% faster and 3/4
the size - basically because loading a word at a time is faster than loading
four individual bytes. The program was also much closer in spirit and
appearance to what the corresponding Basic program would be, following
the afore-mentioned advice.

--
Gavin Wraith (ga...@wra1th.plus.com)
Home page: http://www.wra1th.plus.com/

Michael

unread,

Jun 4, 2012, 8:20:03 PM6/4/12

to

First of all, sorry, I accidentally removed the quotes before I
realised that in FF (or at least my setup of it) under RISC OS, I
can't copy and paste!

Thanks for the advice so far, but they don't really help me:

1) I am a BASIC Coder, and although I am sure Lua is better (let's
face it max string length in 2012??), I have learnt BASIC and can code
in BASIC. I already use PHP and JavaScript for my web based
programming. I have used Basalt for a few years under RISC OS Adjust,
and have only recently had to remove dependancy since getting a
BegalBoard. (although issues arose under RPCEmu with RISC OS 5, and
not with Adjust).

2) Messing with strings as byte arrays, is not what I am used to...
in PHP $MyVar can be (as far as I am aware and using it as thus) a
string (or other type) of any length, and so surely messing with
memory like this is dangerous for a person like me, who doesn't do
this sort of thing. In my stupid attempts to achieve byte
indirection, I managed to overwrite something and took out something
nasty resulting in reboot to cure. Other attempts have ment that in
picking up a filepath from OS_GBPB 12, and along with it I also picked
up BASIC instructions before realising the strings were not terminated
as expected and having to add a +CHR$(13) to the end.

3) This is the main issue I have:- The files are on a FAT32 formatted
HardDisc SATA > USB I/F originating from my PC, DigitalCD can access
these files, and therefore so can my program (as it uses DiskSample).
It is aimed at people (who like I) have used iTunes / MusicMan /
Ripped CDs to gain their music, and the filepaths usually created
automatically. And as I have said in my earlier post, I have been
tasked with stripping Meta Data from the File Path for WAVs:

Fat32Fs::Fat32_4.$.Music.T.The Beatles.The Magical Mystery Tour -
1967.The Magical Mystery Tour - 01 - Magical Mystery Tour/ogg

Fat32Fs::Fat32_4.$.Music.T.The Royal Philharmonic Orchestra.Dinner
Party Classics - 2007.The Royal Philharmonic Orchestra - Dinner Party
Classics - 06 - Why Does It Always Rain On Me (Travis Instrumental
Cover)/ogg

Ok these aren't exactly right (as I still don't know which file does
it, and would rather solve my issue than use a work around like
truncate the path), so I have mocked a few extra bits to make the
point, but I need to be able to handle this type of thing.

Although I am developing this, it is in my spare time, and due to
other reasons, this is so little that this simple issue has been going
on for nearly 3 weeks, and it is only now (with an extra 2 days) that
I have been able to even post here on it!

David:

I agree, the backwards thing did bug me a little, but as they worked -
> "If it ain't broke, dont fix it"

As I am using arrays of strings and integers, how would I arrange my
memory into a block and save/read that? I have always steered away
from looking into this, as my system has worked until now, and in case
I bugger something up ;@)

Last thought, should the Wimp Message System be pointing to a byte
array as well to overcome this? (not knowing anything really about
it) :)

I really appreciate the time you guys take to help me, and I hope that
by munging something together here, I can make a useful app for people
(it certainly helps me categorise my music).

Oh and in case anyone has issue with it, the naming of "The Beatles"
as opposed to "Beatles" was down to DBPowerAmp's CD listing library...
I am aiming to detect "Beatles" "The Beatles" and "Beatles, The", and
potentially correct them on the filesystem as well as the ID3 and Meta
Tags. This also goes for "Nilson, Harry" "Harry Nilson", Harry
Nillson" and "Nilson".

Cheers Michael

David Holden

unread,

Jun 5, 2012, 2:37:57 AM6/5/12

to

On 5-Jun-2012, Michael <michael...@gmail.com> wrote:

> David:
>
> I agree, the backwards thing did bug me a little, but as they worked -
> > "If it ain't broke, dont fix it"
>

The problem is that it *is* broke. It's a hangover form the original BBC
6502 Basic and exixts only because with a 6502 and limited RAM code could be
faster and more compact it you counted down to 0 instead of up to a certain
number. Hence starting at the end of a string and working down the the start
was 'better'. The trouble is that it makes the data effectively unreadable
for anything except BBC Basic. If you use BPUT# instead of PRINT# and then
GET$# to retrieve a string then the strings are just put into the file
'normally' so they can be read by any other method, including simply
visually.

> As I am using arrays of strings and integers, how would I arrange my
> memory into a block and save/read that? I have always steered away
> from looking into this, as my system has worked until now, and in case
> I bugger something up ;@)
>

Create a byte array big enough to hold your data eg. DIM array% 8000 for
8000 (actually 8001) bytes. You now need a pointer to the data, eg. ptr%.
Assuming you've got a Basic string and 2 integers you could use -

ptr%=array% :REM set ptr% to start of data block
$ptr%=name$ :REM insert string at ptr%
ptr%=ptr%+LEN(name$)+1 :REM point to next byte after end of string
ptr%=(ptr%+3) AND NOT 3 :REM word align ptr%
!ptr%=data1%:ptr%+=4 :REM insert 1st data word and inc ptr%
!ptr%=data2%:ptr%+=4 :REM insert 2nd data word and inc ptr%

repeat until done (obviously without the 1st line for subsequent records).

It would also be sensible to leave (say) 4 bytes empty at the start and keep
a count of the number of records as you go then put this into the start of
the data so you know how many records there are when you relieve it; eg.,
start out with ptr%=array%+4, then before you save !array%=count%

Note. It's not strictly necessary to word align the integers in a Basic
array but it's neater and would make it easier for another language to
retrieve the data from the file.

If the names are longer than 255 characters then instead of just inserting
Basic strings you'd have to copy them. If (as you say) you've used OS_GBPB
to get the names then set a pointer (name%) to point to the name returned by
OS_GBPB and instead of lines 2 and 3 above use -

WHILE ?name% >13 :REM repeat until end of name found
?ptr%=?name% :REM copy a character
ptr%+=1:name%+=1 :REM inc pointers
ENDWHILE
?ptr%=0:ptr%+=1 :REM terminate string and inc ptr%

To save the data it's best to use the appropriate OS_File command but you
could just use -

OSCLI "save "+filename$+" "+STR$~array%+" "+STR$~ptr%

Retrieving the data is the reverse of the above.

> Last thought, should the Wimp Message System be pointing to a byte
> array as well to overcome this? (not knowing anything really about
> it) :)

All text passed to and from the Wimp message system *is* passed in that way.
Basic will hide this from you by letting you put a Basic string as a
parameter but it converts it to a 0 terminated string before passing it to
the SWI.

The limitation is because the Wimp message system itself passes only 256
bytes. Some of these are used for various parameters so any text will be
limited to about 240. Of course, this won't matter if you're not using the
Wimp message system, eg. to load/save data by dragging.

spampling

unread,

Jun 5, 2012, 5:13:38 AM6/5/12

to

In article <a35nq6...@mid.individual.net>, David Holden

<Spa...@apdl.co.uk> wrote:
> The problem is that it *is* broke. It's a hangover form the original BBC
> 6502 Basic and exixts only because with a 6502 and limited RAM code
> could be faster and more compact it you counted down to 0 instead of up
> to a certain number. Hence starting at the end of a string and working
> down the the start was 'better'.

Ah, I knew there had to be a reason for that back to front stuff.

> The trouble is that it makes the data
> effectively unreadable for anything except BBC Basic. If you use BPUT#
> instead of PRINT# and then GET$# to retrieve a string then the strings
> are just put into the file 'normally' so they can be read by any other
> method, including simply visually.

The latter was the reason I always use the put-get version - while I find
reading handwriting upside down [1] and print backwards relatively easy
reading things the right way round is always easier.

Mind you I've come across commercial software that screwed up the byte
count and thus over-wrote the data separation markers. I had to hack a
"database" used at the place my mother worked to stop it accepting data
strings that were too long.

[1] Handy in various across the desk situations.

--

Steve Pampling

John Williams (News)

unread,

Jun 5, 2012, 6:06:55 AM6/5/12

to

In article <529afdca16...@btinternet.com>,

spampling <spam....@btinternet.com> wrote:

> [1] Handy in various across the desk situations.

Steve - too much information!

Seriously though, as a teacher I also acquired that skill of reading, even
spotting errors, upside-down!

Also eyes in the back of the head.

John

--
John Williams, Brittany, Northern France - no attachments to these addresses!
Non-RISC OS posters change user to johnrwilliams or put 'risc' in subject!
Who is John Williams? http://petit.four.free.fr/picindex/author/

spampling

unread,

Jun 5, 2012, 11:25:07 AM6/5/12

to

In article <529b02aa...@tiscali.co.uk>,

John Williams (News) <UCE...@tiscali.co.uk> wrote:
> In article <529afdca16...@btinternet.com>,
> spampling <spam....@btinternet.com> wrote:

> > [1] Handy in various across the desk situations.

> Steve - too much information!

> Seriously though, as a teacher I also acquired that skill of reading,
> even spotting errors, upside-down!

Mastered as a pupil - keeping one step ahead.
Also need to show no awareness of content just read.

--

Steve Pampling

Jonathan Graham Harston

unread,

Jun 5, 2012, 3:59:11 PM6/5/12

to

gavin wrote:

> David Holden wrote:
> > Secondly it's best not to use Basic strings for this sort of thing anyway.
A

> I second David Holden's advice.

Agreed. The answer to "I want to use greater-than-255-character
strings" is "they aren't strings, they're text blocks, treat them
as such."

And if you have any media with pathnames longer than 240
characters, it desperately needs tidying up, not just from a Wimp
point of view, but from a general usability and understandable
structural point of view.

--
J.G.Harston - j...@mdfs.net - mdfs.net/jgh
RISC OS Internationalisation - http://mdfs.net/Software/RISCOS

Richard Russell

unread,

Jun 7, 2012, 9:29:38 AM6/7/12

to

On Jun 5, 1:20 am, Michael <michaelremer...@gmail.com> wrote:
> 1) I am a BASIC Coder, and although I am sure Lua is better (let's
> face it max string length in 2012??), I have learnt BASIC and can code
> in BASIC.

Have you considered using Brandy? That's very compatible with BASIC
V, but supports strings up to 65535 bytes.

Richard.
http://www.rtrussell.co.uk/

Rick Murray

unread,

Jun 15, 2012, 2:37:32 AM6/15/12

to

On 05/06/2012 08:37, David Holden wrote:

> The trouble is that it makes the data effectively unreadable for anything
> except BBC Basic.

It's a doddle in C (just byte read it backwards into the string) and was
fairly easy also under VB. Hell, it was a lot easier than recoding the
string read functions myself so that VB could cope with RISC OS
terminated text without freaking out.

> so they can be read by any other method, including simply visually.

You might be surprised to find that some long-time Beeb/BASIC coders can
actually read backwards strings visually.

Once upon a time, I could write stuff backwards as well. A skill learned
mostly to freak out teachers, but it's a skill long since lost (rather
like knowing about the frasch process - learned it, was tested on it,
forgot it, never needed to care about sulphur after chem class!

> Create a byte array big enough to hold your data eg. DIM array% 8000 for
> 8000 (actually 8001) bytes.

So, erm, DIM 7999 then? ;-)

> WHILE ?name%>13 :REM repeat until end of name found
> ?ptr%=?name% :REM copy a character
> ptr%+=1:name%+=1 :REM inc pointers
> ENDWHILE
> ?ptr%=0:ptr%+=1 :REM terminate string and inc ptr%

In case it was missed, note the terminator looked for (13) and the
terminator set in the array (0) are not the same.

Best wishes,

Rick.

David Holden

unread,

Jun 15, 2012, 4:58:27 AM6/15/12

to

On 15-Jun-2012, Rick Murray <heyrickma...@yahoo.co.uk> wrote:

> On 05/06/2012 08:37, David Holden wrote:
>
> > The trouble is that it makes the data effectively unreadable for
> > anything
> > except BBC Basic.
>
> It's a doddle in C (just byte read it backwards into the string) and was
> fairly easy also under VB. Hell, it was a lot easier than recoding the
> string read functions myself so that VB could cope with RISC OS
> terminated text without freaking out.
>

Yes, it's a doddle in anything, but, as you pointed out, you have to write
special routines to deal with it.

>
> > WHILE ?name%>13 :REM repeat until end of name found
> > ?ptr%=?name% :REM copy a character
> > ptr%+=1:name%+=1 :REM inc pointers
> > ENDWHILE
> > ?ptr%=0:ptr%+=1 :REM terminate string and inc ptr%

> In case it was missed, note the terminator looked for (13) and the
> terminator set in the array (0) are not the same.

Looked for by what? Just about everything except BBC Basic recognises 0 as a
terminator. Even the BBC Basic GET$# does, so it would seem more sensible to
use 0. Remember the whole point of the exercise is to *avoid* Basic strings.

Rick Murray

unread,

Jun 15, 2012, 11:37:36 AM6/15/12

to

On 15/06/2012 10:58, David Holden wrote:

> Yes, it's a doddle in anything, but, as you pointed out, you have to write
> special routines to deal with it.

Er... ;-) That's what programming is, once you move to something a
little more serious than AppInventor.

>>> WHILE ?name%>13 :REM repeat until end of name found

> Looked for by what?

That.

> Just about everything except BBC Basic recognises 0 as a terminator.

Yup. T'is a shame BASIC V didn't introduce support for null terminated
strings. Especially since the OS uses them itself [*], what with the
likes of OS_Write0 (but no OS_WriteCR equivalent).

> Remember the whole point of the exercise is to *avoid* Basic strings.

Which implicates a potential need to write custom routines for string
manipulation (instr, strstr, strlen, strcmp, etc). Not complicated, but
best done in assembler unless you enjoy pouring treacle.

Best wishes,

Rick.

Paul Sprangers

unread,

Jun 15, 2012, 12:34:37 PM6/15/12

to

In article <4fdb56be$0$6168$ba4a...@reader.news.orange.fr>,

Rick Murray <heyrickma...@yahoo.co.uk> wrote:

> Which implicates a potential need to write custom routines for string
> manipulation (instr, strstr, strlen, strcmp, etc). Not complicated, but
> best done in assembler unless you enjoy pouring treacle.

My experience is that memory block manipulation, as a replacement for
string manipulation, is not noticeably slower than the original BASIC
calls, despite the multiple lines of code. On the contrary even, at least
when the BASIC program gets compiled - memory block manipulation seems to
benefit strongly from compilation, up to a factor of 40, where the overall
performance of compiled BASIC rarely exceeds a factor 2, compared to the
interpreted source.

Kind regards,
Paul Sprangers

Frank de Bruijn

unread,

Jun 16, 2012, 2:25:29 AM6/16/12

to

In article <a40bpn...@mid.individual.net>,

David Holden <Spa...@apdl.co.uk> wrote:
> On 15-Jun-2012, Rick Murray <heyrickma...@yahoo.co.uk> wrote:
> > In case it was missed, note the terminator looked for (13) and the
> > terminator set in the array (0) are not the same.

> Looked for by what? Just about everything except BBC Basic recognises
> 0 as a terminator. Even the BBC Basic GET$# does,

In which version of BBC Basic? Certainly not in anything up to 1.48.

Regards,
Frank

David Holden

unread,

Jun 16, 2012, 3:57:35 AM6/16/12

to

Well, from personal experience I know it's true but to quote from the GET$#
entry in the BBC Basic Refernce Manual;

"A string of characters is read until a linefeed (ASCII 10) a carriage
return (ASCII 13), null character (ASCII 0) or the end of the file is
encountered or else the maximum of 255 characters is reached."

Though I'm not sure why you feel the need to bring this up. The whole point
of the excercise was to bypass BPUT# and GET$# because some of the enties
might be longer than 255 characters. The only reason BPUT# and GET$# were
mentioned was as a better alternative to PRINT# and INPUT#. The code you
quoted was an alternative for strings longer than 255 characters so it
wouldn't be used with GET$# anyway.

Steve Drain

unread,

Jun 16, 2012, 6:00:46 AM6/16/12

to

On 16/06/2012 08:57, David Holden wrote:

> Frank de Bruijn wrote:
>>> Looked for by what? Just about everything except BBC Basic recognises
>>> 0 as a terminator. Even the BBC Basic GET$# does,
>>
>> In which version of BBC Basic? Certainly not in anything up to 1.48.
>
> Well, from personal experience I know it's true but to quote from the GET$#
> entry in the BBC Basic Refernce Manual;
>
> "A string of characters is read until a linefeed (ASCII 10) a carriage
> return (ASCII 13), null character (ASCII 0) or the end of the file is
> encountered or else the maximum of 255 characters is reached."

In the StrongHelp BASIC manual I have documented for a very long time
that the Reference Manual is not correct here. That was checked from the
source.

> Though I'm not sure why you feel the need to bring this up. The whole point
> of the excercise was to bypass BPUT# and GET$# because some of the enties
> might be longer than 255 characters. The only reason BPUT# and GET$# were
> mentioned was as a better alternative to PRINT# and INPUT#. The code you
> quoted was an alternative for strings longer than 255 characters so it
> wouldn't be used with GET$# anyway.

This thread began with the OP saying he had tried my Strings library and
that it did not work with his Beagle board, but I have not yet had any
details.

To recap, the Strings library deals with strings of any length using
BASIC string variables. The equivalent to GET$#, FNsBGET, reads strings
up to the first control character.

There was a StringUtils module published in Acorn User Jan 1995, and you
can get a SH manual for it from my site. Interestingly, String_Get$#
allows you to specify which terminating character you want. It also
seems to be designed so that PRINT# strings can be read.

If anyone cares to wait a week or so, Basalt will do the equivalent of
my Strings library. The code is all written, but it has been quite
tricky deciding on the best API, because I have to work around what
BASIC will allow. ATM the file handling uses only ASCII 0 as a
terminator, which should please you. ;-)

Steve

Frank de Bruijn

unread,

Jun 16, 2012, 6:05:13 AM6/16/12

to

In article <a42sji...@mid.individual.net>,

David Holden <Spa...@apdl.co.uk> wrote:
> Though I'm not sure why you feel the need to bring this up.

It's something that goes against my own experience so I asked a question
about it. Nothing more. I'm curious which version of BBC Basic does
that. It doesn't work here.

I just checked the StrongHelp manual for Basic and it has this to say
about it:
"The BASIC manual says that a null character (0) is also a terminator,
but it is certain that this is not so in later versions."

> The code you quoted was an alternative for strings longer than 255
> characters so it wouldn't be used with GET$# anyway.

I didn't quote anything. You're mistaking me for somebody else.

Michael

unread,

Aug 6, 2012, 4:28:10 AM8/6/12

to

> This thread began with the OP saying he had tried my Strings library and
>
> that it did not work with his Beagle board, but I have not yet had any
>
> details.

Ah balls, I sent it to my close friend Steve...didn't even bother to rely to me!

I will need to power my board up again to get the errors and dumps for you!

Sorry!

Michael