--
Peter McMurray
Excal...@bigpond.com
--
Kevin Powick
Yeah, I thought I was tired.
What the heck did he just say?
Tony Gravagno wrote:
>Yeah, I thought I was tired.
>What the heck did he just say?
OK, after some coffee here I get it. He flash-compiled a program and
it processed a large block of text faster. That's good news - a
decade late perhaps but good things come to those who wait I guess.
It's a relatively common operation to move data between the host OS
and D3 so I'll post some notes which I hope will be helpful.
Regarding COPY DOS: and similar mechanisms to import data to/from the
host OS, the transfer may or may not be "instantaneous" but the time
to move data to/from D3 is not related to the time taken to parse the
strings within that data. In fact, copying the data into D3 is one
"hit" and parsing it is another, and you will want to mininize both
hits when possible. With %functions (see below) you can import/parse
in one step and that might have made this experience truly as
instantaneous as it could get.
Joseba posted code for using %functions to move data to/from the host
OS. I refer people to it a lot. I recommend developers use the
information to create their own subroutines, and add them to their bag
of tricks, so you can simply pass a file name and quickly read or
write massive dynamic arrays:
http://forums.tigerlogic.com/index.php?showtopic=904
If you are importing data from the host OS via OSFI, using the proper
Hosts driver will convert EOL delimiters for you: CR and/or LF to AM.
Note that this is independent of the OS you're on. So if the file is
LF-delimited, you can use COPY UNIX:/dos/path/file.ext and it will
convert LF to AM even if you're on Windows. If the file is
CRLF-delimited and you COPY UNIX:, note that you're going to see CR:AM
sequences in the resulting data. Use the right driver for the data,
regardless of the current OS.
CONVERT only operates on one character. I believe CONVERT CRLF TO AM
IN REC would actually result in double AMs. Multiple chars are
converted with SWAP(), or better with CHANGE(), because it's a
cross-platform function.
- These sorts of operations are complete independent of Excel, file
extension names, OpenOffice, and other such things, so I have no idea
what those digressions were about. Data is data, full stop.
Joseba also posted the following code in the RD/TL forum. It may be
helpful for others doing import of very large data sets:
execute "qselect unix:/tmp file.ext"
eod=0
loop
readnext line else eod=1
until eod=1 do
* add line to a new record,
* or simply don't add if line is to be deleted
repeat
(Should have more info about this but...) I believe D3 BASIC includes
functions for operating on fixed-length blocks. I don't think I've
used them since the 90's, and they're platform-specific, but they're
worth investigating by anyone doing this sort of work. I don't mean
the Tape block operations, though those could help too.
As a final note: For years I've been trying to get Raining Data /
TigerLogic to publish the API for the OSFI so that we could create our
own OSFI drivers. The benefit of this would be new abilities to
read/write Excel, Word, Web Services, HTTP, FTP, Email, Fax, and many
other document and transaction types using standard Open/Read/Write
statements, and TCL file operations. They never got the idea and
always ask for a business case. I'm not asking them to write anything
new, just asking for the documentation for a protocol that they
advertise as being "Open". If we could write our own drivers then
many of these "how do I get data to/from D3" problems would go away
because it would all be nothing but pure and familiar BASIC. I'd say
there's a good business case there which requires them to do nothing
but let us use the software that they provide. If you understand the
potential of this dormant resource, please file an action item for TL
to publish the API for the OSFI so that third-party developers can
make use of it. I will warn you in advance however that solutions
created with this will be very easy to use, but will almost certainly
not be free or open source.
Tony Gravagno
Nebula Research and Development
TG@ remove.pleaseNebula-RnD.com
NEW: Follow TonyGravagno on Twitter
Nebula R&D sells mv.NET and other Pick/MultiValue products
worldwide, and provides related development services
remove.pleaseNebula-RnD.com/blog
Visit PickWiki.com! Contribute!
My point is that a 2 megabyte file is CONVERTED instantaneously in a program
compiled with option O and takes over a minute when run in a program that is
not optimised. I was prompted to check this because the COPY works so
quickly and there had to be a reason.
> If you are importing data from the host OS via OSFI, using the proper
> Hosts driver will convert EOL delimiters for you: CR and/or LF to AM.
> Note that this is independent of the OS you're on. So if the file is
> LF-delimited, you can use COPY UNIX:/dos/path/file.ext and it will
> convert LF to AM even if you're on Windows. If the file is
> CRLF-delimited and you COPY UNIX:, note that you're going to see CR:AM
> sequences in the resulting data. Use the right driver for the data,
> regardless of the current OS.
If you are uncertain as to the source of the file. For example a person may
import a Unix file into Excel and then re-export it to your program without
telling you (actual event) then my trick is the simplest way of doing it.
>
> CONVERT only operates on one character. I believe CONVERT CRLF TO AM
> IN REC would actually result in double AMs. Multiple chars are
> converted with SWAP(), or better with CHANGE(), because it's a
> cross-platform function.
CONVERT does work on multiple characters by dumping the second one which is
what I wanted to happen and as stated in the manual
> - These sorts of operations are complete independent of Excel, file
> extension names, OpenOffice, and other such things, so I have no idea
> what those digressions were about. Data is data, full stop.
Thes operations are totally dependent on what is reading them which is why I
mentioned it. Excel expects a csv to be a comma separated item and splits
on commas as well as CRLF but it will allow you to specify a txt split up.
Open Office on the other hand will not open a txt file in a spreadsheet but
forces it to a document. It only allows one to read and specify the split
up if you call it csv.
I am aware of the %functions but also somewhat chary of them given the
confusion in the manual. I did not go there in this case because the large
file is a one off by a muddled programmer and the daily file is typically
one thirtieth of the size.
Summary Fact: CONVERT works extremely fast on large items if COMPILED with
(O) however you have to jump through the hoop that an O Compiled program
will only run as optimised if called from an optimised program. It falls
back to standard Basic if called from a standard BASIC compile. Simple
solution in a program not optimised is EXECUTE COPY it to WINDOWS and COPY
it back.
ENd of my input.
Peter McMurray
No argument. That's what it's for.
> however you have to jump through the hoop that an O Compiled program
>will only run as optimised if called from an optimised program. It falls
>back to standard Basic if called from a standard BASIC compile.
What you're describing is what I explain in page 3 of the following
blog:
nospamNebula-RnD.com/blog/tech/mv/2008/04/d3flash1.html
>Simple solution in a program not optimised is EXECUTE COPY it to
> WINDOWS and COPY it back.
Peter, you're confusing movement of data from OS into D3 with the
processing of the data after it's in D3. FlashBASIC has nothing to do
with a COPY command because COPY is done outside of BASIC. The speed
of copy from Windows to D3 or back will be exactly the same because
you're dealing with the file tier, not BASIC workspace. Your
performance is attained when you're doing string operations on the
dynamic array because now you're operating in a single block of fast
memory with C++ rather than in paged frame space where links need to
be chased by assembler code.
Now, if you are doing the following:
OPEN 'C:/path' TO FV ELSE STOP
READ INFO FROM FV,'filename.ext' ELSE STOP
I could be wrong but I believe the performance of that READ will be
better with Flashed code than with non-flashed because as stated above
INFO is in memory and you're not grabbing a new frame of overflow
every 4k or so. But again, this is not related to this:
EXECUTE "COPY C:/path file.ext":@AM:"(MYFILE"
Part 2 of the above blog discusses some nuances of %functions which I
mentioned earlier:
nospamNebula-RnD.com/blog/tech/mv/d3/2009/06/d3flash2.html
HTH
T
Peter,
COPY is a TCL2 verb
:ed md copy
top
.p
001 vz
002 3]e
003 uz
eoi 003
No Basic code, flashed or otherwise, to be seen
That's right... Blame the readers. :-)
--
Kevin Powick
Peter McMurray
"Kevin Powick" <kpo...@gmail.com> wrote in message
news:48572f69-f2a3-4185...@j28g2000vbl.googlegroups.com...
The website was taken down for a security check at about the time you
were going over it - sorry about that. It's back now.
Peter, why do you argue with people who have been working for years
with something you just discovered? Just learn and ask questions.
Stop professing about stuff you've never used. You do that a lot and
I've grown quite weary of it.
As Ross said, COPY has nothing to do with FlashBASIC. The OSFI driver
converted the characters quickly because that operation is done at the
C/monitor/filesystem level rather than by BASIC.
You believe "debug vanishes"? What does that mean? People debug
running FlashBASIC code all the time - some of the functionality is
different but it's all fully functional.
"The pain with optimise is that it is all or nothing."
Uh, yeah. So? Why do you want to run non-flashed anymore?
Update your verbs COMPILE and :CCOMPILE with the letter "o" in atb 6
and you won't have to manually use the O option again. Your code will
all automatically be flashed, it will all run much faster, and you'll
never know the difference between flashed and unflashed anymore.
Nuff outta me, got a plane to catch.
T
Peter McMurray
"Tony Gravagno" <address.i...@removethis.com.invalid> wrote in
message news:nbgoc5hdbil5f4kb7...@4ax.com...
The copy switch (O) means overwrite.
If you run your program the first time the overflow part of the file
has to be allocated while the subsequent runs have the overflow space
allocated already.
Run from scratch by re-creating the files and probably you'll see no
difference between Flashed and non-Flashed code.
The compile method should have no bearing over executed or chained
programs as they are outside the scope of current code.
Lucian
I don't suppose you recall that I was QA Manager and later Product
Manager for the software that we're discussing? I don't make this
crap up, I do have some clue.
The OSFI driver is not related to the compilation method of code
invoking it. A simple proof is that you can use all OSFI functions
from TCL, entirely outside of BASIC.
Please feel free to contact your support provider or TL Support
directly, and ask them to explain how it works. Until then, please
stop insisting that you know the internals of this stuff when you just
discovered it a couple days ago! C'mon, how rational is that?
T
>> EXECUTE "COPY DOS:D:/ ":RECKEY:" (O)"
>
>The copy switch (O) means overwrite.
>If you run your program the first time the overflow part of the file
>has to be allocated while the subsequent runs have the overflow space
>allocated already.
Respectfully, that's not correct either. An Overwrite option works
faster than a copy without overwrite because the code that determines
if the item already exists is skipped and the write is simply
performed. As far as allocating overflow, a new item with overflow is
written to disk and then a new pointer item is written to the actual
frame space. This is what update protection does for us anyway. If
you write two items with the same ID and the exact same number of
bytes, and overflow protection is off, then I believe you are correct
that the same frames are overwritten rather than new frames being
retrieved from overflow. So many details...
>Run from scratch by re-creating the files and probably you'll see no
>difference between Flashed and non-Flashed code.
With the exception of data that's already in memory I think that
should be about right.
>The compile method should have no bearing over executed or chained
>programs as they are outside the scope of current code.
>
>Lucian
Well, he is correct that flashed code must call to flashed code and
unflashed must call to unflashed. See page 3 of the mentioned blog.
> Well, he is correct that flashed code must call to flashed code and
> unflashed must call to unflashed. See page 3 of the mentioned blog.
This is an "execute" not a "call".
"Execute" just passes control to another program and works regardles
of program type, be it Basic, Proc, Access or a non-PICK program as in
EXECUTE CHAR(255):"sh"
Lucian
No disagreement. You are correct in what you are saying, as he was
correct in what he was saying about this specific point: In addition
to Executing a copy where flash has no bearing, he discovered that
flash must call to flash and non-flash must call to non-flash.
T
However I find it quite amazing that people who one would believe are
analysts by trade seem incapable of understanding a simple statement of
fact.
It is extremely unlikely that a system that is absolutely based on string
handling would have multiple low level versions of code to do the same job.
At no time did I say that the conversion of text was done at the OSFI level.
I stated the opposite that in my opinion the switch of characters from CRLF
to AM would be done after the block was moved and therefore in the D3
environment. It believe that this is the same routine that optimised code
would use.
If anyone has actual knowledge of the details of the operation I am willing
to listen.
With regard to the previous poster's comments I did it as an EXECUTE because
that provides a separate environment without requiring the Basic to be
optimised. I did that because the client managed to get himself into a hole
with other suppliers and I had to dig him out ASAP.
Peter McMurray
<address.i...@removethis.com.invalid> wrote in message
news:p50rc5tkl1ptbcber...@4ax.com...