A Caché of Tips: Streams

297 views
Skip to first unread message

Emily Haggstrom

unread,
Apr 9, 2012, 10:35:41 PM4/9/12
to intersy...@googlegroups.com
Streams are a great way of representing data in your application. They can be used to represent binary data like images, or very long character data. 

There are four types of stream classes Caché provides for this purpose:

%Stream.GlobalCharacter
%Stream.GlobalBinary
%Stream.FileCharacter
%Stream.FileBinary

Caché also supports an older set of stream APIs in the %Library package for legacy applications, but new development should use the %Stream package.

Whether it is a Global stream or a File stream refers to where the data is stored. Global streams are stored as global data inside the Caché database, while File streams are stored as OS level files. The methods you can use to operate on them are largely the same.

For a simple example, let's look at a program that creates and writes to a new file:

  FS="%Stream.FileCharacter"->%New()
  SC=FS->LinkToFile("C:\temp\myFile.txt")
 IF SC#1 THEN CRT "ERROR CREATING FILE"
 SC=FS->Write("This is some text")
 IF SC#1 THEN CRT "ERROR WRITING TO FILE"
 SC=FS->%Save()
 IF SC#1 THEN CRT "ERROR SAVING FILE" 

This shows a few key methods. The LinkToFile() method is useful for tying an OS file to a filestream object (either Character or Binary). You can then read from or write to the file. The file need not exist at the time you call the method (although the directory it's in must exist).

This also shows the Write() method. This writes a string of characters (or data) to the stream. You can also use the WriteLine() method, which writes the given string plus an endline character. The WriteLine() method should not be used with binary streams, since line terminators have no meaning in a binary file.

At this point, the filestream exists only as an object in memory. You need to call %Save() to write the stream to disk.

All of these methods return a status code as their result. The will return 1 on success, or some error code on failure. It's a good practice to check these status codes before moving on with your application. 

Both Write methods write their contents starting at the current pointer location. This will always be the beginning of the file if it has just been opened, and they will overwrite any existing contents of the file. You can use this boolean property to check if the pointer is at the end of a file:

IF FS->AtEnd#1 THEN ...

And you can use this method to move to the end of a file:

FS->MoveToEnd()

So, let's modify the first example to open and append a statement to a log file. This will create the file if it doesn't exist, and if it does, it will make sure the pointer is at the end of it before inserting a new line:

 FS="%Stream.FileCharacter"->%New()
 SC=FS->LinkToFile("C:\temp\myFile.txt")
 IF SC#1 THEN CRT "ERROR CREATING FILE"
 FS->MoveToEnd()
 SC=FS->WriteLine("This is some text")
 IF SC#1 THEN CRT "ERROR WRITING TO FILE"
 SC=FS->%Save()
 IF SC#1 THEN CRT "ERROR SAVING FILE" 

Example #2 covers reading from a file. Since there is a limit to the length a string can be in Caché, (large though it may be) it is wise to read the data in manageable chunks:

  FS="%Stream.FileCharacter"->%New()
 SC=FS->LinkToFile("C:\temp\Input.txt")
 IF SC#1 THEN CRT "ERROR OPENING FILE"
 LENGTH=100
 EOL=0
 LOOP WHILE EOL=0
  LINE=FS->ReadLine(LEN,SC,EOL)
  CRT LINE
 REPEAT

The Read() and ReadLine() methods return the chunk of text they've read, as opposed to a status code. Instead, the status code is returned by reference as the second argument of the method. It takes a ength parameter as the first argument, which defines how big of a chunk will be read. In the event that the remaining length of the stream is less than the length argument, this variable is set to the number of characters read. The third parameter is the End Of Line marker. This is set to 1 if the Read method has hit the end of the stream. This is useful for looping over the stream.

If you ever need to return to the beginning of the stream, you can use the Rewind() method to set the pointer to the start of the stream.

As I mentioned, there is a variable size limit in Caché of 3.5 MB. In the case of CMQL statements with very large output, this can be a problem. To work around this, you can use the EXECUTE statement with an output variable. Here's an example:

 FS = "%Stream.FileCharacter"->%New()
 EXECUTE "LIST BIGFILE ID-SUPP A1" OUTPUT FS

At this point, the pointer is at the end of the Stream after writing the output. Use Rewind() to reset the pointer to the beginning, then you can process the data as normal using Read() and ReadLine().

One more useful method you should know about is the CopyFrom() method. CopyFrom() can be used to copy any stream to another stream. This is particularly useful for saving streams in your database to OS files, and vice versa. Take for example an image saved as a %Stream.GlobalBinary object, GS. You can save it to a local file using this code:

 FS="%Stream.FileBinary"->%New()
 FS->LinkToFile("C:\temp\myImage.jpg")
 FS->CopyFrom(GS)
 FS->%Save()
 IF SC#1 THEN CRT "ERROR SAVING FILE" 

That's it for this tip. For more information, check out this Chapter in the documentation, and don't forget to have a look at the class reference for the four stream classes to see what other methods and properties are available for your use.

Bill Farrell

unread,
Apr 10, 2012, 8:33:49 AM4/10/12
to intersy...@googlegroups.com

Hi Emily,

 

Streams are wonderful creatures that can speed up a lot of programming tasks.  Out in the wild (especially in communications routines), there’s a lot of something like:

 

SomeDynamicArray = “”

Loop

                ; *  do some code

                SomeDynamicArray< -1 > = ResultOfComputation

Repeat

 

A GlobalCharacterStream can help speed that up.  To create a dynamic array of some result (like gathering packets/chunks/globs of data from a TCP/IP connection) you could do something like:

 

ResultFromConnection = “%Stream.GlobalCharacter”

ResultFromConnection->LineTerminator = @fm ; * create a dynamic array by terminating each “line” with an attribute mark

 

Loop

                ;* do some computation

                sc = ResultFromConnection->WriteLine( ResultOfComputation ) ; * or use …->Write() to add to the current attribute without adding a new @fm

Repeat

 

Later, when you want to retrieve the result you could do:

 

ResultFromConnection->Rewind()

SomeDynamicArray = ResultFromConnection->Read()

 

We all learned (or should have learned, heh) that doing DynArray<-1>’s gets slower with each successive iteration.  A global character stream will allow you to gather results in a loop and tack more data onto the end (as you might have done with <-1> ) without losing any speed.  Setting the line terminator to a field mark lets you have a familiar dynamic array at the end of your computation.

 

I’ve begun to use streams a good deal in my code where I have a TCP/IP connection open and I’m gathering and evaluating chunks of data as it arrives from the line.  Using a stream instead of a dynamic array dramatically speeds up handling and response time.

 

Bill

--
You received this message because you are subscribed to the Google Groups "InterSystems: MV Community" group.
To post to this group, send email to Cac...@googlegroups.com
To unsubscribe from this group, send email to CacheMV-u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/CacheMV?hl=en

Emily Haggstrom

unread,
Apr 10, 2012, 9:01:41 AM4/10/12
to intersy...@googlegroups.com

That’s a neat trick! I hadn’t thought of using a value mark as a line terminator. Thanks for sharing!

 


Jason Warner

unread,
Apr 10, 2012, 11:33:06 AM4/10/12
to intersy...@googlegroups.com
Emily,

I did not know that you could use OUTPUT to pipe data to a stream. That is very cool and could help generate some really quick and dirty reports. Thanks for the tip.

Jason

Lee Burstein

unread,
Apr 10, 2012, 1:34:28 PM4/10/12
to <intersystems-mv@googlegroups.com>
We added the EXECUTE ... OUTPUT syntax variation some time ago because CAPTURING a CMQL report will likely exceed the 3.5mb limit for a single variable.

Lee H. Burstein
Technical Trainer
InterSystems
Office: 302-477-0180
Cell: 302-345-0810

Reply all
Reply to author
Forward
0 new messages