Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Make TStringlist.DelimitedText ignore spaces

3,503 views
Skip to first unread message

David Sampson

unread,
Mar 6, 2007, 2:03:57 AM3/6/07
to
I am designing a system that must process many text files with different
file structures. Some are fixed length, some comma delimited, some are HL7
(for those who know what that is) and many others.

For those files which are not fixed length (which is most of them) I want to
use the "DelimitedText" property of the TStringlist class to parse each
line. The idea is to tell TStringlist what the current delimiter is then
simply assign the string to the DelimitedText property.


for example:
Var
lclList: TStringList;
Const
SrcStr: String = "MSH|12459|972-596-5432|San Diego|California"

// assume the TStringlist has been duly instantiated
lclList.Delimiter := '|'; // the delimiter for this file is the vertical
bar
lclList.DelimitedText := SrcStr;

This should return the following:
lclList[0] = 'MSH'
lclList[1] = '12459'
lclList[2] = '972-596-5432'
lclList[3] = 'San Diego'
lclList[4] = 'California'

But... instead it returns:
lclList[0] = 'MSH'
lclList[1] = '12459'
lclList[2] = '972-596-5432'
lclList[3] = 'San'
lclList[4] = 'Diego' *** note this being split into two strings
lclList[5] = 'California'

This is because (as you probably already know) in addition to the delimiter
I tell it to use, it will always use a space as well.

I know this is a common problem (I've been dealing with it for years)... I
don't know why Borland hasn't given us, at least, an option to turn it off
or on to consider the space as a delimter.

I could rewrite this routine to ignore the space but I would certainly
prefer finding a way to make TStringlist work.

Gary Williams

unread,
Mar 6, 2007, 2:52:46 AM3/6/07
to

David Sampson wrote:
> For those files which are not fixed length (which is most of them) I want
> to use the "DelimitedText" property of the TStringlist class to parse each
> line. The idea is to tell TStringlist what the current delimiter is then
> simply assign the string to the DelimitedText property.


I know your pain.

In frustration, long ago I wrote the following code to work around the
problem. I'm certain I could devise a more elegant approach if I were to do
it again today, but so far this has worked for me.

-Gary


procedure Split(const Rec: String; const Fields: TStrings; const Delimiter:
Char);
var
I: Integer;
J: Integer;
Temp: String;
OldDelimiter: Char;
TempSL: TStrings;
begin
if (Fields is TStringList) and (TStringList(Fields).Sorted) then
begin
TempSL := TStringList.Create;
try
Split(Rec, TempSL, Delimiter);
Fields.Clear;
Fields.AddStrings(TempSL);
finally
TempSL.Free;
end;
end
else
begin
// CommaText annoyingly treats spaces as delimiters if they are not
surrounded
// by double-quotes.

Temp := Rec;

for I := 1 to Length(Temp) do
if (Temp[I] = ' ') then
Temp[I] := #255;

OldDelimiter := Fields.Delimiter;
Fields.Delimiter := Delimiter;
Fields.DelimitedText := Temp;
Fields.Delimiter := OldDelimiter;

for I := 0 to (Fields.Count - 1) do
begin
Temp := Fields[I];

for J := 1 to Length(Temp) do
if (Temp[J] = #255) then
Temp[J] := ' ';

Fields[I] := Trim(Temp);
end;
end;
end;

function Concatenate(const Fields: TStrings; const Delimiter: Char): String;
var
I: Integer;
Temp: String;
begin
Assert(Delimiter <> '"');
Assert(Delimiter <> ' ');
Assert(Delimiter <> #255);

Result := '';
for I := 0 to (Fields.Count - 1) do
begin
Temp := Fields[I];

if (Pos('"', Temp) <> 0) then
Temp := StringReplace(Temp, '"', '""', [rfReplaceAll]);

if (Pos(Delimiter, Temp) <> 0) then
Temp := '"' + Temp + '"';

if (I > 0) then
Result := Result + Delimiter;

Result := Result + Temp;
end;
end;

procedure ParseCSVRecord(const Rec: String; const Fields: TStrings);
begin
Split(Rec, Fields, ',');
end;

function BuildCSVRecord(const Fields: TStrings): String;
begin
Result := Concatenate(Fields, ',');
end;

function BuildCSVRecord(const Fields: array of String): String;
var
Temp: TStringList;
I: Integer;
begin
Temp := TStringList.Create;
try
for I := 0 to High(Fields) do
Temp.Add(Fields[I]);

Result := Concatenate(Temp, ',');
finally
Temp.Free;
end;
end;

Mark Patterson

unread,
Mar 6, 2007, 3:28:52 AM3/6/07
to
David Sampson wrote:
>
> // assume the TStringlist has been duly instantiated
> lclList.Delimiter := '|'; // the delimiter for this file is the vertical

Here is a form with a couple of simple procedures to convert between
TStrings and '|' delimited strings:

First the pascal unit:

****************************************************************

unit Unit1;

interface

uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls,
Forms,
Dialogs, StdCtrls;

type
TForm1 = class(TForm)
Edit1: TEdit;
Memo1: TMemo;
procedure Edit1Change(Sender: TObject);
procedure Memo1Change(Sender: TObject);
private
{ Private declarations }
public
{ Public declarations }
end;

var
Form1: TForm1;

implementation

{$R *.dfm}

procedure StrToStrings(s: string; list: TStrings);
var i: integer;
begin
list.Clear;
repeat
i := Pos('|', s);
if i = 0 then begin
list.Add(s);
break;
end{if};
list.Add(Copy(s, 1, i - 1));
s := Copy(s, i + 1, MaxInt);
until false;
end;// StrToStrings


function StringsToStr(list: TStrings): string;
var i: integer;
begin
result := '';
for i := 0 to list.Count - 1 do begin
if i > 0 then begin
result := result + '|';
end{if};
result := result + list[i];
end{for};
end;// StringsToStr


procedure TForm1.Edit1Change(Sender: TObject);
var LOnChange: TNotifyEvent;
begin
// LOnCHange := memo1.OnChange;
// memo1.OnChange := nil;
StrToSTrings(Edit1.Text, memo1.Lines);
end;

procedure TForm1.Memo1Change(Sender: TObject);
var LOnChange: TNotifyEvent;
begin
LOnCHange := Edit1.OnChange;
Edit1.OnChange := nil;
Edit1.Text := StringsToStr(Memo1.Lines);
Edit1.OnChange := LOnChange;
end;

end.

*****************************************************************

Next the DFM:

*****************************************************************

object Form1: TForm1
Left = 173
Top = 200
Width = 445
Height = 202
Caption = 'Form1'
Color = clBtnFace
Font.Charset = DEFAULT_CHARSET
Font.Color = clWindowText
Font.Height = -11
Font.Name = 'MS Sans Serif'
Font.Style = []
OldCreateOrder = False
DesignSize = (
437
173)
PixelsPerInch = 96
TextHeight = 13
object Edit1: TEdit
Left = 5
Top = 6
Width = 425
Height = 21
Anchors = [akLeft, akTop, akRight]
TabOrder = 0
OnChange = Edit1Change
end
object Memo1: TMemo
Left = 4
Top = 32
Width = 425
Height = 137
Anchors = [akLeft, akTop, akRight, akBottom]
TabOrder = 1
OnChange = Memo1Change
end
end
*****************************************************************

That should give you the ability to test and refine the functions if
they're not quite what you're after.

--
Mark Patterson
www.piedsoftware.com

JD

unread,
Mar 6, 2007, 7:02:50 AM3/6/07
to

"David Sampson" <sam...@directlink.net> wrote:
>
> [...] but I would certainly prefer finding a way to make
> TStringlist work.

If you have control over how the files are generated and are
willing to change their formats, TStringList::CommaText is the
way to go.

Instead of formatting each record in the file as:

"MSH|12459|972-596-5432|San Diego|California"

format it like:

"MSH","12459","972-596-5432","San Diego","California"

Then you can use TStringList::LoadFromFile to load the file
into one TStringList and then use a second TStringList and
it's CommaText method to parse any of the Strings (records)
from the file. Works perfectly every time no matter what's
between the quotes with one exception. If the substring has
a double quote, that quote must be escaped.

I don't know if Delphi has it but CBuilder has a method that
will do that for you (AnsiQuotedStr). For example (saving a
TStringGrid in this format):

void __fastcall TForm1::GridSaveToFileClick(TObject *Sender)
{
if( SaveDialog1->Execute() )
{
TStringGrid* pGrid = StringGrid1;
char Quote = '"';
TStringList *pList = new TStringList;

for( int Row = pGrid->FixedRows; Row < pGrid->RowCount; ++Row )
{
String tmpString = "";
for( int Col = pGrid->FixedCols; Col < pGrid->ColCount; ++Col )
{
tmpString += AnsiQuotedStr( pGrid->Cells[ Col ][ Row ], Quote );
if( Col < pGrid->ColCount - 1 ) tmpString += ",";
}
pList->Add( tmpString );
}
pList->SaveToFile( SaveDialog1->FileName );
delete pList;
}
}

~ JD

Chris Morgan

unread,
Mar 6, 2007, 9:16:21 AM3/6/07
to
>I am designing a system that must process many text files with different
>file structures. Some are fixed length, some comma delimited, some are HL7
>(for those who know what that is) and many others.
>
> For those files which are not fixed length (which is most of them) I want
> to use the "DelimitedText" property of the TStringlist class to parse each
> line. The idea is to tell TStringlist what the current delimiter is then
> simply assign the string to the DelimitedText property.
>

Hi,

D2006 has the TStringList.StrictDelimiter property to turn on or off using
a space as a delimiter. This was introduced at some time after D6, since I
had
written a TStringList descendent class to do exactly this.

Cheers,

Chris


NovaKane

unread,
Mar 6, 2007, 4:56:21 PM3/6/07
to
On Mar 6, 8:16 am, "Chris Morgan" <chris.nospam at lynxinfo.co.uk>
wrote:

Thanks, Chris. I was having the same problem. StrictDelimiter works
great.

Iain Macmillan

unread,
Mar 6, 2007, 10:36:15 PM3/6/07
to
In article <45ed...@newsgroups.borland.com>, "Gary Williams"
<gray...@gmail.com> wrote:

> I know your pain.
>
> In frustration, long ago I wrote the following code to work around the
> problem. I'm certain I could devise a more elegant approach if I were to do
> it again today, but so far this has worked for me.

Wouldn't it have been shorter to add " to the beginning and end of the text,
and replace every | with "|" ??
;)

JED

unread,
Mar 6, 2007, 10:42:05 PM3/6/07
to
Chris Morgan wrote:

> This was introduced at some time after D6, since I had
> written a TStringList descendent class to do exactly this.

This was new in D2006

http://jedqc.blogspot.com/2005/12/d2006-new-strictdelimiter-property.htm
l

--
TJSDialog - TaskDialog for other operating systems:
http://www.jed-software.com/jsd.htm
Visual Forms IDE Add In: http://www.jed-software.com/vf.htm

Blog: http://jedqc.blogspot.com

Gary Williams

unread,
Mar 7, 2007, 4:05:40 AM3/7/07
to
Chris Morgan wrote:
> D2006 has the TStringList.StrictDelimiter property to turn on or off using
> a space as a delimiter. This was introduced at some time after D6, since I
> had written a TStringList descendent class to do exactly this.

I'm surprised that I hadn't heard about this property before now. Now I can
go back and simplify a lot of code! Thank you for mentioning the property.

-Gary


0 new messages