Ben Knight
St. George Consulting Group, Inc.
P.S. Can anyone (read Peter Below) tell me where the ByteOrderMark := 65279
comes from?
procedure TMainForm.Open1Click(Sender: TObject);
var
s : String;
wc : array of WideChar;
fs : TFileStream;
ByteOrderMark : Word;
x : integer;
begin
If OpenDialog1.Execute then
begin
ByteOrderMark:=65279;
fs:=TFileStream.Create(OpenDialog1.FileName, fmOpenRead);
try
fs.ReadBuffer(ByteOrderMark, 2);
SetLength(wc, (fs.size-2) div 2);
fs.ReadBuffer(wc[0], fs.size-2);
for x:=0 to (fs.Size-1 div 2) do s:=s+wc[x];
RichEdit1.Text:=s;
finally
fs.free
end;
end;
end;
It is the recommended way to start a unicode text file, and documented in
win32.hlp (topic "Byte-order Mark"). Its purpose is to handle UNICODE files
written on other platforms that use a different byte order (big-endian instead
of the Intel little-endian). If you read the first word of a UNICODE file and
see that it is $FFFE instead of $FEFF you know that you have to swap the bytes
in each word you read to get a valid Widechar for your platform. Delphi has a
Swap function that performs this byte order switch.
So if you read a UNICODE file you have to be prepared to deal with files that
have the correct byte order mark for your platform, that need to be swapped
and those that do not have a byte order mark at all (since it is recommended
but not enforcible, of course).
Your routine could be modified to deal with this like follows (untested):
Procedure SwapBytesInWideString( Var ws: WideString );
var
i: Integer;
begin
for i:= 1 to Length( ws ) do
ws[i] := Swap( ws[i] );
// if compiler balks at this try
// ws[i] := WideChar( Swap( word( ws[i] )));
end;
procedure TMainForm.Open1Click(Sender: TObject);
var
ws : WideString;
fs : TFileStream;
begin
If OpenDialog1.Execute then
begin
fs:=TFileStream.Create(OpenDialog1.FileName, fmOpenRead);
try
SetLength( ws, fs.size div 2 );
fs.ReadBuffer( ws[1], fs.Size );
If ws[1] = #$FFFE Then
SwapBytesInWideString( ws );
If ws[1] = #$FEFF Then
Delete( ws, 1, 1 );
RichEdit1.Text:= ws;
finally
fs.free
end;
end;
end;
Peter Below (TeamB) 10011...@compuserve.com)
No replies in private e-mail, please, unless explicitly requested!