What encoding vbscript uses for string vars ?
Single byte, double byte ?
I'm using vbscript in wsh environment.
thanks
Vilius
The best discussion of variable typing in VBScript is in the Scripting
Guide. For example:
http://www.microsoft.com/technet/scriptcenter/guide/sas_vbs_eves.mspx
Use the Table of Contents at the left in the page to navigate to more
information.
--
Richard Mueller
MVP Directory Services
Hilltop Lab - http://www.rlmueller.net
--
Looking again, I mis-read your question. I'm out of town and don't have my
references, but I believe VBScript handles strings one byte per character.
But VBScript can use ADO, where you can specify drivers that support other
encodings.
sOmega = ChrW( &H03C9 )
MsgBox sOmega & "," & UCase( sOmega )
to make sure of this. Depending on the codepage and font used by
your console,
WScript.Echo sOmega & "," & UCase( sOmega )
may look less nice.
Your source code may be written in utf-16 or your standard regional
setting encoding.
cscript <yourfile>.vbs
should work for both cases.
Data/Text file can be used in both formats too. There are (optional)
parameters for functions like OpenTextFile, CreateTextFile, or
OpenAsTextStream.
To process utf-8 data, you'll have to resort to extensions, e.g. the
ADODB.Stream component.
Best regards,
Frits de Boer
ActiveXperts Software B.V.
http://www.activexperts.com
"Vilius Mock�nas" <v_moc...@yahoo.com> wrote in message
news:%23DuCg3n...@TK2MSFTNGP06.phx.gbl...
Dim s
s = "word"
MsgBox TypeName(s) & vbCrLf & VarType(s)
' Returns "String" and 8
It is kind of complex. You can only see the content of those strings by
writing them to the screen with a message box or WScript.Echo statement,
which forces the string to be filtered through the Locale filter; what you
see can seem far different from what you think is in the string. You can
also display the Asc() and AscW() functions for each character. I have not
been able to resolve in my mind what the "True" encoding is within VBScript
strings.
The URL:
http://msdn.microsoft.com/en-us/library/aa212305(office.11).aspx
titled:
HTML Character Sets
has a gap from € through Ÿ:
} } --- Right curly brace
~ ~ --- Tilde
---  --- Unused
  Nonbreaking space
! ¡ ¡ Inverted exclamation
c ¢ ¢ Cent sign
The character set code points within this gap are now 'undefined', or
seeminly inconsistently defined in the scripting regular expression engine.
Some of these code points have a kind of duality, partially dependent on the
'locale' that CScript/WScript is running under.
Try following short script which kind of demonstrates this duality, in that
the Asc and AscW values for some characters are different, which
further clouds the issue of what encoding really is used for the
characters in the string:
Dim i, sMsg
For i = 128 To 159
sMsg = sMsg & vbCrLf & i & vbTab & Chr(i) & vbTab & _
Asc(Chr(i)) & vbTab & ascW(Chr(i))
Next
MsgBox smsg
In the 1082 locale (Maltese), the Asc and AscW values for all characters is
the same, and the Asc value can be greater than 255; this boggles my mind.
The scripting help file script56.chm talks a little about the use of its
locale functions like SetLocale.
-Paul Randall