Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

vbscript string encoding ?

380 views
Skip to first unread message

Vilius Mock�nas

unread,
Oct 11, 2009, 10:20:31 AM10/11/09
to
Hello,

What encoding vbscript uses for string vars ?
Single byte, double byte ?
I'm using vbscript in wsh environment.

thanks
Vilius


Richard Mueller [MVP]

unread,
Oct 11, 2009, 11:29:43 AM10/11/09
to

"Vilius Mock�nas" <v_moc...@yahoo.com> wrote in message
news:%23DuCg3n...@TK2MSFTNGP06.phx.gbl...

The best discussion of variable typing in VBScript is in the Scripting
Guide. For example:

http://www.microsoft.com/technet/scriptcenter/guide/sas_vbs_eves.mspx

Use the Table of Contents at the left in the page to navigate to more
information.

--
Richard Mueller
MVP Directory Services
Hilltop Lab - http://www.rlmueller.net
--


Richard Mueller [MVP]

unread,
Oct 11, 2009, 12:10:10 PM10/11/09
to

"Richard Mueller [MVP]" <rlmuelle...@ameritech.nospam.net> wrote in
message news:uMAyteoS...@TK2MSFTNGP06.phx.gbl...

Looking again, I mis-read your question. I'm out of town and don't have my
references, but I believe VBScript handles strings one byte per character.
But VBScript can use ADO, where you can specify drivers that support other
encodings.

ekkehard.horner

unread,
Oct 11, 2009, 12:30:05 PM10/11/09
to
Vilius Mockûnas schrieb:
[...]

> What encoding vbscript uses for string vars ?
> Single byte, double byte ?
> I'm using vbscript in wsh environment.
[...]
Internally, strings are represented as 16 bit unicode codepoints.
You can use code like

sOmega = ChrW( &H03C9 )
MsgBox sOmega & "," & UCase( sOmega )

to make sure of this. Depending on the codepage and font used by
your console,

WScript.Echo sOmega & "," & UCase( sOmega )

may look less nice.

Your source code may be written in utf-16 or your standard regional
setting encoding.

cscript <yourfile>.vbs

should work for both cases.

Data/Text file can be used in both formats too. There are (optional)
parameters for functions like OpenTextFile, CreateTextFile, or
OpenAsTextStream.

To process utf-8 data, you'll have to resort to extensions, e.g. the
ADODB.Stream component.

msnews.microsoft.com

unread,
Oct 11, 2009, 3:13:57 PM10/11/09
to
BSTR is wide character formatted. But note that it start with a length
prefix of 4 bytes, and is not 0 terminated (opposite to ANSI C strings).

Best regards,

Frits de Boer
ActiveXperts Software B.V.
http://www.activexperts.com

"Vilius Mock�nas" <v_moc...@yahoo.com> wrote in message
news:%23DuCg3n...@TK2MSFTNGP06.phx.gbl...

mayayana

unread,
Oct 11, 2009, 8:50:20 PM10/11/09
to
In addition to the other info. already provided,
you should know that there are only objects
and variants in VBS. So what's thought of as
a string is actually a variant of subtype string.

Dim s
s = "word"
MsgBox TypeName(s) & vbCrLf & VarType(s)
' Returns "String" and 8

Paul Randall

unread,
Oct 11, 2009, 11:50:45 PM10/11/09
to

"Vilius Mock�nas" <v_moc...@yahoo.com> wrote in message
news:%23DuCg3n...@TK2MSFTNGP06.phx.gbl...

It is kind of complex. You can only see the content of those strings by
writing them to the screen with a message box or WScript.Echo statement,
which forces the string to be filtered through the Locale filter; what you
see can seem far different from what you think is in the string. You can
also display the Asc() and AscW() functions for each character. I have not
been able to resolve in my mind what the "True" encoding is within VBScript
strings.

The URL:
http://msdn.microsoft.com/en-us/library/aa212305(office.11).aspx
titled:
HTML Character Sets
has a gap from &#128 through &#159:

} &#125; --- Right curly brace
~ &#126; --- Tilde
--- &#127; --- Unused
&#160; &nbsp; Nonbreaking space
! &#161; &iexcl; Inverted exclamation
c &#162; &cent; Cent sign

The character set code points within this gap are now 'undefined', or
seeminly inconsistently defined in the scripting regular expression engine.
Some of these code points have a kind of duality, partially dependent on the
'locale' that CScript/WScript is running under.

Try following short script which kind of demonstrates this duality, in that
the Asc and AscW values for some characters are different, which
further clouds the issue of what encoding really is used for the
characters in the string:

Dim i, sMsg
For i = 128 To 159
sMsg = sMsg & vbCrLf & i & vbTab & Chr(i) & vbTab & _
Asc(Chr(i)) & vbTab & ascW(Chr(i))
Next
MsgBox smsg

In the 1082 locale (Maltese), the Asc and AscW values for all characters is
the same, and the Asc value can be greater than 255; this boggles my mind.

The scripting help file script56.chm talks a little about the use of its
locale functions like SetLocale.

-Paul Randall


0 new messages