Suggested Alternative Unicode Implementation (for Rudy+ misc others)

Jolyon Smith

unread,

Mar 6, 2008, 11:01:28 PM3/6/08

to

Without going into the whys and wherefores behind this post, here for
the benefit of anyone that missed it before, is one idea for an
alternative Unicode implementation in Tiburon that would avoid many of
the pitfalls that the implementation (as currently described) is going
to encounter (and cause).

The suggestion/idea:

Extend String RTTI (for the purposes of this post, RTTI here refers to
the runtime properties of a string, i.e. Length and Reference Count).

String RTTI would be extended to include encoding information. For
access efficiency this would likely be a 32-bit value.

Some encoding values would be reserved for specific system
interpretation, representing:

UTF8
UTF16
UTF32
ANSI (system cp)

Remaining values would identify a specific codepage of an ANSI encoded
string.

i.e. at the implementation level, there would continued to be only one
actual "type" of string, but the formal type of any given instance of a
String would include it's encoding.

There would exist, for the purposes of declarations in code:

UTF8String
UTF16String
UTF32String
ANSIString

and

String

String would "map" to one of the string types based on a project
setting. i.e. for an existing application one would most likely choose
to continue with String => ANSIString, but for a new application one
could choose to map String to the UTF encoding of Unicode most
appropriate to that applications needs.

RTL support for strings would be extended to incorporate appropriate,
implicit transcodings. For ANSI => Unicode these would be lossless. For
Unicode => ANSI the compiler could emit a warning.

Specific transcoding support would provide the means for addressing such
warnings if it were not desirable to simply disable that warning in a
project.

e.g. given that the VCL would be fully Unicode

var
s: ANSIString; (or String where String => ANSIString)

s := Edit1.Text; // WARN: Implicit conversion from Unicode to ANSI

The warning could be addressed by either:

- Changing the declaration of 's' to any Unicode string type
(UTF8, 16, 32)

or

- Utilising an explicit transcode:

s := UnicodeToANSI(Edit1.Text);
or
s := UnicodeToANSI(Edit1.Text, cp1251); // etc

or

- Disabling the warning in the project options (likely to be
acceptable for the majority of existing ANSI applications)

Note that explicit transcoding for ANSI=>Unicode is not required (in
order to address warnings) since such transcoding could be lossless
thanks to the specific codepage of the source and the required UTF
encoding of the destination, being able in the RTTI, and so would not
require any warnings:

e.g.

Edit1.Text := s; // Edit1.Text is UTF16, codepage of ANSI s
// is in RTTI. Compiler silently injects RTL
// transcoding for lossless conversion

In general, the only encoding characteristic of a string that may be
changed would be the codepage of an ANSI string.

It would not be possible to otherwise change the encoding of a string
"in place". Attempting to do so, or attempting operations that rely on
it being possible, would result in a compilation error:

i.e.

var
s: UTF8String;

s := UnicodeToANSI(Edit1.Text); // ERROR: Incompatible types

That's covered the basics I think. I'm running out of time (now gone
5pm on a Friday afternoon and I have to go collect my daughters from
after school care).

IANACW, so I would prefer it if people commenting on the idea could
concentrate on the idea and NOT on nitpicking about what is or isn't
"RTTI", what is or isn't an "encoding", what is or isn't "transcoding"
etc etc.

If any questions arise from inappropriate use of such terminology kindly
restrict comments on that score to clarifying for others, if such
clarification is genuinely needed.

Enjoy,

Jolyon Smith

Pavel S

unread,

Mar 7, 2008, 12:47:27 AM3/7/08

to

Seconded.
This is a solution I would expect from CodeGear : new possibilities for
those who need it, backward compatibility for all.

I am unable to understand how they could propose such a poor solution,
giving up the biggest Delphi's asset : excellent backward compatibility.
Micro$oft virus in action ?
Bad joke ?

Pavel

"Jolyon Smith" <jsm...@deltics.co.nz> pí¨e v diskusním príspevku
news:MPG.223b85438...@newsgroups.borland.com...

Eric Grange

unread,

Mar 7, 2008, 3:24:33 AM3/7/08

to

> Note that explicit transcoding for ANSI=>Unicode is not required (in
> order to address warnings) since such transcoding could be lossless

Wrong. The codepage of an ANSI string isn't related to the source or
destination encoding, it's related to the ANSI string itself, and where
it comes from (which is neither system nor source code dependant, there
is no way to guess it, it has to be intelligently developper specified).

Such an assumption is the reason of DFM transcoding bugs in the Delphi IDE.

Eric

Michael C.

unread,

Mar 7, 2008, 3:24:19 AM3/7/08

to

Pavel S wrote:
> Seconded.
> This is a solution I would expect from CodeGear : new possibilities for
> those who need it, backward compatibility for all.

I agree with this sentiment.

Kryvich

unread,

Mar 7, 2008, 3:43:05 AM3/7/08

to

Jolyon Smith wrote:

> Without going into the whys and wherefores behind this post, here for
> the benefit of anyone that missed it before, is one idea for an
> alternative Unicode implementation in Tiburon that would avoid many of
> the pitfalls that the implementation (as currently described) is going
> to encounter (and cause).

Such approach will add unnecessary runtime type checks for an every
one string manipulation. Are you want to get the .NET String class in
native applications?

I think CodeGear has chosen an optimal variant of transition to the
Unicode Delphi.

Arthur Hoornweg

unread,

Mar 7, 2008, 4:45:03 AM3/7/08

to

Kryvich wrote:

> I think CodeGear has chosen an optimal variant of transition to the
> Unicode Delphi.

I disagree. I fear that Codegear will now have to bundle Delphi 2007
with Delphi 2008 for compatibility's sake. It would be catastrophic if
people wouldn't be able to buy a compiler that's compatible with
legacy projects.

--
Arthur Hoornweg

(In order to reply per e-mail, please just remove the ".net"
from my e-mail address. Leave the rest of the address intact
including the "antispam" part. I had to take this measure to
counteract unsollicited mail.)

Paul Scott

unread,

Mar 7, 2008, 5:14:25 AM3/7/08

to

On Fri, 07 Mar 2008 08:24:19 -0000, Michael C. <Mic...@Anonymous.net>
wrote:

Me three!

--
Paul Scott
Information Management Systems
Macclesfield, UK.

Message has been deleted

Paul Scott

unread,

Mar 7, 2008, 5:47:03 AM3/7/08

to

On Fri, 07 Mar 2008 09:11:44 -0000, Marco Caspers <Hexor...@Vaxor.Com>
wrote:

> I disagree, those that have older projects written in Delphi will have
> most likely also older versions of Delphi with which they can maintain
> the old project.
>
> For new users of the next version there is no problem at all since they
> don't have any projects written in older versions of Delphi it isn't an
> issue that everything is unicode.

New users will indeed have no problems - That's great for them!

But for existing users AND CodeGear there may be negative consequences:

Existing users who DO still decide to upgrade, would have to install both
systems.
a) Even more disk space needed (a problem for those whose main development
machines are laptops)
b) A timebomb waiting to explode when you open/compile the wrong project
with the wrong IDE
c) Doubts that CodeGear will continue to support (Bug-Fix and VistaTweak)
the current system

Existing users who DO NOT decide to upgrade (thinking that this hassle may
be too great) are the problem for CodeGear.

Dave Nottage [TeamB]

unread,

Mar 7, 2008, 5:02:29 AM3/7/08

to

Jolyon Smith wrote:

> Extend String RTTI (for the purposes of this post, RTTI here refers
> to the runtime properties of a string, i.e. Length and Reference
> Count).
>
> String RTTI would be extended to include encoding information. For
> access efficiency this would likely be a 32-bit value.
>
> Some encoding values would be reserved for specific system
> interpretation, representing:
>
> UTF8
> UTF16
> UTF32
> ANSI (system cp)
>
> Remaining values would identify a specific codepage of an ANSI
> encoded string.

Interesting idea. Perhaps they're thinking of something like this
anyway? AFAICT, there has been little or no detail about implementation.

--
Dave Nottage [TeamB]

Chris Rolliston

unread,

Mar 7, 2008, 6:04:55 AM3/7/08

to

(Genuine questions:) what does PChar map to? And more to the point,
how do WinAPI calls work? By which I mean, one benefit of CodeGear's
intended approach is that existing code that casts to and from
PChar/string for the purpose of calling a WinAPI function will still
work (given 'string' becomes UTF16, PChar becomes PWideChar, and all
the APIs get mapped to the W versions). Given PChar types can't have
the 'RTTI' you propose for string, it appears to me that sometimes a
PChar cast will work on your proposal and sometimes not, or so it
seems. Maybe I'm missing something obvious though?

Richie B.

unread,

Mar 7, 2008, 8:05:42 AM3/7/08

to

> I disagree, those that have older projects written in Delphi will have
> most likely also older versions of Delphi with which they can maintain
> the old project.

That "old" project is our money cow. We don't have any other. We certainly
cannot go to the new compiler due to the breaking changes!

So we are forced by CodeGear to NOT buying the new Delphi IDE/Compiler.

Sad to see a 20+ people company being dependent on the Delphi 2007 compiler
for the years to come. We are certainly dissappointed in CodeGear (and
that's an understatement).

Richie

OBones

unread,

Mar 7, 2008, 8:22:10 AM3/7/08

to

What I find interesting is the ongoing and raging debate over which
method is best and why CodeGear is wrong. But unless I'm grossly
mistaken, nobody from CodeGear said that they would use any specific
method, nor did anyone speaking had a look at an implementation.
I do appreciate the flow ideas coming around, but what I don't get is
why so many people say that they are disappointed by a solution they
have not yet been able to see...

Message has been deleted

RandomAccess

unread,

Mar 7, 2008, 9:09:45 AM3/7/08

to

"Marco Caspers" <Hexor...@Vaxor.Com> wrote in message
news:47d1...@newsgroups.borland.com...
> ...

> I disagree, those that have older projects written in Delphi will have
> most likely also older versions of Delphi with which they can maintain
> the old project.

You can't be serious!!!

best regards

Kostya

unread,

Mar 7, 2008, 9:12:08 AM3/7/08

to

> I disagree, those that have older projects written in Delphi will have
> most likely also older versions of Delphi with which they can maintain
> the old project.

1. Application is sold to client
with source code so they can
can maintain and customize it
themselves.

2. Extra developers hired to add features
to product

This situations happen very often and
both become impossible if older compiler
no longer supplied. One more reason not
to rely on CG as a serious vendor

Kostya

unread,

Mar 7, 2008, 9:15:39 AM3/7/08

to

OBones wrote:
> But unless I'm grossly
> mistaken, nobody from CodeGear said that they would use any specific
> method

I think that you are "grossly mistaken". Somebody
from CG stated in his blog that there will be
no "compiler switch" and the code would have
to be manually changed where necessary to
work correctly

Message has been deleted

Kostya

unread,

Mar 7, 2008, 9:47:01 AM3/7/08

to

I think your solution could lead
to slower string operations and that
is not always a good "feature". It also
leaves open the situation of ANSI to
Unicode due to missing codepage information.

I think I have couple workable of solutions

Solution 1:

1. There should be 2 explicit types -
ANSIString and Unicode string.
2. There should be directive with the unit
wide scope something like
{$DEFAULTSTRINGTYPE=ANSIString} or
{$DEFAULTSTRINGTYPE=UnicodeString}.
3. There should be project wide directive
of the same nature.

So in case if explicit types compiler
would know how to deal with strings
and in case of implicit type compiler
should treat it accordingly to a unit
scope directive or if one is missing
take a project wide directive and if
that one is missing ask user to supply
one before compile.

Solution 2:

Is kind of the same as Solution 1 but
leave out project wide directive and
make unit scope directive mandatory
and refuse to compile unit until one
is supplied.

While not perfect both of these solutions
at least in my case would make for happy
transition and I think it should be almost
trivial for CG to implement unless their
project already in such state that it is
too late to change anything in which case it
is a shame

Arthur Hoornweg

unread,

Mar 7, 2008, 9:54:54 AM3/7/08

to

Marco Caspers wrote:

> I disagree, those that have older projects written in Delphi will have
> most likely also older versions of Delphi with which they can maintain
> the old project.

But what about your customers? What if you sell a product including
sourcecode? Are you going to tell them, "sorry, but the sourcecode isn't
going to compile on legally available Delphi versions?"

I have millions of lines of Ansi sourcecode in legacy applications plus
third-party libraries (in source code) worth many thousands of $$$.

Arthur Hoornweg

unread,

Mar 7, 2008, 10:13:08 AM3/7/08

to

Paul Scott wrote:

> Existing users who DO NOT decide to upgrade (thinking that this hassle
> may be too great) are the problem for CodeGear.

I'm already looking forward to having this conversation with our
management.

Q: Will this Delphi upgrade save us time?
A: No, it will involve a complete review of all our sourcecode before we
can
think about compiling our existing projects with it.

Q: Will it save us money, then?
A: No, in fact we will have to upgrade every single component library
we own. Some suppliers, such as Turbopower, no longer exist, so
we will have to edit their source code ourselves.

Q: Will it improve our software, then?
A: For new projects, certainly.
For our existing projects, we expect some timebombs in every unit.

Paul Scott

unread,

Mar 7, 2008, 10:18:20 AM3/7/08

to

On Fri, 07 Mar 2008 15:13:08 -0000, Arthur Hoornweg
<antispam...@casema.nl.net> wrote:

>> Existing users who DO NOT decide to upgrade (thinking that this hassle
>> may be too great) are the problem for CodeGear.
>
> I'm already looking forward to having this conversation with our
> management.

...

I am already having the same conversation with my co-directors.

Arthur Hoornweg

unread,

Mar 7, 2008, 10:21:39 AM3/7/08

to

Marco Caspers wrote:

> But if you start to use the information that has already been given
> from today on, you can plan ahead for the future that will come.

Thing is, the Delphi DFM editor gets new properties in every
new release. The consequence is that if you open and save
a form, the application will throw exceptions if you compile
and run it with a previous version of Delphi.

Assuming that Delphi 2008 also introduces new properties in
its components (which I think is a reasonable assumption), this will
make it very difficult to gradually port applications to the new
situation. Just *using* the new IDE would introduce timebombs.

Arthur Hoornweg

unread,

Mar 7, 2008, 10:29:39 AM3/7/08

to

Chris Rolliston wrote:
> (Genuine questions:) what does PChar map to? And more to the point,
> how do WinAPI calls work? By which I mean, one benefit of CodeGear's
> intended approach is that existing code that casts to and from
> PChar/string for the purpose of calling a WinAPI function will still
> work (given 'string' becomes UTF16, PChar becomes PWideChar, and all
> the APIs get mapped to the W versions).

This may indeed work for the WInAPI because Codegear takes care of
the A/W DLL mapping in Unit Windows. However, if you use third-party
dll's, you'll have to manually change all header declarations.
Pchar becomes pAnsichar and the corresponding strings become
Ansistrings.

If I would be forced to convert all my existing projects in one
day, I would do a global search/replace on my hard drive in
all *.pas, *.inc and *.dpr files.
I would change "string" into "Ansistring", "pchar" into "pansichar"
and "char" into "ansichar".

Arthur Hoornweg

unread,

Mar 7, 2008, 10:30:18 AM3/7/08

to

Kostya wrote:

> I think that you are "grossly mistaken". Somebody
> from CG stated in his blog that there will be
> no "compiler switch" and the code would have
> to be manually changed where necessary to
> work correctly

I read that too.

Wayne Niddery (TeamB)

unread,

Mar 7, 2008, 11:08:25 AM3/7/08

to

"Kostya" <thanks@but_no_thanks.com> wrote in message
news:47d14dd8$1...@newsgroups.borland.com...

>
> I think that you are "grossly mistaken". Somebody
> from CG stated in his blog that there will be
> no "compiler switch" and the code would have
> to be manually changed where necessary to
> work correctly

But what everyone here is assuming, without any foundation yet, is to what
extent "where necessary" will come into play. I think it would be grossly
premature, not to mention without precedent, to assume that this means CG is
going to do essentially nothing to minimize migration issues. It would make
far more sense to interpret "where necessary" to mean: whatever pieces CG
cannot handle for you in the most sensible way possible.

There will not doubt be some issues, but I would suggest everyone hold off
on this argument - and accusations about wrong decisions - until it becomes
more clear on what the actual issues will be and to what extent they will
affect existing code.

--
Wayne Niddery - TeamB (www.teamb.com)
Winwright, Inc. (www.winwright.ca)

Richie B.

unread,

Mar 7, 2008, 11:16:03 AM3/7/08

to

> The current proposed change is more significant that previous changes,
> yes. It will require extra work, definitly.

Yes, at least 6000 places where we use string and PChar like they are one
byte a character. Not to forget all our third party components, some of
which are older and might be maintained by us. Not an easy thing to do in
this case.

> But if you start to use the information that has already been given
> from today on, you can plan ahead for the future that will come.

We could. The CodeGear solution is worthless to us. We have cannot go to 2
bytes per character since it will double our memory consumption (already a
few time 10 Gb!).

But again, it is so much work and we win nothing. We have to do all this
work just to keep our software running the same! We can't make a business
doing that.

> Not at all, that is a descision that you make yourself based on your
> beliefs and needs.

You are talking like you know our situation better than we do/

> No-one from CodeGear has you at gunpoint telling you that you cannot
> buy their product.

That's true. Delphi 2008+ is worthless to use, so we stick with 2007. On the
UI side we already did some stuff in C#/.NET and maybe this CG decision will
accelarate the UI being a full C#/.NET product.

> Will it cause extra effort, yes.
> Will it cost extra money, certainly.
> Who will pay for it?
> Ultimately? Your customers.
> Is this a bad thing? No not at all, they pay for something you provide.
> If that something requires more effort, then it's perfectly reasonable
> to ask more money for it.

Again, we have to put effort in the product just te keep it working the same
as before! So I should ask my customer for money so I can keep up with CG's
newest compiler functions without delivering him extra functionality?

I think CodeGear made a big gamble here. I think there are a lot of big
Win32 Delphi products that won't be able to make the switch to the new
compiler due to the amount of work and the risk. I also think that a large
part of the CG Delphi income is based on this "old" systems being developed.
So if most "old" system developers can't make the jump to Delphi 2008 it
might be another nail on Delphi's coffin (like loosing all .NET developers
before).

Richie.

unread,

Mar 7, 2008, 12:53:45 PM3/7/08

to

On 7 Mar 2008 02:11:44 -0700, "Marco Caspers" <Hexor...@Vaxor.Com>
wrote:

>I disagree, those that have older projects written in Delphi will have
>most likely also older versions of Delphi with which they can maintain
>the old project.
>

>For new users of the next version there is no problem at all since they
>don't have any projects written in older versions of Delphi it isn't an
>issue that everything is unicode.

Do you throw away all of your existing code every time you start a new
project?
People with years worth of developed and debugged code need to be able
to continue to use that code with the next version of Delphi.

Markus.Humm

unread,

Mar 7, 2008, 1:03:03 PM3/7/08

to

Hello,

does M$ still supply VB6?

Greetings

Markus

Richie B.

unread,

Mar 7, 2008, 12:56:10 PM3/7/08

to

> but I don't see the value in getting upset over *assumptions* based on
> premature information and/or lack of it.

unread,

Mar 7, 2008, 3:42:48 PM3/7/08

to

Jolyon Smith wrote:

>
> Without going into the whys and wherefores behind this post, here for
> the benefit of anyone that missed it before, is one idea for an
> alternative Unicode implementation in Tiburon that would avoid many
> of the pitfalls that the implementation (as currently described) is
> going to encounter (and cause).
>
>
> The suggestion/idea:
>
>
> Extend String RTTI (for the purposes of this post, RTTI here refers
> to the runtime properties of a string, i.e. Length and Reference
> Count).

If I understand this right, this would mean that the encoding (UTF8,
UTF16, etc.) is stored with the string, ans extra hidden field, just
like lenght or reference count. I get that, so far.

But the data, the payload, for each string would be different, right?
This would mean RTL functions for each type of payload. It would also
mean that code compiled with "string = UTF8String" interfacing with
code compiled with "string = UTF16String" would constantly have to
convert forth and back? Since you call it RTTI, it would also mean that
code using strings constantly checks the type/encoding of the string
and then calls the appropriate RTL routines?

Dunno, but I think that doing the break once is much better. Most code,
unless explicitly declared as AnsiString, will be UnicodeString. There
will be a lot less converting to and fro going on. Of course the data
can/should still have the encoding stored with the text data, but there
should only be one "generic" string type, which will apparently be
UnicodeString.

Plase tell me if I misunderstood something.

--
Rudy Velthuis [TeamB] http://www.teamb.com

"War doesn't make boys men, it makes men dead." -- Ken Gillespie

Wayne Niddery (TeamB)

unread,

Mar 7, 2008, 5:01:14 PM3/7/08

to

"Richie B." <firs...@lastname.com> wrote in message

news:47d1...@newsgroups.borland.com...

>
> Changing "string" is enough for us. It will simply double our memory
> consumption and make our solution = product (having a huge amount of data,
> > 10 Gb, in memory for quick querying) simply impossible. It takes to much
> time to change and test all our strings/pchars.

Fair enough. Your product would, I think, be an exception in needing that
much memory - it definitely seems extreme. For you, clearly, the best
solution would be a simple checkbox to cause String to map to AnsiString.
I'm sure this would be preferable for many anyway, even where memory usage
is not a problem, if it means they need to make few, if any, changes in
code.

But once again we don't know what they may have in mind for this yet.

On the one hand, they are making this change because they know for a fact
that *many* of their customers need it and have been asking for it for a
very long time already, as well as it is crucial if they hope to keep the
product attractive going forward.

On the other hand, you can be sure they *absolutely* want to avoid doing
something that will block a significant number of current customers from
upgrading - they've already been dealing with that issue and have worked
very hard to get themselves back to a position with a product sufficient in
both quality and features to finally attract a large number of customers
that have not upgraded for many years now. Somehow I doubt they *want* to
start that over yet again; they know very well that a majority of their
current Delphi business depends on existing customers.

So, while it is always possible they may fumble this in some way, I really
think its too early and unfair to assume that to be the case. Also, as
usual, they won't be able to satisfy everyone no matter what they do, they
have to do what they think will satisfy the majority as well as being
feasible for them. That *might* leave you out in the cold, but maybe (and
hopefully) not.

And again, hopefully more official info will be made available soon - at
least some reasonable amount of time before the product nears release.

Richie B.

unread,

Mar 7, 2008, 5:46:48 PM3/7/08

to

Thank you for taking the time to answer. Our software is extreme, yes. The
paradox is, we will have to support Unicode for some strings as well. Right
now going to Delphi 2008 seems to be to risky and to much work. Concerning
the Unicode we need, Delphi 2008 seems to be released after a decision has
to be made and code writing should have been started. For Unicode we might
prefer UTF8 due to our software's memory consumption. We do however
understand the UTF16 choice CG has made since it is the best choice for the
UI.

Richie

Jason Burgon

unread,

Mar 7, 2008, 7:07:49 PM3/7/08

to

"Arthur Hoornweg" <antispam...@casema.nl.net> wrote in message
news:47d15f62$1...@newsgroups.borland.com...

> If I would be forced to convert all my existing projects in one
> day, I would do a global search/replace on my hard drive in
> all *.pas, *.inc and *.dpr files.
> I would change "string" into "Ansistring", "pchar" into "pansichar"
> and "char" into "ansichar".

Well I wouldn't. What I'd do is replace all "string" declarations into
"TStdString", PChar into "PStdChar", etc. Then I'd put a "StdTypes.pas" or
similar in the uses clause of *every* unit. StdTypes.pas would be the unit
where the "TStdString" and the other string aliase types are declared. It's
more flexible and maintainable that way.

IMO, anyone who is considering buying Tuberon and has a lot of AnsiiString
code to maintain (but is planning on changing this to Unicode in the future)
should be doing something similar NOW.

--
Jay

Jason Burgon - author of Graphic Vision
http://homepage.ntlworld.com/gvision

Alexandre Machado

unread,

Mar 7, 2008, 8:08:37 PM3/7/08

to

> New users will indeed have no problems - That's great for them!
>
> But for existing users AND CodeGear there may be negative consequences:

>
> Existing users who DO still decide to upgrade, would have to install both
> systems.

> a) Even more disk space needed (a problem for those whose main development
> machines are laptops)

> b) A timebomb waiting to explode when you open/compile the wrong project
> with the wrong IDE
> c) Doubts that CodeGear will continue to support (Bug-Fix and VistaTweak)
> the current system

And what about DCU's??? How could I change a switch in my project and keep
compiling it using DCU's compiled with a different option?

I have *a lot* of libraries distributed in DCU's only. How could I handle
that unicode switch?

Regards

Alexandre Machado

unread,

Mar 7, 2008, 8:36:48 PM3/7/08

to

> Well I wouldn't. What I'd do is replace all "string" declarations into
> "TStdString", PChar into "PStdChar", etc. Then I'd put a "StdTypes.pas" or
> similar in the uses clause of *every* unit. StdTypes.pas would be the unit
> where the "TStdString" and the other string aliase types are declared.
> It's
> more flexible and maintainable that way.

Excellent. The best of all solutions.

Regards

Michael C.

unread,

Mar 8, 2008, 4:20:25 AM3/8/08

to

Marco Caspers wrote:
<snip>

> The current proposed change is more significant that previous changes,
> yes. It will require extra work, definitly.
>

> But if you start to use the information that has already been given
> from today on, you can plan ahead for the future that will come.
>

Shouldn't CodeGear be planing ahead so that they are making things
easier for the customer?
Of course! That's the proper order of things.
Codegear should be striving to make the customer happy not upset.
Period.

Michael C.

unread,

Mar 8, 2008, 4:24:00 AM3/8/08

to

Kostya wrote:
>> I disagree, those that have older projects written in Delphi will have
>> most likely also older versions of Delphi with which they can maintain
>> the old project.
>

> 1. Application is sold to client
> with source code so they can
> can maintain and customize it
> themselves.
>
> 2. Extra developers hired to add features
> to product
>
> This situations happen very often and
> both become impossible if older compiler
> no longer supplied. One more reason not
> to rely on CG as a serious vendor

They really should be selling older versions of Delphi.
Sadly, I don't think they are smart enough to do it.

Rudy Velthuis [TeamB]

unread,

Mar 8, 2008, 4:31:25 AM3/8/08

to

Michael C. wrote:

> > The current proposed change is more significant that previous
> > changes, yes. It will require extra work, definitly.
> >
> > But if you start to use the information that has already been given
> > from today on, you can plan ahead for the future that will come.
> >
>
> Shouldn't CodeGear be planing ahead so that they are making things
> easier for the customer?

Not if the requirements of many (usually non-US) customers are
something that is not as easy at all. In the olden days, when you only
had 7 bit ASCII, things were easier. Since then, things have become
more complicated, with code pages, UTF8, UTF16, UCS-2, different
encodings, code points, code units, etc.etc.

CodeGear is trying to make the transition to this rather complicated
matter as easy as possible.

Of course, those who live in the US and only write for the US market
can probably still do fairly well with 7 bit ASCII or 8 bit and one
single code page. So fo them a move to Unicode is not a direct
advantage. To all other customers (in Asia, South America, Europe) it
makes a big difference and Unicode is a must.

--
Rudy Velthuis [TeamB] http://www.teamb.com

"Raymond's Law of Software: Given a sufficiently large number of
eyeballs, all bugs are shallow." -- Eric S. Raymond

Michael C.

unread,

Mar 8, 2008, 4:44:17 AM3/8/08

to

Do you realize that a simple compiler flag would've saved
thousands of customers hours or days worth of work?
Why are you giving in so easily here?
Why not tell CodeGear what a disservice this is to you instead of
acting like your a compiler?

Codegear should be trying to win the hearts and minds of it's customers.
They shouldn't be trying to make us work for them.

Michael C.

unread,

Mar 8, 2008, 4:50:36 AM3/8/08

to

That's still not a good solution if you keep tons of backups
of previous projects from YEARS ago.
Remember, a lot of us out here have been coding in Delphi for many years.
You can't honestly expect us to update all our "string" source every time we
pull out an old project or download one of the internet.
Do you guys really understand how stupid that is for CodeGear's customers?
Do you realize how much time is wasted here?
Do you realize how much I will think ( as well as other people will think )
CodeGear is stupid for making me do such a thing?!?!

Michael C.

unread,

Mar 8, 2008, 4:59:33 AM3/8/08

to

Rudy Velthuis [TeamB] wrote:
> Michael C. wrote:
>
>>> The current proposed change is more significant that previous
>>> changes, yes. It will require extra work, definitly.
>>>
>>> But if you start to use the information that has already been given
>>> from today on, you can plan ahead for the future that will come.
>>>
>> Shouldn't CodeGear be planing ahead so that they are making things
>> easier for the customer?
>
> Not if the requirements of many (usually non-US) customers are
> something that is not as easy at all.

<snip>
We are meant to disagree here Rudy.
I don't believe
Codegear is not making anything "easy as possible" if they require
the software developer to do work that CodeGear's compiler should have done
in the first place.

Moreover, a person in the United States shouldn't be "punished"
just because someone around the world doesn't speak English.

Rudy Velthuis [TeamB]

unread,

Mar 8, 2008, 5:14:33 AM3/8/08

to

Michael C. wrote:

> Do you realize that a simple compiler flag would've saved
> thousands of customers hours or days worth of work?

How?

--
Rudy Velthuis [TeamB] http://www.teamb.com

"Military justice is to justice what military music is to music."
-- Groucho Marx

Rudy Velthuis [TeamB]

unread,

Mar 8, 2008, 5:13:48 AM3/8/08

to

Michael C. wrote:

> We are meant to disagree here Rudy.

Are we? If you mean you disagree, you can simply say so. <g>

> I don't believe
> Codegear is not making anything "easy as possible" if they require
> the software developer to do work that CodeGear's compiler should
> have done in the first place.

OK, you tell me how.

--
Rudy Velthuis [TeamB] http://www.teamb.com

"I'm Jewish. I don't work out. If God had wanted us to bend over,
He would have put diamonds on the floor." -- Joan Rivers.

unread,

Mar 8, 2008, 7:13:32 AM3/8/08

to

Michael C. schrieb:

[snip]

>
> They really should be selling older versions of Delphi.
> Sadly, I don't think they are smart enough to do it.
>

a) if you ask CG they might be able to make some deal with you on that

b) it's maybe not a matter of smartness but more of treaties with
3rd party vendors from which stuff is included

But: with D8 there came a copy of D7 out of the box. So I never used D8.

Greetings

Markus

unread,

Mar 8, 2008, 10:05:44 AM3/8/08

to

>
> Do you realize that a simple compiler flag would've saved
> thousands of customers hours or days worth of work?
> Why are you giving in so easily here?
> Why not tell CodeGear what a disservice this is to you instead of
> acting like your a compiler?

This would make deployment and team development
even more complicated that it already is.
There would need to be parallel versions of every single BPL.

IMHO Codegear have chosen the best option.

cheers,

Chris

Chris Morgan

unread,

Mar 8, 2008, 10:07:30 AM3/8/08

to

> When Delphi 2 was introduced, it came with a compiler flag that allowed
> us to decide wether we wanted huge strings or not, to facilitate the
> porting of existing software. That was deemed important enough
> then. Now it suddenly isn't?

But D2 did not come with a compiler flag which allowed you
to specify whether integers was 2 bytes or 4 bytes.

This is similar. So no compiler flag.

cheers,

Chris

Ray Porter

unread,

Mar 8, 2008, 10:45:49 AM3/8/08

to

I've read through the CodeGear blogs about Unicode, particularly Allen
Bauer's blog, and I've followed the threads here.

I'd very much appreciated it someone would correct me if I'm wrong (and
point out how) but it seems to me that for very many of us, this conversion
will be relatively painless. We use very few third party components
(IntraWeb, TMS TAdvString, TFlexCel, Instrumentation Workshop, TWebUpdate
and a few old freeware components for which I have source code). Things
like Length(s) and indexing into a string will continue to work as expected.
It just seems to me that for most of us who write straight
business/administrative software for internal customers or for the
shrink-wrap market, may get by with recompiling and need very few (if any)
actual code changes.

If I'm missing something and suddenly every app I have is going to require
dramatic rewrites, I'd very much appreciate someone clearing things up. I'm
not saying there'll be no changes and certainly everything will need to be
tested thoroughly but it sounds like for the most part, this won't be a
really big deal for a great many Delphi developers.

Ray Porter

"Arthur Hoornweg" <antispam...@casema.nl.net> wrote in message

news:47d28fcd$1...@newsgroups.borland.com...

Richie B.

unread,

Mar 8, 2008, 11:29:17 AM3/8/08

to

>> That's true. Delphi 2008+ is worthless to use, so we stick with 2007. On
>> the UI side we already did some stuff in C#/.NET and maybe this CG
>> decision will accelarate the UI being a full C#/.NET product.
>
> .NET strings already use 2-byte characters (even Delphi.NET).

I know, that's why we cannot go to .NET for our server (because speed and
memory should me controled to have a valid solution). For the UI it is less
of a problem.

Richie

Chris Rolliston

unread,

Mar 8, 2008, 11:45:48 AM3/8/08

to

> This may indeed work for the WInAPI because Codegear takes care of
> the A/W DLL mapping in Unit Windows. However, if you use third-party
> dll's, you'll have to manually change all header declarations.

Sure, but my point was merely to suggest that interacting with the
WinAPI (and casts to/from PChar generally) is possibly one area in
which Jolyon's proposed solution is worse than the one CodeGear appear
to be implementing. (This is on the assumption that Jolyon's idea
looks superior otherwise.) In fact, my point would go for those
proposing a compiler switch to toggle the meaning of 'string' to/from
UTF16 too, given that when UTF16 is not the default, the 'A' versions
of API functions will have to be called explicitly.

Aleksander Oven

unread,

Mar 8, 2008, 6:45:21 PM3/8/08

to

Arthur, this is not necessarily meant for you specifically, but for
anyone "rebelling" against Tiburon's implicit Unicode support...

I'm sure someone else has pointed this out already:

Delphi's string type came with a warning about it being prone to change
for as long as I can remember. So it doesn't really matter which way we
turn it - we've all had many, many years to prepare for the change. But
as always, some chose to ignore the warning...

Being mindful, I've been declaring my strings explicitly as AnsiString
or WideString for years now, so there are no instances of the default
string type in any of *my* code (or at least not in the one that still
matters).

Sure, this doesn't solve the 3rd-party code in all that components. But
the way I see it there's only one sure recipe for that problem: get
everything with source, so you can modify it if necessary. It will take
time and a fair amount of care, but at least it's possible. Thankfully,
it's not much more than a thorough search-and-replace of variable
declarations.

FWIW, I'm sure that responsible component vendors will update their code
in a matter of months tops, just as they always have when a new version
of Delphi came out. IMO, the only real problem are those monstrous open
source libraries out there, that have no real owner. I'm really glad
I stopped using them a couple of years back. :)

On a similar note, we all better prepare for another shocker: Integer
and Cardinal also came with a similar warning about their implicit size,
and are most probably going to grow with Delphi for 64-bit.
I hope you've all been using DWORDs, LongInts, etc. in your fixed-sized
structures. <g>

--
Regards,
Aleksander Oven

Michael C.

unread,

Mar 9, 2008, 1:04:25 AM3/9/08

to

Ray Porter wrote:
<snip>

>
> If I'm missing something and suddenly every app I have is going to require
> dramatic rewrites, I'd very much appreciate someone clearing things up. I'm
> not saying there'll be no changes and certainly everything will need to be
> tested thoroughly but it sounds like for the most part, this won't be a
> really big deal for a great many Delphi developers.
>

Let me make some counterpoints.
First, some of us loyal customers in the world have written a lot of sources
in the past that use Delphi strings.
Now, they would require "editing" to make them work correctly.
( In fact, that might introduce more bugs - but that's other point some people
wouldn't understand here in this newsgroup.)
Second,
many of us out here have many backups of Delphi source code
for the last 10 years. They would require "editing" now.
Third,
all the Delphi/Pascal sources available on the internet
would require require "editing" to make work without issues.

So, without the compiler switch,
there is a tremendous waste of developers time that could
actually be spent doing something "productive" instead
of something the compiler could've done in the first place.

Michael C.

unread,

Mar 9, 2008, 1:16:23 AM3/9/08

to

Aleksander Oven wrote:
> Arthur, this is not necessarily meant for you specifically, but for
> anyone "rebelling" against Tiburon's implicit Unicode support...
>
> I'm sure someone else has pointed this out already:
>
> Delphi's string type came with a warning about it being prone to change
> for as long as I can remember. So it doesn't really matter which way we
> turn it - we've all had many, many years to prepare for the change. But
> as always, some chose to ignore the warning...

That doesn't make it a good idea to introduce the change.
In fact, it's a pretty stupid idea that needs vocal support.

> Being mindful, I've been declaring my strings explicitly as AnsiString
> or WideString for years now, so there are no instances of the default
> string type in any of *my* code (or at least not in the one that still
> matters).

Here's something mindful for CodeGear:
Make a compiler switch so you wouldn't have to waste your customers' time
requiring them to edit there sources.
If not, your customers are going to be thinking you guys have dumb solutions.
Think "source code on the internet" - hey CodeGear way to screw that up.

<snip>

>
> On a similar note, we all better prepare for another shocker: Integer
> and Cardinal also came with a similar warning about their implicit size,
> and are most probably going to grow with Delphi for 64-bit.
> I hope you've all been using DWORDs, LongInts, etc. in your fixed-sized
> structures. <g>
>

A Delphi integer isn't as complex as a Delphi string.
Moreover, a Delphi string is a special ADT that made Delphi special.
Special support is require.

Michael C.

unread,

Mar 9, 2008, 1:18:40 AM3/9/08

to

I think your conclusion is in error.
You don't need parallel version of every BPL when you add new features.
That's silly.
If it's really that bad,
I need to rethink using Delphi.

Michael C.

unread,

Mar 9, 2008, 1:25:07 AM3/9/08

to

Markus.Humm wrote:
> Michael C. schrieb:
>
> [snip]
>
>>
>> They really should be selling older versions of Delphi.
>> Sadly, I don't think they are smart enough to do it.
>>
>
> a) if you ask CG they might be able to make some deal with you on that

Most customer probably don't like to ask stupid questions like:
"Please can I please order this product with sugar on top?"
Instead, most customer want to order a product in a hassle free manner.
In todays world, they just want an easy way to order it on the internet!
A world-wide fix price would be a good thing too.
Why punish some customers and reward others?

> b) it's maybe not a matter of smartness but more of treaties with
> 3rd party vendors from which stuff is included

Well, they have the power to change their stupid contracts.
If the 3rd party doesn't agree - dump them.
Who's da man CodeGear?

Michael C.

unread,

Mar 9, 2008, 1:34:13 AM3/9/08

to

Rudy Velthuis [TeamB] wrote:
> Michael C. wrote:
>
>> We are meant to disagree here Rudy.
>
> Are we? If you mean you disagree, you can simply say so. <g>
>
>> I don't believe
>> Codegear is not making anything "easy as possible" if they require
>> the software developer to do work that CodeGear's compiler should
>> have done in the first place.
>
> OK, you tell me how.
>

Future version of Delphi should include an option where the developer can select which
string type he or she is going to be using.
This way the default "string" type can be mapped to a specific string type.
Is this over your head Rudy? <g>
Let me know. :-)

Michael C.

unread,

Mar 9, 2008, 1:37:09 AM3/9/08

to

Michael Bickel wrote:
> ... sorry, that some people have another mother language.
>
> You should consider that Codegear doesn't produce software only for
> english speaking countries.
>

That's why they should have really good project options
and support for multiple languages.
I'm actually agreeing with you.

However, I was saying that a person who just speaks English shouldn't
be "punished" for the new support.

Michael C.

unread,

Mar 9, 2008, 1:41:29 AM3/9/08

to

I think some people here are talking what I posted in the wrong way.
I support multiple languages.
Many people in my extended family speak 2 languages.

I just don't want to have to edit all my old projects just
because of this new "string" support.
I also don't want to have to edit other people's examples from the
internet just to make it run in the new Delphi.
I think it's a tremendous waste of time for myself and other Delphi developers.

Michael C.

unread,

Mar 9, 2008, 1:46:46 AM3/9/08

to

I think it would be useful to have an option
that indicates which integer type you are using.
I think that's a good idea. :-)

However, strings are not similar to integers because
a Delphi string is a special abstract data type.
Unicode strings are much different than ansi strings,
so a compiler flag is needed to
indicate what the developer is "actually using".

Mike B

unread,

Mar 9, 2008, 6:00:24 AM3/9/08

to

"Michael C." <Mic...@Anonymous.net> wrote in message
news:47d38476$1...@newsgroups.borland.com...

>
> Future version of Delphi should include an option where the developer can
> select which string type he or she is going to be using.
> This way the default "string" type can be mapped to a specific string
> type.
> Is this over your head Rudy? <g>
> Let me know. :-)
>

Perhaps it's over my head as well.

By changing the "default" string type to ANSI will this mean that the
Unicode VCL will miraculously morph back into an ANSI version? How can the
compiler assume string to be one thing for a user's code and another thing
for the VCL?

Mike

Aleksander Oven

unread,

Mar 9, 2008, 6:14:56 AM3/9/08

to

Michael C. wrote:

> A Delphi integer isn't as complex as a Delphi string.

That may be, but it will still break your code the same. Also, the
solution will be the same - replace all Integer instances that assume
32-bit with LongInts. But I guess some people will again call for a
compiler switch.

IMO, this could very well lead to far worse problems due to the many
combinations of switches. In Delphi 2009+, you won't be able to
realiably look at code and tell what "string" and "Integer" mean, since
their definitions could change at compile time.

--
Regards,
Aleksander Oven

Mike B

unread,

Mar 9, 2008, 6:36:02 AM3/9/08

to

"Michael C." <Mic...@Anonymous.net> wrote in message

news:47d38048$1...@newsgroups.borland.com...

>
> That doesn't make it a good idea to introduce the change.
> In fact, it's a pretty stupid idea that needs vocal support.
>

You really are very vocal, aren't you?

Reading your posts, you use the words "idiot, fools, ..." etc quite freely.
You are coming across as a little girlie-man who wants to get his way, at
all costs.

Aleksander is quite right about the warnings in respect of assumptions about
string storage and layout in memory. If you ignored these, then tough for
you. If your third-party vendors ignored these, then tough for you again.

CodeGear has more to gain than lose by delivering a Unicode VCL.
Potentially, hundreds of thousands of programmers could be new customers.
All those who code software for markets where the language needs Unicode to
be supported. Microsoft delivered this years ago, and I bet they didn't
suffer because of it.

The reality is, the vast majority of applications will compile with few if
any changes. For those that can't, I am sure that CodeGear will find someway
to make the transition as painless as possible.

Mike B

unread,

Mar 9, 2008, 6:41:58 AM3/9/08

to

"Michael C." <Mic...@Anonymous.net> wrote in message

news:47d3...@newsgroups.borland.com...

>
> I need to rethink using Delphi.
>

I am happy with that!

I don't want to be using the same toolset as you.

Mike

Roger Lascelles

unread,

Mar 9, 2008, 7:21:07 AM3/9/08

to

"Michael C." <Mic...@Anonymous.net> wrote in message

news:47d2631d$1...@newsgroups.borland.com...

> Moreover, a person in the United States shouldn't be "punished"
> just because someone around the world doesn't speak English.

Your statement shows how insular you are. CodeGear must support the rest of
the world or become irrelevant because it depends on sales outside the USA -
the US sales are not enough. And the USA must come to terms with the rest
of the world, or it will become irrelevant. The EU and Asian countries will
eat your lunch if you don't watch out.

Here in Australia, Unicode is a given for exportable products, because we
understand we are just one country in the world. Our Asian customers have
good money to spend on our products - provided they can read the screens.
The lack of Delphi Unicode has hurt the company I work for, and wasted a lot
of time on workarounds.

Delpi with Unicode will get a new lease of life as international customers
continue to purchase Delphi, thus giving you in the USA a healthier CodeGear
with a better prospects.

Roger Lascelles

John Herbster

unread,

Mar 9, 2008, 7:22:21 AM3/9/08

to

Arthur Hoornweg <antispam...@casema.nl.net> wrote

>> ... I fear that Codegear will now have to bundle Delphi 2007 with Delphi 2008 for compatibility's sake.
>> It would be catastrophic if people wouldn't be able to buy a compiler that's compatible with legacy projects.

"Marco Caspers" <Hexor...@Vaxor.Com> wrote

> I disagree, those that have older projects written in Delphi will have most likely also older versions of Delphi with which they can maintain the old project.

Historically, Borland/CodeGear does not like to sell old compilers, this makes support of legacy projects (and long-term support of new projects) dicey.

--JohnH
(Maybe I should have stuck with FORTRAN. <g>)

42

unread,

Mar 9, 2008, 8:00:31 AM3/9/08

to

"Mike B" <mi...@winx-soft.com> wrote in message news:47d3...@newsgroups.borland.com...

Actually, it's pretty easy, if the default stays Ansi, they can put a {$UNICODE}
switch in the VCL units. They could also have an option to automatically include
the switch in the generated/new units.

Ray Porter

unread,

Mar 9, 2008, 9:07:43 AM3/9/08

to

"Michael C." <Mic...@Anonymous.net> wrote in message

news:47d37d7a$1...@newsgroups.borland.com...

> Ray Porter wrote:
> <snip>
>>
>> If I'm missing something and suddenly every app I have is going to
>> require dramatic rewrites, I'd very much appreciate someone clearing
>> things up. I'm not saying there'll be no changes and certainly
>> everything will need to be tested thoroughly but it sounds like for the
>> most part, this won't be a really big deal for a great many Delphi
>> developers.
>>
>
> Let me make some counterpoints.
> First, some of us loyal customers in the world have written a lot of
> sources
> in the past that use Delphi strings.
> Now, they would require "editing" to make them work correctly.
> ( In fact, that might introduce more bugs - but that's other point some
> people
> wouldn't understand here in this newsgroup.)
> Second,
> many of us out here have many backups of Delphi source code
> for the last 10 years. They would require "editing" now.

Here's where I'm not at all sure you're right, Mike. I also have code going
back to Delphi 1.0 and it uses strings extensively (what business
application doesn't?). After reading Allen's blogs on the topic, I strongly
suspect a recompile will be all that's needed in many cases. I don't use a
lot of third-party components and I have source code for those I do use but
I don't really anticipate any problems there, given the nature of those
components.

The one area I will have to look at is code that reads data coming from our
mainframe or writes files out intended for uploading to the mainframe. I
already specify the size for those strings since the mainframe expects fixed
length files but I'll probably have to change those declarations to
AnsiString since the mainframe still speaks EBCDIC.

Again, if someone can point out specifics of how I'm wrong, I'd greatly
appreciate it.

Ray

Rod

unread,

Mar 9, 2008, 9:56:45 AM3/9/08

to

>
> 1. There should be 2 explicit types -
> ANSIString and Unicode string.
> 2. There should be directive with the unit
> wide scope something like
> {$DEFAULTSTRINGTYPE=ANSIString} or
> {$DEFAULTSTRINGTYPE=UnicodeString}.
> 3. There should be project wide directive
> of the same nature.

1. There will be
- String (=UnicodeString)
- AnsiString (=AnsiString ;-) )
- WideString (remains)

2. Would not be possible, because that would force a IDE/RTL/VCL
recompilation.

3. dito.

unread,

Mar 9, 2008, 11:22:10 AM3/9/08

to

> 2. Would not be possible, because that would force a IDE/RTL/VCL
> recompilation.

Why? There is absolutely no need for recompile in here

42

unread,

Mar 9, 2008, 12:29:44 PM3/9/08

to

"Rudy Velthuis [TeamB]" <newsg...@rvelthuis.de> wrote in message news:xn0fnhf86untgm...@rvelthuis.de...

> 42 wrote:
>
>> Actually, it's pretty easy, if the default stays Ansi, they can put a
>> {$UNICODE} switch in the VCL units.
>
> You would have two version of the (compiled) VCL then, or you would
> constantly recompile the VCL? Packages would have to come in two (or
> more) versions?

No, just one, the Unicode version. The compiler would perform the
conversions when calling a Unicode function from Ansi and vice versa.
No need to recompile anything, unless I missed something (if I did,
I would like to know what).

42

unread,

Mar 9, 2008, 12:24:56 PM3/9/08

to

"Rudy Velthuis [TeamB]" <newsg...@rvelthuis.de> wrote in message news:xn0fnhf6junr3h...@rvelthuis.de...

> Mike B wrote:
>
>> By changing the "default" string type to ANSI will this mean that the
>> Unicode VCL will miraculously morph back into an ANSI version?
>
> Of course not. One would be constantly converting from Ansi to Unicode
> and back, especialyl if the RTL is also Unicode. Even if the conversion
> were transparent (if e.g. the string type were stored with the string
> data, so conversions could take place without any work from the user),
> it would still have to occur, and take a lot of precious time.

The exact same thing happened when you converted your D1 programs.
You changed every occurrence of String to ShortString and everything
was fine, except, unbeknownst to most, a lot of conversion took place
"behind the scenes", so you took a performance hit. However, in most
cases this performance hit (even if it was noticed) didn't matter at all
and when it did, it was simple enough to make the necessary changes
in the affected units. If 10+ years ago people could live with this, I'm
pretty sure that for the vast majority, the loss of this "precious time"
you mention is simply a non-issue.

Rod

unread,

Mar 9, 2008, 1:05:10 PM3/9/08

to

The switch you suggested is a compile time switch.

Rudy Velthuis [TeamB]

unread,

Mar 9, 2008, 1:38:33 PM3/9/08

to

42 wrote:

>
> "Rudy Velthuis [TeamB]" <newsg...@rvelthuis.de> wrote in message
> news:xn0fnhf86untgm...@rvelthuis.de...
> > 42 wrote:
> >
> >> Actually, it's pretty easy, if the default stays Ansi, they can
> put a >> {$UNICODE} switch in the VCL units.
> >
> > You would have two version of the (compiled) VCL then, or you would
> > constantly recompile the VCL? Packages would have to come in two (or
> > more) versions?
>
> No, just one, the Unicode version. The compiler would perform the
> conversions when calling a Unicode function from Ansi and vice versa.

That is exactly what they want to prevent, I guess. You can, of course,
still use AnsiString in your apps, but I would not recommend it.

> No need to recompile anything, unless I missed something (if I did,
> I would like to know what).

I guess conversion will be implicit, although I feel that it SHOULDN'T
BE. A conversion from Ansi to Unicode is generally lossless, but vice
versa is not. In that case, I feel that a conversion ought to be
explicit.

But someone who doesn't use too much low level code shouldn't have any
problems at all. Simply recompile the code, and all code will use
Unicode. If there are some mismatches because of a Move or FillChar
assuming a certain Char size, you'll soon find out anyway. It can't be
too hard to search for Move or FillChar in code.

Here we see again that "clever" coding doesn't pay out. Anyone using a
cast to PChar because it allows pointer arithmetic will have a slight
problem now. <g>

--
Rudy Velthuis [TeamB] http://www.teamb.com

"A mathematician is a device for turning coffee into theorems."
-- Paul Erdos

Rudy Velthuis [TeamB]

unread,

Mar 9, 2008, 1:40:58 PM3/9/08

to

42 wrote:

>
> "Rudy Velthuis [TeamB]" <newsg...@rvelthuis.de> wrote in message
> news:xn0fnhf6junr3h...@rvelthuis.de...
> > Mike B wrote:
> >
> >> By changing the "default" string type to ANSI will this mean that
> the >> Unicode VCL will miraculously morph back into an ANSI version?
> >
> > Of course not. One would be constantly converting from Ansi to
> > Unicode and back, especialyl if the RTL is also Unicode. Even if
> > the conversion were transparent (if e.g. the string type were
> > stored with the string data, so conversions could take place
> > without any work from the user), it would still have to occur, and
> > take a lot of precious time.
>
> The exact same thing happened when you converted your D1 programs.

The char size was the same, so the conversion was extremely simple, and
lossless (unless you converted from an AnsiString that was longer than
255 characters to a ShortString, of course), especially since
ShortStrings were (well, are) fixed size. Conversions between Ansi and
Unicode are a little different.

--
Rudy Velthuis [TeamB] http://www.teamb.com

"Sometimes I lie awake at night, and I ask, 'Where have I gone
wrong?' Then a voice says to me, 'This is going to take more
than one night.'" -- Charlie Brown.

Rudy Velthuis [TeamB]

unread,

Mar 9, 2008, 1:51:16 PM3/9/08

to

Kostya wrote:

No? Either your AnsiString-based code will constantly have to convert
between Ansi and Unicode, or there are two versions of the RTL and VCL,
one Unicode, one Ansi.

But why cling on to Ansi? If your code uses string, Char and PChar only
and does no or hardly any low level tricks, conversion should be only a
matter or recompiling (which you must generally do, with a new version
of the compiler, anyway).

--
Rudy Velthuis [TeamB] http://www.teamb.com

"Once you eliminate the impossible, whatever remains, no matter
how improbable, must be the truth."
-- Sherlock Holmes (by Sir Arthur Conan Doyle, 1859-1930)

Andreas Hausladen

unread,

Mar 9, 2008, 2:06:52 PM3/9/08

to

Rudy Velthuis [TeamB] wrote:

> Anyone using a
> cast to PChar because it allows pointer arithmetic will have a slight
> problem now. <g>

I don't see a problem with that except you use code that assumes
SizeOf(Char) = 1 (like some JclStrings assembler functions that I
converted to pascal recently).

--
Regards,

Andreas Hausladen

Rudy Velthuis [TeamB]

unread,

Mar 9, 2008, 2:42:36 PM3/9/08

to

Andreas Hausladen wrote:

> Rudy Velthuis [TeamB] wrote:
>
> > Anyone using a
> > cast to PChar because it allows pointer arithmetic will have a
> > slight problem now. <g>
>
> I don't see a problem with that except you use code that assumes
> SizeOf(Char) = 1

Well, that is exactly why people use it. They cast a pointer to PChar,
add something to it, and then cast it back to the original pointer
type. Problem is that adding something to a PWideChar will give you the
wrong increment. You'll see code like:

P := PInteger(PChar(P) + 3 * SizeOf(Integer));

(Yes, I know this can be done with Inc(P, 3). <g>)

If PChar is PAnsiChar, that will work. But if PChar is PWideChar, you
will add 24 to the pointer, which is in reality 6 * SizeOf(Integer).

I saw Allen Bauer blog about pointer math for all pointers
(switchable). That would be cool, of course.

http://blogs.codegear.com/abauer/2008/01/24/38852

--
Rudy Velthuis [TeamB] http://www.teamb.com

"We all agree that your theory is crazy, but is it crazy enough?"
-- Niels Bohr (1885-1962)

Kostya

unread,

Mar 9, 2008, 2:48:39 PM3/9/08

to

unread,

Mar 9, 2008, 3:23:59 PM3/9/08

to

In article <47d0...@newsgroups.borland.com>, egra...@SPAMglscene.org
says...
> > Note that explicit transcoding for ANSI=>Unicode is not required (in
> > order to address warnings) since such transcoding could be lossless
>
> Wrong. The codepage of an ANSI string isn't related to the source or
> destination encoding, it's related to the ANSI string itself

Right, and the specific codepage for an ANSIString (in the proposed
implementation) is embedded in the individual string's RTTI. So the
transcoding from a specific ANSI string to Unicode is performed using
that specific string's codepage.

> it has to be intelligently developper specified).

In the proposed implementation that is exactly what would happen.

The developer would either set a specific codepage in the instance of
the ANSI string involved, or if no explicit codepage has been specified
the string would adopt the system default codepage.

That's about as good as can be reasonably expected I think (and is
certainly a vast improvement on both the current situation AND the
proposed Unicode implementation in Tiburon).

42

unread,

Mar 9, 2008, 3:32:13 PM3/9/08

to

"Rudy Velthuis [TeamB]" <newsg...@rvelthuis.de> wrote in message news:xn0fnhkgguuyfz...@rvelthuis.de...

> I guess conversion will be implicit, although I feel that it SHOULDN'T
> BE. A conversion from Ansi to Unicode is generally lossless, but vice
> versa is not. In that case, I feel that a conversion ought to be
> explicit.

I strongly disagree.

> But someone who doesn't use too much low level code shouldn't have any
> problems at all. Simply recompile the code, and all code will use
> Unicode. If there are some mismatches because of a Move or FillChar
> assuming a certain Char size, you'll soon find out anyway. It can't be
> too hard to search for Move or FillChar in code.

You have to ask yourself the question: If it's so easy and trivial to switch
to Unicode from Ansi string, why was this such a big deal for CG and why
is this such a major feature? After all, apparently "all" they had to do was
just recompile their stuff after performing a few "search and replace"
operations, right?

I think you vastly underestimate the work involved in making this conversion.
IMO, in any decent sized project, the potential for error is fairly large and we
haven't even considered the time and effort it takes to test all the functions.