The suggestion/idea:
Extend String RTTI (for the purposes of this post, RTTI here refers to
the runtime properties of a string, i.e. Length and Reference Count).
String RTTI would be extended to include encoding information. For
access efficiency this would likely be a 32-bit value.
Some encoding values would be reserved for specific system
interpretation, representing:
UTF8
UTF16
UTF32
ANSI (system cp)
Remaining values would identify a specific codepage of an ANSI encoded
string.
i.e. at the implementation level, there would continued to be only one
actual "type" of string, but the formal type of any given instance of a
String would include it's encoding.
There would exist, for the purposes of declarations in code:
UTF8String
UTF16String
UTF32String
ANSIString
and
String
String would "map" to one of the string types based on a project
setting. i.e. for an existing application one would most likely choose
to continue with String => ANSIString, but for a new application one
could choose to map String to the UTF encoding of Unicode most
appropriate to that applications needs.
RTL support for strings would be extended to incorporate appropriate,
implicit transcodings. For ANSI => Unicode these would be lossless. For
Unicode => ANSI the compiler could emit a warning.
Specific transcoding support would provide the means for addressing such
warnings if it were not desirable to simply disable that warning in a
project.
e.g. given that the VCL would be fully Unicode
var
s: ANSIString; (or String where String => ANSIString)
s := Edit1.Text; // WARN: Implicit conversion from Unicode to ANSI
The warning could be addressed by either:
- Changing the declaration of 's' to any Unicode string type
(UTF8, 16, 32)
or
- Utilising an explicit transcode:
s := UnicodeToANSI(Edit1.Text);
or
s := UnicodeToANSI(Edit1.Text, cp1251); // etc
or
- Disabling the warning in the project options (likely to be
acceptable for the majority of existing ANSI applications)
Note that explicit transcoding for ANSI=>Unicode is not required (in
order to address warnings) since such transcoding could be lossless
thanks to the specific codepage of the source and the required UTF
encoding of the destination, being able in the RTTI, and so would not
require any warnings:
e.g.
Edit1.Text := s; // Edit1.Text is UTF16, codepage of ANSI s
// is in RTTI. Compiler silently injects RTL
// transcoding for lossless conversion
In general, the only encoding characteristic of a string that may be
changed would be the codepage of an ANSI string.
It would not be possible to otherwise change the encoding of a string
"in place". Attempting to do so, or attempting operations that rely on
it being possible, would result in a compilation error:
i.e.
var
s: UTF8String;
s := UnicodeToANSI(Edit1.Text); // ERROR: Incompatible types
That's covered the basics I think. I'm running out of time (now gone
5pm on a Friday afternoon and I have to go collect my daughters from
after school care).
IANACW, so I would prefer it if people commenting on the idea could
concentrate on the idea and NOT on nitpicking about what is or isn't
"RTTI", what is or isn't an "encoding", what is or isn't "transcoding"
etc etc.
If any questions arise from inappropriate use of such terminology kindly
restrict comments on that score to clarifying for others, if such
clarification is genuinely needed.
Enjoy,
Jolyon Smith
I am unable to understand how they could propose such a poor solution,
giving up the biggest Delphi's asset : excellent backward compatibility.
Micro$oft virus in action ?
Bad joke ?
Pavel
"Jolyon Smith" <jsm...@deltics.co.nz> pèe v diskusnÃm prÃspevku
news:MPG.223b85438...@newsgroups.borland.com...
Wrong. The codepage of an ANSI string isn't related to the source or
destination encoding, it's related to the ANSI string itself, and where
it comes from (which is neither system nor source code dependant, there
is no way to guess it, it has to be intelligently developper specified).
Such an assumption is the reason of DFM transcoding bugs in the Delphi IDE.
Eric
I agree with this sentiment.
> Without going into the whys and wherefores behind this post, here for
> the benefit of anyone that missed it before, is one idea for an
> alternative Unicode implementation in Tiburon that would avoid many of
> the pitfalls that the implementation (as currently described) is going
> to encounter (and cause).
Such approach will add unnecessary runtime type checks for an every
one string manipulation. Are you want to get the .NET String class in
native applications?
I think CodeGear has chosen an optimal variant of transition to the
Unicode Delphi.
> I think CodeGear has chosen an optimal variant of transition to the
> Unicode Delphi.
I disagree. I fear that Codegear will now have to bundle Delphi 2007
with Delphi 2008 for compatibility's sake. It would be catastrophic if
people wouldn't be able to buy a compiler that's compatible with
legacy projects.
--
Arthur Hoornweg
(In order to reply per e-mail, please just remove the ".net"
from my e-mail address. Leave the rest of the address intact
including the "antispam" part. I had to take this measure to
counteract unsollicited mail.)
Me three!
--
Paul Scott
Information Management Systems
Macclesfield, UK.
> I disagree, those that have older projects written in Delphi will have
> most likely also older versions of Delphi with which they can maintain
> the old project.
>
> For new users of the next version there is no problem at all since they
> don't have any projects written in older versions of Delphi it isn't an
> issue that everything is unicode.
New users will indeed have no problems - That's great for them!
But for existing users AND CodeGear there may be negative consequences:
Existing users who DO still decide to upgrade, would have to install both
systems.
a) Even more disk space needed (a problem for those whose main development
machines are laptops)
b) A timebomb waiting to explode when you open/compile the wrong project
with the wrong IDE
c) Doubts that CodeGear will continue to support (Bug-Fix and VistaTweak)
the current system
Existing users who DO NOT decide to upgrade (thinking that this hassle may
be too great) are the problem for CodeGear.
> Extend String RTTI (for the purposes of this post, RTTI here refers
> to the runtime properties of a string, i.e. Length and Reference
> Count).
>
> String RTTI would be extended to include encoding information. For
> access efficiency this would likely be a 32-bit value.
>
> Some encoding values would be reserved for specific system
> interpretation, representing:
>
> UTF8
> UTF16
> UTF32
> ANSI (system cp)
>
> Remaining values would identify a specific codepage of an ANSI
> encoded string.
Interesting idea. Perhaps they're thinking of something like this
anyway? AFAICT, there has been little or no detail about implementation.
--
Dave Nottage [TeamB]
That "old" project is our money cow. We don't have any other. We certainly
cannot go to the new compiler due to the breaking changes!
So we are forced by CodeGear to NOT buying the new Delphi IDE/Compiler.
Sad to see a 20+ people company being dependent on the Delphi 2007 compiler
for the years to come. We are certainly dissappointed in CodeGear (and
that's an understatement).
Richie
You can't be serious!!!
best regards
1. Application is sold to client
with source code so they can
can maintain and customize it
themselves.
2. Extra developers hired to add features
to product
This situations happen very often and
both become impossible if older compiler
no longer supplied. One more reason not
to rely on CG as a serious vendor
I think that you are "grossly mistaken". Somebody
from CG stated in his blog that there will be
no "compiler switch" and the code would have
to be manually changed where necessary to
work correctly
I think I have couple workable of solutions
Solution 1:
1. There should be 2 explicit types -
ANSIString and Unicode string.
2. There should be directive with the unit
wide scope something like
{$DEFAULTSTRINGTYPE=ANSIString} or
{$DEFAULTSTRINGTYPE=UnicodeString}.
3. There should be project wide directive
of the same nature.
So in case if explicit types compiler
would know how to deal with strings
and in case of implicit type compiler
should treat it accordingly to a unit
scope directive or if one is missing
take a project wide directive and if
that one is missing ask user to supply
one before compile.
Solution 2:
Is kind of the same as Solution 1 but
leave out project wide directive and
make unit scope directive mandatory
and refuse to compile unit until one
is supplied.
While not perfect both of these solutions
at least in my case would make for happy
transition and I think it should be almost
trivial for CG to implement unless their
project already in such state that it is
too late to change anything in which case it
is a shame
> I disagree, those that have older projects written in Delphi will have
> most likely also older versions of Delphi with which they can maintain
> the old project.
But what about your customers? What if you sell a product including
sourcecode? Are you going to tell them, "sorry, but the sourcecode isn't
going to compile on legally available Delphi versions?"
I have millions of lines of Ansi sourcecode in legacy applications plus
third-party libraries (in source code) worth many thousands of $$$.
> Existing users who DO NOT decide to upgrade (thinking that this hassle
> may be too great) are the problem for CodeGear.
I'm already looking forward to having this conversation with our
management.
Q: Will this Delphi upgrade save us time?
A: No, it will involve a complete review of all our sourcecode before we
can
think about compiling our existing projects with it.
Q: Will it save us money, then?
A: No, in fact we will have to upgrade every single component library
we own. Some suppliers, such as Turbopower, no longer exist, so
we will have to edit their source code ourselves.
Q: Will it improve our software, then?
A: For new projects, certainly.
For our existing projects, we expect some timebombs in every unit.
>> Existing users who DO NOT decide to upgrade (thinking that this hassle
>> may be too great) are the problem for CodeGear.
>
> I'm already looking forward to having this conversation with our
> management.
...
I am already having the same conversation with my co-directors.
> But if you start to use the information that has already been given
> from today on, you can plan ahead for the future that will come.
Thing is, the Delphi DFM editor gets new properties in every
new release. The consequence is that if you open and save
a form, the application will throw exceptions if you compile
and run it with a previous version of Delphi.
Assuming that Delphi 2008 also introduces new properties in
its components (which I think is a reasonable assumption), this will
make it very difficult to gradually port applications to the new
situation. Just *using* the new IDE would introduce timebombs.
This may indeed work for the WInAPI because Codegear takes care of
the A/W DLL mapping in Unit Windows. However, if you use third-party
dll's, you'll have to manually change all header declarations.
Pchar becomes pAnsichar and the corresponding strings become
Ansistrings.
If I would be forced to convert all my existing projects in one
day, I would do a global search/replace on my hard drive in
all *.pas, *.inc and *.dpr files.
I would change "string" into "Ansistring", "pchar" into "pansichar"
and "char" into "ansichar".
> I think that you are "grossly mistaken". Somebody
> from CG stated in his blog that there will be
> no "compiler switch" and the code would have
> to be manually changed where necessary to
> work correctly
I read that too.
But what everyone here is assuming, without any foundation yet, is to what
extent "where necessary" will come into play. I think it would be grossly
premature, not to mention without precedent, to assume that this means CG is
going to do essentially nothing to minimize migration issues. It would make
far more sense to interpret "where necessary" to mean: whatever pieces CG
cannot handle for you in the most sensible way possible.
There will not doubt be some issues, but I would suggest everyone hold off
on this argument - and accusations about wrong decisions - until it becomes
more clear on what the actual issues will be and to what extent they will
affect existing code.
--
Wayne Niddery - TeamB (www.teamb.com)
Winwright, Inc. (www.winwright.ca)
Yes, at least 6000 places where we use string and PChar like they are one
byte a character. Not to forget all our third party components, some of
which are older and might be maintained by us. Not an easy thing to do in
this case.
> But if you start to use the information that has already been given
> from today on, you can plan ahead for the future that will come.
We could. The CodeGear solution is worthless to us. We have cannot go to 2
bytes per character since it will double our memory consumption (already a
few time 10 Gb!).
But again, it is so much work and we win nothing. We have to do all this
work just to keep our software running the same! We can't make a business
doing that.
> Not at all, that is a descision that you make yourself based on your
> beliefs and needs.
You are talking like you know our situation better than we do/
> No-one from CodeGear has you at gunpoint telling you that you cannot
> buy their product.
That's true. Delphi 2008+ is worthless to use, so we stick with 2007. On the
UI side we already did some stuff in C#/.NET and maybe this CG decision will
accelarate the UI being a full C#/.NET product.
> Will it cause extra effort, yes.
> Will it cost extra money, certainly.
> Who will pay for it?
> Ultimately? Your customers.
> Is this a bad thing? No not at all, they pay for something you provide.
> If that something requires more effort, then it's perfectly reasonable
> to ask more money for it.
Again, we have to put effort in the product just te keep it working the same
as before! So I should ask my customer for money so I can keep up with CG's
newest compiler functions without delivering him extra functionality?
I think CodeGear made a big gamble here. I think there are a lot of big
Win32 Delphi products that won't be able to make the switch to the new
compiler due to the amount of work and the risk. I also think that a large
part of the CG Delphi income is based on this "old" systems being developed.
So if most "old" system developers can't make the jump to Delphi 2008 it
might be another nail on Delphi's coffin (like loosing all .NET developers
before).
Richie.
From the blogs it is clear that they aren't doing anything to minimize
migration issues. First they made a decision to change, then they try to
minimize the migration issues. It should have been the other way around. The
compiler switch would be a good example of minimizing migration issues.
Change of code files might be much more risky (who's going to test our
software of over 1.000.000 lines of code?)
Richie
Barring a better or more flexible way of doing so, yes it would. I don't
know if that's the case or not (a better alternative), I just know that
beyond a few blog entries giving some insight into a work in progress, there
is no official information on exact implementation and its consequences for
all of us.
As always, I understand the earlier such information can be made available,
the better - people need to be able to plan as much as possible, so I
understand the frustration when that info is lacking, but I don't see the
value in getting upset over *assumptions* based on premature information
and/or lack of it.
> Existing users who DO still decide to upgrade, would have to install
> both systems. a) Even more disk space needed (a problem for those
> whose main development machines are laptops)
This shouldn't be an issue - any laptop capable of running CDS at
acceptable speeds will be new enough to have more than adequate hard
drive capacity.
My 75GB laptop, with D5 and CDS2007 installed, as well as the complete
Office suite, VS2005 and several database engines, still has over 30GB
free. Unless the next release of CDS jumps from a several GB to
several-tens-of-GB installation, I think the vast majority of
developers won't have a problem with this.
>I disagree, those that have older projects written in Delphi will have
>most likely also older versions of Delphi with which they can maintain
>the old project.
>
>For new users of the next version there is no problem at all since they
>don't have any projects written in older versions of Delphi it isn't an
>issue that everything is unicode.
Do you throw away all of your existing code every time you start a new
project?
People with years worth of developed and debugged code need to be able
to continue to use that code with the next version of Delphi.
does M$ still supply VB6?
Greetings
Markus
Changing "string" is enough for us. It will simply double our memory
consumption and make our solution = product (having a huge amount of data, >
10 Gb, in memory for quick querying) simply impossible. It takes to much
time to change and test all our strings/pchars.
Richie
I do not care what MS does.
They can afford to piss on their
customers. If that is example
of proper behavior then
I hope one day they get
nice payback
> Existing users who DO still decide to upgrade, would have to install
> both systems.
Hmmm... not on this one (I've become lazy), but on my other system I
had BP7, D1, D6, D7, D8, D2005 and BDS2006 plus BCB5 and BCB6.
IOW, having to install two Delphis wouldn't really scratch me. <g>
--
Rudy Velthuis [TeamB] http://www.teamb.com
"If Tyranny and Oppression come to this land, it will be in
the guise of fighting a foreign enemy."
-- James Madison
> Kryvich wrote:
>
> > I think CodeGear has chosen an optimal variant of transition to the
> > Unicode Delphi.
>
> I disagree. I fear that Codegear will now have to bundle Delphi 2007
> with Delphi 2008 for compatibility's sake. It would be catastrophic if
> people wouldn't be able to buy a compiler that's compatible with
> legacy projects.
Agreed. Some version of D2007 could be shipped with the new version.
Just like, for a while, D1 was shipped with the Win32 Delphis.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"God is real unless declared integer" -- david
>
> Without going into the whys and wherefores behind this post, here for
> the benefit of anyone that missed it before, is one idea for an
> alternative Unicode implementation in Tiburon that would avoid many
> of the pitfalls that the implementation (as currently described) is
> going to encounter (and cause).
>
>
> The suggestion/idea:
>
>
> Extend String RTTI (for the purposes of this post, RTTI here refers
> to the runtime properties of a string, i.e. Length and Reference
> Count).
<snip proposal>
If I understand this right, this would mean that the encoding (UTF8,
UTF16, etc.) is stored with the string, ans extra hidden field, just
like lenght or reference count. I get that, so far.
But the data, the payload, for each string would be different, right?
This would mean RTL functions for each type of payload. It would also
mean that code compiled with "string = UTF8String" interfacing with
code compiled with "string = UTF16String" would constantly have to
convert forth and back? Since you call it RTTI, it would also mean that
code using strings constantly checks the type/encoding of the string
and then calls the appropriate RTL routines?
Dunno, but I think that doing the break once is much better. Most code,
unless explicitly declared as AnsiString, will be UnicodeString. There
will be a lot less converting to and fro going on. Of course the data
can/should still have the encoding stored with the text data, but there
should only be one "generic" string type, which will apparently be
UnicodeString.
Plase tell me if I misunderstood something.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"War doesn't make boys men, it makes men dead." -- Ken Gillespie
Fair enough. Your product would, I think, be an exception in needing that
much memory - it definitely seems extreme. For you, clearly, the best
solution would be a simple checkbox to cause String to map to AnsiString.
I'm sure this would be preferable for many anyway, even where memory usage
is not a problem, if it means they need to make few, if any, changes in
code.
But once again we don't know what they may have in mind for this yet.
On the one hand, they are making this change because they know for a fact
that *many* of their customers need it and have been asking for it for a
very long time already, as well as it is crucial if they hope to keep the
product attractive going forward.
On the other hand, you can be sure they *absolutely* want to avoid doing
something that will block a significant number of current customers from
upgrading - they've already been dealing with that issue and have worked
very hard to get themselves back to a position with a product sufficient in
both quality and features to finally attract a large number of customers
that have not upgraded for many years now. Somehow I doubt they *want* to
start that over yet again; they know very well that a majority of their
current Delphi business depends on existing customers.
So, while it is always possible they may fumble this in some way, I really
think its too early and unfair to assume that to be the case. Also, as
usual, they won't be able to satisfy everyone no matter what they do, they
have to do what they think will satisfy the majority as well as being
feasible for them. That *might* leave you out in the cold, but maybe (and
hopefully) not.
And again, hopefully more official info will be made available soon - at
least some reasonable amount of time before the product nears release.
Richie
> If I would be forced to convert all my existing projects in one
> day, I would do a global search/replace on my hard drive in
> all *.pas, *.inc and *.dpr files.
> I would change "string" into "Ansistring", "pchar" into "pansichar"
> and "char" into "ansichar".
Well I wouldn't. What I'd do is replace all "string" declarations into
"TStdString", PChar into "PStdChar", etc. Then I'd put a "StdTypes.pas" or
similar in the uses clause of *every* unit. StdTypes.pas would be the unit
where the "TStdString" and the other string aliase types are declared. It's
more flexible and maintainable that way.
IMO, anyone who is considering buying Tuberon and has a lot of AnsiiString
code to maintain (but is planning on changing this to Unicode in the future)
should be doing something similar NOW.
--
Jay
Jason Burgon - author of Graphic Vision
http://homepage.ntlworld.com/gvision
And what about DCU's??? How could I change a switch in my project and keep
compiling it using DCU's compiled with a different option?
I have *a lot* of libraries distributed in DCU's only. How could I handle
that unicode switch?
Regards
Excellent. The best of all solutions.
Regards
Shouldn't CodeGear be planing ahead so that they are making things
easier for the customer?
Of course! That's the proper order of things.
Codegear should be striving to make the customer happy not upset.
Period.
They really should be selling older versions of Delphi.
Sadly, I don't think they are smart enough to do it.
> > The current proposed change is more significant that previous
> > changes, yes. It will require extra work, definitly.
> >
> > But if you start to use the information that has already been given
> > from today on, you can plan ahead for the future that will come.
> >
>
> Shouldn't CodeGear be planing ahead so that they are making things
> easier for the customer?
Not if the requirements of many (usually non-US) customers are
something that is not as easy at all. In the olden days, when you only
had 7 bit ASCII, things were easier. Since then, things have become
more complicated, with code pages, UTF8, UTF16, UCS-2, different
encodings, code points, code units, etc.etc.
CodeGear is trying to make the transition to this rather complicated
matter as easy as possible.
Of course, those who live in the US and only write for the US market
can probably still do fairly well with 7 bit ASCII or 8 bit and one
single code page. So fo them a move to Unicode is not a direct
advantage. To all other customers (in Asia, South America, Europe) it
makes a big difference and Unicode is a must.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"Raymond's Law of Software: Given a sufficiently large number of
eyeballs, all bugs are shallow." -- Eric S. Raymond
Do you realize that a simple compiler flag would've saved
thousands of customers hours or days worth of work?
Why are you giving in so easily here?
Why not tell CodeGear what a disservice this is to you instead of
acting like your a compiler?
Codegear should be trying to win the hearts and minds of it's customers.
They shouldn't be trying to make us work for them.
That's still not a good solution if you keep tons of backups
of previous projects from YEARS ago.
Remember, a lot of us out here have been coding in Delphi for many years.
You can't honestly expect us to update all our "string" source every time we
pull out an old project or download one of the internet.
Do you guys really understand how stupid that is for CodeGear's customers?
Do you realize how much time is wasted here?
Do you realize how much I will think ( as well as other people will think )
CodeGear is stupid for making me do such a thing?!?!
<snip>
We are meant to disagree here Rudy.
I don't believe
Codegear is not making anything "easy as possible" if they require
the software developer to do work that CodeGear's compiler should have done
in the first place.
Moreover, a person in the United States shouldn't be "punished"
just because someone around the world doesn't speak English.
> Do you realize that a simple compiler flag would've saved
> thousands of customers hours or days worth of work?
How?
--
Rudy Velthuis [TeamB] http://www.teamb.com
"Military justice is to justice what military music is to music."
-- Groucho Marx
> We are meant to disagree here Rudy.
Are we? If you mean you disagree, you can simply say so. <g>
> I don't believe
> Codegear is not making anything "easy as possible" if they require
> the software developer to do work that CodeGear's compiler should
> have done in the first place.
OK, you tell me how.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"I'm Jewish. I don't work out. If God had wanted us to bend over,
He would have put diamonds on the floor." -- Joan Rivers.
You should consider that Codegear doesn't produce software only for english
speaking countries.
Michael
I didn't claim that being proper behaviour as I've got a technical
problem with them currently. I'd just wanted to broaden the look on that
topic.
Greetings
Markus
have you read the explanation about why there won't be such a compiler
switch in one of CGs blogs lately?
Greetings
Markus
[snip]
>
> They really should be selling older versions of Delphi.
> Sadly, I don't think they are smart enough to do it.
>
a) if you ask CG they might be able to make some deal with you on that
b) it's maybe not a matter of smartness but more of treaties with
3rd party vendors from which stuff is included
But: with D8 there came a copy of D7 out of the box. So I never used D8.
Greetings
Markus
But back in the ASCII days non US people were severly punished!
So that's not really a argument or if you honestly came with it I'd say:
a not worldwide useable system for string representation shouldn't have
been even remotely considered by international IT cooperations.
Greetings
Markus
> have you read the explanation about why there won't be such a compiler
> switch in one of CGs blogs lately?
That doesn't necessarily mean we agree with it.
When Delphi 2 was introduced, it came with a compiler flag that allowed
us to decide wether we wanted huge strings or not, to facilitate the
porting of existing software. That was deemed important enough
then. Now it suddenly isn't?
--
Arthur Hoornweg
(In order to reply per e-mail, please just remove the ".net"
from my e-mail address. Leave the rest of the address intact
including the "antispam" part. I had to take this measure to
counteract unsollicited mail.)
...
> That's true. Delphi 2008+ is worthless to use, so we stick with 2007. On
> the UI side we already did some stuff in C#/.NET and maybe this CG
> decision will accelarate the UI being a full C#/.NET product.
.NET strings already use 2-byte characters (even Delphi.NET).
cheers,
Chris
This would make deployment and team development
even more complicated that it already is.
There would need to be parallel versions of every single BPL.
IMHO Codegear have chosen the best option.
cheers,
Chris
But D2 did not come with a compiler flag which allowed you
to specify whether integers was 2 bytes or 4 bytes.
This is similar. So no compiler flag.
cheers,
Chris
I'd very much appreciated it someone would correct me if I'm wrong (and
point out how) but it seems to me that for very many of us, this conversion
will be relatively painless. We use very few third party components
(IntraWeb, TMS TAdvString, TFlexCel, Instrumentation Workshop, TWebUpdate
and a few old freeware components for which I have source code). Things
like Length(s) and indexing into a string will continue to work as expected.
It just seems to me that for most of us who write straight
business/administrative software for internal customers or for the
shrink-wrap market, may get by with recompiling and need very few (if any)
actual code changes.
If I'm missing something and suddenly every app I have is going to require
dramatic rewrites, I'd very much appreciate someone clearing things up. I'm
not saying there'll be no changes and certainly everything will need to be
tested thoroughly but it sounds like for the most part, this won't be a
really big deal for a great many Delphi developers.
Ray Porter
"Arthur Hoornweg" <antispam...@casema.nl.net> wrote in message
news:47d28fcd$1...@newsgroups.borland.com...
I know, that's why we cannot go to .NET for our server (because speed and
memory should me controled to have a valid solution). For the UI it is less
of a problem.
Richie
Sure, but my point was merely to suggest that interacting with the
WinAPI (and casts to/from PChar generally) is possibly one area in
which Jolyon's proposed solution is worse than the one CodeGear appear
to be implementing. (This is on the assumption that Jolyon's idea
looks superior otherwise.) In fact, my point would go for those
proposing a compiler switch to toggle the meaning of 'string' to/from
UTF16 too, given that when UTF16 is not the default, the 'A' versions
of API functions will have to be called explicitly.
I'm sure someone else has pointed this out already:
Delphi's string type came with a warning about it being prone to change
for as long as I can remember. So it doesn't really matter which way we
turn it - we've all had many, many years to prepare for the change. But
as always, some chose to ignore the warning...
Being mindful, I've been declaring my strings explicitly as AnsiString
or WideString for years now, so there are no instances of the default
string type in any of *my* code (or at least not in the one that still
matters).
Sure, this doesn't solve the 3rd-party code in all that components. But
the way I see it there's only one sure recipe for that problem: get
everything with source, so you can modify it if necessary. It will take
time and a fair amount of care, but at least it's possible. Thankfully,
it's not much more than a thorough search-and-replace of variable
declarations.
FWIW, I'm sure that responsible component vendors will update their code
in a matter of months tops, just as they always have when a new version
of Delphi came out. IMO, the only real problem are those monstrous open
source libraries out there, that have no real owner. I'm really glad
I stopped using them a couple of years back. :)
On a similar note, we all better prepare for another shocker: Integer
and Cardinal also came with a similar warning about their implicit size,
and are most probably going to grow with Delphi for 64-bit.
I hope you've all been using DWORDs, LongInts, etc. in your fixed-sized
structures. <g>
--
Regards,
Aleksander Oven
Let me make some counterpoints.
First, some of us loyal customers in the world have written a lot of sources
in the past that use Delphi strings.
Now, they would require "editing" to make them work correctly.
( In fact, that might introduce more bugs - but that's other point some people
wouldn't understand here in this newsgroup.)
Second,
many of us out here have many backups of Delphi source code
for the last 10 years. They would require "editing" now.
Third,
all the Delphi/Pascal sources available on the internet
would require require "editing" to make work without issues.
So, without the compiler switch,
there is a tremendous waste of developers time that could
actually be spent doing something "productive" instead
of something the compiler could've done in the first place.
That doesn't make it a good idea to introduce the change.
In fact, it's a pretty stupid idea that needs vocal support.
> Being mindful, I've been declaring my strings explicitly as AnsiString
> or WideString for years now, so there are no instances of the default
> string type in any of *my* code (or at least not in the one that still
> matters).
Here's something mindful for CodeGear:
Make a compiler switch so you wouldn't have to waste your customers' time
requiring them to edit there sources.
If not, your customers are going to be thinking you guys have dumb solutions.
Think "source code on the internet" - hey CodeGear way to screw that up.
<snip>
>
> On a similar note, we all better prepare for another shocker: Integer
> and Cardinal also came with a similar warning about their implicit size,
> and are most probably going to grow with Delphi for 64-bit.
> I hope you've all been using DWORDs, LongInts, etc. in your fixed-sized
> structures. <g>
>
A Delphi integer isn't as complex as a Delphi string.
Moreover, a Delphi string is a special ADT that made Delphi special.
Special support is require.
I think your conclusion is in error.
You don't need parallel version of every BPL when you add new features.
That's silly.
If it's really that bad,
I need to rethink using Delphi.
Most customer probably don't like to ask stupid questions like:
"Please can I please order this product with sugar on top?"
Instead, most customer want to order a product in a hassle free manner.
In todays world, they just want an easy way to order it on the internet!
A world-wide fix price would be a good thing too.
Why punish some customers and reward others?
> b) it's maybe not a matter of smartness but more of treaties with
> 3rd party vendors from which stuff is included
Well, they have the power to change their stupid contracts.
If the 3rd party doesn't agree - dump them.
Who's da man CodeGear?
Future version of Delphi should include an option where the developer can select which
string type he or she is going to be using.
This way the default "string" type can be mapped to a specific string type.
Is this over your head Rudy? <g>
Let me know. :-)
That's why they should have really good project options
and support for multiple languages.
I'm actually agreeing with you.
However, I was saying that a person who just speaks English shouldn't
be "punished" for the new support.
I think some people here are talking what I posted in the wrong way.
I support multiple languages.
Many people in my extended family speak 2 languages.
I just don't want to have to edit all my old projects just
because of this new "string" support.
I also don't want to have to edit other people's examples from the
internet just to make it run in the new Delphi.
I think it's a tremendous waste of time for myself and other Delphi developers.
I think it would be useful to have an option
that indicates which integer type you are using.
I think that's a good idea. :-)
However, strings are not similar to integers because
a Delphi string is a special abstract data type.
Unicode strings are much different than ansi strings,
so a compiler flag is needed to
indicate what the developer is "actually using".
By changing the "default" string type to ANSI will this mean that the
Unicode VCL will miraculously morph back into an ANSI version? How can the
compiler assume string to be one thing for a user's code and another thing
for the VCL?
Mike
> A Delphi integer isn't as complex as a Delphi string.
That may be, but it will still break your code the same. Also, the
solution will be the same - replace all Integer instances that assume
32-bit with LongInts. But I guess some people will again call for a
compiler switch.
IMO, this could very well lead to far worse problems due to the many
combinations of switches. In Delphi 2009+, you won't be able to
realiably look at code and tell what "string" and "Integer" mean, since
their definitions could change at compile time.
--
Regards,
Aleksander Oven
Reading your posts, you use the words "idiot, fools, ..." etc quite freely.
You are coming across as a little girlie-man who wants to get his way, at
all costs.
Aleksander is quite right about the warnings in respect of assumptions about
string storage and layout in memory. If you ignored these, then tough for
you. If your third-party vendors ignored these, then tough for you again.
CodeGear has more to gain than lose by delivering a Unicode VCL.
Potentially, hundreds of thousands of programmers could be new customers.
All those who code software for markets where the language needs Unicode to
be supported. Microsoft delivered this years ago, and I bet they didn't
suffer because of it.
The reality is, the vast majority of applications will compile with few if
any changes. For those that can't, I am sure that CodeGear will find someway
to make the transition as painless as possible.
Mike B
I don't want to be using the same toolset as you.
Mike
> Moreover, a person in the United States shouldn't be "punished"
> just because someone around the world doesn't speak English.
Your statement shows how insular you are. CodeGear must support the rest of
the world or become irrelevant because it depends on sales outside the USA -
the US sales are not enough. And the USA must come to terms with the rest
of the world, or it will become irrelevant. The EU and Asian countries will
eat your lunch if you don't watch out.
Here in Australia, Unicode is a given for exportable products, because we
understand we are just one country in the world. Our Asian customers have
good money to spend on our products - provided they can read the screens.
The lack of Delphi Unicode has hurt the company I work for, and wasted a lot
of time on workarounds.
Delpi with Unicode will get a new lease of life as international customers
continue to purchase Delphi, thus giving you in the USA a healthier CodeGear
with a better prospects.
Roger Lascelles
>> ... I fear that Codegear will now have to bundle Delphi 2007 with Delphi 2008 for compatibility's sake.
>> It would be catastrophic if people wouldn't be able to buy a compiler that's compatible with legacy projects.
"Marco Caspers" <Hexor...@Vaxor.Com> wrote
> I disagree, those that have older projects written in Delphi will have most likely also older versions of Delphi with which they can maintain the old project.
Historically, Borland/CodeGear does not like to sell old compilers, this makes support of legacy projects (and long-term support of new projects) dicey.
--JohnH
(Maybe I should have stuck with FORTRAN. <g>)
Actually, it's pretty easy, if the default stays Ansi, they can put a {$UNICODE}
switch in the VCL units. They could also have an option to automatically include
the switch in the generated/new units.
Here's where I'm not at all sure you're right, Mike. I also have code going
back to Delphi 1.0 and it uses strings extensively (what business
application doesn't?). After reading Allen's blogs on the topic, I strongly
suspect a recompile will be all that's needed in many cases. I don't use a
lot of third-party components and I have source code for those I do use but
I don't really anticipate any problems there, given the nature of those
components.
The one area I will have to look at is code that reads data coming from our
mainframe or writes files out intended for uploading to the mainframe. I
already specify the size for those strings since the mainframe expects fixed
length files but I'll probably have to change those declarations to
AnsiString since the mainframe still speaks EBCDIC.
Again, if someone can point out specifics of how I'm wrong, I'd greatly
appreciate it.
Ray
1. There will be
- String (=UnicodeString)
- AnsiString (=AnsiString ;-) )
- WideString (remains)
2. Would not be possible, because that would force a IDE/RTL/VCL
recompilation.
3. dito.
> Future version of Delphi should include an option where the developer
> can select which string type he or she is going to be using.
I don't think that will happen or makes any sense.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"The Bible was a consolation to a fellow alone in the old cell.
The lovely thin paper with a bit of matress stuffing in it, if
you could get a match, was as good a smoke as I ever tasted."
-- Brendan Behan.
> By changing the "default" string type to ANSI will this mean that the
> Unicode VCL will miraculously morph back into an ANSI version?
Of course not. One would be constantly converting from Ansi to Unicode
and back, especialyl if the RTL is also Unicode. Even if the conversion
were transparent (if e.g. the string type were stored with the string
data, so conversions could take place without any work from the user),
it would still have to occur, and take a lot of precious time.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"While we are postponing, life speeds by."
-- Seneca (3BC - 65AD)
> I just don't want to have to edit all my old projects just
> because of this new "string" support.
That is the problem with any change. There will always be people who
don't benefit from it and don't want to bother to adapt to the new
situation.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"I never forget a face, but in your case I'll be glad to make an
exception." -- Groucho Marx
> Actually, it's pretty easy, if the default stays Ansi, they can put a
> {$UNICODE} switch in the VCL units.
You would have two version of the (compiled) VCL then, or you would
constantly recompile the VCL? Packages would have to come in two (or
more) versions?
--
Rudy Velthuis [TeamB] http://www.teamb.com
"If absolute power corrupts absolutely, where does that leave
God?" -- George Deacon.
> However, I was saying that a person who just speaks English shouldn't
> be "punished" for the new support.
You can't make an omelette without breaking eggs.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"There's many a bestseller that could have been prevented by a
good teacher." -- Flannery O'Connor (1925-1964)
Why? There is absolutely no need for recompile in here
No, just one, the Unicode version. The compiler would perform the
conversions when calling a Unicode function from Ansi and vice versa.
No need to recompile anything, unless I missed something (if I did,
I would like to know what).
The exact same thing happened when you converted your D1 programs.
You changed every occurrence of String to ShortString and everything
was fine, except, unbeknownst to most, a lot of conversion took place
"behind the scenes", so you took a performance hit. However, in most
cases this performance hit (even if it was noticed) didn't matter at all
and when it did, it was simple enough to make the necessary changes
in the affected units. If 10+ years ago people could live with this, I'm
pretty sure that for the vast majority, the loss of this "precious time"
you mention is simply a non-issue.
The switch you suggested is a compile time switch.
>
> "Rudy Velthuis [TeamB]" <newsg...@rvelthuis.de> wrote in message
> news:xn0fnhf86untgm...@rvelthuis.de...
> > 42 wrote:
> >
> >> Actually, it's pretty easy, if the default stays Ansi, they can
> put a >> {$UNICODE} switch in the VCL units.
> >
> > You would have two version of the (compiled) VCL then, or you would
> > constantly recompile the VCL? Packages would have to come in two (or
> > more) versions?
>
> No, just one, the Unicode version. The compiler would perform the
> conversions when calling a Unicode function from Ansi and vice versa.
That is exactly what they want to prevent, I guess. You can, of course,
still use AnsiString in your apps, but I would not recommend it.
> No need to recompile anything, unless I missed something (if I did,
> I would like to know what).
I guess conversion will be implicit, although I feel that it SHOULDN'T
BE. A conversion from Ansi to Unicode is generally lossless, but vice
versa is not. In that case, I feel that a conversion ought to be
explicit.
But someone who doesn't use too much low level code shouldn't have any
problems at all. Simply recompile the code, and all code will use
Unicode. If there are some mismatches because of a Move or FillChar
assuming a certain Char size, you'll soon find out anyway. It can't be
too hard to search for Move or FillChar in code.
Here we see again that "clever" coding doesn't pay out. Anyone using a
cast to PChar because it allows pointer arithmetic will have a slight
problem now. <g>
--
Rudy Velthuis [TeamB] http://www.teamb.com
"A mathematician is a device for turning coffee into theorems."
-- Paul Erdos
>
> "Rudy Velthuis [TeamB]" <newsg...@rvelthuis.de> wrote in message
> news:xn0fnhf6junr3h...@rvelthuis.de...
> > Mike B wrote:
> >
> >> By changing the "default" string type to ANSI will this mean that
> the >> Unicode VCL will miraculously morph back into an ANSI version?
> >
> > Of course not. One would be constantly converting from Ansi to
> > Unicode and back, especialyl if the RTL is also Unicode. Even if
> > the conversion were transparent (if e.g. the string type were
> > stored with the string data, so conversions could take place
> > without any work from the user), it would still have to occur, and
> > take a lot of precious time.
>
> The exact same thing happened when you converted your D1 programs.
The char size was the same, so the conversion was extremely simple, and
lossless (unless you converted from an AnsiString that was longer than
255 characters to a ShortString, of course), especially since
ShortStrings were (well, are) fixed size. Conversions between Ansi and
Unicode are a little different.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"Sometimes I lie awake at night, and I ask, 'Where have I gone
wrong?' Then a voice says to me, 'This is going to take more
than one night.'" -- Charlie Brown.
No? Either your AnsiString-based code will constantly have to convert
between Ansi and Unicode, or there are two versions of the RTL and VCL,
one Unicode, one Ansi.
But why cling on to Ansi? If your code uses string, Char and PChar only
and does no or hardly any low level tricks, conversion should be only a
matter or recompiling (which you must generally do, with a new version
of the compiler, anyway).
--
Rudy Velthuis [TeamB] http://www.teamb.com
"Once you eliminate the impossible, whatever remains, no matter
how improbable, must be the truth."
-- Sherlock Holmes (by Sir Arthur Conan Doyle, 1859-1930)
> Anyone using a
> cast to PChar because it allows pointer arithmetic will have a slight
> problem now. <g>
I don't see a problem with that except you use code that assumes
SizeOf(Char) = 1 (like some JclStrings assembler functions that I
converted to pascal recently).
--
Regards,
Andreas Hausladen
> Rudy Velthuis [TeamB] wrote:
>
> > Anyone using a
> > cast to PChar because it allows pointer arithmetic will have a
> > slight problem now. <g>
>
> I don't see a problem with that except you use code that assumes
> SizeOf(Char) = 1
Well, that is exactly why people use it. They cast a pointer to PChar,
add something to it, and then cast it back to the original pointer
type. Problem is that adding something to a PWideChar will give you the
wrong increment. You'll see code like:
P := PInteger(PChar(P) + 3 * SizeOf(Integer));
(Yes, I know this can be done with Inc(P, 3). <g>)
If PChar is PAnsiChar, that will work. But if PChar is PWideChar, you
will add 24 to the pointer, which is in reality 6 * SizeOf(Integer).
I saw Allen Bauer blog about pointer math for all pointers
(switchable). That would be cool, of course.
http://blogs.codegear.com/abauer/2008/01/24/38852
--
Rudy Velthuis [TeamB] http://www.teamb.com
"We all agree that your theory is crazy, but is it crazy enough?"
-- Niels Bohr (1885-1962)
Yes it is and if compiler sees that my source uses
ANSIString and VCL is UnicodeString and I for example
write something like Edit1.Text := MyANSIString there
is more then enough information for compiler to
figure out what has to be done without recompiling
VCL
The problem is not our own code. That we can handle.
But to dig in the guts of 3rd party libraries and
we use a lot those IS a time bomb. I think that is the
main concern. For many libraries vendors already
do not provide support or have written new
versions that are sometimes incompatible with the
old ones, It is just too much of risk, headache
and work without any ROI. Also our apps run 24x7
and customers (and there are a lot of them) do not
tolerate service interruptions (entertainment/hospitality)
industries.
> > But why cling on to Ansi? If your code uses string, Char and PChar
> > only and does no or hardly any low level tricks, conversion should
> > be only a matter or recompiling (which you must generally do, with
> > a new version of the compiler, anyway).
>
> The problem is not our own code. That we can handle.
> But to dig in the guts of 3rd party libraries and
> we use a lot those IS a time bomb.
Don't you buy upgrades when you get a new compiler/RTL/VCL? IOW, let
the 3rd party vendors worry about that.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"So I went to the dentist. He said "Say Aaah." I said "Why?"
He said "My dog's died.'" -- Tommy Cooper
No need for a switch. I guess it will be like the current automatic
onversion between WideString and AnsiString or between AnsiString and
ShortString.
But would you like to constantly convert between WideString and
AnsiString? It would be pretty slow and memory consuming, IMO.
--
Rudy Velthuis [TeamB] http://www.teamb.com
"I was thrown out of college for cheating on the metaphysics
exam; I looked into the soul of the boy next to me."
-- Woody Allen.
Right, and the specific codepage for an ANSIString (in the proposed
implementation) is embedded in the individual string's RTTI. So the
transcoding from a specific ANSI string to Unicode is performed using
that specific string's codepage.
> it has to be intelligently developper specified).
In the proposed implementation that is exactly what would happen.
The developer would either set a specific codepage in the instance of
the ANSI string involved, or if no explicit codepage has been specified
the string would adopt the system default codepage.
That's about as good as can be reasonably expected I think (and is
certainly a vast improvement on both the current situation AND the
proposed Unicode implementation in Tiburon).
I strongly disagree.
> But someone who doesn't use too much low level code shouldn't have any
> problems at all. Simply recompile the code, and all code will use
> Unicode. If there are some mismatches because of a Move or FillChar
> assuming a certain Char size, you'll soon find out anyway. It can't be
> too hard to search for Move or FillChar in code.
You have to ask yourself the question: If it's so easy and trivial to switch
to Unicode from Ansi string, why was this such a big deal for CG and why
is this such a major feature? After all, apparently "all" they had to do was
just recompile their stuff after performing a few "search and replace"
operations, right?
I think you vastly underestimate the work involved in making this conversion.
IMO, in any decent sized project, the potential for error is fairly large and we
haven't even considered the time and effort it takes to test all the functions.