Case conversion without the loopiness

Clive Page

unread,

Jan 14, 2009, 2:46:41 PM1/14/09

to

I guess many of us have at some time or another had to write a little
routine to convert mixed-case text to all upper (or all lower) case.
It's not hard to do with the aid of a DO loop working through the string
one character at a time. But that's awfully tedious and it's
essentially using the methods of Fortran77, so I kept thinking that
there ought to be a more modern way of doing case conversion, ideally in
just one simple statement. Well this works:

string = TRANSFER(MERGE(ACHAR(IACHAR( &
TRANSFER(string,'a',LEN(string)))-32), &
TRANSFER(string,'a',LEN(string)), &
TRANSFER(string,'a',LEN(string))>='a' .AND. &
TRANSFER(string,'a',LEN(string))<='z'), string)

And it's efficient enough, yet it doesn't have the essential elegance
that I was hoping to achieve. Maybe others can suggest improvements?

--
Clive Page

Paul van Delst

unread,

Jan 14, 2009, 3:33:40 PM1/14/09

to

Yes. Use ruby. Examples below from irb (interative ruby).

Some test data:

irb> string=<<-EOT
irb" TYPE :: Structure_type
irb" INTEGER :: comp1 = 0
irb" INTEGER :: comp2 = 0
irb" INTEGER :: comp3 = 0
irb" INTEGER, POINTER :: comp4(:) => NULL()
irb" REAL(fp), POINTER :: comp5(:) => NULL()
irb" END TYPE Structure_type
irb" EOT

Display what we just entered:

irb> puts string
TYPE :: Structure_type
INTEGER :: comp1 = 0
INTEGER :: comp2 = 0
INTEGER :: comp3 = 0
INTEGER, POINTER :: comp4(:) => NULL()
REAL(fp), POINTER :: comp5(:) => NULL()
END TYPE Structure_type

Make it all lower case:

irb> puts string.downcase
type :: structure_type
integer :: comp1 = 0
integer :: comp2 = 0
integer :: comp3 = 0
integer, pointer :: comp4(:) => null()
real(fp), pointer :: comp5(:) => null()
end type structure_type

...or upper:

irb> puts string.upcase
TYPE :: STRUCTURE_TYPE
INTEGER :: COMP1 = 0
INTEGER :: COMP2 = 0
INTEGER :: COMP3 = 0
INTEGER, POINTER :: COMP4(:) => NULL()
REAL(FP), POINTER :: COMP5(:) => NULL()
END TYPE STRUCTURE_TYPE

:o)

cheers,

paulv

Richard Maine

unread,

Jan 14, 2009, 4:16:10 PM1/14/09

to

Clive Page <ju...@main.machine> wrote:

Yukk!! My suggestion is that the "traditional" f77 loop is hugely better
than this. I'll take your word for it that this does the job. It
wouldn't even take me very long to work through it without taking your
word, but work through it I'd have to do.

I fail to see what is wrong with the traditional loop. The "tediousnes"
took less time to write than it would take me to read the above, much
less write it. Furthermore, that tediousness was taken care of many
years ago when I wrote the subroutine. I redid it slightly to put it in
a module about a decade and a half ago, when first working with f90.
That was among the utility routines I did very early on. So there is
nothing tedious about it now.

You do, I would presume, have a subroutine for it instead of rewriting
the code inline for each occurance. If you are rewriting it for each
occurance, I would see why you would think it tedious, though I'd find
the above far more so.

I think that old conversion routine is an excellent example of something
that should not be fixed because it isn't broken. In any case, I'd call
the above a major step in the wrong direction, in my opinion. It isnt
going to be any more efficient than the loop (it could well be far
worse, depending on how smart the compiler is about things - I could
imagine compilers that would make a multitude of temporary
dynamically-sized arrays for the above), it isn't any more flexible
(what happens when someone asks you to deal with alphabets other than
stright ASCII?), and it *SURE* isn't any more clear (see above about me
taking your word for its operation). In short, I can't see a single way
in which it is an improvement.

People here might have noticed that I tend to be on the side of liking
some of the "shiny" newfangled things. But I'm not quite so much that
way that I'm blind to the fact that "newfangled" doesn't always mean
better. I don't think it does in this case.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

glen herrmannsfeldt

unread,

Jan 14, 2009, 4:59:43 PM1/14/09

to

Clive Page <ju...@main.machine> wrote:
> I guess many of us have at some time or another had to write a little
> routine to convert mixed-case text to all upper (or all lower) case.

Unix has the tr program that is well designed to do that...

> It's not hard to do with the aid of a DO loop working through the string
> one character at a time. But that's awfully tedious and it's
> essentially using the methods of Fortran77, so I kept thinking that
> there ought to be a more modern way of doing case conversion, ideally in
> just one simple statement. Well this works:

> string = TRANSFER(MERGE(ACHAR(IACHAR( &
> TRANSFER(string,'a',LEN(string)))-32), &
> TRANSFER(string,'a',LEN(string)), &
> TRANSFER(string,'a',LEN(string))>='a' .AND. &
> TRANSFER(string,'a',LEN(string))<='z'), string)

One statement, but not simple.

Using ACHAR and IACHAR is nice in that it will work for
non-ASCII systems as long as the characters have an ASCII
representation. Not all EBCDIC character can be
converted to ASCII and back, though.

> And it's efficient enough, yet it doesn't have the essential elegance
> that I was hoping to achieve.

Do you mean more efficient than the DO version using a
lookup table?

> Maybe others can suggest improvements?

character*1 tr(0:255)
character*10 in,out
integer i,j,k
do i=0,255
tr(i)=char(i)
enddo
do i=iachar('A'),iachar('Z')
j=ichar(achar(i))
tr(j)=char(ichar(tr(j))+(ichar('a')-ichar('A')))
enddo

in='Hi ThErE!'
out=transfer(tr(ichar(transfer(in,in(1:1),len(in)))),out)
print *,in,out
end

In this case, I use DO loops to generate tr, but no DO loops
in the actual conversion. The complication with transfer
is needed to convert a string to an array and back
to a string again. Also, I am not sure how to do it
without a constant 255, as HUGE doesn't work on characters.
(It seems that huge(char('a')) is 2147483647 on my system.)

The tr array can probably be done using implied-DO,
I didn't try that at all.

-- glen

paul.rich...@gmail.com

unread,

Jan 15, 2009, 3:45:37 AM1/15/09

to

Dear Clive,

I must thank you for your upper case routine; not for its utility but
for exercising TRANSFER in gfortran :-)

I share Richard's revulsion because the code is unnecessarily concise
to the extent that it is totally opaque.

After working on gfortran to fix your original, I put this in my
utilities:

character(80) :: str
str = "aBcDeFgHiJkLmNoPqRsTuVwXyZ"
call Upper (str)
print *, str
contains
subroutine Upper (arg)
character(*) :: arg
character(len = 1) :: chr(len (arg))
integer :: l, diff
l = len (arg)
diff = iachar ('a') - iachar ('A')
! Convert argument to an array.
chr = transfer (arg, chr, l)
! Use merge to convert lower case to upper.
chr = merge (achar (iachar (chr) - diff), chr, &
chr >= 'a' .and. chr <= 'z')
! Convert back to argument from array.
arg = transfer (chr, arg)
end subroutine
end

It does the same technique but is rather more comprehensible.

Cheers

Paul

Clive Page

unread,

Jan 15, 2009, 12:49:36 PM1/15/09

to

In message
<c22c6439-2fb8-4003...@m22g2000vbl.googlegroups.com>,
paul.rich...@gmail.com writes

>I must thank you for your upper case routine; not for its utility but
>for exercising TRANSFER in gfortran :-)

Don't mention it.

>I share Richard's revulsion because the code is unnecessarily concise
>to the extent that it is totally opaque.

Well I do too, actually. As the subject line suggested, it wasn't
entirely serious.

>After working on gfortran to fix your original, I put this in my
>utilities:

I'm not sure I follow you - it seemed to work fine using gfortran as it
was (and also g95, I haven't yet tried it with other compilers).
[snip]

>It does the same technique but is rather more comprehensible.

Indeed it does, and is. But I'm not sure it will beat mine in the 2009
Obfuscated Fortran Contest (well C has one, I don't see why Fortran
shouldn't too).

Regards

--
Clive Page

James Van Buskirk

unread,

Jan 15, 2009, 8:41:17 PM1/15/09

to

"Clive Page" <ju...@main.machine> wrote in message
news:GtC1GvAh...@page.demo.co.uk...

> string = TRANSFER(MERGE(ACHAR(IACHAR( &
> TRANSFER(string,'a',LEN(string)))-32), &
> TRANSFER(string,'a',LEN(string)), &
> TRANSFER(string,'a',LEN(string))>='a' .AND. &
> TRANSFER(string,'a',LEN(string))<='z'), string)

> And it's efficient enough, yet it doesn't have the essential elegance
> that I was hoping to achieve. Maybe others can suggest improvements?

Although normally the more invocations of TRANSFER, the more readable
the program, in this case a couple of the invocations are redundant,
so they may more advantageously be eliminated:

C:\gfortran\clf\to_upper>type to_upper_string.f90
program to_upper_string
implicit none
external sub
character(32) string
integer i
integer j

do i = 32,126,32
do j = i,min(i+31,126)
string(j-i+1:j-i+1) = achar(j)
end do
call sub(string(1:j-i))
if(i+32 <= 126) write(*,'()')
end do
end program to_upper_string

subroutine sub(string1)
implicit none
character(*) string1
character(len(string1)) string2

string2 = TRANSFER(achar(ieor( &
iachar(TRANSFER(string1,['x'])),merge(32,0,(2* &
iachar(TRANSFER(string1,['x']))-219)**2 <= 625))),string2)
write(*,'(a)') string1
write(*,'(a)') string2
end subroutine sub

C:\gfortran\clf\to_upper>gfortran to_upper_string.f90 -oto_upper_string

C:\gfortran\clf\to_upper>to_upper_string
!"#$%&'()*+,-./0123456789:;<=>?
!"#$%&'()*+,-./0123456789:;<=>?

@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_

`abcdefghijklmnopqrstuvwxyz{|}~
`ABCDEFGHIJKLMNOPQRSTUVWXYZ{|}~

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

Terence

unread,

Jan 16, 2009, 2:34:32 AM1/16/09

to

When you have two tables (e.g. 256 words containing integers in the
range 1 to 256) and you uses an integer I to index into the first
table to pull out an integer J, which you will then use to index the
second table and retrieve a third number K, then,

If I always equals K for any value of I in the range 1 through 256,
then you have two MAPS with a one-to-correcpondence.

In theory you might use such a pair of tables for transforming WORD
symbols to ASCII-256 sybols and the reverse. But there no such one-to-
one correspondence; so you have to have signal (e.g. 0) saying
"there's no equivalent".

Or with smaller tables obeying the same mapping rules, you could
changing upper case to lower case and back. This one works because
there IS a ono-to-one correspondence.

DOS services used a table of 64 character pairs to change upper to
lower and the reverse, depending which side, left to right, you
entered with, with the number value of the character, get the
opposite character as what you needed.

Terence

unread,

Jan 16, 2009, 3:06:31 AM1/16/09

to

Terence wrote:
etc:
Yes, I know a simple way with english is to add or subtract 32, but
this is valid only over the range A through Z. The general case of non-
sequential mapping uses tables, where two columns of character pairs
is a neat trick. Think of Spanish ñ, Ñ, as well as the accented
vowels. Even Ch was once considered a single character, and W was non-
existent.

robin

unread,

Jan 26, 2009, 7:02:32 AM1/26/09

to

"Paul van Delst" <Paul.v...@noaa.gov> wrote in message news:gkli74$50h$1...@news.nems.noaa.gov...

> Clive Page wrote:
> > I guess many of us have at some time or another had to write a little
> > routine to convert mixed-case text to all upper (or all lower) case.
> > It's not hard to do with the aid of a DO loop working through the string
> > one character at a time. But that's awfully tedious and it's
> > essentially using the methods of Fortran77, so I kept thinking that
> > there ought to be a more modern way of doing case conversion, ideally in
> > just one simple statement. Well this works:
> >
> > string = TRANSFER(MERGE(ACHAR(IACHAR( &
> > TRANSFER(string,'a',LEN(string)))-32), &
> > TRANSFER(string,'a',LEN(string)), &
> > TRANSFER(string,'a',LEN(string))>='a' .AND. &
> > TRANSFER(string,'a',LEN(string))<='z'), string)
> >
> > And it's efficient enough, yet it doesn't have the essential elegance
> > that I was hoping to achieve. Maybe others can suggest improvements?
>
> Yes. Use ruby.

Even better, use PL/I:

S = UPPERCASE (S);

and for lower case:

S = LOWERCASE(S);

robin

unread,

Jan 26, 2009, 7:02:33 AM1/26/09

to

"Richard Maine" <nos...@see.signature> wrote in message
news:1itjasr.15mo4rhu48h3mN%nos...@see.signature...

There's such a thing as copy and paste.
But more appropriate would be to make it a function / module
and to use it that way, as you actually did with your version.