On Monday, December 11, 2017 at 6:25:39 AM UTC-5, Ian Harvey wrote:
> ..
>
> You don't want that. The type bound procedure reference strongly
> implies that the type is extensible, and it makes very little sense for
> a string type to be extensible. Implementation wise, a type bound
> procedure call may also incur the overhead of construction of a
> descriptor for the argument.
> ..
The above comments by @Ian Harvey in this thread are rather didactic, presumptuous, and misleading enough that they hurt any healthy engagement and discussion of any future revisions to the language related to strings.
For Fortran, it really comes down to: WHAT FOR and FOR WHOM?
Is the advancement of Fortran only intended for the likes of *certain users* and them only and that too only for the applications they think the language makes sense to use?
And is the performance and/or the sense of appropriateness by these users the only items of consideration for any revisions to the language?
Or is it possible to make it a *wider tent* and consider the needs of a broader set of users who may also seek convenience and ease-of-use, consistency, improved readability and packaging in terms of software design, etc.?
@spectrum,
On Sunday, December 10, 2017 at 2:59:00 PM UTC-5, spectrum wrote:
> ..
> and I think it would be very nice if we could write this ..
Please note it's trivially possible now to write in Fortran what you indicate will be "very nice" - please see below and try it out - you can make it a lot more efficient, what's below is just a quick first-pass attempt:
--- begin code ---
module string_m
implicit none
private
type, public :: string_t
private
character(len=:), allocatable :: m_s
contains
private
procedure, pass(this) :: assign_s
procedure, pass(this) :: write_s
procedure, pass(this), public :: split
procedure, pass(this), public :: s => get_s
generic, public :: assignment(=) => assign_s
generic, public :: write(formatted) => write_s
end type string_t
contains
elemental subroutine assign_s( this, rhs )
class(string_t), intent(inout) :: this
character(len=*), intent(in) :: rhs
this%m_s = rhs
return
end subroutine assign_s
subroutine write_s(this, lun, iotype, vlist, istat, imsg)
! argument definitions
class(string_t), intent(in) :: this
integer, intent(in) :: lun
character(len=*), intent(in) :: iotype
integer, intent(in) :: vlist(:)
integer, intent(out) :: istat
character(len=*), intent(inout) :: imsg
! local variable
character(len=9) :: sfmt
sfmt = "(A)"
if ( (iotype == "DT").and.(size(vlist) >= 1) ) then
! vlist(1) to be used as the field width of the character component.
write(sfmt,"(A,I2,A)", iostat=istat, iomsg=imsg ) "(A", vlist(1), ")"
if (istat /= 0) return
end if
write(lun, fmt=sfmt, iostat=istat, iomsg=imsg) this%m_s
return
end subroutine write_s
elemental function get_s( this ) result( s )
class(string_t), intent(in) :: this
! Function result
character(len=len(this%m_s)) :: s
s = this%m_s
end function get_s
subroutine split( this, token, strings )
! Argument list
class(string_t), intent(in) :: this
character(len=1), intent(in) :: token
type(string_t), allocatable, intent(out) :: strings(:)
! Local variables
integer :: numstrings
integer :: idx_token
if ( allocated(this%m_s) ) then
if ( len(this%m_s) <= 1) then
return
end if
else
return
end if
numstrings = numtoken( this%m_s, token ) + 1
if ( numstrings > 0 ) then
allocate( strings(numstrings) )
idx_token = len( this%m_s )
do while (numstrings > 1 )
call crop_right( this%m_s(1:idx_token), token, idx_token, strings(numstrings)%m_s )
numstrings = numstrings - 1
end do
! Fill left-most string
strings(numstrings)%m_s = this%m_s(1:idx_token)
end if
return
end subroutine
function numtoken( string, token ) result( num )
character(len=*), intent(in) :: string
character(len=1), intent(in) :: token
integer :: num
num = count( transfer(source=string, mold="a", size=len(string)) == token)
return
end function
subroutine crop_right( string, token, idx_token, crop )
! Argumwnt list
character(len=*), intent(in) :: string
character(len=1), intent(in) :: token
integer, intent(inout) :: idx_token
character(len=:), allocatable, intent(out) :: crop
idx_token = scan(string, token, back=.true. )
if ( idx_token == 0 ) return
if ( len(string) > idx_token ) then
crop = string( idx_token+1: )
end if
idx_token = idx_token - 1
return
end subroutine crop_right
end module string_m
program main
use, intrinsic :: iso_fortran_env, only : compiler_version
use string_m, only : string_t
implicit none
type( string_t ) :: inpstr
type( string_t ), allocatable :: words( : )
integer :: i
print *, "Compiler Version: ", compiler_version()
inpstr = "04-DEC-2015,10-DEC-2015,23-DEC-2015,25-DEC-2015"
call inpstr%split( ",", words )
do i = 1, size( words )
print *, "-", words( i )
end do
end
--- end code ---
Upon execution, you should get something along the following lines assuming the processor you employ supports the standard language features:
--- begin output ---
Compiler Version: GCC version 8.0.0 20171112 (experimental)
- 04-DEC-2015
- 10-DEC-2015
- 23-DEC-2015
- 25-DEC-2015
--- end output ---
You have hit the nail on the head with your point about "the user can use it (with convenient methods) out-of-the-box (= directly, easily) for string handling. So, if the standard (or de fact or semi-standard) library provides an anlog for this (e.g. type(string_t)), I believe it would also be extremely useful. "
I agree with you wholeheartedly, that for many containers and algorithms, there is a need for some form of *standard* solution of users, the precise nature of which is something that can be deferred to the Fortran standards committee, but god damnit, it's high time something got done about this. I hope this will get through the thick-heads of the committee members.
For improved string handling, an infinite number of libraries can be put together - the above snippet being just a small illustration - using the existing facilities in the language. But that is besides the point. For what is a long-solved computer science problem, what is needed are standardized interfaces. Consider your SPLIT example: you can call it that, I can call it TOKENIZE, someone else PARSE, others DECODE, and so forth; some can make it a FUNCTION subprogram, others a SUBROUTINE subprogram. For a variety of situations, it will help if the users have the option to employ *a standard way* of doing things that they can be sure will work with all the conforming compilers.
Just as the Fortran standard extends its foot into the math business with a standardized interface for intrinsic procedures of DOT_PRODUCT and MATMUL, etc., it will be very useful - as you indicate - for the standard to have a set of *intrinsic derived types with bound procedures* for certain commonly used aspects such as 'strings'.
Separately, I find comments such as "You don't want that. The type bound procedure reference strongly implies that the type is extensible, and it makes very little sense for a string type to be extensible. Implementation wise, a type bound procedure call may also incur the overhead of construction of a
descriptor for the argument. " rather ill-informed and narrow-minded too:
Note many programming paradigms and language developments have found the concept of SEALED CLASSES quite practical and useful and have utilized the concepts in their 'string class' design:
https://msdn.microsoft.com/en-us/library/system.string(v=vs.110).aspx
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/sealed
Fortran can consider bringing in SEALED or NON_OVERRIDABLE attribute into play for its derived types and perhaps stipulate its intrinsic 'string_t' derived type too be sealed. These are all topics that should be actively discussed on forums such as comp.lang.fortran and not be simply misdirected by a few here with these "you don't want that" remarks or "there be ghosts and gremlins lurking around" type of fear-mongering whenever anyone has a suggestion.
The point about the overhead is mostly about the performance aspect. As I indicate above, users will have situations where other factors such as convenience and ease-of-coding, packaging, readability, etc. matter more and under the circumstances they may have prefer the approach you indicate with 'inpstr%split( .. '. Why should their points-of-view not be considered in the evolution of Fortran? Other languages such as C++, Python, Microsoft .NET, etc. constantly strive to serve its practitioners, why not Fortran?
I find users are quite discerning in terms of what they seek in Fortran and strings are at the very core of the needs.
Something is "deeply rotten in the state of" Fortran that the regular pleas of many Fortran users are being ignored when it comes to strings.
Cheers,