Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

wordposAtIndex() / wordAtIndex(): find the number of the word at a given index position

58 views
Skip to first unread message

muk...@gmail.com

unread,
Aug 28, 2012, 10:22:37 AM8/28/12
to
I am looking for a function that would return the number of a word at a given index position.

Example: assuming function syntax is wordposAtIndex(n, haystack)

wordposAtIndex(10, "Georg Friedrich Händel, a great componist") should return "3"

This way I can retrieve the word at that given index or perform other word proximity processing for that matter. The index would generally result from a rexx pos() or a regular expression match.

Cheers,
Madou



muk...@gmail.com

unread,
Aug 28, 2012, 10:24:55 AM8/28/12
to
oops, the example should have been
wordposAtIndex(17, "Georg Friedrich Händel, a great componist") should return "3"

Op dinsdag 28 augustus 2012 16:22:38 UTC+2 schreef (onbekend) het volgende:

Dave Saville

unread,
Aug 28, 2012, 11:44:42 AM8/28/12
to
On Tue, 28 Aug 2012 14:22:37 UTC, muk...@gmail.com wrote:

> I am looking for a function that would return the number of a word at a given index position.
>
> Example: assuming function syntax is wordposAtIndex(n, haystack)
>
> wordposAtIndex(10, "Georg Friedrich H„ndel, a great componist") should return "3"
>
> This way I can retrieve the word at that given index or perform other word proximity processing for that matter. The index would generally result from a rexx pos() or a regular expression match.

string = "Georg Friedrich H„ndel, a great componist"
n = 17
do i = 1 to words(string)
x = wordindex(string, i)
if x >= n then do
say word(string, i) i
leave;
End
End

Never got the hang of REXX functions but you get the idea. But, what
about if n hits punctuation or white space - which way do you jump?

--
Regards
Dave Saville

Barry Schwarz

unread,
Aug 28, 2012, 12:20:26 PM8/28/12
to
On Tue, 28 Aug 2012 07:24:55 -0700 (PDT), muk...@gmail.com wrote:

>oops, the example should have been
>wordposAtIndex(17, "Georg Friedrich Händel, a great componist") should return "3"

Do you want the number of words from 1 to 17 or from 17 to end of
string? If the former, does WORDS(SUBSTRING("Geor...",1,17)) do what
you want? If the latter, use WORDS(SUBSTRING("Geor...",17)) but then
the answer should be 4.

>
>Op dinsdag 28 augustus 2012 16:22:38 UTC+2 schreef (onbekend) het volgende:
>> I am looking for a function that would return the number of a word at a given index position.
>>
>>
>>
>> Example: assuming function syntax is wordposAtIndex(n, haystack)
>>
>>
>>
>> wordposAtIndex(10, "Georg Friedrich Händel, a great componist") should return "3"
>>
>>
>>
>> This way I can retrieve the word at that given index or perform other word proximity processing for that matter. The index would generally result from a rexx pos() or a regular expression match.
>>
>>
>>
>> Cheers,
>>
>> Madou

--
Remove del for email
Message has been deleted

Gerard_Schildberger

unread,
Aug 28, 2012, 2:45:10 PM8/28/12
to
Here is a short ditty:

/**/ parse arg n .; if n=='' then n=17
yyy='Georg Friedrich Händel, a great componist (sic)'
say ' 1 2 3 4 '
say '12345678901234567890123456789012345678901234567'
say yyy
say
say 'word index at' n "is" words(left(yyy,n))
____________________________ Gerard Schildberger

muk...@gmail.com

unread,
Aug 28, 2012, 4:30:00 PM8/28/12
to
thanks for the code bits.

Regarding the case when the index falls on a space character '20'x (a case I would normally not expect, but...) I would argue that the function should return a value indicating the absence of a word, e.g. 0. Same would apply to a tab character '09'x, to stay compliant to the rexx definition of a whitespace.

So in the preceding example, an if-statement would probably do the trick:

if substr(yyy,1,1)==' ' | substr(yyy,1,1)=='09'x then
return 0
else
return words(left(yyy,n))

LesK

unread,
Aug 29, 2012, 12:47:17 AM8/29/12
to
That's the ooRexx definition of whitespace. Other interpreters may only
define blank, which is required by ANSI.

--

Les (Change Arabic to Roman to email me)

Glenn Knickerbocker

unread,
Aug 29, 2012, 10:39:54 AM8/29/12
to
On Wed, 29 Aug 2012 00:47:17 -0400, LesK wrote:
>That's the ooRexx definition of whitespace. Other interpreters may only
>define blank, which is required by ANSI.

If you want your function to be consistent with the implementation's
specification of white space, just use inexact comparison:

If substr(yyy, n, 1) = '' then Return 0

That's what it's there for.

¬R http://users.bestweb.net/~notr You are already too educated stupid to
understand the truth of nature's harmonic simultaneous 4-liter wine cube
0 new messages