substring finding problem!

fedora

unread,

Feb 14, 2010, 8:18:57 AM2/14/10

to

Hi group!

Reading all posts about Spinozas efforts to create string substitute
program, i wanted to code mine too and specs was that not to use the
<string.h> library. But already problem in finding all places where
substring occurs in a string. i'm looking for long time but not able to see
where error is. Any help on where i made mistake is appreciated. TIA:)

Code is :-

#include<stdlib.h>
#include<stdio.h>
#include<string.h>

unsigned strLength(char *s)
{
unsigned idx;
for (idx = 0; s[idx] != '\0'; idx++)
;
return idx;
}

char *strSubstr(char *s, char *t, unsigned ls, unsigned lt)
{
char *substr = 0;
unsigned end = ls - lt, i, j;

for (i = 0; i <= end; i++) {
if (s[i] == t[0] && s[i + (lt-1)] == t[lt-1]) {
for (j = 1; j < lt && s[i + j] == t[j]; j++)
;
if (j == lt) {
substr = s + i;
i = end;
}
}
}
return substr;
}

unsigned findSubstr(char *s, char *t, unsigned ls, unsigned lt, char ***sp)
{
unsigned n, m, lu;
char *u;

for (n = 0, u = s, lu = ls;
lu >= lt && (u = strSubstr(u, t, lu, lt));
n++, u += lt, lu = ((s+ls) - u))
;
if (sp && (*sp = malloc(n * sizeof **sp))) {
for (m = 0, u = s, lu = ls; m < n; m++, u += lt, lu = ((s+ls) - u))
sp[0][m] = strSubstr(u, t, lu, lt);
}
return n;
}

int main()
{
char **p;
unsigned found, i;

printf("heee e\n");
found = findSubstr("heee", "e", strlen("heee"), strlen("e"), &p);
printf("%u times\n", found);
for (i = 0; i < found; i++)
printf("\t%p, %c\n", (void*)p[i], p[i][0]);

printf("hee e\n");
found = findSubstr("hee", "e", strlen("hee"), strlen("e"), &p);
printf("%u times\n", found);
for (i = 0; i < found; i++)
printf("\t%p, %c\n", (void*)p[i], p[i][0]);

printf("hhhh h\n");
found = findSubstr("hhhh", "h", strlen("hhhh"), strlen("h"), &p);
printf("%u times\n", found);
for (i = 0; i < found; i++)
printf("\t%p, %c\n", (void*)p[i], p[i][0]);

}

Output :-

heee e
3 times
0x400a66, e
0x400a66, e
0x400a67, e
hee e
2 times
0x400a84, e
0x400a84, e
hhhh h
4 times
0x400a90, h
0x400a91, h
0x400a92, h
0x400a93, h

It is correct when substr starts as first character of string, but if not
then always it is repeated twice...

Richard Heathfield

unread,

Feb 14, 2010, 8:27:36 AM2/14/10

to

fedora wrote:
> Hi group!
>
> Reading all posts about Spinozas efforts to create string substitute
> program, i wanted to code mine too and specs was that not to use the
> <string.h> library. But already problem in finding all places where
> substring occurs in a string. i'm looking for long time but not able to see
> where error is. Any help on where i made mistake is appreciated. TIA:)
>
> Code is :-
>
> #include<stdlib.h>
> #include<stdio.h>
> #include<string.h>

Better:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

>
> unsigned strLength(char *s)
> {
> unsigned idx;
> for (idx = 0; s[idx] != '\0'; idx++)
> ;
> return idx;
> }

To find the length of a string, use strlen unless you have a compelling
reason not to. There is no evidence above of a compelling reason not to
use strlen.

> char *strSubstr(char *s, char *t, unsigned ls, unsigned lt)
> {
> char *substr = 0;
> unsigned end = ls - lt, i, j;
>
> for (i = 0; i <= end; i++) {
> if (s[i] == t[0] && s[i + (lt-1)] == t[lt-1]) {
> for (j = 1; j < lt && s[i + j] == t[j]; j++)
> ;
> if (j == lt) {
> substr = s + i;
> i = end;

What are you trying to do in this function?

> unsigned findSubstr(char *s, char *t, unsigned ls, unsigned lt, char ***sp)
> {
> unsigned n, m, lu;
> char *u;
>
> for (n = 0, u = s, lu = ls;
> lu >= lt && (u = strSubstr(u, t, lu, lt));
> n++, u += lt, lu = ((s+ls) - u))
> ;
> if (sp && (*sp = malloc(n * sizeof **sp))) {
> for (m = 0, u = s, lu = ls; m < n; m++, u += lt, lu = ((s+ls) - u))
> sp[0][m] = strSubstr(u, t, lu, lt);
> }
> return n;
> }

To find a substring, use strstr unless you have a compelling reason not
to. There is no evidence above of a compelling reason not to use strstr.

If you want help debugging your code, your first step is to choose more
meaningful names for your objects, so that we can more easily see what
you think you're doing.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

fedora

unread,

Feb 14, 2010, 9:11:35 AM2/14/10

to

Richard Heathfield wrote:

> fedora wrote:
>> Hi group!
>>
>> Reading all posts about Spinozas efforts to create string substitute
>> program, i wanted to code mine too and specs was that not to use the
>> <string.h> library. But already problem in finding all places where
>> substring occurs in a string. i'm looking for long time but not able to
>> see where error is. Any help on where i made mistake is appreciated.
>> TIA:)
>>
>> Code is :-
>>
>> #include<stdlib.h>
>> #include<stdio.h>
>> #include<string.h>
>
> Better:
>
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
>
>>
>> unsigned strLength(char *s)
>> {
>> unsigned idx;
>> for (idx = 0; s[idx] != '\0'; idx++)
>> ;
>> return idx;
>> }
>
> To find the length of a string, use strlen unless you have a compelling
> reason not to. There is no evidence above of a compelling reason not to
> use strlen.

Not using string.h was the rule i thought.

>> char *strSubstr(char *s, char *t, unsigned ls, unsigned lt)
>> {
>> char *substr = 0;
>> unsigned end = ls - lt, i, j;
>>
>> for (i = 0; i <= end; i++) {
>> if (s[i] == t[0] && s[i + (lt-1)] == t[lt-1]) {
>> for (j = 1; j < lt && s[i + j] == t[j]; j++)
>> ;
>> if (j == lt) {
>> substr = s + i;
>> i = end;
>
> What are you trying to do in this function?

It returns same value as strstr in string.h. It gives pointer to 1st found
occurance of string t in string s or null ptr other wise. ls is length of s
and lt is length of t, same value that strlen gives.

>> unsigned findSubstr(char *s, char *t, unsigned ls, unsigned lt, char
>> ***sp)
>> {
>> unsigned n, m, lu;
>> char *u;
>>
>> for (n = 0, u = s, lu = ls;
>> lu >= lt && (u = strSubstr(u, t, lu, lt));
>> n++, u += lt, lu = ((s+ls) - u))
>> ;
>> if (sp && (*sp = malloc(n * sizeof **sp))) {
>> for (m = 0, u = s, lu = ls; m < n; m++, u += lt, lu = ((s+ls) - u))
>> sp[0][m] = strSubstr(u, t, lu, lt);
>> }
>> return n;
>> }
>
>
> To find a substring, use strstr unless you have a compelling reason not
> to. There is no evidence above of a compelling reason not to use strstr.
>
> If you want help debugging your code, your first step is to choose more
> meaningful names for your objects, so that we can more easily see what
> you think you're doing.

findSubstr returns the no. of times t occurs in s (not over lapping) and if
sp is not null ptr, it sets *sp to point to list of pointers that point to
each start of t in s. i got this method from some one else in another
thread.. I think Ben Bacarrisse.

Thanks for your comments! I'll post another version using strlen and better
var names shotly.

Richard Heathfield

unread,

Feb 14, 2010, 9:18:44 AM2/14/10

to

fedora wrote:
> Richard Heathfield wrote:
>
<snip>

>> To find the length of a string, use strlen unless you have a compelling
>> reason not to. There is no evidence above of a compelling reason not to
>> use strlen.
>
> Not using string.h was the rule i thought.

It's a silly rule, best ignored.

<snip>

fedora

unread,

Feb 14, 2010, 9:40:35 AM2/14/10

to

Posting program again with longer var names. i prefer short names since long
names run out of 80x25 screen. Also now using strlen from string.h.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

char *strSubstr(char *str, char *subStr, unsigned lstr, unsigned lsubStr)
{
char *substr = 0;
unsigned lastIdx = lstr - lsubStr, firstIdx, subStrIdx;

// locate first char of subStr in str
for (firstIdx = 0; firstIdx <= lastIdx; firstIdx++) {
if (str[firstIdx] == subStr[0] &&
str[firstIdx + (lsubStr-1)] == subStr[lsubStr-1]) {

// check if complete subStr occurs at this pos in str
for (subStrIdx = 1; subStrIdx < lsubStr &&
str[firstIdx + subStrIdx] == subStr[subStrIdx]; subStrIdx++)
;
if (subStrIdx == lsubStr) {
// subStr found, so return its start and break frm loop
substr = str + firstIdx;
firstIdx = lastIdx;
}
}
}
return substr;
}

unsigned findSubstr(
char *str,
char *subStr,
unsigned lstr,
unsigned lsubStr,
char ***sp)
{
unsigned found, ctr, lu;
char *u;

// find how many times subStr is in str
for (found = 0, u = str, lu = lstr;
lu >= lsubStr && (u = strSubstr(u, subStr, lu, lsubStr));
found++, u += lsubStr, lu = ((str + lstr) - u))
;

// alloc space and copy the start of all subStr in str
if (sp && (*sp = malloc(found * sizeof **sp))) {
for (ctr = 0, u = str, lu = lstr;
ctr < found;
ctr++, u += lsubStr, lu = ((str + lstr) - u))
sp[0][ctr] = strSubstr(u, subStr, lu, lsubStr);
}
return found;
}

int main()
{
char **p;
unsigned found, i;

printf("heee e\n");
found = findSubstr("heee", "e", strlen("heee"), strlen("e"), &p);
printf("%u times\n", found);
for (i = 0; i < found; i++)
printf("\t%p, %c\n", (void*)p[i], p[i][0]);

printf("hee e\n");
found = findSubstr("hee", "e", strlen("hee"), strlen("e"), &p);
printf("%u times\n", found);
for (i = 0; i < found; i++)
printf("\t%p, %c\n", (void*)p[i], p[i][0]);

printf("hhhh h\n");
found = findSubstr("hhhh", "h", strlen("hhhh"), strlen("h"), &p);
printf("%u times\n", found);
for (i = 0; i < found; i++)
printf("\t%p, %c\n", (void*)p[i], p[i][0]);

}

Output is :-

heee e
3 times
0x400a46, e
0x400a46, e
0x400a47, e
hee e
2 times
0x400a64, e
0x400a64, e
hhhh h
4 times
0x400a70, h
0x400a71, h
0x400a72, h
0x400a73, h

So if start of substring is not at first position in string, then always its
repeated twice in output. And i cant see where the error is for this. Any
help is appreciatad. TIA:)

spinoza1111

unread,

Feb 14, 2010, 10:43:49 AM2/14/10

to

On Feb 14, 4:18 pm, fedora <no_m...@invalid.invalid> wrote:
> Hi group!
>
> Reading all posts about Spinozas efforts to create string substitute

It is not an effort. I produced one in two hours and worked
collaboratively with a couple of posters other than the regs to find
the bugs in a few more hours. It now works, as far as I know, and I am
adding a stress test and further improvements.

It is being deliberately renarrated as an "effort" by the regs owing
to their inability to solve the problem.

Ben Bacarisse

unread,

Feb 14, 2010, 12:39:23 PM2/14/10

to

fedora <no_...@invalid.invalid> writes:

> Posting program again with longer var names. i prefer short names since long
> names run out of 80x25 screen.

Long is not the same as good! Choosing good names is very hard and to
a large extent it is not culturally neutral. For example, when
searching for a sub-string, I'd call the first location "anchor"
because I am used to the term "anchored search".

> Also now using strlen from string.h.

That's a good idea, but there is not reason to abandon the other
string functions. If you want an exercise, then you could try to
write a fast search and replace. That will, most likely, lead to you
looking for an alternative to using strstr rather than simply
re-writing it (your strSubstr is very similar to strstr) but you will
learn practical things along the way, such as how to find where you
code is spending time.

Note, there will never be a "fastest" version because what is fast
will depend on all sorts of variables such as the quality of your C
implementation and the kind of search and replace calls you do. For
example, my simple version is still the fastest I can write for very
long strings with only a few replacements because the strstr in glibc
is very good at searching long strings.

> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
>
> char *strSubstr(char *str, char *subStr, unsigned lstr, unsigned lsubStr)
> {
> char *substr = 0;
> unsigned lastIdx = lstr - lsubStr, firstIdx, subStrIdx;

You should use size_t for sizes like this -- it is the type returned
by strlen, but because it is an unsigned type you will need to think
about your method. You can't subtract the lengths like you do.

> // locate first char of subStr in str
> for (firstIdx = 0; firstIdx <= lastIdx; firstIdx++) {
> if (str[firstIdx] == subStr[0] &&
> str[firstIdx + (lsubStr-1)] == subStr[lsubStr-1]) {

I am not sure it is worth doing this second test.

> // check if complete subStr occurs at this pos in str
> for (subStrIdx = 1; subStrIdx < lsubStr &&
> str[firstIdx + subStrIdx] == subStr[subStrIdx]; subStrIdx++)
> ;
> if (subStrIdx == lsubStr) {
> // subStr found, so return its start and break frm loop
> substr = str + firstIdx;
> firstIdx = lastIdx;

There is a statement to do that: break. Altering the loop variable to
break out of the loop makes the code harder to change.

> }
> }
> }
> return substr;
> }
>
> unsigned findSubstr(
> char *str,
> char *subStr,
> unsigned lstr,
> unsigned lsubStr,
> char ***sp)
> {
> unsigned found, ctr, lu;
> char *u;
>
> // find how many times subStr is in str
> for (found = 0, u = str, lu = lstr;
> lu >= lsubStr && (u = strSubstr(u, subStr, lu, lsubStr));
> found++, u += lsubStr, lu = ((str + lstr) - u))
> ;

You have picked up an odd style. Using lots of variables in a for
loop is not very clear. How about:

size_t matches = 0, remaining_len = lstr;
char *u = str;
while (u = strSubstr(u, subStr, remaining_len, lsubStr)) {
matches++;
u += lsubStr;
remaining_len = str + lstr - u;
}

This is untested (and almost certainly wrong) but it shows another way
to write such loops.

Note that I've removed the lu >= lsubStr test. In effect this is the
first thing that strSubstr tests so there is no obvious need to
repeat it here.

>
> // alloc space and copy the start of all subStr in str
> if (sp && (*sp = malloc(found * sizeof **sp))) {
> for (ctr = 0, u = str, lu = lstr;
> ctr < found;
> ctr++, u += lsubStr, lu = ((str + lstr) - u))
> sp[0][ctr] = strSubstr(u, subStr, lu, lsubStr);

I'd rewrite that rather packing all the work into the for loop
controls. Loops are clearer when you can see what is being done in
the loop.

> }
> return found;
> }
>
> int main()
> {
> char **p;
> unsigned found, i;
>
> printf("heee e\n");
> found = findSubstr("heee", "e", strlen("heee"), strlen("e"), &p);
> printf("%u times\n", found);
> for (i = 0; i < found; i++)
> printf("\t%p, %c\n", (void*)p[i], p[i][0]);
>
> printf("hee e\n");
> found = findSubstr("hee", "e", strlen("hee"), strlen("e"), &p);
> printf("%u times\n", found);
> for (i = 0; i < found; i++)
> printf("\t%p, %c\n", (void*)p[i], p[i][0]);
>
> printf("hhhh h\n");
> found = findSubstr("hhhh", "h", strlen("hhhh"), strlen("h"), &p);
> printf("%u times\n", found);
> for (i = 0; i < found; i++)
> printf("\t%p, %c\n", (void*)p[i], p[i][0]);
>
>
> }

A general point. Lots of people seem to pack tests into main. I
never do. I make main use its arguments so I have a general-purpose
test program. The tests I want to run often can them be put into
script but I can quickly test new cases simply by typing a command.

<snip>
--
Ben.

fedora

unread,

Feb 14, 2010, 1:04:37 PM2/14/10

to

Ben Bacarisse wrote:

> fedora <no_...@invalid.invalid> writes:
>
>> Posting program again with longer var names. i prefer short names since
>> long names run out of 80x25 screen.
>
> Long is not the same as good! Choosing good names is very hard and to
> a large extent it is not culturally neutral. For example, when
> searching for a sub-string, I'd call the first location "anchor"
> because I am used to the term "anchored search".

Good point:) i know my longer names are terrible but i wanted to post the
code quickly. I think the code is readable with shorter names but giving
names with good meaning is very hard!

>> Also now using strlen from string.h.
>
> That's a good idea, but there is not reason to abandon the other
> string functions. If you want an exercise, then you could try to
> write a fast search and replace. That will, most likely, lead to you
> looking for an alternative to using strstr rather than simply
> re-writing it (your strSubstr is very similar to strstr) but you will
> learn practical things along the way, such as how to find where you
> code is spending time.
>
> Note, there will never be a "fastest" version because what is fast
> will depend on all sorts of variables such as the quality of your C
> implementation and the kind of search and replace calls you do. For
> example, my simple version is still the fastest I can write for very
> long strings with only a few replacements because the strstr in glibc
> is very good at searching long strings.

Yes. my aim was to write a prog for the Spinoza contest ongoing but am still
stuck with the same error.

>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <string.h>
>>
>> char *strSubstr(char *str, char *subStr, unsigned lstr, unsigned lsubStr)
>> {
>> char *substr = 0;
>> unsigned lastIdx = lstr - lsubStr, firstIdx, subStrIdx;
>
> You should use size_t for sizes like this -- it is the type returned
> by strlen, but because it is an unsigned type you will need to think
> about your method. You can't subtract the lengths like you do.

Can you explain in which case (for length of str and subStr) the subtraction
will be wrong? i considered both strings of same length but cant see any
error.

lsubStr has to be <= lstr. i just assume that condition from calling code!

>> // locate first char of subStr in str
>> for (firstIdx = 0; firstIdx <= lastIdx; firstIdx++) {
>> if (str[firstIdx] == subStr[0] &&
>> str[firstIdx + (lsubStr-1)] == subStr[lsubStr-1]) {
>
> I am not sure it is worth doing this second test.

I just thought if last position also matched then chance that it is the sub-
striong is much higher, so testing for both 1st & last positions would
decrease the no. of times the inner loop has to run...

>> // check if complete subStr occurs at this pos in str
>> for (subStrIdx = 1; subStrIdx < lsubStr &&
>> str[firstIdx + subStrIdx] == subStr[subStrIdx]; subStrIdx++)
>> ;
>> if (subStrIdx == lsubStr) {
>> // subStr found, so return its start and break frm loop
>> substr = str + firstIdx;
>> firstIdx = lastIdx;
>
> There is a statement to do that: break. Altering the loop variable to
> break out of the loop makes the code harder to change.

i read somewhere that break and continue are almost as bad as go to and
loops should terminate by their test expressions for readable elegant code.

>
>> }
>> }
>> }
>> return substr;
>> }
>>
>> unsigned findSubstr(
>> char *str,
>> char *subStr,
>> unsigned lstr,
>> unsigned lsubStr,
>> char ***sp)
>> {
>> unsigned found, ctr, lu;
>> char *u;
>>
>> // find how many times subStr is in str
>> for (found = 0, u = str, lu = lstr;
>> lu >= lsubStr && (u = strSubstr(u, subStr, lu, lsubStr));
>> found++, u += lsubStr, lu = ((str + lstr) - u))
>> ;
>
> You have picked up an odd style. Using lots of variables in a for
> loop is not very clear. How about:
>
> size_t matches = 0, remaining_len = lstr;
> char *u = str;
> while (u = strSubstr(u, subStr, remaining_len, lsubStr)) {
> matches++;
> u += lsubStr;
> remaining_len = str + lstr - u;
> }
>
> This is untested (and almost certainly wrong) but it shows another way
> to write such loops.

Ok! it's just so hard to be sure that all these lenghts and offsets will
always behave right for all cases of string and substrings! i thought
through my loops for cases of equal lengths etc, and i just cant spot where
mistake is...

> Note that I've removed the lu >= lsubStr test. In effect this is the
> first thing that strSubstr tests so there is no obvious need to
> repeat it here.

hmm okay. any minute change and i'm not sure where all it'd affect the code.
pointers+strings is tricky!

Yeah, i wrote a version where main will accept strings from user in a loop
and feed them to findSubstr and return the results and keep looping till
CTRL-C but that version is sometimes giving segfault and sometimes working
okay for same string pairs!!

right now i cant understand my own code. now i want to rewrite everything
but in very simple statements and loops so i can figure out where i'm going
wrong.

pete

unread,

Feb 14, 2010, 1:31:36 PM2/14/10

to

fedora wrote:
> Richard Heathfield wrote:

>>What are you trying to do in this function?
>
>
> It returns same value as strstr in string.h.

If you write
str_len, str_chr, and str_ncmp
first, then
str_str
is pretty simple to write without using string.h.

http://www.mindspring.com/~pfilandr/C/library/str_ing.c

--
pete

Ike Naar

unread,

Feb 14, 2010, 2:24:33 PM2/14/10

to

In article <hl8gek$ev1$1...@news.eternal-september.org>,

fedora <no_...@invalid.invalid> wrote:
>unsigned findSubstr(
> char *str,
> char *subStr,
> unsigned lstr,
> unsigned lsubStr,
> char ***sp)
>{
> unsigned found, ctr, lu;
> char *u;
>
> // find how many times subStr is in str
> for (found = 0, u = str, lu = lstr;
> lu >= lsubStr && (u = strSubstr(u, subStr, lu, lsubStr));
> found++, u += lsubStr, lu = ((str + lstr) - u))
> ;
>
> // alloc space and copy the start of all subStr in str
> if (sp && (*sp = malloc(found * sizeof **sp))) {
> for (ctr = 0, u = str, lu = lstr;
> ctr < found;
> ctr++, u += lsubStr, lu = ((str + lstr) - u))
> sp[0][ctr] = strSubstr(u, subStr, lu, lsubStr);

Here is your bug; you want

sp[0][ctr] = u = strSubstr(u, subStr, lu, lsubStr);

> }
> return found;
>}

Ben Bacarisse

unread,

Feb 14, 2010, 2:38:31 PM2/14/10

to

fedora <no_...@invalid.invalid> writes:

> Ben Bacarisse wrote:
>
>> fedora <no_...@invalid.invalid> writes:

<snip>

>>> Also now using strlen from string.h.
>>
>> That's a good idea, but there is not reason to abandon the other
>> string functions. If you want an exercise, then you could try to
>> write a fast search and replace. That will, most likely, lead to you
>> looking for an alternative to using strstr rather than simply
>> re-writing it (your strSubstr is very similar to strstr) but you will
>> learn practical things along the way, such as how to find where you
>> code is spending time.
>>
>> Note, there will never be a "fastest" version because what is fast
>> will depend on all sorts of variables such as the quality of your C
>> implementation and the kind of search and replace calls you do. For
>> example, my simple version is still the fastest I can write for very
>> long strings with only a few replacements because the strstr in glibc
>> is very good at searching long strings.
>
> Yes. my aim was to write a prog for the Spinoza contest ongoing but am still
> stuck with the same error.

Sorry, I missed you have an error you were stuck on. The problem is
that the first and second loops in findSubstr don't do the same
thing. The first correctly advances u by the match length *from that
last match*. The second loop advances u by only the match length
(from it's last value).

The fix (using your style):

if (sp && (*sp = malloc(found * sizeof **sp))) {
for (ctr = 0, u = str, lu = lstr;
ctr < found;

ctr++, lu = ((str + lstr) - u)) {

sp[0][ctr] = strSubstr(u, subStr, lu, lsubStr);

u = sp[0][ctr] + lsubStr;
}
}

This could be written much more clearly. I don't want to bang on
about the same thing all the time, but bugs are the compiler's way of
telling you that you code need to be clearer!

>>> #include <stdlib.h>
>>> #include <stdio.h>
>>> #include <string.h>
>>>
>>> char *strSubstr(char *str, char *subStr, unsigned lstr, unsigned lsubStr)
>>> {
>>> char *substr = 0;
>>> unsigned lastIdx = lstr - lsubStr, firstIdx, subStrIdx;
>>
>> You should use size_t for sizes like this -- it is the type returned
>> by strlen, but because it is an unsigned type you will need to think
>> about your method. You can't subtract the lengths like you do.
>
> Can you explain in which case (for length of str and subStr) the subtraction
> will be wrong? i considered both strings of same length but cant see any
> error.
>
> lsubStr has to be <= lstr. i just assume that condition from calling
> code!

Then there is no error! I think, though, that this is a rather string
condition to place on the calling code. That the pointers are not
NULL seems a reasonable condition; even that the match string is not
zero length; but that you can't search for "ab" in "a" seems rather
too restrictive.

BTW, you can document these "caller contract" restrictions in the code
by including assert calls:

assert(str && subStr && lsubStr <= lstr && lsubStr > 0);

(#include <assert.h> at the top).

>>> // locate first char of subStr in str
>>> for (firstIdx = 0; firstIdx <= lastIdx; firstIdx++) {
>>> if (str[firstIdx] == subStr[0] &&
>>> str[firstIdx + (lsubStr-1)] == subStr[lsubStr-1]) {
>>
>> I am not sure it is worth doing this second test.
>
> I just thought if last position also matched then chance that it is the sub-
> striong is much higher, so testing for both 1st & last positions would
> decrease the no. of times the inner loop has to run...

... at the expense of more code. I'd put this in only after testing
that, in general, it pays off.

>>> // check if complete subStr occurs at this pos in str
>>> for (subStrIdx = 1; subStrIdx < lsubStr &&
>>> str[firstIdx + subStrIdx] == subStr[subStrIdx]; subStrIdx++)
>>> ;
>>> if (subStrIdx == lsubStr) {
>>> // subStr found, so return its start and break frm loop
>>> substr = str + firstIdx;
>>> firstIdx = lastIdx;
>>
>> There is a statement to do that: break. Altering the loop variable to
>> break out of the loop makes the code harder to change.
>
> i read somewhere that break and continue are almost as bad as go to and
> loops should terminate by their test expressions for readable elegant code.

Yes, some people are of that opinion. I am not, but I doubt that even
people with that opinion would advocate terminating the loop by
setting the loop variable. It is likely that they'd re-write the loop
in some new way but maybe if there is anyone of that opinion here they
could chip in. I don't like speaking for views I don't hold!

Ah, you need to free yourself from that fear. Experience helps, but
striving to write the clearest code you can makes it much simpler to
be sure of your code. Do you ever reason about your code? For
example, do you assert the negation of a loop at the end to see what
it really means for the code that follows? Over the years, I've found
more bugs doing this than by any other method.

I would not use input, I'd use argc and argv. For example, here is
what I wrote to investigate your bug:

int main(int argc, char **argv)
{
if (argc == 3) {
char **p;
const char *s = argv[1], *m = argv[2];
size_t mlen = strlen(m);
unsigned i, found = findSubstr(s, m, strlen(s), mlen, &p);

printf("In \"%s\" find \"%s\"\n%u times:\n", s, m, found);
for (i = 0; i < found; i++) {
int off = p[i] - s;
printf("\t%d, %.*s<%.*s>%s\n",
off, off, s, (int)mlen, p[i], s + off + mlen);
}
}
return 0;
}

> right now i cant understand my own code. now i want to rewrite everything
> but in very simple statements and loops so i can figure out where i'm going
> wrong.

It will come. BTW, kudos for your (void *) cast when printing with %p!

--
Ben.

bartc

unread,

Feb 14, 2010, 4:28:42 PM2/14/10

to

"fedora" <no_...@invalid.invalid> wrote in message
news:hl8blk$s5$1...@news.eternal-september.org...

> Reading all posts about Spinozas efforts to create string substitute
> program, i wanted to code mine too and specs was that not to use the
> <string.h> library. But already problem in finding all places where
> substring occurs in a string. i'm looking for long time but not able to
> see
> where error is. Any help on where i made mistake is appreciated. TIA:)

> #include<string.h>

You won't need this then...

> char *strSubstr(char *s, char *t, unsigned ls, unsigned lt)

> unsigned findSubstr(char *s, char *t, unsigned ls, unsigned lt, char
> ***sp)

Some comments about what each of these do wouldn't be amiss.

> printf("heee e\n");
> found = findSubstr("heee", "e", strlen("heee"), strlen("e"), &p);
> printf("%u times\n", found);
> for (i = 0; i < found; i++)
> printf("\t%p, %c\n", (void*)p[i], p[i][0]);

...
You're repeating code here that's best in a loop or in a function.

I've put together some code that also counts substrings, and also avoids
string.h, although it allows them to overlap so the results may not be the
same (so that "AA" is 3 substrings of "AAAA", not 2):

#include <stdio.h>

/* return how many times t occurs in s */
int findsubstrings(char *s, char*t){
int count=0;
char *p,*q;

if (*s==0 || *t==0) return 0;

while (*s) {
p=s;
q=t;
while (*p && *q && *p++==*q)++q;
if (*q==0) ++count;
++s;
}
return count;
}

void test(char *s,char *t){
printf("\"%s\" occurs %d times in \"%s\"\n",t,findsubstrings(s,t),s);
}

int main(void){
test("sisisisisisis","sis");
}

--
bartc

Seebs

unread,

Feb 14, 2010, 4:59:30 PM2/14/10

to

On 2010-02-14, fedora <no_...@invalid.invalid> wrote:
> Not using string.h was the rule i thought.

No, that was just Nilges being contrary to the point of stupidity.

That said, it could be a good exercise. You've already had one of the
key insights: The thing replacing strstr() should be written as a function
which is called by other functions, so as not to overcomplicate a single
gigantic function.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Ben Bacarisse

unread,

Feb 14, 2010, 5:49:52 PM2/14/10

to

Seebs <usenet...@seebs.net> writes:

> On 2010-02-14, fedora <no_...@invalid.invalid> wrote:
>> Not using string.h was the rule i thought.
>
> No, that was just Nilges being contrary to the point of stupidity.
>
> That said, it could be a good exercise. You've already had one of the
> key insights: The thing replacing strstr() should be written as a function
> which is called by other functions, so as not to overcomplicate a single
> gigantic function.

Another insight is that if one is not using strstr then its
replacement should be more helpful that strstr is.

The trouble is that strstr(x, y); returns NULL when y is not in x. It
scans x and then tells you almost nothing. If I were re-doing this
I'd at least make my strstr replacement act like GNU's strchrnul
(there is no strstrnul in GNU's library). I.e. str_str_nul should
return a pointer to the end of its first argument string when the
search fails. At the least this would allow one to avoid re-scanning
just to find the length[1].

Even when the search string /is/ present, the final call always scans
that tail with no valuable data being returned.

I'd argue that strtsr should have been defined this way from the
start, but such is the C library.

[1] At the expense of limiting the code to strings no longer than
PTRDIFF_MAX characters. I think it is quite fiddly to avoid this
restriction so I am not too bothered by that.

--
Ben.

Seebs

unread,

Feb 14, 2010, 5:49:24 PM2/14/10

to

On 2010-02-14, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
> Another insight is that if one is not using strstr then its
> replacement should be more helpful that strstr is.

That's a point.

> The trouble is that strstr(x, y); returns NULL when y is not in x. It
> scans x and then tells you almost nothing. If I were re-doing this
> I'd at least make my strstr replacement act like GNU's strchrnul
> (there is no strstrnul in GNU's library). I.e. str_str_nul should
> return a pointer to the end of its first argument string when the
> search fails. At the least this would allow one to avoid re-scanning
> just to find the length[1].

And you'd replace the if (ptr) with if (*ptr), which would be fine. Hmm,
I like that.

> I'd argue that strtsr should have been defined this way from the
> start, but such is the C library.

Hmm.

Here's my thought: The C library functions we currently have are defined
in a way that's simple and well-defined; "return a pointer to X", and if
you can't, don't return a valid pointer. I think that's easier to specify
or discuss than "return a pointer to X, or possibly a pointer to Y".

> [1] At the expense of limiting the code to strings no longer than
> PTRDIFF_MAX characters. I think it is quite fiddly to avoid this
> restriction so I am not too bothered by that.

Yeah.

Chris M. Thomasson

unread,

Feb 14, 2010, 7:53:19 PM2/14/10

to

Here is my humble little entry that took me around a half an hour or so to
create:

http://clc.pastebin.com/f62504e4c

If you want to avoid using `string.h' then you are going to have to implment
the following functions:
_________________________________________________
#define xstrstr strstr
#define xstrlen strlen
#define xstrcmp strcmp
#define xmemcpy memcpy
_________________________________________________

I personally don't see any need to do that unless you want to go through a
learning experience. Or perhaps if you just "know" that those functions are
very poorly implemented on your platform. Anyway, this code pre-computes all
of the substring matches and stores them in a linked-list. This gets around
having to scan the source string twice. It's fairly good at reducing the
number of list nodes by allowing a single node to hold multiple offsets into
the source string. So, in the code as-is, `malloc()/free()' is completely
avoided on list nodes _if_ there are less than or equal to 256 matches.

Any questions?

;^)

Message has been deleted

Chris M. Thomasson

unread,

Feb 15, 2010, 12:41:28 AM2/15/10

to

"Stefan Ram" <r...@zedat.fu-berlin.de> wrote in message
news:rand-2010...@ram.dialup.fu-berlin.de...

> "Chris M. Thomasson" <n...@spam.invalid> writes:
>>Or perhaps if you just "know" that those functions are
>>very poorly implemented on your platform.
>

> Often, one can indeed assume that �stdlib.h:rand()� is
> implemented in such a manner that it is not very random
> in the less significant bits - although this knowledge
> has nothing to do with the language C but only with the
> culture of C implementations.

good point. Humm... I would hope that `strstr()' does not commonly use a
naive algorithm to search for substrings.

spinoza1111

unread,

Feb 15, 2010, 4:35:18 PM2/15/10

to

On Feb 15, 12:59 am, Seebs <usenet-nos...@seebs.net> wrote:

> On 2010-02-14, fedora <no_m...@invalid.invalid> wrote:
>
> > Not using string.h was the rule i thought.
>
> No, that was just Nilges being contrary to the point of stupidity.
>
> That said, it could be a good exercise. You've already had one of the
> key insights: The thing replacing strstr() should be written as a function
> which is called by other functions, so as not to overcomplicate a single
> gigantic function.
>

Hee hee. Sure is taking you tomatoes a long time to "ketchup" with me.
Peter, shouldn't you be checking my code like I told you? If you could
find a bug in the latest version I posted (and only that version, dear
heart), it would be a real feather in your little cap. In fact, I'll
send you a check for 25 Hong Kong dollars.
> -s
> --
> Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nos...@seebs.nethttp://www.seebs.net/log/<-- lawsuits, religion, and funny pictureshttp://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

spinoza1111

unread,

Feb 15, 2010, 4:53:03 PM2/15/10

to

Yeah, Chris. I have a question. Why did you call it an "entry" when
this (to me, anyway) implied that it was a contest entry to the
Spinoza challenge? Please don't be corrupted by the dishonesty and
brutality of these newsgroups.

Sure, you do say that I have to implement FOUR (4) non-trivial library
functions.

But by saying "it took me a half hour" I read an implied, perhaps
unintended slight at the approximately six hours I took ... where only
in dysfunctional corporations is it a bad thing to take a little extra
care and a little extra time in anticipation of difficulty.

A week late, in this thread, Seebach, Bacarisse et al. seem to be
running into confusion trying to help the OP meet the original
challenge. But I note nobody harassing them or the original poster,
targeting them for abuse.

In a sense, "I have only myself to blame" for this, because a year or
so ago, I jumped all over Seebach for his attacks on Schildt. I feel I
was right to do so, *et je ne regrette rien*. Nonetheless, I'm tired
of his lies.

If your "slight" was unintended, I apologize.

Without knowing as much about C as the regs, esp. postmodern C and the
standards (I'll be the first to concede this), I've left them in the
dust as regards my challenge. I've completed my solution, although I
am refining it by adding a stress test and may post a proof elsethread
proving that the code matches the algorithm statement I've posted
elsethread, and in so doing I may find a bug.

This is because "knowing C" is different from "knowing how to program"
and given the serious design flaws of C, there could be "knowing too
much about C".

But as far as I can see, no-ones mastered the problem to the same
extent, Seebach perhaps least of all, because he wasted too much time
last week attacking me. He may redeem himself by helping the OP of
this thread, but he's never even tried to write his own solution.

>
> ;^)

spinoza1111

unread,

Feb 15, 2010, 5:02:49 PM2/15/10

to

I'd say that it better. It cannot use Boyer Moore or Knuth Morris
Pratt IF they use tables, and I believe they do, since that implies
(as far as I can tell) state (like malloc) or else an extra parameter
in the call.

cf. Donald Hennessy's books from Morgan Kaufman on computer
architecture: I'm willing to bet that strstr uses a straightforward
algorithm, that runs best on RISC, and RISC-influenced chips
(including modern Intel chips, influenced as they are), without
microcoded or hardwired special-purpose scan instructions...just
optimized character operations.

But take this cum grano salis. Although I took Computer Architecture
in grad school (me got an A) and worked on an architecture team in
Silicon Valley, I'm an English teacher today, and may not be *au
courant*.

And note that "using strstr" has its own dangers. IT FINDS OVERLAPPING
STRINGS. If you use it to construct a table of replace points you're
gonna have an interesting bug-o-rama:

replace("banana", "ana", "ono")

IF you restart one position after the find point, and not at its end.

Moral: don't let the library do your thinking for you.

Tom St Denis

unread,

Feb 15, 2010, 5:41:32 PM2/15/10

to

On Feb 14, 3:27 am, Richard Heathfield <r...@see.sig.invalid> wrote:
> To find the length of a string, use strlen unless you have a compelling
> reason not to. There is no evidence above of a compelling reason not to
> use strlen.

In production code perhaps, if a student is trying to learn comp.sci
through expressing ideas in C they are actually BETTER served by
writing their own versions of algorithms we take for granted.

Tom

spinoza1111

unread,

Feb 15, 2010, 6:08:59 PM2/15/10

to

Update: Willem posted an exciting, if apparently buggy, solution in
the thread where Peter mis-spelled "efficiency" in the name, and I
just had a chance to look at it. It uses recursion in place of any
data structure whatsoever. And in the same thread another poster seems
to claim he did a working (if for me hard to read) solution on day
one: I have asked him to use my test suite.

And yes. A solution WITH BUGS can be more intelligent than a clean
one. A solution that TAKES A LONG TIME can likewise be better than one
that doesn't. I was also prepared to concede victory to Willem despite
his one character identifiers because of the beauty of his idea: use
recursion and not a data structure.

Corporate thinking is applied Positivism and reductionism. Instead of
the intolerable to many effort of thinking, everything becomes a
reductionistic saw or maxim applied without feeling in the manner of
the performance review: so and so uses one character identifiers, or
likes to take more time, or is "verbose", or doesn't have comments, or
has too many comments, so therefore he's "reduced" to the incompetent
cipher, the infinitesimal that everyone in capitalism feels himself to
be, and fears himself to be.

Whereas I was, while setting up a test of Willem's code, rooting for
him. I was willing to hand the gold medal over to him, one character
identifiers and all, because of the beauty of his idea.

"The Good" in capitalist society is always reduced in the Positivist
spirit to something else because of the cash nexus and its alienation
of us from our selves and each other, which leads people around in a
(recursive) ring, chasing goals that in all cases are subgoals of
another goal...on the model of the toxic derivative which is found to
point back to itself.

Whereas if we could just say that The Good is a simple, recognizable
thing, whether a piece of code or a symphony...if we could just trust
our common humanity...imagine there's no Heaven.

But this I know. There are people in this newsgroup driven mad by
reductionism, who have been told that if they follow an external,
alienated code, whether a bunch of cargo-cult programming maxims or
"Jesus", they will be "saved", and it is these people who are starting
the fights when they are confronted with their own alienation.
>
>
>
>
>
> > ;^)

spinoza1111

unread,

Feb 15, 2010, 6:12:50 PM2/15/10

to

I think you're talking to a "wall, wondrous high, and covered with
serpent shapes" (the Wanderer): I don't think Heathfield wants people
to learn, at least as free, critical human beings. I think he wants
them to listen, and repeat rote maxims. If he's a teacher, as he seems
to want to be, he's the gym coach in the History Boys:

"Jesus didn't ask to be excused!"

"Actually, sir, he did."
>
> Tom

Chris M. Thomasson

unread,

Feb 15, 2010, 6:32:53 PM2/15/10

to

"spinoza1111" <spino...@yahoo.com> wrote in message
news:d520a640-1606-407e...@s36g2000prf.googlegroups.com...

On Feb 15, 8:41 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> > "Stefan Ram" <r...@zedat.fu-berlin.de> wrote in message
> >
> > news:rand-2010...@ram.dialup.fu-berlin.de...
> >
> > > "Chris M. Thomasson" <n...@spam.invalid> writes:
> > >>Or perhaps if you just "know" that those functions are
> > >>very poorly implemented on your platform.
> >
> > > Often, one can indeed assume that �stdlib.h:rand()� is
> > > implemented in such a manner that it is not very random
> > > in the less significant bits - although this knowledge
> > > has nothing to do with the language C but only with the
> > > culture of C implementations.
> >
> > good point. Humm... I would hope that `strstr()' does not commonly use a
> > naive algorithm to search for substrings.
>
> I'd say that it better. It cannot use Boyer Moore or Knuth Morris
> Pratt IF they use tables, and I believe they do, since that implies
> (as far as I can tell) state (like malloc) or else an extra parameter
> in the call.

If fuc%ing better be more efficient than a naive algorithm!

:^o

[...]

>
> And note that "using strstr" has its own dangers. IT FINDS OVERLAPPING
> STRINGS. If you use it to construct a table of replace points you're
> gonna have an interesting bug-o-rama:
>
> replace("banana", "ana", "ono")
>
> IF you restart one position after the find point, and not at its end.

Well, I simply did not construct my `replace()' function to detect
overlapping strings. Therefore, if I pass your input to my implementation I
get:
_______________________________________________
src: banana
cmp: ana
xchg: ono
expect: bonona
result: bonona
_______________________________________________

That result is fine with me. Humm... It might be interesting to see if I can
use `strstr()' to build a table that can handle overlapping strings. For the
`banana' example I would have two entries in the table:

1: offset 1
2: offset 3

After processing 1, the destination string is:

bono

After processing 2, the destination string is:

bonono

But that was easy because the exchange string is the exact same size as
comparand string. Things could get "dicey" if the exchange string were,
let's say, bigger '12345'. So, what should the final result look like in an
overlapping replace function for the following input:

replace("banana", "ana", "12345");

?

Would it be:

b1234512345

?

If so, would it be okay for replace("banana", "ana", "ono") to result in:

bonoono

?

We need to work out some rules here... ;^)

> Moral: don't let the library do your thinking for you.

How do you feel about a garbage collector doing all the thinking for you? I
think a GC is convenient, and I also feel the same way about certain library
functions. However, there are times when you do want to "re-invent"
something. For instance, I am okay with using various manual memory
management techniques to help relieve the pressure on a GC.

Chris M. Thomasson

unread,

Feb 15, 2010, 6:45:19 PM2/15/10

to

"spinoza1111" <spino...@yahoo.com> wrote in message

news:dd0b52a1-6d35-4faf...@b7g2000pro.googlegroups.com...
[...]

> Update: Willem posted an exciting, if apparently buggy, solution in
> the thread where Peter mis-spelled "efficiency" in the name, and I
> just had a chance to look at it. It uses recursion in place of any
> data structure whatsoever. And in the same thread another poster seems
> to claim he did a working (if for me hard to read) solution on day
> one: I have asked him to use my test suite.
>
> And yes. A solution WITH BUGS can be more intelligent than a clean
> one. A solution that TAKES A LONG TIME can likewise be better than one
> that doesn't.
>
>
> I was also prepared to concede victory to Willem despite

Humm... I need to ask why would you feel the need to concede victory to
anybody? I thought this was not a contest. What am I missing?

> his one character identifiers because of the beauty of his idea: use
> recursion and not a data structure.

Can I pass it a bomb that can possibly blow the stack? I cannot seem to find
Willem's posting in the thread entitled "Efficency and the standard
library".

[...]

Seebs

unread,

Feb 15, 2010, 6:45:37 PM2/15/10

to

On 2010-02-15, Chris M. Thomasson <n...@spam.invalid> wrote:
> Humm... I need to ask why would you feel the need to concede victory to
> anybody? I thought this was not a contest. What am I missing?

When has Nilges ever acted in a way that suggested that he did not view
everything as a contest with winners and losers? I think you're inventing
rationality not in evidence.

-s
--

Chris M. Thomasson

unread,

Feb 15, 2010, 7:13:54 PM2/15/10

to

"spinoza1111" <spino...@yahoo.com> wrote in message

news:292491f2-c3ca-45ae...@x1g2000prb.googlegroups.com...

On Feb 15, 3:53 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> Here is my humble little entry that took me around a half an hour or so to
> create:
>
> http://clc.pastebin.com/f62504e4c
>
> If you want to avoid using `string.h' then you are going to have to
> implment
> the following functions:
> _________________________________________________

[...]
> _________________________________________________
>
[...]
> Any questions?

> Yeah, Chris. I have a question. Why did you call it an "entry" when
> this (to me, anyway) implied that it was a contest entry to the
> Spinoza challenge? Please don't be corrupted by the dishonesty and
> brutality of these newsgroups.

Ahh crap. I was thinking that it was sort of a "challenge" so to speak.
Anyway, I apologize for misrepresenting you.

> Sure, you do say that I have to implement FOUR (4) non-trivial library
> functions.
>
> But by saying "it took me a half hour" I read an implied, perhaps
> unintended slight at the approximately six hours I took ... where only
> in dysfunctional corporations is it a bad thing to take a little extra
> care and a little extra time in anticipation of difficulty.

I actually meant nothing by it. Quite frankly, now that I think about it, I
don't actually know why I posted how long it took me. I mean, who cares
right?

> A week late, in this thread, Seebach, Bacarisse et al. seem to be
> running into confusion trying to help the OP meet the original
> challenge. But I note nobody harassing them or the original poster,
> targeting them for abuse.

What challenge?

> In a sense, "I have only myself to blame" for this, because a year or
> so ago, I jumped all over Seebach for his attacks on Schildt. I feel I
> was right to do so, *et je ne regrette rien*. Nonetheless, I'm tired
> of his lies.
>
> If your "slight" was unintended, I apologize.

It was totally unintended Edward. I did not even think of insulting anybody
by posting how long it took be to flesh out that code.

> Without knowing as much about C as the regs, esp. postmodern C and the
> standards (I'll be the first to concede this), I've left them in the
> dust as regards my challenge.

Again, what challenge are you referring to?

> I've completed my solution, although I
> am refining it by adding a stress test and may post a proof elsethread
> proving that the code matches the algorithm statement I've posted
> elsethread, and in so doing I may find a bug.

Finding a bug is damn good thing! Nothing wrong with that. I hate it when
somebody gets pissed off when I find a bug in some of their code. They don't
even thank you for pointing it out to them!

Bastards!

> This is because "knowing C" is different from "knowing how to program"
> and given the serious design flaws of C, there could be "knowing too
> much about C".
>
> But as far as I can see, no-ones mastered the problem to the same
> extent, Seebach perhaps least of all, because he wasted too much time
> last week attacking me. He may redeem himself by helping the OP of
> this thread, but he's never even tried to write his own solution.

I just cannot really understand why you are trying to avoid `string.h' in
all cases. I mean, if you wanted to re-implement `strstr()', well, that's
fine. However, I don't see a real need to roll your own version of
`strlen()' or `memcpy()'. I mean, how can you do better than a good
implementation of the standard C library? An implementation of `memcpy()'
will most likely be using processor specific instructions that provide a
level of efficiency that cannot be reached with 100% pure portable C code.

Tom St Denis

unread,

Feb 15, 2010, 7:20:37 PM2/15/10

to

Nobody, least of all anyone looking like me asked you for your
opinion.

Tom

Chris M. Thomasson

unread,

Feb 15, 2010, 7:22:52 PM2/15/10

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message
news:4Dgen.97765$CM7....@newsfe04.iad...

> "spinoza1111" <spino...@yahoo.com> wrote in message
> news:dd0b52a1-6d35-4faf...@b7g2000pro.googlegroups.com...
> [...]
>
>> Update: Willem posted an exciting, if apparently buggy, solution in
>> the thread where Peter mis-spelled "efficiency" in the name, and I
>> just had a chance to look at it. It uses recursion in place of any
>> data structure whatsoever. And in the same thread another poster seems
>> to claim he did a working (if for me hard to read) solution on day
>> one: I have asked him to use my test suite.
>>
>> And yes. A solution WITH BUGS can be more intelligent than a clean
>> one. A solution that TAKES A LONG TIME can likewise be better than one
>> that doesn't.
>>
>>
>> I was also prepared to concede victory to Willem despite

[...]

>> his one character identifiers because of the beauty of his idea: use
>> recursion and not a data structure.
>
> Can I pass it a bomb that can possibly blow the stack? I cannot seem to
> find Willem's posting in the thread entitled "Efficency and the standard
> library".

Ahhh, I found his code in the "Warning to newbies" thread:

http://groups.google.com/group/comp.lang.c/msg/7c6bf8fae5249919

I don't think it can blow the stack because the recursion level is limited.
Also, I found a response to Willem from you:

http://groups.google.com/group/comp.lang.c/msg/5b2b278673c86951

in which you clearly state that this is indeed a "Spinoza challenge":
_____________________________________________________________
spinoza111: "2. I was, and remain, very impressed by your solution and as I
created the source file you see below I was rooting for you: for had
it ran with my test suite, I would have handed the "Olympic gold
medal" that I've awarded myself in the Spinoza challenge to you."
_____________________________________________________________

Now you are confusing me here. First you say it's not a challenge, then you
seem to contradict yourself. Can you please clear this up for me? Thanks.

Seebs

unread,

Feb 15, 2010, 7:47:10 PM2/15/10

to

On 2010-02-15, Chris M. Thomasson <n...@spam.invalid> wrote:
> "spinoza1111" <spino...@yahoo.com> wrote in message
> news:292491f2-c3ca-45ae...@x1g2000prb.googlegroups.com...

>> But by saying "it took me a half hour" I read an implied, perhaps
>> unintended slight at the approximately six hours I took ... where only
>> in dysfunctional corporations is it a bad thing to take a little extra
>> care and a little extra time in anticipation of difficulty.

> I actually meant nothing by it. Quite frankly, now that I think about it, I
> don't actually know why I posted how long it took me. I mean, who cares
> right?

Right.

That said, I *did* mean a slight at the alleged six hours Nilges took to
produce a much buggier implementation, because that suggests that his
methodology is bad -- and since he posted his originally specifically as
a criticism of my off-the-cuff example I posted in another thread, I
figured he was looking for comparisons.

In short, he posted a large rant about how unprofessional and unconsidered
and badly-designed my code was, which point he "proved" by demonstrating
that, in only ten times as many lines of code, with four or five times as
many bugs, he could nearly solve a sort of similar problem. Very persuasive.

>> A week late, in this thread, Seebach, Bacarisse et al. seem to be
>> running into confusion trying to help the OP meet the original
>> challenge. But I note nobody harassing them or the original poster,
>> targeting them for abuse.

> What challenge?

I have no clue.

This whole thing started because I posted a snippet of code I found
interesting to make for some vaguely topical stuff. Nilges responded with
angry rants about how bad my code was and a gigantic, buggy, "solution"
to the problem.

>> Without knowing as much about C as the regs, esp. postmodern C and the
>> standards (I'll be the first to concede this), I've left them in the
>> dust as regards my challenge.

> Again, what challenge are you referring to?

The one he thinks it's very offensive that you implied existed.

He's not exactly consistent. At any given time, he believes whatever he
feels makes him look best. This can result in him believing wildly
contradictory things over the course of a post.

> Finding a bug is damn good thing! Nothing wrong with that. I hate it when
> somebody gets pissed off when I find a bug in some of their code. They don't
> even thank you for pointing it out to them!

Agreed!

> I just cannot really understand why you are trying to avoid `string.h' in
> all cases. I mean, if you wanted to re-implement `strstr()', well, that's
> fine. However, I don't see a real need to roll your own version of
> `strlen()' or `memcpy()'. I mean, how can you do better than a good
> implementation of the standard C library? An implementation of `memcpy()'
> will most likely be using processor specific instructions that provide a
> level of efficiency that cannot be reached with 100% pure portable C code.

I have no clue. I also don't see why he thinks my posts about his buggy
code have anything to do with the time or effort it takes to get this done.
When he proposed a more general problem than the one my original effort
solved, I posted a proposed solution, which took about ten minutes to write,
and in which one bug was found so far. (It went into an infinite loop if you
had it matching a zero-length substring.) I fixed that, and it's done. So
far as I can tell, it works for all inputs that don't exhaust memory or
size_t or something similar, and is otherwise unexceptional because the
task is fundamentally a very trivial one.

Willem

unread,

Feb 15, 2010, 8:04:07 PM2/15/10

to

Seebs wrote:
) On 2010-02-15, Chris M. Thomasson <n...@spam.invalid> wrote:
)> I just cannot really understand why you are trying to avoid `string.h' in
)> all cases. I mean, if you wanted to re-implement `strstr()', well, that's
)> fine. However, I don't see a real need to roll your own version of
)> `strlen()' or `memcpy()'. I mean, how can you do better than a good
)> implementation of the standard C library? An implementation of `memcpy()'
)> will most likely be using processor specific instructions that provide a
)> level of efficiency that cannot be reached with 100% pure portable C code.
)
) I have no clue. I also don't see why he thinks my posts about his buggy
) code have anything to do with the time or effort it takes to get this done.
) When he proposed a more general problem than the one my original effort
) solved, I posted a proposed solution, which took about ten minutes to write,
) and in which one bug was found so far. (It went into an infinite loop if you
) had it matching a zero-length substring.) I fixed that, and it's done. So
) far as I can tell, it works for all inputs that don't exhaust memory or
) size_t or something similar, and is otherwise unexceptional because the
) task is fundamentally a very trivial one.

I wouldn't call matching a zero-length substring a bug, really. More of an
oversight in the specification. It's comparable to dividing by zero.

The reason I enjoyed coding it up without using string.h functions is
because it's an academic challenge/puzzle. Not a hard one, mind.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Seebs

unread,

Feb 15, 2010, 8:05:43 PM2/15/10

to

On 2010-02-15, Willem <wil...@snail.stack.nl> wrote:
> I wouldn't call matching a zero-length substring a bug, really. More of an
> oversight in the specification. It's comparable to dividing by zero.

My code's behavior was a bug, though -- bad inputs shouldn't cause an
infinite loop.

> The reason I enjoyed coding it up without using string.h functions is
> because it's an academic challenge/puzzle. Not a hard one, mind.

It's interesting, and the recursive strategy is fascinating.

Tim Streater

unread,

Feb 15, 2010, 8:27:52 PM2/15/10

to

On 15/02/2010 18:08, spinoza1111 wrote:

[snip]

> And yes. A solution WITH BUGS can be more intelligent than a clean
> one. A solution that TAKES A LONG TIME can likewise be better than one
> that doesn't. I was also prepared to concede victory to Willem despite
> his one character identifiers because of the beauty of his idea: use
> recursion and not a data structure.

Yes - VICTORY! - that's what we need. Doubtless you will be increasing
the chocolate ration from 30gm/week to 20gm/week in the near future too.

--
Tim

"That the freedom of speech and debates or proceedings in Parliament
ought not to be impeached or questioned in any court or place out of
Parliament"

Bill of Rights 1689

Chris M. Thomasson

unread,

Feb 15, 2010, 8:35:06 PM2/15/10

to

"Seebs" <usenet...@seebs.net> wrote in message
news:slrnhnjagd.fm3...@guild.seebs.net...

> On 2010-02-15, Willem <wil...@snail.stack.nl> wrote:
>> I wouldn't call matching a zero-length substring a bug, really. More of
>> an
>> oversight in the specification. It's comparable to dividing by zero.
>
> My code's behavior was a bug, though -- bad inputs shouldn't cause an
> infinite loop.
>
>> The reason I enjoyed coding it up without using string.h functions is
>> because it's an academic challenge/puzzle. Not a hard one, mind.
>
> It's interesting, and the recursive strategy is fascinating.

Yes, I agree that the solution based on recursion is neat. However, any
recursive function tends to make me worry about blowing the stack. Perhaps I
worry to much!

;^)

Walter Banks

unread,

Feb 15, 2010, 9:49:40 PM2/15/10

to

"Chris M. Thomasson" wrote:

> Yes, I agree that the solution based on recursion is neat. However, any
> recursive function tends to make me worry about blowing the stack. Perhaps I
> worry to much!

As much as I like recursive solutions for many things including most of the
parsers I have written.

There are some application areas where recursion is avoided. Most of the
automotive bugs 10 or 15 years ago had a stack depth component and most
code is now written with predictable run time requirements.

Regards

Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

spinoza1111

unread,

Feb 16, 2010, 5:25:29 AM2/16/10

to

On Feb 16, 2:32 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "spinoza1111" <spinoza1...@yahoo.com> wrote in message

Fine, since garbage collection is simpler than software design. We
have the right to think of software entities coming into existence and
dying without having to be midwifes or funeral directors.

spinoza1111

unread,

Feb 16, 2010, 5:27:47 AM2/16/10

to

On Feb 16, 2:45 am, Seebs <usenet-nos...@seebs.net> wrote:
> On 2010-02-15, Chris M. Thomasson <n...@spam.invalid> wrote:
>
> > Humm... I need to ask why would you feel the need to concede victory to
> > anybody? I thought this was not a contest. What am I missing?
>
> When has Nilges ever acted in a way that suggested that he did not view
> everything as a contest with winners and losers? I think you're inventing

It's better to have a contest with winners and losers than a rigged
game with bullies and victims, Seebach. You've forced me to
demonstrate that you're not competent to judge Schildt, but I would
much prefer not having to do this. If you'd behave yourself and start
a night school course in comp sci, then I won't start contests.

> rationality not in evidence.
>
> -s
> --

spinoza1111

unread,

Feb 16, 2010, 5:34:37 AM2/16/10

to

On Feb 16, 3:13 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "spinoza1111" <spinoza1...@yahoo.com> wrote in message

>
> news:292491f2-c3ca-45ae...@x1g2000prb.googlegroups.com...
> On Feb 15, 3:53 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
>
>
>
>
>
> > Here is my humble little entry that took me around a half an hour or so to
> > create:
>
> >http://clc.pastebin.com/f62504e4c
>
> > If you want to avoid using `string.h' then you are going to have to
> > implment
> > the following functions:
> > _________________________________________________
> [...]
> > _________________________________________________
>
> [...]
> > Any questions?
> > Yeah, Chris. I have a question. Why did you call it an "entry" when
> > this (to me, anyway) implied that it was a contest entry to the
> > Spinoza challenge? Please don't be corrupted by the dishonesty and
> > brutality of these newsgroups.
>
> Ahh crap. I was thinking that it was sort of a "challenge" so to speak.
> Anyway, I apologize for misrepresenting you.
>

No prob.

> > Sure, you do say that I have to implement FOUR (4) non-trivial library
> > functions.
>
> > But by saying "it took me a half hour" I read an implied, perhaps
> > unintended slight at the approximately six hours I took ... where only
> > in dysfunctional corporations is it a bad thing to take a little extra
> > care and a little extra time in anticipation of difficulty.
>
> I actually meant nothing by it. Quite frankly, now that I think about it, I
> don't actually know why I posted how long it took me. I mean, who cares
> right?

Sorry for gettin' hot on the collar but as you can see there are a lot
of ill-intentioned people here.

>
> > A week late, in this thread, Seebach, Bacarisse et al. seem to be
> > running into confusion trying to help the OP meet the original
> > challenge. But I note nobody harassing them or the original poster,
> > targeting them for abuse.
>
> What challenge?
>

Write a replace() function without using string.h.

> > In a sense, "I have only myself to blame" for this, because a year or
> > so ago, I jumped all over Seebach for his attacks on Schildt. I feel I
> > was right to do so, *et je ne regrette rien*. Nonetheless, I'm tired
> > of his lies.
>
> > If your "slight" was unintended, I apologize.
>
> It was totally unintended Edward. I did not even think of insulting anybody
> by posting how long it took be to flesh out that code.
>
> > Without knowing as much about C as the regs, esp. postmodern C and the
> > standards (I'll be the first to concede this), I've left them in the
> > dust as regards my challenge.
>
> Again, what challenge are you referring to?
>
> > I've completed my solution, although I
> > am refining it by adding a stress test and may post a proof elsethread
> > proving that the code matches the algorithm statement I've posted
> > elsethread, and in so doing I may find a bug.
>
> Finding a bug is damn good thing! Nothing wrong with that. I hate it when
> somebody gets pissed off when I find a bug in some of their code. They don't
> even thank you for pointing it out to them!
>
> Bastards!

Swine!

Let's hear it for the good guys! YAY
Let's hear it for the bad guys! BOO

>
> > This is because "knowing C" is different from "knowing how to program"
> > and given the serious design flaws of C, there could be "knowing too
> > much about C".
>
> > But as far as I can see, no-ones mastered the problem to the same
> > extent, Seebach perhaps least of all, because he wasted too much time
> > last week attacking me. He may redeem himself by helping the OP of
> > this thread, but he's never even tried to write his own solution.
>
> I just cannot really understand why you are trying to avoid `string.h' in
> all cases. I mean, if you wanted to re-implement `strstr()', well, that's
> fine. However, I don't see a real need to roll your own version of
> `strlen()' or `memcpy()'. I mean, how can you do better than a good
> implementation of the standard C library? An implementation of `memcpy()'

Actually, in terms of efficiency one often can. Library writers are
men of flesh and blood, and women too.

> will most likely be using processor specific instructions that provide a
> level of efficiency that cannot be reached with 100% pure portable C code.

How is that possible? The compiler of the library code will emit
"processor specific" instructions, to be sure, but it will do the same
for me, or any man. And if the library code forces out assembler code,
then it will only work on one processor, or at best small n processor.

Without any examples in front of me at the time, I'd say that well-
written library routines are basically simple and correct. They have
to run on multiple processors, and can no where assume instructions
that execute in one or small n machine cycles.

spinoza1111

unread,

Feb 16, 2010, 5:35:19 AM2/16/10

to

On Feb 16, 3:20 am, Tom St Denis <t...@iahu.ca> wrote:

Too bad. It was free. You don't have to pay me.
>
> Tom

spinoza1111

unread,

Feb 16, 2010, 5:35:54 AM2/16/10

to

On Feb 16, 3:22 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "Chris M. Thomasson" <n...@spam.invalid> wrote in messagenews:4Dgen.97765$CM7....@newsfe04.iad...
>
>
>
>
>

> > "spinoza1111" <spinoza1...@yahoo.com> wrote in message

It is a challenge.

Richard Heathfield

unread,

Feb 16, 2010, 6:34:52 AM2/16/10

to

spinoza1111 wrote:
> On Feb 16, 3:20 am, Tom St Denis <t...@iahu.ca> wrote:

<snip>

>> Nobody, least of all anyone looking like me asked you for your
>> opinion.
>
> Too bad. It was free.

And overpriced.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

spinoza1111

unread,

Feb 16, 2010, 7:46:02 AM2/16/10

to

On Feb 16, 3:47 am, Seebs <usenet-nos...@seebs.net> wrote:
> On 2010-02-15, Chris M. Thomasson <n...@spam.invalid> wrote:
>

> > "spinoza1111" <spinoza1...@yahoo.com> wrote in message

> >news:292491f2-c3ca-45ae...@x1g2000prb.googlegroups.com...
> >> But by saying "it took me a half hour" I read an implied, perhaps
> >> unintended slight at the approximately six hours I took ... where only
> >> in dysfunctional corporations is it a bad thing to take a little extra
> >> care and a little extra time in anticipation of difficulty.
> > I actually meant nothing by it. Quite frankly, now that I think about it, I
> > don't actually know why I posted how long it took me. I mean, who cares
> > right?
>
> Right.
>
> That said, I *did* mean a slight at the alleged six hours Nilges took to
> produce a much buggier implementation, because that suggests that his
> methodology is bad -- and since he posted his originally specifically as
> a criticism of my off-the-cuff example I posted in another thread, I
> figured he was looking for comparisons.
>
> In short, he posted a large rant about how unprofessional and unconsidered
> and badly-designed my code was, which point he "proved" by demonstrating
> that, in only ten times as many lines of code, with four or five times as
> many bugs, he could nearly solve a sort of similar problem. Very persuasive.

You only knew about the bugs because I fixed them, Peter, whereas you
never fixed the bugs you reported in your first attempt. I haven't
been tracking your other code in detail because it doesn't interest
me, but I see in your posts nothing like diligence in testing. You're
so worried about taking "too long", having been in my view corrupted
by corporate life, that you created nothing like a systematic and
growing test suite (I did) nor did you systematically track and
document bugs and changes (I did).

In fact, most of the "six hours" I spent was in documentation and test
creation, not coding. How dare you even compare your buggy and
amateurish work?

If you look at the latest text of "my" replace, you'll see in the
Change Record that each bug save one is labeled with "bug:": one is
labeled "bug fix".

There were, in fact, only five bugs.

And, of course, you're counting in "lines of code" the test suite that
many other posters including Willem have found useful, but which you
probably dare not use.

>
> >> A week late, in this thread, Seebach, Bacarisse et al. seem to be
> >> running into confusion trying to help the OP meet the original
> >> challenge. But I note nobody harassing them or the original poster,
> >> targeting them for abuse.
> > What challenge?
>
> I have no clue.
>
> This whole thing started because I posted a snippet of code I found
> interesting to make for some vaguely topical stuff. Nilges responded with
> angry rants about how bad my code was and a gigantic, buggy, "solution"
> to the problem.

Five bugs. All fixed. YOU NEVER FIXED the %s bug.

>
> >> Without knowing as much about C as the regs, esp. postmodern C and the
> >> standards (I'll be the first to concede this), I've left them in the
> >> dust as regards my challenge.
> > Again, what challenge are you referring to?
>
> The one he thinks it's very offensive that you implied existed.
>
> He's not exactly consistent. At any given time, he believes whatever he
> feels makes him look best. This can result in him believing wildly
> contradictory things over the course of a post.

Your limitations are not my contradictions,
Your failures are not mine,
Your misery is your own history,
So, dear boy, don't whine.

>
> > Finding a bug is damn good thing! Nothing wrong with that. I hate it when
> > somebody gets pissed off when I find a bug in some of their code. They don't
> > even thank you for pointing it out to them!
>
> Agreed!

Asshole. I've bent over backward to acknowledge whatever contributions
you've made. You won't even respond to email.

>
> > I just cannot really understand why you are trying to avoid `string.h' in
> > all cases. I mean, if you wanted to re-implement `strstr()', well, that's
> > fine. However, I don't see a real need to roll your own version of
> > `strlen()' or `memcpy()'. I mean, how can you do better than a good
> > implementation of the standard C library? An implementation of `memcpy()'
> > will most likely be using processor specific instructions that provide a
> > level of efficiency that cannot be reached with 100% pure portable C code.
>
> I have no clue. I also don't see why he thinks my posts about his buggy
> code have anything to do with the time or effort it takes to get this done.
> When he proposed a more general problem than the one my original effort
> solved, I posted a proposed solution, which took about ten minutes to write,

It uses string.h. It doesn't meet the challenge. God damn, boy, you
are a liar, aren't you?

If I missed where you wrote a bug free replace() without using
string.h, post it

*HERE*

so we can evaluate it.

> and in which one bug was found so far. (It went into an infinite loop if you
> had it matching a zero-length substring.) I fixed that, and it's done. So
> far as I can tell, it works for all inputs that don't exhaust memory or
> size_t or something similar, and is otherwise unexceptional because the
> task is fundamentally a very trivial one.
>
> -s
> --

bartc

unread,

Feb 16, 2010, 11:50:29 AM2/16/10

to

spinoza1111 wrote:
> On Feb 16, 3:13 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:

>> roll your own version of `strlen()' or `memcpy()'. I mean, how can
>> you do better than a good implementation of the standard C library?
>> An implementation of `memcpy()'
>
> Actually, in terms of efficiency one often can. Library writers are
> men of flesh and blood, and women too.

But different men for different implementations. When you write your
strlen() equivalent, you are only going to write one, not dozens. And if you
stick to portable C, it might not be faster.

>> will most likely be using processor specific instructions that
>> provide a level of efficiency that cannot be reached with 100% pure
>> portable C code.
>
> How is that possible? The compiler of the library code will emit
> "processor specific" instructions, to be sure, but it will do the same
> for me, or any man. And if the library code forces out assembler code,
> then it will only work on one processor, or at best small n processor.

I think standard library routines can be written in a language other than C,
or some mix. For example, hand-written assembly.

And the library you use comes with the processor; switch processors, and
there could be a different library routine, optimised a different way (or
just optimised down by the compiler to a couple of inline machine
instructions).

--
Bartc

spinoza1111

unread,

Feb 16, 2010, 12:22:59 PM2/16/10

to

On Feb 16, 7:50 pm, "bartc" <ba...@freeuk.com> wrote:
> spinoza1111wrote:

> > On Feb 16, 3:13 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> >> roll your own version of `strlen()' or `memcpy()'. I mean, how can
> >> you do better than a good implementation of the standard C library?
> >> An implementation of `memcpy()'
>
> > Actually, in terms of efficiency one often can. Library writers are
> > men of flesh and blood, and women too.
>
> But different men for different implementations. When you write your
> strlen() equivalent, you are only going to write one, not dozens. And if you
> stick to portable C, it might not be faster.
>
> >> will most likely be using processor specific instructions that
> >> provide a level of efficiency that cannot be reached with 100% pure
> >> portable C code.
>
> > How is that possible? The compiler of the library code will emit
> > "processor specific" instructions, to be sure, but it will do the same
> > for me, or any man. And if the library code forces out assembler code,
> > then it will only work on one processor, or at best small n processor.
>
> I think standard library routines can be written in a language other than C,
> or some mix. For example, hand-written assembly.

Correct. Wonder how many library routines are written in assembler.
Don't know.

>
> And the library you use comes with the processor; switch processors, and
> there could be a different library routine, optimised a different way (or
> just optimised down by the compiler to a couple of inline machine
> instructions).

Correct o mundo
>
> --
> Bartc

Nick Keighley

unread,

Feb 16, 2010, 2:08:14 PM2/16/10

to

you've been spoilt by the "portable assembler" nature of C. C is
unusual in that much of it's standard library can be written in C.
Since Windows and most Unixes are also written in C, calls into the OS
are easy as well.

Many other languages will have the low level parts of their libraries
written in C.

fedora

unread,

Feb 16, 2010, 8:24:18 PM2/16/10

to

Hi all!

Have finished my program for spinoza's challenge. rewrote everything and
this time i made each statement as simple as posible, so that i can
understand the program. The allSubstr procedure can search for over lapping
sub-string too like spinoza wanted, but the replace routine doesnt use that
since i cant think how to replace over lapping ones!

haven't used any function from string.h! it works for strings i could think
of but maybe it got bugs since i'm just a beginner...

how's mine spinoza111? :)

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

size_t strLength(char *cstr) {
size_t index = 0;

while (cstr[index] != '\0') ++index;
return index;
}

char *strFirstCh(char *str, char ch, size_t lstr) {
char *chpos = 0;
size_t current;

for (current = 0; current < lstr; current++) {
if (str[current] == ch) {
chpos = str + current;
break;
}
}
return chpos;
}

int strComp(char *s, char *t, size_t len) {
int ret = 0;
size_t index;

for (index = 0; index < len; index++) {
if (s[index] != t[index]) {
ret = 1;
break;
}
}
return ret;
}

char *strSubstr(
char *str,
char *sub,
size_t lstr,
size_t lsub) {
char *substr = 0;
char *anchor = str;
size_t remaining_len = (lstr - lsub) + 1;

assert(str && sub && lstr && lsub && lstr >= lsub);
while (remaining_len > 0 && anchor) {
if (anchor = strFirstCh(anchor, *sub, remaining_len)) {
if (strComp(anchor, sub, lsub) == 0) {
substr = anchor;
break;
}
anchor++;
remaining_len--;
}
}
return substr;
}

unsigned allSubstr(
char *str,
char *sub,
size_t lstr,
size_t lsub,
char ***ps,
int overlap) {
unsigned occurs = 0;
unsigned ctr;
char *orig_str = str;
size_t orig_lstr = lstr;
size_t step;

if (overlap == 1)
step = 1;
else
step = lsub;

while (lstr >= lsub) {
str = strSubstr(str, sub, lstr, lsub);
if (str == 0)
break;
occurs++;
str += step;
lstr = (orig_str + orig_lstr) - str;
}

if (occurs > 0 && ps) {
str = orig_str;
lstr = orig_lstr;
*ps = malloc(occurs * sizeof **ps);
if (*ps) {
for (ctr = 0; ctr < occurs; ctr++) {
ps[0][ctr] = str = strSubstr(str, sub, lstr, lsub);
str += step;
lstr = (orig_str + orig_lstr) - str;
}
}
}
return occurs;
}

char *replace(char *str, char *substr, char *rep) {
char *new = 0;
size_t lstr, lsubstr, lrep, lnew, strc, newc, repc, replaced;
unsigned replacements;
char **subpos;

assert(str && substr && rep);
lstr = strLength(str);
lsubstr = strLength(substr);
lrep = strLength(rep);
if (lstr == 0 || lsubstr == 0 || lsubstr > lstr)
return 0;
replacements = allSubstr(str, substr, lstr, lsubstr, &subpos, 0);
if (replacements > 0) {
lnew = (lstr - (replacements * lsubstr)) + (replacements * lrep);
new = malloc(lnew + 1);
if (!new)
return 0;
strc = newc = replaced = 0;
while (strc <= lstr) {
if (str + strc == subpos[replaced]) {
for (repc = 0; repc < lrep; repc++) {
new[newc] = rep[repc];
newc++;
}
replaced++;
strc += lsubstr;
}
else {
new[newc] = str[strc];
strc++;
newc++;
}
}
free(subpos);
}
else {
new = malloc(lstr + 1);
if (!new)
return 0;
for (strc = 0; strc <= lstr; strc++)
new[strc] = str[strc];
}
return new;
}

int main(int argc, char **argv) {
char *newstr;

assert(argc == 4);
newstr = replace(argv[1], argv[2], argv[3]);
if (newstr)
printf("%s\n", newstr);
else
printf("replace() -> null\n");
free(newstr);
return 0;
}

thanks a lot all who helped!

bartc

unread,

Feb 16, 2010, 9:27:01 PM2/16/10

to

"fedora" <no_...@invalid.invalid> wrote in message
news:hleutd$utr$1...@news.eternal-september.org...

> Hi all!
>
> Have finished my program for spinoza's challenge. rewrote everything and
> this time i made each statement as simple as posible, so that i can
> understand the program. The allSubstr procedure can search for over
> lapping
> sub-string too like spinoza wanted, but the replace routine doesnt use
> that
> since i cant think how to replace over lapping ones!
>
> haven't used any function from string.h! it works for strings i could
> think
> of but maybe it got bugs since i'm just a beginner...

Seems to be solid enough.

Except, if it can't find a substring (I think for substrings longer than the
text), sometimes it returns the original text unchanged, and sometimes it
returns NULL, eg.

"a", "ab", "" returns NULL, but:

"ab", "x", "" returns "ab"

--
Bartc

fedora

unread,

Feb 16, 2010, 10:10:42 PM2/16/10

to

Oops... mem leak here! I return without giving back the subpos array. THat
should be :-

if (!new) {
free(subpos);

fedora

unread,

Feb 16, 2010, 10:20:04 PM2/16/10

to

bartc wrote:

thanks bartc!

The reason the first one returns null pointer is because i assume its
incorrect to call the routine with a substring bigger than target string.
maybe i should've put it into the assert but i thought it wasn't serious
enought to crash, so i return null ptr.

In second case it returns original string (not exactly but copy of original
because main routine free()s the replace's returned pointer but we cant
free() argv[] strings!!) because the sub-string doesn't occur and there's
nothing to replace. replace("ab", "ab", "") will give an empty string. hope
all cases are logivcal and consistent!

for right to left, we can simply reverse the target string before sending to
replace(), but i didn't put the functionality for that. The replace
procedure is very untidy to me! it could've been made much more neat and
efficient if i'd written strcpy() too, but not yet done!

anyways, seeing Williem's recursive program makes me ashamed i'm so stupid
beginner!!

thanks all

Phil Carmody

unread,

Feb 16, 2010, 10:35:20 PM2/16/10

to

"Chris M. Thomasson" <n...@spam.invalid> writes:

> "Chris M. Thomasson" <n...@spam.invalid> wrote in message
> news:4Dgen.97765$CM7....@newsfe04.iad...
>> "spinoza1111" <spino...@yahoo.com> wrote in message

> Now you are confusing me here. First you say it's not a challenge,
> then you seem to contradict yourself. Can you please clear this up for
> me? Thanks.

Do not feed the troll.

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

Ben Bacarisse

unread,

Feb 16, 2010, 10:42:03 PM2/16/10

to

fedora <no_...@invalid.invalid> writes:

> Have finished my program for spinoza's challenge. rewrote everything and
> this time i made each statement as simple as posible, so that i can
> understand the program. The allSubstr procedure can search for over lapping
> sub-string too like spinoza wanted, but the replace routine doesnt use that
> since i cant think how to replace over lapping ones!
>
> haven't used any function from string.h! it works for strings i could think
> of but maybe it got bugs since i'm just a beginner...

The result is good, but I am not sure you were right to accept the
peculiar notion of not using standard string functions. If you felt
you had to, why not use standard functions but then plug-in you own
versions? That way you learn about the standard library and get to
write the character-fiddling functions that can be useful learning
exercises.

I'll make a few detailed comments (one is a bug), but on the "big
picture" I don't see why you mix size_t and unsigned all over the
place. I'd stick to size_t.

Finally, I think it is odd to return a null result when the substring
is too long to match. I'd treat is like any other substring that does
not match.

<snip>

> #include <stdlib.h>
> #include <stdio.h>
> #include <assert.h>
>
> size_t strLength(char *cstr) {
> size_t index = 0;
>
> while (cstr[index] != '\0') ++index;
> return index;
> }
>
> char *strFirstCh(char *str, char ch, size_t lstr) {
> char *chpos = 0;
> size_t current;
>
> for (current = 0; current < lstr; current++) {
> if (str[current] == ch) {
> chpos = str + current;
> break;
> }
> }
> return chpos;
> }
>
> int strComp(char *s, char *t, size_t len) {
> int ret = 0;
> size_t index;
>
> for (index = 0; index < len; index++) {
> if (s[index] != t[index]) {
> ret = 1;
> break;
> }
> }
> return ret;

Small point: ret is the same as index != len. Can you see why? Many
C programmers would just return index != len here. Also, I'd reverse
the sense of the returned value and call the function strEqual.

You have a bug here. replaced can become equal to the size of the
subpos array and, hence, you index outside of it.

> for (repc = 0; repc < lrep; repc++) {
> new[newc] = rep[repc];
> newc++;
> }
> replaced++;
> strc += lsubstr;
> }
> else {
> new[newc] = str[strc];
> strc++;
> newc++;
> }
> }
> free(subpos);
> }
> else {
> new = malloc(lstr + 1);
> if (!new)
> return 0;
> for (strc = 0; strc <= lstr; strc++)
> new[strc] = str[strc];
> }
> return new;
> }
>
> int main(int argc, char **argv) {
> char *newstr;
>
> assert(argc == 4);

I don't think this is a good use of assert. It is almost always
wrong to use it to check user input. I'd just use an "if".

> newstr = replace(argv[1], argv[2], argv[3]);
> if (newstr)
> printf("%s\n", newstr);
> else
> printf("replace() -> null\n");
> free(newstr);
> return 0;
> }

--
Ben.

Chris M. Thomasson

unread,

Feb 16, 2010, 10:58:55 PM2/16/10

to

"Walter Banks" <wal...@bytecraft.com> wrote in message
news:4B79C174...@bytecraft.com...

>
>
> "Chris M. Thomasson" wrote:
>
>> Yes, I agree that the solution based on recursion is neat. However, any
>> recursive function tends to make me worry about blowing the stack.
>> Perhaps I
>> worry to much!
>
> As much as I like recursive solutions for many things including most of
> the
> parsers I have written.
>
> There are some application areas where recursion is avoided. Most of the
> automotive bugs 10 or 15 years ago had a stack depth component and most
> code is now written with predictable run time requirements.

I do not "necessarily" want to restrict "potential" user input in order to
get around the limitations of a recursive function in an environment that
has a rather small per-thread stack size. If you can create an iterative
solution, then I say go ahead and do it. This may work out when you realize
that the limits you set on a recursive function are to great to run on a
system that has lower per-task/thread stack size.

Chris M. Thomasson

unread,

Feb 16, 2010, 11:25:16 PM2/16/10

to

"spinoza1111" <spino...@yahoo.com> wrote in message

news:f12dcf90-987d-4b72...@z10g2000prh.googlegroups.com...

On Feb 16, 7:50 pm, "bartc" <ba...@freeuk.com> wrote:
> > spinoza1111wrote:
> > > On Feb 16, 3:13 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> > >> roll your own version of `strlen()' or `memcpy()'. I mean, how can
> > >> you do better than a good implementation of the standard C library?
> > >> An implementation of `memcpy()'
> >
> > > Actually, in terms of efficiency one often can. Library writers are
> > > men of flesh and blood, and women too.
> >
> > But different men for different implementations. When you write your
> > strlen() equivalent, you are only going to write one, not dozens. And if
> > you
> > stick to portable C, it might not be faster.
> >
> > >> will most likely be using processor specific instructions that
> > >> provide a level of efficiency that cannot be reached with 100% pure
> > >> portable C code.
> >
> > > How is that possible? The compiler of the library code will emit
> > > "processor specific" instructions, to be sure, but it will do the same
> > > for me, or any man. And if the library code forces out assembler code,
> > > then it will only work on one processor, or at best small n processor.
> >
> > I think standard library routines can be written in a language other
> > than C,
> > or some mix. For example, hand-written assembly.

> Correct. Wonder how many library routines are written in assembler.
> Don't know.

Well, a standard library implementation in the form of a deferment to native
OS library calls can be written in 100% assembly language. Perhaps a
standard C header can defer to highly efficient native OS provided
primitives.

Chris M. Thomasson

unread,

Feb 16, 2010, 11:30:02 PM2/16/10

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message
news:tPFen.186968$Fm7.1...@newsfe16.iad...
[...]

> Well, a standard library implementation in the form of a deferment to
> native OS library calls can be written in 100% assembly language.

Ummm... The above should probably read as:
__________________________________________________________
Well, a standard library implementation can defer to native OS provided
library calls that happen to be implemented in 100% assembly language...
__________________________________________________________

Is that any better?

;^o

spinoza1111

unread,

Feb 17, 2010, 3:37:15 AM2/17/10

to

On Feb 17, 6:35 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
wrote:

> "Chris M. Thomasson" <n...@spam.invalid> writes:
>
> > "Chris M. Thomasson" <n...@spam.invalid> wrote in message
> >news:4Dgen.97765$CM7....@newsfe04.iad...

> >> "spinoza1111" <spinoza1...@yahoo.com> wrote in message

> > Now you are confusing me here. First you say it's not a challenge,
> > then you seem to contradict yourself. Can you please clear this up for
> > me? Thanks.
>
> Do not feed the troll.

I am not a "troll", and "troll" is a Nordic racist word, referring as
it does to peoples pushed out of Western Europe by invaders after the
fall of Rome. I have been discussing and submitting code written in C.

A "troll" is one who posts insincerely in order to get a rise out of
people. I do not do so.

Richard Heathfield is by no means a friend of mine, yet he has gone on
record to say that I am not a "troll".

Please keep your comments on-topic if you cannot keep them civil.

spinoza1111

unread,

Feb 17, 2010, 3:49:39 AM2/17/10

to

Recursive solutions are in the thread where Seebs misspelled
"efficiency" in the header, in C by Willem and in C Sharp by myself.

Each solution will put something on the stack for each occurence of
the target. But if you don't recurse, then you need to create one of
my segments in my linked list. The stack frame is a couple of
addresses as is the segment.

Therefore, it seems to me that there's a minimum storage complexity to
the problem and not just to these two solutions. In a language/
computer where strings could grow magically this would not be the
case, but this would be the existence as if by magic of hardware that
could hey presto realloc strings "under the covers".

Here, the storage complexity is real if hidden. It is the need to keep
finding and reallocing small strings, or getting a big string and
discarding what you don't need.

And if you implement replace() in assembler, you still have the same
storage complexity.

The most storage efficient solution for C's format for strings would
examine the target and replacement strings. If the target length is
greater than or equal to that of the replacement, do the
transformation *in situ* by shifting bytes left. If the target length
is less, you must reallocate, perhaps more than once.

Which is why C programmers need to understand that C doesn't support
strings out of the box. Instead, it provides a silly set of solutions
based on the absurd idea that it makes sense to terminate a string
with a Nul.

C without strings could be a sensible language for low-level
programming of toys. C with strings needs to use a standardized, open
source modern string.H that represents strings as linked lists.

Chris M. Thomasson

unread,

Feb 17, 2010, 5:06:33 AM2/17/10

to

"spinoza1111" <spino...@yahoo.com> wrote in message

news:6f22a1d8-6b97-4821...@t17g2000prg.googlegroups.com...
[...]

> > > Moral: don't let the library do your thinking for you.
> >
> > How do you feel about a garbage collector doing all the thinking for
> > you? I

> Fine, since garbage collection is simpler than software design. We
> have the right to think of software entities coming into existence and
> dying without having to be midwifes or funeral directors.

What about forgetting to set a reference to NULL? Sometimes, you can
unnecessarily extend the lifetimes of objects in a pure GC system if you
forget to set certain object references to NULL. IMHO, a GC does not mean
you have to check you're brain at the door.

spinoza1111

unread,

Feb 17, 2010, 6:46:56 AM2/17/10

to

On Feb 17, 1:06 pm, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "spinoza1111" <spinoza1...@yahoo.com> wrote in message

True. In my book "Build Your Own .Net Language and Compiler" I have a
methodology for stateless objects which requires the user to dispose
the object calling a dispose() method. This allows the object to set
all its references to objects from the heap to null.

Precisely because you still need a brain when you have a garbage
collector means that you need the garbage collector since there's no
reason to waste fine brains on manual memory management when those
brains could be solving more important problems.

It is true, however, that you might run out of Fun Stuff to Think
About just as airline pilots in modern high tech cockpits traveling
across the Pacific might get bored. Too bad. I need the pilot to be
bored. I don't want him to have fun or be challenged.

Branimir Maksimovic

unread,

Feb 17, 2010, 6:55:53 AM2/17/10

to

spinoza1111 wrote:
>
> C without strings could be a sensible language for low-level
> programming of toys. C with strings needs to use a standardized, open
> source modern string.H that represents strings as linked lists.

I think that you could write c program hat will beat current c++ program
in this benchmark.
All in all it seems that g++ is 6 times faster than gcc and also
java server is slightly faster than gcc according to this benchmark.
I guess they don't test different algorithms, rather same algorihtm
in different language implementations, but I could be wrong.
According to this C is slower than java and both consumes more memory
than c++ and is 6 times slower for same algorithm?

http://shootout.alioth.debian.org/u64/performance.php?test=knucleotide&sort=fullcpu

× Program Source Code CPU secs Elapsed secs Memory KB Code B
≈ CPU Load
1.0 C++ GNU g++ #6 11.22 11.23 142,288 3415 0% 0% 0% 100%
2.0 C++ GNU g++ 22.30 22.30 135,788 2106 0% 0% 0% 100%
3.1 Ada 2005 GNAT #2 34.30 34.36 256,660 4865 0% 0% 0% 100%
4.2 Java 6 -server #2 47.33 47.41 490,660 1602 0% 0% 0% 100%
5.0 Java 6 -server 56.15 56.29 1,295,096 1330 0% 0% 0% 100%
5.0 C GNU gcc #6 56.21 56.25 180,540 2439 0% 0% 0% 100%

For example on my machine, classic c++ program that I would write
lasts:
real 0m58.659s
user 0m58.390s
sys 0m0.270s
I need 10 seconds for getline from std::cin into string than pack
into array of chars, let alone rest of processing.
but from this site benchmarking c++ program takes this time on my
machine:
real 0m4.306s
user 0m7.880s
sys 0m0.120s

Whoa!

Greets

Richard Heathfield

unread,

Feb 17, 2010, 9:29:50 AM2/17/10

to

spinoza1111 wrote:
> On Feb 17, 6:35 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
> wrote:

<snip>

>> Do not feed the troll.
>

<snip>

> Richard Heathfield is by no means a friend of mine, yet he has gone on
> record to say that I am not a "troll".

That is true. I have also gone on record as saying that you're an idiot.

spinoza1111

unread,

Feb 17, 2010, 9:59:32 AM2/17/10

to

On Feb 17, 5:29 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
> spinoza1111wrote:

> > On Feb 17, 6:35 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
> > wrote:
> <snip>
> >> Do not feed the troll.
>
> <snip>
>
> > Richard Heathfield is by no means a friend of mine, yet he has gone on
> > record to say that I am not a "troll".
>
> That is true. I have also gone on record as saying that you're an idiot.

That is correct. And I have gone on record as saying you're a fool.
So, we agree that I am not a troll.

Phil Carmody

unread,

Feb 17, 2010, 10:20:08 AM2/17/10

to

Richard Heathfield <r...@see.sig.invalid> writes:
> spinoza1111 wrote:
>> On Feb 17, 6:35 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
>> wrote:
> <snip>
>>> Do not feed the troll.
>>
> <snip>
>
>> Richard Heathfield is by no means a friend of mine, yet he has gone on
>> record to say that I am not a "troll".
>
> That is true. I have also gone on record as saying that you're an idiot.

He does it for the strokes. His motives may be different, and the
strokes he seeks may be different (though not vastly different from
some of the sci.math cranks I've encountered), but he's still doing
it to elicit responses. Troll, crank, idiot - you can tick all three
with him.

And Chris' post did nothing but invite him to spew more inane crap
onto c.l.c. Why would anyone want that?

Richard Heathfield

unread,

Feb 17, 2010, 10:30:17 AM2/17/10

to

spinoza1111 wrote:
> On Feb 17, 5:29 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>> spinoza1111wrote:
>>> On Feb 17, 6:35 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
>>> wrote:
>> <snip>
>>>> Do not feed the troll.
>> <snip>
>>
>>> Richard Heathfield is by no means a friend of mine, yet he has gone on
>>> record to say that I am not a "troll".
>> That is true. I have also gone on record as saying that you're an idiot.
>
> That is correct. And I have gone on record as saying you're a fool.
> So, we agree that I am not a troll.

As usual, my point has gone way over your head.

You cite me as someone who says you are not a troll, in support of an
argument that you are not a troll. In so doing, you are appealing to
people's trust in my good judgement, whether or not you realise it.

For those people who do trust my good judgement, the argument is a
powerful one, but the argument that you are an idiot is equally
powerful. And for those people who do not trust my good judgement, the
argument carries no conviction.

As for your opinion of me, I ascribe it no value whatsoever, so it has
no bearing on this discussion.

ObTopic: finding substrings is easy with strstr(). If you need to find a
substring, use strstr() until and unless profiling demonstrates that
it's a significant bottleneck.

fedora

unread,

Feb 17, 2010, 10:17:49 AM2/17/10

to

Ben Bacarisse wrote:

> fedora <no_...@invalid.invalid> writes:
>
>> Have finished my program for spinoza's challenge. rewrote everything and
>> this time i made each statement as simple as posible, so that i can
>> understand the program. The allSubstr procedure can search for over
>> lapping sub-string too like spinoza wanted, but the replace routine
>> doesnt use that since i cant think how to replace over lapping ones!
>>
>> haven't used any function from string.h! it works for strings i could
>> think of but maybe it got bugs since i'm just a beginner...
>
> The result is good, but I am not sure you were right to accept the
> peculiar notion of not using standard string functions. If you felt
> you had to, why not use standard functions but then plug-in you own
> versions? That way you learn about the standard library and get to
> write the character-fiddling functions that can be useful learning
> exercises.

Thanks Ben! I didn't use stdlib functions because i wanted to how
easy/difficult it would be to write my own. also spinoza didn't accept
program that called function in string.h.

About plugging in my own versions, do you mean writing routines with the
same name so that at linking the linker finds my lib first and plug it in?
but i read somewhere that using ansi c's namespace and relying on linker is
all undefined and gcc can replace calls to stdlib functions with inline code
so bypassing my code too... but i'll try it.

> I'll make a few detailed comments (one is a bug), but on the "big
> picture" I don't see why you mix size_t and unsigned all over the
> place. I'd stick to size_t.

ok.

> Finally, I think it is odd to return a null result when the substring
> is too long to match. I'd treat is like any other substring that does
> not match.

and just return copy of original target str? Okay, i'll change code to do
that but right now i'm having bigger problems.

okay. that's better than modelling after stdlib strcmp. my reasoning was
since there's only one case where two strings can be identical but many
cases where they can differ i'd use the one bool value for the first case
(false) and return different true values for un-equal cases. but i think
i've thought of it wrongly. boolean flase and true are both unique values
and not like C's definition of true.

Thanks for spotting! i'd never have seen that. I replaced that line by

if (replaced < replacements && str + strc == subpos[replaced]) {

i made no other changes and compiled and ran... but strange errors are
occurring. below i'm pasting session from gdb... i really appreciate if
anyone can tell me why it's happening. what's irritating is there is no
patter for the seg faults. sometimes it happens, sometimes not.

compiled with gcc -Wall -Wextra -std=c99 -pedantic -o replace replace.c -
ggdb3

# gdb ./replace
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...

(gdb) run "sdakfhaskfhaskdfhaskdfhaksdfhksdfhajksdfhkjsdfhsdfasd" "sda" "+"
Starting program: /home/fedora/c/replace
"sdakfhaskfhaskdfhaskdfhaksdfhksdfhajksdfhkjsdfhsdfasd" "sda" "+"

Program received signal SIGSEGV, Segmentation fault.
0x0000000000400648 in strFirstCh (str=0x7fff1da36feb "rc/c/replace",
ch=115 's', lstr=18446744073709551546) at replace.c:17
17 if (str[current] == ch) {

how can str be "rc/c/replace"?? It should be some part of the "sdak..."
string as i see...
ch is with right value. but lstr is wrong. how did it get that value? i cant
see any thing in my code that is wrong but i'm not intel.

big thanks to any one who can say why str is "rc/c/replace" and lstr is
wrong...

am thinking if string programing in C is naturally so difficult or i'm just
stupid:(

fedora

unread,

Feb 17, 2010, 11:19:54 AM2/17/10

to

Hi all!

Here's more session output from gdb... i cant figure out why it crashes for
random strings but not for others...

(gdb) run "askdjfhakdfhakdfhasdjkhaksdfhaklsdjfhlakfhlakfhaklfh" "ask" "+"
Starting program: /home/fedora/c/replace
"askdjfhakdfhakdfhasdjkhaksdfhaklsdjfhlakfhlakfhaklfh" "ask" "+"

Program received signal SIGSEGV, Segmentation fault.

0x0000000000400648 in strFirstCh (str=0x7fffe11e6ff5 "ce", ch=97 'a',
lstr=18446744073709551559) at replace.c:17

17 if (str[current] == ch) {

(gdb) run
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaskadhfklajsdhfkajfhklajdfhjkafhkdf"
"a" "+"
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /home/fedora/c/replace
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaskadhfklajsdhfkajfhklajdfhjkafhkdf"
"a" "+"
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++sk+dhfkl+jsdhfk+jfhkl+jdfhjk+fhkdf

Program exited normally.
(gdb)

i can see that str and lstr are getting corrupt but i cant see where in my
code that is happening... and putting lots of printfs into is really
discouraging.

thanks.

fedora

unread,

Feb 17, 2010, 11:58:54 AM2/17/10

to

Hi all!

now I added two printf calls to strSubstr and allSubstr to trace where str
and lstr are getting wrong values.

I also added one if statement to check if adding step to str would result in
it going beyond str+lstr. that is :-

if (((orig_str + orig_lstr) - str) <= step) break;

below is full code and run in gdb with still seg fault. suddenly lstr is
getting i can see, but which statement is doing that i can't see:(

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

size_t strLength(char *cstr) {
size_t index = 0;

while (cstr[index] != '\0') ++index;
return index;
}

char *strFirstCh(char *str, char ch, size_t lstr) {
char *chpos = 0;
size_t current;

for (current = 0; current < lstr; current++) {

if (str[current] == ch) {

chpos = str + current;
break;
}
}
return chpos;
}

int strComp(char *s, char *t, size_t len) {
int ret = 0;
size_t index;

for (index = 0; index < len; index++) {
if (s[index] != t[index]) {
ret = 1;
break;
}
}
return ret;
}

char *strSubstr(

char *str,
char *sub,
size_t lstr,
size_t lsub) {
char *substr = 0;
char *anchor = str;
size_t remaining_len = (lstr - lsub) + 1;

assert(str && sub && lstr && lsub && lstr >= lsub);

printf("in strSubstr: str = %p\tlstr = %zu\n", (void*)str, lstr);

if (((orig_str + orig_lstr) - str) <= step) break;

str += step;
lstr = (orig_str + orig_lstr) - str;

printf("in allSubstr: str = %p\tlstr = %zu\n", (void*)str, lstr);
}

if (!new) {
free(subpos);

return 0;
}
strc = newc = replaced = 0;
while (strc <= lstr) {

if (replaced < replacements && str + strc == subpos[replaced]) {

for (repc = 0; repc < lrep; repc++) {
new[newc] = rep[repc];
newc++;
}
replaced++;
strc += lsubstr;
}
else {
new[newc] = str[strc];
strc++;
newc++;
}
}
free(subpos);
}
else {
new = malloc(lstr + 1);
if (!new)
return 0;
for (strc = 0; strc <= lstr; strc++)
new[strc] = str[strc];
}
return new;
}

int main(int argc, char **argv) {
char *newstr;

assert(argc == 4);

newstr = replace(argv[1], argv[2], argv[3]);
if (newstr)
printf("%s\n", newstr);
else
printf("replace() -> null\n");
free(newstr);
return 0;
}

# gdb ./replace
(gdb) run "asdkhfaklsdjfhakldfhaklsdjfhakldfhakldfhaskldfh" "asd" "+"
Starting program: /home/fedora/c/replace
"asdkhfaklsdjfhakldfhaklsdjfhakldfhakldfhaskldfh" "asd" "+"
in strSubstr: str = 0x7fff646a86e4 lstr = 47
in allSubstr: str = 0x7fff646a86e7 lstr = 44
in strSubstr: str = 0x7fff646a86e7 lstr = 44
in allSubstr: str = 0x7fff646a8717 lstr = 18446744073709551612
in strSubstr: str = 0x7fff646a8717 lstr = 18446744073709551612

Program received signal SIGSEGV, Segmentation fault.

0x0000000000400698 in strFirstCh (str=0x7fff646a8ff5 "ce", ch=97 'a',

lstr=18446744073709551559) at replace.c:17
17 if (str[current] == ch) {

(gdb) run "asdkhfaklsdjfhakldfhaklsdjfhakldfhakldfhaskldfh" "as" "+"

The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /home/fedora/c/replace
"asdkhfaklsdjfhakldfhaklsdjfhakldfhakldfhaskldfh" "as" "+"
in strSubstr: str = 0x7fff27e376e5 lstr = 47
in allSubstr: str = 0x7fff27e376e7 lstr = 45
in strSubstr: str = 0x7fff27e376e7 lstr = 45
in allSubstr: str = 0x7fff27e3770f lstr = 5
in strSubstr: str = 0x7fff27e3770f lstr = 5
in strSubstr: str = 0x7fff27e376e5 lstr = 47
in strSubstr: str = 0x7fff27e376e7 lstr = 45
+dkhfaklsdjfhakldfhaklsdjfhakldfhakldfh+kldfh

Program exited normally.

as its shown using "as" for sub-string instead of "asd" makes it work ok.
somewhere some length calculation is getting wrapped around but am not able
to pin point...

thanks all.

Walter Banks

unread,

Feb 17, 2010, 12:25:34 PM2/17/10

to

Branimir Maksimovic wrote:

All in all it seems that g++ is 6 times faster than gcc and also
java server is slightly faster than gcc according to this benchmark.
I guess they don't test different algorithms, rather same algorihtm
in different language implementations, but I could be wrong.
According to this C is slower than java and both consumes more memory
than c++ and is 6 times slower for same algorithm?

http://shootout.alioth.debian.org/u64/performance.php?test=knucleotide&sort=fullcpu
Ã� Program Source Code CPU secs Elapsed secs Memory KB Code B
â�� CPU Load

1.0     C++ GNU g++ #6 11.22   11.23   142,288 3415      0% 0% 0% 100%
2.0     C++ GNU g++     22.30   22.30   135,788 2106      0% 0% 0% 100%
3.1     Ada 2005 GNAT #2        34.30   34.36   256,660 4865      0% 0% 0% 100%
4.2     Java 6 -server #2       47.33   47.41   490,660 1602      0% 0% 0% 100%
5.0     Java 6 -server 56.15   56.29   1,295,096       1330      0% 0% 0% 100%
5.0     C GNU gcc #6    56.21   56.25   180,540 2439      0% 0% 0% 100%

> According to this C is slower than java and both consumes more memory

> than c++ and is 6 times slower for same algorithm?

The stats that are quoted are for GNU. At the risk of causing a flame war
this far from a state of the art compiler.

The shootout benchmarks have done a lot to highlight the language
component to code generation.

Regards

Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

Ben Bacarisse

unread,

Feb 17, 2010, 12:31:02 PM2/17/10

to

fedora <no_...@invalid.invalid> writes:

> Ben Bacarisse wrote:
>
>> fedora <no_...@invalid.invalid> writes:
>>
>>> Have finished my program for spinoza's challenge. rewrote everything and
>>> this time i made each statement as simple as posible, so that i can
>>> understand the program. The allSubstr procedure can search for over
>>> lapping sub-string too like spinoza wanted, but the replace routine
>>> doesnt use that since i cant think how to replace over lapping ones!
>>>
>>> haven't used any function from string.h! it works for strings i could
>>> think of but maybe it got bugs since i'm just a beginner...
>>
>> The result is good, but I am not sure you were right to accept the
>> peculiar notion of not using standard string functions. If you felt
>> you had to, why not use standard functions but then plug-in you own
>> versions? That way you learn about the standard library and get to
>> write the character-fiddling functions that can be useful learning
>> exercises.
>
> Thanks Ben! I didn't use stdlib functions because i wanted to how
> easy/difficult it would be to write my own. also spinoza didn't accept
> program that called function in string.h.

Why do you want to do what spinoza1111 says? I am curious about how
you decided it was something you wanted to do.

His initial programs all used string.h and some versions erroneously
called strlen without including string.h. I don't know why he decided
to change his mind, but a skilled C programmer would use the standard
library and only do something more complex if there was a compelling
reason.

> About plugging in my own versions, do you mean writing routines with the
> same name so that at linking the linker finds my lib first and plug it in?
> but i read somewhere that using ansi c's namespace and relying on linker is
> all undefined and gcc can replace calls to stdlib functions with inline code
> so bypassing my code too... but i'll try it.

You do it like this:

#define STRLEN my_strelen

and you write STRLEN, STRSTR etc in your code.

<snip>

>>> char *strSubstr(
>>> char *str,
>>> char *sub,
>>> size_t lstr,
>>> size_t lsub) {
>>> char *substr = 0;
>>> char *anchor = str;
>>> size_t remaining_len = (lstr - lsub) + 1;

Your problems below come from this + 1, I think. It looks wrong.
Sorry I did not spot this first time round.

>>> assert(str && sub && lstr && lsub && lstr >= lsub);
>>> while (remaining_len > 0 && anchor) {
>>> if (anchor = strFirstCh(anchor, *sub, remaining_len)) {
>>> if (strComp(anchor, sub, lsub) == 0) {
>>> substr = anchor;
>>> break;
>>> }
>>> anchor++;
>>> remaining_len--;
>>> }
>>> }
>>> return substr;
>>> }

> i made no other changes and compiled and ran... but strange errors are
> occurring. below i'm pasting session from gdb... i really appreciate if
> anyone can tell me why it's happening. what's irritating is there is no
> patter for the seg faults. sometimes it happens, sometimes not.

If you run off an array, all kinds of strange things can happen. It
is often not worthwhile trying to work out exactly why (at least I've
stopped trying -- I just fix the problem).

> compiled with gcc -Wall -Wextra -std=c99 -pedantic -o replace replace.c -
> ggdb3
>
> # gdb ./replace
> GNU gdb 6.8-debian
> Copyright (C) 2008 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu"...
>
> (gdb) run "sdakfhaskfhaskdfhaskdfhaksdfhksdfhajksdfhkjsdfhsdfasd" "sda" "+"
> Starting program: /home/fedora/c/replace
> "sdakfhaskfhaskdfhaskdfhaksdfhksdfhajksdfhkjsdfhsdfasd" "sda" "+"
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000400648 in strFirstCh (str=0x7fff1da36feb "rc/c/replace",
> ch=115 's', lstr=18446744073709551546) at replace.c:17
> 17 if (str[current] == ch) {
>
> how can str be "rc/c/replace"?? It should be some part of the "sdak..."
> string as i see...
> ch is with right value. but lstr is wrong. how did it get that value? i cant
> see any thing in my code that is wrong but i'm not intel.
>
> big thanks to any one who can say why str is "rc/c/replace" and lstr is
> wrong...

See above.

> am thinking if string programing in C is naturally so difficult or i'm just
> stupid:(

No, it is hard to get the details right but you are not helping
yourself by not breaking your program up into helpful simple
functions. You have some nice functions to help find the strings, but
you stopped there. I'd have some functions to help build the copy
with the replacements.

*Actually*, I'd use (and did use) memcpy and strcpy, but if I had
decided to drink the "no string.h" cool aid, I'd write these myself.
Neither is more than a line or two and they simplify the replace
function a lot.

For my own amusement, I've studied what slows up this replace function
and I have written reasonably fast version that avoids strstr because,
as I've descried elsewhere, it forces the program to re-scan strings
unnecessarily. For very long strings, strstr still wins because of the
sophisticated algorithm that glibc's version uses, but for anything
else it is slightly faster to scan for all the sub-string positions
"by hand". Of course, it still uses the other string functions.

<snip>
--
Ben.

bartc

unread,

Feb 17, 2010, 12:46:48 PM2/17/10

to

"fedora" <no_...@invalid.invalid> wrote in message

news:hlggj2$t3m$1...@news.eternal-september.org...

> am thinking if string programing in C is naturally so difficult or i'm
> just
> stupid:(

It's just very fiddly.

--
Bartc

spinoza1111

unread,

Feb 17, 2010, 1:04:11 PM2/17/10

to

On Feb 17, 6:20 pm, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
wrote:
> Richard Heathfield <r...@see.sig.invalid> writes:
> >spinoza1111wrote:

> >> On Feb 17, 6:35 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
> >> wrote:
> > <snip>
> >>> Do not feed the troll.
>
> > <snip>
>
> >> Richard Heathfield is by no means a friend of mine, yet he has gone on
> >> record to say that I am not a "troll".
>
> > That is true. I have also gone on record as saying that you're an idiot.
>
> He does it for the strokes. His motives may be different, and the
> strokes he seeks may be different (though not vastly different from
> some of the sci.math cranks I've encountered), but he's still doing
> it to elicit responses. Troll, crank, idiot - you can tick all three
> with him.

All of which may be true. The problem is that in every discussion I'm
in, I am (like Shakespeare's Falstaff) not only Witty (IMO) in myself
but the cause that is of Witte that is in others. It's always a
refreshing change from the tedious efforts in this and other
newsgroups to recreate the dull spirit of the nastiest type of
business office, which is anhedonia gone insane. I somehow manage to
drive discussions that are always above the usual level.

Oh yes, and this "troll, crank, and idiot" here was the first to post
a solution to the problem, the first to debug it, and is the only one
to have posted anything like a correct solution other than Willem and
(as far as I can tell) io_x. The Regular Guys have been posting
idiotic nonsolutions with far more bugs, all of which use string.h.
It's far too late for the Chomsky Type 3 Guys to post anything, since
they'd probably plagiarize Willem or myself.

But if nonconforming to normalized deviance and Eunuch programming is
to be a troll, a crank and an idiot in your book, hey, so I am.

spinoza1111

unread,

Feb 17, 2010, 1:05:14 PM2/17/10

to

THOU SHALT NOT says Richard NOT USE STRING.H, even for shits and
giggles.

Because thou are virtuous, shall there be no more cakes and ale?

Ike Naar

unread,

Feb 17, 2010, 1:25:51 PM2/17/10

to

In article <hlglm1$r14$1...@news.eternal-september.org>,

fedora <no_...@invalid.invalid> wrote:
>
>I also added one if statement to check if adding step to str would result in
>it going beyond str+lstr. that is :-
>
>if (((orig_str + orig_lstr) - str) <= step) break;

Not sure if this is your problem, but you should be aware that
you're doing a comparison between a signed number (the difference
between the pointers (orig_str+orig_lstr) and str), and an unsigned
number (step).
This will give unexpected results if the lefthandside of the comparison
is a small negative value; the value will be converted to unsigned
before it is compared to the righthandside, and becomes a huge positive
value. As a result, the comparision yields false.

You can work around the problem by rewriting the comparison as

(orig_str + orig_lstr <= str + step)

where both sides of the comparision are pointers.

fedora

unread,

Feb 17, 2010, 1:30:24 PM2/17/10

to

Ike Naar wrote:

Hi Ike!

Gcc warned me of this too, but i ignored it since i couldn't think of a
better way.

Initially i thought of writing it exactly like yours but then when str is <
step bytes before it's end, adding step to it would be undefined behaviour
so i coded like i did. Maybe i can cast step to signed long and then compare
with (orig_str+orig_lstr) - str? is this ok?

thanks

fedora

unread,

Feb 17, 2010, 1:39:44 PM2/17/10

to

Ben Bacarisse wrote:

> fedora <no_...@invalid.invalid> writes:
>
>> Ben Bacarisse wrote:
>>
>>> fedora <no_...@invalid.invalid> writes:
>>>
>>>> Have finished my program for spinoza's challenge. rewrote everything
>>>> and this time i made each statement as simple as posible, so that i can
>>>> understand the program. The allSubstr procedure can search for over
>>>> lapping sub-string too like spinoza wanted, but the replace routine
>>>> doesnt use that since i cant think how to replace over lapping ones!
>>>>
>>>> haven't used any function from string.h! it works for strings i could
>>>> think of but maybe it got bugs since i'm just a beginner...
>>>
>>> The result is good, but I am not sure you were right to accept the
>>> peculiar notion of not using standard string functions. If you felt
>>> you had to, why not use standard functions but then plug-in you own
>>> versions? That way you learn about the standard library and get to
>>> write the character-fiddling functions that can be useful learning
>>> exercises.
>>
>> Thanks Ben! I didn't use stdlib functions because i wanted to how
>> easy/difficult it would be to write my own. also spinoza didn't accept
>> program that called function in string.h.
>
> Why do you want to do what spinoza1111 says? I am curious about how
> you decided it was something you wanted to do.

I wanted to write my own functions partly for learning as i'm just beginning
and also because if i used string.h, i couldn't have submitted to spinoza's
challenge...

but motly it was just for learning. as we see, i was correct since i'm
having so much difficulty.

> His initial programs all used string.h and some versions erroneously
> called strlen without including string.h. I don't know why he decided
> to change his mind, but a skilled C programmer would use the standard
> library and only do something more complex if there was a compelling
> reason.

i'm not a skilled programmer:)

>> About plugging in my own versions, do you mean writing routines with the
>> same name so that at linking the linker finds my lib first and plug it
>> in? but i read somewhere that using ansi c's namespace and relying on
>> linker is all undefined and gcc can replace calls to stdlib functions
>> with inline code so bypassing my code too... but i'll try it.
>
> You do it like this:
>
> #define STRLEN my_strelen
>
> and you write STRLEN, STRSTR etc in your code.

Okay i see. Thanks.

> <snip>
>>>> char *strSubstr(
>>>> char *str,
>>>> char *sub,
>>>> size_t lstr,
>>>> size_t lsub) {
>>>> char *substr = 0;
>>>> char *anchor = str;
>>>> size_t remaining_len = (lstr - lsub) + 1;
>
> Your problems below come from this + 1, I think. It looks wrong.
> Sorry I did not spot this first time round.

I added the one because when both string and sub string are equal length the
diff will give zero, so to make the while loop below work properly, i add
one.

i cant compare remaining_len >= 0 since that'll always be true for unsigned.

>>>> assert(str && sub && lstr && lsub && lstr >= lsub);
>>>> while (remaining_len > 0 && anchor) {
>>>> if (anchor = strFirstCh(anchor, *sub, remaining_len)) {
>>>> if (strComp(anchor, sub, lsub) == 0) {
>>>> substr = anchor;
>>>> break;
>>>> }
>>>> anchor++;
>>>> remaining_len--;
>>>> }
>>>> }
>>>> return substr;
>>>> }
>
> <snip bug fix>
>> i made no other changes and compiled and ran... but strange errors are
>> occurring. below i'm pasting session from gdb... i really appreciate if
>> anyone can tell me why it's happening. what's irritating is there is no
>> patter for the seg faults. sometimes it happens, sometimes not.
>
> If you run off an array, all kinds of strange things can happen. It
> is often not worthwhile trying to work out exactly why (at least I've
> stopped trying -- I just fix the problem).

i hate it when i cant get a mental image of how a function will behave for
all combinations of legal ainput values. boundary values complicate
everything and unsigned seems to be more trouble than i thought.

Yes i'm not happy with the replace routine. It's too big. i'll code versions
of strcpy and make it shorter and try again. maybe the bug is in replace...

> *Actually*, I'd use (and did use) memcpy and strcpy, but if I had
> decided to drink the "no string.h" cool aid, I'd write these myself.
> Neither is more than a line or two and they simplify the replace
> function a lot.
>
> For my own amusement, I've studied what slows up this replace function
> and I have written reasonably fast version that avoids strstr because,
> as I've descried elsewhere, it forces the program to re-scan strings
> unnecessarily.

by this replace function do you mean mine above?

Ike Naar

unread,

Feb 17, 2010, 1:53:28 PM2/17/10

to

In article <hlgr1g$qp1$1...@news.eternal-september.org>,

fedora <no_...@invalid.invalid> wrote:
>Ike Naar wrote:
>
>> In article <hlglm1$r14$1...@news.eternal-september.org>,
>> fedora <no_...@invalid.invalid> wrote:
>>>
>>>if (((orig_str + orig_lstr) - str) <= step) break;
>>

>> [snip]

>>
>> (orig_str + orig_lstr <= str + step)
>

> [snip]

>
>Initially i thought of writing it exactly like yours but then when str is <
>step bytes before it's end, adding step to it would be undefined behaviour
>so i coded like i did.

Whoops you're right I did not think about that.

> Maybe i can cast step to signed long and then compare
>with (orig_str+orig_lstr) - str? is this ok?

That is a possibility. If your compiler supports prtdiff_t then perhaps
it's better to cast step to that type, so that the types on both sides
of the comparison operator are the same.

What about (orig_str + orig_lstr - step <= str) ?

Kenny McCormack

unread,

Feb 17, 2010, 2:00:39 PM2/17/10

to

In article <250f529b-a204-4d8a...@w27g2000pre.googlegroups.com>,
spinoza1111 <spino...@yahoo.com> wrote:
...

>But if nonconforming to normalized deviance and Eunuch programming is
>to be a troll, a crank and an idiot in your book, hey, so I am.

Now you've gone and done it! From here on in, they will gleefully refer
to you as a "self-confessed troll".

It happened to me about 5 years ago - and they still use it in their
diatribes. The fact, of course, being that they define things so that
anyone with even a glimmer of intelligence becomes defined as "troll".

fedora

unread,

Feb 17, 2010, 2:02:58 PM2/17/10

to

Ike Naar wrote:

> In article <hlgr1g$qp1$1...@news.eternal-september.org>,
> fedora <no_...@invalid.invalid> wrote:
>>Ike Naar wrote:
>>
>>> In article <hlglm1$r14$1...@news.eternal-september.org>,
>>> fedora <no_...@invalid.invalid> wrote:
>>>>
>>>>if (((orig_str + orig_lstr) - str) <= step) break;
>>>
>>> [snip]
>>>
>>> (orig_str + orig_lstr <= str + step)
>>
>> [snip]
>>
>>Initially i thought of writing it exactly like yours but then when str is
>>< step bytes before it's end, adding step to it would be undefined
>>behaviour so i coded like i did.
>
> Whoops you're right I did not think about that.
>
>> Maybe i can cast step to signed long and then compare
>>with (orig_str+orig_lstr) - str? is this ok?
>
> That is a possibility. If your compiler supports prtdiff_t then perhaps
> it's better to cast step to that type, so that the types on both sides
> of the comparison operator are the same.
>
> What about (orig_str + orig_lstr - step <= str) ?

This is perfect thanks!! After changing that line to the comparison above,
i've stopped getting the strange seg faults as far as i can see.

so it seems this failed test allowed lstr to wrap around to very big values
and access foreign memory locations. now it works!

thanks again Ike

Richard Harter

unread,

Feb 17, 2010, 3:29:50 PM2/17/10

to

Clearly then, you are not a troll.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Infinity is one of those things that keep philosophers busy when they
could be more profitably spending their time weeding their garden.

spinoza1111

unread,

Feb 17, 2010, 3:54:54 PM2/17/10

to

On Feb 17, 10:00 pm, gaze...@shell.xmission.com (Kenny McCormack)
wrote:

> In article <250f529b-a204-4d8a-b670-2b555ea6a...@w27g2000pre.googlegroups.com>,spinoza1111 <spinoza1...@yahoo.com> wrote:
>
> ...
>
> >But if nonconforming to normalized deviance and Eunuch programming is
> >to be a troll, a crank and an idiot in your book, hey, so I am.
>
> Now you've gone and done it! From here on in, they will gleefully refer
> to you as a "self-confessed troll".
>
> It happened to me about 5 years ago - and they still use it in their
> diatribes. The fact, of course, being that they define things so that
> anyone with even a glimmer of intelligence becomes defined as "troll".

There's no point, Kenny, in evading their stupid, childish labels. But
I predict that if I continue posting great code and incisive writing,
things will change here.

Hasta la victoria siempre!

spinoza1111

unread,

Feb 17, 2010, 4:03:23 PM2/17/10

to

On Feb 17, 11:29 pm, c...@tiac.net (Richard Harter) wrote:
> On Wed, 17 Feb 2010 14:00:39 +0000 (UTC),
>

> gaze...@shell.xmission.com (Kenny McCormack) wrote:
> >In article <250f529b-a204-4d8a-b670-2b555ea6a...@w27g2000pre.googlegroups.com>,

> >spinoza1111 <spinoza1...@yahoo.com> wrote:
> >...
> >>But if nonconforming to normalized deviance and Eunuch programming is
> >>to be a troll, a crank and an idiot in your book, hey, so I am.
>
> >Now you've gone and done it! From here on in, they will gleefully refer
> >to you as a "self-confessed troll".
>
> >It happened to me about 5 years ago - and they still use it in their
> >diatribes. The fact, of course, being that they define things so that
> >anyone with even a glimmer of intelligence becomes defined as "troll".
>
> Clearly then, you are not a troll.

They think they can define the world,
Their poison they have hurled,
Because they can't stand freedom:
They recreate their corporate kingdom.
They actually do this for fun
Thinking by lies they have won,
But when you look at their code,
You see quite a load,
Of the bugs they condemn in others.
Seebach can't get string length right
He's off by one in plain sight,
And they dare to call you a "troll"
When you can write above the level of O,
When you code above the level of Joe.

They've taken structured programming
And made it a dog's dinner:
They think it is a ban on thinking.
Dijsktra died at seventy-two
In part from life long depression
At the nonsense and voodoo
That passed for good computing.
Now they invoke his name
To speak it they should feel shame.
>
> Richard Harter, c...@tiac.nethttp://home.tiac.net/~cri,http://www.varinoma.com

Seebs

unread,

Feb 17, 2010, 5:16:34 PM2/17/10

to

On 2010-02-17, fedora <no_...@invalid.invalid> wrote:
> Thanks Ben! I didn't use stdlib functions because i wanted to how
> easy/difficult it would be to write my own. also spinoza didn't accept
> program that called function in string.h.

Unless you're expecting to spend most of your programming career working
for people who have major obsessive problems with using technology in ways
that generally work, because of personal grudges never adequately explained,
I would suggest that perhaps what Nilges accepts or doesn't accept should
not be a component of any technical decision, ever.

> About plugging in my own versions, do you mean writing routines with the
> same name so that at linking the linker finds my lib first and plug it in?
> but i read somewhere that using ansi c's namespace and relying on linker is
> all undefined and gcc can replace calls to stdlib functions with inline code
> so bypassing my code too... but i'll try it.

The obvious way to do it would be:

size_t
callstrlen(const char *s) {
#ifdef ME
/* your code here */
#else
return strlen(s);
#endif
}

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Seebs

unread,

Feb 17, 2010, 5:48:12 PM2/17/10

to

On 2010-02-17, fedora <no_...@invalid.invalid> wrote:

> am thinking if string programing in C is naturally so difficult or i'm just
> stupid:(

It takes quite a while to get used to it, at the very least. When you're
doing multiple string operations, you have a lot of things to keep track
of. Consider a simple strcpy() replacement:
char *cpystr(char *dest, char *src) {
char *start = dest;
while ((*(dest++) = *(src++)) != '\0')
;
return start;
}

This isn't actually all that simple; I write a lot of code which is
much harder for me to figure out, but which would be easier to figure
out without years of experience with strings. It can be a bit easier
to read done the other way:

char *cpystr(char *dest, char *src) {
int i;
for (i = 0; src[i]; ++i) {
dest[i] = src[i];
}
dest[i] = '\0';
return dest;
}

This one is often easier for people to read because the pointers stay
pointing to the same things. So for some cases, you may find array
notation simpler. What I have found is usually that array notation is
simpler as long as I only have one index to use. If I have multiple indexes,
it can be easier for me to think about the problem if I write it in
terms of pointers, but not necessarily tersely:

while (*src) {
*dest = *src;
++dest;
++src;
}
*dest = '\0';

That's certainly simpler to understand than the original version. What this
does is trade the simplicity of having the pointers stay fixed -- they always
point to the same things -- for the simplicity of having the semantic content
of the pointers stay fixed -- they always point to the next character you
need to copy.

Which of those is better is a bit arbitrary, and may vary from one person to
another, or one algorithm to another. Once you start getting into stuff like
strstr() implementations, it's often important to pick descriptive names.
I don't recommend names like thePointerToTheThingIWasGoingToCopyInto. I tend
to prefer short nickname length things; something that's easy to recognize,
visually distinct, and tells me what I'm doing.

So...

char *findstr(char *needle, char *haystack) {
while (*haystack) {
if (*haystack == *needle) {
char *h = haystack, *n = needle, *first = NULL;
while (*n && *n == *h) {
if (!first && *h == *needle) {
first = h;
}
++n;
++h;
}
if (!*n) {
return haystack;
} else {
haystack = first - 1;
}
}
++haystack;
}
return NULL;
}

I don't know whether this will work, but it's approximately a standard cheap
strstr(), with the one optimization being an attempt to not rescan a chunk
of the haystack looking for the first character of the needle when it's not
needed. The names aren't very long, and the inner loop uses short names
which clearly refer back to their sources.

You're invited to look for errors in this, as there may well be some, not
the least of which is that I have no idea whether or not it will compile.

Kenny McCormack

unread,

Feb 17, 2010, 7:29:33 PM2/17/10

to

In article <4b7c0b54....@text.giganews.com>,
Richard Harter <c...@tiac.net> wrote:
...

>>It happened to me about 5 years ago - and they still use it in their
>>diatribes. The fact, of course, being that they define things so that
>>anyone with even a glimmer of intelligence becomes defined as "troll".
>
>Clearly then, you are not a troll.

Don't quit your day job.

Malcolm McLean

unread,

Feb 17, 2010, 7:42:02 PM2/17/10

to

On Feb 17, 7:16 pm, Seebs <usenet-nos...@seebs.net> wrote:

> On 2010-02-17, fedora <no_m...@invalid.invalid> wrote:
>
> > Thanks Ben! I didn't use stdlib functions because i wanted to how
> > easy/difficult it would be to write my own. also spinoza didn't accept
> > program that called function in string.h.
>
> Unless you're expecting to spend most of your programming career working
> for people who have major obsessive problems with using technology in ways
> that generally work, because of personal grudges never adequately explained,
> I would suggest that perhaps what Nilges accepts or doesn't accept should
> not be a component of any technical decision, ever.
>

That's the ad hominem fallacy. It's not a pretentious term for
"insult" but a common falacy, which is to suppose that an argument is
wrong because of the person who is making it.

In fact there are good reasons for deprecating string.h. chars
effectively have to be octets, whilst often programs need to accept
non-Latin strings. Then the functions are all very old, with certain
weaknesses (no protection from buffer overun in strcpy, an O(N)
performance for strcat and strlen, an inconvenient interface for
strcat, const inconsistencies with strchr, very poor functionality
with strfind and const inconsiencies here too, very serious buffer
problems with sprintf, an overly difficult interface and buffer
problems with sscanf, thread problems with strtok and a non-intuitive
interface.

Seebs

unread,

Feb 17, 2010, 7:49:33 PM2/17/10

to

On 2010-02-17, Malcolm McLean <malcolm...@btinternet.com> wrote:
> On Feb 17, 7:16�pm, Seebs <usenet-nos...@seebs.net> wrote:
>> Unless you're expecting to spend most of your programming career working
>> for people who have major obsessive problems with using technology in ways
>> that generally work, because of personal grudges never adequately explained,
>> I would suggest that perhaps what Nilges accepts or doesn't accept should
>> not be a component of any technical decision, ever.

> That's the ad hominem fallacy. It's not a pretentious term for
> "insult" but a common falacy, which is to suppose that an argument is
> wrong because of the person who is making it.

No, it's not an ad hominem fallacy. It's the very well supported view
that what Nilges accepts or doesn't accept should not be a component of any
technical decision. I'm not saying that an argument is wrong because of
the person who is making it. I'm saying that a conclusion should be ignored
(neither accepted nor rejected) based on the person who has offered it.

Which is to say, if you know someone is a clown, knowing his position on an
issue tells you nothing for or against the issue. Now, if he had advanced
an argument, it could be worth discussing that argument, but as long as
we're just talking about his conclusion, it's not an ad hominem fallacy to
suggest disregarding the conclusions reached by someone who is demonstrably
very bad at the topic in question.

> In fact there are good reasons for deprecating string.h.

For some purposes, yes.

For manipulation of sequences of non-NUL chars, terminated by a char, not so
much.

> chars
> effectively have to be octets, whilst often programs need to accept
> non-Latin strings.

True. This is addressed in no small part by the multibyte stuff, which you
would presumably use for multibyte strings.

> Then the functions are all very old, with certain
> weaknesses (no protection from buffer overun in strcpy, an O(N)
> performance for strcat and strlen, an inconvenient interface for
> strcat, const inconsistencies with strchr, very poor functionality
> with strfind and const inconsiencies here too, very serious buffer
> problems with sprintf, an overly difficult interface and buffer
> problems with sscanf, thread problems with strtok and a non-intuitive
> interface.

I'm not aware of "strfind".

While the various interfaces are certainly flawed, consider that the Nilges
alternative is to duplicate the flaws of the interface without even the
benefit of already having been debugged. Or, worse, to just not even come
close.

There are certainly cases where the <string.h> functions are not the right
tool. However, that Nilges argues against it is not an argument either way
in that. His arguments might be an argument either way; his conclusion is
not.

Ben Bacarisse

unread,

Feb 17, 2010, 8:50:10 PM2/17/10

to

Malcolm McLean <malcolm...@btinternet.com> writes:
<snip>

> In fact there are good reasons for deprecating string.h. chars
> effectively have to be octets, whilst often programs need to accept
> non-Latin strings.

It's easy to switch to wide character versions if you used the
equivalent str* versions. A few macros and you can build versions for
either character type very simply.

> Then the functions are all very old, with certain
> weaknesses (no protection from buffer overun in strcpy, an O(N)
> performance for strcat and strlen, an inconvenient interface for
> strcat, const inconsistencies with strchr, very poor functionality
> with strfind and const inconsiencies here too, very serious buffer
> problems with sprintf, an overly difficult interface and buffer
> problems with sscanf, thread problems with strtok and a non-intuitive
> interface.

Those are arguments for using something better, not arguments for not
using C's string functions. If the "challenge" had been: "use this
improved string library to write replace" or "design a string library
so that replace is easy to write" I for one would have no objection.

The problem is that rejecting what is already there (rather than using
something better) leads to a /more/ complex and buggy solution.

--
Ben.

Kaz Kylheku

unread,

Feb 17, 2010, 8:58:12 PM2/17/10

to

On 2010-02-17, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
><snip>
>> In fact there are good reasons for deprecating string.h. chars
>> effectively have to be octets, whilst often programs need to accept
>> non-Latin strings.
>
> It's easy to switch to wide character versions if you used the
> equivalent str* versions.

Just watch out for C99 braindamage!

swprintf has an argument interface similar to the wide character
equivalent of snprintf, but you may be bitten by the gratuitously
different return value convention.

Richard Heathfield

unread,

Feb 17, 2010, 9:14:41 PM2/17/10

to

Malcolm McLean wrote:
> On Feb 17, 7:16 pm, Seebs <usenet-nos...@seebs.net> wrote:
>> On 2010-02-17, fedora <no_m...@invalid.invalid> wrote:
>>
>>> Thanks Ben! I didn't use stdlib functions because i wanted to how
>>> easy/difficult it would be to write my own. also spinoza didn't accept
>>> program that called function in string.h.
>> Unless you're expecting to spend most of your programming career working
>> for people who have major obsessive problems with using technology in ways
>> that generally work, because of personal grudges never adequately explained,
>> I would suggest that perhaps what Nilges accepts or doesn't accept should
>> not be a component of any technical decision, ever.
>>
> That's the ad hominem fallacy. It's not a pretentious term for
> "insult" but a common falacy, which is to suppose that an argument is
> wrong because of the person who is making it.

No, he's not saying that an argument is wrong because Mr Nilges is
making it. He's saying that Mr Nilges's support for an argument is of no
value in determining whether or not the argument is valid. That's very
different.

Here is an example of "ad hominem" argument:

"Mr X says strcpy is dangerous. Mr X doesn't know spit about C.
Therefore strcpy is not dangerous". Poor reasoning.

Here is an example of what Seebs is saying, using the above
(hypothetical) case:

"Mr X says strcpy is dangerous. Mr X doesn't know spit about C.
Therefore X's claim that strcpy is dangerous adds no value to the
argument that strcpy is dangerous. Whether it is or isn't dangerous is
another matter entirely."

That is, he is in effect claiming that it is possible that even someone
who knows little about a subject may know enough about it to make a
correct claim, or may simply make a correct claim by chance. So, moving
back from the abstract to the concrete, he is not dismissing any claim
made by Mr Nilges as being necessarily incorrect. That would be foolish,
not only because it would be an invalid "ad hominem" argument, but also
because Mr Nilges (who is on record as saying that he wishes to cause
maximum damage to this newsgroup) could exploit such a position by
deliberately making claims that are clearly true.

> In fact there are good reasons for deprecating string.h. chars
> effectively have to be octets, whilst often programs need to accept
> non-Latin strings. Then the functions are all very old, with certain
> weaknesses (no protection from buffer overun in strcpy, an O(N)
> performance for strcat and strlen, an inconvenient interface for
> strcat, const inconsistencies with strchr, very poor functionality
> with strfind and const inconsiencies here too, very serious buffer
> problems with sprintf, an overly difficult interface and buffer
> problems with sscanf, thread problems with strtok and a non-intuitive
> interface.

Taking your specific points one at a time:

Whilst it is true that strcpy offers no added protection against buffer
overrun, careful programming overcomes this problem. Thus, strcpy does
not get in the way of the programmer who knows full well that his buffer
is sufficiently large - no performance penalty is imposed.

Yes, strcat and strlen are O(N) - so, where it matters, you remember the
string length, having found it out the first time. These two functions
offer simple solutions to a simple task, and as such are very often a
good solution to the task at hand. Where that is not the case, we have
the option of building more powerful tools. (And yes, I agree that
strcat's interface could be improved; for example, it could return a
pointer to the null terminator rather than to the beginning of the string.)

Again, I must agree that the const inconsistency with strchr is a bit of
a wart. But the input is const purely to constitute a promise that the
function won't write to the input string. The return value is non-const
because strchr would otherwise be a real pain to use. How could it be
done better?

As for strfind, that's not C's problem. Take it up with the vendor.

To my mind, the sprintf function does not have serious buffer problems.
Nevertheless, some people obviously disagree, and C99 provides snprintf
for such people.

The scanf function is basically a mess, and is rarely used correctly. I
am at a loss to understand why it is introduced so early in programming
texts.

The strtok function is of limited use, but there are times when it is
just the ticket. It would be better, however, for it to take a state
pointer. I'm not convinced that its interface is particularly non-intuitive.

Walter Banks

unread,

Feb 17, 2010, 9:19:53 PM2/17/10

to

Ben Bacarisse wrote:

> Those are arguments for using something better, not arguments for not
> using C's string functions. If the "challenge" had been: "use this
> improved string library to write replace" or "design a string library
> so that replace is easy to write" I for one would have no objection.
>
> The problem is that rejecting what is already there (rather than using
> something better) leads to a /more/ complex and buggy solution.

This whole project thread has been filled how not to engineer software.
Application code specifically avoiding libraries, Not Invented Here,
random design with moving target specifications or no specifications,
ad hoc testing with a dose of 20+ Year old unresolved office battles,
interpersonal rivalry and off topic rants.

As several have stated not the environment that we are used to.

Regards

w..

Chris M. Thomasson

unread,

Feb 17, 2010, 9:51:29 PM2/17/10

to

"Richard Heathfield" <r...@see.sig.invalid> wrote in message
news:FrCdnTZJc8ajweHW...@bt.com...

[...]

> Yes, strcat and strlen are O(N) - so, where it matters, you remember the
> string length, having found it out the first time.

Bingo! :^)

Seebs

unread,

Feb 17, 2010, 9:39:45 PM2/17/10

to

On 2010-02-17, Walter Banks <wal...@bytecraft.com> wrote:
> This whole project thread has been filled how not to engineer software.
> Application code specifically avoiding libraries, Not Invented Here,
> random design with moving target specifications or no specifications,
> ad hoc testing with a dose of 20+ Year old unresolved office battles,
> interpersonal rivalry and off topic rants.

> As several have stated not the environment that we are used to.

Yes, but it's important to be prepared to program in some of the many
environments which real programmers often end up having to work in.

To be fair, I've never had a coworker in the same class as Nilges. Not
even particularly close. But I have had to work with arbitrary or
bad specifications, specifications which change repeatedly during
implementation, old office battles, and vehement opposition to things which
were Not Invented Here.

At one point, I was asked to develop a linked list implementation. The
proposed design looked like this:

struct list_node {
struct list_node *next;
void *data;
};

struct list {
struct list_node *head;
struct list_node *tail;
};

The specification was much as you'd expect. Except for one TINY detail.
Which was that the formal specification was that
(struct list *) (x->tail->next) == x
whenever tail was not null.

That is to say, if the list contained any members, the "next" pointer for
the last member of the list was a pointer (suitably converted) to the list
object.

So iteration would look roughly like:
for (l = x->head; l->next != x; l = l->next) {
/* ... */
}

It took a day or so of effort for me to round up enough senior developers
to all sit on the guy and tell him that:

1. He was wrong.
2. He was micro-managing, which is presumptively wrong.

before we were allowed to use a more conventional design.

Having had to deal with things like a database in which the formal schema
description begins with "all fields are VARCHAR for simplicity", I found the
Nilges String Replace Challenge to be a surprisingly good approximation of
what programming work is often like in the real world.

(Disclaimer: All the above memories are faded with age. My current
environment is pretty good about this kind of stuff. I have no clue about
the office politics, as our management put a great deal of time and effort
into ensuring that they are Not Our Problem.)

Walter Banks

unread,

Feb 17, 2010, 10:08:42 PM2/17/10

to

Seebs wrote:

> On 2010-02-17, Walter Banks <wal...@bytecraft.com> wrote:
> > This whole project thread has been filled how not to engineer software.
> > Application code specifically avoiding libraries, Not Invented Here,
> > random design with moving target specifications or no specifications,
> > ad hoc testing with a dose of 20+ Year old unresolved office battles,
> > interpersonal rivalry and off topic rants.
>
> > As several have stated not the environment that we are used to.
>
> Yes, but it's important to be prepared to program in some of the many
> environments which real programmers often end up having to work in.
>
> To be fair, I've never had a coworker in the same class as Nilges. Not
> even particularly close. But I have had to work with arbitrary or
> bad specifications, specifications which change repeatedly during
> implementation, old office battles, and vehement opposition to things which
> were Not Invented Here.

I saw this 20 years ago and it became nothing but a memory as better
more effective approaches prevailed.

> (Disclaimer: All the above memories are faded with age. My current
> environment is pretty good about this kind of stuff. I have no clue about
> the office politics, as our management put a great deal of time and effort
> into ensuring that they are Not Our Problem.)

Software development practices have improved a lot as applications
have become more complex and requirements better defined.

w..

Ben Bacarisse

unread,

Feb 18, 2010, 12:02:13 AM2/18/10

to

fedora <no_...@invalid.invalid> writes:

> Ben Bacarisse wrote:
>
>> fedora <no_...@invalid.invalid> writes:

<snip>

>>>> fedora <no_...@invalid.invalid> writes:
<snip>
>>>>> char *strSubstr(
>>>>> char *str,
>>>>> char *sub,
>>>>> size_t lstr,
>>>>> size_t lsub) {
>>>>> char *substr = 0;
>>>>> char *anchor = str;
>>>>> size_t remaining_len = (lstr - lsub) + 1;
>>
>> Your problems below come from this + 1, I think. It looks wrong.
>> Sorry I did not spot this first time round.
>
> I added the one because when both string and sub string are equal length the
> diff will give zero, so to make the while loop below work properly, i add
> one.
>
> i cant compare remaining_len >= 0 since that'll always be true for
> unsigned.

I think you are right. The bug is a bit further on... I mixed two
related issues up.

>>>>> assert(str && sub && lstr && lsub && lstr >= lsub);
>>>>> while (remaining_len > 0 && anchor) {
>>>>> if (anchor = strFirstCh(anchor, *sub, remaining_len)) {
>>>>> if (strComp(anchor, sub, lsub) == 0) {
>>>>> substr = anchor;
>>>>> break;
>>>>> }
>>>>> anchor++;
>>>>> remaining_len--;
>>>>> }
>>>>> }
>>>>> return substr;
>>>>> }

The problem is not the + 1, but the fact that you subtract one from
remaining_len every time, even when anchor has jumped by more than
one. Put:

assert(!substr || substr + lsub <= str + lstr);

just before the return and run with "aaxbx" and "xa" and you will see
that the assert will fire because the second time round the loop
anchor points to the 'b' and remaining_len is 3 (when it should be 2).

One solution is to replace the decrement of remaining_len with a new
calculation of it:

remaining_len = str + lstr - anchor - lsub + 1;

If you do that, there is really no need to have it in a variable.
Just pass this value to strFirstCh and loop while anchor is not null
but I suspect that you can come up with a simpler way to write the
whole function if you try.

--
Ben.

Ben Bacarisse

unread,

Feb 18, 2010, 12:06:09 AM2/18/10

to

fedora <no_...@invalid.invalid> writes:

No, sorry, your program is still wrong. You are looking at a symptom
not a cause. The change alters thing so it /seems/ to work but the
bug I pointed out (though I miss-described it) is still there. This
new test (which I don't think is needed if you correct strSubstr) is
hiding the undefined effect of going outside the array, but the
behaviour is still undefined even though you may not see a crash. See
my other post about that.

lstr "wrapped round" only because strSubstr is returning a pointer too
close to the end of the string. This line:

lstr = (orig_str + orig_lstr) - str;

sets an unsigned lstr to a possibly signed pointer difference. With
the + error in strSubstr, this difference can be -1. If you fix
strSubstr everything is OK again. Of course, I may have missed
another problem, but i am pretty sure about this one!

--
Ben.

spinoza1111

unread,

Feb 18, 2010, 3:09:15 AM2/18/10

to

On Feb 18, 3:49 am, Seebs <usenet-nos...@seebs.net> wrote:

> On 2010-02-17, Malcolm McLean <malcolm.mcle...@btinternet.com> wrote:
>
> > On Feb 17, 7:16 pm, Seebs <usenet-nos...@seebs.net> wrote:
> >> Unless you're expecting to spend most of your programming career working
> >> for people who have major obsessive problems with using technology in ways
> >> that generally work, because of personal grudges never adequately explained,
> >> I would suggest that perhaps what Nilges accepts or doesn't accept should
> >> not be a component of any technical decision, ever.
> > That's the ad hominem fallacy. It's not a pretentious term for
> > "insult" but a common falacy, which is to suppose that an argument is
> > wrong because of the person who is making it.
>
> No, it's not an ad hominem fallacy. It's the very well supported view
> that what Nilges accepts or doesn't accept should not be a component of any
> technical decision. I'm not saying that an argument is wrong because of
> the person who is making it. I'm saying that a conclusion should be ignored
> (neither accepted nor rejected) based on the person who has offered it.

...and that's an ad hominem fallacy. If you'd troubled to take a class
in informal logic, you'd have discovered that there is a VALID
argument based on applicable authority, but none based on anti-
authority. Nobody except a Fascist or a child refuses to believe
something because of his hatred of a person.

>
> Which is to say, if you know someone is a clown, knowing his position on an
> issue tells you nothing for or against the issue. Now, if he had advanced
> an argument, it could be worth discussing that argument, but as long as
> we're just talking about his conclusion, it's not an ad hominem fallacy to
> suggest disregarding the conclusions reached by someone who is demonstrably
> very bad at the topic in question.
>

I'd be careful with this, dear boy. You're the one who posted a
"solution" for replacing %s (and bugger all) with something else, that
didn't work, and who posted a strlen with a crude off by one bug. Many
people here (but not me) may well use your demonstrated incompetence
as a reason for ignoring your input on technical matters. I haven't:
most recently, I used the fact that your latest stupidity in the off-
by-one strlen was so glaringly obvious, because you used short
identifiers, to admit that my long and literate identifies might need
to be reduced.

> > In fact there are good reasons for deprecating string.h.
>
> For some purposes, yes.
>
> For manipulation of sequences of non-NUL chars, terminated by a char, not so
> much.

You haven't responded to Malcolm's concerns at all.

>
> > chars
> > effectively have to be octets, whilst often programs need to accept
> > non-Latin strings.
>
> True. This is addressed in no small part by the multibyte stuff, which you
> would presumably use for multibyte strings.

...as an exception (in other words, a bug in waiting)

>
> > Then the functions are all very old, with certain
> > weaknesses (no protection from buffer overun in strcpy, an O(N)
> > performance for strcat and strlen, an inconvenient interface for
> > strcat, const inconsistencies with strchr, very poor functionality
> > with strfind and const inconsiencies here too, very serious buffer
> > problems with sprintf, an overly difficult interface and buffer
> > problems with sscanf, thread problems with strtok and a non-intuitive
> > interface.
>
> I'm not aware of "strfind".
>
> While the various interfaces are certainly flawed, consider that the Nilges
> alternative is to duplicate the flaws of the interface without even the
> benefit of already having been debugged. Or, worse, to just not even come
> close.

Incoherent. How do I "duplicate the flaws of the interface?" My
replace merely starts with Nul terminated strings because this is what
I'm given. You're just issuing words out of hatred at this point.

>
> There are certainly cases where the <string.h> functions are not the right
> tool. However, that Nilges argues against it is not an argument either way
> in that. His arguments might be an argument either way; his conclusion is
> not.

Empty words.

There is a techie named Seebach
Who sense and skill doth lack
Who uses words
Like little turds
That incompetent techie named Seebach

His code it had an ugly Bug
Off by one, snug as a rug
It was obvious to all
And the sensitive this bug did appall
It made them say, oh Ugh

But we would like to forgive him
Programming is hard for all men
And when he withdraws his Schildt rant
We will do so, like Immanuel Kant.
>
> -s
> --
> Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nos...@seebs.nethttp://www.seebs.net/log/<-- lawsuits, religion, and funny pictureshttp://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

spinoza1111

unread,

Feb 18, 2010, 3:12:48 AM2/18/10

to

You're in denial. In fact, Seebach and Heathfield insist on making
this recreational environment, in which we could in the absence of job
pressures engage in a common search for truth, representative of the
usual sort of development environment. The fact is that most high tech
firms produce low tech software consistently, and they do so because
correct software takes "too much time", and requires for its
developers, free human beings unafraid of being laid off into a savage
state of unemployment when they are the targets of office bullies.

spinoza1111

unread,

Feb 18, 2010, 3:18:42 AM2/18/10

to

> Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nos...@seebs.nethttp://www.seebs.net/log/<-- lawsuits, religion, and funny pictureshttp://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Fascists like to tell stories about fantasy people and other people in
hopes that their listeners will then confuse the mythical or alternate
person with their target. This of course is easier than going head to
head, where Peter consistently loses.

Peter has not, to my knowledge, posted a solution to my challenge that
works. The only people to have done so are Willem and io_x. Instead,
he started the ball rolling with code with bugs (%s and bugger all to
bugger all) and has since that time posted more code with bugs, that
uses string.h, along with a marvelously compact example of off by one
bugosity.

So of course, it's time to start telling war stories about his
victories over "incompetent" coworkers.

Instead of developing his own competence (for example, by taking
university courses in comp sci), Peter insists on advancing his career
by trashing good people like Schildt and paying his way onto standards
committees.

He is a most amusing creature.

Walter Banks

unread,

Feb 18, 2010, 3:57:02 AM2/18/10

to

spinoza1111 wrote:

This project and your comments show just how far out of touch
you are with software technology. It goes back a long time
when with pride you were debugging in machine code on a 1401
instead of using tools created to debug software.

NIH and other juvenile behavior at best illustrates the way
not to implement a project and at worst misleading and intellectually
dishonest.

The broad based generalizations that you attribute to software
development companies just doesn't fit the facts.

w..

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

spinoza1111

unread,

Feb 18, 2010, 4:23:39 AM2/18/10

to

On Feb 18, 5:39 am, Seebs <usenet-nos...@seebs.net> wrote:

> On 2010-02-17, Walter Banks <wal...@bytecraft.com> wrote:
>
> > This whole project thread has been filled how not to engineer software.
> > Application code specifically avoiding libraries, Not Invented Here,
> > random design with moving target specifications or no specifications,
> > ad hoc testing with a dose of 20+ Year old unresolved office battles,
> > interpersonal rivalry and off topic rants.
> > As several have stated not the environment that we are used to.
>
> Yes, but it's important to be prepared to program in some of the many
> environments which real programmers often end up having to work in.
>
> To be fair, I've never had a coworker in the same class as Nilges. Not

No, I don't think you have.

> even particularly close. But I have had to work with arbitrary or
> bad specifications, specifications which change repeatedly during
> implementation, old office battles, and vehement opposition to things which
> were Not Invented Here.

You've memorized the mere names of thought crimes ("not invented
here") without learning your trade; as you have told us, you haven't
taken any university computer science classes at all.

>
> At one point, I was asked to develop a linked list implementation. The
> proposed design looked like this:
>
> struct list_node {
> struct list_node *next;
> void *data;

This is as we know a mistake. You're retailing a story, in the
Fascist's way, in hopes that your auditors will confuse the story with
me. William Butler Yeats noticed that both sides in the Irish
"troubles" preferred to read their own biased newspapers and were
uninterested in dialog with the other side, and in the case of the
Republicans further split when Michael Collins tried to establish the
Irish free state:

The bees build in the crevices
Of loosening masonry, and there
The mother birds bring grubs and flies.
My wall is loosening; honey-bees,
Come build in the empty house of the state.

We are closed in, and the key is turned
On our uncertainty; somewhere
A man is killed, or a house burned,
Yet no clear fact to be discerned:
Come build in he empty house of the stare. (Yeats: The Stare's Nest by
My Window, to be continued)

> };
>
> struct list {
> struct list_node *head;
> struct list_node *tail;
> };
>
> The specification was much as you'd expect. Except for one TINY detail.
> Which was that the formal specification was that
> (struct list *) (x->tail->next) == x
> whenever tail was not null.
>
> That is to say, if the list contained any members, the "next" pointer for
> the last member of the list was a pointer (suitably converted) to the list
> object.
>
> So iteration would look roughly like:
> for (l = x->head; l->next != x; l = l->next) {
> /* ... */
> }
>
> It took a day or so of effort for me to round up enough senior developers
> to all sit on the guy and tell him that:
>
> 1. He was wrong.
> 2. He was micro-managing, which is presumptively wrong.
>
> before we were allowed to use a more conventional design.

OK, he specified a circular singly-linked list, and (because by your
own admission you have never taken a single computer science class)
you had never seen this simple data structure, and you wasted an
entire day trying to figure this code out in consequence. For the same
reason you preferred to try destroy the career of Herb Schildt and you
hound me here, this enraged you and you maliciously went after the
original designer in order to ruin his position, because you didn't
know how to code a simple loop (we've seen how incompetent you are at
this task in your strlen that was off by one).

if (x->head == NULL) return;
list * p = x->head;
list * p2 = p;
do { ... p = p->next;} while(p != p2);

You may in fact know more about C than I do in the infantile rote
register, but you can't program, and instead of learning your trade,
you try to destroy reputations. You're a Fascist and an incompetent.
Oh honey-bees, come and build in the empty house of the stare: oh
honey-bees, come and build in the empty house of the state.

>
> Having had to deal with things like a database in which the formal schema
> description begins with "all fields are VARCHAR for simplicity", I found the
> Nilges String Replace Challenge to be a surprisingly good approximation of
> what programming work is often like in the real world.

You can't deal. Like your hero George Bush (whom you supported in
2000) you're incurious and you jump to conclusions about concepts and
people all too readily, because you're not qualified for the job
you're in and you don't want to learn anything except slogans.

A barricade of stone or of wood;
Some fourteen days of civil war;
Last night they trundled down the road
That dead young soldier in his blood:
Come build in the empty house of the stare.

We had fed the heart on fantasies,
The heart's grown brutal from the fare;
More substance in our enmities
Than in our love; O honey-bees,
Come build in the empty house of the stare. (Yeats)