SAFEcode

G G

unread,

Apr 14, 2015, 12:19:49 AM4/14/15

to

from:

http://safecode.cs.illinois.edu/docs/UsersGuide.html

1 #include "stdio.h"
2 #include "stdlib.h"
3
4 int
5 foo (char * bar) {
6 for (unsigned index = 0; index < 10; ++index)
7 bar[index] = 'a';
8 return 0;
9 }
10
11 int
12 main (int argc, char ** argv) {
13 char * array[100];
14 int max = atoi (argv[1]);
15
16 for (int index = max; index >= 0; --index) {
17 array[index] = malloc (index+1);
18 }
19
20 for (int index = max; index >= 0; --index) {
21 foo (array[index]);
22 }
23
24 exit (0);
25 }

reformatted:

#include "stdio.h"
#include "stdlib.h"

int foo (char * bar)
{
for ( unsigned index = 0; index < 10; ++index )
bar[ index ] = 'a';

return 0;
}

int main( int argc, char **argv )
{
char * array[ 100 ];

int max = atoi( argv[ 1 ] );

for ( int index = max; index >= 0; --index )
{
array[ index ] = malloc( index + 1 );
}

for ( int index = max; index >= 0; --index )
{
foo( array[ index ] );
}

exit (0);
}

the author explains how SAFEcode alerts one the a problem of accessing
memory it should not.

uses ./test 10 so i guess they complied the program as test. then ran it.

as they explain, "Lines 16-18 allocate character arrays of decreasing size,
starting with the argument plus one specified by the user down to an array
of one character. Lines 20-22 then call the function foo() which accesses
elements 0-9 of the array."

please explain what happens when foo is called. i mean at what point is
it accessing space outside it's range?

and ah, why "stdio.h", "stdlib.h" as oppose to <stdio.h>, <stdlih.h>?
is it that they have rewritten those header files and want to use
their own, or it related to where they placed their headers?

thanks everyone,

g.

Ian Collins

unread,

Apr 14, 2015, 1:12:40 AM4/14/15

to

<snip>

> the author explains how SAFEcode alerts one the a problem of accessing
> memory it should not.
>
> uses ./test 10 so i guess they complied the program as test. then ran it.
>
> as they explain, "Lines 16-18 allocate character arrays of decreasing size,
> starting with the argument plus one specified by the user down to an array
> of one character. Lines 20-22 then call the function foo() which accesses
> elements 0-9 of the array."
>
> please explain what happens when foo is called. i mean at what point is
> it accessing space outside it's range?

None, the behaviour is undefined.

> and ah, why "stdio.h", "stdlib.h" as oppose to <stdio.h>, <stdlih.h>?
> is it that they have rewritten those header files and want to use
> their own, or it related to where they placed their headers?

Bad practice. Same with using atoi().

--
Ian Collins

Ian Collins

unread,

Apr 14, 2015, 1:19:32 AM4/14/15

to

Ian Collins wrote:

> G G wrote:
>>
>> please explain what happens when foo is called. i mean at what point is
>> it accessing space outside it's range?
>
> None, the behaviour is undefined.

Ah, did you actually read the full page?

--
Ian Collins

Morris Dovey

unread,

Apr 14, 2015, 1:24:43 AM4/14/15

to

On 4/13/15 11:19 PM, G G wrote:
> please explain what happens when foo is called. i mean at what point is
> it accessing space outside it's range?

When foo() is called, it stores 'a' into ten bytes pointed to by bar.

If bar points to an allocation smaller than ten bytes then foo() will
still attempt to store ten bytes, and storing the final byte(s) will
access memory beyond the allocation.

The out of range access will happen whenever foo(array[index]) is called
with index < 9.

> and ah, why "stdio.h", "stdlib.h" as oppose to <stdio.h>, <stdlih.h>?
> is it that they have rewritten those header files and want to use
> their own, or it related to where they placed their headers?

Your guess is as good as any. Ask the prof.

--
Morris Dovey
http://www.iedu.com/Solar

asetof...@gmail.com

unread,

Apr 14, 2015, 1:30:35 AM4/14/15

to

For me many very ugly error would
happen in the line that contain
"bar[ index ] = 'a';"
in the few I know
<stdio.h>
Search stdio.h file in its
sys programming directory
where there is the C compiler.
Instead "stdio.h" would search the file stdio.h in the directory where there is the program to compile.
Possibly they assign to "file.h" one other sys directory it is not the compiler one.

G G

unread,

Apr 14, 2015, 2:45:17 AM4/14/15

to

On Tuesday, April 14, 2015 at 1:24:43 AM UTC-4, Morris Dovey wrote:
> On 4/13/15 11:19 PM, G G wrote:

> When foo() is called, it stores 'a' into ten bytes pointed to by bar.
>
> If bar points to an allocation smaller than ten bytes then foo() will
> still attempt to store ten bytes, and storing the final byte(s) will
> access memory beyond the allocation.
>
> The out of range access will happen whenever foo(array[index]) is called
> with index < 9.

ah, ok, thanks.

Robbie Hatley

unread,

Apr 14, 2015, 3:14:30 AM4/14/15

to

On 4/13/2015 9:19 PM, G G wrote:

> from:
>
> http://safecode.cs.illinois.edu/docs/UsersGuide.html

From that site:

"The SAFECode compiler is a memory safety compiler built using the
LLVM Compiler Infrastructure and the Clang compiler driver.
A memory safety compiler is a compiler that inserts run-time checks
into a program during compilation to catch memory safety errors at
run-time. Such errors can include buffer overflows, invalid frees,
and dangling pointer dereferences."

Given the code below, I'd say such a compiler would be needed!!!
It's anything *but* "safe".

#include "stdio.h"
#include "stdlib.h"

int foo (char * bar)
{
for ( unsigned index = 0; index < 10; ++index )
bar[ index ] = 'a';

return 0;
}

int main( int argc, char **argv )
{
char * array[ 100 ];

int max = atoi( argv[ 1 ] );

for ( int index = max; index >= 0; --index )
{
array[ index ] = malloc( index + 1 );
}

for ( int index = max; index >= 0; --index )
{
foo( array[ index ] );
}

exit (0);
}

Though I dare say a good compiler, with warnings turned on,
should give some warnings at compile-time. Let's find out.

Aragorn@Ketch
/rhe/src/test
%make memory-violation-test.exe
Using pattern rule %.exe:%.c to compile memory-violation-test.c
to memory-violation-test.exe:
gcc -I /rhe/include -Wall -std=c11 -Os -s memory-violation-test.c
-L/rhe/lib -lrh -lfl -ly -lm -o /rhe/bin/test/memory-violation-test.exe
memory-violation-test.exe is up to date.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Aragorn@Ketch
/rhe/src/test
%

Huh. No errors, no warnings. Probably because the character arrays
are allocated at run-time.

OK, how about run-time errors? I'm almost afraid to run this.

Aragorn@Ketch
/rhe/src/test
%memory-violation-test 20

Aragorn@Ketch
/rhe/src/test
%

Great, I just wrote a bunch of copies of the letter 'a' to
some unknown locations on my computer, with no run-time
errors. I feel raped. I think I'll restart my OS now.

--
Cheers,
Robbie Hatley
Midway City, CA, USA
perl -le 'print "\154o\156e\167o\154f\100w\145ll\56c\157m"'
http://www.well.com/user/lonewolf/
https://www.facebook.com/robbie.hatley

David Brown

unread,

Apr 14, 2015, 4:33:22 AM4/14/15

to

It is common practice for the library to ask the OS for large lumps of
memory, and then malloc divides it up locally (to save using a system
call for every little malloc). So when your program first asks for 20
bytes, the library is probably given a few hundred KB from the OS - your
program can stomp all over that without causing problems for anything
else. And of course you should get a seg fault rather than cause damage
when you go out of those bounds.

#include <stdio.h>
#include <stdlib.h>

int main(void) {
char * p = malloc(1);

for (int i = 0; i < 30; i++) {
int x = 1 << i;
printf("Trying index %d\r\n", x);
p[x] = 123;
}
return 0;
}

Trying index 1
Trying index 2
Trying index 4
Trying index 8
Trying index 16
Trying index 32
Trying index 64
Trying index 128
Trying index 256
Trying index 512
Trying index 1024
Trying index 2048
Trying index 4096
Trying index 8192
Trying index 16384
Trying index 32768
Trying index 65536
Trying index 131072
Trying index 262144
Segmentation fault (core dumped)

(64-bit Linux)

Malcolm McLean

unread,

Apr 14, 2015, 5:01:44 AM4/14/15

to

When doing searching, it's best to use the Golden Ratio (1.618). We know that x is greater
than our test, tx. But whilst we've got some idea of the process that generates x, we don't
know whether it is arithmetical or geometrical - x could be growing 1, 2, 3 ... or 1, 2, 4 ...
(however it's not something that's stupidly huge, such that we'll never get x in a month of
Sundays by searching logarithmically).
Phi is the compromise between exponential and arithmetical growth,

Bartc

unread,

Apr 14, 2015, 5:23:24 AM4/14/15

to

In other words, you've finally found a use for a Fibonacci sequence.

Instead of x = 1,2,3,4,5... or 1,2,4,8,16,..., you do (1),1,2,3,5,8,....

--
Bartc

David Brown

unread,

Apr 14, 2015, 5:50:09 AM4/14/15

to

Using the golden ratio here is exponential growth - it is not a
"compromise between exponential and arithmetical growth".

You could argue that it is a convenient way to get an exponential growth
that is lower than powers of 2 while still being possible to calculate
with easy integer arithmetic. And while there are some algorithms that
have best amortized performance when using Fibonacci numbers (such as
the Fibonacci heap for priority queues - or perhaps when choosing the
size of lumps for malloc to request from the OS memory manager), it is a
very big stretch to generalise that to saying the Golden Ratio is /the/
best choice for searching.

In this case, powers of two is clearly the best choice for searching
since we are merely looking for an indication of the scale of the value,
using a simple and quick program.

Malcolm McLean

unread,

Apr 14, 2015, 7:31:17 AM4/14/15

to

On Tuesday, April 14, 2015 at 10:50:09 AM UTC+1, David Brown wrote:
>
> You could argue that it is a convenient way to get an exponential growth
> that is lower than powers of 2 while still being possible to calculate
> with easy integer arithmetic. And while there are some algorithms that
> have best amortized performance when using Fibonacci numbers (such as
> the Fibonacci heap for priority queues - or perhaps when choosing the
> size of lumps for malloc to request from the OS memory manager), it is a
> very big stretch to generalise that to saying the Golden Ratio is /the/
> best choice for searching.
>

It's the accepted way of searching for a minimum along a continuous function of
a continuous variable, when you can't easily characterise the function other
that to say x is not so large that exponential growth will never reach it in reasonable
time.

>
> In this case, powers of two is clearly the best choice for searching
> since we are merely looking for an indication of the scale of the value,
> using a simple and quick program.
>

Actually powers of two are about the worst choice, if you've reason to believe that
the answer might be either exactly a power of two or one minus a power of two.
It's also a bad choice for growing a buffer - which is a very similar problem -
largely because the growth is too aggressive. You can grow by length + length/2,
which isn't quite the golden ratio but close enough to stretch a point.
I think the golden ratio has some deeper mathematical properties which make it
a good choice, but I've got to confess that I don't know them off pat.

Ben Bacarisse

unread,

Apr 14, 2015, 8:03:52 AM4/14/15

to

Robbie Hatley <see.m...@for.my.address> writes:

> On 4/13/2015 9:19 PM, G G wrote:
>
>> from:
>>
>> http://safecode.cs.illinois.edu/docs/UsersGuide.html

<snip>

> #include "stdio.h"
> #include "stdlib.h"
>
> int foo (char * bar)
> {
> for ( unsigned index = 0; index < 10; ++index )
> bar[ index ] = 'a';
>
> return 0;
> }
>
> int main( int argc, char **argv )
> {
> char * array[ 100 ];
>
> int max = atoi( argv[ 1 ] );
>
> for ( int index = max; index >= 0; --index )
> {
> array[ index ] = malloc( index + 1 );
> }
>
> for ( int index = max; index >= 0; --index )
> {
> foo( array[ index ] );
> }
>
> exit (0);
> }

<snip>

> OK, how about run-time errors? I'm almost afraid to run this.
>
> Aragorn@Ketch
> /rhe/src/test
> %memory-violation-test 20

<snip>

> Great, I just wrote a bunch of copies of the letter 'a' to
> some unknown locations on my computer, with no run-time
> errors. I feel raped. I think I'll restart my OS now.

That's a little strong! Unless your OS is some throw-back to the 70s,
the program wrote inappropriate bytes to known locations in the virtual
memory space allocated to it.

BTW, valgrind is good for such thing -- I always run programs I've
written under valgrind (when it's available):

$ valgrind ./mtest 10
==9831== Memcheck, a memory error detector
==9831== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==9831== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==9831== Command: ./mtest 10
==9831==
==9831== Invalid write of size 1
==9831== at 0x4005E1: foo (mtest.c:7)
==9831== by 0x4006A1: main (mtest.c:25)
==9831== Address 0x51fb0e9 is 0 bytes after a block of size 9 alloc'd
==9831== at 0x4C2ABA0: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9831== by 0x400658: main (mtest.c:20)

<snip some more diagnostics>

--
Ben.

David Brown

unread,

Apr 14, 2015, 8:54:52 AM4/14/15

to

On 14/04/15 13:31, Malcolm McLean wrote:
> On Tuesday, April 14, 2015 at 10:50:09 AM UTC+1, David Brown wrote:

And I assume you agree that you were completely wrong about Fibonacci
numbers being anything other than exponential growth. It would be nice
if you didn't keep snipping your mistakes - admitting that you are wrong
(or providing justification for why you are not) is common courtesy.

>>
>> You could argue that it is a convenient way to get an exponential growth
>> that is lower than powers of 2 while still being possible to calculate
>> with easy integer arithmetic. And while there are some algorithms that
>> have best amortized performance when using Fibonacci numbers (such as
>> the Fibonacci heap for priority queues - or perhaps when choosing the
>> size of lumps for malloc to request from the OS memory manager), it is a
>> very big stretch to generalise that to saying the Golden Ratio is /the/
>> best choice for searching.
>>
> It's the accepted way of searching for a minimum along a continuous function of
> a continuous variable, when you can't easily characterise the function other
> that to say x is not so large that exponential growth will never reach it in reasonable
> time.

It is not /the/ accepted way - merely /an/ accepted way. For some types
of searching problems, it is useful (or even theoretically optimal).
But that does not mean it is always the best way, no matter what your
definition of "best" happens to be.

>>
>> In this case, powers of two is clearly the best choice for searching
>> since we are merely looking for an indication of the scale of the value,
>> using a simple and quick program.
>>
> Actually powers of two are about the worst choice, if you've reason to believe that
> the answer might be either exactly a power of two or one minus a power of two.

If I had been trying to find an exact value, then I agree that a power
of two would have been a poor choice - at least, when used like this.
But that was not the aim of the code. Re-read my paragraph above, and
perhaps you will understand.

(If I had wanted an exact value, I would have looked it up online in the
glibc documentation.)

> It's also a bad choice for growing a buffer - which is a very similar problem -
> largely because the growth is too aggressive. You can grow by length + length/2,
> which isn't quite the golden ratio but close enough to stretch a point.

There are situations when an exponential growth factor of 2 is too
aggressive - and situations when it is not aggressive enough, and a
factor of 10 might be a better choice. For small values, factors of 1.5
or sqrt(2) are also popular.

> I think the golden ratio has some deeper mathematical properties which make it
> a good choice, but I've got to confess that I don't know them off pat.
>

For some types of uses, it has theoretically optimal amortized
behaviour. But that is only for some uses - and its use must be
balanced against additional complexity compared to choices that may be
simpler to implement (such as growing by 1.5) or may have additional
efficiencies (it may be more efficient to allocate memory in powers of 2
rather than powers of phi).

This is why it is wrong to say powers of phi is /the/ best choice or
/the/ accepted value - the best way to grow a value will depend on the
circumstances and the requirements.

Malcolm McLean

unread,

Apr 14, 2015, 9:39:16 AM4/14/15

to

On Tuesday, April 14, 2015 at 1:54:52 PM UTC+1, David Brown wrote:
>
> And I assume you agree that you were completely wrong about Fibonacci
> numbers being anything other than exponential growth. It would be nice
> if you didn't keep snipping your mistakes - admitting that you are wrong
> (or providing justification for why you are not) is common courtesy.
>

The base exponential function is e^x. The base arithmetical function is of course
x + 1. The series phi^-1, 1, phi, phi^2 is a kind of midpoint between the two
series.
Say we want to find x for which f(x) is zero. We know that x is N steps away
from unity, and we know than N is less than, say, 100. But we don't know if
the function is x - N or ln(x) - N. We can't plot it, all we can do is call f(x) and see
if it is positive or negative.
So what's our optimal strategy?

(I'm sure sure it's a golden section search, I think it is on the basis of reading, but
I haven't actually tried to prove it from first principles).

>
> It is not /the/ accepted way - merely /an/ accepted way. For some types
> of searching problems, it is useful (or even theoretically optimal).
> But that does not mean it is always the best way, no matter what your
> definition of "best" happens to be.
>

> This is why it is wrong to say powers of phi is /the/ best choice or
> /the/ accepted value - the best way to grow a value will depend on the
> circumstances and the requirements.
>

it won't always be the best way. For example if f(x) is tetration then a golden section
search won't be optimal. But we live a world of three levels of recursion.

David Brown

unread,

Apr 14, 2015, 10:10:29 AM4/14/15

to

On 14/04/15 15:39, Malcolm McLean wrote:
> On Tuesday, April 14, 2015 at 1:54:52 PM UTC+1, David Brown wrote:
>>
>> And I assume you agree that you were completely wrong about Fibonacci
>> numbers being anything other than exponential growth. It would be nice
>> if you didn't keep snipping your mistakes - admitting that you are wrong
>> (or providing justification for why you are not) is common courtesy.
>>
> The base exponential function is e^x. The base arithmetical function is of course
> x + 1. The series phi^-1, 1, phi, phi^2 is a kind of midpoint between the two
> series.

No.

The series 1, phi, phi^2, etc., is exponential. The base (phi in this
case) gives the speed of growth, but it is exponential growth regardless
of the base.

phi^x is just e^(ln(phi) * x) . The base is just a scale factor in the
x direction.

> Say we want to find x for which f(x) is zero. We know that x is N steps away
> from unity, and we know than N is less than, say, 100. But we don't know if
> the function is x - N or ln(x) - N. We can't plot it, all we can do is call f(x) and see
> if it is positive or negative.
> So what's our optimal strategy?

If you know nothing about f(x), then a linear search is optimal - there
is no reason for expecting to find the 0 any faster with any other
method (it is like searching an unsorted list).

If you know f(x) is monotone, then a binary search is optimal (like
searching a sorted list).

>
> (I'm sure sure it's a golden section search, I think it is on the basis of reading, but
> I haven't actually tried to prove it from first principles).

The golden section search is something different, and can be useful for
finding the extrema of certain types of function. It does not work by
using powers of phi.

I recommend reading the Wikipedia article, to see how it does not apply
here.

Bartc

unread,

Apr 14, 2015, 10:24:24 AM4/14/15

to

On 14/04/2015 14:39, Malcolm McLean wrote:
> On Tuesday, April 14, 2015 at 1:54:52 PM UTC+1, David Brown wrote:
>>
>> And I assume you agree that you were completely wrong about Fibonacci
>> numbers being anything other than exponential growth. It would be nice
>> if you didn't keep snipping your mistakes - admitting that you are wrong
>> (or providing justification for why you are not) is common courtesy.
>>
> The base exponential function is e^x. The base arithmetical function is of course
> x + 1. The series phi^-1, 1, phi, phi^2 is a kind of midpoint between the two
> series.
> Say we want to find x for which f(x) is zero. We know that x is N steps away
> from unity, and we know than N is less than, say, 100. But we don't know if
> the function is x - N or ln(x) - N. We can't plot it, all we can do is call f(x) and see
> if it is positive or negative.
> So what's our optimal strategy?

I'm not sure if there is one.

What's the best approach here, assuming we don't know the contents of
the function:

int f(long long int x){
return x!=6839003422641637678;
}

--
Bartc

Malcolm McLean

unread,

Apr 14, 2015, 11:05:00 AM4/14/15

to

On Tuesday, April 14, 2015 at 3:24:24 PM UTC+1, Bart wrote:
>
> I'm not sure if there is one.
>
> What's the best approach here, assuming we don't know the contents of
> the function:
>
> int f(long long int x){
> return x!=6839003422641637678;
> }
>

You can't minimise a golf course function. (That's one with a flat area and a tiny
minimum). You have to make assumptions that the function will be pointing
towards the answer, or will be positive one side and negative the other if you
want zero rather than the minimum.
The assumption is that the mathematical function reflects some sort of real
underlying physical process which you don't know much about, but you don't
know absolutely nothing - you know that the answers will be clustered towards
zero, you know that the process won't be wildly super-exponential.

A real example, which I did for my PhD, is an energy surface in protein conformation
space. The bonds rotate. Some conformations, such as those where the atoms overlap
each other, are so entropically unfavourable that for all practical purposes we can
say they never occur. But others involve milder forces, such as van der Walls interactions
that tend to draw particles together - so as you rotate two bonds, you'll see an energy
surface in the two angles. Even with two bonds its going to be too difficult to model
it analytically, but you can sample it and find the minimum.

Malcolm McLean

unread,

Apr 14, 2015, 11:11:50 AM4/14/15

to

On Tuesday, April 14, 2015 at 3:10:29 PM UTC+1, David Brown wrote:
>
> If you know f(x) is monotone, then a binary search is optimal (like
> searching a sorted list).
>

That's only the case if you know the maximum value, and you've reason to believe
that the probability distribution for the answer is uniform. Which is the case if you
know the target is an entry in a sorted list. But that's information you've supplied.

In my problem (I'm not sure I've set it up rigorously, but it basically reflects real
situations where we know f(x) represents some biological process, but we don't
really know anything else about it), we know that our answer has a probability
density function which is somewhere between uniform and somewhere between
an exponential decay in e^x.

Keith Thompson

unread,

Apr 14, 2015, 11:33:41 AM4/14/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Tuesday, April 14, 2015 at 1:54:52 PM UTC+1, David Brown wrote:
>> And I assume you agree that you were completely wrong about Fibonacci
>> numbers being anything other than exponential growth. It would be nice
>> if you didn't keep snipping your mistakes - admitting that you are wrong
>> (or providing justification for why you are not) is common courtesy.
>>
> The base exponential function is e^x. The base arithmetical function
> is of course x + 1. The series phi^-1, 1, phi, phi^2 is a kind of
> midpoint between the two series.

The series x, x+2, x+4, x+6, x+8, ... is also a kind of "midpoint"

between the two series.

> Say we want to find x for which f(x) is zero. We know that x is N
> steps away from unity, and we know than N is less than, say, 100. But
> we don't know if the function is x - N or ln(x) - N. We can't plot it,
> all we can do is call f(x) and see if it is positive or negative. So
> what's our optimal strategy?

So we know that f(x) is a function, that f(x) == 0 for some x in the
range -99 .. +99, and we want to find such an x. Since you way we
"can't plot it", I presume that either the parameter is real rather than
integer or that calling the function is quite expensive; otherwise I'd
just call the function up to 199 times and be done with it.

You're asking about an "optimal strategy" over the set of all functions.
You've given no other information about how this function behaves, so
knowing the values of f(40) and f(42) tells us nothing about the value
of f(41). With the given information, there is no strategy better than
an exhaustive search.

Provide some more information about any restrictions on the function and
*maybe* we can reasonably discuss an optimal strategy.

> (I'm sure sure it's a golden section search, I think it is on the
> basis of reading, but I haven't actually tried to prove it from first
> principles).

Citation?

[...]

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Malcolm McLean

unread,

Apr 14, 2015, 11:55:00 AM4/14/15

to

On Tuesday, April 14, 2015 at 4:33:41 PM UTC+1, Keith Thompson wrote:

> Malcolm McLean <malcolm...@btinternet.com> writes:
>
> So we know that f(x) is a function, that f(x) == 0 for some x in the
> range -99 .. +99, and we want to find such an x. Since you way we
> "can't plot it", I presume that either the parameter is real rather than
> integer or that calling the function is quite expensive; otherwise I'd
> just call the function up to 199 times and be done with it.
>

Exactly.
F(x) is either

int f(double x)
{
return sign( x - n );
}

or
int f(double x)
{
return sign(ln(x) - n);
}

We know that n isn't huge, in fact I'll tell you that it's between zero and 100.

So the task is to devise a search strategy that homes in on x (to machine
precision), with the fewest possible calls to f(), to make f return 0.

>
> Provide some more information about any restrictions on the function and
> *maybe* we can reasonably discuss an optimal strategy.
>

I think I've set it up right.
I've called it n rather than k because n will represent the number of "steps"
in some physical process. That's basically what we're after. However it's a real.

There might be a hole in it that means a trivial test will tell us which function
we're dealing with.

Ben Bacarisse

unread,

Apr 14, 2015, 1:48:57 PM4/14/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:

> On Tuesday, April 14, 2015 at 3:24:24 PM UTC+1, Bart wrote:
>>
>> I'm not sure if there is one.
>>
>> What's the best approach here, assuming we don't know the contents of
>> the function:
>>
>> int f(long long int x){
>> return x!=6839003422641637678;
>> }
>>
> You can't minimise a golf course function. (That's one with a flat area and a tiny
> minimum).

Clearly one can. However, as in so many of your threads, your words are
no doubt to be interpreted with all the extra words added in to make
what you are saying something correct. We may never know the full set
of constraints you are assuming for f which make your claims (which are
flat-out wrong on the face of it) right in your mind.

<snip>
--
Ben.

Keith Thompson

unread,

Apr 14, 2015, 2:43:38 PM4/14/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Tuesday, April 14, 2015 at 4:33:41 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>> So we know that f(x) is a function, that f(x) == 0 for some x in the
>> range -99 .. +99, and we want to find such an x. Since you way we
>> "can't plot it", I presume that either the parameter is real rather than
>> integer or that calling the function is quite expensive; otherwise I'd
>> just call the function up to 199 times and be done with it.
>>
> Exactly.
> F(x) is either
>
>
> int f(double x)
> {
> return sign( x - n );
> }
>
> or
> int f(double x)
> {
> return sign(ln(x) - n);
> }
>
> We know that n isn't huge, in fact I'll tell you that it's between zero and 100.

The "optimal strategy" for determining the argument value for which a
function returns 0 (or perhaps "close enough" to 0) depends critically
on what information we have about the function. Is it continuous
(subject to the fact that floating-point representations are discrete)?
Is it monotone? Is its first derivative continuous?

> So the task is to devise a search strategy that homes in on x (to machine
> precision), with the fewest possible calls to f(), to make f return 0.
>>
>> Provide some more information about any restrictions on the function and
>> *maybe* we can reasonably discuss an optimal strategy.
>>
> I think I've set it up right.
> I've called it n rather than k because n will represent the number of "steps"
> in some physical process. That's basically what we're after. However it's a real.
>
> There might be a hole in it that means a trivial test will tell us which function
> we're dealing with.

I don't think so. Both functions return -1 for x < foo, 0 for x == foo,
1 for x > foo (assuming that's what you mean by "sign()"), where "foo"
is either n or exp(n). Applying sign() destroys the information we
could have used to distinguish between the two versions.

The big problem is that ln(x) is undefined for x <= 0.0. A secondary
problem is that there might not a value of x such that ln(x) - n == 0
(mathematically there is, but floating-point is not so well behaved).

Ignoring those problems (assuming a similar function that's actually
defined over -99..+99), and given information about the probability
distribution of the value of n and of the choice of which of the two
functions is being used, I have little doubt that an optimal strategy
could be devised, probably some kind of biased binary search. I'm
skeptical that it would be worth the effort over a simple unbiased
binary search, but perhaps in some applications it would be.

Malcolm McLean

unread,

Apr 14, 2015, 3:41:22 PM4/14/15

to

On Tuesday, April 14, 2015 at 7:43:38 PM UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
>
>The "optimal strategy" for determining the argument value for which a
>function returns 0 (or perhaps "close enough" to 0) depends critically
>on what information we have about the function. Is it continuous
>(subject to the fact that floating-point representations are discrete)?
>Is it monotone? Is its first derivative continuous?
>

The real issues that that you don't know much about the function,
you're probing it. If it's pathological, such as a golf course function,
then there's nothing you can do. But you know that it represents a physical
process, so that's unlikely - not impossible, but there probably
isn't a demon setting things up so you can't work out what is
going on.
It's not continuous, it consists of three sections, one which returns
-1, one which returns 1, and a very small one (limited by machine precision)
which returns 0. You know the machine precision, so if it so happens
that one value returns 1 and the next representable value returns -1,
then you've bracketed zero.
That's a bit unrealistic. The idea that you can't take derivatives isn't
unrealistic at all.

>
> I don't think so. Both functions return -1 for x < foo, 0 for x == foo,
> 1 for x > foo (assuming that's what you mean by "sign()"), where "foo"
> is either n or exp(n). Applying sign() destroys the information we
> could have used to distinguish between the two versions.
>

Exactly, that's the problem. You know whether you're under or over,
but you don't know the gradient.

>
> The big problem is that ln(x) is undefined for x <= 0.0. A secondary
> problem is that there might not a value of x such that ln(x) - n == 0
> (mathematically there is, but floating-point is not so well behaved).
>

That's not a big problem, since I've told you that n is under 100
and all runs have produced values over 1, and n represents the
number of steps in a physical process, with a bit of fuzziness to
make the problem continuous.
Clearly we can't pass x < 0.0. We've got one limit, but not the other
one.

>
> Ignoring those problems (assuming a similar function that's actually
> defined over -99..+99), and given information about the probability
> distribution of the value of n and of the choice of which of the two
> functions is being used, I have little doubt that an optimal strategy
> could be devised, probably some kind of biased binary search. I'm
> skeptical that it would be worth the effort over a simple unbiased
> binary search, but perhaps in some applications it would be.
>

It is, yes. Obviously for the problem as I have defined it you can
get the result in under just noticeable time on a modern machine,
so it wouldn't really be worth the effort. But in reality f(x)
might represent a physical measurement and be quite costly.
Another common situation is that you spend several days of processor
time minimising a function that takes several parameters.

Malcolm McLean

unread,

Apr 14, 2015, 3:43:34 PM4/14/15

to

The meaning is defined by the natural language of the words.

Johannes Bauer

unread,

Apr 14, 2015, 3:52:31 PM4/14/15

to

On 14.04.2015 21:43, Malcolm McLean wrote:

>>>> int f(long long int x){
>>>> return x!=6839003422641637678;
>>>> }
>>>>
>>> You can't minimise a golf course function. (That's one with a flat area and a tiny
>>> minimum).
>>
>> Clearly one can. However, as in so many of your threads, your words are
>> no doubt to be interpreted with all the extra words added in to make
>> what you are saying something correct.
>

> The meaning is defined by the natural language of the words.

So then can I assume that by you "can't minimise" a function you mean
that I can't write this:

void minimize_unminimizable(void) {
for (long long int i = 0; i < LLONG_MAX; i++) {
if (f(i) == 0) {
printf("Minimum: %lld\n", i);
}
}
}

That would be a very peculiar natural language meaning of the word
"can't" because for something that I "can't" do I sure as hell just did.

Cheers,
Johannes

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?
> Zumindest nicht öffentlich!
Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1...@speranza.aioe.org>

Richard Heathfield

unread,

Apr 14, 2015, 4:02:19 PM4/14/15

to

On 14/04/15 20:43, Malcolm McLean wrote:
> On Tuesday, April 14, 2015 at 6:48:57 PM UTC+1, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>

<snip>

>> >>
>> > You can't minimise a golf course function. (That's one with a flat area and a tiny
>> > minimum).
>>
>> Clearly one can. However, as in so many of your threads, your words are
>> no doubt to be interpreted with all the extra words added in to make
>> what you are saying something correct. We may never know the full set
>> of constraints you are assuming for f which make your claims (which are
>> flat-out wrong on the face of it) right in your mind.
>>
> The meaning is defined by the natural language of the words.

The principal function of a golf course is to determine how few strokes
can be used to sink the ball into every one of eighteen holes.
Therefore, the minimum for a golf course function is 18.

So "the natural language of the words" doesn't actually define the
meaning you intended.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

Keith Thompson

unread,

Apr 14, 2015, 4:08:54 PM4/14/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Tuesday, April 14, 2015 at 7:43:38 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>>The "optimal strategy" for determining the argument value for which a
>>function returns 0 (or perhaps "close enough" to 0) depends critically
>>on what information we have about the function. Is it continuous
>>(subject to the fact that floating-point representations are discrete)?
>>Is it monotone? Is its first derivative continuous?
>>
> The real issues that that you don't know much about the function,
> you're probing it. If it's pathological, such as a golf course function,
> then there's nothing you can do. But you know that it represents a physical
> process, so that's unlikely - not impossible, but there probably
> isn't a demon setting things up so you can't work out what is
> going on.
> It's not continuous, it consists of three sections, one which returns
> -1, one which returns 1, and a very small one (limited by machine precision)
> which returns 0. You know the machine precision, so if it so happens
> that one value returns 1 and the next representable value returns -1,
> then you've bracketed zero.
> That's a bit unrealistic. The idea that you can't take derivatives isn't
> unrealistic at all.

Ok, so the fact that the function is monotonic increasing is enough to
let us search for the argument value for which it returns 0, or at least
where it transitions from negative to positive. A binary search would
work. A biased binary search might be preferable, but we can't
implement it without more specific information about how we *expect* the
function to behave. (I'm making a general point, not asking for more
details about this particular function.)

[...]

>> The big problem is that ln(x) is undefined for x <= 0.0. A secondary
>> problem is that there might not a value of x such that ln(x) - n == 0
>> (mathematically there is, but floating-point is not so well behaved).
>>
> That's not a big problem, since I've told you that n is under 100
> and all runs have produced values over 1, and n represents the
> number of steps in a physical process, with a bit of fuzziness to
> make the problem continuous.
> Clearly we can't pass x < 0.0. We've got one limit, but not the other
> one.

The second version of your function computes ln(x), which is undefined
for x <= 0.0. In your original description, you said the function is
defined over the range -99 .. +99. If you were computing ln(n) and we
know that n >= 1, that would be ok -- but you're computing ln(x).

Your description is inconsistent.

This probably isn't relevant to the general problem, except that we
can't work with the function unless we have a consistent description of
its behavior.

>> Ignoring those problems (assuming a similar function that's actually
>> defined over -99..+99), and given information about the probability
>> distribution of the value of n and of the choice of which of the two
>> functions is being used, I have little doubt that an optimal strategy
>> could be devised, probably some kind of biased binary search. I'm
>> skeptical that it would be worth the effort over a simple unbiased
>> binary search, but perhaps in some applications it would be.
>>
> It is, yes. Obviously for the problem as I have defined it you can
> get the result in under just noticeable time on a modern machine,
> so it wouldn't really be worth the effort. But in reality f(x)
> might represent a physical measurement and be quite costly.
> Another common situation is that you spend several days of processor
> time minimising a function that takes several parameters.

Sure, all that seems plausible.

Your original claim was something about a "golden section search".
Perhaps later I'll read up on that. But I think we've gone rather far
afield from that claim.

Malcolm McLean

unread,

Apr 14, 2015, 4:16:16 PM4/14/15

to

On Tuesday, April 14, 2015 at 8:52:31 PM UTC+1, Johannes Bauer wrote:
> On 14.04.2015 21:43, Malcolm McLean wrote:
>
> >>>> int f(long long int x){
> >>>> return x!=6839003422641637678;
> >>>> }
> >>>>
> >>> You can't minimise a golf course function. (That's one with a flat area and a tiny
> >>> minimum).
> >>
> >> Clearly one can. However, as in so many of your threads, your words are
> >> no doubt to be interpreted with all the extra words added in to make
> >> what you are saying something correct.
> >
> > The meaning is defined by the natural language of the words.
>
> So then can I assume that by you "can't minimise" a function you mean
> that I can't write this:
>
> void minimize_unminimizable(void) {
> for (long long int i = 0; i < LLONG_MAX; i++) {
> if (f(i) == 0) {
> printf("Minimum: %lld\n", i);
> }
> }
> }
>
> That would be a very peculiar natural language meaning of the word
> "can't" because for something that I "can't" do I sure as hell just did.
>

It wouldn't be an unusual use of natural language.

I said "a golf course function". Add a dimension.

Richard Heathfield

unread,

Apr 14, 2015, 4:39:47 PM4/14/15

to

That would be a rather uninteresting golf course. Add /two/ dimensions.

Malcolm McLean

unread,

Apr 14, 2015, 5:03:00 PM4/14/15

to

On Tuesday, April 14, 2015 at 9:08:54 PM UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
>
> Ok, so the fact that the function is monotonic increasing is enough to
> let us search for the argument value for which it returns 0, or at least
> where it transitions from negative to positive. A binary search would
> work. A biased binary search might be preferable, but we can't
> implement it without more specific information about how we *expect* the
> function to behave. (I'm making a general point, not asking for more
> details about this particular function.)
>

That's exactly the point. If you literally know nothing about the
function, then of course there's no strategy other than trying numbers
at random.

>
> The second version of your function computes ln(x), which is undefined
> for x <= 0.0. In your original description, you said the function is
> defined over the range -99 .. +99. If you were computing ln(n) and we
> know that n >= 1, that would be ok -- but you're computing ln(x).
>
> Your description is inconsistent.
>

It's hard to say what the computer will do if you pass x < 0, but
it's quite common to have a constraint that the argument shall be
positive.

>
> This probably isn't relevant to the general problem, except that we
> can't work with the function unless we have a consistent description of
> its behavior.
>

The point is you don't know what you're dealing with. You probe it
to characterise it. If you literally know nothing, then probing
isn't going to help, but you've been told that n represents the
number of steps in a physical process.

>
> Your original claim was something about a "golden section search".
> Perhaps later I'll read up on that. But I think we've gone rather far
> afield from that claim.
>

You want to work out how much memory you have. You can call malloc(),
which will return a pointer if the amount is under the amount
requested, or null if it is over. It's the same problem.

Or a slightly different variant, you have a stream of text data
coming in. You can allocate a small buffer with certainty, but
you don't want to gobble a large buffer. However reallocations are
also expensive. So given that our input is at least N bytes,
what's our best estimate of its likely size?

The point is that text is produced by a physical process. We can't
characterise it exactly - there might be fashion for short and snappy
tweets, followed by a later fashion for voluminous emails, followed
by another one for messages of exactly 100 words. But on the other
hand, we can't say absolutely nothing about it either.

David Brown

unread,

Apr 14, 2015, 5:59:03 PM4/14/15

to

On 14/04/15 17:11, Malcolm McLean wrote:
> On Tuesday, April 14, 2015 at 3:10:29 PM UTC+1, David Brown wrote:
>>

Again, I have to assume that you agree with all my other points - but I
would prefer if you were explicit.

>> If you know f(x) is monotone, then a binary search is optimal (like
>> searching a sorted list).
>>
> That's only the case if you know the maximum value, and you've reason to believe
> that the probability distribution for the answer is uniform. Which is the case if you
> know the target is an entry in a sorted list. But that's information you've supplied.

As I said, "if you know f(x) is monotone, then a binary search is
optimal (like searching a sorted list)". Perhaps I should have
specified that you know f(x) is monotone, but know nothing else about it?

>
> In my problem (I'm not sure I've set it up rigorously, but it basically reflects real
> situations where we know f(x) represents some biological process, but we don't
> really know anything else about it), we know that our answer has a probability
> density function which is somewhere between uniform and somewhere between
> an exponential decay in e^x.
>

Your description doesn't really make sense, but in general if you know
more about a function or its distribution, then of course you can pick
better ways to divide up your search (for maximum, minimum, or specific
values).

Keith Thompson

unread,

Apr 14, 2015, 6:29:11 PM4/14/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Tuesday, April 14, 2015 at 9:08:54 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>> Ok, so the fact that the function is monotonic increasing is enough to
>> let us search for the argument value for which it returns 0, or at least
>> where it transitions from negative to positive. A binary search would
>> work. A biased binary search might be preferable, but we can't
>> implement it without more specific information about how we *expect* the
>> function to behave. (I'm making a general point, not asking for more
>> details about this particular function.)
>>
> That's exactly the point. If you literally know nothing about the
> function, then of course there's no strategy other than trying numbers
> at random.

Right. And when I jumped into this thread, I knew almost literally
nothing about the function. You added some qualifications that might or
might not, for all I know, imply that a search using the golden ratio
would be optimal. The details, are unlikely to be related to C, so I
think I'll bow out of this.

>> The second version of your function computes ln(x), which is undefined
>> for x <= 0.0. In your original description, you said the function is
>> defined over the range -99 .. +99. If you were computing ln(n) and we
>> know that n >= 1, that would be ok -- but you're computing ln(x).
>>
>> Your description is inconsistent.
>>
> It's hard to say what the computer will do if you pass x < 0, but
> it's quite common to have a constraint that the argument shall be
> positive.

I think I misunderstood your description. Upthread, you wrote:

Say we want to find x for which f(x) is zero. We know that x
is N steps away from unity, and we know than N is less than,
say, 100. But we don't know if the function is x - N or ln(x)
- N. We can't plot it, all we can do is call f(x) and see if
it is positive or negative.

I think I confused x and N, and thought that you were saying the
function is defined for argument values from -99 to +99.

So the use of ln(x) implicitly constrains x to be positive. I should
have picked up on that, but it's something that would have been good to
mention explicitly.

[...]

Ben Bacarisse

unread,

Apr 14, 2015, 8:20:34 PM4/14/15

to

(1) So you placed that comment after an exampled of a one-argument
function in a thread about minimising one-argument functions, but you
did not intend your words to be interpreted as referring to one-argument
functions? Would you agree that that is, at best, rather deceptive?

(2) Are you claiming that Johannes Bauer's example code can not be
extended to functions of more than one argument?

--
Ben.

Richard Heathfield

unread,

Apr 14, 2015, 8:29:42 PM4/14/15

to

On 15/04/15 01:20, Ben Bacarisse wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:

<snip>

>> I said "a golf course function". Add a dimension.
>
> (1) So you placed that comment after an exampled of a one-argument
> function in a thread about minimising one-argument functions, but you
> did not intend your words to be interpreted as referring to one-argument
> functions? Would you agree that that is, at best, rather deceptive?

I wouldn't. With the greatest possible respect to Malcolm, I'm afraid
Hanlon's Razor must be applied.

Johannes Bauer

unread,

Apr 15, 2015, 2:14:21 AM4/15/15

to

On 14.04.2015 22:16, Malcolm McLean wrote:

>>>>> You can't minimise a golf course function. (That's one with a flat area and a tiny
>>>>> minimum).
>

>> That would be a very peculiar natural language meaning of the word
>> "can't" because for something that I "can't" do I sure as hell just did.
>>
> It wouldn't be an unusual use of natural language.

So what you're saying is that it's natural in language that "can't" may
mean "can", yes?

Just to confirm, does your "wouldn't" in your sentence maybe also mean
"would" or am I reading too much into it?

> I said "a golf course function". Add a dimension.

You do realize that YOU provided the original function, right? And you
do also realize that adding a dimension changes nothing about the
problem, right?

You're doing *exactly* the weaseling about words which Ben predicted you
would instead of admitting that your original wording was -- at best --
imprecise.

David Brown

unread,

Apr 15, 2015, 4:02:20 AM4/15/15

to

On 14/04/15 17:54, Malcolm McLean wrote:
> On Tuesday, April 14, 2015 at 4:33:41 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>> So we know that f(x) is a function, that f(x) == 0 for some x in the
>> range -99 .. +99, and we want to find such an x. Since you way we
>> "can't plot it", I presume that either the parameter is real rather than
>> integer or that calling the function is quite expensive; otherwise I'd
>> just call the function up to 199 times and be done with it.
>>
> Exactly.
> F(x) is either
>
>
> int f(double x)
> {
> return sign( x - n );
> }
>
> or
> int f(double x)
> {
> return sign(ln(x) - n);
> }
>
> We know that n isn't huge, in fact I'll tell you that it's between zero and 100.
>
> So the task is to devise a search strategy that homes in on x (to machine
> precision), with the fewest possible calls to f(), to make f return 0.

This is a completely and utterly pointless example. I am assuming that
x must be greater than 0, otherwise the log function is undefined. And
since you've called n a "number of steps", it must be a non-negative
integer.

A binary search of integer x between 1 and 100 is optimal.

David Brown

unread,

Apr 15, 2015, 4:04:06 AM4/15/15

to

On 14/04/15 21:41, Malcolm McLean wrote:

> On Tuesday, April 14, 2015 at 7:43:38 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>> The "optimal strategy" for determining the argument value for which a
>> function returns 0 (or perhaps "close enough" to 0) depends critically
>> on what information we have about the function. Is it continuous
>> (subject to the fact that floating-point representations are discrete)?
>> Is it monotone? Is its first derivative continuous?
>>
> The real issues that that you don't know much about the function,
> you're probing it.

Please make up your mind. In half your posts, you are telling us a
great deal of information about the function is (it is either sign(ln(x)
- n) or sign(x - n)), and in the other half you are telling us that we
know little or nothing about the function.

Malcolm McLean

unread,

Apr 15, 2015, 4:26:05 AM4/15/15

to

On Tuesday, April 14, 2015 at 10:59:03 PM UTC+1, David Brown wrote:
> On 14/04/15 17:11, Malcolm McLean wrote:
>
> As I said, "if you know f(x) is monotone, then a binary search is
> optimal (like searching a sorted list)". Perhaps I should have
> specified that you know f(x) is monotone, but know nothing else about it?
>

You know that n represents the number of steps in a physical process.
Almost everything, but not absolutely everything, clusters towards
low values. The likely distributions are top hat (n goes from 0 to
a constant, and each step adds an increment, gaussian
(n is usually constant and low, but each step adds a uniform number,
the entire system constrained to be above zero, log gaussian (each
step creates a factor which multiplies), broken stick (you get this
distribution by taking a stick, breaking it in random places, then
taking a distribution of the length of the fragments). There are others,
but they all have the characteristic that negative values are
impossible (typical of physical processes), and as we go towards infinity
the probability density function goes to zero,

So if you've got two constraints, zero and a sanity test or an
inherent limit (e.g. the size of a fish is limited by the volume
of the ocean), then a uniform distribution between zero and
your limit, whilst not impossible, is unlikely.

>
> Your description doesn't really make sense, but in general if you know
> more about a function or its distribution, then of course you can pick
> better ways to divide up your search (for maximum, minimum, or specific
> values).
>

Exactly. We're back to our previous discussion.
Data means something. You're not shuffling about random bits, at least
usually. You're taking measurements or observations or simulations
of something in the world outside the computer.

Malcolm McLean

unread,

Apr 15, 2015, 4:32:11 AM4/15/15

to

It's a case of understanding how linguistically competent adult
speakers communicate with each other.

David Brown

unread,

Apr 15, 2015, 5:11:36 AM4/15/15

to

I'm sorry, but I am going to have to stop replying to you in this
thread. The /world/ may not shuffle around random bits, but /you/
certainly do. You leap randomly between talking about invented
meaningless functions, distributions we know /nothing/ about, and
distributions that we know /something/ about (but you vary what that
"something" might be). You make totally incorrect mathematical claims
(like "powers of phi is something between exponential and linear"), and
totally incorrect programming claims (like "golden section search" is
/the/ best approach). And every time you are challenged or shown to be
wrong, you ignore the comment and jump off at a new tangent.

It is just getting too tedious and painful explaining the same things
repeatedly to someone who is convinced that he already knows everything.

Malcolm McLean

unread,

Apr 15, 2015, 5:35:34 AM4/15/15

to

It's really very basic.
If I say "we're trying to minimise a function we know nothing about", and you
interpret "nothing" in the most literal way possible, then there's an infinite
number of possible functions, and you're left with some rarefied branch of set
theory.
But if you interpret "nothing" as meaning, "I know a few basics, such as that
I can represent the values adequately using my programming language, such
as that the function represents something in the real world, but essentially
I'm dealing with an unknown function" then you're out of the rarefied branch
of mathematical set theory, and you can make a few common sense observations
that require a little bit of mathematical insight, but no special training.

Fish size is limited at one extreme by zero, at another by the volume of the ocean.
So that gives us two limits. But a fish is likely to be closer to the lower limit
than the upper. Replace fish by tree and replace "size of the ocean" by "height
of earth's atmosphere" and you've got the same issue. Replace tree by salary and
"height of the earth's atmosphere" by 'total world gross GDP" and you've got the
same issue. Replace salary by "star" and "total world GDP" with "so massive that
light can't escape" and what? Well I'm not an astrophysicist. Maybe we've got an
exception here. Maybe there are a lot of stars just on the cusp of becoming
black holes. So we need to be aware that we've got to support the upper limit.

It's a case of making the leap from naive literalism "you can't minimise a golf course
function" "well you can, here's a program that minimises Bart's function" to
a more flexible, more mature, more informed way of talking and of thinking.

David Brown

unread,

Apr 15, 2015, 7:52:42 AM4/15/15

to

On 15/04/15 11:35, Malcolm McLean wrote:
<snip>

When you are prepared to use the same language as the rest of us, maybe
we can make progress. But as long as you expect people to interpret "a
function we know nothing about" as "the size of a fish", there is little
point in the thread.

At least with fir, asetofsymbols, and Rosario19 it is obvious that they
can't (or won't) communicate normally. It is much more frustrating when
you write perfectly clear English but mean something entirely different.

Malcolm McLean

unread,

Apr 15, 2015, 9:04:32 AM4/15/15

to

A useful function represents something in the real world.
In this case, the function is the size of the fish, or, if you insist,
of x - n where n is the size of the fish (in metres), and we're looking
for the value of x where that function is zero.

That's not a good way of talking.

So what's x going to be? I have told you almost nothing about the fish, other
than that it's fish. But that's crucial information, x can't be negative, and
it can't be bigger than the maximum dimension of the ocean. We need something
else, which is the infinitesimal limit, but ignore that for the time being. I'm
trying to put over the essential simplicity of the concept.

But is x likely to be close to zero, or close to the maximum dimension of the
ocean? What's our best search strategy?

Now the leap you need to make is I'm not telling you that n is the length of a fish.
All I'm telling you is that it represents something in the real world, and we know
it can't be lower than zero, and we know we can represent it adequately using
the numerical facilities of our computer. No what's our optimal search strategy for
n?

Keith Thompson

unread,

Apr 15, 2015, 11:08:16 AM4/15/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
[...]

> It's really very basic.
> If I say "we're trying to minimise a function we know nothing about",
> and you interpret "nothing" in the most literal way possible, then
> there's an infinite number of possible functions, and you're left with
> some rarefied branch of set theory.

[...]

It might have helped if you had defined what it means to "minimise" a
function. (One possible meaning is "belittle"; I can certainly do
that.)

You've presented functions that can return -1, 0, or 1, and discussed
finding an argument for which they return 0. If that's what you mean by
"minimise", you need to be a *lot* clearer in your descriptions.

Keith Thompson

unread,

Apr 15, 2015, 11:17:33 AM4/15/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
[...]
>> Malcolm McLean <malcolm...@btinternet.com> writes:
[...]

>> >> On 14.04.2015 21:43, Malcolm McLean wrote:

[...]

>> >> >>> You can't minimise a golf course function. (That's one with a
>> >> >>> flat area and a tiny
>> >> >>> minimum).

[...]

>> >> > The meaning is defined by the natural language of the words.

[...]

>> > It wouldn't be an unusual use of natural language.
>> >
>> > I said "a golf course function". Add a dimension.

[...]

> It's a case of understanding how linguistically competent adult
> speakers communicate with each other.

Malcolm, if other people repeatedly fail to understand what I'm talking
about, I usually consider the possibility that it's because I'm being
unclear.

Malcolm McLean

unread,

Apr 15, 2015, 11:57:31 AM4/15/15

to

On Wednesday, April 15, 2015 at 4:08:16 PM UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
> [...]
> > It's really very basic.
> > If I say "we're trying to minimise a function we know nothing about",
> > and you interpret "nothing" in the most literal way possible, then
> > there's an infinite number of possible functions, and you're left with
> > some rarefied branch of set theory.
> [...]
>
> It might have helped if you had defined what it means to "minimise" a
> function. (One possible meaning is "belittle"; I can certainly do
> that.)
>
> You've presented functions that can return -1, 0, or 1, and discussed
> finding an argument for which they return 0. If that's what you mean by
> "minimise", you need to be a *lot* clearer in your descriptions.
>

That's just usage.
People say "minimisation" when they're trying to find a maximum, conventionally
you invert the function. Normally you say "solve" when you're trying to find zero.
In this case I set it up so that you can think of it as a minimum and two derivatives,
which you can take accurately enough to know the sign and nothing else, or
you can think of it as a step with a tiny vertical which is the point we want.

Malcolm McLean

unread,

Apr 15, 2015, 12:13:38 PM4/15/15

to

On Wednesday, April 15, 2015 at 4:17:33 PM UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
>

> Malcolm, if other people repeatedly fail to understand what I'm talking
> about, I usually consider the possibility that it's because I'm being
> unclear.
>

But I'm not being unclear.

These are fairly simple ideas. Data means something. Values tend to cluster towards
the low end of the possible range. If we want to find a value, we need to bias our
search to the low range, but not by so much that we'll take too long to find a
high value.
The Fibonnacci sequence goes 0, 1, 1, 2, 3, 5, 8, 13 ... Ignoring for a moment the theoretical
difficulty of the two ones, you can see that it starts out linear, then becomes exponential.
Or, if we approximate it as Phi^x, we can see that Phi is roughly intermediate between 1
and e. So if you don't know if you're probing something which is linear or exponential, it's
a compromise. Replace the second 1 with 1 plus epsilon, and you have quite a nice
strategy for probing something you know next to nothing about.

David Brown

unread,

Apr 15, 2015, 12:24:57 PM4/15/15

to

On 15/04/15 17:57, Malcolm McLean wrote:
> On Wednesday, April 15, 2015 at 4:08:16 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>> [...]
>>> It's really very basic.
>>> If I say "we're trying to minimise a function we know nothing about",
>>> and you interpret "nothing" in the most literal way possible, then
>>> there's an infinite number of possible functions, and you're left with
>>> some rarefied branch of set theory.
>> [...]
>>
>> It might have helped if you had defined what it means to "minimise" a
>> function. (One possible meaning is "belittle"; I can certainly do
>> that.)
>>
>> You've presented functions that can return -1, 0, or 1, and discussed
>> finding an argument for which they return 0. If that's what you mean by
>> "minimise", you need to be a *lot* clearer in your descriptions.
>>
> That's just usage.
> People say "minimisation" when they're trying to find a maximum, conventionally
> you invert the function.

Let me get this straight - when /you/ say "minimise", you sometimes mean
"find a 0", and sometimes "find a maximum". What do you call it when
you want to find a minimum? Painting the function pink? Translating it
to Chinese?

I don't even know where to start with your complete misunderstanding of
the word "derivative" - perhaps you forgot to mention that you use the
word "derivative" here to mean different species of fish.

Johannes Bauer

unread,

Apr 15, 2015, 12:33:49 PM4/15/15

to

On 15.04.2015 18:13, Malcolm McLean wrote:
> On Wednesday, April 15, 2015 at 4:17:33 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>> Malcolm, if other people repeatedly fail to understand what I'm talking
>> about, I usually consider the possibility that it's because I'm being
>> unclear.
>>
> But I'm not being unclear.

No you're being flat out *wrong* and instead of doing what
(linguistically competent) adults would do, you try to cover up your
mistakes and weasel of of responsibility. Instead of simply saying
"yeah, sorry, wrong about this".

> The Fibonnacci sequence goes 0, 1, 1, 2, 3, 5, 8, 13 ... Ignoring for a moment the theoretical
> difficulty of the two ones, you can see that it starts out linear, then becomes exponential.

So the Fibonnaci sequence "starts out linear"? That is like saying that
the sinus function is a linear function because for x close to 0 it's a
reasonable approximation.

But it's still wrong. The Fibonnaci sequence is a exponential sequence,
there's no question about it. You claiming it "starts out linear" is as
wrong as me saying that it "starts out constant" until it "turns
quadratic" to finally "become exponential". It's nonsense.

> Or, if we approximate it as Phi^x, we can see that Phi is roughly intermediate between 1
> and e. So if you don't know if you're probing something which is linear or exponential, it's
> a compromise. Replace the second 1 with 1 plus epsilon, and you have quite a nice
> strategy for probing something you know next to nothing about.

Very poor argument for choosing phi as the exponential base for probing.
Essentially all your evidence is that it's somewhere halfway between 1
and e. Cool beans, but why would that be impressive?

Why don't you choose a number that is halfway between 0 and Pi and take
that? How come the midpoint of 1 and e is - according to you -
especially well suited to probe something unknown?

It isn't. And you can't back your claims up, not even with empirical
data. It's just nonsense that you pile on top of other nonsense. Anyone
who points it out is just "linguistically incompetent" according to your
implication.

That's pretty pathetic scientific standards, let me tell you.

Malcolm McLean

unread,

Apr 15, 2015, 12:45:23 PM4/15/15

to

On Wednesday, April 15, 2015 at 5:24:57 PM UTC+1, David Brown wrote:
> On 15/04/15 17:57, Malcolm McLean wrote:
> > On Wednesday, April 15, 2015 at 4:08:16 PM UTC+1, Keith Thompson wrote:
> >> Malcolm McLean <malcolm...@btinternet.com> writes:
> >>
> > That's just usage.
> > People say "minimisation" when they're trying to find a maximum, conventionally
> > you invert the function.
>
> Let me get this straight - when /you/ say "minimise", you sometimes mean
> "find a 0", and sometimes "find a maximum". What do you call it when
> you want to find a minimum? Painting the function pink? Translating it
> to Chinese?
>

Minimise often means maximise. Conventionally you search for a minimum, and
invert the sign if you actually want a maximum.
That's legitimately confusing, but it's how people who deal with these things
day in and day out talk.

>
> I don't even know where to start with your complete misunderstanding of
> the word "derivative" - perhaps you forgot to mention that you use the
> word "derivative" here to mean different species of fish.
>

You can think of the function as a step, with the points above the critical value
as one and the points below as minus one. Or you can think of it as a valley
with the critical value as the lowest point, and one side pointing down,
and the other side also pointing down, but from the other direction.

Malcolm McLean

unread,

Apr 15, 2015, 1:02:33 PM4/15/15

to

On Wednesday, April 15, 2015 at 5:33:49 PM UTC+1, Johannes Bauer wrote:
> On 15.04.2015 18:13, Malcolm McLean wrote:
> > On Wednesday, April 15, 2015 at 4:17:33 PM UTC+1, Keith Thompson wrote:
>
> > The Fibonnacci sequence goes 0, 1, 1, 2, 3, 5, 8, 13 ... Ignoring for a moment the theoretical
> > difficulty of the two ones, you can see that it starts out linear, then becomes exponential.
>
> So the Fibonnaci sequence "starts out linear"? That is like saying that
> the sinus function is a linear function because for x close to 0 it's a
> reasonable approximation.
>

Ignore the two ones.
0, 1, 2, 3 ... a step of one, a step of one, a step of one, that's linear.
it then converges on an exponential function at the higher end.

>
> Very poor argument for choosing phi as the exponential base for probing.
> Essentially all your evidence is that it's somewhere halfway between 1
> and e. Cool beans, but why would that be impressive?
>

Choosing e gives us a nice curve.
Choosing one gives us a flat line.
So we want something midway between the two. It would be nice if we applied
a golden section search to the interval 1, e we got Phi, but we don't. We
do get something fairly close. We're on the right lines with 1.6 or so.

>
> Why don't you choose a number that is halfway between 0 and Pi and take
> that? How come the midpoint of 1 and e is - according to you -
> especially well suited to probe something unknown?
>

exponentials and e like each other. You can of course raise PI to a power,
but it's not especially interesting if you do that.

>
> It isn't. And you can't back your claims up, not even with empirical
> data. It's just nonsense that you pile on top of other nonsense. Anyone
> who points it out is just "linguistically incompetent" according to your
> implication.
>

We hear claims that it's nonsense. I said "you can't minimise a golf
course function" in response to Bart's objection. That's "nonsense" according to
some people. That's linguistic incompetence. Kenny regularly makes the
same point.

>
> That's pretty pathetic scientific standards, let me tell you.
>

No, it's an explanation of how to characterise things you don't know much about.
If you're not a scientist, or not the right kind of scientist, it's not something
you'll ever have to do. If you're a mathematician, you'll probably regard what I
say as absurdly simplistic and irrelevant to real mathematics. But if you've
got data points and you want to understand where to sample to understand
the underlying processes which generate them, you'll see the sense.

Johannes Bauer

unread,

Apr 15, 2015, 1:09:37 PM4/15/15

to

On 15.04.2015 18:45, Malcolm McLean wrote:

> Minimise often means maximise. Conventionally you search for a minimum, and
> invert the sign if you actually want a maximum.
> That's legitimately confusing, but it's how people who deal with these things
> day in and day out talk.

If I have the function x^3 - 2x^2 and I ask you for the X value of the
local minimum, I mean the X value of the local minimum.

Not the X value of the local maximum.
Not the Y value of the local minimum.
Not the Y value of the local maximum.
Nor any other nonsense.

The burden to get your ideas across is on *you*. You're being sloppy in
your language and when you're confronted with that you go into
full-arrogance mode where you try to defend yourself by claiming "pretty
much everyone does it like this" and that completely opposite concepts
are similar to the point where terms can be used interchangably.

You're the one being sloppy and you expect everyone to clean up your
mess after you and interpret your words in the way you MEANT, not the
way you WROTE them. That's incredibly arrogant.

If you want to express an idea, write down the IDEA not something that
is vaguely or even vastly different and then crying that it's not what
you ACTUALLY meant. That's just being obnoxious and annoying.

Johannes Bauer

unread,

Apr 15, 2015, 2:56:04 PM4/15/15

to

On 15.04.2015 19:02, Malcolm McLean wrote:

> Ignore the two ones.
> 0, 1, 2, 3 ... a step of one, a step of one, a step of one, that's linear.
> it then converges on an exponential function at the higher end.

Yes, if you cherrypick X-values of any sequence you can draw conclusions.

Just like
exp(0.000) = 1
exp(0.693) \approx 2
exp(1.099) \approx 3
exp(1.386) \approx 4

You can clearly see the exp() function is linear at first but it then
becomes a exponential function at the higher end.

Does it sound ridiculous yet? I surely hope so.

>> Very poor argument for choosing phi as the exponential base for probing.
>> Essentially all your evidence is that it's somewhere halfway between 1
>> and e. Cool beans, but why would that be impressive?
>>
> Choosing e gives us a nice curve.
> Choosing one gives us a flat line.

Choosing Pi gives us a nice curve.

Choosing one gives us a flat line.

Thus the conclusion is, use Pi/2 as the base.

> exponentials and e like each other. You can of course raise PI to a power,
> but it's not especially interesting if you do that.

Sure, magically (without explanation) your choice is superior to mine,
the only evidence being that "e gives a nice curve". Seriously?

> We hear claims that it's nonsense. I said "you can't minimise a golf
> course function" in response to Bart's objection. That's "nonsense" according to
> some people. That's linguistic incompetence. Kenny regularly makes the
> same point.

You not only heared claims it's nonsense, I even outright did prove you
wrong in that I presented a way to minimize the gold-course function
that YOU provided. I therefore provided EVIDENCE that it's nonsense.

Then you claimed that the function you provided wasn't a golf-course
function after all.

>> That's pretty pathetic scientific standards, let me tell you.
>>
> No, it's an explanation of how to characterise things you don't know much about.
> If you're not a scientist, or not the right kind of scientist, it's not something
> you'll ever have to do.

Funny that you would say that because it just so happens that I *am* a
scientist.

And that kind of "hear what I mean, not what I say" kind of attitude is
what makes me go up a wall precisely *because* I am a scientist. When
taking about stuff in a scientific manner I expect people to be precise
and sharp in their expressions.

I do work in the field of applied cryptography. Regularly people will
say things like "we need to encrypt a certificate" or the likes. And
later claim that - well OF COURSE they didn't mean the certificate but
rather the private key that corresponds to the public key embedded in
the certificate. Then just say so, please.

> But if you've
> got data points and you want to understand where to sample to understand
> the underlying processes which generate them, you'll see the sense.

The only thing I see is sloppiness and weak excuses for said sloppiness.

Keith Thompson

unread,

Apr 15, 2015, 3:29:57 PM4/15/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Wednesday, April 15, 2015 at 5:24:57 PM UTC+1, David Brown wrote:
>> On 15/04/15 17:57, Malcolm McLean wrote:
>> > On Wednesday, April 15, 2015 at 4:08:16 PM UTC+1, Keith Thompson wrote:
>> >> Malcolm McLean <malcolm...@btinternet.com> writes:
>> >>
>> > That's just usage.
>> > People say "minimisation" when they're trying to find a maximum,
>> > conventionally you invert the function.
>>
>> Let me get this straight - when /you/ say "minimise", you sometimes mean
>> "find a 0", and sometimes "find a maximum". What do you call it when
>> you want to find a minimum? Painting the function pink? Translating it
>> to Chinese?
>>
> Minimise often means maximise. Conventionally you search for a minimum, and
> invert the sign if you actually want a maximum.
> That's legitimately confusing, but it's how people who deal with these things
> day in and day out talk.

Whether that remarkable claim is accurate or not, *I* have never heard
the word "minimise" (or "minimize") used in that way. I do not deal
with these things day in and day out, and chances are most other people
here don't either.

If you're using technical jargon from a field that most of your readers
are not familiar with, it is *your* responsibility to be clear, not ours
to infer that "minimise" means finding the argument for which a function
that can return -1, 0, or +1 returns 0, or failing that transitions from
-1 to +1.

>> I don't even know where to start with your complete misunderstanding of
>> the word "derivative" - perhaps you forgot to mention that you use the
>> word "derivative" here to mean different species of fish.
>>
> You can think of the function as a step, with the points above the critical value
> as one and the points below as minus one. Or you can think of it as a valley
> with the critical value as the lowest point, and one side pointing down,
> and the other side also pointing down, but from the other direction.

You're saying that you can think of the function (which function?) in
two entirely inconsistent ways. Using crude ASCII graphics (you may
need to use a fixed-width font to view this), you first describe a
function whose graph would look something like this (the + and |
characters are meant to be vertically aligned):

+-------------
|
------------+

and then a function whose graph would look something like this:

----------\ /---------
\ /
V

You say these are two ways to think of the same function. I say
they're different functions.

The closest I can get to making your descriptions make sense is
that the second graph is of the negated derivative of the function
represented by the first (negated so that using the word "minimise"
makes some kind of sense) -- except that the functions you've been
desribing have a discontinuity where they transition from -1 to +1,
and have no defined derivative at that point.

And what does this have to do either with C or with the point you
were trying to make about the golden ratio?

Robert Wessel

unread,

Apr 15, 2015, 8:27:42 PM4/15/15

to

On Wed, 15 Apr 2015 10:02:18 -0700 (PDT), Malcolm McLean
<malcolm...@btinternet.com> wrote:

>On Wednesday, April 15, 2015 at 5:33:49 PM UTC+1, Johannes Bauer wrote:
>> On 15.04.2015 18:13, Malcolm McLean wrote:
>> > On Wednesday, April 15, 2015 at 4:17:33 PM UTC+1, Keith Thompson wrote:
>>
>> > The Fibonnacci sequence goes 0, 1, 1, 2, 3, 5, 8, 13 ... Ignoring for a moment the theoretical
>> > difficulty of the two ones, you can see that it starts out linear, then becomes exponential.
>>
>> So the Fibonnaci sequence "starts out linear"? That is like saying that
>> the sinus function is a linear function because for x close to 0 it's a
>> reasonable approximation.
>>
>Ignore the two ones.
>0, 1, 2, 3 ... a step of one, a step of one, a step of one, that's linear.
>it then converges on an exponential function at the higher end.

That's an interesting claim, and I'd say based on cherry picked data
points. Consider, instead Binet's formula, which clearly shows the
Fibonacci series to be exponential.

Ian Collins

unread,

Apr 15, 2015, 10:51:30 PM4/15/15

to

Malcolm McLean wrote:
> On Wednesday, April 15, 2015 at 4:17:33 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>> Malcolm, if other people repeatedly fail to understand what I'm talking
>> about, I usually consider the possibility that it's because I'm being
>> unclear.
>>
> But I'm not being unclear.

But you do have a knack for talking bollocks with great clarity..

--
Ian Collins

Malcolm McLean

unread,

Apr 16, 2015, 1:30:09 AM4/16/15

to

On Wednesday, April 15, 2015 at 6:09:37 PM UTC+1, Johannes Bauer wrote:
> On 15.04.2015 18:45, Malcolm McLean wrote:
>
> > Minimise often means maximise. Conventionally you search for a minimum, and
> > invert the sign if you actually want a maximum.
> > That's legitimately confusing, but it's how people who deal with these things
> > day in and day out talk.
>
> If I have the function x^3 - 2x^2 and I ask you for the X value of the
> local minimum, I mean the X value of the local minimum.
>
> Not the X value of the local maximum.
> Not the Y value of the local minimum.
> Not the Y value of the local maximum.
> Nor any other nonsense.
>

But if you wanted the y value of the local minimum, and I gave you
the x value, would you complain?

If fact we say "the function minimisation problem", "techniques for
minimisation" and so on, and everyone understands that if you are
looking for a maximum you simply invert the sign. It's legitimately
a bit confusing if you're not familiar with the jargon.

Malcolm McLean

unread,

Apr 16, 2015, 1:57:02 AM4/16/15

to

On Wednesday, April 15, 2015 at 8:29:57 PM UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
>
> You're saying that you can think of the function (which function?) in
> two entirely inconsistent ways. Using crude ASCII graphics (you may
> need to use a fixed-width font to view this), you first describe a
> function whose graph would look something like this (the + and |
> characters are meant to be vertically aligned):
>
> +-------------
> |
> ------------+
>
> and then a function whose graph would look something like this:
>
> ----------\ /---------
> \ /
> V
>
> You say these are two ways to think of the same function. I say
> they're different functions.
>

They're not really. The function is "too low", "too high" or "just
right". So you can think of it as looking for a step, or you can
think of it as looking for the low point of a valley with two
walls pointing you to the answer. It's what visualisation works
for you.

>
> The closest I can get to making your descriptions make sense is
> that the second graph is of the negated derivative of the function
> represented by the first (negated so that using the word "minimise"
> makes some kind of sense) -- except that the functions you've been
> desribing have a discontinuity where they transition from -1 to +1,
> and have no defined derivative at that point.
>
> And what does this have to do either with C or with the point you
> were trying to make about the golden ratio?
>

The question is how to probe something you know next to nothing about.
For example, we've opened a stream, which we know will send us a
human-composed message in English text. We want to store it in a
memory buffer. Every time we ask for memory, we get charged on a
per byte basis (the malloc zeroes out memory, so if you ask for 20K,
that's ten times as expensive as asking for 2K). realloc() has
exactly the same cost as malloc().
So, clearly, our best strategy is to get a statistical distribution
of the messages coming in, and then work out some sort of formula
that minimises the number of calls. But we can't do that if we're
on version 0. We haven't had any messages to analyse yet. All we
know is that the program is expected to handle "human-composed
messages in English text".
So what's our best strategy now?

Richard Heathfield

unread,

Apr 16, 2015, 5:29:46 AM4/16/15

to

On 16/04/15 06:56, Malcolm McLean wrote:

<snip>

> The question is how to probe something you know next to nothing about.
> For example, we've opened a stream, which we know will send us a
> human-composed message in English text. We want to store it in a
> memory buffer. Every time we ask for memory, we get charged on a
> per byte basis (the malloc zeroes out memory, so if you ask for 20K,
> that's ten times as expensive as asking for 2K). realloc() has
> exactly the same cost as malloc().
> So, clearly, our best strategy is to get a statistical distribution
> of the messages coming in, and then work out some sort of formula
> that minimises the number of calls.

No, the best strategy is to change what we want, very slightly. Instead
of storing it in memory, we store it in a file, counting bytes as we go.
We can then, if we wish, use that count to define a VLA and then read
the whole file into that VLA, reducing the number of calls to
malloc/realloc to 0, and thus eliminating the cost entirely.

The trouble with analogies is that they break so easily.

Malcolm McLean

unread,

Apr 16, 2015, 7:46:20 AM4/16/15

to

On Thursday, April 16, 2015 at 10:29:46 AM UTC+1, Richard Heathfield wrote:
> On 16/04/15 06:56, Malcolm McLean wrote:
>
> > The question is how to probe something you know next to nothing about.
> > For example, we've opened a stream, which we know will send us a
> > human-composed message in English text. We want to store it in a
> > memory buffer. Every time we ask for memory, we get charged on a
> > per byte basis (the malloc zeroes out memory, so if you ask for 20K,
> > that's ten times as expensive as asking for 2K). realloc() has
> > exactly the same cost as malloc().
> > So, clearly, our best strategy is to get a statistical distribution
> > of the messages coming in, and then work out some sort of formula
> > that minimises the number of calls.
>
> No, the best strategy is to change what we want, very slightly. Instead
> of storing it in memory, we store it in a file, counting bytes as we go.
> We can then, if we wish, use that count to define a VLA and then read
> the whole file into that VLA, reducing the number of calls to
> malloc/realloc to 0, and thus eliminating the cost entirely.
>
> The trouble with analogies is that they break so easily.
>

If you've genuinely got no other constraints than calls to realloc() then
you can do things differently.
But it's a real problem. The statistical distribution answer is likely to be
impractical. The "farm out to disk" answer is going to be too slow and
put junk on someone's hard drive. The "allocate lots of sections and join
them up" solution means your peak memory use is double - in the
problem as I posed it, not a prohibitive penalty, but likely a constraint,
and it's too much complicated code.
People genuinely do have streams of data coming in, they want to increase
buffer size to accommodate the stream,and they want to minimise
both memory take and reallocations. (The unrealistic part is that you
have a balance between memory take and reallocations/copies in real
life, which means that it's a much harder optimisation problem).

Richard Heathfield

unread,

Apr 16, 2015, 8:08:28 AM4/16/15

to

On 16/04/15 12:46, Malcolm McLean wrote:

<snip>

> If you've genuinely got no other constraints than calls to realloc() then
> you can do things differently.

In other words, the problem statement was imprecise.

> But it's a real problem. The statistical distribution answer is likely to be
> impractical. The "farm out to disk" answer is going to be too slow and
> put junk on someone's hard drive.

You'd be surprised. Firstly, disks are astoundingly fast nowadays.
Secondly, servers are cheap, so it doesn't have to be "someone's" hard
drive. It can be a RAID in the data centre. That way, it's the
shareholders' hard drive, and they don't have access anyway, so that's
all right.

> The "allocate lots of sections and join
> them up" solution means your peak memory use is double - in the
> problem as I posed it, not a prohibitive penalty, but likely a constraint,
> and it's too much complicated code.

Actually, the real, true, genuine, correct, legitimate, copper-bottomed
answer to your question is that malloc and realloc are far cheaper than
you suggested, and your constraint was therefore an artificial one.

> People genuinely do have streams of data coming in, they want to increase
> buffer size to accommodate the stream,and they want to minimise
> both memory take and reallocations. (The unrealistic part is that you
> have a balance between memory take and reallocations/copies in real
> life, which means that it's a much harder optimisation problem).

If you have more data coming in than one machine can comfortably handle,
then either you're the Met Office, in which case you already have a good
solution to this problem, or (almost certainly) your data don't actually
need to be related to each other in anything other than a
meta-relationship, so you can simply farm out different tasks to
different machines. For example, if three thousand people are all
uploading videos to you at the same time, you may need to be able to
store them all, but you probably don't need to have them all in one
particular computer's memory at the same time. Or if you're receiving
image data for rendering, you don't need to have every image in the same
computer's memory at the same time - you can hurl the images into the
rendering farm.

I'm not saying we should be wasteful of resources - of course I'm not -
but you'll not be saving money by trying to squeeze every last
milli-ounce of performance out of a machine by coming up with tricksy
algorithms when it's quicker and cheaper to throw another box at the
problem instead.

Keith Thompson

unread,

Apr 16, 2015, 11:23:10 AM4/16/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
[...]

> If fact we say "the function minimisation problem", "techniques for
> minimisation" and so on, and everyone understands that if you are
> looking for a maximum you simply invert the sign. It's legitimately
> a bit confusing if you're not familiar with the jargon.

You've asserted repeatedly that "minimisation" is commonly used as
jargon for finding either the minimum or the maximum value of a
function. If that's true, you should be able to cite sources that use
the word that way.

Please do so.

In my own cursory Google search, I see an explicit distinction between
"minimization" and "maximization", with the non-jargon meanings I'd
expect. Of course that doesn't prove anything, but it is suggestive.

Malcolm McLean

unread,

Apr 16, 2015, 11:43:01 AM4/16/15

to

On Thursday, April 16, 2015 at 1:08:28 PM UTC+1, Richard Heathfield wrote:
> On 16/04/15 12:46, Malcolm McLean wrote:
>
> > If you've genuinely got no other constraints than calls to realloc() then
> > you can do things differently.
>
> In other words, the problem statement was imprecise.
>

I should have specified that you can only have one buffer, and the data
must always be maintained in a contiguous state in the buffer.
That does make the problem sound artificial.

>
> Actually, the real, true, genuine, correct, legitimate, copper-bottomed
> answer to your question is that malloc and realloc are far cheaper than
> you suggested, and your constraint was therefore an artificial one.
>

Ok, so think of a C++ vector.
We have to implement push_back().
Or think of the original problem where we're trying to find the point at which
memory runs out. Or, as I recast it, we're doing a search for the size of a
fish.
Or you're probing the free energy landscape of biomolecule and you want to
find the local minimum in a given direction in configuration space.
It's all essentially the same problem.
The last one could gobble up weeks of supercomputer time.

>
> I'm not saying we should be wasteful of resources - of course I'm not -
> but you'll not be saving money by trying to squeeze every last
> milli-ounce of performance out of a machine by coming up with tricksy
> algorithms when it's quicker and cheaper to throw another box at the
> problem instead.
>

I try to give a simple, familiar example.
What we're really after is sampling free energy surfaces. But a lot of people aren't
very familiar with entropy, enthalpy, partial derivatives, Hessian matrices and all
the apparatus of that subset of programming. Most people have seen a few fishes,
maybe even caught one.

Ben Bacarisse

unread,

Apr 16, 2015, 11:47:17 AM4/16/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
<snip>

> For example, we've opened a stream, which we know will send us a
> human-composed message in English text. We want to store it in a
> memory buffer. Every time we ask for memory, we get charged on a
> per byte basis (the malloc zeroes out memory, so if you ask for 20K,
> that's ten times as expensive as asking for 2K). realloc() has
> exactly the same cost as malloc().
> So, clearly, our best strategy is to get a statistical distribution
> of the messages coming in, and then work out some sort of formula
> that minimises the number of calls.

Can you prove that (or have I made the mistake of taking you literally
again)? Given the cost model you've proposed, the optimal strategy
would seem to involve minimising the number of bytes allocated, not the
number of calls made.

<snip>
--
Ben.

Richard Heathfield

unread,

Apr 16, 2015, 11:53:26 AM4/16/15

to

Since minimising is the same as maximising, you can use as many bytes as
you like and it won't cost you any extra. Or perhaps even using as few
bytes as you can will still result in an unmanageable fee. So the best
strategy would appear to buy your memory from a more reasonable supplier.

Keith Thompson

unread,

Apr 16, 2015, 11:55:44 AM4/16/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Wednesday, April 15, 2015 at 8:29:57 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>> You're saying that you can think of the function (which function?) in
>> two entirely inconsistent ways. Using crude ASCII graphics (you may
>> need to use a fixed-width font to view this), you first describe a
>> function whose graph would look something like this (the + and |
>> characters are meant to be vertically aligned):
>>
>> +-------------
>> |
>> ------------+
>>
>> and then a function whose graph would look something like this:
>>
>> ----------\ /---------
>> \ /
>> V
>>
>> You say these are two ways to think of the same function. I say
>> they're different functions.
>>
> They're not really. The function is "too low", "too high" or "just
> right". So you can think of it as looking for a step, or you can
> think of it as looking for the low point of a valley with two
> walls pointing you to the answer. It's what visualisation works
> for you.

Seriously, what the *hell* are you talking about?

My crude ASCII art was intended to be straightforward graphs of two
*different* functions as you described them. They are not different
visualizations of the same function, and I honestly cannot figure out
how you think they could be. The first one could be, for example:

double func1(double x) {
if (x < 0.0) return -1.0;
else if (x == 0.0) return 0.0;
else return 1.0;
}

and the second could be:

double func2(double x) {
if (x < -1.0) return 1.0;
else if (x > -1.0 && x < 0.0) return -x;
else if (x >= 0.0 && x < 1.0) return x;
else return 1.0;
}

(with a "valley" of width 2.0).

Do those look like the same function to you?

[...]

>> And what does this have to do either with C or with the point you
>> were trying to make about the golden ratio?
>>
> The question is how to probe something you know next to nothing about.
> For example, we've opened a stream, which we know will send us a
> human-composed message in English text. We want to store it in a
> memory buffer.

[snip]

> So what's our best strategy now?

And you assert that a Fibonacci sequence is the best strategy, yes?

For all I know, that might be true. But your hand-waving arguments
about *why* it's true (because it's some kind of "compromise" between
linear and exponential growth?) have not furthered your claim.

(I found a Wikipedia article discussing something like this, suggesting
a Fibonacci-like sequence as an optimal strategy for this kind of thing.
I don't remember the title or URL.)

Steve Thompson

unread,

Apr 16, 2015, 12:58:20 PM4/16/15

to

On Wed, Apr 15, 2015 at 07:09:30PM +0200, Johannes Bauer wrote:
> On 15.04.2015 18:45, Malcolm McLean wrote:
>
> > Minimise often means maximise. Conventionally you search for a minimum, and
> > invert the sign if you actually want a maximum.
> > That's legitimately confusing, but it's how people who deal with these things
> > day in and day out talk.
>
> If I have the function x^3 - 2x^2 and I ask you for the X value of the
> local minimum, I mean the X value of the local minimum.
>
> Not the X value of the local maximum.
> Not the Y value of the local minimum.
> Not the Y value of the local maximum.
> Nor any other nonsense.
>
> The burden to get your ideas across is on *you*. You're being sloppy in
> your language and when you're confronted with that you go into
> full-arrogance mode where you try to defend yourself by claiming "pretty
> much everyone does it like this" and that completely opposite concepts

> are similar to the point where terms can be used interchangeably.

>
> You're the one being sloppy and you expect everyone to clean up your
> mess after you and interpret your words in the way you MEANT, not the
> way you WROTE them. That's incredibly arrogant.
>
> If you want to express an idea, write down the IDEA not something that
> is vaguely or even vastly different and then crying that it's not what
> you ACTUALLY meant. That's just being obnoxious and annoying.

In Malcolm's defense he is correct in a quasi-general sense in that
people use language and words in complex and contradictory ways in
some situations. People will say they want peace when they desire
'war'; politicians say they want to reduce poverty when they may in
fact wish to increase it to divert more money to otherwise redundant
socialist programs; some say they want simplicity when they wish to
conceal complexity; or they may advocate tolerance when what they mean
is tolerance for intolerance. The most striking dissonance is
observed when pronoun pairs such as 'I' and 'you' are used
interchangeably. It is a fairly commonplace pathology and it often
seems to arise when someone wishes to say something without seeming to
say it. I believe linguistic identification of the basic phenomenon
at work is known as dynamic grammar, but I do not recall where I read
that. This is all to say that a superficially contradictory dialectic
is nothing new.

In any event people are rather good at "correctly" parsing the
intended meaning (authorial intentionality) in such instances from
context as there are almost always telling clues accompanying these
contradictory usages. It is necessary to be sensitive to the shared
context, which may not be obvious or easily apprehended by those of us
who are more literal in our thinking. This sort of thing is firmly
entrenched in the mainstream of society and can be trivially observed
when people use 'sick' to signify that something is 'cool', or 'bad'
when they mean 'good'.

What is more difficult is inferring the correct meaning in the case
where the nominal subject under discussion is some aspect of
mathematics. Math simply does not work if the definition of "minimum"
or "maximum" deviates from the accepted definition, or if the meaning
of terms is otherwise similarly distorted. I doubt Malcolm would ever
assert 4 = 5 even for large values of 4, so it is extremely difficult
to follow what he means when he suggests a minimum is the same as a
maximum along with a bit of handwavium about signs.

In this light his usage seems rather Orwellian, if not outright wrong.
Perhaps he is simply attempting to convey a subtle joke, and if true I
suggest it is in bad taste. But then I am not much of an expert on
mathematics so I will leave it to others to clarify this issue in this
particular instance.

Regards,

Steve Thompson

--
"If I had a nickel for every time some idiot called me about a
computer problem that turned out to be user error, I would be able to
retire and spend the rest of my days cultivating clues in my backyard
hillside garden." -- MysteryDog in 24hoursupport.helpdesk.

Malcolm McLean

unread,

Apr 16, 2015, 1:44:37 PM4/16/15

to

My second hit on "function minimisation algorithms"

https://www.gnu.org/software/gsl/manual/html_node/Multidimensional-Minimization.html

This chapter describes routines for finding minima of arbitrary multidimensional functions. The library provides low level components for a variety of iterative minimizers and convergence tests. These can be combined by the user to achieve the desired solution, while providing full access to the intermediate steps of the algorithms. Each class of methods uses the same framework, so that you can switch between minimizers at runtime without needing to recompile your program. Each instance of a minimizer keeps track of its own state, allowing the minimizers to be used in multi-threaded programs. The minimization algorithms can be used to maximize a function by inverting its sign.

Malcolm McLean

unread,

Apr 16, 2015, 1:52:24 PM4/16/15

to

malloc(10) costs 10 units, malloc(20) costs 20 units
realloc(ptr, 10) costs 10 units, realloc(ptr, 20) costs 20 units.

So if we start with ptr = malloc(1) and then call
ptr = realloc(ptr, 2), ptr = realloc(ptr, 3) etc it's an expensive algorithm,
costing N^2/2 (plus N/2, since I'm under attack for being unclear and wrong
in every respect).

We're only allowed one buffer, I must clarify. Otherwise you can generate lots
of buffers of one byte and trivially get an optimal answer.

Malcolm McLean

unread,

Apr 16, 2015, 2:03:21 PM4/16/15

to

On Thursday, April 16, 2015 at 4:55:44 PM UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
> > On Wednesday, April 15, 2015 at 8:29:57 PM UTC+1, Keith Thompson wrote:
> >> Malcolm McLean <malcolm...@btinternet.com> writes:
> >> You're saying that you can think of the function (which function?) in
> >> two entirely inconsistent ways. Using crude ASCII graphics (you may
> >> need to use a fixed-width font to view this), you first describe a
> >> function whose graph would look something like this (the + and |
> >> characters are meant to be vertically aligned):
> >>
> >> +-------------
> >> |
> >> ------------+
> >>
> >> and then a function whose graph would look something like this:
> >>
> >> ----------\ /---------
> >> \ /
> >> V
> >>
> >> You say these are two ways to think of the same function. I say
> >> they're different functions.
> >>
> >

Graph 2 is the integral of graph 1, assuming graph 1 has been surrounded by zeros
(it should have been drawn the the V extending indefinitely).
So we can thing of the function as something that gives us a height (y is a height,
and so we have a step). or we can think of it as something that gives us a derivative
(so we've got a valley).
The second way is the way most people prefer to visualise what we're doing, but
it doesn't matter if another mental image works better for you. The fundamental
thing to understand is that all the information has been stripped except too
high, too low and "just right".

>
> For all I know, that might be true. But your hand-waving arguments
> about *why* it's true (because it's some kind of "compromise" between
> linear and exponential growth?) have not furthered your claim.
>

Because we don't know the probability density function of the point at which
f(x) = 0. We do know that it's somewhere between uniform with a limit and
exponential decay from zero.
So we need to search using a function that somehow compromises between
a binary search in linear space and a binary search in log space.

Keith Thompson

unread,

Apr 16, 2015, 2:22:14 PM4/16/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Thursday, April 16, 2015 at 4:23:10 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>> [...]
>> > If fact we say "the function minimisation problem", "techniques for
>> > minimisation" and so on, and everyone understands that if you are
>> > looking for a maximum you simply invert the sign. It's legitimately
>> > a bit confusing if you're not familiar with the jargon.
>>
>> You've asserted repeatedly that "minimisation" is commonly used as
>> jargon for finding either the minimum or the maximum value of a
>> function. If that's true, you should be able to cite sources that use
>> the word that way.
>>
>> Please do so.
>>
>> In my own cursory Google search, I see an explicit distinction between
>> "minimization" and "maximization", with the non-jargon meanings I'd
>> expect. Of course that doesn't prove anything, but it is suggestive.
>

> My second hit on "function minimisation algorithms"
>
> https://www.gnu.org/software/gsl/manual/html_node/Multidimensional-Minimization.html
>
> This chapter describes routines for finding minima of arbitrary
> multidimensional functions. The library provides low level components
> for a variety of iterative minimizers and convergence tests. These can
> be combined by the user to achieve the desired solution, while
> providing full access to the intermediate steps of the
> algorithms. Each class of methods uses the same framework, so that you
> can switch between minimizers at runtime without needing to recompile
> your program. Each instance of a minimizer keeps track of its own
> state, allowing the minimizers to be used in multi-threaded
> programs. The minimization algorithms can be used to maximize a
> function by inverting its sign.

Read the last sentence:

The minimization algorithms can be used to maximize a function by
inverting its sign.

That supports my position, not yours. The quoted text does not use the
word "minimization" to refer to finding the maximum value of a function.
It suggests a technique that can use a minimization algorithm to
determine the maximum value of a function *by determining the minimum
value of a different function*, namely the original function with the
result's sign inverted. That's both useful and obvious.

Minimization means finding the minimum value of a function. It does
not mean finding the maximum value of a function; that's called
"maximimization".

At this point, my working hypothesis is that you're simply wrong
and unwilling to admit it. You've done this kind of thing before,
and it makes trying to communicate with you very frustrating.

Keith Thompson

unread,

Apr 16, 2015, 2:35:22 PM4/16/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
> On Thursday, April 16, 2015 at 4:55:44 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>> > On Wednesday, April 15, 2015 at 8:29:57 PM UTC+1, Keith Thompson wrote:
>> >> Malcolm McLean <malcolm...@btinternet.com> writes:
>> >> You're saying that you can think of the function (which function?) in
>> >> two entirely inconsistent ways. Using crude ASCII graphics (you may
>> >> need to use a fixed-width font to view this), you first describe a
>> >> function whose graph would look something like this (the + and |
>> >> characters are meant to be vertically aligned):
>> >>
>> >> +-------------
>> >> |
>> >> ------------+
>> >>
>> >> and then a function whose graph would look something like this:
>> >>
>> >> ----------\ /---------
>> >> \ /
>> >> V
>> >>

(I cleaned up the alignment.)

>> >> You say these are two ways to think of the same function. I say
>> >> they're different functions.
>
> Graph 2 is the integral of graph 1, assuming graph 1 has been surrounded by zeros

It's been a long time since I studied calculus, but I'm pretty sure
that's nonsense. Graph 2 might be the derivative of the inverse of
graph 1.

Graphs 1 and 2 were my attempt to visualize the functions that you
described upthread.

[SNIP]

Clearly trying to figure out what you actually mean is a waste of my
time.

Martin Shobe

unread,

Apr 16, 2015, 3:12:42 PM4/16/15

to

On 4/15/2015 11:38 PM, Steve Thompson wrote:
> What is more difficult is inferring the correct meaning in the case
> where the nominal subject under discussion is some aspect of
> mathematics. Math simply does not work if the definition of "minimum"
> or "maximum" deviates from the accepted definition, or if the meaning
> of terms is otherwise similarly distorted. I doubt Malcolm would ever
> assert 4 = 5 even for large values of 4, so it is extremely difficult
> to follow what he means when he suggests a minimum is the same as a
> maximum along with a bit of handwavium about signs.

> In this light his usage seems rather Orwellian, if not outright wrong.
> Perhaps he is simply attempting to convey a subtle joke, and if true I
> suggest it is in bad taste. But then I am not much of an expert on
> mathematics so I will leave it to others to clarify this issue in this
> particular instance.

I can make some sense of what Malcolm is trying to say about minimum and
maximum. < and > are duals. This means that you can, through the
appropriate transformations, apply any knowledge you have of one to the
other. In particular, any technique for finding a minimum has a
corresponding technique for finding a maximum with the same costs,
benefits, etc.

The bit about roots and minimums is more obscure. Yes, you can transform
the problem of finding a root for a function, f, into the problem of
finding the minimum of |f|. However, I can't see any advantage to it.
You've lost information by doing so (in particular, the sign of f(x)).
In the particular case given, you are stuck with a try everything until
you find a zero approach instead of being able to perform a binary search.

Martin Shobe

Ben Bacarisse

unread,

Apr 16, 2015, 5:11:47 PM4/16/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:

> On Thursday, April 16, 2015 at 4:47:17 PM UTC+1, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>> <snip>
>> > For example, we've opened a stream, which we know will send us a
>> > human-composed message in English text. We want to store it in a
>> > memory buffer. Every time we ask for memory, we get charged on a
>> > per byte basis (the malloc zeroes out memory, so if you ask for 20K,
>> > that's ten times as expensive as asking for 2K). realloc() has
>> > exactly the same cost as malloc().
>> > So, clearly, our best strategy is to get a statistical distribution
>> > of the messages coming in, and then work out some sort of formula
>> > that minimises the number of calls.
>>
>> Can you prove that (or have I made the mistake of taking you literally
>> again)? Given the cost model you've proposed, the optimal strategy
>> would seem to involve minimising the number of bytes allocated, not the
>> number of calls made.
>>
> malloc(10) costs 10 units, malloc(20) costs 20 units
> realloc(ptr, 10) costs 10 units, realloc(ptr, 20) costs 20 units.

Ah, so the cost is not per byte used but per byte asked for. I suppose
I could have inferred that from the remark about realloc having the same
cost as malloc, but I took that to mean that reallocated memory costs
the same as allocated-once memory.

> So if we start with ptr = malloc(1) and then call
> ptr = realloc(ptr, 2), ptr = realloc(ptr, 3) etc it's an expensive algorithm,
> costing N^2/2 (plus N/2, since I'm under attack for being unclear and wrong
> in every respect).

That's what big O and friends are for -- the cost function is O(N^2).
I've lost track of where this was going, but if there is a connection
with golden section search I am now in a position to follow it.

> We're only allowed one buffer, I must clarify. Otherwise you can generate lots
> of buffers of one byte and trivially get an optimal answer.

It sounds like you are setting this as a challenge. I am not aware of
any optimal strategy, but, in terms of memory actually used and copied,
any exponential growth pattern avoids the quadratic behaviour of the
naive approach. Are you saying that a Fibonacci pattern of allocations
is known to have some theoretical advantage? I don't recall ever seeing
that used (though a Fibonacci-like pattern of sizes is sometimes used in
allocators that have fixed-sized blocks held in free lists).

--
Ben.

David Brown

unread,

Apr 17, 2015, 3:28:47 AM4/17/15

to

On 16/04/15 21:12, Martin Shobe wrote:
> On 4/15/2015 11:38 PM, Steve Thompson wrote:
>> What is more difficult is inferring the correct meaning in the case
>> where the nominal subject under discussion is some aspect of
>> mathematics. Math simply does not work if the definition of "minimum"
>> or "maximum" deviates from the accepted definition, or if the meaning
>> of terms is otherwise similarly distorted. I doubt Malcolm would ever
>> assert 4 = 5 even for large values of 4, so it is extremely difficult
>> to follow what he means when he suggests a minimum is the same as a
>> maximum along with a bit of handwavium about signs.
>
>> In this light his usage seems rather Orwellian, if not outright wrong.
>> Perhaps he is simply attempting to convey a subtle joke, and if true I
>> suggest it is in bad taste. But then I am not much of an expert on
>> mathematics so I will leave it to others to clarify this issue in this
>> particular instance.
>
> I can make some sense of what Malcolm is trying to say about minimum and
> maximum. < and > are duals. This means that you can, through the
> appropriate transformations, apply any knowledge you have of one to the
> other. In particular, any technique for finding a minimum has a
> corresponding technique for finding a maximum with the same costs,
> benefits, etc.

Of course maximums and minimums are duals, and problems of one type can
be easily transformed into problems of the other type. But if I want
someone to pass me the biggest cake from a tray of cakes, I ask for the
biggest one - I don't ask for the smallest one and assume they know what
I meant. This is what Malcolm has been doing.

>
> The bit about roots and minimums is more obscure. Yes, you can transform
> the problem of finding a root for a function, f, into the problem of
> finding the minimum of |f|. However, I can't see any advantage to it.
> You've lost information by doing so (in particular, the sign of f(x)).
> In the particular case given, you are stuck with a try everything until
> you find a zero approach instead of being able to perform a binary search.
>

Correct. Mathematically, functions like abs() and sign() are extremely
unpleasant to work with. If you want to transform finding a root into
finding a minimum, you would work with f²(x) instead. But it is
unlikely to be of any help - using numerical methods it is usually a
easier to find a zero than to find a minimum (and using algebraic
methods, you find minimums by finding the zeros of f'(x) ).

David Brown

unread,

Apr 17, 2015, 4:23:09 AM4/17/15

to

There is /no/ theoretical optimal strategy, because the size of the
incoming data is unbounded. For any strategy (exponential, linear,
quadratic, constant, Fibonacci, whatever you like), you would generate a
sequence of allocation sizes - s_i, (say 1, 1, 2, 3, 5, 8, 13, etc. for
a Fibonacci strategy). No matter what sequence you pick, there will be
a size N of the incoming data such that for all data sizes n > N, the
sequence 2*s_i (2, 2, 4, 6, 10, 16, 26, etc., in this case) will be more
efficient.

I haven't proved this assertion, but I am quite confident that I /could/
prove it. I know that you, Ben, have a lot of experience in algorithm
complexity - does your gut feeling agree with me here?

In the real world, of course, there are different factors - the cost of
a malloc/realloc has a significant constant factor as well as a cost
based on the size, and there are practical considerations about the size
of the memory blocks that work best. You might find that allocating in
units of 4K - 16 bytes is the most efficient because it suits the malloc
implementation - and you would use 16 bytes from that block to implement
a linked list and avoid the realloc and copying. And possibly,
depending on the source of the data, you would use statistical analysis
on the incoming streams to guess the sizes of the data and thus make
fewer, bigger allocations without wasting too much space.

In similar situations, it is not uncommon to use an exponential growth
(to avoid the quadratic behaviour, as you said) - but powers of 2 are
more typical because they are simple and work well. Sometimes you want
exponential growth but not as aggressive, or with more steps - and
Fibonacci growth would be one possibility. But there is no theoretical
advantage over it compared to, say, powers of 1.5 or powers of sqrt(2),
or non-uniform allocation (B, 1.5*B, 2B, 3B, 4B, 6B, 8B, 12B, 16B -
approximating powers of sqrt(2) but using nicer numbers).

David Brown

unread,

Apr 17, 2015, 4:28:38 AM4/17/15

to

On 16/04/15 20:35, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
>> On Thursday, April 16, 2015 at 4:55:44 PM UTC+1, Keith Thompson wrote:
>>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>>> On Wednesday, April 15, 2015 at 8:29:57 PM UTC+1, Keith Thompson wrote:
>>>>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>>>> You're saying that you can think of the function (which function?) in
>>>>> two entirely inconsistent ways. Using crude ASCII graphics (you may
>>>>> need to use a fixed-width font to view this), you first describe a
>>>>> function whose graph would look something like this (the + and |
>>>>> characters are meant to be vertically aligned):
>>>>>
>>>>> +-------------
>>>>> |
>>>>> ------------+
>>>>>
>>>>> and then a function whose graph would look something like this:
>>>>>
>>>>> ----------\ /---------
>>>>> \ /
>>>>> V
>>>>>
>
> (I cleaned up the alignment.)
>
>>>>> You say these are two ways to think of the same function. I say
>>>>> they're different functions.
>>
>> Graph 2 is the integral of graph 1, assuming graph 1 has been surrounded by zeros
>
>
> It's been a long time since I studied calculus, but I'm pretty sure
> that's nonsense. Graph 2 might be the derivative of the inverse of
> graph 1.

You are correct, assuming that the vertical line in graph 1 is at an
angle rather than vertical - or that the valley in graph 2 is zero width
and infinitely deep, and assuming we are using engineer-style maths
(which tends to gloss over details like infinities) and not
mathematician maths (since mathematicians get upset about "infinitely
deep" valleys).

David Brown

unread,

Apr 17, 2015, 4:32:43 AM4/17/15

to

On 16/04/15 17:55, Keith Thompson wrote:

>
> And you assert that a Fibonacci sequence is the best strategy, yes?
>
> For all I know, that might be true. But your hand-waving arguments
> about *why* it's true (because it's some kind of "compromise" between
> linear and exponential growth?) have not furthered your claim.
>
> (I found a Wikipedia article discussing something like this, suggesting
> a Fibonacci-like sequence as an optimal strategy for this kind of thing.
> I don't remember the title or URL.)
>

Fibonacci-like sequences can be optimal for certain specific search types:

<http://en.wikipedia.org/wiki/Golden_section_search>
<http://en.wikipedia.org/wiki/Fibonacci_search_technique>

I don't think these particular wikipedia articles are very clear, but
they are far better than Malcolm's random walk!

There is no reason that I can see to correlate the descriptions in these
articles with any of the problems Malcolm has been describing, no matter
which combination of unspecified assumptions he chooses to apply.

Richard Heathfield

unread,

Apr 17, 2015, 4:51:47 AM4/17/15

to

On 17/04/15 09:28, David Brown wrote:

<snip>

> You are correct, assuming that the vertical line in graph 1 is at an
> angle rather than vertical - or that the valley in graph 2 is zero width
> and infinitely deep, and assuming we are using engineer-style maths
> (which tends to gloss over details like infinities) and not
> mathematician maths (since mathematicians get upset about "infinitely
> deep" valleys).

To be fair, they're not the only ones who get upset about infinitely
deep valleys. Geographers, polar explorers, bridge-builders...

Malcolm McLean

unread,

Apr 17, 2015, 5:08:20 AM4/17/15

to

You could square a function, then pass it to a general-purpose minimiser.

Tim Rentsch

unread,

Apr 17, 2015, 5:13:23 AM4/17/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:

> On Wednesday, April 15, 2015 at 4:17:33 PM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>

>> Malcolm, if other people repeatedly fail to understand what I'm talking
>> about, I usually consider the possibility that it's because I'm being
>> unclear.
>
> But I'm not being unclear.

Considering the reactions of numerous responders, you might want
to reconsider that opinion. Just a suggestion.

Malcolm McLean

unread,

Apr 17, 2015, 5:17:30 AM4/17/15

to

I've been accused of talking nonsense.
Now I might well be wrong, I'm not a mathematician.

But I've posed a problem and proposed an answer. No-one else has proposed
an alternative solution, much less proved that their alternative is correct.
They've just asserted that I am talking nonsense and nit picked at things
like the jargon use of "minimise" to mean "any search for extrema".
(It's fair enough for someone who doesn't understand to be confused, I
don't blame anyone for that. It's the nit picking that gets wearing).

But now you can check me out.

Write that function, with various buffer-growing strategies.

Then test it on a series of inputs with various hypotheses for what the statistical
distribution of "human-composed English language messages" might be.Try linear
between 1 word and 1000 words, linear between 20,000 words and 200,000
words (novels), all 25 word tweets, half gaussian, log-normal, and anything else
you can think of that might be realistic.

Am I right, wrong, or talking nonsense?

Morris Dovey

unread,

Apr 17, 2015, 6:56:32 AM4/17/15

to

On 4/17/15 4:17 AM, Malcolm McLean wrote:

> Am I right, wrong, or talking nonsense?

Yes. I went through this exercise and settled for a function that
recursed until end of input, then allocated an exact-sized buffer, and
stored the input into that buffer as each instance returned.

The cost of char input was small, and the recursive call imposed very
little execution overhead. There is a risk of running out of (stack)
space with a 'large' input, so a certain amount of prudence is required
- but 200000 words wouldn't be a problem on any of my last half-dozen or
so machines.

Code has already been posted to c.l.c - and it's online at
http://www.iedu.com/mrd/c/getsm.c

--
Morris Dovey
http://www.iedu.com/Solar

Malcolm McLean

unread,

Apr 17, 2015, 7:37:38 AM4/17/15

to

Ok, so that's the first best attempt to solve the problem.

Anyone offering anything superior?

Martin Shobe

unread,

Apr 17, 2015, 10:33:31 AM4/17/15

to

On 4/17/2015 2:28 AM, David Brown wrote:
> On 16/04/15 21:12, Martin Shobe wrote:
>> I can make some sense of what Malcolm is trying to say about minimum and
>> maximum. < and > are duals. This means that you can, through the
>> appropriate transformations, apply any knowledge you have of one to the
>> other. In particular, any technique for finding a minimum has a
>> corresponding technique for finding a maximum with the same costs,
>> benefits, etc.
>
> Of course maximums and minimums are duals, and problems of one type can
> be easily transformed into problems of the other type. But if I want
> someone to pass me the biggest cake from a tray of cakes, I ask for the
> biggest one - I don't ask for the smallest one and assume they know what
> I meant. This is what Malcolm has been doing.

Yes. His behavior reminds me of a "game" I sometimes see children play
where someone announces that they know something and then dribble just
enough bits and pieces to keep the audience going.

>> The bit about roots and minimums is more obscure. Yes, you can transform
>> the problem of finding a root for a function, f, into the problem of
>> finding the minimum of |f|. However, I can't see any advantage to it.
>> You've lost information by doing so (in particular, the sign of f(x)).
>> In the particular case given, you are stuck with a try everything until
>> you find a zero approach instead of being able to perform a binary search.

> Correct. Mathematically, functions like abs() and sign() are extremely
> unpleasant to work with. If you want to transform finding a root into
> finding a minimum, you would work with f²(x) instead. But it is
> unlikely to be of any help - using numerical methods it is usually a
> easier to find a zero than to find a minimum (and using algebraic
> methods, you find minimums by finding the zeros of f'(x) ).

Agreed. I really can't think of a situation where you'd want to make
such a transformation.

Martin Shobe

Martin Shobe

unread,

Apr 17, 2015, 10:43:09 AM4/17/15

to

Squaring has the same problems. In the case given, squaring the function
and taking the absolute value of the function result in the exact same
function. You are once again forced into a brute-force search of the
entire space because you've lost the all the information that could have
been gleaned from each test point.

Martin Shobe

Martin Shobe

unread,

Apr 17, 2015, 11:04:41 AM4/17/15

to

There is nothing superior in terms of the metric you've chosen. This
method mallocs exactly the amount needed for each input. Malloc any
less, and you don't have enough for that input.

Martin Shobe

Malcolm McLean

unread,

Apr 17, 2015, 11:04:49 AM4/17/15

to

On Friday, April 17, 2015 at 3:43:09 PM UTC+1, Martin Shobe wrote:
>
> Squaring has the same problems. In the case given, squaring the function
> and taking the absolute value of the function result in the exact same
> function. You are once again forced into a brute-force search of the
> entire space because you've lost the all the information that could have
> been gleaned from each test point.
>

The entire space is zero to infinity. So if our only option is a brute force search
of the entire space, the algorithm will never terminate.
But that doesn't mean we don't have better strategy.

Tim Rentsch

unread,

Apr 17, 2015, 11:07:54 AM4/17/15

to

An extremely poor idea. In almost all cases this will entail
more work and also produce worse results.

Malcolm McLean

unread,

Apr 17, 2015, 11:38:43 AM4/17/15

to

Ok, so that the best attempt from Martin Shobe.

Keith Thompson

unread,

Apr 17, 2015, 1:31:08 PM4/17/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:
[...]

> I've been accused of talking nonsense.
> Now I might well be wrong, I'm not a mathematician.
>
> But I've posed a problem and proposed an answer. No-one else has proposed
> an alternative solution, much less proved that their alternative is correct.
> They've just asserted that I am talking nonsense and nit picked at things
> like the jargon use of "minimise" to mean "any search for extrema".
> (It's fair enough for someone who doesn't understand to be confused, I
> don't blame anyone for that. It's the nit picking that gets wearing).

I do understand your claim that "minimise" is a jargon term for
"any search for extrema". I simply don't believe it.

Ben Bacarisse

unread,

Apr 17, 2015, 3:52:25 PM4/17/15

to

Martin Shobe did provide a solution. He commented on the fact that you
appear to have misunderstood the code posted by Morris Dovey.

Morris's code is not a solution to the problem you posed -- it's a way
to avoid the problem altogether by using the stack instead. Your
problem (as posed) seemed to be to find the optimal multiplier for
growing an input buffer.

--
Ben.

Ben Bacarisse

unread,

Apr 17, 2015, 4:09:33 PM4/17/15

to

Malcolm McLean <malcolm...@btinternet.com> writes:

> On Thursday, April 16, 2015 at 10:11:47 PM UTC+1, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>>
>> It sounds like you are setting this as a challenge. I am not aware of
>> any optimal strategy, but, in terms of memory actually used and copied,
>> any exponential growth pattern avoids the quadratic behaviour of the
>> naive approach. Are you saying that a Fibonacci pattern of allocations
>> is known to have some theoretical advantage? I don't recall ever seeing
>> that used (though a Fibonacci-like pattern of sizes is sometimes used in
>> allocators that have fixed-sized blocks held in free lists).
>>
> I've been accused of talking nonsense.
> Now I might well be wrong, I'm not a mathematician.
>
> But I've posed a problem and proposed an answer.

I missed the proposed answer. What was it?

> No-one else has proposed
> an alternative solution, much less proved that their alternative is correct.
> They've just asserted that I am talking nonsense and nit picked at things
> like the jargon use of "minimise" to mean "any search for extrema".
> (It's fair enough for someone who doesn't understand to be confused, I
> don't blame anyone for that. It's the nit picking that gets wearing).
>
> But now you can check me out.
>
> Write that function, with various buffer-growing strategies.
>
> Then test it on a series of inputs with various hypotheses for what
> the statistical distribution of "human-composed English language
> messages" might be.Try linear between 1 word and 1000 words, linear
> between 20,000 words and 200,000 words (novels), all 25 word tweets,
> half gaussian, log-normal, and anything else you can think of that
> might be realistic.

Why would anyone spend time doing this work, especially with the rather
artificial cost model you've proposed?

In practice, it's a non-problem for most people. They use some fixed
multiplier (so as to avoid the O(N^2) cost from a linear policy) and
they are done. I doubt anyone uses a multiplier much larger than 3 or
one much smaller than 1.5, but I also doubt anyone chooses it based on
much study of the distribution of inputs. I'd consider setting a
minimum start size based on such a study, but that's about it.

> Am I right, wrong, or talking nonsense?

This is new topic. Previously you appeared to be talking about
searching for extremas (or possibly zeros) in costly-to-evaluate
functions. The only claim I can recall you've made in relation to
allocation strategies is this:

"So, clearly, our best strategy is to get a statistical distribution
of the messages coming in, and then work out some sort of formula that
minimises the number of calls".

That's hard to asses as right or wrong because of the vagueness of "some
sort of formula". Given the cost model you've proposed, it's certainly
not obvious that fewer calls corresponds to lower costs, though it will
do for some formulas and some distributions of inputs.

--
Ben.

Martin Shobe

unread,

Apr 17, 2015, 6:08:59 PM4/17/15

to

In the given case, the search space is countable and the functions all
have roots. There are brute force searches which always terminate.

> But that doesn't mean we don't have better strategy.

Such as don't square it and send it to a general find minimum function.

Martin Shobe

Malcolm McLean

unread,

Apr 17, 2015, 10:01:30 PM4/17/15

to

On Friday, April 17, 2015 at 9:09:33 PM UTC+1, Ben Bacarisse wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
>
> Why would anyone spend time doing this work, especially with the rather
> artificial cost model you've proposed?
>
> In practice, it's a non-problem for most people. They use some fixed
> multiplier (so as to avoid the O(N^2) cost from a linear policy) and
> they are done. I doubt anyone uses a multiplier much larger than 3 or
> one much smaller than 1.5, but I also doubt anyone chooses it based on
> much study of the distribution of inputs. I'd consider setting a
> minimum start size based on such a study, but that's about it.
>

So most people say "the optimal answer is to grow the buffer by a
factor of between 1.5 and 3". Fair enough. Doesn't that observation
interest you?
You've got two costs, the cost of making a call / doing a probe,
and the cost of taking a chunk. I've unified them, not in a very
artificial way.
The problem's not contrived, either, lots of people knock up such
code to cope with more or less that situation.

>
> "So, clearly, our best strategy is to get a statistical distribution
> of the messages coming in, and then work out some sort of formula that
> minimises the number of calls".
>
> That's hard to asses as right or wrong because of the vagueness of "some
> sort of formula". Given the cost model you've proposed, it's certainly
> not obvious that fewer calls corresponds to lower costs, though it will
> do for some formulas and some distributions of inputs.
>

If you know the statistical distribution of the messages coming in,
then you can optimise the strategy to cope with that distribution.
To take a trivial example, if you know that every message is
exactly 1K, you've got an optimal strategy there and then.

But all you know is that the messages are English language text composed
by a human. No one has actually sent you any messages yet, you don't
really know what to expect, other than some very vague ballparks.

So really the question boils down to, what's the statistical distribution
of statistical distributions? But we don't need to go there yet.

David Brown

unread,

Apr 18, 2015, 7:16:32 AM4/18/15

to

And for this problem, there is /no/ solution.

Malcolm McLean

unread,

Apr 18, 2015, 7:22:48 AM4/18/15

to

On Saturday, April 18, 2015 at 12:16:32 PM UTC+1, David Brown wrote:
>
> And for this problem, there is /no/ solution.
>

You can easily test various strategies, then feed them messages with
plausible length distributions, and see how each performs under
each hypothesis.
However you can't really say which of the plausible distributions
is going to be most likely. I'm not quite sure what a "statistical
distribution of statistical distributions" really means in statistical
theory. Whilst I use a lot of mathematics for my job, I'm ultimately
not a mathematician.

David Brown

unread,

Apr 18, 2015, 8:09:04 AM4/18/15

to

On 18/04/15 13:22, Malcolm McLean wrote:
> On Saturday, April 18, 2015 at 12:16:32 PM UTC+1, David Brown wrote:
>>
>> And for this problem, there is /no/ solution.
>>
> You can easily test various strategies, then feed them messages with
> plausible length distributions, and see how each performs under
> each hypothesis.

No, you can't do that. Your problem (as posed) is completely
hypothetical and unrealistic. You can't use trial and error. And even
if you /did/ use trial and error, you can't claim to have found an
optimum strategy - at best you could see that one strategy was better
than another on the test sample used.

Plenty of people (myself included) have given ideas and suggestions for
efficient and practical ways of dealing with a real-world variation of
your problem.

Maybe I'll write out a proof of my claim that there is no optimum
strategy for the problem you posed. (Of course, in trying to do so I
might find I was wrong!).

Malcolm McLean

unread,

Apr 18, 2015, 9:00:55 AM4/18/15

to

On Saturday, April 18, 2015 at 1:09:04 PM UTC+1, David Brown wrote:
> On 18/04/15 13:22, Malcolm McLean wrote:
> > On Saturday, April 18, 2015 at 12:16:32 PM UTC+1, David Brown wrote:
> >>
> >> And for this problem, there is /no/ solution.
> >>
> > You can easily test various strategies, then feed them messages with
> > plausible length distributions, and see how each performs under
> > each hypothesis.
>
> No, you can't do that. Your problem (as posed) is completely
> hypothetical and unrealistic. You can't use trial and error. And even
> if you /did/ use trial and error, you can't claim to have found an
> optimum strategy - at best you could see that one strategy was better
> than another on the test sample used.
>
> Plenty of people (myself included) have given ideas and suggestions for
> efficient and practical ways of dealing with a real-world variation of
> your problem.
>
> Maybe I'll write out a proof of my claim that there is no optimum
> strategy for the problem you posed. (Of course, in trying to do so I
> might find I was wrong!).
>

We know that if we grow the buffer by one byte each time, we'll
get a very poor result. And we know that if we take the maximum
physically possible text message, allocate that, we'll get a poor
result. So there's a strategy in between which is better than
either, and the idea that the problem has no solution at all is
obviously false.

Ben's suggested that the optimal answer is to grow the buffer by a
factor between 1.5 and 3. Does that sound reasonable to you? What
about growing by 1.1 or by 10?