Testing randomized algorithms

Aleksandar Makelov

unread,

Jun 13, 2012, 9:58:02 AM6/13/12

to sy...@googlegroups.com

How does this work (and does it work) in sympy?

A classical example is a Monte Carlo algorithm. Is there an accepted way to test these in sympy? For one-sided Monte Carlo algorithms, we can easily test one direction in a deterministic way, but this doesn't feel very satisfactory.

Another example is the algorithms outputting random group elements that are supposed to be nearly uniformly distributed in the group. I can't think of any way to test these except by asking if the returned value is an element of the group. Again, not much fun.

Chris Smith

unread,

Jun 14, 2012, 1:13:57 AM6/14/12

to sy...@googlegroups.com

You might check out
csrc.nist.gov/publications/nistpubs/800-22.../SP800-22rev1a.pdf

Aleksandar Makelov

unread,

Jun 14, 2012, 2:04:19 AM6/14/12

to sy...@googlegroups.com

Sorry but the link appears to be broken

Chris Smith

unread,

Jun 14, 2012, 3:38:17 AM6/14/12

to sy...@googlegroups.com

On Thu, Jun 14, 2012 at 11:49 AM, Aleksandar Makelov
<amak...@college.harvard.edu> wrote:
> Sorry but the link appears to be broken
>

I copied the dot-dot link on the google search page. Try this:
http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf

Chris Smith

unread,

Jun 14, 2012, 3:55:59 AM6/14/12

to sy...@googlegroups.com

On Wed, Jun 13, 2012 at 7:43 PM, Aleksandar Makelov
<amak...@college.harvard.edu> wrote:
> How does this work (and does it work) in sympy?
>
> A classical example is a Monte Carlo algorithm. Is there an accepted way to
> test these in sympy? For one-sided Monte Carlo algorithms, we can easily
> test one direction in a deterministic way, but this doesn't feel very
> satisfactory.

I see two aspects to a randomized algorithm: doing the right thing
with the random input and using good random number generation. Python
has addressed the latter, so we just need to make sure that the right
thing is being done with it. Let's say I had an MC integration
routine. If I sample the region randomly I should get a good result.
But however I sample it I need to be sure that I am doing the right
thing with the results. The latter (though it may give a bad answer
for bad input) is what I should test in sympy and I don't need more
than a minimum number of points to test it (depending on how many
branches in decisions there are for the algorithm) e.g.

```
def MC_integr(xrange, yrange, shape, rand_seq):
area = xrange*yrange
hit = miss = 0
for x in rand_seq:
for y in rand_seq:
if Point(x,y) in shape: hit += 1
else: miss += 1
if tolerance(hit, miss) is True: break # some routine that
figures out if the precision is ok
return hit/(hit+miss)*area
```

This routine could be tested with a rand_seq of 4 elements: one giving
a point inside and one outside; the return area should be
xrange*yrange/2. The tolerance function could be tested independently
to see that if hit+miss > 1000 (or whatever) that it quits.

>
> Another example is the algorithms outputting random group elements that are
> supposed to be nearly uniformly distributed in the group.

If they are selected by integers which themselves have been tested to
be uniformly distributed (i.e. python's generator) then I don't think
you have to test the sympy output. But perhaps I am misunderstanding.

krastano...@gmail.com

unread,

Jun 14, 2012, 4:26:09 AM6/14/12

to sy...@googlegroups.com

I agree with what Chris said. I am not sure whether anybody has
expressed this idea clearly until now, but goodness of random numbers
and correctness of algorithm based on random numbers are two different
things.

Chris Smith

unread,

Jun 14, 2012, 4:40:35 AM6/14/12

to sy...@googlegroups.com

That's saying it more clearly. That's what I wanted to convey.

/c

krastano...@gmail.com

unread,

Jun 14, 2012, 4:48:21 AM6/14/12

to sy...@googlegroups.com

> That's saying it more clearly. That's what I wanted to convey.

I actually thought I was quoting you :) Anyway, the credit is yours.

Chris Smith

unread,

Jun 14, 2012, 4:50:00 AM6/14/12

to sy...@googlegroups.com

LOL! (OK, at least we know -- in this instance, after one iteration --
that I am self-consistent.)

/c

Aleksandar Makelov

unread,

Jun 14, 2012, 4:53:20 AM6/14/12

to sy...@googlegroups.com

Yes, this way of testing randomized output (testing whether the right thing is done with random numbers) seems both reasonable and feasible. For anyone else following this discussion, see here: https://github.com/sympy/sympy/pull/1353#issuecomment-6322159 which is also relevant. One bad thing about it is that we'll have to add more arguments to the functions being tested (to store a particular precomputed randomized output) and make their code a bit more complex solely for the purposes of testing. I don't think that's such a drawback (well, it will slow down some functions that are frequently used by a really minor amount, but I guess that's OK).

Another potential downfall is that in some algorithms there are many random choices made, so coming up with a test might become ugly. I guess if we stick to small examples this won't be such a big pain.

My third concern is that this kind of testing is sort of implementation-specific, so if someone changes the algorithm with another possible randomized implementation, the test are going to have to be rewritten.

>
> Another example is the algorithms outputting random group elements that are
> supposed to be nearly uniformly distributed in the group.

If they are selected by integers which themselves have been tested to
be uniformly distributed (i.e. python's generator) then I don't think
you have to test the sympy output. But perhaps I am misunderstanding.

Well, python's generator is certainly used, but things are more complicated than that. For starters, the algorithm outputs elements that are *nearly* uniformly distributed, and if you look at them they don't really seem uniformly distributed at all (their *properties* are said to be uniform enough for most algorithms using random elements). I think that some sort of statistical testing is not going to help here (and such tests have a small probability of failing anyway), however something like the idea described above might work. Again, loading the precomputed random elements is probably going to be ugly.

krastano...@gmail.com

unread,

Jun 14, 2012, 6:58:06 AM6/14/12

to sy...@googlegroups.com

By the way, if we are starting to implement nontrivial, high
performance random algorithms, we should start thinking about making
it easy for the user to supply advanced random number generators.
quasi-random vs pseudo-random, gnu scientific library generators,
numpy, etc.

And if the code is to be useful for research, we may want to add
optional cythonized modules.

All that I am saying is that for competing with established software,
we should really think about performance.

Most of my comments are probably applicable only to the group theory
module, but anyway, I am mentioning it.

Joachim Durchholz

unread,

Jun 14, 2012, 1:52:04 PM6/14/12

to sy...@googlegroups.com

Am 14.06.2012 09:38, schrieb Chris Smith:
> I copied the dot-dot link on the google search page. Try this:
> http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf

That's related to testing RNGs.
I think Aleksandar was asking about texting code that uses RNGs.
(Yes, there's some overlap, but most issues are unique to each field.)

Aleksandar Makelov

unread,

Jun 17, 2012, 8:13:12 PM6/17/12

to sy...@googlegroups.com

Yes, that's true, I choose to trust the RNGs :) I'm concerned with testing if the actual algorithms use the random numbers in the right way. But it seems that we've reached some consensus on that. Anyway, what do you guys think about the concerns I pointed out in my previous post though?

Joachim Durchholz

unread,

Jun 18, 2012, 1:06:27 AM6/18/12

to sy...@googlegroups.com

Am 18.06.2012 02:13, schrieb Aleksandar Makelov:
> Yes, that's true, I choose to trust the RNGs :) I'm concerned with testing
> if the actual algorithms use the random numbers in the right way. But it
> seems that we've reached some consensus on that. Anyway, what do you guys
> think about the concerns I pointed out in my previous post though?

We've had that kind of problem in the past.
Current policy is to acquire a random seed and print it on the test
output so that tests that use a RNG can do so, but if a bug shows up,
they can be repeated.

Aleksandar Makelov

unread,

Jun 23, 2012, 10:53:16 AM6/23/12

to sy...@googlegroups.com

OK, I implemented the use of precomputed values instead of random numbers for testing randomized algorithms (among other things) in this pull request: https://github.com/sympy/sympy/pull/1377
My current solution for the implementation is to use a dictionary in order to supply all the variables that are otherwise provided by RNG in the body of a function. You're welcome to take a look!

Joachim Durchholz

unread,

Jun 23, 2012, 11:29:23 AM6/23/12

to sy...@googlegroups.com

Am 23.06.2012 16:53, schrieb Aleksandar Makelov:
> OK, I implemented the use of precomputed values instead of random numbers
> for testing randomized algorithms (among other things) in this pull
> request: https://github.com/sympy/sympy/pull/1377

I haven't found that aspect (which may be because I didn't look too
closely).

What's the purpose of using precomputed values?

krastano...@gmail.com

unread,

Jun 23, 2012, 11:35:22 AM6/23/12

to sy...@googlegroups.com

> What's the purpose of using precomputed values?

I am not sure how useful it is as a test, however the idea is to check
that the algorithm works correctly without the problems that testing
random numbers brings.

Even if it is actually useless for testing the algorithm, it is still
useful for testing the 2to3 translator that we use for python3.

Aleksandar Makelov

unread,

Jun 23, 2012, 11:38:52 AM6/23/12

to sy...@googlegroups.com

We want to make sure that the right thing is done with the output from the RNGs, so we manually supply as an additional argument to a given function some particular choice for all the variables inside the function that come from RNGs. The reason that we use certain precomputed values is that doing the test with some randomly generated set of values as an additional argument is essentially going to have to repeat the calculations in the function itself (which we want to test) - whereas for concrete values we know the answer right away. Does that make sense?

Aleksandar Makelov

unread,

Jun 23, 2012, 11:43:46 AM6/23/12

to sy...@googlegroups.com

Such tests may indeed be very implementation-specific (is that why you think they might be useless?), but I wouldn't say that they don't help at all - there are still many things that can be broken inside a function that make such tests fail.

krastano...@gmail.com

unread,

Jun 23, 2012, 11:49:04 AM6/23/12

to sy...@googlegroups.com

I was saying that they might be useless, because all that the test
does is to copy the logic of the function.

for instance:

a=1
b=2
assert a+b == 1+2

is not a good unit test because all that you do is to exactly copy the
function to be tested.

I feel that this is true for many of the random tests that we have. I
do not know if it true in your case.

And about my comment concerning the 2to3 script:

In order to use sympy on python3 you need to run a translator script.
Even if your tests are useless for algorithm checking, they will
stress the translator script and ensure that there are no errors in
it.

Aleksandar Makelov

unread,

Jun 23, 2012, 11:54:56 AM6/23/12

to sy...@googlegroups.com

OK, I see. That's what I was talking about in my reply to Joachim:

The reason that we use certain precomputed values is that doing the test with some randomly generated set of values as an additional argument is essentially going to have to repeat the calculations in the function itself (which we want to test) - whereas for concrete values we know the answer right away

so the situation is not as tautological as in your example (some nontrivial computations are needed to get from the precomputed random input to the answer in the tests in PR1377).

Joachim Durchholz

unread,

Jun 23, 2012, 12:32:36 PM6/23/12

to sy...@googlegroups.com

Am 23.06.2012 17:38, schrieb Aleksandar Makelov:
> We want to make sure that the right thing is done with the output from the
> RNGs, so we manually supply as an additional argument to a given function
> some particular choice for all the variables inside the function that come
> from RNGs.

Ah, I see.
I'm not convinced that it's the best way to design such a thing. Adding
parameters to a function that are purely there for testing purposes is
going to confuse people who aren't into testing. It's also in
contradiction to the "keep interfaces as narrow as possible" principle -
a narrow interface means less things that need to be remembered by
programmers, less things that need to be set up by the caller, less
things that might be misinterpreted.
Also, it'd adding code to the functions. Which means adding bugs - which
may affect the function if it's running in production. Which kind of
defeats the purpose of testing in the first place.

> The reason that we use certain precomputed values is that doing
> the test with some randomly generated set of values as an additional
> argument is essentially going to have to repeat the calculations in the
> function itself (which we want to test) - whereas for concrete values we
> know the answer right away. Does that make sense?

Not very much, I fear.
As Stefan said, repeating a calculation in test code isn't a useful unit
test, even if you place the unit test into another module. Or if you're
doing the calculation by hand - unless those calculations have been done
by experts in the field and verified by other experts in the field, of
course.

Expanding on Stefan's example.
Assuming you're testing an array-inversion routine.

We agree on the worst approach to test it: repeat the array inversion
algorithm in the test and see whether it gives the same result as the
code in SymPy.
Actually this kind of test isn't entirely pointless - if the test code
remains stable but the SymPy code evolves into optimizations, this could
serve a useful purpose. On the other hand, you still don't write this
kind of test code until you actually do the optimization.

The other approach would be to add an "expected result" parameter, and
fail if the result isn't the expected one.
This has two problems:
a) It adds an unwanted dependency to the testing modules. At least if
you want to give better diagnostics than just throwing an exception (for
example, you may want to test internal workings that throw exceptions
which get caught).
b) You're supplying precomputed results. You'd still need to explain why
the results are correct. Somebody has to verify that they are, indeed,
correct.

My approach for that would be to test the defining property of the function:
(matrix_inv(A) * A).is_unit_matrix()
(sorry for ad-hoc invention of matrix functions)

I.e. you're testing the purpose of the function, not its inner workings.

Oh, and this kind of testing can uncover more bugs.
For example, the above reasoning overlooks that not all matrices can be
inverted. If I'm testing the algorithm, I'll simply overlook the case of
a singular matrix because I'm all thinking inside the algorithm.
If I write my test code with the purpose in mind, I have a better chance
to stumble over the singular case - either because I'm thinking about
matrix theory instead of my algorithm, or because some tests
mysteriously fail, namely, when the RNG happens to generate a singular
matrix.

I hope that's all understandable.
And I hope I'm not missing the point entirely :-)

Aleksandar Makelov

unread,

Jun 23, 2012, 1:39:02 PM6/23/12

to sy...@googlegroups.com

23 юни 2012, събота, 19:32:36 UTC+3, Joachim Durchholz написа:

Am 23.06.2012 17:38, schrieb Aleksandar Makelov:
> We want to make sure that the right thing is done with the output from the
> RNGs, so we manually supply as an additional argument to a given function
> some particular choice for all the variables inside the function that come
> from RNGs.

Ah, I see.
I'm not convinced that it's the best way to design such a thing. Adding
parameters to a function that are purely there for testing purposes is
going to confuse people who aren't into testing. It's also in
contradiction to the "keep interfaces as narrow as possible" principle -
a narrow interface means less things that need to be remembered by
programmers, less things that need to be set up by the caller, less
things that might be misinterpreted.

Yeah, I realized that earlier (it's in one of my posts from above): adding code to the body of the function that is purely intended for testing is ugly from my perspective, too. But on the other hand we need some way of testing these algorithms, don't we?

Also, it'd adding code to the functions. Which means adding bugs - which
may affect the function if it's running in production. Which kind of
defeats the purpose of testing in the first place.

The addition of code intended for the user is quite minimal. If you stick with not setting the value of the test parameter, the only difference the flow of execution sees is an additional if statement that checks whether the parameter is there, and then goes back to the logic of the function.

> The reason that we use certain precomputed values is that doing
> the test with some randomly generated set of values as an additional
> argument is essentially going to have to repeat the calculations in the
> function itself (which we want to test) - whereas for concrete values we
> know the answer right away. Does that make sense?

Not very much, I fear.
As Stefan said, repeating a calculation in test code isn't a useful unit
test, even if you place the unit test into another module. Or if you're
doing the calculation by hand - unless those calculations have been done
by experts in the field and verified by other experts in the field, of
course.

Yes, I think my situation is closer to "doing it by hand" - and even if I'm nowhere near an expert in the field, the calculations haven't required much ingenuity so far :)

Expanding on Stefan's example.
Assuming you're testing an array-inversion routine.

We agree on the worst approach to test it: repeat the array inversion
algorithm in the test and see whether it gives the same result as the
code in SymPy.
Actually this kind of test isn't entirely pointless - if the test code
remains stable but the SymPy code evolves into optimizations, this could
serve a useful purpose. On the other hand, you still don't write this
kind of test code until you actually do the optimization.

The other approach would be to add an "expected result" parameter, and
fail if the result isn't the expected one.

Yeah, that makes sense, but the algorithms I had to test don't return the same answer every time - for example,
consider an algorithm that returns a random group element - there is no way to test the result apart from asserting that
it belongs to the group, and this test can be cheated rather easily (for example, always return the identity).

This has two problems:
a) It adds an unwanted dependency to the testing modules. At least if
you want to give better diagnostics than just throwing an exception (for
example, you may want to test internal workings that throw exceptions
which get caught).
b) You're supplying precomputed results. You'd still need to explain why
the results are correct. Somebody has to verify that they are, indeed,
correct.

I agree on that, but don't you do that all the time with tests for deterministic algorithms as well?
The majority of tests in sympy rely partly on some outside - and not defining - knowledge (for example, the symmetric group on 5 elements having order 120), or on an example calculated by hand; is the validity of these claims explained somewhere (maybe it should be included as comments in the test files)?

My approach for that would be to test the defining property of the function:
(matrix_inv(A) * A).is_unit_matrix()
(sorry for ad-hoc invention of matrix functions)

I.e. you're testing the purpose of the function, not its inner workings.

Yes, this is obviously a good way to test any algorithm, however in many cases the answer returned
by a randomized algorithm (and, not too rarely, by a deterministic algorithm) has no clear defining property (see above, when I talk about random group elements).

Aaron Meurer

unread,

Jun 23, 2012, 9:22:56 PM6/23/12

to sy...@googlegroups.com

On Jun 23, 2012, at 10:32 AM, Joachim Durchholz <j...@durchholz.org> wrote:

> Am 23.06.2012 17:38, schrieb Aleksandar Makelov:
>> We want to make sure that the right thing is done with the output from the
>> RNGs, so we manually supply as an additional argument to a given function
>> some particular choice for all the variables inside the function that come
>> from RNGs.
>
> Ah, I see.
> I'm not convinced that it's the best way to design such a thing. Adding parameters to a function that are purely there for testing purposes is going to confuse people who aren't into testing. It's also in contradiction to the "keep interfaces as narrow as possible" principle - a narrow interface means less things that need to be remembered by programmers, less things that need to be set up by the caller, less things that might be misinterpreted.
> Also, it'd adding code to the functions. Which means adding bugs - which may affect the function if it's running in production. Which kind of defeats the purpose of testing in the first place.

This is fixed by the ideas of this pull request:
https://github.com/sympy/sympy/pull/1375.

Aaron Meurer

>
> > The reason that we use certain precomputed values is that doing
>> the test with some randomly generated set of values as an additional
>> argument is essentially going to have to repeat the calculations in the
>> function itself (which we want to test) - whereas for concrete values we
>> know the answer right away. Does that make sense?
>
> Not very much, I fear.
> As Stefan said, repeating a calculation in test code isn't a useful unit test, even if you place the unit test into another module. Or if you're doing the calculation by hand - unless those calculations have been done by experts in the field and verified by other experts in the field, of course.
>
> Expanding on Stefan's example.
> Assuming you're testing an array-inversion routine.
>
> We agree on the worst approach to test it: repeat the array inversion algorithm in the test and see whether it gives the same result as the code in SymPy.
> Actually this kind of test isn't entirely pointless - if the test code remains stable but the SymPy code evolves into optimizations, this could serve a useful purpose. On the other hand, you still don't write this kind of test code until you actually do the optimization.
>
> The other approach would be to add an "expected result" parameter, and fail if the result isn't the expected one.
> This has two problems:
> a) It adds an unwanted dependency to the testing modules. At least if you want to give better diagnostics than just throwing an exception (for example, you may want to test internal workings that throw exceptions which get caught).
> b) You're supplying precomputed results. You'd still need to explain why the results are correct. Somebody has to verify that they are, indeed, correct.
>
> My approach for that would be to test the defining property of the function:
> (matrix_inv(A) * A).is_unit_matrix()
> (sorry for ad-hoc invention of matrix functions)
>
> I.e. you're testing the purpose of the function, not its inner workings.
>
> Oh, and this kind of testing can uncover more bugs.
> For example, the above reasoning overlooks that not all matrices can be inverted. If I'm testing the algorithm, I'll simply overlook the case of a singular matrix because I'm all thinking inside the algorithm.
> If I write my test code with the purpose in mind, I have a better chance to stumble over the singular case - either because I'm thinking about matrix theory instead of my algorithm, or because some tests mysteriously fail, namely, when the RNG happens to generate a singular matrix.
>
> I hope that's all understandable.
> And I hope I'm not missing the point entirely :-)
>

> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

Aleksandar Makelov

unread,

Jul 4, 2012, 4:20:31 PM7/4/12

to sy...@googlegroups.com

24 юни 2012, неделя, 04:22:56 UTC+3, Aaron Meurer написа:

On Jun 23, 2012, at 10:32 AM, Joachim Durchholz <j...@durchholz.org> wrote:

> Am 23.06.2012 17:38, schrieb Aleksandar Makelov:
>> We want to make sure that the right thing is done with the output from the
>> RNGs, so we manually supply as an additional argument to a given function
>> some particular choice for all the variables inside the function that come
>> from RNGs.
>
> Ah, I see.
> I'm not convinced that it's the best way to design such a thing. Adding parameters to a function that are purely there for testing purposes is going to confuse people who aren't into testing. It's also in contradiction to the "keep interfaces as narrow as possible" principle - a narrow interface means less things that need to be remembered by programmers, less things that need to be set up by the caller, less things that might be misinterpreted.
> Also, it'd adding code to the functions. Which means adding bugs - which may affect the function if it's running in production. Which kind of defeats the purpose of testing in the first place.

This is fixed by the ideas of this pull request:
https://github.com/sympy/sympy/pull/1375.

I'm confused about this - how does the PR avoid the use of an additional parameter?

> To unsubscribe from this group, send email to sympy+unsubscribe@googlegroups.com.

Aaron Meurer

unread,

Jul 4, 2012, 5:25:49 PM7/4/12

to sy...@googlegroups.com

On Jul 4, 2012, at 2:20 PM, Aleksandar Makelov <amak...@college.harvard.edu> wrote:

24 юни 2012, неделя, 04:22:56 UTC+3, Aaron Meurer написа:
On Jun 23, 2012, at 10:32 AM, Joachim Durchholz <j...@durchholz.org> wrote:

> Am 23.06.2012 17:38, schrieb Aleksandar Makelov:
>> We want to make sure that the right thing is done with the output from the
>> RNGs, so we manually supply as an additional argument to a given function
>> some particular choice for all the variables inside the function that come
>> from RNGs.
>
> Ah, I see.
> I'm not convinced that it's the best way to design such a thing. Adding parameters to a function that are purely there for testing purposes is going to confuse people who aren't into testing. It's also in contradiction to the "keep interfaces as narrow as possible" principle - a narrow interface means less things that need to be remembered by programmers, less things that need to be set up by the caller, less things that might be misinterpreted.
> Also, it'd adding code to the functions. Which means adding bugs - which may affect the function if it's running in production. Which kind of defeats the purpose of testing in the first place.

This is fixed by the ideas of this pull request:
https://github.com/sympy/sympy/pull/1375.
I'm confused about this - how does the PR avoid the use of an additional parameter?

You still have an additional parameter, but there's no more code duplication.

Aaron Meurer

> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.

> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

--

You received this message because you are subscribed to the Google Groups "sympy" group.

To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/3QN8h68PtfMJ.

To post to this group, send email to sy...@googlegroups.com.

To unsubscribe from this group, send email to sympy+un...@googlegroups.com.

Aleksandar Makelov

unread,

Jul 4, 2012, 5:50:44 PM7/4/12

to sy...@googlegroups.com

05 юли 2012, четвъртък, 00:25:49 UTC+3, Aaron Meurer написа:

On Jul 4, 2012, at 2:20 PM, Aleksandar Makelov <amak...@college.harvard.edu> wrote:

24 юни 2012, неделя, 04:22:56 UTC+3, Aaron Meurer написа:
On Jun 23, 2012, at 10:32 AM, Joachim Durchholz <j...@durchholz.org> wrote:

> Am 23.06.2012 17:38, schrieb Aleksandar Makelov:
>> We want to make sure that the right thing is done with the output from the
>> RNGs, so we manually supply as an additional argument to a given function
>> some particular choice for all the variables inside the function that come
>> from RNGs.
>
> Ah, I see.
> I'm not convinced that it's the best way to design such a thing. Adding parameters to a function that are purely there for testing purposes is going to confuse people who aren't into testing. It's also in contradiction to the "keep interfaces as narrow as possible" principle - a narrow interface means less things that need to be remembered by programmers, less things that need to be set up by the caller, less things that might be misinterpreted.
> Also, it'd adding code to the functions. Which means adding bugs - which may affect the function if it's running in production. Which kind of defeats the purpose of testing in the first place.

This is fixed by the ideas of this pull request:
https://github.com/sympy/sympy/pull/1375.
I'm confused about this - how does the PR avoid the use of an additional parameter?

You still have an additional parameter, but there's no more code duplication.

Aaron Meurer

Thanks. So the only thing that makes this different from manually supplying the values for the random variables is the use of the (predefined) generator object that spits out whatever we tell it to? Also, does the code duplication you're talking about refer to repeating the logic of the function in the tests?

Chris Smith

unread,

Jul 4, 2012, 6:19:00 PM7/4/12

to sy...@googlegroups.com

>> This is fixed by the ideas of this pull request:
>> https://github.com/sympy/sympy/pull/1375.
>
> I'm confused about this - how does the PR avoid the use of an additional
> parameter?

The seed parameter feeds _randint or _randrange. When None, random
assigns its own seed; when an int, that acts as the seed; when a list,
the list becomes the source of values. So, for example, if a routine
needs two random integers and you want to manual specify those you
send those two ints in a list, e.g. seed=[1, 2[.

Joachim Durchholz

unread,

Jul 4, 2012, 7:15:54 PM7/4/12

to sy...@googlegroups.com

Am 04.07.2012 23:25, schrieb Aaron Meurer:
> On Jul 4, 2012, at 2:20 PM, Aleksandar Makelov <amak...@college.harvard.edu>
> wrote:
>
>> 24 юни 2012, неделя, 04:22:56 UTC+3, Aaron Meurer написа:
>>> This is fixed by the ideas of this pull request:
>>> https://github.com/sympy/sympy/pull/1375.
>>>
>> I'm confused about this - how does the PR avoid the use of an additional
>> parameter?
>
> You still have an additional parameter, but there's no more code
> duplication.

I'm not seeing much of a difference now - from the "breadth of
interface" point of view, it's the same whether you send in a seed or a
generator.

I'm wondering whether what we're gaining is worth the additional
parameter and the ~100 LoC we're paying here.
What are the problems solved, benefits gained by this anyway? I'm a bit
fuzzy about them, but the various possibilities I've been considering
can all be achieved at a much lower total line count, but I'm not sure
that I have considered what's relevant in this situation, which is why
I'm asking.

Reply all

Reply to author

Forward