Yes, factorint is a better example of something that can be tested
with hypothesis. It's the example I gave on the issue
https://github.com/sympy/sympy/issues/20914.
It's also a good example of how we can start with something simple and
built out a more rigorous test.
There's other properties that could be added to the test as well, for instance
assert isprime(prime)
assert exp >= 1
assert isinstance(prime, int)
assert isinstance(exp, int)
And we can also test the various flags to factorint.
As for the existing test, for now, we should generally leave any
existing manual tests intact. Hypothesis should be treated as an
extension to manual testing, not a complete replacement. For instance,
some of the assertions in that test you showed are based on specific
inputs that are known to potentially cause issues. Hypothesis might
not necessarily generate an example like them. Plus, you'll notice
that that test is marked as @slow, meaning some of the numbers being
tested are too slow compared to the inputs we might want to generate
from hypothesis.
This is actually one thing that will need to be considered in this
project. Hypothesis tries to always generate "interesting" examples in
its strategies, in addition to random ones. But what hypothesis
considers "interesting" is based on some heuristics that apply to a
broad category of programming. For instance, the "interesting"
integers from st.integers() are things like -1, 0, 1, etc. These are
important to test, but for factorint, we also want to make sure we
test "interesting" integers in terms of their prime factorizations.
This might mean numbers that have both small and large prime factors,
numbers that have many prime factors, and numbers that have very few
prime factors, numbers with factors that are interesting corner cases
in terms of the specific algorithms that are implemented, etc.. Some
of these are not distributed very well on the number line, so we might
have to create a custom strategy that generates them with higher
likelihood. Otherwise, they would basically never be chosen at random.
Hypothesis also limits the size of the maximum integer generated by
integers() (probably to something like 2**64). But factorint can
handle numbers much larger than that. Creating custom input strategies
is going to be a big part of this project, so it's something you
should be thinking about, and learn how to do (it also can be one of
the more challenging parts of using hypothesis effectively). As a
start, I would learn how to run hypothesis in verbose mode, so that
you can see the actual inputs it is generating, then to take a look at
those inputs and try to see if they actually cover all the important
cases for the given function.
The code for factorint is very complex, and testing it rigorously
requires testing a lot of different kinds of corner cases. Hypothesis
is very good at this sort of thing, but it wasn't built with these
specific types of corner cases in mind, so it will need some help to
get there.
Aaron Meurer
> To view this discussion visit
https://groups.google.com/d/msgid/sympy/CANENgK4_%3D5Dws%3D3H-Pq3pL4dxBe5Do1SvKWj8eFjX7fqJUVxkA%40mail.gmail.com.