Why Arbitrary and not simply Gen?

321 views
Skip to first unread message

etorreborre

unread,
Dec 16, 2014, 11:17:30 PM12/16/14
to scala...@googlegroups.com
I think the question has bugged me for a while without really being conscious.

What does Arbitrary[T] brings to Gen[T]? Why not simply Gen[T] and a bunch of standard Gens in the Gen companion object?

object Gen {

  // some syntactic sugar to get Gen[Int] in scope
  def apply[T : Gen]: Gen[T] = implicitly[Gen[T]]

  // standard Gen for Int
  implicit def genInt: Gen[Int] = ???

  ...

}

I'm saying this I frequently have to jump in/out Arbitrary and Gen where I feel I could use Gen all the time.

Thanks.

Eric.

Tony Morris

unread,
Dec 16, 2014, 11:19:38 PM12/16/14
to scala...@googlegroups.com
I think it's just to make a clear split between what is a data type and what is a type-class.

Gen is a data type, while Arbitrary is a type-class. Of course, Scala gives us the ability to treat Gen/Arbitrary as one or the other, but if you agree that this runs into problems, it then easily follows that we would at least have Gen, then well, a mapping of implementation to a type (Arbitrary).
--
You received this message because you are subscribed to the Google Groups "scalacheck" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalacheck+...@googlegroups.com.
To post to this group, send email to scala...@googlegroups.com.
Visit this group at http://groups.google.com/group/scalacheck.
For more options, visit https://groups.google.com/d/optout.

etorreborre

unread,
Dec 16, 2014, 11:33:40 PM12/16/14
to scala...@googlegroups.com
I would understand that if Arbitrary, as a type class was not defined by only one function, returning a Gen. 

If it had some functions distilling what is the "essence" of being Arbitrary + some related laws that would make a lot more sense (I don't know what they would be though).

Eric. 

Tony Morris

unread,
Dec 16, 2014, 11:40:42 PM12/16/14
to scala...@googlegroups.com
I agree that at the level you are describing, there are problems with talking about Arbitrary as a type-class. I was ignoring those design issues in the earlier response and just pointing out the motivation for a clean split of data type and type-class.

I am in complete agreement, however, that you have identified, or at least alluded to, a design fault of QuickCheck. But there are lots of those.

Dave Stevens

unread,
Dec 16, 2014, 11:43:32 PM12/16/14
to scala...@googlegroups.com
Perhaps I'm doing it wrong, but I frequently require different ways to get a Gen[A] for a given A. For example, I may need B => Gen[A] and a B => Gen[C] in order to have a Gen[(A, C)]. Other times, I just need a Gen[A]. I always only have one definition of Arbitrary[A] for a given A though. As Tony said, it is a typeclass and there must only be one. If Gen and Arbitrary were combined, all hell would break loose attempting to ensure that there was only one implicit Gen[A]. I would probably need to use newtype all over, constantly wrapping and unwrapping.

etorreborre

unread,
Dec 16, 2014, 11:48:12 PM12/16/14
to scala...@googlegroups.com
That's interesting.

In our practice we frequently create new types for generation, like Gen[SmallString], Gen[BigString], Gen[NonEmptyString],...

Stefan Höck

unread,
Dec 17, 2014, 12:01:49 AM12/17/14
to scala...@googlegroups.com
> Perhaps I'm doing it wrong, but I frequently require different ways to get
> a Gen[A] for a given A. For example, I may need B => Gen[A] and a B =>
> Gen[C] in order to have a Gen[(A, C)]. Other times, I just need a Gen[A]. I
> always only have one definition of Arbitrary[A] for a given A though. As
> Tony said, it is a typeclass and there must only be one. If Gen and
> Arbitrary were combined, all hell would break loose attempting to ensure
> that there was only one implicit Gen[A]. I would probably need to use
> newtype all over, constantly wrapping and unwrapping.

I think this is *exactly* why there should be no Arbitrary (although I
use the type class a lot in my own code). How often does it make sense
to have only one instance of Arbitrary? Hardly ever, from my experience.
But of course, if there is no type class, you must not use implicits
but pass your Gens (not genes) around explicitly.

etorreborre

unread,
Dec 17, 2014, 12:12:49 AM12/17/14
to scala...@googlegroups.com
That's interesting.

We generally come up with lots of newtypes: Gen[BigString], Gen[SmallString], Gen[NonEmptyString]...

E.


On Wednesday, December 17, 2014 3:43:32 PM UTC+11, Dave Stevens wrote:

Dave Stevens

unread,
Dec 17, 2014, 1:01:42 AM12/17/14
to scala...@googlegroups.com
That's interesting.

We generally come up with lots of newtypes: Gen[BigString], Gen[SmallString], Gen[NonEmptyString]...

That situation is a bit different then what I had in mind.

My example below is also not exactly what I had in mind, but it is another example of when I may have multiple definitions of a Gen[A]. Consider a type with multiple data constructors. Once these data constructors reach a certain complexity, I create separate functions returning generators for each. How would you define a Gen[A] given the example below?

sealed trait A
object A {
  private case class AA(....) extends A
  ....
  private case class AE(....) extends A

  def aa(...): A = AA(...)
  ...
  def ae(...): A = AE(...)
}

def genAA: Gen[A] = ???
...
def genAE: Gen[A] = ???

def genA: Gen[A] = Gen.oneOf(genAA, ..., genAE)
def arbA: Arbitrary[A] = Arbitrary(genA)

etorreborre

unread,
Dec 17, 2014, 1:29:57 AM12/17/14
to scala...@googlegroups.com
In that case what would happen if you just set "def genA" as implicit?

Rickard Nilsson

unread,
Dec 22, 2014, 8:22:27 PM12/22/14
to scala...@googlegroups.com, etorr...@gmail.com
Hi Eric,

On 12/17/2014 05:33 AM, etorreborre wrote:
> I would understand that if Arbitrary, as a type class was not defined by
> only one function, returning a Gen.
>
> If it had some functions distilling what is the "essence" of being
> Arbitrary + some related laws that would make a lot more sense (I don't
> know what they would be though).

Well, I don't think the fact that a type class is defined only by one
method makes it invalid.

The "essence" of being Arbitrary is (although losely defined, poorly
documented and probably not consistently implemented in ScalaCheck) to
provide a generator that generates uniformly distributed values
(possibly weighted by the size parameter) with edge cases thrown in.

You could very well imagine another type class called SmallGen[T] with
the purpose of providing a generator that produces "small" values of T.
That type class would have the exact same signature as Arbitrary[T].

You could then have:

def forAllSmall[T](f: T => Prop)(implicit s: SmallGen[T]) = ...

and use it like this:

val p = forAllSmall { n: Int => time(myFun(n)) <= REASONABLE_TIME }

Now, the value of using type classes like Arbitrary can be argued -
maybe it always makes more sense to write properties in terms of
specific generatores - but I don't think it would make sense to directly
use implicit generators as you suggest, since it would block the
possibility to define and use type classes like Arbitrary or the
SmallGen example above. (OK, it wouldn't really block that possiblity,
but you would not be able to implement the scenario I've described in
your way).

On a side-note, maybe it would be good to use the method name
"forAllArbitrary" when the Arbitrary type class is used, and use
"forAll(gen)" only for specific generators.

In the end, choosing between a type-class approach and a more
value-centric approach is probably a matter of personal taste. For me, I
think I would pick "forAll(smallInt: Gen[Int])" over defining an
implicit "Arbitrary[SmallInt]" instance. I mean, if you have refined
your integer to SmallInt it seems strange to then talk about
Arbitrary[SmallInt], since (to me) "Arbitrary" means something un-refined.


/ Rickard
>> send an email to scalacheck+...@googlegroups.com <javascript:>.
>> To post to this group, send email to scala...@googlegroups.com
>> <javascript:>.
>> <http://groups.google.com/group/scalacheck>.
>> For more options, visit https://groups.google.com/d/optout
>> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "scalacheck" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to scalacheck+...@googlegroups.com
> <mailto:scalacheck+...@googlegroups.com>.
> To post to this group, send email to scala...@googlegroups.com
> <mailto:scala...@googlegroups.com>.

etorreborre

unread,
Dec 23, 2014, 12:31:01 AM12/23/14
to scala...@googlegroups.com, etorr...@gmail.com
Thanks Rickard for your explanation.

I still need to through this in my head. For example it is not clear to me why Arbitrary.arbitrary should return a Gen with all these methods.
Wouldn't a Gen trait with just a `doApply` method be enough? Then the current Gen would be a GenData implementing Gen but having additional combinators to build GenData instances.


> I mean, if you have refined your integer to SmallInt it seems strange to then talk about 
> Arbitrary[SmallInt], since (to me) "Arbitrary" means something un-refined. 

For me the Arbitrary[SmallInt] is still an un-refined value but for the type SmallInt. 

Also in practice we often create things like:

case class Line(fields: List[String]) {
  def line = fields.mkString("|")
}

Then we get an Arbitrary[Line] (instead of a straight Arbitrary[String] representing a line), we can feed a (l: Line).line to a parser for example and check that we get all the original (line: Line).fields.

But that's beside the point.

More to the point, I am going to propose in specs2 (in 3.0) some additional sugar on top of ScalaCheck to be able to:

 - set specific parameters on a property
 - set specific Pretty values for each parameter on a property
 - set specific Arbitrary/Shrink/Gen instance on a property
 - have a way to declare which of the arguments should be collected (opposed to any value at the moment)
 - declare the verbosity of the property
 - pass parameters from the command line to override the property parameters (that one is really nice when testing)
 - output the args for a failed property so that strings are quoted (if the arg is a case class) in order to be able to more easily copy and paste from the console and recreate a failed example

If you implement #120, some of this might become deprecated but in the meantime that could be pretty helpful to ScalaCheck users out there (I know it will be for me :-)).

Eric.

Rickard Nilsson

unread,
Dec 23, 2014, 10:25:03 AM12/23/14
to scala...@googlegroups.com, etorreborre@gmail.com >> Eric Torreborre
Hi Eric,

On 12/23/2014 06:31 AM, etorreborre wrote:
> Thanks Rickard for your explanation.
>
> I still need to through this in my head. For example it is not clear to
> me why Arbitrary.arbitrary should return a Gen with all these methods.
> Wouldn't a Gen trait with just a `doApply` method be enough? Then the
> current Gen would be a GenData implementing Gen but having additional
> combinators to build GenData instances.

I'm not really following what you mean. The core methods of Gen is
really map and flatMap, to support the current monadic style when
constructing generators. I don't see how separating stuff into Gen and
GenData would work (or make things better).

My suggestion was simply that if you want to support several different
type classes similar to Arbitrary, it wouldn't work just using "implicit
val genInt: Gen[Int]". Of course, you could still rely on using scope
for picking the right implicit generator, but I don't see how that would
be better than having different type classes.

> > I mean, if you have refined your integer to SmallInt it seems strange
> to then talk about
> > Arbitrary[SmallInt], since (to me) "Arbitrary" means something
> un-refined.
>
> For me the Arbitrary[SmallInt] is still an un-refined value but for the
> type SmallInt.
>
> Also in practice we often create things like:
>
> case class Line(fields: List[String]) {
> def line = fields.mkString("|")
> }
>
> Then we get an Arbitrary[Line] (instead of a straight Arbitrary[String]
> representing a line), we can feed a (l: Line).line to a parser for
> example and check that we get all the original (line: Line).fields.
>
> But that's beside the point.

Yes, there's absolutely nothing wrong with your approach. As I said, I
think it depends entirely on personal taste whether you define a
"newtype" or use a specific generator.


> More to the point, I am going to propose in specs2 (in 3.0) some
> additional sugar on top of ScalaCheck to be able to:
>
> - set specific parameters on a property

This will be in ScalaCheck 2.0 or before.

> - set specific Pretty values for each parameter on a property

This might be useful, but I'm not sure. I'd be happy to see use cases
where the same type should be presented in different ways for different
properties.

> - set specific Arbitrary/Shrink/Gen instance on a property

I'm not sure about this. Arbitrary is tied to Prop.forAll or explicit
use of Arbitrary.arbitrary, and maybe that should be emphasized by
something like forAllArbitrary.

Shrink will be more tied to Gen in ScalaCheck 2.0. Basically a generator
that is responsible for constructing a value should also be responsible
for deconstructing it (by shrinking it). This should help with the more
unexpected shrinking effects you currently can see in ScalaCheck, since
currently Shrink knows nothing about how a value has been produced.

So other than picking a specific generator (by using something like
"forAll(gen) { x => ... }") I don't think there needs to be a way to
bring in different Arbitrary instances for the same types. My feeling is
that Arbitrary should have the specific meaning I tried to describe
earlier, and then there ideally should be just one Arbitrary instance
per type. I don't know if we can enforce this in some way other than
convention though. And I'm keen on hearing other peoples opinion on that
idea.

> - have a way to declare which of the arguments should be collected
> (opposed to any value at the moment)

I'm not sure what you mean here. But I fully agree that there can be
much better support for collecting data samples and statistics during a
test run.

> - declare the verbosity of the property

This should probably be handled like the rest of the params.

> - pass parameters from the command line to override the property
> parameters (that one is really nice when testing)

Yeah, the whole parameter thing should be cleaned up and improved, so
you can set parameters (ideally down to individual properties) on all
levels (command line, sbt, Properties, Prop), with sensible override
semantics.

> - output the args for a failed property so that strings are quoted (if
> the arg is a case class) in order to be able to more easily copy and
> paste from the console and recreate a failed example

In ScalaCheck 2.0, generators should be deterministic, and there should
be some support to output the seed that was used for a failed test, and
that seed should be everything you need to exactly reproduce the test
(within ScalaCheck). But maybe you are after something more like: I want
to create a traditional unit test out of this property failure. That's
an interesting idea, and I wonder if it could be built into Pretty and
controlled by some flag (like verbosity).


> If you implement #120
> <https://github.com/rickynils/scalacheck/issues/120>, some of this might

etorreborre

unread,
Dec 23, 2014, 4:47:30 PM12/23/14
to scala...@googlegroups.com, etorr...@gmail.com
> I'm not really following what you mean. The core methods of Gen is 
> really map and flatMap, to support the current monadic style when 
> constructing generators. I don't see how separating stuff into Gen and 
> GenData would work (or make  things better)

Sorry if I am not very clear. What I meant is that the definition of Arbitrary requires to return a Gen object.
But once the Gen object is built (using map and flatMap) only the doApply method is used when evaluating a Prop (as far as I understand)
So the requirement on the Arbitrary typeclass could be lowered to just returning a Gen object with a doApply method and have GenData subclass or implement Gen.

> My suggestion was simply that if you want to support several different 
> type classes similar to Arbitrary, it wouldn't work just using "implicit 
> val genInt: Gen[Int]". Of course, you could still rely on using scope 
> for picking the right implicit generator, but I don't see how that would 
> be better than having different type classes. 

Right. I think that part of my frustration is that there's only one Arbitrary typeclass at the moment. So when I define a Gen[T] and I have to declare a trivial Arbitrary[T] returning that Gen, it feels like I'm doing unncessary work.
But now I get the point about the necessity of keeping the Gen *datatype* and the Arbitrary *typeclass* apart.

>>   - set specific Pretty values for each parameter on a property 

> This might be useful, but I'm not sure. I'd be happy to see use cases 
> where the same type should be presented in different ways for different 
> properties. 

Let's say I have:

  case class Member(age: Int, gender: String, transactions: Transactions)

  implicit def arbitraryMember: Arbitrary[Member] = ???

and then I want to test a property only about the age of the member

  Prop.forAll((member: Member) => (member.age >= 18) ==> isMajor(member))

If the property is failing then it will display a member but will also display all its transactions which might be very verbose in that case. Having the ability to trim down the output is useful in that case.

> Shrink will be more tied to Gen in ScalaCheck 2.0. Basically a generator 
> that is responsible for constructing a value should also be responsible 
> for deconstructing it (by shrinking it). This should help with the more 
> unexpected shrinking effects you currently can see in ScalaCheck, since 
> currently Shrink knows nothing about how a value has been produced. 

That's really cool because I had issues with that in the past as well.

 
> But maybe you are after something more like: I want 
> to create a traditional unit test out of this property failure. That's 
> an interesting idea, and I wonder if it could be built into Pretty and 
> controlled by some flag (like verbosity). 

Yes, I want to be able to easily re-run a failed example in order to analyse / debug it. 
But that doesn't have to be by copy/pasting values from the console. 
Maybe if the test output would return the seed and test number to feed this back again to the Prop evaluation then we could just re-generate and re-evaluate only that precise test case.

Now about ScalaCheck 2.0, have you started working on it already? Do you have any time frame for releasing it?

Thanks,

Eric.

Rickard Nilsson

unread,
Dec 23, 2014, 6:40:18 PM12/23/14
to scala...@googlegroups.com, etorr...@gmail.com
On 12/23/2014 10:47 PM, etorreborre wrote:
> > I'm not really following what you mean. The core methods of Gen is
> > really map and flatMap, to support the current monadic style when
> > constructing generators. I don't see how separating stuff into Gen and
> > GenData would work (or make things better)
>
> Sorry if I am not very clear. What I meant is that the definition of
> Arbitrary requires to return a Gen object.
> But once the Gen object is built (using map and flatMap) only the
> doApply method is used when evaluating a Prop (as far as I understand)
> So the requirement on the Arbitrary typeclass could be lowered to just
> returning a Gen object with a doApply method and have GenData subclass
> or implement Gen.

We still want to support using Arbitrary.arbitrary as an ordinary
generator, like:

val myGen: Gen[List[Char]] =
(Arbitrary.arbitrary[String]).map(_.toList)


> > My suggestion was simply that if you want to support several different
> > type classes similar to Arbitrary, it wouldn't work just using "implicit
> > val genInt: Gen[Int]". Of course, you could still rely on using scope
> > for picking the right implicit generator, but I don't see how that would
> > be better than having different type classes.
>
> Right. I think that part of my frustration is that there's only one
> Arbitrary typeclass at the moment. So when I define a Gen[T] and I have
> to declare a trivial Arbitrary[T] returning that Gen, it feels like I'm
> doing unncessary work.
> But now I get the point about the necessity of keeping the Gen
> *datatype* and the Arbitrary *typeclass* apart.

Yes, I understand your frustration. I guess part of it could be resolved
by better documentation/naming. Even though "forAll { xs: List[Int] =>
... }" looks nice and magical, I think it is better to introduce new
ScalaCheck users to the form "forAll(gen) { xs => ... }" first.


> >> - set specific Pretty values for each parameter on a property
>
> > This might be useful, but I'm not sure. I'd be happy to see use cases
> > where the same type should be presented in different ways for different
> > properties.
>
> Let's say I have:
>
> case class Member(age: Int, gender: String, transactions: Transactions)
>
> implicit def arbitraryMember: Arbitrary[Member] = ???
>
> and then I want to test a property only about the age of the member
>
> Prop.forAll((member: Member) => (member.age >= 18) ==> isMajor(member))
>
> If the property is failing then it will display a member but will also
> display all its transactions which might be very verbose in that case.
> Having the ability to trim down the output is useful in that case.

That is a nice example. I think we can benefit both from using specific
Pretty instances, but also from more convenient ways of labeling
arguments and parts of properties.


> That's really cool because I had issues with that in the past as well.
> > But maybe you are after something more like: I want
> > to create a traditional unit test out of this property failure. That's
> > an interesting idea, and I wonder if it could be built into Pretty and
> > controlled by some flag (like verbosity).
>
> Yes, I want to be able to easily re-run a failed example in order to
> analyse / debug it.
> But that doesn't have to be by copy/pasting values from the console.
> Maybe if the test output would return the seed and test number to feed
> this back again to the Prop evaluation then we could just re-generate
> and re-evaluate only that precise test case.

Yes, this was my thought, to output the seed and any other necessary
parameters (like size, possibly baked together in a single string), to
be able to use it in a new test run.


> Now about ScalaCheck 2.0, have you started working on it already? Do you
> have any time frame for releasing it?

I have started, and I'm trying out a new generator implementation that
looks promising so far. But I have no time frame estimates yet.


/ Rickard

etorreborre

unread,
Dec 23, 2014, 7:08:35 PM12/23/14
to scala...@googlegroups.com, etorr...@gmail.com
Oh, something else I forgot to mention in my new "ScalaCheck" trait:

 - the possibility to add before/after (or setup/teardown) actions around properties

We sometimes test applications and not simple pure functions (let's say file system interactions) and it is useful to add this kind of pre/post processing behaviour at the property level (without going through a full "Command" modelling).

E.

Rickard Nilsson

unread,
Dec 23, 2014, 7:35:18 PM12/23/14
to scala...@googlegroups.com, etorreborre@gmail.com >> Eric Torreborre
On 12/24/2014 01:08 AM, etorreborre wrote:
> Oh, something else I forgot to mention in my new "ScalaCheck" trait:
>
> - the possibility to add before/after (or setup/teardown) actions
> around properties
>
> We sometimes test applications and not simple pure functions (let's say
> file system interactions) and it is useful to add this kind of pre/post
> processing behaviour at the property level (without going through a full
> "Command" modelling).

Yes, I've been thinking of this too. The cleanest way to solve it would
be to simply introduce a property combinator that just wraps another
property and inserts callbacks before and after. I'd like to also
replace the current TestCallback thingy with such an approach. This
would work for stuff that is needed on *each property evaluation*, but
not if we need setup/teardown once for each a property evaluation round
("test" in ScalaCheck terms).


/ Rickard
> --
> You received this message because you are subscribed to the Google
> Groups "scalacheck" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to scalacheck+...@googlegroups.com
> <mailto:scalacheck+...@googlegroups.com>.

etorreborre

unread,
Dec 23, 2014, 7:50:31 PM12/23/14
to scala...@googlegroups.com, etorr...@gmail.com
This
would work for stuff that is needed on *each property evaluation*, but
not if we need setup/teardown once for each a property evaluation round
("test" in ScalaCheck terms). 

In my experience both methods are needed. This is why in specs2 you can do this at the specification level: beforeAll or at the example level: before. 
But sometimes when the example is a ScalaCheck property it is necessary to do it at the property execution level as well.

E. 
Reply all
Reply to author
Forward
0 new messages