How to address sample size in the test matrix?

Amit Gelber

unread,

Oct 8, 2020, 3:57:22 AM10/8/20

to indeedeng-proctor-users

Hey,
Great work! Proctor looks like a great framework.
2 questions here:

is the project still in maintenance? active development?
in the presentation (https://engineering.indeedblog.com/talks/managing-experiments-behavior-dynamically-proctor/, slide 49),
sample size is addressed, but there is no explanation of how to use it in the test matrix.

what I mean is that usually before creating an experiment, the analyst is calculating what is the minimum sample size in order to rule which group won.
sample size calculated for i.e.:[https://www.surveysystem.com/sscalc.htm]

In case the desired sample size is reached we want to stop allocating users to that specific experiment.

Is this subject taken care of in the Proctor framework?
Thank you!

Ketan Gangatirkar

unread,

Oct 8, 2020, 1:07:27 PM10/8/20

to Amit Gelber, indeedeng-proctor-users

Hi, Amit. Yes, Proctor is in very active use with thousands of experiments being executed through it at Indeed. It's pretty stable at this point, but there are incremental fixes and improvements as needed, though most of our effort has been dedicated to other (internal only) tools in our broader experimentation efforts.

With respect to your sample question, Proctor doesn't specify the sample size directly. That would require it to know how much traffic is coming in at what rate or to maintain state somewhere counting how many visitors you have had. Proctor is meant to be stateless and have minimal dependencies, so the only parameter available here is a proportional test group size, that is to say the proportion of overall users that should be in the test population.

I recommend taking the calculated minimum with a grain of salt. There's no small fraction sample size that guarantees accuracy; you just have to decide what level of incorrectness you're willing to accept. As well, the confidence in a result doesn't just vary the sample size but also by the magnitude of the effect being detected. Also, many products have user bases that vary over the course of a day or week, and there's a higher risk of getting a non-representative sample if you run the test for shorter periods of time. For these reasons, I strongly recommend that you configure your test group size so that it exposes your test to a sample substantially larger than this calculated minimum over the course of a full week of use.

--
You received this message because you are subscribed to the Google Groups "indeedeng-proctor-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to indeedeng-proctor...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/indeedeng-proctor-users/fb08df7d-c2ab-412e-9ca1-331b05782cc3n%40googlegroups.com.

--

Ketan Gangatirkar
ke...@indeed.com

Job Seeker Engineering

Amit

unread,

Oct 9, 2020, 8:33:19 AM10/9/20

to indeedeng-proctor-users

Hey Ketan, Thank you for your answer.
Following this, since we do want to limit the exposer for a certain experiment, do you suggest creating some kind of alert mechanism which will notify if the experiment reached a certain level of effect? the motive here is to expose small groups of users to tests, and that way to enable more you tests in parallel.

Thank you again, really appreciate it.

--
You received this message because you are subscribed to a topic in the Google Groups "indeedeng-proctor-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/indeedeng-proctor-users/3D0ZMMthIlM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to indeedeng-proctor...@googlegroups.com.

Henry Stokeley

unread,

Oct 12, 2020, 7:01:51 AM10/12/20

to indeedeng-proctor-users

Hello,

As Ketan mentioned, Proctor doesn’t keep any state and so isn’t aware of the sample size of an experiment. If you wish to turn an experiment off after a certain sample size you would need to do the sample size calculations manually. However, to expand on the above usually it is desirable to collect more data than a minimum sample size calculation for a few reasons:

1. This improves the chances of detecting a genuine change and can be relatively easy to do in an online context

2. The sample size (Power Analysis) calculations themselves contain assumptions regarding 'nuisance parameters' which aren't exact

3. As Ketan mentioned it is common to run online experiments in units of a certain time window (weeks for example) as well as reaching a specific sample size

Note that Proctor already supports concurrent experiments, as well as multiple variants in a single experiment. So you shouldn’t need to turn off one experiment to run parallel experiments.

If your motivation for not running many parallel experiments is a concern regarding interaction effects, you may need to organize an experiment schedule outside of Proctor. You may also find points 3.7 and 5.2 in this paper interesting where it discusses concurrency and interactions.

Ketan Gangatirkar

unread,

Oct 12, 2020, 10:25:41 AM10/12/20

to Henry Stokeley, indeedeng-proctor-users

If you still have a strong need to strictly limit the count, then it would have to be built on top of Proctor. A simple version of this would be storing a count in some database and guarding the test membership check with a check of the count. You probably don't need to build the former since you must have something that's recording your test results, so you can probably just check that. You could wedge this into an extension of the functions available in Proctor's rule language, but that's not likely worth it unless you're going to be doing this sort of thing a lot.

You received this message because you are subscribed to the Google Groups "indeedeng-proctor-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to indeedeng-proctor...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/indeedeng-proctor-users/9caa3955-a3d9-4177-8ebf-024b305fb629n%40googlegroups.com.

Reply all

Reply to author

Forward