random seed

802 views
Skip to first unread message

claudia

unread,
Sep 29, 2007, 2:12:54 PM9/29/07
to Maxent
hi to everybody,

I couldn't find an explanation of what actually "random seed" (in the
settings) is doing. Is it putting a starting point for the
randomization of points to be selected for the training samples?
thank you for your help, best greetings

claudia

Dan.L....@gmail.com

unread,
Sep 30, 2007, 11:02:49 AM9/30/07
to Maxent
Hi Claudia,
It is extremely difficult for a computer to generate truly random
numbers, so many computer programs use pseudorandom number generators
instead. These pseudorandom number generators are algorithms that
calculate a set of numbers that approximate the properties of random
numbers. The seed is a number that you put into this algorithm to
start the process. Usually people don't put a seed in, and if that's
the case the generator will just use the time from your system clock
as its seed. This ensures that repeated runs don't use the same set
of pseudorandom numbers. Sometimes people actually will to use the
same set of pseudorandom numbers repeatedly, though, and in those
cases they can do so by specifying the same seed in repeated
analyses. There's more info here:

http://en.wikipedia.org/wiki/Pseudorandom_number_generator

Hope that helps!

Dan

Dan.L....@gmail.com

unread,
Sep 30, 2007, 11:06:44 AM9/30/07
to Maxent
That was supposed to say "sometimes people will WANT to use".

claudia

unread,
Sep 30, 2007, 11:50:36 AM9/30/07
to Maxent
Hi Dan,

thank you very much for your answer.
The reason why I was asking for "random seed" is that, after running
different runs in maxent with my dataset, the occurrences chosen for
test (25%) seem to be the same in every run.
So if I want the test fraction to be randomly chosen, do I need to
tick "random seed" in the settings menue?
I read that the test samples are chosen randomly in maxent, so in the
first runs, I didn't care about it. But then it struck me that the
purple test occurrences were always in the same locations.
I tried now "random seed" for subsets of my data, and it seems to
work. The test sample locations differ, but the number of test samples
may differ as well (strange).
So did I use the "random seed" correctly for my purpose (to randomize
occurrences chosen for testing), or does it refer to other processes
in maxent?
I would appreciate very much your help, best regards,

Claudia


On Sep 30, 5:06 pm, "Dan.L.War...@gmail.com" <Dan.L.War...@gmail.com>
wrote:

Dan.L....@gmail.com

unread,
Oct 1, 2007, 4:03:31 PM10/1/07
to Maxent
Yeah, it looks like that's telling it to change the seed each time,
which is what you want. As for the number of test samples differing,
I'm not sure. Do you have duplicate occurrences in your file, and
"remove duplicate presence records" checked? If so, it might be
picking duplicate points and then eliminating, leading to some
variation in number of test points. I'm not sure if Maxent does this,
but it's a possibility. It's also possible that you don't have
environmental data for some of your occurrence points, which means
that they would be left out of your analyses causing some variation in
number of test points. Those are just my first guesses as to what
might cause that behavior.

Hope that helps!

Klaus

unread,
Oct 2, 2007, 2:32:13 AM10/2/07
to Maxent
Hi Claudia

Bernoulli sampling could be another explanation for the varying sample
sizes. There each element is included with prob = q.

Klaus

On Oct 1, 10:03 pm, "Dan.L.War...@gmail.com" <Dan.L.War...@gmail.com>
wrote:

Steven Phillips

unread,
Oct 3, 2007, 10:10:56 AM10/3/07
to Max...@googlegroups.com
I think Dan's second idea is the right diagnosis. Removal of
duplicates happens before the presence data are split into training
and test sets, and the split is done exactly, not with Bernoulli
sampling.

-- Steven

claudia

unread,
Oct 3, 2007, 4:18:09 PM10/3/07
to Maxent
Hi Steven,
Thank you for your comment. I guess that the right diagnosis was Dan's
first idea. I use - few - occurrences without background data. I
checked it in my runs, and it is true, if one species has only few
occurrences, e.g. 5, and the test fraction is small (e.g. 1 test
sample), and then an occurrence without background data is chosen as
sample point, there will be no test sample. If a species has more
occurrences, and for test sample, an occurrence without background
data is chosen, there is one test sample less than in other runs where
no occurrences have been chosen as test sample that were without
background data.
I chose to use all occurrence data, even though maxent was warning me,
because I have many species with only few occurrences, and wanted to
have a visual help to see if the model was fitting. Do you know what
the maxent warning "experimental" is meaning?
I anyway have the problem, that more than 10 % of my species only have
2 occurrence points. If argued, that maxent is an algorithm trying to
find the probability distribution which is most widespread in a
constrained space, would it be recommendable to model 2 occurrences?
In many studies, if there are not enough occurrences, the species are
left out, but for conservation analyses, in particular species with
only few occurrences are important. It would be great if you had an
idea what to do with species having few occurrences.
Best regards, claudia


On Oct 3, 4:10 pm, Steven Phillips <phill...@research.att.com> wrote:
> I think Dan's second idea is the right diagnosis. Removal of
> duplicates happens before the presence data are split into training
> and test sets, and the split is done exactly, not with Bernoulli
> sampling.
>
> -- Steven
>

Sam Veloz

unread,
Oct 3, 2007, 4:24:36 PM10/3/07
to Max...@googlegroups.com
I wouldn't have much confidence with a model created with only 2 points.
However, it seems like you could use the prediction to direct future
surveys and either validate the model or identify new occurrences with
which you could make more robust models.
Sam

claudia

unread,
Oct 3, 2007, 4:54:27 PM10/3/07
to Maxent
Hi Sam,
thank you for your reply! I think you are right about the limited use
of 2-occurrences-models.
Best regards, claudia
Reply all
Reply to author
Forward
0 new messages