I am trying to get to grips with SLiM, which seems awesomely powerful and spectacularly fast!
All of my questions are (will be!) extremely basic - the
problem is that I am really struggling to use the manual. I am not used to
anything object oriented, and I find it extremely unintuitive. My questions all
really amount to "How do I find out how to...?" In particular
"How do I find all of the methods and properties of an object, and how do
I understand what those methods/properties do?" The recipes take me a long way, in the sense
that I can (almost) always guess what they are doing and how they are doing it,
but I cannot ever seem to work out how to generalise from the recipes.
(1) I want to count all of the mutations of type 'm1' in a sample on individuals:
From the manual I can see how to do this for the whole simulation:
D = sum(sim.substitutions.mutationType == m1); And I can see how to sample individuals:
allIndividuals = sim.subpopulations.individuals
sampledIndividuals = sample(allIndividuals,30)
But how do I do the equivalent of "sum(sim.substitutions.mutationType == m1)" on those individuals?
I see individuals are objects so sampledIndividuals will have a property uniqueMutations and there is a method uniqueMutationsOfType(), but I cannot tell what they do. i.e. What *is* a unique mutation in this sense? is it one that is unique within the individual, or unique within the sample?
They both return mutation objects, but how (from this object) would I find out how may distinct mutations there are in the sample? countOfMutationsOfType() seems to count within individuals, not within the sample.
(2) How would I get the site frequency spectrum for a particular mutation type (as integer counts, not decimal proportions) from a sample?
More importantly - and to avoid me becoming an irritation on the google group! - how should I go about trying to answer these questions for myself? i.e. what should I be searching for, or how should I be reading the manual? (I do not have access the to GUI)
Thanks,
Darren
(1) I want to count all of the mutations of type 'm1' in a sample on individuals:
--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to slim-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/35be5a14-a935-42f1-b648-b007690c1f2a%40googlegroups.com.
- As Peter noted, the reference sheets are useful for finding the methods and properties associated with each class, and the functions available in Eidos. As well as being at the back of the manuals, they can be downloaded as separate PDFs from the SLiM home page at https://messerlab.org/slim/
- The second half of the SLiM manual also has full reference documentation for each SLiM class. That would be the place to look to answer your question "How do I find all of the methods and properties of an object, and how do I understand what those methods/properties do?" if you want more detail than is provided by the reference sheets.
It sounds like you don't have much experience with programming, particularly object-oriented programming, and the learning curve can be a bit steep at the beginning.
Eidos is quite similar to R, at least superficially, so it might be easier for you to go learn basic R programming first, as there are many more online resources available for that. If you get to the point where basic R programming is natural for you, switching to Eidos will be trivial.
- For experimenting and practicing, by far the best method is to work in SLiMgui. [SNIP] Find a way to run SLiMgui,
This gives you very quick turnaround time on your experiments, shows you visually what your model is doing (much easier to interpret and understand than data dumps), provides built-in help on all the classes/methods/properties/functions, gives you code completion facilities that can also help you find the method/property you need, gives you all the recipes from the manual at your fingertips, and more. You say you don't have access to it; I would strongly suggest that you find a way to get such access. Trying to learn how to use SLiM without the GUI, particularly with little programming experience, is like trying to type on a typewriter using a meter-long "typing stick" instead of your fingers, while blindfolded.
- It is also not clear to me whether you've looked at the Eidos manual, but if you're having trouble with the basic syntax / semantics of programming in Eidos, that is the place to look. If the Eidos language itself is a challenge for you right now, then trying to understand the SLiM manual without getting a grip on Eidos first may be tricky. Again, starting with R might actually be easier (especially if you're going to end up needing to know R anyway, which most grad students eventually do).
- Note that "mutations" and "substitutions" are not the same thing; "substitutions" are created when a mutation becomes fixed. So the code you wrote using sim.substitutions may not be doing what you want it to; sim.mutations may be what you want.
- If you look at the documentation for the Individual class, there is a method, countOfMutationsOfType(), that does precisely what its name suggests.
So your code would simply be: sum(sampledIndividuals.countOfMutationsOfType(m1)).
- You write "What *is* a unique mutation in this sense?" There is documentation for uniqueMutationsOfType() that explains exactly what it does, but if you had further questions, the best way would be to simply try it and see what happens. This would be easiest in the Eidos console in SLiMgui, but failing that, just add a line of code to your model script that calls uniqueMutationsOfType() and prints out the result. Figure out a way to conduct an experiment that answers the question you have. Your question also seems like perhaps part of the confusion is over how vectorized method calls in Eidos work – what the semantics of them are – which the Eidos manual can answer for you.
- You write "how (from this object) would I find out how may distinct mutations there are in the sample?" The size() method, or the size() function (either will work for your purposes) tells you the size of a vector. They are documented in the Eidos manual, and are (or should be) on the Eidos reference sheet.
Regarding your next question, "How would I get the site frequency spectrum for a particular mutation type (as integer counts, not decimal proportions) from a sample?", here's how I would go about answering that question for myself. (1) Try searching on "frequency" in the second half of the SLiM manual (the reference section) to find anything relevant. That will turn up the mutationFrequencies() method, which sounds promising but returns fractional frequencies, not counts. The documentation for it says "See the -mutationCounts() method to obtain integer counts instead of float frequencies." So look for the reference doc for mutationCounts(), and it looks like exactly what you need. It takes a vector of subpopulations or NULL, so you will need to decide which subpops you want to analyze (or pass NULL for all of them). And it takes a vector of mutations; so you'll need to get the mutations you want, which I guess are all of the mutations of a particular mutation type. That is easily achieved with, for example, sim.mutations[sim.mutations.mutationType == m2] or whatever; you arrive at that by asking "where would I get all the mutations from?" (the simulation object) and "how would I select just the ones of mutation type m2?" (find out which ones are type m2 and subset). If that sort of construction is not obvious, you should read the Eidos manual, which discusses such things in detail. Working out the picky details of exactly how to code what you want would be easy in the Eidos console in SLiMgui.
- As Peter noted, the reference sheets are useful for finding the methods and properties associated with each class, and the functions available in Eidos. As well as being at the back of the manuals, they can be downloaded as separate PDFs from the SLiM home page at https://messerlab.org/slim/
I had found them, but they're extremely hard to follow.
- The second half of the SLiM manual also has full reference documentation for each SLiM class. That would be the place to look to answer your question "How do I find all of the methods and properties of an object, and how do I understand what those methods/properties do?" if you want more detail than is provided by the reference sheets.
OK - I was struggling to guess whether this was comprehensive
It sounds like you don't have much experience with programming, particularly object-oriented programming, and the learning curve can be a bit steep at the beginning.
I do. At least a reasonable amount - say 12 years in R, and quite a bit of C in the early 2000's. Plus some simple bash scripting
Eidos is quite similar to R, at least superficially, so it might be easier for you to go learn basic R programming first, as there are many more online resources available for that. If you get to the point where basic R programming is natural for you, switching to Eidos will be trivial.No, it's not trivial. as I say, I spend around 10 hours a week coding in R.
The few people I've spoken to so far seems to have given up on Eidos, and just written their own code to parse ms or VCF output
- For experimenting and practicing, by far the best method is to work in SLiMgui. [SNIP] Find a way to run SLiMgui,
No mac, headless linux machine.
Why do you think I'm a grad student?
But you're right, dumping output in some other format and parsing it in R would almost certainly be easier. Seems a shame to shame not to make use of all the effort you put into eidos though ...
- Note that "mutations" and "substitutions" are not the same thing; "substitutions" are created when a mutation becomes fixed. So the code you wrote using sim.substitutions may not be doing what you want it to; sim.mutations may be what you want.
That was an analogy. I'm aware that they're different.
- If you look at the documentation for the Individual class, there is a method, countOfMutationsOfType(), that does precisely what its name suggests.So your code would simply be: sum(sampledIndividuals.countOfMutationsOfType(m1)).
Hmmm.... this was the root of my frustration.
I guess I did something wrong, but sum(sim.mutations.mutationType == m1) gave me a smaller number than sum(sampledIndividuals.countOfMutationsOfType(m1)), which cannot be true. This led me to think that sampledIndividuals.countOfMutationsOfType(m1) was giving me the sum across individuals in the sample, whereas I thought sim.mutations.mutationType was giving me the number of segregating loci
- You write "What *is* a unique mutation in this sense?" There is documentation for uniqueMutationsOfType() that explains exactly what it does, but if you had further questions, the best way would be to simply try it and see what happens. This would be easiest in the Eidos console in SLiMgui, but failing that, just add a line of code to your model script that calls uniqueMutationsOfType() and prints out the result. Figure out a way to conduct an experiment that answers the question you have. Your question also seems like perhaps part of the confusion is over how vectorized method calls in Eidos work – what the semantics of them are – which the Eidos manual can answer for you.
Its not that I hadn't read it, it just made very little sense to me.
- You write "how (from this object) would I find out how may distinct mutations there are in the sample?" The size() method, or the size() function (either will work for your purposes) tells you the size of a vector. They are documented in the Eidos manual, and are (or should be) on the Eidos reference sheet.
So I discovered - the problem is that it was not easy to discover this.
Regarding your next question, "How would I get the site frequency spectrum for a particular mutation type (as integer counts, not decimal proportions) from a sample?", here's how I would go about answering that question for myself. (1) Try searching on "frequency" in the second half of the SLiM manual (the reference section) to find anything relevant. That will turn up the mutationFrequencies() method, which sounds promising but returns fractional frequencies, not counts. The documentation for it says "See the -mutationCounts() method to obtain integer counts instead of float frequencies." So look for the reference doc for mutationCounts(), and it looks like exactly what you need. It takes a vector of subpopulations or NULL, so you will need to decide which subpops you want to analyze (or pass NULL for all of them). And it takes a vector of mutations; so you'll need to get the mutations you want, which I guess are all of the mutations of a particular mutation type. That is easily achieved with, for example, sim.mutations[sim.mutations.mutationType == m2] or whatever; you arrive at that by asking "where would I get all the mutations from?" (the simulation object) and "how would I select just the ones of mutation type m2?" (find out which ones are type m2 and subset). If that sort of construction is not obvious, you should read the Eidos manual, which discusses such things in detail. Working out the picky details of exactly how to code what you want would be easy in the Eidos console in SLiMgui.
This very usefully describes the challenges involved, thank you.
I shall probably just parse the MS-style output in R / bash. Given that what I want out is quite simple, that will certainly be quicker and easier than learning eidos, and much cheaper than buying a mac.
Hi Ben,
My apologies for my aggressive tone, I'm simply frustrated by the learning curve.
I'm used to being able to look at any script and guess how to do things.
[I'm afraid that I was also unreasonably irked by your advice (to summarise my take-home) of "You grad students should learn to script, and you need to buy a mac"]
I really am immensely impressed with SLiM - it is spectacularly powerful & blazingly fast. There are a lot of things I'd like to do with it.
I've been trying to work out why I find the manual difficult to follow, and I think the following things are adding to my frustration:(1) It's long and dense and I don't want to have to read 380 pages of text and (admittedly great) recipes that are not useful to me until after I've exhausted all other possibilities. These things need to be there, but shouldn't be the starting point.
It would also be great if the recipes were commented line by line (with explanation) and the reference was spread out /bulleted (see next point).
(2) I don't want to read text unless I have to, I want to be able to see how to use something by example. I think it would be great to start the manual with a 1-page "for the impatient who can script", then followed by the reference, then followed by the recipes. Or even have two separate docs for the recipes and reference. (as for R references/ manuals / vignettes). I think a good layout would be for document to start with the reference (after a 'for the impatient' and one simple recipe) and have recipes as appendices or as a separate document (to aid finding the reference section by searching). Within each reference definition give 3 clear example use cases, flag up any common pitfalls, and hyperlink the recipes that use it and any related/alternative functions. Spread it all out and use bullets/lists, so that i don't have to read text. If the reference looked more like "man grep" or "?grep()" in R, so that I can skim over it for a case that looks like mine, rather than it being a dense paragraph of text I (have to locate by repeatedly searching and) then read carefully and dissect.
(3) The .pdf I'm using (am I wrong in this?) doesn't seem to have any hyperlink cross-referencing except from the contents, so I simply have to search up and down it to find examples. I'm used to having a man, -h, or ?table() for each function that includes some examples.
(4) I can understand why you provide an Eidos reference separate from the SLiM reference for non-Slim users, but I think it would be better to put the content of the Eidos reference within the SLiM reference for SLiM users. My thought process had to run "How do I do this in SLiM? [search manual 1] -> Maybe I Can't -> How can I do this myself with Eidos? [Search Manual 2] -> perhaps I missed something obvious in SLiM? [search manual 1 again] - > no I didn't [search manual 2 again]
(6) The fact it is so R-like is both great, and frustrating. My failure to find size() was because I was looking for length(), and couldn't guess what it might be called if not length(). [size() puts me in mind of sizeof() ] . It would be helpful if anytime SLiM does something R-like but differs in the syntax or idiom there is a flag in the manual for the reader. Similarly for any really python-like aspects. (Rather like R manual/reference does for S). If Eidos looked very different I probably wouldn't have this problem.
(7) Ultimately I'm not about to buy (and learn to use) a mac for a single piece of software, however great that software is (and SLiM is great!).
An interactive interpreter for the linux command line (without any of the plotting or IDE-like properties) would solve most (all?) of my problems. Especially being able to interrogate the properties of an object without having the go back and read the manual. This would be akin to my continual use of str() in R.