Counting mutations in samples of individuals and generating the site frequency spectrum from samples of individuals

404 views
Skip to first unread message

Darren Obbard

unread,
Apr 25, 2018, 10:50:33 AM4/25/18
to slim-discuss

I am trying to get to grips with SLiM, which seems awesomely powerful and spectacularly fast!

 

All of my questions are (will be!) extremely basic - the problem is that I am really struggling to use the manual. I am not used to anything object oriented, and I find it extremely unintuitive. My questions all really amount to "How do I find out how to...?" In particular "How do I find all of the methods and properties of an object, and how do I understand what those methods/properties do?"  The recipes take me a long way, in the sense that I can (almost) always guess what they are doing and how they are doing it, but I cannot ever seem to work out how to generalise from the recipes.

 

(1) I want to count all of the mutations of type 'm1'  in a sample on individuals:

 

From the manual I can see how to do this for the whole simulation:

                D = sum(sim.substitutions.mutationType == m1); And I can see how to sample individuals:

                allIndividuals = sim.subpopulations.individuals

                sampledIndividuals = sample(allIndividuals,30)

 

But how do I do the equivalent of "sum(sim.substitutions.mutationType == m1)" on those individuals?

 

I see  individuals are objects so sampledIndividuals will have a property uniqueMutations and there is a method uniqueMutationsOfType(), but I cannot tell what they do. i.e. What *is* a unique mutation in this sense? is it one that is unique within the individual, or unique within the sample?


They both return mutation objects, but how (from this object) would I find out how may distinct mutations there are in the sample? countOfMutationsOfType() seems to count within individuals, not within the sample.

 

(2) How would I get the site frequency spectrum for a particular mutation type (as integer counts, not decimal proportions) from a sample? 



More importantly - and to avoid me becoming an irritation on the google group! - how should I go about trying to answer these questions for myself? i.e. what should I be searching for, or how should I be reading the manual? (I do not have access the to GUI)

 

Thanks,

 

Darren

 

 

Darren Obbard

unread,
Apr 25, 2018, 11:04:39 AM4/25/18
to slim-discuss

(1) I want to count all of the mutations of type 'm1'  in a sample on individuals:

 


Does this do what I want?
 
allIndividuals = sim.subpopulations.individuals 
sampledIndividuals = sample(allIndividuals,30)
size(unique(sampledIndividuals.uniqueMutationsOfType(m1))) 


Meta-questions:
 
How should I go about finding out whether this does what I want>
How should i find out whether is this the sensible way to do it?

Peter Ralph

unread,
Apr 25, 2018, 12:58:17 PM4/25/18
to Darren Obbard, slim-discuss
Have you noticed the quick reference sheets at the end of the manual?

--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to slim-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/35be5a14-a935-42f1-b648-b007690c1f2a%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Ben Haller

unread,
Apr 25, 2018, 3:15:11 PM4/25/18
to slim-discuss
Hi Darren.  Regarding getting up to speed on SLiM and Eidos:

- As Peter noted, the reference sheets are useful for finding the methods and properties associated with each class, and the functions available in Eidos.  As well as being at the back of the manuals, they can be downloaded as separate PDFs from the SLiM home page at https://messerlab.org/slim/

- The second half of the SLiM manual also has full reference documentation for each SLiM class.  That would be the place to look to answer your question "How do I find all of the methods and properties of an object, and how do I understand what those methods/properties do?" if you want more detail than is provided by the reference sheets.

- Regarding how to generalize from the recipes, etc., there isn't any magic bullet.  It sounds like you don't have much experience with programming, particularly object-oriented programming, and the learning curve can be a bit steep at the beginning.  The only real answer is: experiment and practice until you get over the hump.  Eidos is quite similar to R, at least superficially, so it might be easier for you to go learn basic R programming first, as there are many more online resources available for that.  If you get to the point where basic R programming is natural for you, switching to Eidos will be trivial.

- For experimenting and practicing, by far the best method is to work in SLiMgui.  This gives you very quick turnaround time on your experiments, shows you visually what your model is doing (much easier to interpret and understand than data dumps), provides built-in help on all the classes/methods/properties/functions, gives you code completion facilities that can also help you find the method/property you need, gives you all the recipes from the manual at your fingertips, and more.  You say you don't have access to it; I would strongly suggest that you find a way to get such access.  Trying to learn how to use SLiM without the GUI, particularly with little programming experience, is like trying to type on a typewriter using a meter-long "typing stick" instead of your fingers, while blindfolded.

- The single best tool in SLiMgui for experimenting and practicing is the Eidos console window, which lets you interactively code in Eidos using the objects that exist in your simulation.  Find a way to run SLiMgui, and then start trying to attack your problems by experimenting in the Eidos console once you've stepped your model forward to the appropriate point (so that subpopulations, individuals, etc. have been created that you can then interact with in the console).

- It is also not clear to me whether you've looked at the Eidos manual, but if you're having trouble with the basic syntax / semantics of programming in Eidos, that is the place to look.  If the Eidos language itself is a challenge for you right now, then trying to understand the SLiM manual without getting a grip on Eidos first may be tricky.  Again, starting with R might actually be easier (especially if you're going to end up needing to know R anyway, which most grad students eventually do).

  Regarding your first question, about counting mutations of a given type:

- Note that "mutations" and "substitutions" are not the same thing; "substitutions" are created when a mutation becomes fixed.  So the code you wrote using sim.substitutions may not be doing what you want it to; sim.mutations may be what you want.

- If you look at the documentation for the Individual class, there is a method, countOfMutationsOfType(), that does precisely what its name suggests.  So your code would simply be: sum(sampledIndividuals.countOfMutationsOfType(m1)).  In general, in object-oriented programming, you look to the relevant objects to have methods or properties that do what you want.  In this case, since you have a vector of individuals, and want to know how many mutations of a given type they have, you just look at the Individual class and see what APIs it offers.  If it doesn't do what you need, then you look at what it *does* provide, and try to figure out a way to get where you want to go; if the countOfMutationsOfType() method didn't exist, maybe you would have to go down to the level of Genome to get the answer you want.

- You write "What *is* a unique mutation in this sense?"  There is documentation for uniqueMutationsOfType() that explains exactly what it does, but if you had further questions, the best way would be to simply try it and see what happens.  This would be easiest in the Eidos console in SLiMgui, but failing that, just add a line of code to your model script that calls uniqueMutationsOfType() and prints out the result.  Figure out a way to conduct an experiment that answers the question you have.  Your question also seems like perhaps part of the confusion is over how vectorized method calls in Eidos work – what the semantics of them are – which the Eidos manual can answer for you.

- You write "how (from this object) would I find out how may distinct mutations there are in the sample?"  The size() method, or the size() function (either will work for your purposes) tells you the size of a vector.  They are documented in the Eidos manual, and are (or should be) on the Eidos reference sheet.

  Regarding your next question, "How would I get the site frequency spectrum for a particular mutation type (as integer counts, not decimal proportions) from a sample?", here's how I would go about answering that question for myself.  (1) Try searching on "frequency" in the second half of the SLiM manual (the reference section) to find anything relevant.  That will turn up the mutationFrequencies() method, which sounds promising but returns fractional frequencies, not counts.  The documentation for it says "See the -mutationCounts() method to obtain integer counts instead of float frequencies."  So look for the reference doc for mutationCounts(), and it looks like exactly what you need.  It takes a vector of subpopulations or NULL, so you will need to decide which subpops you want to analyze (or pass NULL for all of them).  And it takes a vector of mutations; so you'll need to get the mutations you want, which I guess are all of the mutations of a particular mutation type.  That is easily achieved with, for example, sim.mutations[sim.mutations.mutationType == m2] or whatever; you arrive at that by asking "where would I get all the mutations from?" (the simulation object) and "how would I select just the ones of mutation type m2?" (find out which ones are type m2 and subset).  If that sort of construction is not obvious, you should read the Eidos manual, which discusses such things in detail.  Working out the picky details of exactly how to code what you want would be easy in the Eidos console in SLiMgui.

  I hope this helps – and I hope you understand that I can't answer more questions of this sort at this level of detail.  I've tried to show you the resources at your disposal, and how to approach these sorts of questions when programming; this is the "teach a man to fish" approach, rather than the "give a man a fish" approach.  So now it is time for you to go practice catching your own fish.  :->  But step one: find a way to get access to SLiMgui.

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University

Darren Obbard

unread,
Apr 25, 2018, 4:25:50 PM4/25/18
to slim-discuss
Hi!

Thanks for this,


- As Peter noted, the reference sheets are useful for finding the methods and properties associated with each class, and the functions available in Eidos.  As well as being at the back of the manuals, they can be downloaded as separate PDFs from the SLiM home page at https://messerlab.org/slim/

I had found them, but they're extremely hard to follow.
 
- The second half of the SLiM manual also has full reference documentation for each SLiM class.  That would be the place to look to answer your question "How do I find all of the methods and properties of an object, and how do I understand what those methods/properties do?" if you want more detail than is provided by the reference sheets.

OK - I was struggling to guess whether this was comprehensive
 
  It sounds like you don't have much experience with programming, particularly object-oriented programming, and the learning curve can be a bit steep at the beginning.  

I do. At least a reasonable amount - say 12 years in R, and quite a bit of C in the early 2000's. Plus some simple bash scripting
 
Eidos is quite similar to R, at least superficially, so it might be easier for you to go learn basic R programming first, as there are many more online resources available for that.  If you get to the point where basic R programming is natural for you, switching to Eidos will be trivial.

 
No, it's not trivial.  as I say, I spend around 10 hours a week coding in R.

The few people I've spoken to so far seems to have given up on Eidos, and just written their own code to parse ms or VCF output
 
- For experimenting and practicing, by far the best method is to work in SLiMgui. [SNIP]  Find a way to run SLiMgui,

No mac, headless linux machine.
 
This gives you very quick turnaround time on your experiments, shows you visually what your model is doing (much easier to interpret and understand than data dumps), provides built-in help on all the classes/methods/properties/functions, gives you code completion facilities that can also help you find the method/property you need, gives you all the recipes from the manual at your fingertips, and more.  You say you don't have access to it; I would strongly suggest that you find a way to get such access.  Trying to learn how to use SLiM without the GUI, particularly with little programming experience, is like trying to type on a typewriter using a meter-long "typing stick" instead of your fingers, while blindfolded.
 
- It is also not clear to me whether you've looked at the Eidos manual, but if you're having trouble with the basic syntax / semantics of programming in Eidos, that is the place to look.  If the Eidos language itself is a challenge for you right now, then trying to understand the SLiM manual without getting a grip on Eidos first may be tricky.  Again, starting with R might actually be easier (especially if you're going to end up needing to know R anyway, which most grad students eventually do).
 
Why do you think I'm a grad student? But you're right, dumping output in some other format and parsing it in R would almost certainly be easier. Seems a shame to shame not to make use of all the effort you put into eidos though ...

 

- Note that "mutations" and "substitutions" are not the same thing; "substitutions" are created when a mutation becomes fixed.  So the code you wrote using sim.substitutions may not be doing what you want it to; sim.mutations may be what you want.


That was an analogy. I'm aware that they're different.
 

- If you look at the documentation for the Individual class, there is a method, countOfMutationsOfType(), that does precisely what its name suggests.

 So your code would simply be: sum(sampledIndividuals.countOfMutationsOfType(m1)).

 Hmmm.... this was the root of my frustration.

I guess I did something wrong, but sum(sim.mutations.mutationType == m1) gave me a smaller number than sum(sampledIndividuals.countOfMutationsOfType(m1)), which cannot be true. This led me to think that  sampledIndividuals.countOfMutationsOfType(m1) was giving me the sum across individuals in the sample, whereas I thought sim.mutations.mutationType was giving me the number of segregating loci

- You write "What *is* a unique mutation in this sense?"  There is documentation for uniqueMutationsOfType() that explains exactly what it does, but if you had further questions, the best way would be to simply try it and see what happens.  This would be easiest in the Eidos console in SLiMgui, but failing that, just add a line of code to your model script that calls uniqueMutationsOfType() and prints out the result.  Figure out a way to conduct an experiment that answers the question you have.  Your question also seems like perhaps part of the confusion is over how vectorized method calls in Eidos work – what the semantics of them are – which the Eidos manual can answer for you.

Its not that I hadn't read it, it just made very little sense to me. 


- You write "how (from this object) would I find out how may distinct mutations there are in the sample?"  The size() method, or the size() function (either will work for your purposes) tells you the size of a vector.  They are documented in the Eidos manual, and are (or should be) on the Eidos reference sheet.

So I discovered - the problem is that it was not easy to discover this.
 
  Regarding your next question, "How would I get the site frequency spectrum for a particular mutation type (as integer counts, not decimal proportions) from a sample?", here's how I would go about answering that question for myself.  (1) Try searching on "frequency" in the second half of the SLiM manual (the reference section) to find anything relevant.  That will turn up the mutationFrequencies() method, which sounds promising but returns fractional frequencies, not counts.  The documentation for it says "See the -mutationCounts() method to obtain integer counts instead of float frequencies."  So look for the reference doc for mutationCounts(), and it looks like exactly what you need.  It takes a vector of subpopulations or NULL, so you will need to decide which subpops you want to analyze (or pass NULL for all of them).  And it takes a vector of mutations; so you'll need to get the mutations you want, which I guess are all of the mutations of a particular mutation type.  That is easily achieved with, for example, sim.mutations[sim.mutations.mutationType == m2] or whatever; you arrive at that by asking "where would I get all the mutations from?" (the simulation object) and "how would I select just the ones of mutation type m2?" (find out which ones are type m2 and subset).  If that sort of construction is not obvious, you should read the Eidos manual, which discusses such things in detail.  Working out the picky details of exactly how to code what you want would be easy in the Eidos console in SLiMgui.

This very usefully describes the challenges involved, thank you.

I shall probably just parse the MS-style output in R / bash. Given that what I want out is quite simple, that will certainly be quicker and easier than learning eidos, and much cheaper than buying a mac.

Thanks!

Darren

Darren Obbard

unread,
Apr 25, 2018, 4:43:24 PM4/25/18
to slim-discuss
Hi!

I thought I ought to go back and check what I had done wrong, given that you suggest sum(sampledIndividuals.countOfMutationsOfType(m1)) is what I want.

As far as I can tell, sum(sim.mutations.mutationType == m1) seems to give me the number of currently segregating sites in the simulation [or is this what I have wrong?], but sum(sampledIndividuals.countOfMutationsOfType(m1)) gives me a massively larger number, closer to summing the number of mutations in each individual in the sample.

Is this the intended behaviour?

Thanks,

Darren


Ben Haller

unread,
Apr 25, 2018, 5:18:18 PM4/25/18
to slim-discuss
- As Peter noted, the reference sheets are useful for finding the methods and properties associated with each class, and the functions available in Eidos.  As well as being at the back of the manuals, they can be downloaded as separate PDFs from the SLiM home page at https://messerlab.org/slim/

I had found them, but they're extremely hard to follow.

  What about them is "extremely hard to follow"?  Please try to make your criticism constructive.
 
- The second half of the SLiM manual also has full reference documentation for each SLiM class.  That would be the place to look to answer your question "How do I find all of the methods and properties of an object, and how do I understand what those methods/properties do?" if you want more detail than is provided by the reference sheets.

OK - I was struggling to guess whether this was comprehensive

  It is reference documentation, so it is comprehensive.  If you find something that is missing, please let me know.
 
  It sounds like you don't have much experience with programming, particularly object-oriented programming, and the learning curve can be a bit steep at the beginning.  
 
I do. At least a reasonable amount - say 12 years in R, and quite a bit of C in the early 2000's. Plus some simple bash scripting

   OK. I was going from what you wrote earlier, "I am not used to anything object oriented".

Eidos is quite similar to R, at least superficially, so it might be easier for you to go learn basic R programming first, as there are many more online resources available for that.  If you get to the point where basic R programming is natural for you, switching to Eidos will be trivial.
 
No, it's not trivial.  as I say, I spend around 10 hours a week coding in R.

  OK.  Again, if you have constructive criticism, I am happy to hear it.  If there are places in the Eidos manual that are unclear and could be rewritten for greater clarity, that would be welcome feedback.
 
The few people I've spoken to so far seems to have given up on Eidos, and just written their own code to parse ms or VCF output

  I assure you there is a large community of SLiM users out there coding quite happily in Eidos.
 
- For experimenting and practicing, by far the best method is to work in SLiMgui. [SNIP]  Find a way to run SLiMgui,

No mac, headless linux machine.

  The solution, of course, being: find a Mac.  I am not trying to trivialize that undertaking; I realize Macs are scarce and expensive in some countries.  But if you want an easier learning curve for SLiM, using SLiMgui is the way to get it.
 
Why do you think I'm a grad student?

  I have no idea what you are, and I apologize if what seemed like a logical inference – most of the users who write to ask me questions are gard students – was mistaken.  Is there a reason you're taking such a hostile tone?  I spent more than an hour this morning trying to carefully and politely answer your questions, and yet I'm feeling attacked, throughout this thread.
 
But you're right, dumping output in some other format and parsing it in R would almost certainly be easier. Seems a shame to shame not to make use of all the effort you put into eidos though ...

  If your use case is simple, and can be achieved as you describe, then indeed, the effort required to learn a new language and a new tool may not be worth the payoff.  Nobody is forcing you to do your analysis in Eidos if you prefer R.
 
- Note that "mutations" and "substitutions" are not the same thing; "substitutions" are created when a mutation becomes fixed.  So the code you wrote using sim.substitutions may not be doing what you want it to; sim.mutations may be what you want.

That was an analogy. I'm aware that they're different.

  OK.
 
- If you look at the documentation for the Individual class, there is a method, countOfMutationsOfType(), that does precisely what its name suggests.

 So your code would simply be: sum(sampledIndividuals.countOfMutationsOfType(m1)).

 Hmmm.... this was the root of my frustration.

I guess I did something wrong, but sum(sim.mutations.mutationType == m1) gave me a smaller number than sum(sampledIndividuals.countOfMutationsOfType(m1)), which cannot be true. This led me to think that  sampledIndividuals.countOfMutationsOfType(m1) was giving me the sum across individuals in the sample, whereas I thought sim.mutations.mutationType was giving me the number of segregating loci

  sum(sim.mutations.mutationType == m1) gives you the number of mutations of type m1 in the simulation.  sum(sampledIndividuals.countOfMutationsOfType(m1)) gives you the sum of (# of m1 mutations possessed by an individual), across the sampledIndividuals vector.  Since the same mutation can be possessed by more than one individual, the latter number can be larger than the former.  Indeed, as the documentation for countOfMutationsOfType() explains, "a mutation that is present in both genomes counts twice", so in the most extreme case where each m1 mutation is possessed by every individual in the sample in both of each individual's genomes, the latter could be 2N times larger than the former, where N is the size of sampledIndividuals.
 
- You write "What *is* a unique mutation in this sense?"  There is documentation for uniqueMutationsOfType() that explains exactly what it does, but if you had further questions, the best way would be to simply try it and see what happens.  This would be easiest in the Eidos console in SLiMgui, but failing that, just add a line of code to your model script that calls uniqueMutationsOfType() and prints out the result.  Figure out a way to conduct an experiment that answers the question you have.  Your question also seems like perhaps part of the confusion is over how vectorized method calls in Eidos work – what the semantics of them are – which the Eidos manual can answer for you.

Its not that I hadn't read it, it just made very little sense to me. 

  Again, if you have constructive criticism, that would be welcome.
 
- You write "how (from this object) would I find out how may distinct mutations there are in the sample?"  The size() method, or the size() function (either will work for your purposes) tells you the size of a vector.  They are documented in the Eidos manual, and are (or should be) on the Eidos reference sheet.

So I discovered - the problem is that it was not easy to discover this.

  The Eidos manual introduces this quite clearly; near the beginning of the manual, one of the first examples of calling a function is "You can find out the number of values in a vector using the size() function".  The reference sheet for Eidos also has: "(integer$)size(* x): count elements in x".  How could it be any easier to discover this?  I don't mean that sarcastically; I am genuinely asking.  Again: constructive criticism is welcome.
 
  Regarding your next question, "How would I get the site frequency spectrum for a particular mutation type (as integer counts, not decimal proportions) from a sample?", here's how I would go about answering that question for myself.  (1) Try searching on "frequency" in the second half of the SLiM manual (the reference section) to find anything relevant.  That will turn up the mutationFrequencies() method, which sounds promising but returns fractional frequencies, not counts.  The documentation for it says "See the -mutationCounts() method to obtain integer counts instead of float frequencies."  So look for the reference doc for mutationCounts(), and it looks like exactly what you need.  It takes a vector of subpopulations or NULL, so you will need to decide which subpops you want to analyze (or pass NULL for all of them).  And it takes a vector of mutations; so you'll need to get the mutations you want, which I guess are all of the mutations of a particular mutation type.  That is easily achieved with, for example, sim.mutations[sim.mutations.mutationType == m2] or whatever; you arrive at that by asking "where would I get all the mutations from?" (the simulation object) and "how would I select just the ones of mutation type m2?" (find out which ones are type m2 and subset).  If that sort of construction is not obvious, you should read the Eidos manual, which discusses such things in detail.  Working out the picky details of exactly how to code what you want would be easy in the Eidos console in SLiMgui.

This very usefully describes the challenges involved, thank you.

  You're welcome.
 
I shall probably just parse the MS-style output in R / bash. Given that what I want out is quite simple, that will certainly be quicker and easier than learning eidos, and much cheaper than buying a mac. 

   Feel free to use SLiM in whatever way works for you.

Darren Obbard

unread,
Apr 26, 2018, 6:23:46 AM4/26/18
to slim-discuss
Hi Ben,

My apologies for my aggressive tone, I'm simply frustrated by the learning curve. I'm used to being able to look at any script and guess how to do things. [I'm afraid that I was also unreasonably irked by your advice  (to summarise my take-home) of "You grad students should learn to script, and you need to buy a mac"]

I really am immensely impressed with SLiM - it is spectacularly powerful & blazingly fast. There are a lot of things I'd like to do with it. 

I've been trying to work out why I find the manual difficult to follow, and I think the following things are adding to my frustration:

(1) It's long and dense and I don't want to have to read 380 pages of text and (admittedly great) recipes that are not useful to me until after I've exhausted all other possibilities. These things need to be there, but shouldn't be the starting point. It would also be great if the recipes were commented line by line (with explanation) and the reference was spread out /bulleted (see next point). 

(2) I don't want to read text unless I have to, I want to be able to see how to use something by example. I think it would be great to start the manual with a 1-page "for the impatient who can script", then followed by the reference, then followed by the recipes. Or even have two separate docs for the recipes and reference. (as for R references/  manuals / vignettes). I think a good layout would be for document to start with the reference (after a 'for the impatient' and one simple recipe) and have recipes as appendices or as a separate document (to aid finding the reference section by searching). Within each reference definition give 3 clear example use cases, flag up any common pitfalls, and hyperlink the recipes that use it and any related/alternative functions. Spread it all out and use bullets/lists, so that i don't have to read text. If the reference looked more like "man grep" or "?grep()" in R, so that I can skim over it for a case that looks like mine, rather than it being a dense paragraph of text I (have to locate by repeatedly searching and) then read carefully and dissect.

(3) The .pdf I'm using (am I wrong in this?) doesn't seem to have any hyperlink cross-referencing except from the contents, so I simply have to search up and down it to find examples. I'm used to having a man,  -h,  or ?table() for each function that includes some examples. 

(4) I can understand why you provide an Eidos reference separate from the SLiM reference for non-Slim users, but I think it would be better to put the content of the Eidos reference within the SLiM reference for SLiM users. My thought process had to run "How do I do this in SLiM? [search manual 1] -> Maybe I Can't -> How can I do this myself with Eidos? [Search Manual 2] -> perhaps I missed something obvious in SLiM? [search manual 1 again] - > no I didn't [search manual 2 again]

(6) The fact it is so R-like is both great, and frustrating. My failure to find size() was because I was looking for length(), and couldn't guess what it might be called if not length(). [size() puts me in mind of sizeof() ] . It would be helpful if anytime SLiM does something R-like but differs in the syntax or idiom there is a flag in the manual for the reader. Similarly for any really python-like aspects. (Rather like R manual/reference does for S). If Eidos looked very different I probably wouldn't have this problem.
 
(7) Ultimately I'm not about to buy (and learn to use) a mac for a single piece of software, however great that software is (and SLiM is great!). An interactive interpreter for the linux command line (without any of the plotting  or IDE-like properties) would solve most (all?) of my problems. Especially being able to interrogate the properties of an object without having the go back and read the manual. This would be akin to my continual use of str() in R. 

Thanks again for your help,

Darren

Ben Haller

unread,
Apr 26, 2018, 12:55:48 PM4/26/18
to slim-discuss
Hi Ben,

  Hi Darren,
 
My apologies for my aggressive tone, I'm simply frustrated by the learning curve.

  Thank you for your more constructive tone here.
 
I'm used to being able to look at any script and guess how to do things.

  SLiM is quite a complex piece of software, and it takes some investment of time and effort to become conversant with it.  That said, we (Philipp and I) have done everything we can think of to smooth out that learning curve, including providing about a hundred example recipes, and providing SLiMgui as a learning / modelling environment.
 
[I'm afraid that I was also unreasonably irked by your advice  (to summarise my take-home) of "You grad students should learn to script, and you need to buy a mac"]

  I'm sorry that was your take-home; it is certainly not what I wrote.  What I wrote was that you would find it much easier to learn SLiM if you found a way to get access to a Mac so you could use SLiMgui; and I stand by that advice.  That does not have to mean buying a Mac (although I know of more than one lab that has, in fact, bought a Mac specifically so that they could run SLiMgui); it could mean finding a friend or colleague with a Mac you could use for a little while, or even finding an internet cafe or a library with a Mac you could use.  If that is simply impossible for you, for whatever reason, that is unfortunate, as the fact remains that using SLiMgui is the best way to learn SLiM.
 
I really am immensely impressed with SLiM - it is spectacularly powerful & blazingly fast. There are a lot of things I'd like to do with it. 

  I'm glad to hear it.  Philipp and I have invested years of our lives, literally, into making it the best software we can.
 
I've been trying to work out why I find the manual difficult to follow, and I think the following things are adding to my frustration:

(1) It's long and dense and I don't want to have to read 380 pages of text and (admittedly great) recipes that are not useful to me until after I've exhausted all other possibilities. These things need to be there, but shouldn't be the starting point.

  You certainly don't need to read all of the recipes, and in fact the manual, in its introduction, specifically recommends that you should not do so.  But I don't think dropping people right into the reference doc after a one-page example would work for most beginning users.  The intent is that people will read forward through as many of the introductory recipes as they personally find to be useful, and when they get tired/bored of that, they will then branch out in whatever direction they find most useful for themselves, whether that is jumping forward to a specific recipe relevant to their work, or jumping to the reference doc, or jumping into playing around in SLiMgui.
 
It would also be great if the recipes were commented line by line (with explanation) and the reference was spread out /bulleted (see next point). 

  The recipes are generally explained line by line in the text associated with them.  That text could perhaps be recast in the form of comments on the code instead; that seems like six of one, half a dozen of the other, to me, *except* that some single lines of code would require a paragraph or more of explanatory comments, which, to me, would make the code for the recipe rather ungainly.  It would then be hard to see the model as a whole, since one screenful of code would contain only a handful of lines of code, interspersed with large multi-line comments, in many cases.

  So that is a specific response to this particular comment from you; but the more general point is that Philipp and I have, in fact, considered many different options for how to present the recipes and the manuals, and have arrived at the present situation as the best solution, for reasons that may not always be immediately obvious.  A different mode of presentation might work better for you personally (or you might find, after actually experiencing it, that in fact it did not); but we have reasons for the choices we have made in this area.  I'll try to explain those reasons in response to your points below as well.
 
(2) I don't want to read text unless I have to, I want to be able to see how to use something by example. I think it would be great to start the manual with a 1-page "for the impatient who can script", then followed by the reference, then followed by the recipes. Or even have two separate docs for the recipes and reference. (as for R references/  manuals / vignettes). I think a good layout would be for document to start with the reference (after a 'for the impatient' and one simple recipe) and have recipes as appendices or as a separate document (to aid finding the reference section by searching). Within each reference definition give 3 clear example use cases, flag up any common pitfalls, and hyperlink the recipes that use it and any related/alternative functions. Spread it all out and use bullets/lists, so that i don't have to read text. If the reference looked more like "man grep" or "?grep()" in R, so that I can skim over it for a case that looks like mine, rather than it being a dense paragraph of text I (have to locate by repeatedly searching and) then read carefully and dissect.

  Again, there are reasons for the choices we have made.  I would first note that the manual does, in fact, start with quick "for the impatient who can script" sections, and that quick-reference sheets are provided for both Eidos and SLiM, and that you are free to skip ahead to the reference section at any point; the manual specifically recommends that you not read through all of the recipes in order, in fact.  As you note, it makes sense for the "reference doc" to be effectively a separate document, and that is in fact pretty much what we have done by making it a separate "Part II" of the manual; if you really want it to be a completely separate PDF, you could presumably achieve that with a PDF editor with a few seconds of splicing work, but having them together in one PDF seems like the better option for various reasons.  Similarly, with each recipe, you don't have to read the long text if you don't want to; the whole recipe is provided as code which you can simply look at if you wish, so that would seem to allow the usage pattern you want.  If you want an example for a particular property/method in the reference doc, all you need to do is search the PDF for that name, and you should be able to easily find a recipe that uses that property/method; adding example recipes to the documentation for every property/method would make the reference doc easily a hundred pages longer or more (since each example would need to be a full executable SLiM recipe that actually did something to illustrate the purpose/utility of the property/method in question), and would end up being redundant with the examples provided by the recipes.  I like your idea of having the reference doc have hyperlinks to recipes, though, and I have added it to my list of new features to consider.

  So again, the important points here are that (1) Philipp and I have thought carefully about these sorts of issues, and have reasons for the choices we have made, and (2) it seems to me that your desired mode of interacting with the documentation is easily accommodated by the doc as it now stands.  We have gotten enthusiastic compliments from many users on the documentation for SLiM, so I hope you understand that we are not likely to embark upon the (enormous) task of rewriting hundreds of pages of documentation from the ground up based upon one user's comments.  Where a change can be made relatively easily, however, such as your idea about adding hyperlinks from the reference doc to the recipes, I am happy to consider it.
 
(3) The .pdf I'm using (am I wrong in this?) doesn't seem to have any hyperlink cross-referencing except from the contents, so I simply have to search up and down it to find examples. I'm used to having a man,  -h,  or ?table() for each function that includes some examples. 

  Since PDFs are easily searchable, it didn't seem necessary to add cross-reference hyperlinks everywhere; a quick search will produce more "hits" than hyperlinks ever could anyway.  Adding hyperlinks is also not at all easy in the editor I use to produce the manual; just adding the hyperlinks for the sections and subsections was quite a lot of work.  Finally, having the text regularly interrupted by blue underlined hyperlinks would, I feel, interrupt the flow and readability, although that is a matter of personal taste.  Instead, I have chosen to reference related material using section numbers; it is quite easy to follow those, either by searching or using the table of contents, although admittedly not as easy as it would be if all of those references were hyperlinks.
 
(4) I can understand why you provide an Eidos reference separate from the SLiM reference for non-Slim users, but I think it would be better to put the content of the Eidos reference within the SLiM reference for SLiM users. My thought process had to run "How do I do this in SLiM? [search manual 1] -> Maybe I Can't -> How can I do this myself with Eidos? [Search Manual 2] -> perhaps I missed something obvious in SLiM? [search manual 1 again] - > no I didn't [search manual 2 again]

  If you want to join the two manuals together, for easier searchability, that is a few seconds' work in a PDF editor.  We think it makes sense for the manuals to be separate because they are, in fact, quite separate conceptually.  The Eidos manual is the specification for the Eidos language, which is likely to be used in other projects besides SLiM eventually, and is a standalone language with no dependencies upon SLiM.  If you are trying to understand something about the Eidos language itself, look in the Eidos manual; if your question is about the SLiM classes/properties/methods or SLiM model recipes, look in the SLiM manual.  But as I said, if you want to join the manuals together into one PDF, you can certainly do so.
 
(6) The fact it is so R-like is both great, and frustrating. My failure to find size() was because I was looking for length(), and couldn't guess what it might be called if not length(). [size() puts me in mind of sizeof() ] . It would be helpful if anytime SLiM does something R-like but differs in the syntax or idiom there is a flag in the manual for the reader. Similarly for any really python-like aspects. (Rather like R manual/reference does for S). If Eidos looked very different I probably wouldn't have this problem.

  I'm sorry this was such a source of frustration.  This is good feedback, and in fact, in response to it, I have just added length() as a synonym for size() in Eidos to prevent others from hitting this problem.  I no longer recall why I chose size() for Eidos; perhaps I made that decision very early in the development of Eidos, before I had decided to pattern the language APIs so strongly on R, and perhaps I was unduly influenced by C++, which of course I was writing Eidos in (all else being equal, "size" is shorter and easier to type than "length", too).  In any case, I agree that "size()" was perhaps the wrong decision, in hindsight.  Happily, that is easy to remedy; the new length() synonym will be available in the next release of SLiM.
 
(7) Ultimately I'm not about to buy (and learn to use) a mac for a single piece of software, however great that software is (and SLiM is great!).

  Other labs have found it worthwhile to do so; your mileage may vary.  But in any case, as I explained above, buying a Mac is not necessary anyway unless there is really no other way for you to get even temporary access to a Mac.  Even a day or two of working in SLiMgui is enormously beneficial for the early stages of learning SLiM.
 
An interactive interpreter for the linux command line (without any of the plotting  or IDE-like properties) would solve most (all?) of my problems. Especially being able to interrogate the properties of an object without having the go back and read the manual. This would be akin to my continual use of str() in R. 

  It would, of course, be nice to have an interactive environment for Linux as well.  SLiMgui is quite a large piece of software that has taken quite a lot of effort to develop; we simply don't have the resources in our group to replicate that effort on Linux.  We also don't have the skills; I have no familiarity with the Linux platform, and would have no idea where to even start coding such an app.  In the end, SLiM is free, open-source software, and anyone with an interest in it is free to contribute to it.  If you, or anybody else, wants to write an interactive Linux front end for SLiM, that would be great; have at it.  :->  "Make an interactive environment for Linux" has, in fact, been on my to-do list for several years, but I am unlikely to ever address it myself, since I have neither the skills nor the time to do so.  So it goes.

  I think at this point we have probably talked all this to death, and I imagine others on this mailing list are feeling a bit spammed by all these emails, so unless there's something that really needs further follow-up, how about we leave it at that?  Thanks for your input, and happy modelling.
Reply all
Reply to author
Forward
0 new messages