here is a solution for identifying your top leech items on a Linux
machine. The used program is available for Windows too (http://
sourceforge.net/projects/xmlstar/), but I did not test it on Windows.
The procedure produces a simple list of your flashcards sorted
descending by lapses.
Background information about leech items:
http://ichi2.net/anki/wiki/Leeches
http://www.supermemo.com/help/leech.htm
Requirements for the procedure:
- Linux
- xmlstarlet installed (http://sourceforge.net/projects/xmlstar/)
Procedure:
- export the database to xml including learning data
- run the following command under linux in a console in the same
directory where you put the xml file:
xmlstarlet sel -T -t -m /mnemosyne/item -s D:N:- "@lps" -v "concat
(@lps,'|',Q,'|',A)" -n default.xml >> lps_ranking.txt
Result/ Output:
The output file (lps_ranking.txt) is a descending sorted list of all
your Flashcards by lapses.
Here are the first few lines of my output file:
8|<img src="media/ui/semester1/eff1/Sorbus_aria-knospe.jpg">|
Sorbus_aria
8|Terminus von "Gips" |Calciumsulfat
7|Ethanol|<$>C_2H_6O</$>
7|Japanischer Pagodenbaum|Styphnolobium japonicum
6|<img src="media/ui/semester1/eff1/Ulmus_minor-knospe1.jpg">|
Ulmus_minor
6|<img src="media/ui/semester1/eff1/Sorbus_torminalis-knospe.jpg">|
Sorbus_torminalis
6|Sanddorn|Hippophaë rhamnoides
5|Berg-Föhre|Pinus mugo
5|incite|to encourage someone to do or feel something unpleasant or
violent
Hopefully this post is useful to some of you!
Greetings,
Roger
Your formatting seems to be off. I had to remove the space between
"D:N:-" and ""@lps"" to get it to work - the error messages were
incomprehensible, but I think that was the solution:
xmlstarlet sel -T -t -m /mnemosyne/item -s D:N:-"@lps" -v "concat
(@lps,'|',Q,'|',A)" -n default.xml
(Alas for me, I've never got the hang of the 0/1 rankings and so all
my leeches are marked '2', so this script doesn't help me.)
--
gwern
Back up a second - what is the meaning of "leech" in this vernacular?
Why would I want to find them? (I'm not being argumentative, I'm truly
curious.)
I think the 2 links to Anki & SuperMemo documentation addressed that
question well enough.
The name conveys the meaning: cards that for whatever reason have
proven really difficult to remember, and for that reason keep coming
up and 'leeching' your review time. They are time-wasters,
discouraging, ineffective, and indicative of being either irrelevant
or badly formulated. Any of those is enough to want to know about them
for analysis & action.
--
gwern
Interesting. Perhaps it's just a matter of perception, but I thought
the whole point of these algorithms is more or less FOR the "leeches".
If it's hard and tends to monopolize your review time, they should.
That's the idea, right? Until they stop being hard? And if they
practically never do, then perhaps they shouldn't and the spacing
algorithm is doing its job.
I get the possibility of a card being badly formulated and discouraging
points, I guess, and maybe I just don't have enough cards to make it a
big deal but I relish the leeches; those are specifically the ones I
want to focus on. I guess the whole thing just sounds like a "math is
hard, let's go shopping" simile.
<shrug> To each their own.
They don't stop being hard. That's the point. Until you take the time
to refactor the question, understand it better, or do something else,
they will continue to take up a disproportionate amount of your time.
Also, I wonder if the -v option is in the right place. Moving it to
after the 'sel' quieted errors, but it also produces nothing if I
replace 'lps' with 'ac_rp', which makes me wonder whether the entire
thing is broken.
That said, would it be possible to extract cards ranked 2, and sort by
either difficulty or age?
(Which would reveal leeches better? Probably difficulty; gr2s seem to
hover pretty consistently in the ~1-2 interval, and 4 and 5s are all
higher.)
--
gwern
your version of command doesn't work at my site, but good you found a
working solution for you and maybe others with the same issue.
> (Alas for me, I've never got the hang of the 0/1 rankings and so all
> my leeches are marked '2', so this script doesn't help me.)
in this case your easyness factor should be a good indicator for the
leech items... you can change the two @lps in the command to @e and
change the -s D:N to sort *A*scending by changing to -A:N
the command should look like this at the end:
xmlstarlet sel -T -t -m /mnemosyne/item -s D:N:-"@e" -v "concat
(@e,'|',Q,'|',A)" -n default.xml >> easyness_ranking.txt
in fact, you can sort by any attribute of the item element. a sample
item element looks like this:
<item id="9ded5f9b.inv" u="0" gr="4" e="2.462" ac_rp="5" rt_rp="1"
lps="1" ac_rp_l="2" rt_rp_l="0" l_rp="347" n_rp="349">
I've got the syntax from the xmlstarlet documentation:
http://xmlstar.sourceforge.net/doc/UG/
the specific EXAMPLE which is quite similar to my command can be found
on this site: http://xmlstar.sourceforge.net/doc/UG/ch04s01.html
(search for the example which is named "Query XML document and produce
sorted text table"
>
> That said, would it be possible to extract cards ranked 2, and sort by
> either difficulty or age?
this is possible for me. But note, that I DON'T remove the space
between "D:N:-" and ""@lps" as you suggested! If I do, the command
gets an error.
BTW, I use xmlstarlet version 1.0.1-2ubuntu1
>
> (Which would reveal leeches better? Probably difficulty; gr2s seem to
> hover pretty consistently in the ~1-2 interval, and 4 and 5s are all
> higher.)
I agree with you, that difficulty might be the better indicator.
However, in the end everybody has to make up their mind by themselves
how they use the grading and what indicates their leeches best.
After a fair bit of tinkering, and examining the example at the doc,
"xml sel -T -t -m /xml/table/rec -s D:N:- "@id" -v
"concat(@id,'|',numField,'|',stringField)" -n xml/table.xml", I
finally came up with something that works:
[12:31 PM] 160Mb$ xmlstarlet sel -T -t -m /mnemosyne/item -s A:N:-
'@e' -v "concat ('Easiness=',@e,'||',Q,'|',A)" -n default.xml
I was a little shocked that 700 of my cards were at 1.300.
--
gwern
Interesting. Perhaps it's just a matter of perception, but I thought the whole point of these algorithms is more or less FOR the "leeches". If it's hard and tends to monopolize your review time, they should. That's the idea, right? Until they stop being hard? And if they practically never do, then perhaps they shouldn't and the spacing algorithm is doing its job.
I get the possibility of a card being badly formulated and discouraging points, I guess, and maybe I just don't have enough cards to make it a big deal but I relish the leeches; those are specifically the ones I want to focus on. I guess the whole thing just sounds like a "math is hard, let's go shopping" simile.
<shrug> To each their own.
Like I said, maybe I don't have enough cards, or mine are already
"factored down" enough to where its not yet affected me.