identify leech items on Mnemosyne 1.X (solution inside)

23 views
Skip to first unread message

Roger

unread,
Jan 13, 2010, 5:27:10 AM1/13/10
to mnemosyne-proj-users
Hi All,

here is a solution for identifying your top leech items on a Linux
machine. The used program is available for Windows too (http://
sourceforge.net/projects/xmlstar/), but I did not test it on Windows.
The procedure produces a simple list of your flashcards sorted
descending by lapses.

Background information about leech items:
http://ichi2.net/anki/wiki/Leeches
http://www.supermemo.com/help/leech.htm

Requirements for the procedure:
- Linux
- xmlstarlet installed (http://sourceforge.net/projects/xmlstar/)

Procedure:
- export the database to xml including learning data
- run the following command under linux in a console in the same
directory where you put the xml file:
xmlstarlet sel -T -t -m /mnemosyne/item -s D:N:- "@lps" -v "concat
(@lps,'|',Q,'|',A)" -n default.xml >> lps_ranking.txt

Result/ Output:
The output file (lps_ranking.txt) is a descending sorted list of all
your Flashcards by lapses.
Here are the first few lines of my output file:
8|<img src="media/ui/semester1/eff1/Sorbus_aria-knospe.jpg">|
Sorbus_aria
8|Terminus von "Gips" |Calciumsulfat
7|Ethanol|<$>C_2H_6O</$>
7|Japanischer Pagodenbaum|Styphnolobium japonicum
6|<img src="media/ui/semester1/eff1/Ulmus_minor-knospe1.jpg">|
Ulmus_minor
6|<img src="media/ui/semester1/eff1/Sorbus_torminalis-knospe.jpg">|
Sorbus_torminalis
6|Sanddorn|Hippophaë rhamnoides
5|Berg-Föhre|Pinus mugo
5|incite|to encourage someone to do or feel something unpleasant or
violent


Hopefully this post is useful to some of you!

Greetings,
Roger

Gwern Branwen

unread,
Jan 13, 2010, 8:19:39 AM1/13/10
to mnemosyne-...@googlegroups.com

Your formatting seems to be off. I had to remove the space between
"D:N:-" and ""@lps"" to get it to work - the error messages were
incomprehensible, but I think that was the solution:

xmlstarlet sel -T -t -m /mnemosyne/item -s D:N:-"@lps" -v "concat
(@lps,'|',Q,'|',A)" -n default.xml

(Alas for me, I've never got the hang of the 0/1 rankings and so all
my leeches are marked '2', so this script doesn't help me.)

--
gwern

Michael Campbell

unread,
Jan 13, 2010, 8:52:39 AM1/13/10
to mnemosyne-...@googlegroups.com
Roger wrote:
> Hi All,
>
> here is a solution for identifying your top leech items on a Linux
> machine. The used program is available for Windows too (http://
> sourceforge.net/projects/xmlstar/), but I did not test it on Windows.
> The procedure produces a simple list of your flashcards sorted
> descending by lapses.
>


Back up a second - what is the meaning of "leech" in this vernacular?
Why would I want to find them? (I'm not being argumentative, I'm truly
curious.)

Gwern Branwen

unread,
Jan 13, 2010, 9:16:05 AM1/13/10
to mnemosyne-...@googlegroups.com

I think the 2 links to Anki & SuperMemo documentation addressed that
question well enough.

The name conveys the meaning: cards that for whatever reason have
proven really difficult to remember, and for that reason keep coming
up and 'leeching' your review time. They are time-wasters,
discouraging, ineffective, and indicative of being either irrelevant
or badly formulated. Any of those is enough to want to know about them
for analysis & action.

--
gwern

Michael Campbell

unread,
Jan 13, 2010, 9:34:31 AM1/13/10
to mnemosyne-...@googlegroups.com
Gwern Branwen wrote:
> On Wed, Jan 13, 2010 at 8:52 AM, Michael Campbell
> <michael....@unixgeek.com> wrote:
>
>> Back up a second - what is the meaning of "leech" in this vernacular? Why
>> would I want to find them? (I'm not being argumentative, I'm truly
>> curious.)
>>
>
> ... cards that for whatever reason have

> proven really difficult to remember, and for that reason keep coming
> up and 'leeching' your review time. They are time-wasters,
> discouraging, ineffective, and indicative of being either irrelevant
> or badly formulated. Any of those is enough to want to know about them
> for analysis & action.

Interesting. Perhaps it's just a matter of perception, but I thought
the whole point of these algorithms is more or less FOR the "leeches".
If it's hard and tends to monopolize your review time, they should.
That's the idea, right? Until they stop being hard? And if they
practically never do, then perhaps they shouldn't and the spacing
algorithm is doing its job.

I get the possibility of a card being badly formulated and discouraging
points, I guess, and maybe I just don't have enough cards to make it a
big deal but I relish the leeches; those are specifically the ones I
want to focus on. I guess the whole thing just sounds like a "math is
hard, let's go shopping" simile.

<shrug> To each their own.

Damien Elmes

unread,
Jan 13, 2010, 9:38:31 AM1/13/10
to mnemosyne-proj-users
> Interesting.  Perhaps it's just a matter of perception, but I thought the
> whole point of these algorithms is more or less FOR the "leeches".  If it's
> hard and tends to monopolize your review time, they should.  That's the
> idea, right?  Until they stop being hard?

They don't stop being hard. That's the point. Until you take the time
to refactor the question, understand it better, or do something else,
they will continue to take up a disproportionate amount of your time.

Gwern Branwen

unread,
Jan 13, 2010, 11:06:13 AM1/13/10
to mnemosyne-...@googlegroups.com

Also, I wonder if the -v option is in the right place. Moving it to
after the 'sel' quieted errors, but it also produces nothing if I
replace 'lps' with 'ac_rp', which makes me wonder whether the entire
thing is broken.

That said, would it be possible to extract cards ranked 2, and sort by
either difficulty or age?

(Which would reveal leeches better? Probably difficulty; gr2s seem to
hover pretty consistently in the ~1-2 interval, and 4 and 5s are all
higher.)

--
gwern

Roger

unread,
Jan 13, 2010, 11:07:39 AM1/13/10
to mnemosyne-proj-users
> Your formatting seems to be off. I had to remove the space between
> "D:N:-" and ""@lps"" to get it to work - the error messages were
> incomprehensible, but I think that was the solution:
>
>      xmlstarlet sel -T -t -m /mnemosyne/item -s D:N:-"@lps" -v "concat
> (@lps,'|',Q,'|',A)" -n default.xml

your version of command doesn't work at my site, but good you found a
working solution for you and maybe others with the same issue.

> (Alas for me, I've never got the hang of the 0/1 rankings and so all
> my leeches are marked '2', so this script doesn't help me.)

in this case your easyness factor should be a good indicator for the
leech items... you can change the two @lps in the command to @e and
change the -s D:N to sort *A*scending by changing to -A:N
the command should look like this at the end:
xmlstarlet sel -T -t -m /mnemosyne/item -s D:N:-"@e" -v "concat
(@e,'|',Q,'|',A)" -n default.xml >> easyness_ranking.txt

in fact, you can sort by any attribute of the item element. a sample
item element looks like this:
<item id="9ded5f9b.inv" u="0" gr="4" e="2.462" ac_rp="5" rt_rp="1"
lps="1" ac_rp_l="2" rt_rp_l="0" l_rp="347" n_rp="349">

Roger

unread,
Jan 13, 2010, 11:26:02 AM1/13/10
to mnemosyne-proj-users
> Also, I wonder if the -v option is in the right place. Moving it to
> after the 'sel' quieted errors, but it also produces nothing if I
> replace 'lps' with 'ac_rp', which makes me wonder whether the entire
> thing is broken.

I've got the syntax from the xmlstarlet documentation:
http://xmlstar.sourceforge.net/doc/UG/
the specific EXAMPLE which is quite similar to my command can be found
on this site: http://xmlstar.sourceforge.net/doc/UG/ch04s01.html
(search for the example which is named "Query XML document and produce
sorted text table"

>
> That said, would it be possible to extract cards ranked 2, and sort by
> either difficulty or age?

this is possible for me. But note, that I DON'T remove the space
between "D:N:-" and ""@lps" as you suggested! If I do, the command
gets an error.
BTW, I use xmlstarlet version 1.0.1-2ubuntu1

>
> (Which would reveal leeches better? Probably difficulty; gr2s seem to
> hover pretty consistently in the ~1-2 interval, and 4 and 5s are all
> higher.)

I agree with you, that difficulty might be the better indicator.
However, in the end everybody has to make up their mind by themselves
how they use the grading and what indicates their leeches best.

Gwern Branwen

unread,
Jan 13, 2010, 12:52:25 PM1/13/10
to mnemosyne-...@googlegroups.com

After a fair bit of tinkering, and examining the example at the doc,
"xml sel -T -t -m /xml/table/rec -s D:N:- "@id" -v
"concat(@id,'|',numField,'|',stringField)" -n xml/table.xml", I
finally came up with something that works:

[12:31 PM] 160Mb$ xmlstarlet sel -T -t -m /mnemosyne/item -s A:N:-
'@e' -v "concat ('Easiness=',@e,'||',Q,'|',A)" -n default.xml

I was a little shocked that 700 of my cards were at 1.300.

--
gwern

Oisín

unread,
Jan 13, 2010, 1:38:07 PM1/13/10
to mnemosyne-...@googlegroups.com


2010/1/13 Michael Campbell <michael....@unixgeek.com>

Interesting.  Perhaps it's just a matter of perception, but I thought the whole point of these algorithms is more or less FOR the "leeches".  If it's hard and tends to monopolize your review time, they should.  That's the idea, right?  Until they stop being hard?  And if they practically never do, then perhaps they shouldn't and the spacing algorithm is doing its job.

No, a leech means a piece of information that is formulated in an ineffective way (generally a single card that could and should be replaced by a set of much simpler cards that collectively carry the same information).

For example, a beginning user might make a single card which tests the user on the full conjugation of a verb in a particular tense (or even multiple tenses). Such a card would include far too much information in one place, would make it too easy to fail (since passing the card requires getting every piece right), would discourage the learner and would waste a lot of time needlessly. Every time you fail the card due to one single bit being wrong, you have to repeat the card again many times, including all the other unrelated bits that were correct. This is a gigantic waste of time and effort.

Learning leech cards is like trying to perform delicate surgery with a sledgehammer, bashing and bashing wastefully.

It's nothing to do with difficulty of the source material - it's simply down to poor card-creating choice. Making good, efficient flashcards is a craft.
It's really worth reading the "Twenty rules of formulating knowledge" article on the Supermemo site - http://www.supermemo.com/articles/20rules.htm - it explains things better.
 
 I get the possibility of a card being badly formulated and discouraging points, I guess, and maybe I just don't have enough cards to make it a big deal but I relish the leeches; those are specifically the ones I want to focus on.  I guess the whole thing just sounds like a "math is hard, let's go shopping" simile.

Nope, it's more like a "learning a language painfully in 10 years is hard, let's learn a language easily in 2 years" simile!
Any automated tool that can find possible inefficiencies in your deck is worth using - I had a card in my Chinese deck once, a sentence card which was 3 or 4 lines long and really felt like a horrible chore to read and translate, since I always got one little part wrong and had to repeat the entire thing repeatedly. When I realised that I had spent a total of over 30 minutes looking at that single card, which totally wasn't worth that amount of time compared to simpler, more granular cards, I deleted it.

<shrug>  To each their own.

I think pretty much everyone prefers faster learning over slow learning :D

Oisín

Michael Campbell

unread,
Jan 13, 2010, 3:42:07 PM1/13/10
to mnemosyne-...@googlegroups.com

Like I said, maybe I don't have enough cards, or mine are already
"factored down" enough to where its not yet affected me.

Reply all
Reply to author
Forward
0 new messages