scheduler feedback needed

Peter Bienstman

unread,

Jun 28, 2009, 7:11:04 AM6/28/09

to mnemosyne-...@googlegroups.com

Hi,

Right now the long term scheduler avoids scheduling related cards (e.g. vice
versa) on the same day.

The near term scheduler (i.e. when you just use grades 0 and 1) works
differently: as soon as you memorised the first card of a vice versa pair, the
next card of the pair is scheduled right away.

Are you happy with this behaviour, or would you prefer that the near term
scheduler also tries to postpone seeing the second card to the next day?

Peter

querido

unread,

Jun 28, 2009, 9:03:30 AM6/28/09

to mnemosyne-proj-users

The current behavior is good.

Ben

unread,

Jun 28, 2009, 9:33:15 AM6/28/09

to mnemosyne-...@googlegroups.com

I like the behavior currently, but I don't use the near-term scheduler much so I wouldn't care either way in that case. But I like the way it splits the vice-versa cards in the long-term case.

--
Ben

Laurent

unread,

Jun 28, 2009, 11:10:37 AM6/28/09

to mnemosyne-proj-users

Hello,

As mentioned in an other message some time ago, I do not see the point
in showing the opposite card right away, as it does not help me
memorize it any better (seeing the answer to the first card of the
pair already helped me remember it equally both ways), and I find it
confusing as to how I should grade the 2nd card (even though I know
the answer, because I just saw it from the 1st card, I find myself
answering 0 or 1 again because I only know it from seeing it just
before, not from memory).

I hope my explanation makes sense...

David A. Harding

unread,

Jun 28, 2009, 1:05:10 PM6/28/09

to mnemosyne-...@googlegroups.com

On Sun, Jun 28, 2009 at 01:11:04PM +0200, Peter Bienstman wrote:
> as soon as you memorise the first card of a vice versa pair, the next

> card of the pair is scheduled right away.

When I memorize a card for the first time, can you send all the other
closely related unmemorized cards to the end of the learning card pool
*for this session only*? For example, say I have four cards entered in
the following order:

1. vice
2. versa
3. three
4. four

When I learn "vice", Mnemosyne re-orders the remaining queue as follows:

1. three
2. four
3. versa

But if I close Mnemosyne, ending the current session, the next time I
start Mnemosyne it reverts to showing unmemorized cards in the order of
addition:

1. versa
2. three
3. four

I also want to add that I dislike the way default Anki hides unmemorized
cards after you learn 20 cards in one day. I prefer that Mnemosyne never
hide any cards from me unless I ask it to hide them.

Thanks,

-Dave
--
David A. Harding Website: http://dtrt.org/
1 (609) 997-0765 Email: da...@dtrt.org
Jabber/XMPP: dhar...@jabber.org

Damien Elmes

unread,

Jun 29, 2009, 1:12:43 AM6/29/09

to mnemosyne-...@googlegroups.com

> I also want to add that I dislike the way default Anki hides unmemorized
> cards after you learn 20 cards in one day. I prefer that Mnemosyne never
> hide any cards from me unless I ask it to hide them.

It stops after 20 cards by default because many people new to SRSes
are excited about their new found study tool, learn many new cards in
the first day or two, and then get bogged down with a mountain of
reviews and give up. You have the option of increasing the daily limit
from the first screen you're presented with, and you also have the
option of clicking 'learn more' if you decide you want to keep
studying.

David A. Harding

unread,

Jun 29, 2009, 2:36:13 AM6/29/09

to mnemosyne-...@googlegroups.com

On Mon, Jun 29, 2009 at 02:12:43PM +0900, Damien Elmes wrote:
> many people new to SRSes [...] get bogged down with a mountain of
> reviews and give up.

I agree that many new SRS users give up and that a mountain of reviews
dispirits even long-time users, but do you have any proof that many new
users give up *because* they had a mountain of reviews?

Damien Elmes

unread,

Jun 29, 2009, 3:03:09 AM6/29/09

to mnemosyne-...@googlegroups.com

My previous statement is based on anecdotal evidence acquired over a
few years of supporting Anki users and my students. Take it as you
will.

As for how useful the default is, I gathered some statistics from a
small portion of the users of AnkiOnline.

count | maxNew
18 0
1 1
1 2
6 3
2 4
17 5
2 6
1 7
3 8
38 10
1 12
2 14
2 15
2320 20
2 22
17 25
1 28
43 30
1 32
1 33
6 35
1 36
39 40
1 43
2 45
38 50
1 52
1 55
10 60
1 62
2 70
1 75
1 80
1 90
1 99
51 100
1 108
1 110
1 115
2 116
1 130
5 150
1 156
1 160
20 200
2 214
2 250
7 300
2 400
20 500
2 600
2 700
1 800
1 842
3 999
12 1000
1 1648
4 2000
2 3000
1 4000
2 5000
2 9999
4 10000
2 20000
6 99999
1 999999
1 9999999
1 999999999

Peter Bienstman

unread,

Jun 29, 2009, 3:29:32 AM6/29/09

to mnemosyne-...@googlegroups.com

On Sunday 28 June 2009 07:05:10 pm David A. Harding wrote:
> On Sun, Jun 28, 2009 at 01:11:04PM +0200, Peter Bienstman wrote:
> > as soon as you memorise the first card of a vice versa pair, the next
> > card of the pair is scheduled right away.
>
> When I memorize a card for the first time, can you send all the other
> closely related unmemorized cards to the end of the learning card pool
> *for this session only*? For example, say I have four cards entered in
> the following order:
>
> 1. vice
> 2. versa
> 3. three
> 4. four
>
> When I learn "vice", Mnemosyne re-orders the remaining queue as follows:
>
> 1. three
> 2. four
> 3. versa
>
> But if I close Mnemosyne, ending the current session, the next time I
> start Mnemosyne it reverts to showing unmemorized cards in the order of
> addition:
>
> 1. versa
> 2. three
> 3. four

OK, and how exactly do you want to change that behaviour? It looks to me like
what you are describing is exactly "send all the other closely related

unmemorized cards to the end of the learning card pool *for this session

only*".

Would you be happy if 'versa' was only shown for the first time the next day?

Also note that as soon as you start giving a grade to a card, it will become
randomly intermixed with the other cards.

Peter

David A. Harding

unread,

Jun 29, 2009, 4:28:50 AM6/29/09

to mnemosyne-...@googlegroups.com

On Mon, Jun 29, 2009 at 04:03:09PM +0900, Damien Elmes wrote:
> [In a sample of 2750 users, 2320 (84%) have the default setting.]

Do you include new and abandoned accounts in your statistics?

Thanks,

David A. Harding

unread,

Jun 29, 2009, 4:49:02 AM6/29/09

to mnemosyne-...@googlegroups.com

On Mon, Jun 29, 2009 at 09:29:32AM +0200, Peter Bienstman wrote:
| On Sunday 28 June 2009 07:05:10 pm David A. Harding wrote:
| | When I memorize a card for the first time, can you send all the other
| | closely related unmemorized cards to the end of the learning card pool
| | *for this session only*? For example, say I have four cards entered in
| | the following order:
|

| It looks to me like what you are describing is exactly "send all the
| other closely related unmemorized cards to the end of the learning
| card pool *for this session only*".

As indicated in the quote above, the example was an example of the
desired behavior. I'm glad it looked exactly like my one-sentence
description. :-)

> Would you be happy if 'versa' was only shown for the first time the
> next day?

I don't want Mnemosyne to hide any cards from me unless I tell it to
hide them. If I run out of other unmemorized cards, I want to see
"versa". If you do make Mnemosyne hide some cards by default, I'd
appreciate if you add an option to show them.

Peter Bienstman

unread,

Jun 29, 2009, 4:58:36 AM6/29/09

to mnemosyne-...@googlegroups.com

On Monday 29 June 2009 10:49:02 am David A. Harding wrote:
> On Mon, Jun 29, 2009 at 09:29:32AM +0200, Peter Bienstman wrote:
> | On Sunday 28 June 2009 07:05:10 pm David A. Harding wrote:
> | | When I memorize a card for the first time, can you send all the other
> | | closely related unmemorized cards to the end of the learning card pool
> | | *for this session only*? For example, say I have four cards entered in
> | | the following order:
> |
> | It looks to me like what you are describing is exactly "send all the
> | other closely related unmemorized cards to the end of the learning
> | card pool *for this session only*".
>
> As indicated in the quote above, the example was an example of the
> desired behavior. I'm glad it looked exactly like my one-sentence
> description. :-)

Strangely enough, it is also the current Mnemosyne behaviour :-) But as I
mentioned before, you will never be able to observe this cleanly, as grade 0
and 1 cards are always randomised.

> > Would you be happy if 'versa' was only shown for the first time the
> > next day?
>
> I don't want Mnemosyne to hide any cards from me unless I tell it to
> hide them. If I run out of other unmemorized cards, I want to see
> "versa".

That would indeed be the plan: if the only cards that are unmemorised are
actually the related cards that were postponed to tomorrow, show them now.

Peter

Peter Bienstman

unread,

Jun 29, 2009, 5:02:57 AM6/29/09

to mnemosyne-...@googlegroups.com

On Monday 29 June 2009 10:28:50 am David A. Harding wrote:
> On Mon, Jun 29, 2009 at 04:03:09PM +0900, Damien Elmes wrote:
> > [In a sample of 2750 users, 2320 (84%) have the default setting.]
>
> Do you include new and abandoned accounts in your statistics?

Also, many people don't bother to change the default. Which brings me to my
next question: which values of 'grade 0 cards to hold in your hand' do people
use?

Peter

David A. Harding

unread,

Jun 29, 2009, 5:05:50 AM6/29/09

to mnemosyne-...@googlegroups.com

On Mon, Jun 29, 2009 at 10:58:36AM +0200, Peter Bienstman wrote:
> Strangely enough, it is also the current Mnemosyne behaviour :-)

Whoops! I knew it sounded like a good idea. :)

> That would indeed be the plan: if the only cards that are unmemorised
> are actually the related cards that were postponed to tomorrow, show
> them now.

Sounds awesome. That has my vote.

Thanks, Peter.

David A. Harding

unread,

Jun 29, 2009, 5:12:25 AM6/29/09

to mnemosyne-...@googlegroups.com

On Mon, Jun 29, 2009 at 11:02:57AM +0200, Peter Bienstman wrote:
> which values of 'grade 0 cards to hold in your hand' do people use?

I sometimes adjust it to suit the subject, but I use 15 the most often.

Damien Elmes

unread,

Jun 29, 2009, 5:51:04 AM6/29/09

to mnemosyne-...@googlegroups.com

I prune old accounts after about 6 months of inactivity, which puts an
upper limit on the number of abandoned accounts. I made no attempt at
accounting for fresh accounts, as it would take more time.

Meishu

unread,

Jun 29, 2009, 9:43:15 AM6/29/09

to mnemosyne-proj-users

Same as David.

On Jun 29, 5:12 pm, "David A. Harding" <d...@dtrt.org> wrote:
> On Mon, Jun 29, 2009 at 11:02:57AM +0200, Peter Bienstman wrote:
> > which values of 'grade 0 cards to hold in your hand' do people use?
>
> I sometimes adjust it to suit the subject, but I use 15 the most often.
>
> -Dave
> --
> David A. Harding Website: http://dtrt.org/

> 1 (609) 997-0765 Email: d...@dtrt.org
> Jabber/XMPP: dhard...@jabber.org

Francisco Fiuza Jr

unread,

Jun 29, 2009, 4:26:40 PM6/29/09

to mnemosyne-...@googlegroups.com

I use 5.

Oisin Mac Fhearai

unread,

Jun 29, 2009, 8:08:54 PM6/29/09

to mnemosyne-...@googlegroups.com, mnemosyne-...@googlegroups.com

On 29 Jun 2009, at 07:36, "David A. Harding" <da...@dtrt.org> wrote:

>
> On Mon, Jun 29, 2009 at 02:12:43PM +0900, Damien Elmes wrote:
>> many people new to SRSes [...] get bogged down with a mountain of
>> reviews and give up.
>
> I agree that many new SRS users give up and that a mountain of reviews
> dispirits even long-time users, but do you have any proof that many
> new
> users give up *because* they had a mountain of reviews?
>

I can't speak for others, but I fell behind over a couple of weeks,
only doing 50-100 of the ~200 reviews scheduled each day. When it got
to about 1500 due cards, I finally lost any remaining motivation and
stopped studying. That was about 4 months ago, and I've been using
srses for over 4 years.
So I would say that there is a strong chance that others, especially
newbies, have fallen into the same trap.

Even at 12 new cards per day, the reviews creep up to well over 100 a
day within a few months, and if answering a card takes 20 seconds (eg
scribbling Chinese chars on a tablet), you're talking 30-60 min
sessions daily. If you take a week off, that's four hours of backlog.

I'd recommend a clear limit for new cards daily as Anki does, tweaked
by the user conservatively, so they don't get proportionally bogged
down 6 months later and quit. When I return, I'll probably only take
on 5 or 6 new cards a day to avoid burnout again.

Peter Bienstman

unread,

Jun 30, 2009, 3:38:34 AM6/30/09

to mnemosyne-...@googlegroups.com

On Tuesday 30 June 2009 02:08:54 am Oisin Mac Fhearai wrote:

> I can't speak for others, but I fell behind over a couple of weeks,
> only doing 50-100 of the ~200 reviews scheduled each day. When it got
> to about 1500 due cards, I finally lost any remaining motivation and
> stopped studying. That was about 4 months ago, and I've been using
> srses for over 4 years.
> So I would say that there is a strong chance that others, especially
> newbies, have fallen into the same trap.
>
> Even at 12 new cards per day, the reviews creep up to well over 100 a
> day within a few months, and if answering a card takes 20 seconds (eg
> scribbling Chinese chars on a tablet), you're talking 30-60 min
> sessions daily. If you take a week off, that's four hours of backlog.
>
> I'd recommend a clear limit for new cards daily as Anki does, tweaked
> by the user conservatively, so they don't get proportionally bogged
> down 6 months later and quit. When I return, I'll probably only take
> on 5 or 6 new cards a day to avoid burnout again.

Interesting observation, thanks!

It's easy to add a warning + explanation when you reach e.g. 10 new learned
cards. I'd prefer this more gentle approach as opposed to a hard limit where
you forbid people to go on.

On a more personal note, I've been using (the predecessor of) Mnemosyne since
2003, learning roughly 5 cards a day. I'm now at 175 scheduled cards a day on
average, and 8500 cards in my database. There is still plenty of stuff I want
to learn, and if I keep up what I think is this steady, gently pace, I could
be at 350 reps daily in 5 more years...

This very long term aspect is definitely something that needs thinking about,
either by making a more thorough analysis of the logs and tweaking the
algorithm, or by pruning the cards in my database.

Peter

Ben

unread,

Jun 30, 2009, 8:34:47 AM6/30/09

to mnemosyne-...@googlegroups.com

Thanks for sharing---to me those statistics are very interesting and
show some possible limits on what memory can achieve in the long term.
But having to review 175 cards a day seems rough to me. I guess it's
all about how difficult the things are that you memorize.

Here's the way I was looking at it, as summarized by Sherlock Holmes
in "A Study in Scarlet":

"You see," he explained, "I consider that a man's brain originally
is like a little empty attic, and you have to stock it with such
furniture as you choose. A fool takes in all the lumber of every
sort that he comes across, so that the knowledge which might be
useful to him gets crowded out, or at best is jumbled up with a lot
of other things so that he has a difficulty in laying his hands
upon it. Now the skilful workman is very careful indeed as to what
he takes into his brain-attic. He will have nothing but the tools
which may help him in doing his work, but of these he has a large
assortment, and all in the most perfect order. It is a mistake to
think that that little room has elastic walls and can distend to
any extent. Depend upon it there comes a time when for every
addition of knowledge you forget something that you knew before. It
is of the highest importance, therefore, not to have useless facts
elbowing out the useful ones."

At first I thought Holmes was being too pessimistic, but there is a
sense in which he's right---you can only remember so many "things",
even with spaced repetition. After that you spend too much time
reviewing cards every day.

Now, if I only wanted to review, say, 50 cards every day, then I was
hoping that I could still have 10K cards in my deck. This corresponds
to a 0.5% (=50/10K) "review rate". But Peter has a review rate of
175/8500 = 2%, which seems pretty high to me. In order to determine
how big my practical mental attic is, it seems useful to know whether
achieveable review rates are more like 2% or 0.2%.

I realize it's probably all about how hard the things are (e.g. most
people have a vocabulary of 10000+ words, and they manage to remember
them without any software at all) but perhaps in practice users'
long-term review rates cluster in a narrow band.

--
Ben

----------------- Original message -----------------
From: Peter Bienstman <Peter.B...@ugent.be>
To: mnemosyne-...@googlegroups.com

Patrick Kenny

unread,

Jun 30, 2009, 9:19:20 AM6/30/09

to mnemosyne-...@googlegroups.com

It certainly depends on how you design the cards. I would be very much
against a hard limit, as for several months I was added 50-100 cards a
day. I have been using Mnemosyne for about eighteen months and have
over 27,000 cards in my database now, but I am averaging only between
200-250 reviews per day (assuming I don't add any new cards).

Perhaps the reason why I have so few reviews (given the size of my
collection) is that nearly all of my cards are very easy and only take a
few seconds to review each, and thus I miss very few. In any case, I
would be hesitant to change the standard algorithm; if something were to
be changed, perhaps a way to more naturally postpone and spread out a
pile of overdue cards would be best.

Cheers,
Patrick

Peter Bienstman

unread,

Jun 30, 2009, 9:36:24 AM6/30/09

to mnemosyne-...@googlegroups.com

On Tuesday 30 June 2009 02:34:47 pm Ben wrote:
> Thanks for sharing---to me those statistics are very interesting and
> show some possible limits on what memory can achieve in the long term.
> But having to review 175 cards a day seems rough to me. I guess it's
> all about how difficult the things are that you memorize.
>
> Here's the way I was looking at it, as summarized by Sherlock Holmes
> in "A Study in Scarlet":

Nice quote, thanks!

> I realize it's probably all about how hard the things are (e.g. most
> people have a vocabulary of 10000+ words, and they manage to remember
> them without any software at all) but perhaps in practice users'
> long-term review rates cluster in a narrow band.

The answers to these questions lie locked in the data we've collected. Once
2.0 is out the door, I hope to have more time to look at those then.

Cheers,

Peter

--
------------------------------------------------
Peter Bienstman
Ghent University, Dept. of Information Technology
Sint-Pietersnieuwstraat 41, B-9000 Gent, Belgium
tel: +32 9 264 34 46, fax: +32 9 264 35 93
WWW: http://photonics.intec.UGent.be
email: Peter.B...@UGent.be
------------------------------------------------

Peter Bienstman

unread,

Jun 30, 2009, 9:41:03 AM6/30/09

to mnemosyne-...@googlegroups.com

On Tuesday 30 June 2009 03:19:20 pm Patrick Kenny wrote:

> Perhaps the reason why I have so few reviews (given the size of my
> collection) is that nearly all of my cards are very easy and only take a
> few seconds to review each, and thus I miss very few. In any case, I
> would be hesitant to change the standard algorithm;

That is definitely not the idea here.

> if something were to
> be changed, perhaps a way to more naturally postpone and spread out a
> pile of overdue cards would be best.

That is in fact already there: Mnemosyne presents you with the most urgent
cards first and takes the lateness in the the review into account when updating
the intervals. The only thing which is not there is 'cheating' with the
scheduler counter to limit it to a certain number of cards. But this is purely
cosmetic aspect and does not require a change to the scheduler itself.

Cheers,

Peter

Gwern Branwen

unread,

Jun 30, 2009, 10:18:40 AM6/30/09

to mnemosyne-...@googlegroups.com

On Tue, Jun 30, 2009 at 8:34 AM, Ben<mi...@emerose.org> wrote:
>
> Now, if I only wanted to review, say, 50 cards every day, then I was
> hoping that I could still have 10K cards in my deck. This corresponds
> to a 0.5% (=50/10K) "review rate". But Peter has a review rate of
> 175/8500 = 2%, which seems pretty high to me. In order to determine
> how big my practical mental attic is, it seems useful to know whether
> achieveable review rates are more like 2% or 0.2%.

FWIW, I'm at 1.8% myself. But I'm a little unclear here; I thought the
idea of spaced repetition was that the review rate would decrease over
time. So wouldn't the question really be 'how long would it take to
hit 0.2%?' and not 'whether'?

--
gwern

Peter Bienstman

unread,

Jun 30, 2009, 10:47:52 AM6/30/09

to mnemosyne-...@googlegroups.com

I will hit any low number, *provided* you don't add new cards in the
meantime...

Peter

Francisco Fiuza Jr

unread,

Jun 30, 2009, 4:39:39 PM6/30/09

to mnemosyne-...@googlegroups.com

I have about 3k cards, and the review rate is about 50 cards a day.
I really sucks when you have a week of backlog.

Francisco Fiuza Jr

unread,

Jun 30, 2009, 4:39:53 PM6/30/09

to mnemosyne-...@googlegroups.com

*it really

Francisco Fiuza Jr

unread,

Jun 30, 2009, 5:58:47 PM6/30/09

to mnemosyne-...@googlegroups.com

What do you guys think about limiting the number of cards to review each day?
For example, if I set this value to 80 cards, it will show me only those cards even though there should be around 200 for this day.

If you feel you can see more cards, pressing the button 'view more late cards' would get the next 80 cards.

It would be very usefull for me. Early this year when I went on vacation, I had about 1400 scheduled cards. It took me like 3 weeks to review them, and at the end I had about 350 grade 1 cards. After a session of 80 cards, I would rather go through those that I forgot giving a grade 1, than to review all the cards and watch the pile of grade 1 cards increase dramatically.

This is probably not the way it was designed, but I think it will have a good impact on the motivation. It's like: "Oh well, I have 1500 cards to review, but I won't need to sit for over 3 weeks to keep on my studies."

Regards,

Frank

David A. Harding

unread,

Jun 30, 2009, 6:52:30 PM6/30/09

to mnemosyne-...@googlegroups.com

On Tue, Jun 30, 2009 at 06:58:47PM -0300, Francisco Fiuza Jr wrote:
> What do you guys think about limiting the number of cards to review
> each day?

I think you're better off setting a time limit instead of a card limit.
For example, I use Mnemosyne in 30 minute increments. If I still have
reviews left after 30 minutes, I congratulate myself on getting started
and happily move on to another task.

Ben

unread,

Jun 30, 2009, 9:33:23 PM6/30/09

to mnemosyne-...@googlegroups.com

Are you sure about this? Unscientifically, it seems more plausible to
me that there is some minimum limit on how seldomly you can review
something and still remember it. For instance, for a specific card,
here are two sequences of optimal times between review:

A: 1mo, 3mo, 6mo, 1yr, 2yr, 3yr, 5yr, 7yr, 10yr ...
B: 1mo, 3mo, 6mo, 1yr, 1yr, 1yr, 1yr, 1yr, 1yr ...

I would think that pattern B is more plausible than A---eventually
memory would peak. In fact, as a person reaches middle age and
beyond, wouldn't their memory deteriorate?

I'm wondering about this because one reason I use mnemosyne is because
I'm 30 now and I'd like to remember a lot of the stuff I remember
"naturally" now when I'm 60. I'd assume just retaining what I have
now is going to be harder and harder as I age.

--
Ben

----------------- Original message -----------------
From: Peter Bienstman <Peter.B...@ugent.be>
To: mnemosyne-...@googlegroups.com

Peter Bienstman

unread,

Jul 1, 2009, 2:44:28 AM7/1/09

to mnemosyne-...@googlegroups.com

As I mentioned here yesterday, Mnemosyne already now shows you the most urgent
cards first, and takes into account how late you are when scheduling. So every
time you do X cards, you can be sure that you are doing the most important
ones.

I'm not sure how useful it would be to have your future schedule read 80 cards
every day for several weeks into the future, as then you'd have no way of
knowing how much behind you really are...

Still, this could be implemented in a plugin.

Cheers,

Peter

Peter Bienstman

unread,

Jul 1, 2009, 2:48:07 AM7/1/09

to mnemosyne-...@googlegroups.com

On Wednesday 01 July 2009 03:33:23 am Ben wrote:
> Are you sure about this? Unscientifically, it seems more plausible to
> me that there is some minimum limit on how seldomly you can review
> something and still remember it. For instance, for a specific card,
> here are two sequences of optimal times between review:
>
> A: 1mo, 3mo, 6mo, 1yr, 2yr, 3yr, 5yr, 7yr, 10yr ...
> B: 1mo, 3mo, 6mo, 1yr, 1yr, 1yr, 1yr, 1yr, 1yr ...

Right now the algorithm does behave like A. Whether that is the best way on a
30y time span, nobody knows, but that's one of the things I hope to find out by
collecting stats. I'm in it for the long run :-)

Peter

Meishu

unread,

Jul 1, 2009, 7:33:42 AM7/1/09

to mnemosyne-proj-users

I I may repeat what was said by "Supermemo"'s people, they emphasize
the following: reviewing is not learning.

Even if one accumulates 1000 cards to be reviewd, not just 200, and
they have truly learned them first, the reviewing procedure wouldn't
be that long nor would it be painful. I somehow suspect that newbies
to the program abandon it not because of the large amount of cards to
review, but rather due to the large amount AND the fact that they
don't remember anything. This is more due to misuse of the core
requirements of SRS and not so much due to the quantity. If you
follow the core requirement than the quantity is only secondary and
quite personal.

Ben

unread,

Jul 1, 2009, 8:24:39 AM7/1/09

to mnemosyne-...@googlegroups.com

But won't the algorithm will only behave like A if you keep getting
the card right? Assuming the ideal sequence is as in B, then by
definition when mnemosyne asks you about the card after a period
longer than a year, you won't do well. So won't the actual interval
bounce around the ideal interval? For instance, suppose the ideal
interval is always 1 year. I thought Mnemosyne would behave like
this:

Mnemosyne Internal Grade User
(in years) Gives Card
.5 5
1 4
1.3 2
1.1 3
.9 4
1.1 3
1.0 4
1.2 3
... ...

The numbers are just illustrations, but I thought Mnemosyne would
shorten the interval if you gave the card less than a 4, and would
lengthen it if you gave the card a 4 or 5. So if the ideal interval
is always 1 yr, won't Mnemosyne will approximate that, at least to
some degree?

--
Ben

----------------- Original message -----------------
From: Peter Bienstman <Peter.B...@ugent.be>
To: mnemosyne-...@googlegroups.com

Peter Bienstman

unread,

Jul 1, 2009, 8:38:46 AM7/1/09

to mnemosyne-...@googlegroups.com

On Wednesday 01 July 2009 02:24:39 pm Ben wrote:
> But won't the algorithm will only behave like A if you keep getting
> the card right?

Yes, good catch, I was assuming that you made no errors.

> Assuming the ideal sequence is as in B, then by
> definition when mnemosyne asks you about the card after a period
> longer than a year, you won't do well. So won't the actual interval
> bounce around the ideal interval? For instance, suppose the ideal
> interval is always 1 year. I thought Mnemosyne would behave like
> this:
>
> Mnemosyne Internal Grade User
> (in years) Gives Card
> .5 5
> 1 4
> 1.3 2
> 1.1 3
> .9 4
> 1.1 3
> 1.0 4
> 1.2 3
> ... ...
>
> The numbers are just illustrations, but I thought Mnemosyne would
> shorten the interval if you gave the card less than a 4, and would
> lengthen it if you gave the card a 4 or 5. So if the ideal interval
> is always 1 yr, won't Mnemosyne will approximate that, at least to
> some degree?

Actually, in your example all the grades are 'pass' grades. And even if you
grade a card 2, the interval will still lengthen a bit, but not as much as
with higher grades.

Cheers,

Peter

Oisín

unread,

Jul 1, 2009, 9:48:25 AM7/1/09

to mnemosyne-...@googlegroups.com

2009/7/1 Meishu <meis...@gmail.com>:

I think you're underestimating how much variation there is in the
difficulty in learning and remembering the information in each card,
between different subjects. When I studied German briefly before going
on a short holiday, I was able to cover about 80 new cards a day in
about 45 minutes, without feeling burdened - my reviews took about 3-5
seconds per card and were very easy.
However, to learn Chinese characters/words/grammar takes me ~17
seconds per card, because I speak the word and write the character
before checking the answer.

There is an atomic, indivisible amount of work required for cards of
various topics which cannot be broken down much further, which impacts
the difficulty of learning _and_ reviewing cards in an unavoidable
way. Granted, I could separate the pronounciation from the written
form, but that wouldn't have any real benefit (the speaking part takes
two or three seconds - writing the characters is the bottleneck).

However, I certainly agree with you in the presumption that a lot of
newbies to SRS systems make mistakes in how they learn the material
(e.g. assuming that the brain is a perfectly functioning lossless
database where true learning and comprehension will automatically
happen just by reviewing cards - we see this attitude when people ask
for pre-built decks to download... it's wishful thinking that we could
almost upload knowledge into our brains like in the Matrix), and how
they construct their cards (e.g. one card for every conjugation of a
verb in a particular tense, when it would be much better to break it
into single conjugations per card).

Oisín

Francisco Fiuza Jr

unread,

Jul 1, 2009, 12:38:56 PM7/1/09

to mnemosyne-...@googlegroups.com

Hello Peter,

> I'm not sure how useful it would be to have your future schedule read 80 cards
> every day for several weeks into the future, as then you'd have no way of
> knowing how much behind you really are...

The status bar would show something like this:

Scheduled: 80; late cards: 220; unlearned: 30

You can know how behind you are.

My point is, for example, let's say I review 80 cards, and I forget 8 of them giving a grade of 1. I would rather go through those 8 cards I forgot than to keep reviewing the other ones.

That's because if I have 2000 to review, at the end I'll have about 200 grade 1 cards.

Regards,

Frank

Peter Bienstman

unread,

Jul 1, 2009, 12:46:04 PM7/1/09

to mnemosyne-...@googlegroups.com

On Wednesday 01 July 2009 06:38:56 pm Francisco Fiuza Jr wrote:

> My point is, for example, let's say I review 80 cards, and I forget 8 of
> them giving a grade of 1. I would rather go through those 8 cards I forgot
> than to keep reviewing the other ones.

My personal opinion is that if you have such a big backlog, then it's better
to do the scheduled cards first so as to avoid forgetting any more cards.

Still, the behaviour you describe could be implemented by a custom scheduler
plugin.

Cheers,

Peter

Meishu

unread,

Jul 6, 2009, 6:11:56 AM7/6/09

to mnemosyne-proj-users

Oisin,

Well, you don't have to review it all in one day, you can spread it
over several days, if the cards were truly absorbed it shouldn't make
much of a difference. If they are just in the beginning stages than
it's even less important IMHO.

Having said this, mnemosyne or any other program are no magical
bullets. If we'll be heading to Thailand for a couple of weeks while
we should be studying there will be consequences (other than a nice
tan).

You mentioned 17 seconds as if it's a long period of time. For some
cards I might actually be spending much more than that trying to
"retrieve" the right answer. It seems to be a crucial element in
retaining and learning new cards. Not just clicking on the answer, but
really focusing hard about it.

On Jul 1, 9:48 pm, Oisín <denpasho...@gmail.com> wrote:
> 2009/7/1 Meishu <meishu...@gmail.com>:

querido

unread,

Jul 24, 2009, 7:40:07 PM7/24/09

to mnemosyne-proj-users

On Jun 30, 3:38 am, Peter Bienstman <Peter.Bienst...@ugent.be> wrote:

> On a more personal note, I've been using (the predecessor of) Mnemosyne since
> 2003, learning roughly 5 cards a day. I'm now at 175 scheduled cards a day on
> average, and 8500 cards in my database. There is still plenty of stuff I want
> to learn, and if I keep up what I think is this steady, gently pace, I could
> be at 350 reps daily in 5 more years...
>
> This very long term aspect is definitely something that needs thinking about,
> either by making a more thorough analysis of the logs and tweaking the
> algorithm, or by pruning the cards in my database.
>
> Peter

I know you're waiting for the hard data, but allow me to share a few
ideas.
1. One resists the idea of pruning cards, but after thinking about it
I have found a good rationale (not proof). Some have opined that at
very short intervals, beginning from the first glance at a fact,
flashcards are not yet the ideal tool, that one should first learn the
fact to some (arguable) degree. (I agree, without knowing how well
that should be. I've tried stretching this out as long as eight days,
memorizing material before flashcarding it for retention-only. An
ideal is probably in there somewhere.) Now, similarly, maybe the
single-fact, atomized-data style flashcard system becomes non-ideal
again at long intervals too. For example, one of my earlier Chinese
textbooks broke down into 900 flashcards total, which are still in
mnemosyne. I can now read, aloud or not, this book fairly rapidly, and
understand its audio. So, reading or listening I zoom over hundreds of
"atoms", all nicely connected with context and grammar, etc.- real
language. At some point, it might be a good idea to prune all 900
cards and make a "review scheduling style" card maybe like this:
Front- "read Modern Chinese Reader aloud" Back- "Did you know (almost)
everything?" (Where "almost everything" concedes that your brain is
not a machine, after all; we all have a standard, and compromise on
"perfection" for the sake of just carrying on living and learning.) I
now intend to do this when I get around to it. (By the way, this would
make it even more important that you're learning from something
cohesive, like a book with lots of context, *so that* you could later
prune all of the cards, knowing you can still hold them all securely
in one hand.)
I had thought that once cards were known perfectly well that each card
would become sufficiently effortless. It doesn't; it is still many
times harder than flying over that same fact in context while reading
or listening. (You could prove that.) I had also thought that once
they were promoted far enough, they would practically disappear. Well,
your testimony above confirms my impression that they don't, quite
well enough. This is what motivated me to think about this again.
2. Back to your discussion above about rep-buildup: Again without
proof, after some years tweaking your program (trying to stay within
the bounds of something that would fit into your project), the most
convincing improvement I think is the idea of demotion by some factor
instead of failing all the way back. I have a strong common-sense
rationale for this too. At some point, we think we "know" a card, but
a cautious compromise is to promote it by some factor instead of "all
the way". When we miss a card, we could employ a symmetrical "cautious
compromise" (cautiously avoiding a buildup at shorter intervals that
would crowd other cards we *know* we don't know yet), and demote by a
factor, instead of all the way. Price paid: we won't know until we see
it again, at some shorter interval, whether we really still have some
grip on it. Benefit gained: less clogging of the cards we *know* we
don't yet retain well. (Common sense exercise: jot down the ones
missed and review outside the program.) This is a common sense
compromise, the degree of which is powerfully controlled by this one
variable: very convenient and easy to understand.
3. Lastly, I'll mention this here because it is the *only* other
important thing I finally decided on; I posted about it here some
months ago. This is the idea of accelerated promote/demote, where your
ef is still calculated as usual per button and saved, but an
additional (one time, *this* time) factor accelerates the desired
promotion/slowing/demotion.

I contributed a tiny bit of code a couple of years ago. Now I have
1.99 running with virtualenv, etc., and hope to produce algorithm
plugins. I won't mind if someone beats me to it though.
I exchanged my very first all-Chinese emails the other day. Now I'm
trying not to burst with premature satisfaction. Mnemosyne is at the
heart of my language-learning world, every day, and I appreciate it
very much.

Peter Bienstman

unread,

Jul 25, 2009, 2:22:04 AM7/25/09

to mnemosyne-...@googlegroups.com

On Saturday 25 July 2009 01:40:07 am querido wrote:

> I contributed a tiny bit of code a couple of years ago. Now I have
> 1.99 running with virtualenv, etc., and hope to produce algorithm
> plugins.

Be my guest!

BTW, by using grade 5 more often, and holding off from adding new cards for a
while, my rep count is down from 175 to 130 a day :-)

Peter

Message has been deleted

Ben

unread,

Jul 25, 2009, 8:46:28 PM7/25/09

to mnemosyne-...@googlegroups.com

Thanks, this is an interesting email. I agree with your thoughts that I quoted below. I think a program like Mnemosyne seems pretty optimal if your goal is, for the rest of your life, to be able to recall a fact at random in a few seconds from your deck. For languages, this may fit real life very well---you never know when you will run across some word and you want to be to recall the meaning in a few seconds.

However, this context-free benchmark may not appropriate for some other knowledge. For instance, suppose I wanted to remember linear/abstract algebra for the rest of my life well enough that, should I run across a paper that uses basic linear algebra, I could spend 10 minutes reviewing and be able to understand the paper. This is different from being able to remember in a few seconds the definition of a "homology" or whatever.

This isn't a complaint about Mnemosyne of course, I'm just agreeing that a card-based system may not be the ultimate answer to retaining all knowledge and skills.

--
Ben

----------------- Original message -----------------
From: querido <tworoads...@gmail.com>
To: mnemosyne-proj-users <mnemosyne-...@googlegroups.com>
Date: Fri, 24 Jul 2009 16:40:07 -0700 (PDT)

...

Gwern Branwen

unread,

Jul 25, 2009, 9:05:11 PM7/25/09

to mnemosyne-...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sat, Jul 25, 2009 at 8:46 PM, Ben wrote:
> However, this context-free benchmark may not appropriate for some other knowledge. For instance, suppose I wanted to remember linear/abstract algebra for the rest of my life well enough that, should I run across a paper that uses basic linear algebra, I could spend 10 minutes reviewing and be able to understand the paper. This is different from being able to remember in a few seconds the definition of a "homology" or whatever.

What if you have a deck principally of small examples and questions?

I've been learning Scheme through SICP, the SICP online tester, and
the R5RS report defining Scheme; I have essentially copied all the
small examples of syntax and semantics I've come across (and added new
ones by modifying those examples to cover in detail edge cases I
didn't understand). While some of my cards are definition-style (for
fundamental functions), most of them are those examples - 'evaluate
these 3 expressions' ultimate result', 'is this syntax correct:
yes/no' etc.

Why wouldn't this approach let me understand random Scheme I come
across in ten years - modulo the advanced stuff I simply haven't
gotten to yet, or the use of libraries I don't know? Certainly I would
expect it to. I don't see any reason why this couldn't be true of
linear algebra. Is it that you don't have a mass of problems and
examples for linear algebra, only an impoverished set of definitions
and theorems?

- --
gwern
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEAREKAAYFAkprq8YACgkQvpDo5Pfl1oLMUQCeOadWjQUJxm6AiKGDWmoRIBce
m1YAn1cwF0D+a/Y8PLOR9knXmptS9Uva
=EM6a
-----END PGP SIGNATURE-----

Ben

unread,

Jul 25, 2009, 9:33:11 PM7/25/09

to mnemosyne-...@googlegroups.com

This might work, but would it be the most efficient way of going about this? I have two thoughts:

1. I mentioned this because of the "context" problem that querido brought up. For instance, the strategy that you mention would allow you to retain the ability (call it ability A) to arbitrarily evaluate some Scheme code in a few seconds, regardless of the context. This is nice, but what if that's more than what you wanted? Suppose instead you didn't want to know any Scheme at all offhand, but you wanted the ability (call this ability B) to be able to review SICP for 30 minutes and then be able to evaluate some Scheme code. Ability A implies ability B perhaps, but suppose all you really want is ability B. Doesn't it stand to reason that, over the years, ability A will take longer to maintain than ability B?

2. Not all mental skills can be called memory. Suppose you wanted to retain the ability to multiply two arbitrary 3 digit numbers in your head in 30 seconds. It wouldn't make sense to make a bunch of cards depicting various specific numbers to multiply. To me, remembering how to program or how to do linear algebra falls in the grey area between remembering the definition of a word and "remembering" how to ride a bicycle.

Cheers,
--
Ben

----------------- Original message -----------------
From: Gwern Branwen <gwe...@gmail.com>
To: mnemosyne-...@googlegroups.com
Date: Sat, 25 Jul 2009 21:05:11 -0400

...

Jason Axelson

unread,

Jul 25, 2009, 9:51:27 PM7/25/09

to mnemosyne-...@googlegroups.com

On Sat, Jul 25, 2009 at 3:33 PM, Ben<mi...@emerose.org> wrote:
> 2. Not all mental skills can be called memory. Suppose you wanted to retain the ability to multiply two arbitrary 3 digit numbers in your head in 30 seconds. It wouldn't make sense to make a bunch of cards depicting various specific numbers to multiply. To me, remembering how to program or how to do linear algebra falls in the grey area between remembering the definition of a word and "remembering" how to ride a bicycle.

I think that for things like this it would be cool to program some
type of "dynamic card" that would make a math problem out of random
numbers and quiz you on it. I agree that this could be very helpful,
it might make sense to use a different scheduling algorithm for this.

Jason

Gwern Branwen

unread,

Jul 25, 2009, 9:59:57 PM7/25/09

to mnemosyne-...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sat, Jul 25, 2009 at 9:33 PM, Ben wrote:
> This might work, but would it be the most efficient way of going about this? I have two thoughts:
>
> 1. I mentioned this because of the "context" problem that querido brought up. For instance, the strategy that you mention would allow you to retain the ability (call it ability A) to arbitrarily evaluate some Scheme code in a few seconds, regardless of the context. This is nice, but what if that's more than what you wanted? Suppose instead you didn't want to know any Scheme at all offhand, but you wanted the ability (call this ability B) to be able to review SICP for 30 minutes and then be able to evaluate some Scheme code. Ability A implies ability B perhaps, but suppose all you really want is ability B. Doesn't it stand to reason that, over the years, ability A will take longer to maintain than ability B?

I hadn't though of that, as I want A & B, so A implying B didn't
bother me. There's surely a minimal subset of Scheme one needs to
understand random SICP sections, though. I'm having a hard time seeing
what example might have A -> B, but not have B a subset of A, though.
The only thing I can think of are fields that overlap, but then why is
one targeting/studying A in the first place?

As for the time investment: I currently have 390 cards; assume I
increase to 1000 by the time I finish - which should be in the right
ballpark, Scheme is known as a minimalistic language - and further
assume that the SuperMemo people are right that the lifetime effort
devoted to studying each card is ~5 minutes. That means ~5000 minutes,
or ~80 hours over my life; if we assume the minimal subset is half
that and I don't actually want to know Scheme-in-general just
Scheme-for-SICP, then the time I'm wasting over my life is 40 hours.
Which doesn't seem too bad - I could recoup those 40 hours just by
laying off Reddit a bit.

Incidentally, Peter, if you're reading this thread: *are* the
SuperMemo folks right about each card taking 5 minutes? I've added a
number of cards based on that belief, and maybe the preliminary
statistics have something to say about that rule of thumb.

> 2. Not all mental skills can be called memory. Suppose you wanted to retain the ability to multiply two arbitrary 3 digit numbers in your head in 30 seconds. It wouldn't make sense to make a bunch of cards depicting various specific numbers to multiply. To me, remembering how to program or how to do linear algebra falls in the grey area between remembering the definition of a word and "remembering" how to ride a bicycle.

Well, hold on. Why wouldn't it make sense? We're technically inclined
folks, it wouldn't be hard for us to write a quick script or macro to
generate, say, 500 random cards which ask us to multiply abc by xyz,
and import them at grade 4 or 5.

Which is your mind going to do - get good at multiplying 2 3 digit
numbers (generate on-demand), or memorize 500 different multiplication
problems (memoize)?

True, Mnemosyne won't enforce the 30-second stricture, but review has
always required honesty of the user.

(From my experience with multiple subtle variants on a card, the mind
gives up after just a few and falls back on a problem-solving approach
- - which is exactly what one wants to exercise, in this case.)

- --
gwern
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEAREKAAYFAkpruJsACgkQvpDo5Pfl1oLWhwCfbFiv4xH7VPMXf4C5rv/o0nS9
8yUAn31GIDfwBAENM9ZkIU2rWelxGNQ3
=qGID
-----END PGP SIGNATURE-----

Gwern Branwen

unread,

Jul 25, 2009, 10:03:31 PM7/25/09

to mnemosyne-...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sat, Jul 25, 2009 at 9:51 PM, Jason Axelson wrote:
> I think that for things like this it would be cool to program some
> type of "dynamic card" that would make a math problem out of random
> numbers and quiz you on it. I agree that this could be very helpful,
> it might make sense to use a different scheduling algorithm for this.
>
> Jason

Sure. But we can do better even for static pre-generated cards like I
suggested up above. For example, Peter mentioned that Mnemosyne 2's
cloze plugin will avoid scheduling cloze deletions too close together
since if you see one, you are contaminated for the next. This
expansion would work perfectly well for multiplication cards - if
you've done 1 or 2 multiplies today, then push the rest to tomorrow.

(Coolest would be, as you say, a dynamic card. This could be quite
easy: add a markup type like , except make it for Python code. The
arbitrary python code gets evaluated in Mnemosyne, and the 2 results
substituted in as question and answer. I can't even begin to imagine
all the possibilities for something like that.)

- --
gwern
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEAREKAAYFAkpruXIACgkQvpDo5Pfl1oIewwCeLBt/vPHaHQw+PtskchH8jsYG
2eMAoJLLAxL1HOKzLU6hphV77Dk4N2KD
=pzkl
-----END PGP SIGNATURE-----

Peter Bienstman

unread,

Jul 26, 2009, 2:19:43 AM7/26/09

to mnemosyne-...@googlegroups.com

On Sunday 26 July 2009 04:03:31 am Gwern Branwen wrote:
> On Sat, Jul 25, 2009 at 9:51 PM, Jason Axelson wrote:
> > I think that for things like this it would be cool to program some
> > type of "dynamic card" that would make a math problem out of random
> > numbers and quiz you on it. I agree that this could be very helpful,
> > it might make sense to use a different scheduling algorithm for this.
> >
> > Jason
>
> Sure. But we can do better even for static pre-generated cards like I
> suggested up above. For example, Peter mentioned that Mnemosyne 2's
> cloze plugin will avoid scheduling cloze deletions too close together
> since if you see one, you are contaminated for the next. This
> expansion would work perfectly well for multiplication cards - if
> you've done 1 or 2 multiplies today, then push the rest to tomorrow.
>
> (Coolest would be, as you say, a dynamic card. This could be quite
> easy: add a markup type like , except make it for Python code. The
> arbitrary python code gets evaluated in Mnemosyne, and the 2 results
> substituted in as question and answer. I can't even begin to imagine
> all the possibilities for something like that.)

The 2.x code base was specifically designed to be able to handle this. So,
download the code and eat your heart out with implementing sophisticated card
types :-)

The cloze card type is the best example of cards that are more than just a
concatenation of predefined fields, so that's a good place to start looking once
you've studied the regular card types.

Peter

Peter Bienstman

unread,

Jul 26, 2009, 2:25:37 AM7/26/09

to mnemosyne-...@googlegroups.com

On Sunday 26 July 2009 03:59:57 am Gwern Branwen wrote:

> Incidentally, Peter, if you're reading this thread: *are* the
> SuperMemo folks right about each card taking 5 minutes? I've added a
> number of cards based on that belief, and maybe the preliminary
> statistics have something to say about that rule of thumb.

I haven't gotten around to looking at the stats yet in great detail. I am
however working now on code to import your 1.x cards and history into the 2.0
SQL database.

As a side effect of testing this, I hope to generate a big scary SQL database
with all the collected logs in the near future. I'd rather continue to work on
2.0 at that stage as opposed to analysing the logs in detail, but if someone
wants to play with it then (and perhaps further optimise the code in the
process), feel free to do so.

Cheers,

Peter

Oisín

unread,

Jul 26, 2009, 9:11:35 AM7/26/09

to mnemosyne-...@googlegroups.com

2009/7/26 Peter Bienstman <Peter.B...@ugent.be>:

>
> On Sunday 26 July 2009 03:59:57 am Gwern Branwen wrote:
>
>> Incidentally, Peter, if you're reading this thread: *are* the
>> SuperMemo folks right about each card taking 5 minutes? I've added a
>> number of cards based on that belief, and maybe the preliminary
>> statistics have something to say about that rule of thumb.
>
> I haven't gotten around to looking at the stats yet in great detail. I am
> however working now on code to import your 1.x cards and history into the 2.0
> SQL database.

I exported my Chinese deck to Anki a year and a half ago when
Mnemosyne wouldn't work on my new Macbook (due to either pygame or
pyqt not compiling on OS X 10.5), while keeping Mnemosyne on my
Windows and Linux boxes for French and German (what a mess :D).
Having a quick look at my deck, I see stats recorded by Anki of
between 5 minutes for easy cards and 19 minutes for very difficult,
mature cards (9 months old or so). I don't know what the average is,
but I'd expect something more like 10 minutes. Certainly 5 minutes as
a lifetime card maximum seems like a very hopeful estimate, for a
learner who never misses reviews, with easy cards.

As usual, it comes down to a question of how difficult the material in
each card is. E.g. I have a few English-English cards (for words like
"hinterland" and "overweening") in the same deck, which are a year and
a half old and on a ~1.6 year interval, with about 30 sec up to 2 mins
on each.

Personally, I'm not sure if using Mnemosyne to learn (memorise?)
Scheme is a productive use of time - programming being less about a
large atomic vocabulary than a small language with many ways to apply
it. Since you already have the knack of programming, I would suggest
that all programming languages are just tiny dialects that sit atop
your existing programming knowledge.

Most of programming is about developing abstract skills, somewhat
similar to driving a car. I wouldn't use an SRS to learn how to drive
a car :D

That said, I'd love to hear how it pans out and if it can work well.
Perhaps my prejudice against SRS use for more difficult subjects than
vocabulary/grammar/facts comes from my failure to use it successfully
when studying a couple of final year compsci courses. Which is
probably down to poor application by myself rather than limited
applicability!

Oisín

Ben

unread,

Jul 26, 2009, 10:42:42 AM7/26/09

to mnemosyne-...@googlegroups.com

You're probably right that that approach would work, but would it be
the most efficient way to practice? I think the problem is that
Mnemosyne would not see the connections between the 500 different
cards. Even if each card takes 5 minutes to learn for a lifetime,
that's still 5*500/60 = 42 hours of time. Perhaps it would be much
more efficient to review the single card that generated random
problems each time.

I suppose part of this discussion is just about whether 40 hours is a
long time or not :). I think of it as being a long time, so the idea
of a strategy being slightly inefficient bothers me. But your point
about reddit applies to me too---I definitely have "wasted" waay over
40 hours in my life.

--
Ben

----------------- Original message -----------------
From: Gwern Branwen <gwe...@gmail.com>
To: mnemosyne-...@googlegroups.com

Date: Sat, 25 Jul 2009 21:59:57 -0400

...

Gwern Branwen

unread,

Jul 26, 2009, 12:59:21 PM7/26/09

to mnemosyne-...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sun, Jul 26, 2009 at 9:11 AM, Oisín wrote:
> I exported my Chinese deck to Anki a year and a half ago when
> Mnemosyne wouldn't work on my new Macbook (due to either pygame or
> pyqt not compiling on OS X 10.5), while keeping Mnemosyne on my
> Windows and Linux boxes for French and German (what a mess :D).
> Having a quick look at my deck, I see stats recorded by Anki of
> between 5 minutes for easy cards and 19 minutes for very difficult,
> mature cards (9 months old or so). I don't know what the average is,
> but I'd expect something more like 10 minutes. Certainly 5 minutes as
> a lifetime card maximum seems like a very hopeful estimate, for a
> learner who never misses reviews, with easy cards.

Hm. The SuperMemo answer no doubt is that anything much beyond 5
minutes represents a card which needs to be broken down and made
easier, or studied better somehow.

I installed Anki and imported my deck, only to find that apparently
tracking the time like that only works if you do your reviews in Anki.
Drat! I would've liked to find the leeches in my deck.

I wasn't considering switching to Anki before, but between the web
review stuff, and this timing feature, it seems tempting. As someone
who used both simultaneously for quite a while, how do they stack up?

> As usual, it comes down to a question of how difficult the material in
> each card is. E.g. I have a few English-English cards (for words like
> "hinterland" and "overweening") in the same deck, which are a year and
> a half old and on a ~1.6 year interval, with about 30 sec up to 2 mins
> on each.

Yes, that seems pretty reasonable to me.

> Personally, I'm not sure if using Mnemosyne to learn (memorise?)
> Scheme is a productive use of time - programming being less about a
> large atomic vocabulary than a small language with many ways to apply
> it. Since you already have the knack of programming, I would suggest
> that all programming languages are just tiny dialects that sit atop
> your existing programming knowledge.

By that argument, isn't this the best way of going about learning
Scheme given that I already have the knack of programming/know
functional Haskell programming?

If the differences between them are dialectical, then that suggests
that only the vocab and syntax differ substantially - and what's the
killer app for SRS? Vocab...

> Most of programming is about developing abstract skills, somewhat
> similar to driving a car. I wouldn't use an SRS to learn how to drive
> a car :D

Hmm. Maybe in 2.0 we can add a 'joystick' card type, which fires up a
3D driving simulator! You can have cards covering every aspect of
intersections, icy bridges, rights of ways... Bwa ha ha.

> That said, I'd love to hear how it pans out and if it can work well.
> Perhaps my prejudice against SRS use for more difficult subjects than
> vocabulary/grammar/facts comes from my failure to use it successfully
> when studying a couple of final year compsci courses. Which is
> probably down to poor application by myself rather than limited
> applicability!
>
> Oisín

It can be difficult. I don't think I would be studying Scheme this way
if I didn't have hundreds of practice problems and all the examples -
it would just be expecting too much of myself. If I knew the right
examples to create, I wouldn't need to study them...

- --
gwern
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEAREKAAYFAkpsi2gACgkQvpDo5Pfl1oK1zwCdHyp1n/cosYKmdUu/UAWr/kfC
3D8An3LXVmE8YGacl8Jgz10YNBX8Jd1c
=ILB1
-----END PGP SIGNATURE-----

Peter Bienstman

unread,

Jul 26, 2009, 1:36:39 PM7/26/09

to mnemosyne-...@googlegroups.com

On Sunday 26 July 2009 06:59:21 pm Gwern Branwen wrote:

> I installed Anki and imported my deck, only to find that apparently
> tracking the time like that only works if you do your reviews in Anki.
> Drat! I would've liked to find the leeches in my deck.

BTW, the leeches thing is really trivial and will be in 2.0: it's just the
cards with lapses > 15.

But if you feel more comfortable with Anki, by all means go ahead :-)

Cheers,

Peter

querido

unread,

Jul 26, 2009, 10:32:16 PM7/26/09

to mnemosyne-proj-users

I admit that most of the above is an overreaction to a problem I've
given myself: I've been fanatically absorbed in Chinese study for the
last seven months, and while I've made great progress, the rate can't
be sustained. I might consolidate for a while.

I have a big idea for you. I suggest you could skip to the last
paragraph first if you'd rather avoid my unpolished verbosity below.

About #1, above (This is about language, and especially relevant to my
scenario of language learned from a graduated series of textbooks in
which later lessons subsume earlier ones. I know your program is much
more general than this, and other people use it all sorts of ways.):

If I can show that all of the information on some subset of my cards,
all of which are at intervals above some minimum, is present in
composite form in some lesson or text I've studied, and if I can prove
that I possess it now as language (by passing a "review scheduling"
card that tests this whole chunk), then "graduating" from those cards
looks reasonable, to be replaced by this scheduled reading/listening
of the whole.
We know the principle of atomic-data flashcards. But what I'm saying
suggests a new theory of how information should be managed over
time... leading toward the big, hard to flashcardize qualities of real
language. Let's see: A subset of less-composite-data flashcards
*should* be condensed into a more-composite-data flashcard as soon as
come criteria are met. This would build toward "review assignment"
cards (in a separate category to avoid interfering with your schedule
of learning *new* things), like this: front "This month, read War and
Peace (in Russian of course)" back "Did you understand everything to
the standard that you demand of yourself?" At that point, you don't
need the 50,000(?) atomic cards that it would break down into. You
could declare yourself done, with a yearly reading. Corollary: the
more composite, the less the interval should be stretched, leading
asymptotically toward no-stretching, pure maintenance. Corollary: the
more composite, the more time should be allowed for the card to avoid
interfering with the normal reps of new cards. That means fewer of
these cards per unit time, ultimately requiring let's say a button to
indicate that you've started on the assignment, giving you the
permitted day, week etc. to complete it. These cards would be "in
progress", awaiting their grade.

From: *learning* atoms, To: *maintaining* chunks of real language.

Just as a software tool could chop up a book into atomic cards, a
software tool could monitor the learning process and re-condense,
letters into words, words into sentences, etc., as justified. (Chop up
the book recursively down to letters or characters, storing the
intermediate results in a database. Do the audio too!) Integrated into
the flashcard program and automated, total card number would
continually fold downward into fewer more complex cards with lower ef,
until your flashcard displays a link to your favorite bookstore to
fetch this month's assignment!

A practical, partial alternative that acknowledges these principles
and could be implemented now is this: Every time I correctly answer a
composite card, every atom present on that card would have *its own*
card's interval reset, from today, probably even incremented, because
I just saw it, and knew it. The presence of these cards would be
irrelevant then since their intervals should become astronomical! The
list of its atoms, compiled when the composite card is made, could be
stored like tags with the card. This would be huge, and is why
increasing card-complexity should be sought. There you go.

Oisín

unread,

Jul 26, 2009, 10:48:55 PM7/26/09

to mnemosyne-...@googlegroups.com

2009/7/26 Gwern Branwen <gwe...@gmail.com>:

> Hm. The SuperMemo answer no doubt is that anything much beyond 5
> minutes represents a card which needs to be broken down and made
> easier, or studied better somehow.

This could certainly be true; it's been pointed out that separating
the phonetic and character learning might ease some of this, but I
can't bring myself to re-engineer over 2000 cards and add to the mess
:P

> I wasn't considering switching to Anki before, but between the web
> review stuff, and this timing feature, it seems tempting. As someone
> who used both simultaneously for quite a while, how do they stack up?

Both are excellent programs which certainly help during language
learning, and perhaps lots of other material if used correctly. They
share many strengths, but currently Anki's statistics and graphs are
very impressive. On the other hand, Mnemosyne's plugin architecture
looks to be really neat for 2.0, so to be honest, it's too close to
call!

>> it. Since you already have the knack of programming, I would suggest
>> that all programming languages are just tiny dialects that sit atop
>> your existing programming knowledge.
>
> By that argument, isn't this the best way of going about learning
> Scheme given that I already have the knack of programming/know
> functional Haskell programming?
>
> If the differences between them are dialectical, then that suggests
> that only the vocab and syntax differ substantially - and what's the
> killer app for SRS? Vocab...

That could work - as long as you don't get overwhelmed (e.g. when I
tried to convert all of my notes for compiler construction into cards
and ran out of time less than halfway through, by trying to memorise
_everything_).

> It can be difficult. I don't think I would be studying Scheme this way
> if I didn't have hundreds of practice problems and all the examples -
> it would just be expecting too much of myself. If I knew the right
> examples to create, I wouldn't need to study them...

Right, but many of those practice problems contained enough work to be
significantly time-sucking (e.g. my solution for 2.19 is an eyesore).
I hope you can break the information down into finer chunks than the
example questions.

Peter Bienstman

unread,

Jul 27, 2009, 1:24:56 AM7/27/09

to mnemosyne-...@googlegroups.com

It's an interesting concept. It could be implemented by some sort of special
card type for foreign language texts. I've put it on my long term TODO list,
so that I don't forget about it :-)

Peter

Patrick Kenny

unread,

Jul 27, 2009, 1:33:13 AM7/27/09

to mnemosyne-...@googlegroups.com

I suggest you take a look at the implementation of "incremental reading"
in SuperMemo. It's similar to what you outlined, with some improvements.

Cheers,
Patrick

querido さんは書きました:

Gwern Branwen

unread,

Jul 27, 2009, 5:55:17 AM7/27/09

to mnemosyne-...@googlegroups.com

This is an interesting idea, but I'm not sure about it. Aside from the
work of reconsolidating, is this efficient? eg. suppose we have 50k
cards, and half of them need to be reviewed in 6 months and the other
half in 16 months; then your proposal would schedule all of them at 12
months or whatever. Wouldn't this be wastefully soon for half the
cards and way too forgetfully late for the other half? It would seem
to encourage forgetting.

While we're on the topic of new approaches to learning languages,
here's one I found interesting, although I never could quite work out
how to incorporate it into Mnemosyne or SRS:
http://jtauber.com/blog/2008/02/10/a_new_kind_of_graded_reader/

The idea is that a student has a small core vocabulary of Greek verbs
& nouns. You scan some large corpus looking for sentences and
paragraphs which have as few words falling outside that corpus as
possible, and ideally just one unknown word, and you present all the
matching sentences for the student to study/translate/learn.* Then,
you re-scan the corpus, having updated the corpus with the new word,
and so on and so forth.

It's a nice idea - intuitively I feel it's automating something that
good students are doing already eg. consider one blogger's 'sentence
mining' approach:
http://www.glowingfaceman.com/2008/12/sentence-mining.html

But I couldn't figure out the best way to marry it with SRS. I figured
that one viable approach might be to take a corpus, take a set of
foreign vocab which it is mandatory for the user to have, and then
generate the 'minimal' learning path. That is, it'd create thousands
of cards, each covering the next most rare word, and the user could
just work his way through them linearly.

(Actually, maybe this approach isn't as bad as I thought. I've been
generating large numbers of cards for memorizing poems, and it hasn't
worked out too bad as long as I didn't use the randomization plugin.
Hm. I should look into whether the guy's software could be repurposed
for this. A static set of cards could work well: imagine such a
generated card deck for someone learning French: she can choose from
one targeted at _In Search of Lost Time_, or she could pick a deck
targeted at Rene Descartes if her interests inclined that way.)

* There's some extra stuff about translating parts of sentences into
English to focus on a particular word, but I think this is extra - a
hack to get around the fact that a 'small' corpus like the New
Testament isn't going to give you often sentences which have *only*
one unknown word. By translating, you can take a sentence with
multiple unknown words and translate it into a sentence with only one
unknown word.

--
gwern

Gwern Branwen

unread,

Jul 27, 2009, 6:00:23 AM7/27/09

to mnemosyne-...@googlegroups.com

On Sun, Jul 26, 2009 at 10:48 PM, Oisín<denpa...@gmail.com> wrote:
> Right, but many of those practice problems contained enough work to be
> significantly time-sucking (e.g. my solution for 2.19 is an eyesore).
> I hope you can break the information down into finer chunks than the
> example questions.

I think the idea behind 2.19 was that while the recursive solution
doesn't have to be too bad, there is no iterative solution to the
coin-change problem which isn't an eyesore.

But I guess I wasn't clear; I'm getting most of my questions from
http://icampustutor.csail.mit.edu/6.001-public/ - which is all about
bite-sized questions perfect for Mnemosyne. The SICP text isn't too
great on its own for that, assuming you don't care about number theory
and bits of trivia for testing primality. (As an aside, I believe
anyone doing SICP should use all the tests on that site, and also
watch the video. The experience is partial otherwise.)

--
gwern

Peter Bienstman

unread,

Jul 27, 2009, 6:40:50 AM7/27/09

to mnemosyne-...@googlegroups.com

On Monday 27 July 2009 11:55:17 am Gwern Branwen wrote:

> While we're on the topic of new approaches to learning languages,
> here's one I found interesting, although I never could quite work out
> how to incorporate it into Mnemosyne or SRS:
> http://jtauber.com/blog/2008/02/10/a_new_kind_of_graded_reader/

Sounds very interesting for when the student is only interested in a
relatively small corpus, like in that example of old testament Greek. I'm not
so sure it's immediately applicable to living languages, where I guess you can
just rely on standard frequency lists.

Peter

Gwern Branwen

unread,

Jul 27, 2009, 6:55:07 AM7/27/09

to mnemosyne-...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Mon, Jul 27, 2009 at 6:40 AM, Peter Bienstman wrote:
> Sounds very interesting for when the student is only interested in a
> relatively small corpus, like in that example of old testament Greek. I'm not
> so sure it's immediately applicable to living languages, where I guess you can
> just rely on standard frequency lists.

One of his points seems to be that just going down the frequency list
loses you a lot: it's possible you'll need to go way down the list
before *any* sentences become understandable. So his algorithms will
try a mix of low and high frequency words, searching for whichever
group of, say, 10 words will lead to the most translated sentences.

This may not reflect the frequency count (imagine that there are 5
sentences which are the sole usage of some rare word Z which is ranked
#1000; the best solution might be to learn #1, #2, #3, and Z to get 5
sentences, while just knowing 1-4 will leave the learner adrift since
those 4 words are all used in sentences with rarer verbs and nouns).

Since this sounds very much like a NP-hard problem, his code uses
interesting search techniques like simulated annealing. Which might
imply that efficiency would be a concern on a large corpus like the
Proust I suggested, but he doesn't seem to've looked into its
scalability.

- --
gwern
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEAREKAAYFAkpth4YACgkQvpDo5Pfl1oIcHgCgkI/h58EfbLplUgWmAkJ4Rbni
EkwAn3Gtok4rtLzR+yPdoFLGJE/r4PGS
=isaQ
-----END PGP SIGNATURE-----

querido

unread,

Jul 30, 2009, 5:19:43 PM7/30/09

to mnemosyne-proj-users

Of course, it has been thought of before... though I haven't seen it
in this context. What I've described is first of all a "lexical
browser" (such a thing already exists). The text is recursively parsed
down to its minimal tokens, with all intermediate results stored in a
tree-shaped database. The database could be browsed with something
like a filesystem browser. Clicking on any node lists its
dependencies; For example clicking on a sentence-node would list its
words. This far, it is probably standard stuff.
But here is its application to flashcard programs: These lists of
dependencies would be *flashcards*, and let's say color coding could
indicate their satisfaction of some criteria. For example, once you
know (according to the criteria you set, like interval achieved) all
of the characters in a word. That word's node would change color, and
optionally, you could have it automatically add this word-card to your
schedule. ...and furthermore, could have it automatically suspend the
cards for its dependencies that you've already learned. (I repeat: it
could be like a filesystem browser but instead of writing filenames
into a list, it writes flashcards into and out of your collection
according to criteria you've set.)
*That* would automate the "continually fold(ing) downward into fewer
more complex cards" of which I spoke above.

What I'm describing is a hierarchical recordkeeping system whose
skeleton probably already exists somewhere. But if its contents are a
lexical parsing, then making its records be *flashcards* suggests
itself.

querido

unread,

Jul 30, 2009, 5:57:22 PM7/30/09

to mnemosyne-proj-users

I wasn't arguing that a flashcard system becomes less useful!
This was my claim (as I said without proof):

"maybe the single-fact, atomized-data style flashcard system becomes
non-ideal again at long intervals"

I was arguing that atomic data, "minimum-information" style cards
should be gradually replaced in prominence (including by pruning
them), in favor of cards that are more composite, more language like,
even to the point of scheduling "assignments", as soon as some
specified justification has been earned. This would correspond to the
gradual awakening of real language. It would also recognize something
not usually mentioned in the flashcard-system world, that (in the case
of languages) the body of material being learned can be viewed as
having a tree-structure, and that therefore something might be gained
by organizing the cards accordingly.

I see wǒ 我 and rén 人 a hundred times every day! I feel safe assuming
that as long as hundreds of sentence-cards (with audio by the way) are
present which contain these characters, I may prune them. (The Author,
above, used this word; I wasn't even considering it.) The task then is
to develop some reliable rule to govern this. Then, optionally
*automating* the action of the rule led me to the ideas I wanted to
post.

Oisin Mac Fhearai

unread,

Jul 30, 2009, 7:44:07 PM7/30/09

to mnemosyne-...@googlegroups.com

On 30 Jul 2009, at 22:57, querido <tworoads...@gmail.com> wrote:

>
>
> I see wǒ 我 and rén 人 a hundred times every day! I feel safe assumi
> ng
> that as long as hundreds of sentence-cards (with audio by the way) are
> present which contain these characters, I may prune them. (The Author,
> above, used this word; I wasn't even considering it.) The task then is
> to develop some reliable rule to govern this. Then, optionally
> *automating* the action of the rule led me to the ideas I wanted to
> post.

In these cases, you would rate the card with the maximum score
repeatedly such that the intervals increase rapidly - this is almost
equivalent to automatically pruning them (e.g. In my deck, words like
人 and 是 were assigned intervals of a few years very quickly - they
are, for all practical purposes, pruned, by virtue of the rapidly
increasing long intervals and short answer time).

However the idea of zooming out to higher knowledge abstraction levels
as the 'atomic' cards become well-learned is very interesting,
although it certainly seems like a difficult and topic-specific task
that is beyond the scope of current SRS systems which require little
information from the user and usually ignore the card contents
completely.

Oisín

Bill Price

unread,

Jul 31, 2009, 9:53:42 AM7/31/09

to mnemosyne-proj-users

I'm late to the party here, so this message isn't very relevant to the
current thread of discussion.

I just wanted to chime in to say that I support the idea of warning
the user once they reach a certain threshold of new cards learned in a
given day.

I long had a habit of trying to clear out ALL cards presented to me by
the GUI, whether they be scheduled repetitions or else new cards which
I had not yet memorized. In other words, if I had 50 cards scheduled
and then added a new set of 120 cards, I would force myself through to
do all 170 cards in a marathon session.

I crashed out recently. In May and June, I steadily fell behind on my
repetitions, lapsed on many cards, then went on vacation for a week
without bringing my laptop. I came back to a queue of around 500
scheduled items--many of which (probably about 15-20%) turned out to
have lapsed in my memory--on top of an already-large pile of lapsed
cards sitting in limbo which I had never taken the time to re-learn.

It's taken me about a month to pull things back under control. It was
only about a week ago that I finally re-memorized the last of my
lapsed cards and got everything back into regular repetition,
shrinking my daily queue to generally fewer than 40 cards. Until my
queue calms down more, I've almost completely stopped adding new cards
to my deck.

I realize now that my pace of learning was not sustainable, but that
because most days' review and learning sessions only took 20-30
minutes, I had no instinctive feeling that I was overreaching. And I
guess that's what this software is all about, anyway: using technology
to help us compensate for our poor intuitions about memorization and
learning. So implementing a gentle warning system to discourage users
from overextending themselves would be very worthwhile, I think.

--Bill

On Jul 30, 7:44 pm, Oisin Mac Fhearai <denpasho...@gmail.com> wrote:

querido

unread,

Aug 2, 2009, 6:46:54 PM8/2/09

to mnemosyne-proj-users

In this following paragraph (1), I wrap up my discussion of a problem
fundamental to the flashcard-program world that is usually ignored,
and I prescribe a solution.

1. I want to clarify that when I said "I see wǒ 我 and rén 人 a hundred
times every day" I was referring also to my studies outside of
mnemosyne. Much of my last few posts is partly what I've come up with
after thinking about how this outside exposure should be or could be
accounted for in a flashcard program. Briefly: On the one hand, cards
are promoted without the justification the algorithm assumes. (Seeing
我 every day undercuts the _meaning_ of the interval and the increment
on a card that the algorithm thinks I haven't seen for 120 days! It
has made a false assumption!) But on the other hand, not promoting
can't be right either. My solution lies in ensuring that 我 is
represented on more-composite cards that are still being mastered, and
that are less likely to be seen literally every day, while the simple
card for 我 itself becomes a "don't care". Here is a perfect
illustration: I don't think I've opened volume 3 in my textbook series
in weeks. Now, if volume 3 is represented by nothing more than its
atoms, like 我, then all I have, within this flashcard system, to show
that I've mastered volume 3, are the intervals on these cards,
intervals that I argued above are unjustified, with respect to volume
3, *because I've probably been seeing these items daily in all
subsequent volumes*. But, if the card for 我 is pruned or is a "don't
care", while 我 is still represented on a card that shows, let's say,
the entire lesson in which 我 first appears, then mnemosyne *opens
volume 3 for me, to a lesson I haven't seen in weeks, on the schedule
I command*, I prove that I still know that whole lesson, and 我 is
validly along for the ride: my zealously guarded hoard of characters
is "all present and accounted for".

2. If there is a linkage such that promoting for example this card, "我不
是中国人.", also auto-promotes the simple card for 我, then yes as you said
"this is almost equivalent to automatically pruning them". Yes, it
would be unnecessary to actually remove it... very convenient.

3.

> However the idea of zooming out to higher knowledge abstraction levels
> as the 'atomic' cards become well-learned is very interesting,
> although it certainly seems like a difficult and topic-specific task
> that is beyond the scope of current SRS systems which require little
> information from the user and usually ignore the card contents
> completely.

The *automatic* parsing of a text? Yes, it would be topic specific. I
was exploring the extreme...

But it wouldn't be necessary that the flashcard program know the
subject on the cards; assuming that the subject can benefit from a
hierarchical breakdown (most subjects) what matters is only the
relationship between the cards, which would be reflected in their
locations in a tree. I *think* the program author said that v2 will
have hierarchical categories. I *think* this would serve as the tree.
[With one caveat: I think? it isn't technically exactly a tree if the
leaf nodes point back to more than one parent. Here's what I mean:
under a node labeled say "volume 3" (broken down optionally into)
"lesson 2" (broken down optionally into) line 5, might be "我不是中国人."
with one leaf being 我. While another line in another volume will also
have 我 as a leaf. Therefore, 我 is more than a simple leaf, though I
forget the terminology. It would be necessary that once 我 is declared
pruned (or is being auto-incremented) in one place, it should prune or
auto-increment itself everywhere, indicated maybe by changing color in
the category browser. I'm not an expert, but I believe the software
for managing relationships like this is routine, so the code is
probably in a library somewhere. At a higher level, this is just
database management, right?]

In conclusion, let me bring this back to a practical level.
Most of my last few posts are arguing about issues that fall somewhere
between a flashcard program, and a lesson plan or study strategy
(divide and conquer while naturally observing dependencies...doesn't
this happen in just about every sphere?). The most important part was
my argument for increasing-complexity as warranted (such as it is for
me in Chinese, now). So far, this doesn't necessarily involve the
flashcard program, but I offered some non-expert ideas on how it could
be supported, if anyone accepts the principle.
The best news is that v2 (assuming I'm remembering right that it will
permit hierarchical categories) will already have a structure that
will support managing these matters *manually*, and incidentally
scrolling windows too. Perfect! I'm doing it manually now, following
my common sense with regard to how to study big subjects where
simplicity is folded into complexity. It won't surprise you that my
original interest in trees followed from my obsession with chess and
then chess programming. Math, chess, language, same thing: "One (tree)
to rule them all".

On Jul 30, 7:44 pm, Oisin Mac Fhearai <denpasho...@gmail.com> wrote:

Peter Bienstman

unread,

Aug 3, 2009, 2:57:03 AM8/3/09

to mnemosyne-...@googlegroups.com

On Monday 03 August 2009 12:46:54 am querido wrote:
> I *think* the program author said that v2 will
> have hierarchical categories. I *think* this would serve as the tree.
> [With one caveat: I think? it isn't technically exactly a tree if the
> leaf nodes point back to more than one parent.

2.0 will support hierarchical categories (named tags). A card can be tagged
with more than 1 tag as well.

Peter

Gwern Branwen

unread,

Aug 5, 2009, 4:07:01 AM8/5/09

to mnemosyne-...@googlegroups.com

On Mon, Jul 27, 2009 at 5:55 AM, Gwern Branwen<gwe...@gmail.com> wrote:
...

> But I couldn't figure out the best way to marry it with SRS. I figured
> that one viable approach might be to take a corpus, take a set of
> foreign vocab which it is mandatory for the user to have, and then
> generate the 'minimal' learning path. That is, it'd create thousands
> of cards, each covering the next most rare word, and the user could
> just work his way through them linearly.
>
> (Actually, maybe this approach isn't as bad as I thought. I've been
> generating large numbers of cards for memorizing poems, and it hasn't
> worked out too bad as long as I didn't use the randomization plugin.
> Hm. I should look into whether the guy's software could be repurposed
> for this. A static set of cards could work well: imagine such a
> generated card deck for someone learning French: she can choose from
> one targeted at _In Search of Lost Time_, or she could pick a deck
> targeted at Rene Descartes if her interests inclined that way.)
>
> * There's some extra stuff about translating parts of sentences into
> English to focus on a particular word, but I think this is extra - a
> hack to get around the fact that a 'small' corpus like the New
> Testament isn't going to give you often sentences which have *only*
> one unknown word. By translating, you can take a sentence with
> multiple unknown words and translate it into a sentence with only one
> unknown word.

So, to update. I discovered that the problem is indeed NP or even EXP
hard when I finished up the program and found that runtime on a corpus
of a few hundred words was going to be on the order of weeks; I
switched to a heuristic which is kind of frequency-based, and seems to
give reasonable results.

The results look kind of like this: given a random hardwired list of
English words, if one feeds in the text of Frank Herbert's _Dune_, one
gets this:

[02:15 AM] 0Mb$ cat /home/gwern/doc/herbert/fh-dune-messiah.txt | ./hcorpus 20
he
said
paul
his
she
her
not
him
had
for
at
alia
no
from
what
asked
they
there
have
stilgar

(Paul, Alia, and Stilgar are major characters, frequently mentioned.)

Eyeballing, these look like reasonable words to know. If we 'learn'
these 20 words (=putting them in the hardwired known list), then our
next batch of 20 words looks like

[02:18 AM] 0Mb$ cat /home/gwern/doc/herbert/fh-dune-messiah.txt | ./hcorpus 20
scytale
thought
know
we
do
could
will
your
must
chani
one
now
by
irulan
eyes
ghola
fremen
then
out
them

(Irulan, Chani, Scytale are minor characters; Fremen & ghola are
_Dune_ neologisms.)

These too look plausible.

On the TODO list is
- read known words from a file
- after printing out the top nth word, print out sentences which are
now translatable by it
- efficiency hacks; the top 20 words on a single book takes ~6s, but
if we run on Frank Herbert's entire corpus (only 10x the size), memory
use blows up and I dunno how long it takes, which is obviously
unacceptable