Mirage keeps track of which tracks were scanned, and stores this info in
the banshee DB. When you click 'Rescan the music collection', it only
scans the tracks that are not already scanned. The percentage shown is
relative to the number of tracks to be scanned, not to all tracks.
It might be a good idea to show the progress as the global percentage of
tracks scanned in the library
The "Reset..." menu item makes Mirage forget which tracks were scanned.
Resume scanning would be possible, but we need to be careful : if
scanning a track crashes banshee (shouldn't happen, but you never
know..), we don't want to be stuck in a "crash-restart-crash" loop,
making banshee unusable.
Thank you for your input !
--
Bertrand Lorentz <bertrand...@gmail.com>
> http://flickr.com/photos/bl8/ <
To butt in here, possibly inappropriately: I have written a python
plugin (originally for Quod Libet, now cross player) that does
similarity lookups on last.fm, and can use tags to look up similar
songs. I would love to add acoustic similarity lookups as well, and I've
been looking at mirage for this.
Where this touches on your question: the plugin currently has the smarts
to not play the same song or tracks by the same artists for a
configurable duration, and subject to further restrictions or
relaxations, depending on what kind of advanced queries the player supports.
If you guys think it would be interesting, I would be happy to port my
plugin to banshee (Is there a python API? I might need some help
otherwise ;) and I would very much like to add the option of acoustic
similarity, so I'm wondering: how deeply is mirage tied to Banshee? I
understand that the similarity scores are stored in the banshee
database, but I would be happy to invest some time to make that
configurable, so that other players (and my plugin ;) could take
advantage of this great feature. I don't know if that is feasible, but I
see no reason in theory that it wouldn't be.
Anyhoo, the plugin's located here:
http://code.google.com/p/autoqueue/
Currently supported are Quod Libet (fully featured, fully configurable)
Rhythmbox (some features missing, default configuration, but fully
working) and there are plans for supporting Pytone, Itunes, mpd and xmms.
Sorry if this is a bit OT but I'm looking to exchange ideas, and
hopefully even make our stuff work together without reinventing too many
wheels.
--
- eric casteleijn
http://thisfred.blogspot.com
Yes, I think we have a terminology confusion here. I think "scan" should
be used for "find which tracks should be analyzed", and "analyze" should
be used for "do the math that allows to determine music similarity".
I'm also guilty of this confusion.
> As for the auto-resume... maybe, when exiting cleanly, mirage could
> flip some "exited cleanly" variable,
> and then when starting could check against it....
> By the way, if you make an auto resume feature, you could also think
> about making a low-priority scan setting, or is that impossible?
The thread that does the analysis is already set to the lowest priority
(ThreadPriority.Lowest)
> It's just that while scanning my 2500+ song collection, my computer
> becomes verry unresponsive, especially on
> long songs (X JAPAN), or Led Zeppelin. On those 2, it completely
> freezes for about 5-10 minutes, and then continues on with the next
> song.
I hope you meant 5 or 10 seconds, because 5 or 10 minutes is way too
long to analyze a track. It usually takes less than 10 seconds for a
"normal" track.
Could you chack your CPU usage while mirage is analyzing those tracks ?
> But once you get past the scanning, the playlist generation is very
> nice, though, as was said in the discussion bellow, a bit confusing at
> first.
> BTW, would there be any way to check for duplicate songs? (Same songs
> from different albums)
This has aleady been suggested. I think it's a good idea, I'll try to
look into it.
By the way, for those who don't watch the website, I'd like to mention
that you can report bugs or suggest features on the Issue tracker :
http://code.google.com/p/banshee-unofficial-plugins/issues/list
Seems interesting. Sadly, python syntax confuses me ;)
> If you guys think it would be interesting, I would be happy to port my
> plugin to banshee (Is there a python API? I might need some help
> otherwise ;) and I would very much like to add the option of acoustic
> similarity, so I'm wondering: how deeply is mirage tied to Banshee? I
> understand that the similarity scores are stored in the banshee
> database, but I would be happy to invest some time to make that
> configurable, so that other players (and my plugin ;) could take
> advantage of this great feature. I don't know if that is feasible, but I
> see no reason in theory that it wouldn't be.
Banshee doesn't have a python API. Banshee extensions can be implemented
in .NET/Mono. Banshee also provides a dbus interface, and I think there
are plans to offer access to the banshee DB through dbus. but this would
be better discussed on the banshee mailing-list or on #banshee.
Only a part of Mirage is tied to banshee, see this thread for more
info :
http://groups.google.com/group/mirage-list/browse_thread/thread/43405ec6b315ef15/2c70d893f5c5f20d
The similarity data is stored in a separate sqlite database, along with
the TrackId of the track, to be able to associate it to the track in the
banshee DB.
So I don't see anything that could prevent you from doing what you want.
Have fun,
Cool, that is very exciting news. I will notify this list as far as it
is relevant to mirage, and might ask some stupid questions here in the
process. ;)
We already had reports of high memory usage during the analysis and also
during playlist generation, so that might indeed be the problem.
I haven't tracked it down yet, so any help with that is most welcome !
> Now, I don't know how banshee stores its track data, but if you know
> the trackid, shouldn't you be able to find out the song name?
> And if you could, you could make Mirage let you set how often you want
> Tracks, Albums, Artists to be repeated... How diverse should the
> generated playlist be... and so forth.
> It would be nice to be able to configure some of Mirage's playlist-
> generating. =)
With the TrackId, you can get all the info banshee has about the track.
So what you're suggesting is definitely possible. But i wouldn't want
mirage to have too many configuration options.
"Patches Welcome !" ;)
Cheers,
Yes, the other parts are for the audio analysis and similarity
calculation.
> Oh, and another question that appeared recently:
> A few days ago, I managed to kill my Ubuntu install (i'm getting good
> at it :D ),
> and had to re-install. After doing that, I happily added all my
> favorite repos,
> and had about 1Gig of packages download and install. Banshee & Mirage
> were among them.
> But when I opened Banshee, it DIDN'T see the Mirage extension... And
> no matter how I fiddled with it,
> it just refused to see it. I currently have no idea on what happened.
> If it helps (don't know how), Banshee's list of Radio stations was
> also empty, with some giberish styles listed... Everything else works
> perfectly. Any Ideas?
Did you do a clean install, or did you keep the content of you home
directory ? ~/.config/banshee-1 in particular.
It might be a packaging issue. I'll try to get the packager to look into
it.
Really? Must be because it's so close to C#, I guess. I found I could
read C# quite easily, and had to look up very little syntax when going
through mirage. In fact, I got quite a ways toward reimplementing the C#
parts in python over the weekend, mainly because I didn't see a good way
to interface python code with C# without going to something like
IronPython, but also just to get a better idea of what the code does.
Using ctypes for interfacing with libmirageaudio, and numpy for the
matrix calculations, I expect the execution speed not to be that much
worse, but that's for later.
Should I get a working version, would you be at all interested in having
it in mirage itself? I understand, of course, if it's not your first
concern, but it would make it easier to integrate mirage into players
that have a python plugin API.
If get a working version, I think we'll have to spin off libmirageaudio
as a separate package. Both your python stuff and the mirage C# stuff
would then depend on it.
I don't know if libmirageaudio is ready for other uses than the one we
currently have, but feel free to try !
And keep us posted on your progress. Good luck !
After banging my head against it some more, I have an initial working
version. It takes about 8 seconds per song for the analysis, which on my
machine seems at least in the same neighborhood as the banshee plugin
operates.
The code attached is not polished at all, and the database layer,
playlist generation hasn't been tested/implemented.
What works is everything up to the distance calculation, which I think
was the hardest part. Herein too lies the rub: I'm not really sure that
I got all the algorithms correct, because I had to get rid of some
nested loops that made things way too slow in python (8 minutes per
track instead of 8 seconds). Luckily the numpy/scipy extension does
almost everything you could want to do with a matrix/array, and it does
it at c speed.
I think that everything is not as it should be though: the distance
between the scms of a song and the scms of the same song is never 0,
which I would expect, and can even be negative. Other distances have
also shown negative values.
On an insignificantly small test set, the values do seem to make some
sense though[*], but I want to make sure that that's not coincidence.
My question is: do any of the original developers have test data that I
can use to build some unit tests? Perhaps a little script with some very
small matrices that exercises the code, or a test mp3 and some of the
resulting data from that (if possible not just the end result, but also
the intermediate matrices generated in Matrix.multiply,
CovarianceMatrix, Vector, Mfcc etc.). I will gladly (try to) contribute
back unit tests in C# if you can help me out with some test data, so
that I know my script is doing the right thing. I am not very well
versed in matrix and vector mathematics, so I'm sure some of the things
I changes are just plain wrong, and also, with unit tests, I can try
some more optimization tricks without worrying about breaking 'correctness'.
[*]: I took 5 mp3s and oggs:
1. felix - don't you want me baby
2. joni mitchell - cactus tree
3. james taylor - hey mister that's me upon the juke box
4. john larner & slater_hogan - gettin' ready
5. ricardo rae - lead the way
where 2 and 3 are 70s folk, 1 is 90s house, and 4 and 5 are consecutive
tracks of a fabric mix cd. As expected, 4 and 5 had the smallest
distance (they are mixed together, so acoustically *very* close)
followed by 2 and 3. 1 is not really like any of the others, which was
reflected.
With the filter files from svn and the windowsize and sampling rate
doubled, things got a tiny bit slower, but the same relations more or
less showed up.
On 09/22/2008 11:34 PM eric casteleijn wrote:
> After banging my head against it some more, I have an initial working
> version. It takes about 8 seconds per song for the analysis, which on my
> machine seems at least in the same neighborhood as the banshee plugin
> operates.
>
> The code attached is not polished at all, and the database layer,
> playlist generation hasn't been tested/implemented.
>
> What works is everything up to the distance calculation, which I think
> was the hardest part. Herein too lies the rub: I'm not really sure that
> I got all the algorithms correct, because I had to get rid of some
> nested loops that made things way too slow in python (8 minutes per
> track instead of 8 seconds). Luckily the numpy/scipy extension does
> almost everything you could want to do with a matrix/array, and it does
> it at c speed.
>
> I think that everything is not as it should be though: the distance
> between the scms of a song and the scms of the same song is never 0,
> which I would expect, and can even be negative. Other distances have
> also shown negative values.
>
> On an insignificantly small test set, the values do seem to make some
> sense though[*], but I want to make sure that that's not coincidence.
Finally got some time to work on this again. Turns out I *was* doing it
wrong, there was a stupid off by one error in my code somewhere. I've
fixed that, and now I'm getting the same values as the C# version does.
Yay! On to a database layer! (I think I will write mine to work directly
with the sqlite database that my plugin already has, but I will try to
keep it general enough so that people could use the code without my plugin.)
Anyone that is interested, the code lives here:
http://code.google.com/p/autoqueue/source/browse/trunk
It's in mirage.py and test_mirage.py. That last file contains some
budding unittests. I've tried to look into porting those to C# too, but
that ended up looking like more work than I was willing to spend right
now, without knowing if anyone is even interested. (Mainly because I
couldn't get the C# unit testing framework to run on my machine.) The
python unit tests could be adapted, and I would convert them if someone
else is interested, and has some time to test them.
I have copied the the /res directory verbatim, and added Dominik and
Bertrand to the copyright files. (I think I'll leave out your email
addresses to prevent questions about my software ending up on your desk,
but let me know if you wish to have them in.) My project is GPL as well,
so I think that should be sufficient, but please let me know if you have
additional wishes/requirements.