Interesting. From the brief write-up, my guess is that it pegs the
melakarta (hence 72) association of any tune based on note
recognition. I doubt if it can do finer raaga recognition. I am not
sure whether/how it does accurate Sa placement. Also, what does it do
with a piece of music that does not fall within the bounds of the 72
raagas it can recognize? But even if it can correctly identify notes
in a random piece of music, it is quite commendable.
On a related note, Dr.Sahasrabudde (husband of Veenatai, I am
forgetting his first name) was working on some algorithms to
"generate" raaga music using statistical methods and had even
published some papers on this. I had at one time toyed around with an
information theoretic characterization of raagas, but gave up without
it going anywhere.
C
Thanks for your informed response. Could you also tell where can I
find the papers by Dr. Sahasrabuddhe?
Aside: As you already know (possibly), I am also trying (along with my
developer friends) to devise a similar algorithm, and from the huge
observational data we have till date, it seems that it is possible to
something like that. At least a clear mathematical formulation, though
may not be an analytical function, is not far beyond reach.
Partha
Dr Hari Sahasrabuddhe
http://www.it.iitb.ac.in/~hvs/
Warm regards,
Abhay
I'm sure you must have had a look at this, but if it was missed then -
Dr. Parag Chordia from the Georgia Institute of Technology is working
on raga classification using pitch class distributions. The details
can be found at http://paragchordia.com/research/raag.html
IINW, he is a member of this list.
Best regards,
Pranav
http://pranavsbrain.peshwe.com
Abhay has pointed to a relevant site. (Thanks, Abhay)
> At least a clear mathematical formulation, though
> may not be an analytical function, is not far beyond reach.
I think this is a challenging problem. Pattern-based recognition
(similar to Dr.Chordia's approach) is more tractable. I will look out
for announcements of your progress on this forum. Good luck.
C
This is impressive! Kudos to Parag and team!
But:
Why is the tonic (Sa frequency) entered manually? Can't the system be
enhanced to identify the tonic and perform the recognition?
In the second example, a human recognizer could zero in on Darbari a
lot quicker based on the application of the komal gandhar. The SW does
not attempt to recognize the flavor of the komal ga. It simply assigns
equal weightage to Darbari and Malkauns until it hears Re and then
Bingo! What would happen if you threw a Kaushi Kanada or a Sampoorna
Malkauns at it?
Based on the description, it is not clear whether it can distinguish
between two raags with the same notes (Bhoop-Deskar, Marwa-Puriya
etc.)
C
What sort of algorithm is it?
What is the input? An audio file, or some sort of notation?
Bhuvanesh
Neither. It's a virtal keyboard (the continuous fretless type) that
logs what you play on it and analyses w.r.t. the given keynote Sa to
identify the characteristic phrases and suggests the name of the mode
-- exactly how human process of cognition works. All in dreams now :)
HMM would be well-suited for the problem. Do you know what features or
parameters form the model? How many parameters are used? I think the
signal processing required to derive some of these parameters would be
daunting. Another challenge would be to provide it with sufficient and
appropriate training corpus, but then that is the case with any
training-oriented recognizer.
C
I guess the reason is that the same patterns would be produced by
apparently different modes. The same pattern (a phrase) is used by
say Malkauns and Bilawal, and one is obtainable from the other by
shifting the tonic from one position to another. So it is rather wise
to enter the tonic manually. However, that doesn't mean Gram/Kharaj
parivarttan or chromatic shift would generate Malkauns from Bilawal, I
am talking about just one phrase here, a fraction of the entire
pattern. The characteristic jumps of the two ragas are different only
with respect to the given Sa, otherwise they are same. Both modes have
characteristic phrases that contain a jump of -9.
Err.. I think it's time I publish some of our observations.
Partha
Thank you Sushama-ji. However this presentation needs a speaker with
actual data. Could you explain the basic principle how this model
handles the apparently unordered data? And also I would like to know
if an actual program is developed from this, would like to see a demo
of it running. Is that possible?
Partha
Even after an appropriate set of recordings is selected, the problem
of pitch estimation remains. I don't yet know how to get rid of the
drone, which has harmonics and messes things up.
Bhuvanesh
Pitch estimation is indeed a difficult problem. Determination of Sa is
probably a more difficult problem in terms of signal processing
complexity (whether or not there is tanpura playing in the
background). However, an HMM would use relative frequencies as
features. It is almost ironical that you have to extract the Sa so
that you can discard it!
C