Hi Harshi,
I think detect-sinusoids does a good job of detecting ringtones, as well as DTMFs. But, as Dan mentioned, it needs a little post-processing and some heuristics.
I wrote a program that takes the output of detect-sinusoids and tries to detect frames that have ringtones. Please note that this has not been tested much and there are certainly better ways to do this. You should just use this as an example of how you could go about post-processing the output of detect-sinusoids.
Here's how you might do this:
Run detect-sinusoids on your wav files with some frame-length. I found --frame-length=40 to be best.
Run detect-ringtones --min-sum=300 --max-high-var=5 --max-low-var=5 --window=4 ark:<output of detect-sinusoids> ark:rings.ark . Play around with the arguments until something works reasonably well.
The rings.ark file will be a vector where 1 corresponds to a frame with a ringtone and 0 corresponds to anything else. You can use a program like select-voiced-frames to remove the frames with ringtones. Note that in that program frames with a 0 are removed and those with a 1 are kept, so you'll want to reverse the labels.
Best,
David