Hey Jaganadh,
So "maximum likelihood estimate" just means "what is the estimate that
you could use to maximize the likelihood of the observation that we
got", in effect believing your data even though you might be (probably
are) overfitting. It means you're not doing any smoothing of your
distribution.
So what do you want to calculate? If I understand your variable names
right, it looks like you're doing something sensible: the proportion
of the times the bigram appears out of the times the first word in the
bigram appears.
--
-- alexr