Some comments:
Short-time windowing is often interpreted in terms of its impact in the Fourier (frequency) domain; multiplying by a finite-length window in time is equivalent to convolving (i.e., blurring) the frequency response with the Fourier transform of the window - the magnitude responses shown in Omid's message.
This is consistent with a more intuitive motivation for tapered windows like the raised-cosine (Hann): Smoothly tapering a waveform segment to zero at its edges avoids the discontinuity we'd otherwise likely get. Discontinuities in time correspond to spreading (blurring) spectral energy across the whole frequency range -- a/k/a "spectral splatter" -- which, if you listen to it, sounds like an audible click.
However, the smooth, but finite-duration raised-cosine window does not have a monotonic magnitude spectrum; its broadened central "bump" is surrounded by multiple secondary bumps - sidelobes - separated by frequencies where the magnitude is zero (notches). Sidelobes are unpleasant because they introduce local maxima in the windowed spectrum which are not centered on the frequency component that caused them.
Even the spectral splatter caused by a rectangular window (no tapering) has a sidelobe structure. The largest sidelobe, at around 3pi/L rad/samp away from the center (for an L point window) is about 13 dB below the main lobe peak.
Hann-window tapering reduces this worst sidelobe peak to better than 31 dB below the main peak. This comes, however, at a cost - the main lobe itself is twice as wide (twice the blurring), so the first sidelobe is now around 5pi/L rad/samp.
The Hamming window reintroduces a little bit of discontinuity (a "pedestal" below the raised-cosine) which manages to cancel some of the peak of the worst Hann sidelobe. As a result, the Hamming window has a worst-case sidelobe almost 43 dB below the mainlobe. It doesn't make the mainlobe any wider than with Hann, *but* it does introduce spectral splatter: whereas the Hann sidelobes continue to decrease as you get further away from the center frequency, the Hamming sidelobes die out much more slowly, as can be seen in Omid's figure.
So essentially it's a compromise between the size of the sidelobes very close to the main lobe (which Hamming makes more than 10 dB better than Hann) and the sidelobes far away (which Hamming makes much worse, maybe 50 dB worse in Omid's figure). This can be particularly damaging if you're trying to be sensitive to low-energy components in a signal with other, high-energy components which are far removed in frequency. Speech without pre-emphasis fits this, with the low-frequency voicing often 40 dB+ more intense than the high frequency.
Dan's objection to the widely-used Hamming stems, I think, from the small discontinuity due to the pedestal, and the consequent spectral splatter. So the "Povey window" stays pretty close to the Hamming window in the time domain, except at the extremes, where it smoothly tapers to zero. However, this doesn't manage to preserve the first-sidelobe suppression of the original Hamming, which actually *relied* on the spectral splatter to cancel the lobe. Here's a plot of the detail of the magnitude responses for Rectangular, Hann, Hamming, and Povey windows, right around the mainlobe and early sidelobes:
You can see how Hamming (green) has reduced the first sidelobe (around \omega=0.010) compared to Hann (orange), but the later green sidelobes don't decay much.
Unfortunately, the Povey window (red) doesn't appear to give any advantage over Hann, except for a marginally narrower mainlobe (which surprised me). On the whole, I think a plain raised-cosine (Hann) is a better choice, although I'd be surprised if there was any meaningful difference between them on a downstream task.
DAn.