Hi Davide,
If you are looking for some broad histone marks ChIP-seq, ENCODE project has a lot of data released <
http://genome.ucsc.edu/ENCODE/downloads.html>. I am not sure if they are suitable to be used in method comparison. Note that ENCODE datasets have different embargo date.
As for transcription factors, people always use motif to judge if the peaks are correct. However, for broad features, as far as I know, people do not have a clear way to compare. You may need to do some genome feature association study. For example, some features are well-known to be associated with gene activation, some are associated with gene repression, and so on.
HTH,
Tao