To add to that -- you can probably use the sum of the sequence lengths or 0.8*sum_lengths (for example) for genome size (G) without it making a big difference in peak calling. If I recall correctly,G is used to calculate the global lambda as well as to calculate the number of redundant reads allowed to pile up at one site. Global lambda is proportional to number_reads/G -- so slightly smaller effective genome size estimates will result in a slightly more conservative set of peaks (or slightly narrower width of their coordinates) since it will result in a bigger global lambda (but only where global lambda would be chosen over local lambdas). As for calculating the maximum number of duplicated tags at a site, using a genome size between 0.75*G to G will almost certainly give the same result -- and if not, then only a difference of 1 read per site. If you are doing "--keep-dup 1" (I believe this is default), then genome size will not affect how many are kept at all -- so it would only affect the global lambda.