not outputting all alignment blocks

19 views
Skip to first unread message

Greg Owens

unread,
Apr 17, 2025, 1:06:43 PM4/17/25
to MafFilter
Hi,
I've been starting to use maffilter. I'm testing it on a maf file produced from cactus and cactus-hal2maf. It's an alignment of diploid genome assemblies, so most of my samples have two sequences representing the two haplotypes.

I have two questions/problems:
1) I tried a simple option file, where I'm just asking it to output a maf without any filtering. This is the option file text:

input.file=tmp.maf.gz  //Input maf file,  gzipped
input.file.compression=gzip
output.log=tmp.maffilter.log //Output log file
maf.filter= \
    Output(                                 \
        file=tmp.filtered.maf.gz,           \
        compression=gzip,                    \
        mask=no)

I started with a maf file that has 100000 lines, but the filtered file only has 52205 lines. I expected that the input and output should be identical (at least in the number of lines) because I'm not asking for any changes to the maf file. 
I've attached my starting maf and the output maf. 

2) There is an option to remove blocks with duplicates. In my case, each samples should be there in 1 or 2 copies. Is there any way of specifying that each sample should appear once or twice, but not more than that? 
Thanks!
Greg Owens


tmp.filtered.maf.gz
tmp.maf.gz
Reply all
Reply to author
Forward
0 new messages