Hi!
I am currently working on a self-BLAST visualization of a bacterial genome using Circos (v0.69-9). As someone who is new to both Circos and bioinformatics, I’m encountering significant difficulties and would greatly appreciate your advice.
After aligning my genome against itself with BLASTn, I observe an excessive number of 100% identity matches, especially among neighboring regions. Despite filtering by:
-evalue 1e-10,
-word_size (100–200),
and minimum alignment length (≥1000 bp),
…the Circos plot remains extremely dense, with thousands of overlapping or nearly identical links.
I also attempted post-processing with awk and pandas to eliminate duplicated start–end coordinates and reduce redundancy. However, even after aggressive filtering, the visual output is still cluttered, making it difficult to interpret any meaningful patterns. In some trials, only the lines remain but the inner space is still heavily filled suggesting that link overlap itself remains a core issue.
What I’d Like to AskIs there a best practice for visualizing self-BLAST results with Circos that helps reduce clutter without compromising biological meaning?
Would it be acceptable to discard 100% identity matches or limit based on alignment length just for clarity?
Are there Circos-specific options like record_limit, z-depth, or other parameters that can help limit or prioritize link display?
Is it possible to configure Circos to automatically display only the top-scoring or longest alignment per region?
I would sincerely appreciate any advice, tips, or example configurations that might help me generate a clearer, more informative Circos plot.
Thank you so much for your time and for developing such a powerful tool.
Kind regards,
Çağla Güçdemir