Correct way to run MiniNJ

20 views
Skip to first unread message

Baqiao Liu

unread,
Apr 11, 2021, 2:18:43 PM4/11/21
to GeneRax
Dear Benoit,

I am interested in running MiniNJ in the same way that it was run in the SpeciesRax preprint, but I don't think I can find the exact commands to run it. I found njrax MiniNJ <input_gfts> <mapping_file> <output_species_tree> as a potential way to run it, but running SpeciesRax with --species-tree MiniNJ --si-strategy SKIP also seems plausible. What will be the correct way to run MiniNJ?

Also, one problem I recall with NJst is the problem of missing data (when some pair of species do not appear anywhere together in the GFTs). I am wondering if I can run MiniNJ when there is missing data in the distance matrix. Is the distance matrix completed in some way?

Best,
Baqiao

Benoit Morel

unread,
Apr 11, 2021, 2:53:55 PM4/11/21
to Baqiao Liu, GeneRax
Dear Baqiao,

Good questions. I just added a page in the wiki to document MiniNJ and to reply to your question: https://github.com/BenoitMorel/GeneRax/wiki/MiniNJ

You are right, MiniNJ can be sensitive to missing data. We are still trying to figure out some elegant way for accounting for missing data, but haven't found any satisfying solution yet. If you have any idea for completing the distance matrix, I would be very happy to implement it :-)

Best,
Benoit

--
You received this message because you are subscribed to the Google Groups "GeneRax" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generaxusers...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generaxusers/57aff017-9fd4-4d09-b560-e4b4f5958b49n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Baqiao Liu

unread,
Apr 11, 2021, 6:04:51 PM4/11/21
to GeneRax
Dear Benoit,

Thanks for the quick reply again! The wiki page is very helpful. I am glad to be able to properly run MiniNJ.

I personally only know the star variants of the distance-based algorithms (NJ*, BioNJ* etc.) that will work on incomplete matrices that might help with missing data, and I have the feeling that you have already considered them. I was just trying to figure out if it makes sense to run MiniNJ on datasets prone to missing data. I guess the answer is that for the time being I should not use MiniNJ by itself when there is significant missing data.

Best,
Baqiao

Reply all
Reply to author
Forward
0 new messages