Reporting on number of trees read

8 views
Skip to first unread message

Yan Wong

unread,
Oct 20, 2020, 5:13:08 AM10/20/20
to DendroPy Users
I have a largish newick file (~20,000 trees, 1500 tips on each tree), and it's taking a fair while to read all these trees into a TreeList. Is there any way I can output a status bar for how long this is going to take (I tend to use the python tqdm library to report this sort of thing, but even printing out to the console would be fine)?

Jeet Sukumaran

unread,
Oct 20, 2020, 3:03:48 PM10/20/20
to dendrop...@googlegroups.com
Hi Yan Wong,

What I do in these cases is to use a tree iterator as opposed to a tree
collection. This returns one tree at a time, as opposed to reading all
the trees and then returning, allowing me to maintain a current count.
In cases where trees can be processed singly, it's also more memory
efficient as each tree can be discarded after being processed or
statistics collected.

https://dendropy.readthedocs.io/en/v3.12.1/tutorial/trees.html#efficiently-iterating-over-trees-in-a-file

-- jeet
> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to dendropy-user...@googlegroups.com
> <mailto:dendropy-user...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/dendropy-users/89e61c4c-5eca-46d4-ba32-4abd8df8f1b5n%40googlegroups.com
> <https://groups.google.com/d/msgid/dendropy-users/89e61c4c-5eca-46d4-ba32-4abd8df8f1b5n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--

----------------------------------------------------
Jeet Sukumaran
----------------------------------------------------
Assistant Professor
Biology Department
San Diego State University
----------------------------------------------------
Lab:
https://sukumaranlab.org/
Blog:
https://jeetblogs.org/
Repositories:
https://github.com/jeetsukumaran
Photography:
https://www.flickr.com/photos/jeetsukumaran/
Instagram:
https://www.instagram.com/jeetsukumaran/
Calendar:
https://goo.gl/dG5Axs
----------------------------------------------------
Email:
jsuku...@sdsu.edu (work)
jeetsu...@gmail.com (personal)
----------------------------------------------------
Mailing Address:
Biology Department, LS 262
San Diego State University
5500 Campanile Drive
San Diego, CA 92182-4614
----------------------------------------------------

Jeet Sukumaran

unread,
Oct 20, 2020, 3:03:53 PM10/20/20
to dendrop...@googlegroups.com
Hi Yan Wong,

What I do in these cases is to use a tree iterator as opposed to a tree
collection. This returns one tree at a time, as opposed to reading all
the trees and then returning, allowing me to maintain a current count.
In cases where trees can be processed singly, it's also more memory
efficient as each tree can be discarded after being processed or
statistics collected.

https://dendropy.readthedocs.io/en/v3.12.1/tutorial/trees.html#efficiently-iterating-over-trees-in-a-file

-- jeet


On 10/20/20 2:13 AM, 'Yan Wong' via DendroPy Users wrote:
Reply all
Reply to author
Forward
0 new messages