Hello Joe. Sorry for the long delay responding to your message.
I think I can say a couple of useful things. First, when the oncotree paper was published a couple of years back (
https://pubmed.ncbi.nlm.nih.gov/33625877/) we standardized onto a new domain name for oncotree :
https://oncotree.info --- and for a while we kept the older domain name (which was
oncotree.mskcc.org) active and running on the original server. There was a brief period recently when the old server (and domain name) stopped working, so your attempts to follow the old links from discussions here probably were attempted during that period. We have now remapped web requests to
oncotree.mskcc.org so that they instead go to the web service running as
oncotree.info --- but as a general rule we recommend that all users standardize to using
oncotree.info and to stop using
oncotree.mskcc.org. You can substitute
oncotree.info wherever you might have used the other domain name.
---
Next, I can tell you that the deprecated API which produces the table-formatted text file should function correctly and provide accurate information about the oncotree version which is requested. So a request to this web address:
will produce an accurate table formatted list of all oncotree nodes for the version "oncotree_latest_stable" .. or any version which might be requested. The reason why this API is deprecated is that we ran into trouble previously when the oncotree itself grew and some nodes were nested at a depth greater than had been previously encountered. In the output of the link above you can see that there are seven "level" columns because the maximum depth of the oncotree is 7 currently. Originally when we developed the API the depth was 5. So in the future if the oncotree grows to be even deeper, additional columns may be needed in the table - and because of this unknown column content in the output of the API it become programmatically difficult for scripts or other automations to work with the output of this (deprecated) API. Many external programs would likely break as this table grows (unless the programmers anticipated the possibility of additional columns appearing and proactively programmed for this occurrence). So to avoid this future problems we deprecated this API and intended to remove it when a good opportunity arose.
All of the information needed to construct this table should be obtainable from the API at this address:
What would be helpful would be a script which takes the output of the tumorTypes API and performs this logic:
- define a number of associative maps from oncotree_code to attributes needed in the table {name, mainType, color, nci, umls, history}
- define an associative map from oncotree_code of the current node to the oncotree_code of the parent node
- define an integer holding the greatest depth (level) of any node seen so far, initialized to zero
- loop through each oncotree node in the list. For each:
- update the greatest depth seen integer if a greater depth is present in the node
- store needed values in the various attribute maps for this node
- store the oncotree code for the parent in the parentage map
- define an oncotree-code-to-parentage-list map. This is done by iteratively following links in the parentage map.
an example would be this entry for GBM -- { "GBM" : ["BRAIN", "DIFG", "GB"] }
- construct a parentage list for each oncotree code, using the parentage map
- construct a sorted ordering of the oncotree codes, based on a comparison function which prioritizes:
- alphabetical ordering of the name of any parent, from "highest" (level 1) to lowest
- alphabetical ordering of the name of the node itself
- output the table header
- for each oncotree code in the sorted ordering:
- look up the parentage-list for the oncotree code
- for each code in the parentage list output a field for the parent in format "<parent-name> (<parent-code>)"
- output a number of empty fields equal to greatest-seen-depth - size(parentage list)
- output the other stored attributes for this oncotree code {name, mainType, color, nci, umls, history}
This is not too difficult of a script to write. If we put together such a script, we will distribute it on our tools page
and maybe then we could actually remove the API which generates this table format of oncotree.
I hope this helps you. Some of the terminology in your message is a little different that the terminology we use, but I think they map to our concepts ... such as OncotreeLineage referring to either our History field or to the parental relationship between oncotree nodes, or OncotreePrimaryDisease referring to what we call "mainType". But if I am mistaken, feel free to clarify these terms. I think you also suggest adding the oncotree version into the text file as content. I think this would require defining a way to include meta-information into file content while still allowing the file to be machine-readable. Perhaps the easiest solution would be to store the oncotree version in the filename rather than in the file contents.
- Rob
On Thursday, July 20, 2023 at 9:02:41 PM UTC-4: