general understanding question

109 views
Skip to first unread message

Namaste Tenzin

unread,
Jan 20, 2020, 2:43:32 AM1/20/20
to pyradiomics
Hi Joost,

I would like to ask a general understanding question for my sake. So I am looking at the following notebook 


I would like to ask how I can interpret the heat map and select features from the heat map that correlates to the outcome? is there a simple function that can spit out the features that are highly correlated to the outcome which then I can use it to do classification on test set? 

also I already have a csv file with feature names on the column and patients on the rows, is there a part of code in that folder that I can just use on the csv file rather than the examples you have given on there? 

Thanks so much, 

best
Tenzin 

Joost van Griethuysen

unread,
Jan 21, 2020, 4:35:54 AM1/21/20
to pyradiomics
Feature visualization does not correlate to outcome, as no outcome is defined. The heatmap you mention is the correlation among the extracted features. You can use this for feature selection by clustering correlated features and then selecting 1 from each cluster.

Regards,

Joost

Op maandag 20 januari 2020 08:43:32 UTC+1 schreef Namaste Tenzin:

j.v.gri...@nki.nl

unread,
Jan 28, 2020, 4:18:26 AM1/28/20
to tkun...@gmail.com, pyrad...@googlegroups.com

No, that just gives the first couple of lines from the dataset.

To select features you have to define a measure to select uncorrelated features. An example would be to select one, the drop out all that are correlated > 0.9, then select the next, etc. (you’d still need to figure out how to select 1 feature to start with, or after you’ve removed correlated features).

However, if your aim is feature selection, my advice would be to read up on several popular algorithms, both supervised and unsupervised. E.g. Orthogonal Principal Feature Selection, mRMR, LASSO, etc.

Next, I also always advise to check whether you can assess stability (for e.g. reader-noise, acquisition-noise) and use that to remove unstable features.

 

Regards,

 

Joost van Griethuysen

 

From: Namaste Tenzin [mailto:tkun...@gmail.com]
Sent: woensdag 22 januari 2020 6:14
To: Joost van Griethuysen
Subject: Re: general understanding question

 

Thank you so much Joost for your answer, 

 

so in that notebook does the code line 21 (d.head) the feature names are those selected using clustering technique? so can I test those features then into my classification? 

 

--
You received this message because you are subscribed to the Google Groups "pyradiomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyradiomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pyradiomics/b40e079c-ea32-48ca-ad4d-3b19cbc861bf%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages