ordering of top features by SD

24 views
Skip to first unread message

Shahab Mirshahvaladi

unread,
Sep 5, 2022, 7:41:25 AM9/5/22
to Omics Playground
Hello
it might be asking for stating the obvious but somebody just asked me why top features are sorted by SD and not something else? any thoughts? 

BigOmics Analytics Team

unread,
Sep 5, 2022, 8:23:33 AM9/5/22
to Omics Playground
Hi. For the unsupervised methods, like the heatmap, you are not allowed to explicitly use any of the condition variables (sample information) because it would not be "unsupervised" anymore. Sorting the genes by SD does not use any information of the groupings but the presumption is that genes with large variation (or SD) are actually/mostly driven by some of these biological conditions. Compare this to the differential expression testing, where you explicitly define groups, and the top features are conditioned on the groupings, this is "supervised". Does it answer your question? Any other variable you want to use for prioritizing genes (for unsupervised methods) should not use any sample information (e.g. information on samples), otherwise it would be "cheating".

Ivo
Reply all
Reply to author
Forward
0 new messages