Hey everyone
I am computer science graduate good in machine leaning, deep learning, and data science. I have started working on a research project related machine learning in cancer. I am trying to create ensemble or deep learning models to predict gene expression of the most correlated genes of a set of input genes.
The data type I am using is mRNASeq, but the problem is to get the large size of dataset.
The dataset I have collected so fat is having seven thousand samples and more than twenty thousand genes. Which is not good at all.
I am curious to know that is it possible to get more than one lakh samples of such data type?
I have been searching on databases like GEO, cbioportal, firehouse etc. But did not reach even ten thousand.
Is it not possible at all or what?
Any help will be appreciated. Cause I don't want to waste my time on something which is not possible.
HELP PLEASE!