Initial R analysis

39 views
Skip to first unread message

Aanuoluwa Adekoya

unread,
Nov 13, 2022, 10:48:19 PM11/13/22
to SAMSA bioinformatics group
Hi,

I get this error "Error in `$<-.data.frame`(`*tmp*`, X2, value = numeric(0)) :
  replacement has 0 rows, data has 8" when I try to use run_DESeq_stats.R  and Subsystems_DESeq_stats.R on my outputs. 

Could you please provide help?

Thanks.

Sam Westreich

unread,
Nov 13, 2022, 10:53:37 PM11/13/22
to Aanuoluwa Adekoya, SAMSA bioinformatics group
Hi Aanuoluwa,

Sure, I can help out!  I'll need a bit more information, though.  Could you give me:
  1. The names of the files in your working directory, that you're feeding in (these will probably look like the files here: https://github.com/transcript/samsa2/tree/master/sample_files_paired-end/6_RefSeq_org_results)
  2. The first 5 or so lines of each file (on the command line, you could run "head -5 $filename" to get these printed out in the terminal)
Your error looks like one or more of the files in the working directory isn't being properly loaded into R.  The script parses through each file and adds it to one merged table; it's throwing an error because one of the files seems to be empty or isn't reading properly.

Best,
Sam


--
You received this message because you are subscribed to the Google Groups "SAMSA bioinformatics group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samsa-bioinformatic...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/samsa-bioinformatics-group/fa72585d-bb42-4757-b903-f7ad5307f228n%40googlegroups.com.


--
Sam Westreich, PMP, PhD
Microbiome Scientist, DNAnexus, 

Adekoya Aanuoluwa

unread,
Nov 14, 2022, 10:39:17 AM11/14/22
to Sam Westreich, SAMSA bioinformatics group
Thank you for your response, Sam.

I have attached the screenshots of the file names and the first five lines of each file to this email.  One more question, do I need to have my raw counts file labeled as control and experimental too?

Thanks.




Aanuoluwa E. Adekoya
Doctoral Student,
Microbiology and Plant Biology Department,
University of Oklahoma.


"Yesterday is not ours to recover, but tomorrow is ours to win or lose" -  Lyndon B. Johnson.

experimental.PNG
control.PNG

Sam Westreich

unread,
Nov 15, 2022, 11:29:59 AM11/15/22
to Adekoya Aanuoluwa, SAMSA bioinformatics group
Hi Adekoya,

Hmm, interesting.  

First, no, you don't need to label the raw_counts file as anything special; it's just loaded in by filename as you specify in the command string.

Second, can you give me the command that you're running that throws this error, and the error message?  Does it say which line is failing?

Third, do you know when (approximately) you grabbed the code from Github?  Just trying to figure out if this was a previous issue that was caught.

If I'm still not certain why this is failing for you, I might ask you to share a couple of the input files (maybe just 2 controls and 2 experimentals) and I can run the command myself and determine where things are getting stuck.

Best,
Sam

Adekoya Aanuoluwa

unread,
Nov 15, 2022, 11:45:47 AM11/15/22
to Sam Westreich, SAMSA bioinformatics group
Thank you again, Sam.

I started working with Samsa in September, so I did not copy this earlier than last month. 

I have attached 4 files and the bash script I am using to reference the R scripts. Thanks! 


Regards,

Aanuoluwa E. Adekoya
Doctoral Student,
Microbiology and Plant Biology Department,
The University of Oklahoma.


"Yesterday is not ours to recover, but tomorrow is ours to win or lose" -  Lyndon B. Johnson.
control_SRR6833324_refou_organism.tsv
experimental_SRR10267759_refou_organism.tsv
experimental_SRR10267760_refou_organism.tsv
control_SRR6833323_refou_organism.tsv
deseqanalysis.sh

Sam Westreich

unread,
Nov 16, 2022, 1:54:38 PM11/16/22
to Adekoya Aanuoluwa, SAMSA bioinformatics group
Hi Adekoya,

Okay, this is going to be a tricky bug to hunt down, since I'm not getting it on my end, at least for running these 4 files through the run_DESeq_stats.R script, using the shell command you provided.

Could you also send me the raw_counts file?  I'm wondering if there's an issue reading that in; I'm able to read in the rest of the files with no problem.

Best,
Sam

Adekoya Aanuoluwa

unread,
Nov 16, 2022, 1:57:38 PM11/16/22
to Sam Westreich, SAMSA bioinformatics group
Do I need to use the raw_counts file?

Maybe taking that out could help. 


Aanuoluwa E. Adekoya
Doctoral Student,
Microbiology and Plant Biology Department,
University of Oklahoma.


"Yesterday is not ours to recover, but tomorrow is ours to win or lose" -  Lyndon B. Johnson.

Sam Westreich

unread,
Nov 16, 2022, 3:41:34 PM11/16/22
to Adekoya Aanuoluwa, SAMSA bioinformatics group
Hi Adekoya,

No, you don't need that file; it's optional if you want to compare the fraction of annotated sequences against all reads (to add a "not annotated" grouping if you suspect a sample has a lot of dark matter).

As an example of the intended output, I'm attaching a DESeq2 results output from the 4 files you sent me.  If you try running the Rscript command again without the raw counts file and still get an error, please let me know - at that point, I might have to try messing around with different R versions.

Best, 
Sam
results.tab

Adekoya Aanuoluwa

unread,
Nov 16, 2022, 4:40:04 PM11/16/22
to Sam Westreich, SAMSA bioinformatics group
Hi Sam,

Thank you again for your help. When I removed the raw counts file, I got the functional results and subsystems result to work. But the organism result came back with another very long error. I have attached a screenshot of that to this email. Thanks.


Aanuoluwa E. Adekoya
Doctoral Student,
Microbiology and Plant Biology Department,
University of Oklahoma.


"Yesterday is not ours to recover, but tomorrow is ours to win or lose" -  Lyndon B. Johnson.

org error.PNG
Reply all
Reply to author
Forward
0 new messages