BETA-basic execution errors

264 views
Skip to first unread message

Matthew Grieshop

unread,
May 5, 2015, 9:34:04 PM5/5/15
to cist...@googlegroups.com
I am having difficulties executing BETA-basic. I have a bed file and beta-specific formatted file but I am receiving this error when I try to execute:

cp: target `/data4/CistromeAP/galaxy_database/files/000/960/dataset_960788.dat' is not a directory
cp: target `/data4/CistromeAP/galaxy_database/files/000/960/dataset_960789.dat' is not a directory
cp: target `/data4/CistromeAP/galaxy_database/files/000/

Any help would be much appreciated!

Thanks,
Matt

Jian Ma

unread,
May 6, 2015, 3:26:09 AM5/6/15
to cist...@googlegroups.com
Hi Matthew,

What's the file format of your input bed and BSF file? Here's have some exple of the file format. http://cistrome.org/BETA/tutorial.html Also you may need to use TAB to split you columns in the file.

Also when you click the eye icon of your failed log dataset in your Cistrome AP history panel, there have some detail error messages. 

--
You received this message because you are subscribed to the Google Groups "Cistrome" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cistrome+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
best,

Jian Ma,


Charlotte Svensson

unread,
May 6, 2015, 5:53:59 AM5/6/15
to cist...@googlegroups.com
I have the exact same problem, I have even uploaded file that used get analyzed, and they are not working any more?

Matthew Grieshop

unread,
May 6, 2015, 10:15:49 PM5/6/15
to cist...@googlegroups.com
Hello Jian,

The input bed file has 4 columns. The first 3 are chromosome, start, and end. The fourth column is score. The txt file has 3 columns. They are gene name, regulatory status, and P-value left to right and it is a tab delimited txt file. I see the bed file should be 3 or 5 columns.

Matt
 

--
You received this message because you are subscribed to a topic in the Google Groups "Cistrome" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cistrome/nwIJgZJzbns/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cistrome+u...@googlegroups.com.

Matthew Grieshop

unread,
May 6, 2015, 10:46:05 PM5/6/15
to cist...@googlegroups.com
Hello Jian, 

I converted the bed file to 3 columns as specified and am still an error. The detailed message that you pointed out is:

CRITICAL:root:The input bed file /data4/CistromeAP/galaxy_database/files/000/961/dataset_961747.dat has a wrong format!(3 column checking active)
[22:38:56] Argument List: 
[22:38:56] Name = NA
[22:38:56] Peak File = /data4/CistromeAP/galaxy_database/files/000/961/dataset_961747.dat
[22:38:56] Top Peaks Number = 10000
[22:38:56] Distance = 100000 bp
[22:38:56] Genome = hg19
[22:38:56] Expression File = /data4/CistromeAP/galaxy_database/files/000/960/dataset_960787.dat
[22:38:56] BETA specific Expression Type
[22:38:56] Number of differential expressed genes = 0.5
[22:38:56] Differential expressed gene FDR Threshold = 1.0
[22:38:56] Up/Down Prediction Cutoff = 1.000000
[22:38:56] Function prediction based on regulatory potential
[22:38:56] Wrong Format:			chr13	37858225	37858829	

[22:38:56] Right Format should look like:	chr1	567577	567578	MACS_peak_1	119.00
[22:38:56] Or the depreciate 3-column format like this:	chr1	567577	567578
[22:38:56] 42064	1.940559441	0.932137556
 is not the header of the expression file
[22:38:56] Checking the differential expression infomation...
[22:38:56] Take the first line with Differential Information as an example: 42064	1.940559441	0.932137556

[22:38:56] BETA cannot recognize the refseq gene ID, status value(logFC) or FDR. Please give the exact column numbers of the refseq, logFC, and fdr like: 1,2,7 for LIMMA; 2,10,13 for Cufdiff; and 1,2,3 for BETA specific format.
Should the txt file be comma delimited, or am I reading the last error wrong? Furthermore, the bed file has only 3 columns from my view but looks like there is an extra tab making column3 actually column4. Are these the issues you are also seeing?
Thanks so much for your help!

Matthew Grieshop

unread,
May 6, 2015, 11:05:15 PM5/6/15
to cist...@googlegroups.com
This is what I have gotten it to. I had to list the BSF as 1,3,4 to get this. When I list 1,2,3 it says that it cannot recognize the refseq gene ID, status value or FDR. 

[23:02:32] Argument List: 
[23:02:32] Name = NA
[23:02:32] Peak File = /data4/CistromeAP/galaxy_database/files/000/961/dataset_961799.dat
[23:02:32] Top Peaks Number = 10000
[23:02:32] Distance = 100000 bp
[23:02:32] Genome = hg19
[23:02:32] Expression File = /data4/CistromeAP/galaxy_database/files/000/960/dataset_960787.dat
[23:02:32] BETA specific Expression Type
[23:02:32] Number of differential expressed genes = 0.5
[23:02:32] Differential expressed gene FDR Threshold = 1.0
[23:02:32] Up/Down Prediction Cutoff = 1.000000
[23:02:32] Function prediction based on regulatory potential
[23:02:32] Check /data4/CistromeAP/galaxy_database/files/000/961/dataset_961799.dat successfully!
[23:02:32] 42064	1.940559441	0.932137556
 is not the header of the expression file
[23:02:32] Checking the differential expression infomation...
[23:02:32] Take the first line with Differential Information as an example: 42064	1.940559441	0.932137556

Traceback (most recent call last):
  File "/usr/local/bin/BETA", line 5, in <module>
    pkg_resources.run_script('BETA-Package==1.0.7', 'BETA')
  File "/usr/local/lib/python2.6/dist-packages/distribute-0.6.35-py2.6.egg/pkg_resources.py", line 505, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/local/lib/python2.6/dist-packages/distribute-0.6.35-py2.6.egg/pkg_resources.py", line 1245, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/local/lib/python2.6/dist-packages/BETA_Package-1.0.7-py2.6.egg/EGG-INFO/scripts/BETA", line 193, in <module>
    main()
  File "/usr/local/lib/python2.6/dist-packages/BETA_Package-1.0.7-py2.6.egg/EGG-INFO/scripts/BETA", line 183, in main
    basicrun(argparser)
  File "/usr/local/lib/python2.6/dist-packages/BETA_Package-1.0.7-py2.6.egg/BETA/runbeta.py", line 62, in basicrun
    expre_info = check.check_expr()
  File "/usr/local/lib/python2.6/dist-packages/BETA_Package-1.0.7-py2.6.egg/BETA/fileformat_check.py", line 110, in check_expr
    fdr = second_line[int(expreinfo[2])-1]
IndexError: list index out of range

Jian Ma

unread,
May 7, 2015, 2:25:11 AM5/7/15
to cist...@googlegroups.com, 王苏
Hi Matthew,

BETA can only recognize refseq_id as gene name. In your BSF, you are using entrez_id.

Also, in case, if you have any space lines at the end of your file, you'd better remove them.

Jian Ma

unread,
May 7, 2015, 2:29:14 AM5/7/15
to cist...@googlegroups.com
Hi Charlotte,

Could you shared your history with me? If you click the re-run button for your previous datasets, they should still work.
Message has been deleted

Charlotte Svensson

unread,
May 7, 2015, 5:25:15 AM5/7/15
to cist...@googlegroups.com
Attached is a picture of the history. I have put refseq id in col 1, expression data in column 2 and p-value in column 3. The MACS file should not cause the problem, since it worked for analysing other things.

Thanks for your help!
history.tif

Matthew Grieshop

unread,
May 7, 2015, 12:22:55 PM5/7/15
to cist...@googlegroups.com
Thanks Jian,

How would you recommend converting my file to RefSeq? The beta tutorial says it accepts official gene symbol as well. 

Matt

You received this message because you are subscribed to a topic in the Google Groups "Cistrome" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cistrome/nwIJgZJzbns/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cistrome+u...@googlegroups.com.

Matthew Grieshop

unread,
May 7, 2015, 7:19:19 PM5/7/15
to cist...@googlegroups.com
Jian,

You said I am using Entrez_ID, my list of genes has 10 that are Entrez_ID while the rest look like this:

A1BG -0.02929 0.714881
A1CF 0 1
A2LD1 0 1
A2M 0 1
A2ML1 0.126561 0.76433
A4GALT -0.06215 0.68971
A4GNT 0 1
AAAS 0.189538 1
AACS 0.518948 0.402564
AADAC 0 1
AADACL2 0 1
AADACL3 0 1
AADACL4 0 1
AADAT 0.081503 1
AAED1 0 1
AAGAB 0.413229 0.161056
AAK1 0.1695 0.629126
AAMP 0.412351 0.494467
AANAT 0 1
AARS 0.280338 0.988991

I thought it was acceptable to use the Gene Symbol.

Charlotte Svensson

unread,
May 8, 2015, 3:21:19 AM5/8/15
to cist...@googlegroups.com

What does this mean?


Dataset generation errors

Dataset 564: H3 across HOT_g15_u501_top20

Tool execution did not generate any error messages. The tool produced the following additional output:
--title=H3 --x_label=HOT regions less than 500bp from TSS of top 20 genes --y_label=K-means --upstream=1000 --downstream=1000 --pf-res=10 --fontsize=2 --col=053061,2166AC,4393C3,92C5DE,D1E5F0,F7F7F7,FDDBC7,F4A582,D6604D,B2182B,67001F --pic_width=1600 --pic_height=1200 --dir
null device 
          1 

Charlotte Svensson

unread,
May 8, 2015, 3:22:30 AM5/8/15
to cist...@googlegroups.com
And this?

Dataset generation errors

Dataset 3: 20081014_ahringer_AB9050_H3K36ME3_N2_L3_1LM_AB9050_H3K36me3_N2_L3_1LM_9515302.ma2c_MA2Cscore.wig

The tool did not create any additional job / error info.

Report this error to the Galaxy Team

Jian Ma

unread,
May 9, 2015, 2:08:21 AM5/9/15
to cist...@googlegroups.com
Hi Matthew,

Yes. I just comfirmed the BETA can also take gene symbol as input. 

For a list of gene id conversion, you can try this tool Convert between RefSeq, Gene Symbols to Entrez IDs using Bioconductor in Cistrome.

Jian Ma

unread,
May 9, 2015, 2:18:50 AM5/9/15
to cist...@googlegroups.com
Hi Charlotte,

These are system error imformation. For the detail of the exact error of the tool you are running, please click the eye icon of the log dataset that tool generated. for example, the dataset 628 in your attachment.


--
best,

Jian Ma,


Charlotte Svensson

unread,
Jun 2, 2015, 3:06:22 AM6/2/15
to cist...@googlegroups.com
When I am pressing the eyes, the window is empty.

I have re-run it and it still does not work, I dont understand what dataset 5 means? I have not put any data with H3K36ME3_N2.....into the analysis?!

Thanks for your help!

Dataset generation errors

Dataset 5: H3K36ME3_N2_L3_2LM MA2Cscore

Tool execution generated the following error message:
uploaded wig file, size: 33.4 Mb

Jian Ma

unread,
Jun 3, 2015, 5:05:34 AM6/3/15
to cist...@googlegroups.com
Hi Charlotte,

The BETA tool cannot run your datasets. This can be caused by bugs in the tool / wrong parameter settings / wrong input file format. Could you share your history with me? 

For the error message it generated, please regardless of that... sometimes this message is not related with your datasets. I need to fix it later.



--
You received this message because you are subscribed to the Google Groups "Cistrome" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cistrome+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
best,

Jian Ma,


Charlotte Svensson

unread,
Jun 30, 2015, 11:04:59 AM6/30/15
to cist...@googlegroups.com
Sure! 

Can I get your e-mail, the I can attach the file? 


Otherwise you can send me an e-mail to charlotte...@sund.ku.dk

Thanks for your help!

BR
Charlotte

Syed najeeb ashraf

unread,
Jul 1, 2015, 6:09:43 AM7/1/15
to cist...@googlegroups.com
Hi I am exactly getting same error. I used same file format as mentioned by Matt above. Its bit awkward. I tried BETA tools too offline but got error.
Thanks
Najeeb

Syed najeeb ashraf

unread,
Jul 1, 2015, 6:10:33 AM7/1/15
to cist...@googlegroups.com
Please let me know if any one here able to get some solution. Please. 


On Wednesday, 6 May 2015 04:34:04 UTC+3, Matthew Grieshop wrote:

Jian Ma

unread,
Jul 2, 2015, 3:54:19 AM7/2/15
to cist...@googlegroups.com
Hi Syed,

Do you have the error while running BETA outside CistromeAP? You may need to ask the author who is much expert with BETA. https://groups.google.com/forum/#!forum/cistromebeta

--
You received this message because you are subscribed to the Google Groups "Cistrome" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cistrome+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
best,

Jian Ma,


Reply all
Reply to author
Forward
0 new messages