How to transform 'chromosome start end' to 'band' in batch using biomaRt?

9 views
Skip to first unread message

Yufang Li

unread,
May 28, 2019, 8:43:18 AM5/28/19
to biomart-users
Hi,
I want to transform  'chromosome start end' to  'band' in batch using biomaRt now. But I met some questions. Could you help me ?
I refer to the examples on the user manual. 

getBM(attributes = c('affy_hg_u133_plus_2','ensembl_gene_id'), filters = c('chromosome_name','start','end'), values = list(16,1100000,1250000), mart = ensembl)


But this example can not transform in  batch. There are 328  items needed to annotate.
My data structure:
    seqnames     start       end width strand no.cpgs    minfdr  Stouffer  maxbetafc  meanbetafc
1        16  54971700  54973047  1348      *       5  0.00e+00  0.00e+00  0.3311636  0.28623709
2        14  36992001  36993517  1517      *       4 2.93e-235 2.50e-278  0.3493025  0.32264970
3         6  31650760  31651158   399      *       7 8.20e-288 3.14e-256  0.2683500  0.23795246
4        17  79004850  79005386   537      *       3 7.45e-249 1.69e-202  0.3201577  0.29638743
5        10 134150489 134150690   202      *       4 1.88e-232 1.34e-197  0.2779639  0.26302729
6         3 169384410 169384648   239      *       3 1.50e-234 7.84e-186  0.3797378  0.30127753
7         1 151810485 151811364   880      *       6 2.08e-188 1.18e-164  0.2746188  0.23411387
8         7   5271516   5271655   140      *       2 2.54e-212 7.38e-163  0.3677478  0.34913817
9        16  54316049  54317413  1365      *       4 8.52e-184 2.45e-162  0.3459313  0.26682181
10        5  37834958  37836246  1289      *       5 1.35e-178 8.61e-155  0.2931416  0.21486234
11        7  36764019  36764140   122      *       2 9.41e-197 2.13e-148 -0.3248119 -0.28964118
12       12  16762789  16762831    43      *       2 2.69e-173 8.30e-137  0.3245738  0.31208577
13       16  68676557  68676806   250      *       2 4.21e-156 9.80e-128  0.2981041  0.28468263
14        1 247712237 247712239     3      *       2 1.69e-154 1.43e-126 -0.2896461 -0.28435472
15        5   2747922   2748369   448      *       2 7.63e-136 2.13e-113  0.2947875  0.28194606
16       11  86383696  86383761    66      *       2 3.24e-130 1.30e-109  0.2938009  0.26314046
17       16   4714794   4714815    22      *       2 2.37e-129 3.62e-109  0.2962017  0.28677182
18       19  58220295  58220818   524      *       8 3.68e-112 3.14e-108  0.1722510  0.15602547
19        1 147782116 147782452   337      *       2 7.91e-128 3.76e-108  0.2651204  0.26211464
20       19  13135662  13135808   147      *       2 1.46e-127 7.33e-108  0.2977374  0.27440837
21       10  77155376  77156421  1046      *       4  9.50e-92 3.01e-105  0.2826025  0.23205180
22       12  30976001  30976016    16      *       2 1.95e-122 2.96e-104  0.2653739  0.25972562
23       15  89951787  89953295  1509      *       4  1.25e-87 1.92e-102  0.1948644  0.17701206
24       12  58021295  58021917   623      *       4 1.00e-110 3.62e-100  0.2756699  0.20133960
25       22  30476206  30476452   247      *       3 4.68e-110  1.28e-98  0.2147699  0.18921464
26       12  54408664  54409599   936      *       7 5.71e-104  5.27e-98  0.1720937  0.14245105
27        6  10416007  10416508   502      *       4 2.52e-102  1.98e-95 -0.2120803 -0.17534331
28        7  96650096  96650192    97      *       3 7.07e-113  3.65e-95  0.2560395  0.21009693
29        7  96642096  96643508  1413      *       4 1.46e-105  3.09e-92  0.2656333  0.18161992
30        3 156324038 156324118    81      *       2 7.00e-106  4.97e-92  0.2429422  0.23746324
31       12  47219737  47219920   184      *       5  6.61e-95  1.09e-90  0.2079795  0.17331125
32        2 172953134 172953925   792      *       4 7.05e-115  2.85e-89  0.2600936  0.18543391
33        6  10420798  10422322  1525      *       5  1.04e-82  1.48e-87  0.2163537  0.16366814
34        6   7540834   7541066   233      *       3  2.00e-95  4.06e-87  0.2151257  0.21056723
35        2 175193377 175193551   175      *       2 4.88e-107  9.56e-87  0.2864994  0.26058300
36        1 226187852 226187876    25      *       2  4.13e-95  1.98e-83  0.2609258  0.24550333
37        5   2754753   2755540   788      *       2  4.75e-74  6.31e-82  0.3180281  0.27040688
38       15  89949617  89949776   160      *       2  3.59e-92  1.91e-81  0.2766399  0.24868424

 And I try to write my code.

DMRS_Band=getBM(attributes = c("chromosome_name", "start_position","end_position", "band"), filters = c('chromosome_name','start','end'), values = list(chromosome_name=Methy_5_DMRs_S1_S3_annotation$seqnames,start=Methy_5_DMRs_S1_S3_annotation$start,end=Methy_5_DMRs_S1_S3_annotation$end), mart = mart)

 
But the annotation results return 55512 records. 
I think the result should be return 328 records in theory. But there are 55512 records returned. So think there may be error in my code.
This follows an example.
    chromosome_name start_position end_position band 1 1 16873708 16873851 p36.13 2 1 12507246 12507397 p36.21 3 1 202070841 202070943 q32.1 4 1 45592722 45592814 p34.1 5 1 23370254 23370346 p36.12 6 1 12166943 12167038 p36.22 7 1 167998660 167998726 q24.2 8 1 32086949 32087007 p35.2 9 1 171101728 171101806 q24.3 10 1 28648600 28648733 p35.3 11 1 62078786 62078859 p31.3 12 1 220117853 220117962 q41 13 1 148334479 148334554 q21.2

I want to solve this problem. So  I write a loop to annotate one by one. But there are another error. We can not call biomaRt API for many times. Otherwise there will be error.
My code:
DMRs_Final<-data.frame(seqnames=c(NA), start=c(NA), end=c(NA), width=c(NA), strand=c(NA), no.cpgs=c(NA), minfdr=c(NA), Stouffer=c(NA), maxbetafc=c(NA), meanbetafc=c(NA), overlapping.promoters=c(NA),chromosome_name=c(NA), start_position=c(NA), end_position=c(NA), band=c(NA))

for(i in 1:nrow(Methy_5_DMRs_S1_S3_annotation)){ DMRs_original=Methy_5_DMRs_S1_S3_annotation[i,] DMRS_Band=getBM(attributes = c("chromosome_name", "start_position","end_position", "band"), filters = c('chromosome_name','start','end'), values = list(DMRs_original$seqnames,DMRs_original$start,DMRs_original$end), mart = mart) if(nrow(DMRS_Band)==0){ i=i+1 }else{ DMRs_merge=cbind(DMRs_original,DMRS_Band) DMRs_Final=rbind(DMRs_Final,DMRs_merge) } }

There follows the error.

Error in getBM(attributes = c("chromosome_name", "start_position", "end_position",  : 
  The query to the BioMart webservice returned an invalid result: biomaRt expected a character string of length 1. 
Please report this on the support site at http://support.bioconductor.org


So, could you tell me to how to transform  'chromosome start end' to  'band' in batch using biomaRt?
Thank you very much.
Reply all
Reply to author
Forward
0 new messages