Error in quantile.default(sds, sd.threshold)

153 views
Skip to first unread message

dr.zh...@gmail.com

unread,
Mar 18, 2015, 5:36:39 AM3/18/15
to methylkit_...@googlegroups.com
When it comes to cluster and PCA, Error in quantile.default(sds, sd.threshold) and missing values and NaN's not allowed if 'na.rm' is FALSE occur. Anybody can help me?
Thank you

Altuna Akalin

unread,
Mar 18, 2015, 3:07:30 PM3/18/15
to methylkit_...@googlegroups.com
It could be that there are NaN values in your data set, or NA values. If you are using min.per.group argument in unite() function just for PCA purposes, try not to use it

On Wed, Mar 18, 2015 at 10:36 AM, <dr.zh...@gmail.com> wrote:
When it comes to cluster and PCA, Error in quantile.default(sds, sd.threshold) and missing values and NaN's not allowed if 'na.rm' is FALSE occur. Anybody can help me?
Thank you

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

dr.zh...@gmail.com

unread,
Mar 19, 2015, 3:11:54 AM3/19/15
to methylkit_...@googlegroups.com
Hello Altuna,

Noted with many thanks.

I have tried your suggestion. However, the same errors still occured when I calculated the correlation among samples and had a cluster. I wonder if it is required for eliminating or ignoring the NA or NAN when data are merged. What's the command you suggested?

Zhuang

在 2015年3月19日星期四 UTC+8上午3:07:30,Altuna Akalin写道:
It could be that there are NaN values in your data set, or NA values. If you are using min.per.group argument in unite() function just for PCA purposes, try not to use it
On Wed, Mar 18, 2015 at 10:36 AM, <dr.zh...@gmail.com> wrote:
When it comes to cluster and PCA, Error in quantile.default(sds, sd.threshold) and missing values and NaN's not allowed if 'na.rm' is FALSE occur. Anybody can help me?
Thank you

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsub...@googlegroups.com.

Altuna Akalin

unread,
Mar 19, 2015, 4:01:00 PM3/19/15
to methylkit_...@googlegroups.com
NaNs could have been produced by data which doesn't have the right format. For example, you might have bases that have 0 coverage, which shoudln't be in your dataset in the first place. It is hard to tell w/o looking at a sample of the  data that reproduced the problem. You need to find out why/where NaNs are generated. I would look at the output of percMethylation() function, locate which CpGs in which samples create NaNs, and look at the methylBase object for those CpGs to see if there is anything out of the ordinary.

To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.

claradomi...@gmail.com

unread,
May 21, 2020, 6:59:03 AM5/21/20
to methylkit_discussion
Hi, I am having a similar error to the one reported by dr.zh. In my case, I have tiled the methylation data without using min.per.group. Then, converted to a Granges object, used it with another software to extract the regions I want to do some further analyses on, and now trying to do PCA with the regions of interest and it shows the same error...I have ensured I do not have Nas by converting them to "0". Could it be due that the pCA does not work with not a CpG resolution? Or perhaps with the metadata of my methylBase object?

PCA<-PCASamples(unique_meth)
Error in quantile.default(sds, sd.threshold) : 
  missing values and NaN's not allowed if 'na.rm' is FALSE


> str(unique_meth)
'data.frame': 7255 obs. of  241 variables:
Formal class 'methylBase' [package "methylKit"] with 13 slots
  ..@ .Data         :List of 241
  .. ..$ : chr  "1" "1" "1" "1" ...
  .. ..$ : int  1522001 2015001 2099001 2181001 2191001 2506001 2834001 3006001 4844001 5445001 ...
  .. ..$ : int  1522500 2015500 2099500 2181500 2191500 2506500 2834500 3006500 4844500 5445500 ...
  .. ..$ : chr  "*" "*" "*" "*" ...
  .. ..$ : int  474 496 315 389 405 389 444 271 297 351 ...
  .. ..$ : int  175 423 264 340 362 323 263 200 230 175 ...
  .. ..$ : int  299 73 51 49 43 66 181 71 67 176 ...
  .. ..$ : int  430 541 306 308 409 384 353 315 257 411 ...

###
.. ..$ : int  157 336 333 262 391 213 247 261 223 168 ...
  .. .. [list output truncated]
  ..@ sample.ids    : chr 
  ..@ assembly      : chr 
  ..@ context       : chr 
  ..@ treatment     : num 
  ..@ coverage.index: num 
  ..@ numCs.index   : num 
  ..@ numTs.index   : num 
  ..@ destranded    : logi 
  ..@ resolution    : chr 
  ..@ names         : chr  "seqnames" "start" "end" "strand" ...
  ..@ row.names     : int  1 2 3 4 5 6 7 8 9 10 ...
  ..@ .S3Class      : chr "data.frame"


I would appreciate any idea on how to fix this problem. 
Thanks!

El jueves, 19 de marzo de 2015, 20:01:00 (UTC), Altuna Akalin escribió:
NaNs could have been produced by data which doesn't have the right format. For example, you might have bases that have 0 coverage, which shoudln't be in your dataset in the first place. It is hard to tell w/o looking at a sample of the  data that reproduced the problem. You need to find out why/where NaNs are generated. I would look at the output of percMethylation() function, locate which CpGs in which samples create NaNs, and look at the methylBase object for those CpGs to see if there is anything out of the ordinary.
On Thu, Mar 19, 2015 at 8:11 AM, <dr.zh...@gmail.com> wrote:
Hello Altuna,

Noted with many thanks.

I have tried your suggestion. However, the same errors still occured when I calculated the correlation among samples and had a cluster. I wonder if it is required for eliminating or ignoring the NA or NAN when data are merged. What's the command you suggested?

Zhuang

在 2015年3月19日星期四 UTC+8上午3:07:30,Altuna Akalin写道:
It could be that there are NaN values in your data set, or NA values. If you are using min.per.group argument in unite() function just for PCA purposes, try not to use it
On Wed, Mar 18, 2015 at 10:36 AM, <dr.zh...@gmail.com> wrote:
When it comes to cluster and PCA, Error in quantile.default(sds, sd.threshold) and missing values and NaN's not allowed if 'na.rm' is FALSE occur. Anybody can help me?
Thank you

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsubscrib...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

Altuna Akalin

unread,
May 21, 2020, 10:11:38 AM5/21/20
to methylkit_...@googlegroups.com
I need to a reproducible example

To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsub...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.

To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/0e3fa699-fd3c-4a42-9937-186e1e8953c6%40googlegroups.com.
--
Sent from mobile, excuse the brevity

Clara Domingo Sabugo

unread,
Jun 25, 2020, 8:13:54 AM6/25/20
to methylkit_discussion
Hi Altuna,
Thanks for your reply.
Basically I want to get a methylBase object to perform PCA on selected genomic regions. 
So after filtering by coverage and normalising, I was using unite without using min.per.group, then using tileMethylCounts in order to get genomic regions, as my input methylation data is in CpG resolution (whole genome data), then using getData to get a data frame so that I can subset my methylation data by chromosomes, and then getting back a methylBase object for using it with either regionCounts or selectByOverlap.

tiles <- tileMethylCounts(filtered_obj_all_normalized,win.size=500,cov.bases = 10)
united_tiles_no_min=unite(tiles, destrand=FALSE)
united_tiles_no_min_df<- getData(united_tiles_no_min)
#Adding "chr" for creating a Granges object.
united_tiles_no_min_df$chr<- sub("^", "chr", united_tiles_no_min_df$chr)
#Filtering regions
selected_united_tiles_df<-filter(united_tiles_df, chr=="chr1")
#Annotation of my regions with Annotatr package
new_meth<- as(annotated_selected_united_tiles_df, "methylBase")
#PCA
PCA<-PCASamples(new_meth)
not working...
Error in quantile.default(sds, sd.threshold) : 
  missing values and NaN's not allowed if 'na.rm' is FALSE

I need a compatible way to do PCA with specific genomic regions.
What is the best approach?  I think the fact of converting the methylkit object to dataframe and then back to methylBase could be somehow corrupting the object. 

I was now trying to selectByOverlap with the attached subset and I have got an error as well.
I have tried annotateWithGenicParts - but it does not work neither, I have got the message: could not find function "annotateWithGenicParts"

promoters_overlap<-selectByOverlap(methylation_subset, promoters)
Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
2: In max(i) : no non-missing arguments to max; returning -Inf


Please any help with this will be much appreciated! I have tried many things but I may be missing something... 

Thanks!
Clara
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsubscrib...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsub...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsub...@googlegroups.com.
methylation_subset.R

Altuna Akalin

unread,
Jun 25, 2020, 10:59:54 AM6/25/20
to methylkit_...@googlegroups.com
the problem is the following step:
new_meth<- as(annotated_selected_united_tiles_df, "methylBase")

when you construct the methylkit object like that extra information such as treatment vector etc is missing. You can subset methylKit objects with [] notation like you would subset data frames. Have a look at the vignette and the manual. 

Best,
Altuna

To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsub...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.

To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
--
Sent from mobile, excuse the brevity

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/3de8ccea-6cd5-40b9-a17e-0aaf198e7217o%40googlegroups.com.

Clara Domingo Sabugo

unread,
Jun 27, 2020, 9:14:05 AM6/27/20
to methylkit_...@googlegroups.com
Right, thanks! I was suspecting that the metadata was missing as shown in my first message.
As an aside, for selecting regions of interest I am using regionCounts. Is there any way to do an inverse overlap, in order to get the regions not present in the regions given?

Thanks,
Clara


Altuna Akalin

unread,
Jun 27, 2020, 3:00:44 PM6/27/20
to methylkit_...@googlegroups.com
If you have a logical vector with the overlap info simply put ! in front of the vector to reverse T/F values

!logic.vec 


Best
Altuna 

Reply all
Reply to author
Forward
0 new messages