how to emove the NaNs

771 views
Skip to first unread message

kombo kai

unread,
Jan 16, 2012, 3:58:54 AM1/16/12
to ggp...@googlegroups.com
Hi
Ihave a problem of removing the NaNs in my data how could I do so as to get rid othe nans  my data is in test.csv format
rgds
kaikombo

Brandon Hurr

unread,
Jan 16, 2012, 4:12:13 AM1/16/12
to kombo kai, ggp...@googlegroups.com
This isn't terribly ggplot related, but I'll play along. 

I found this example to remove rows of data that have NaN in them:

Generally speaking, ggplot2 will ignore NA values in the data set, but I'm unsure of its behavior with NaN. I would assume that it is the same.  If that is the case you shouldn't worry about having NA/NaN values since they will be ignored. 

Are you having trouble with a plot where it is respecting NaN values in some way? Perhaps as a level of a factor? 

Please post a reproducible example with sample data if this is the case. 

Brandon

--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

Brandon Hurr

unread,
Jan 16, 2012, 9:37:01 AM1/16/12
to kombo kai, ggplot2
Kaik,

It looks like your NaNs are coming from taking the log of a negative value. 
> log(0)
[1] -Inf
> log(1)
[1] 0
> log(-1)
[1] NaN
Warning message:
In log(-1) : NaNs produced

You didn't include a dataset with your graphing code so I can't say for certain, but I think it is removing the NaNs when it is plotting, which sounds like the behavior that you want. 

Someone correct me if I'm wrong here. I don't often have log() scale data. 

Brandon

On Mon, Jan 16, 2012 at 10:14, kombo kai <kaik...@yahoo.com> wrote:
Thanks Brandon for assuring my worries,and this is my reproducible example 
 library(ggplot2)
  ######## reading the data frame from the file      
   test.csv<-read.table("E:/xlf/Book2ex.csv", header=T, sep=",")
   ### checking the data
   print(test.csv)
   ###getting the dimentions or variables ogf the data
   dimnames(test.csv)
   ###getting the dimention of the file(rows and colums)
   dim(test.csv)
   ##3###ploting the curves using the qplot commnd
   qplot(lndbh, dbh.2, data=test.csv)  
 
 
  qplot(log(Est.h4), log(Est.h2), data=test.csv)
  ## to put the heading and the x and y labels
  qplot(log(Est.h4), log(Est.h2), data=test.csv, main="graph of logesth vs logmeasuredh", xlab="logestmatedh", ylab="logmeasuredh",colour = "color")
 
  qplot(lndbh, log(Eh3), data = test.csv, shape = cut, colour=col, shape=cut, main="lndbhvs log(Eh3)")
  ###adding the smoothing to the curves using the geom function
  qplot(lndbh, log(Eh3), data = test.csv, shape = cut, colour=col, shape=cut, geom = c("smooth", "point"), main="lndbhvs log(Eh3)",options(error = recover))
 
 qplot(lndbh, log(Eh3), data = test.csv, shape = cut, colour=col, shape=cut, geom = c("smooth", "point"), main="lndbhvs log(Eh3)", log(Eh3)[is.na(log(Eh3))] <- 0 ))
 the errror messages which I get
1: In log(Eh3) : NaNs produced
2: In log(Eh3) : NaNs produced
3: In log(Eh3) : NaNs produced
4: Removed 5 rows containing missing values (stat_smooth).
5: Removed 5 rows containing missing values (geom_point).
rgd
kaik

Thomas Parr

unread,
Jan 16, 2012, 11:21:53 AM1/16/12
to ggplot2
Posting an example data structure would be helpful. As previous
people have mentioned most functions in R handle missing data by
automatically dropping it. In general, your first stop for this type
of question (data manipulation) should be the R help listserv - you
can easily access it by writing "R Cran" and then key words into
google.
Below is a link discussing this very problem (I think their solution
is a little cumbersome):
https://stat.ethz.ch/pipermail/r-help/2003-October/040997.html (just
click the link to see the text and click next message to view
responses)

#Here are my suggestions:
####create a matrix
X<-matrix(rpois(20,1.5), nrow=4)
#insert a NaN value
X[2,2]<-NaN
X

####Select all rows which do not contain an NaN value
# Least favorite
Y<-na.omit(X)

# Works but is a bit clunky
Y<-subset(X,complete.cases(X))

# I like using matrix indexing and I think this is the simplest way to
do this...
Y<-X[complete.cases(X) , ]

#mostly because if you want to switch to omitting columns you simply
move the comma.
Y<-X[ , complete.cases(X)]

-Thomas
Reply all
Reply to author
Forward
0 new messages