how to emove the NaNs

kombo kai

unread,

Jan 16, 2012, 3:58:54 AM1/16/12

to ggp...@googlegroups.com

Hi

Ihave a problem of removing the NaNs in my data how could I do so as to get rid othe nans my data is in test.csv format

rgds

kaikombo

Brandon Hurr

unread,

Jan 16, 2012, 4:12:13 AM1/16/12

to kombo kai, ggp...@googlegroups.com

This isn't terribly ggplot related, but I'll play along.

I found this example to remove rows of data that have NaN in them:

http://stackoverflow.com/questions/5961839/remove-row-with-nan-value

Generally speaking, ggplot2 will ignore NA values in the data set, but I'm unsure of its behavior with NaN. I would assume that it is the same. If that is the case you shouldn't worry about having NA/NaN values since they will be ignored.

Are you having trouble with a plot where it is respecting NaN values in some way? Perhaps as a level of a factor?

Please post a reproducible example with sample data if this is the case.

Brandon

--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

Brandon Hurr

unread,

Jan 16, 2012, 9:37:01 AM1/16/12

to kombo kai, ggplot2

Kaik,

It looks like your NaNs are coming from taking the log of a negative value.

> log(0)

[1] -Inf

> log(1)

[1] 0

> log(-1)

[1] NaN

Warning message:

In log(-1) : NaNs produced

You didn't include a dataset with your graphing code so I can't say for certain, but I think it is removing the NaNs when it is plotting, which sounds like the behavior that you want.

Someone correct me if I'm wrong here. I don't often have log() scale data.

Brandon

On Mon, Jan 16, 2012 at 10:14, kombo kai <kaik...@yahoo.com> wrote:

Thanks Brandon for assuring my worries,and this is my reproducible example

library(ggplot2)
######## reading the data frame from the file
   test.csv<-read.table("E:/xlf/Book2ex.csv", header=T, sep=",")
   ### checking the data
   print(test.csv)
   ###getting the dimentions or variables ogf the data
   dimnames(test.csv)
   ###getting the dimention of the file(rows and colums)
   dim(test.csv)
   ##3###ploting the curves using the qplot commnd
   qplot(lndbh, dbh.2, data=test.csv)

qplot(log(Est.h4), log(Est.h2), data=test.csv)
## to put the heading and the x and y labels
qplot(log(Est.h4), log(Est.h2), data=test.csv, main="graph of logesth vs logmeasuredh", xlab="logestmatedh", ylab="logmeasuredh",colour = "color")

qplot(lndbh, log(Eh3), data = test.csv, shape = cut, colour=col, shape=cut, main="lndbhvs log(Eh3)")
###adding the smoothing to the curves using the geom function
qplot(lndbh, log(Eh3), data = test.csv, shape = cut, colour=col, shape=cut, geom = c("smooth", "point"), main="lndbhvs log(Eh3)",options(error = recover))

qplot(lndbh, log(Eh3), data = test.csv, shape = cut, colour=col, shape=cut, geom = c("smooth", "point"), main="lndbhvs log(Eh3)", log(Eh3)[is.na(log(Eh3))] <- 0 ))

the errror messages which I get

1: In log(Eh3) : NaNs produced
2: In log(Eh3) : NaNs produced
3: In log(Eh3) : NaNs produced
4: Removed 5 rows containing missing values (stat_smooth).
5: Removed 5 rows containing missing values (geom_point).

rgd

kaik

Thomas Parr

unread,

Jan 16, 2012, 11:21:53 AM1/16/12

to ggplot2

Posting an example data structure would be helpful. As previous
people have mentioned most functions in R handle missing data by
automatically dropping it. In general, your first stop for this type
of question (data manipulation) should be the R help listserv - you
can easily access it by writing "R Cran" and then key words into
google.
Below is a link discussing this very problem (I think their solution
is a little cumbersome):
https://stat.ethz.ch/pipermail/r-help/2003-October/040997.html (just
click the link to see the text and click next message to view
responses)

#Here are my suggestions:
####create a matrix
X<-matrix(rpois(20,1.5), nrow=4)
#insert a NaN value
X[2,2]<-NaN
X

####Select all rows which do not contain an NaN value
# Least favorite
Y<-na.omit(X)

# Works but is a bit clunky
Y<-subset(X,complete.cases(X))

# I like using matrix indexing and I think this is the simplest way to
do this...
Y<-X[complete.cases(X) , ]

#mostly because if you want to switch to omitting columns you simply
move the comma.
Y<-X[ , complete.cases(X)]

-Thomas

Reply all

Reply to author

Forward