I am having a perplexing problem with ggplot with both my real dataset and a simplified dummy dataset. I am trying to plot heterozygosity across a genome, and have each chromosome on a separate plot.
For simplicity, my fake dataset:
chr1 123000 124000 2 0.00002 26 0.00026 indiv1
chr1 124000 125000 3 0.00003 12 0.00012 indiv1
chr1 125000 126000 1 0.00001 6 0.00006 indiv1
chr1 126000 126000 2 0.00002 14 0.00014 indiv1
chr2 123000 124000 6 0.00006 20 0.00020 indiv1
chr2 124000 125000 0 0.00000 12 0.00012 indiv1
chr1 123000 124000 2 0.00002 26 0.00026 indiv2
chr1 124000 125000 3 0.00003 12 0.00012 indiv2
chr1 125000 126000 1 0.00001 6 0.00006 indiv2
chr1 126000 126000 2 0.00002 14 0.00014 indiv2
chr2 123000 124000 6 0.00006 20 0.00020 indiv2
chr2 124000 125000 0 0.00000 12 0.00012 indiv2
My code to read in the data file looks like this:
hetshoms <- read.table("fakedata.txt", header=F)
chrom <- hetshoms$V1
start.pos <- hetshoms$V2
end.pos <- hetshoms$V3
hets <- hetshoms$V4
het_stat <- hetshoms$V5
homs <- hetshoms$V6
hom_stat <- hetshoms$V7
indiv <- hetshoms$V8
HetRatio <- hets/(hets+homs)
If I run qplot, there is the expected output:
testplot <- qplot(start.pos, HetRatio, facets = chrom ~ ., ylim=c(0,1))
But if I try something analogous in ggplot:
testplot <- ggplot(hetshoms, aes(x=start.pos, y=HetRatio)) + geom_point()
testplot + facet_wrap(~chrom)
I receive the error:
"Error en layout_base(data, vars, drop = drop) :
At least one layer must contain all variables used for facetting"
Can anyone explain what I have done wrong, and how to fix it?
Many thanks in advance,
Loren