GFF3/Bio::DB::SeqFeature::Store headache....

1 view
Skip to first unread message

James Abbott

unread,
Dec 11, 2012, 6:00:12 AM12/11/12
to he...@gmod.org
Hello,

I have what is probably an 'inability to see wood from trees' problem
creating a Bio::DB::SeqFeature::Store driven gbrowse instance from a
gff3 file. This is something I've done plenty of times before but for
some reason I just get 'landmark not found' errors when trying to access
any contigs via gbrowse (1.70) and can't for the life of me see what's
wrong. I'm aware this normally results from a gff format problem, so
I've validated the gff (which was fine).

The top of my gff looks like:

##gff-version 3
contig_000001 BluGen contig 1 1312 . . .
ID=contig_000001;Name=contig_000001
contig_000002 BluGen contig 1 1067 . . .
ID=contig_000002;Name=contig_000002
contig_000003 BluGen contig 1 2044 . . .
ID=contig_000003;Name=contig_000003
contig_000004 BluGen contig 1 15746 . . .
ID=contig_000004;Name=contig_000004

So ID and Name match for contigs...I've read that case sensitivity is an
issue so I've tried 'name' and 'Name' but with no difference.

The genes/mRNA/CDS features in the gff look like:

contig_000586 BluGen gene 12441 13894 . - .
ID=bgh00001;Name="Aquaporin
1";Ontology_term="GO:0006810","GO:0016020","GO:0005215"
contig_000586 BluGen mRNA 12441 12791 . - .
ID=bgh00001m1;Parent=bgh00001;Name=bgh00001m1
contig_000586 BluGen mRNA 12845 13063 . - .
ID=bgh00001m2;Parent=bgh00001;Name=bgh00001m2
contig_000586 BluGen mRNA 13113 13413 . - .
ID=bgh00001m3;Parent=bgh00001;Name=bgh00001m3
contig_000586 BluGen mRNA 13460 13894 . - .
ID=bgh00001m4;Parent=bgh00001;Name=bgh00001m4
contig_000586 BluGen CDS 12745 12791 . - 0
Parent=bgh00001m1
contig_000586 BluGen CDS 12845 13063 . - 0
Parent=bgh00001m2
contig_000586 BluGen CDS 13113 13413 . - 0
Parent=bgh00001m3
contig_000586 BluGen CDS 13460 13768 . - 0
Parent=bgh00001m4

Once the data is loaded, the following test script

> my @ids = $db->seq_ids();
> foreach my $id (@ids) {
> print $id, "\n";
> }
> my @types = $db->types();
> print "\nTypes: \n";
> foreach my $type (@types) {
> print $type, "\n";
> }
>
> my $contig = $db->fetch_sequence('contig_000001',1=>1000);
> print "\ncontig = $contig\n";

Produces the following:
==================================================================
<snip>
contig_015108
contig_015109
contig_015110
contig_015111

Types:
CDS:BluGen
contig:BluGen
gene:BluGen
mRNA:BluGen
tRNA:BluGen

contig =
TTGTATCAAGCAACTAAGTTTCACTTGGCCACATTACATGGGAGCTAGGAAGGAATGTGAGACGGCGAAGTAGAATTGCTAAGTGAGAGAGTCAGCTAGATGGCAAAAATGACGACTGGCAGTGGCGGAGCAATATGTCATTCTCACCAACAGAGTACGTACTGGATGAGCTAGGAGGATGTACAAATAGTCATTACCCGTAGTTGTGGTACTTCTCTCTTCATATAGTTTAATCTTTCTAAAAGTACACTACAACCAGCTTGGTTTTGTCACAGAATGAAACAGCGCTTATCAACAGCTTTCCACCCACAAACAGACGGTGCCACCGAAAGAATGAATGAGGAAATCTTAGCGTACCTACGAGCATTTATTACTTATACACAGTTTGACTGGAAAGATTTGCTTCTGTGCGCAATGCTGGATTTAAATAATAGAACATCAGCAGCGTTAGGAATGAGCCCATTTTTTGCTGAACATGGTTATCATGCAGAGCCAATTCAACAAATTGAATATAGCAGCACCCCATTAAGGCCAGAGAAGAATGCGCAAAAGTTTGTTGAAAGACTAAGAGAAGCAGAAGTGTTACGACGTGCGCTAGGGGTACTGGAGATCGCCGTCTGCCGAAGGTAATGTATTAATAACTGTTCAGATCATAGTTGCTAACGAAGGGTACTCAGATATAATCCAACTGGCGAGGTGCCAGGTGGTCAAAGGTCGGAGAACTACAGGATAGTCGAGACACGAAGTCAAGAAGTCGAGTTGCCGAGCCAGCGATTAAGGCTGATAGATCAAGGTAAACGCAAGTACAAATAATGTAAATACTAAATGAAAGACTGAGGATATTGGACTGTGAGGATCTAAAATTATGATATTATATAAGTGCTAGATTTCTGTCACAGGCTCATACCTGGTTATCATTAGGCTGCTGCCTAACAAATAAGGCTTGAACTCTTCCAACCATGTTTAACACCATACATGTCATGAACCCCTCCACAACTC
A
==================================================================

So the contig ids are known we have the expected range of types and
sequences can be retrieved from the database to match the contig ids.

Reference class is set to 'contig' in the conf file, while 'automatic
classes' is set to 'contig gene', however searching for 'contig_000001'
produces the 'landmark not found' error (as does searching for
contig:contig_000001 or contig:BluGen:contig_000001). Likewise searching
for genes by their ID/Name fails.

I've tried the obvious thinkgs like dropping the database to make sure
gbrowse is pointing at the right thing, but nothing jumps out at me.

Are there any indications there what I might be doing wrong?

Many thanks,
James
--
Dr. James Abbott
Lead Bioinformatician
Bioinformatics Support Service
Imperial College, London
Reply all
Reply to author
Forward
0 new messages