[Genome] protein to dna blat / protein to genome blat alignment

403 views
Skip to first unread message

Finney, Richard (NIH/NCI) [E]

unread,
Dec 10, 2007, 1:20:31 PM12/10/07
to gen...@soe.ucsc.edu
Is there a way to get command line blat to do protein to dna (protein to
genome) sequence alignment? I get the message "d and q must both be
either protein or dna" when running a command like this

"blat -t=prot =q=dna chr1.fa p.fsa output.psl"

Normally, I'd take the error message at face value, but I'm a little
stumped
because the hgBlat server does do protein to genome alignment.

I've probably annoyed your sysadmins before by setting up curl/wget
scripts to torture your servers to get this information but would really
like to do
it locally via command line.

Thanks for any thoughts and help on this.



Galt Barber

unread,
Dec 10, 2007, 2:01:29 PM12/10/07
to Finney, Richard (NIH/NCI) [E], gen...@soe.ucsc.edu

Only dna/rna to dna/rna queries can be done
in nucleotide space. All other combinations
of type for query and target really happen
in protein space by translating either
the query or the target or both into
protein space. This true for blast as well as blat.

We call the query (-q) the usually smaller thing you
are searching for, and the target (-t) is the big
thing you are searching, often the genome.

According to your description you have protein sequences
as your query and you wish to use blat to search the
target genome which is given as dna.

Therefore you should use

blat -q=prot -t=dnax chr1.fa p.fsa output.psl

If you run blat at the commandline with nothing after it,
you will see all the options including the ones we
are discussing:

prompt> blat
blat - Standalone BLAT v. 34 fast sequence search command line tool
usage:
blat database query [-ooc=11.ooc] output.psl

[...]
options:
-t=type Database type. Type is one of:
dna - DNA sequence
prot - protein sequence
dnax - DNA sequence translated in six frames to protein
The default is dna
-q=type Query type. Type is one of:
dna - DNA sequence
rna - RNA sequence
prot - protein sequence
dnax - DNA sequence translated in six frames to protein
rnax - DNA sequence translated in three frames to protein
The default is dna
-prot Synonymous with -t=prot -q=prot

[...]

-Galt
> _______________________________________________
> Genome maillist - Gen...@soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
Reply all
Reply to author
Forward
0 new messages