To whom it may concern,
I am writing to ask a question about the shortest query size of BLAT. After reading the FAQ for BLAT, I noticed the question "How do I configure BLAT for short sequences with maximum sensitivity?". The answer for this question pointed out that there is a formula to find the shortest query size that is "2 * stepSize + tileSize - 1". What’s more,you also mentioned that there is a "minimum luky size" whose size equals to "stepSize + tileSize". To test the minimum query size as the preparation of my own reasearch, I used the sequences ranging from 15bp to 30bp of the target bacterial genome. Then I carried out the mapping by the command line statement below:
"blat genome.fasta dna_fragment.txt -tileSize=10 -stepSize=5 -repMatch=1000000 -out=blast9 output5.txt"
I excepted that I would get mapping information for all these sequences as they were equal to or longer than "minimum luky size" which should be equal to 15 (i.e. tileSize + stepSize = 10+5=15), but actually I only got the mapping result for the query sequence which was 30bp.
I want to know why it seemed that the parameters that I set up did not work and whether I understood the formula of the shortest query size and the "minimum luky size" correctly.
Looking forward to your reply!
P.S., I put the screen shot of my command line statement and the result as attachment files.
Best regards,
Griffy Ge, a graduate student from China
Hi, Griffy.
Thank you for your patience while we looked into this.
One of our engineers shared that you are missing the minScore setting which defaults to 30.
Also should may want to use psl since the other output formats break things into individual exons with no chaining.
Our engineer advises the following:
-minScore=0 -minIdentity=0 -minMatch=1 -noSimpRepMask
I hope this is helpful. Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on our public forum. If your question includes sensitive information, you may send it instead to genom...@soe.ucsc.edu.
Lou Nassar
UCSC Genomics Institute
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/tencent_6931DC302D6A6F5A3344982D%40qq.com.