Hi Ji,
Thank you for your question about the genePredToGtf utility. One of
our engineers notes that this is not a contradiction. The refFlat
and genePred table structures are presented at
http://genome.ucsc.edu/FAQ/FAQformat.html#format9. By default, the
tool pulls data from a table in a MySQL database. When pulling data
from a database, it asks for specific columns by name, which means
it doesn't matter what order these columns are in. This is why the
tool can work with a genePred, extended genePred, or refFlat
table.
If, however, the tool is pulling data from a file, it has to guess
which columns are the ones it wants. The tool assumes the file is in
the genePred format at that point. It doesn't work with a refFlat
file
because the data are organized differently in these files. Note the
extra "geneName" column in the description of the refFlat format.
You may be able to remove this extra "geneName" column from the
refFlat file and then convert it to GTF using the genePredToGtf
tool.
You can use the genePredToGtf with a genePred file by specifying
"file" as your database. For example:
genePredToGtf file
myGenePredFile.txt myOutput.gtf
I hope this is helpful. If you have any further questions, please
reply to
gen...@soe.ucsc.edu. All messages sent to that address are
archived on a publicly-accessible Google Groups forum. If your
question includes sensitive data, you may send it instead to
genom...@soe.ucsc.edu.
Matthew Speir
UCSC Genome Bioinformatics Group