Additional Information About SOAPdenovo pregraph output file

26 views
Skip to first unread message

Cole Lyman

unread,
Jan 16, 2017, 3:18:36 AM1/16/17
to BGI-SOAP
I am interested in using SOAPdenovo2 to efficiently construct a de Bruijn graph. Has anyone had experience using SOAPdenovo2 in this way?

I specifically would like to know how the output files from
pregraph
relate to one another. In the documentation it states:

Output files from the command "pregraph"
a. *.kmerFreq
  Each row shows the number of Kmers with a frequency equals the row number. Note that those peaks of frequencies 
  which are the integral multiple of 63 are due to the data structure.
b. *.edge
  Each record gives the information of an edge in the pre-graph: length, Kmers on both ends, average kmer coverage,
  whether it's reverse-complementarily identical and the sequence.
c. *.markOnEdge & *.path
  These two files are for using reads to solve small repeats.
e. *.preArc
  Connections between edges which are established by the read paths.
f. *.vertex
  Kmers at the ends of edges.
g. *.preGraphBasic
  Some basic information about the pre-graph: number of vertex, K value, number of edges, maximum read length etc.

Are there any more resources in order to use these files to construct a de Bruijn graph?

Any feedback or suggestions are welcome.

Thank you,
Cole 
Reply all
Reply to author
Forward
0 new messages