Salmon - Is it possible to get output in bigWig/TDF format?

Erik Zhivkopljas

unread,

Oct 1, 2015, 6:05:48 PM10/1/15

to Sailfish Users Group

Hello,

I have quantified my samples using Salmon in lightweight-alignment mode. Consider there is actually no real alignment can I get information about signal intensity to visualise my output in IGVbrowser? Sorry if I'm missing something, but I didn't find any relevant topics in documentation or group discussions.

Thank you,
Erik.

Alexander Predeus

unread,

Oct 10, 2015, 3:31:08 PM10/10/15

to Sailfish Users Group

Yes, that would be a very useful feature.

Any comments from the authors?

Thank you in advance,

-- Alexander Predeus

Rob

unread,

Oct 13, 2015, 12:02:25 PM10/13/15

to Sailfish Users Group

Hi Erik and Alexander,

This is not currently possible, though I agree it would be a nice feature. Essentially, the problem is the following. When it's performing inference, Salmon uses a streaming algorithm, and then reduces the mapping information present into the reads into a set of equivalence classes (sets of fragments that map to the same subset of transcripts). Thus, during the "offline" phase of inference (where estimates of abundance are iterated until convergence), detailed information about where a fragment originates on a transcript is not present.

It would be possible to implement something like this, however, with one extra lightweight-mapping pass over the reads after inference has completed. The idea would be to evaluate the mapping positions in light of the inferred transcript abundances, and build a coverage profile of each transcript that reflects the inferred probability that a fragment came from each potential transcript. This would require, however, that the input file could be read through a second time, and so wouldn't be useable with the standard process substitution syntax etc. I'll look into this more. What type of output format would be desired?

--Rob

Warren McGee

unread,

Feb 25, 2016, 7:46:49 PM2/25/16

to Sailfish Users Group

Hello Rob,

I apologize for reviving an old thread, but I only recently joined the group and this is my first post. I'm wondering if there has been progress on implementing this feature or a feature to output a "pseudo-bam" file? For this, I think the output format that would make the most sense is the "bigWig" format (more info here); however, I don't know if there is a simple way to output that information directly, as it's a binary format (I only know of the "wigToBigWig" utility that UCSC produced). A format that would be easy to work with is the "wig(gle)" format that BigWig converts to binary (more info here).

Ideally, there would also be a way to combine normalized estimated counts from multiple biological replicates into one "condition" wiggle/BigWig file for comparison of conditions.

Let me know if you have additional questions.

Best,

Warren McGee

Reply all

Reply to author

Forward