Merging outputs from multiple samples

Giorgio Casaburi

unread,

Oct 27, 2017, 5:06:08 PM10/27/17

to shortbred-users

Hi all,

I have a couple of question I was hoping you could answer:

1) Is there a script to merge multiple output files from Shortbread? (Like metaphlan or humann2 have).

2) Once you merge the files, does the merged file need to be re-normalized based on the way Shortbread generatew the protein profile?

Thanks a lot in advance,

Giorgio

Jim Kaminski

unread,

Oct 28, 2017, 4:03:20 PM10/28/17

to Giorgio Casaburi, shortbred-users

Hi Giorgio,

Thank you for using ShortBRED!

1) We currently do not have a script for merging ShortBRED output. Typically, I merge ShortBRED output in R. Python users may want to use pandas.

If there is demand for a script, I could add one to the utilities.

2) No, you do not need to renormalize the files. Assuming each of your output files corresponds to one particular metagenomic sample, the values in the "Count" column are already normalized. (If you would like more details on the normalization procedure, it's explained in the section "Profiling protein family metagenomic abundance with ShortBRED-Quantify" in the ShortBRED paper.)

Thank you,

Jim

--
You received this message because you are subscribed to the Google Groups "shortbred-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to shortbred-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Giorgio Casaburi

unread,

Oct 28, 2017, 4:12:27 PM10/28/17

to Jim Kaminski, shortbred-users

Hi Jim,

Thank you so much for your answer. I thouth you needed to re-normalize the RPKM values just like humann2 reccomends after merging tables from different samples. Maybe you guys have used a different approach to obtain the count than then.

Giorgio

To unsubscribe from this group and stop receiving emails from it, send an email to shortbred-use...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

________________________________

Giorgio Casaburi, Ph.D.

Bioinformatics Scientist

Evolve Biosystems, Inc.

2121 Second Street

Suite B107

Davis, CA 95618

gcas...@evolvebiosystems.com

Office: 530-747-2018
Cell: 702-981-1253

Eric Franzosa

unread,

Oct 30, 2017, 10:00:16 AM10/30/17

to Giorgio Casaburi, Jim Kaminski, shortbred-users

Hi Giorgio,

Speaking on behalf of the HUMAnN2 team: confirmed, HUMAnN2 outputs abundance in units of RPK (not RPKM), meaning that it's initial output is _not_ normalized for sequencing depth. The reason for this is 1) to allow users to pick their means of normalization (relative abundance, copies per million, etc.) and 2) to enable analyses that depend on more count-like abundance, such as strain profiling.

Thanks,

Eric

On Sat, Oct 28, 2017 at 4:12 PM, Giorgio Casaburi <giorgio...@gmail.com> wrote:

Hi Jim,

Thank you so much for your answer. I thouth you needed to re-normalize the RPKM values just like humann2 reccomends after merging tables from different samples. Maybe you guys have used a different approach to obtain the count than then.

Giorgio

On Sat, Oct 28, 2017 at 1:03 PM Jim Kaminski <jim.ka...@gmail.com> wrote:

--

________________________________

Giorgio Casaburi, Ph.D.

Bioinformatics Scientist

Evolve Biosystems, Inc.

2121 Second Street

Suite B107

Davis, CA 95618

gcas...@evolvebiosystems.com

Office: 530-747-2018
Cell: 702-981-1253

--

Giorgio Casaburi

unread,

Oct 30, 2017, 10:29:50 AM10/30/17

to Eric Franzosa, Jim Kaminski, shortbred-users

Hi Eric,

1000 thanks for the clarification. It’s great to know the difference between the two outputs now.

Best wishes,

Giorgio

To unsubscribe from this group and stop receiving emails from it, send an email to shortbred-use...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

________________________________

Giorgio Casaburi, Ph.D.

Bioinformatics Scientist

Evolve Biosystems, Inc.

2121 Second Street

Suite B107

Davis, CA 95618

gcas...@evolvebiosystems.com

Office: 530-747-2018
Cell: 702-981-1253

--

You received this message because you are subscribed to the Google Groups "shortbred-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to shortbred-use...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward