Does anyone have any experience loading proteomics data?

Zachary Wright

unread,

Nov 6, 2018, 10:02:45 PM11/6/18

to transmart-discuss

Does anyone have any experience loading proteomics data? Which loader did you use and what were the file formats? How do users run HDD proteomics analyses when there don't appear to be any Uniprot IDs in the "Select a . . . " field in the HDD modal?

I haven't been able to find any documentation on loading it using the original ETL scripts (i.e. transmart-data) but I did find documentation for loading proteomics using transmart-batch:

https://github.com/thehyve/transmart-batch/blob/master/docs/data_formats/proteomics.md

And this article on the Axiomedix site, though I can't tell which loader they were using (Clarivate?):

https://transmart.support.axiomedix.com/hc/en-us/articles/360006057733-04-Proteomics-Data

Thanks!

-- Zach (University of Michigan)

Peter Rice

unread,

Nov 7, 2018, 9:48:46 AM11/7/18

to transmar...@googlegroups.com, Zachary Wright

Hi Zach,

There are proteomics examples from Sanofi's test datasets from 1.2
(which became tranSMART 16.1) and also from the 17.1 testing.

You can find them in the CUrated Datasets ready for loading with
transmart-data. The wiki page is
https://wiki.transmartfoundation.org/display/transmartwiki/Curated+Data+Repository
and scroll down to the Sanofu test data (or search for proteomics). You
can download the test datasets and have a look at how they were defined.

The UniProt IDs shuld be defined in the database (assuming you built it
with transmart-data) and in the annotation platforms for the proteomics
test datasets.

I am working on extending the Axiomedix articles on datatypes. This
would be a good one to start with. I always used the Kettle scripts
(i.e. load_msproteomics_studyname in transmart-data) as that is how
Sanofi loaded these sets.

We also have the original detailed test cases for Proteomics data from
Sanofi that explain how things are supposed to work. Again they are on
the wiki at
https://wiki.transmartfoundation.org/display/transmartwiki/Sanofi+RC2+Tests

Let me know if I can help some more...

Peter Rice
Axiomedix Inc

> --
> For more ways to get in contact with the tranSMART community visit
> https://wiki.transmartfoundation.org/display/transmartwiki/Getting+Support
> ---
> You received this message because you are subscribed to the Google
> Groups "transmart-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to transmart-disc...@googlegroups.com
> <mailto:transmart-disc...@googlegroups.com>.
> To post to this group, send email to transmar...@googlegroups.com
> <mailto:transmar...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/transmart-discuss/f168c5d5-8b8c-4654-881c-61fad2199983%40googlegroups.com
> <https://groups.google.com/d/msgid/transmart-discuss/f168c5d5-8b8c-4654-881c-61fad2199983%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

---
This email has been checked for viruses by AVG.
https://www.avg.com

Zach Wright

unread,

Nov 7, 2018, 3:24:11 PM11/7/18

to transmart-discuss

Thanks Peter. I'll use the Sanofi dataset as a model and try uploading our data and see what happens. I'm not sure why we don't have those Uniprot IDs in the database. Perhaps because we're using an old database build?

-- Zach

Debasish Karan

unread,

Nov 11, 2018, 11:46:18 PM11/11/18

to transmar...@googlegroups.com

Hi Zatch,

I used to load proteomics data using transmart ETL for development and testing. Basically,the ETL pipelines and mapping files have to be placed on location , configure and execute the ETL with parameters. It was during the year 2013-2014 though.The ETL scripts are available in the github.(https://github.com/transmart/tranSMART-ETL/tree/master/Kettle/oracle/Kettle-ETL). On successful loading process, the proteomics data would appear against the study and the tree nodes appears in the tranSMART UI.

You may refer to the file formats in the URL https://github.com/thehyve/transmart-batch/blob/master/docs/data_formats/proteomics.md

The Unitprot IDs are provided in the annotation data file and they are not mandatory in the ETL design. However, customization may be done on the ETL design if it is not fulfilling your requirement.

Thanks,

Debasish Karan

On Thu, Nov 8, 2018 at 1:54 AM Zach Wright <zach....@gmail.com> wrote:

Thanks Peter. I'll use the Sanofi dataset as a model and try uploading our data and see what happens. I'm not sure why we don't have those Uniprot IDs in the database. Perhaps because we're using an old database build?

-- Zach

--

For more ways to get in contact with the tranSMART community visit https://wiki.transmartfoundation.org/display/transmartwiki/Getting+Support
---
You received this message because you are subscribed to the Google Groups "transmart-discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to transmart-disc...@googlegroups.com.
To post to this group, send email to transmar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/transmart-discuss/1de71da5-a024-4252-a2ec-1eae916ead02%40googlegroups.com.

Reply all

Reply to author

Forward