Edge: running the Qiime analysis: what is the metadata mapping file?

33 views
Skip to first unread message

gmoore777

unread,
Nov 20, 2018, 3:57:36 PM11/20/18
to edge-users
HI,
I'm trying to run the Qiime analysis via the Edge user interface, and don't know exactly what to "correctly" put into the metadata mapping file?

but it is referring to BarcodeSequence and LinkerPrimerSequence fields, FASTA files and referring to Qiime rather than Qiime2.

I don't have FASTA files, and I don't think my FASTQ files have BarcodeSequence nor LinkerPrimerSequence sequences nor
SampleIDs nor Descriptions.

I was familiar with the metadata file that I had created for Qiime2 analysis for command line use at another facility, and that
file would look like this:

  # this is aManifestFile.csv, for use with qiime2
  sample-id,absolute-filepath,direction
  sample-1,/home/bee015guest/Qiime_Testing/SRR6684160_1.fastq,forward
  sample-1,/home/bee015guest/Qiime_Testing/SRR6684160_2.fastq,reverse


But I don't know what a correct metadata mapping file should look like. Should mine be blank? or just 2 fields?

  #SampleID  BarcodeSequence  LinkerPrimerSequence  Description
  sample_1                                                                   Description_One

I'm confused.

Guy

Lo, Chien-Chi

unread,
Nov 20, 2018, 6:53:45 PM11/20/18
to gmoore777, edge-users

Here has some descriptions for the Qiime pipeline. EDGE implemented Qiime v1.9.1 into a pipeline.  We plan to update it to Qiime2 in the near future.

 

https://edge.readthedocs.io/en/latest/gui.html#run-qiime

 

Thanks,

Chienchi

--
You received this message because you are subscribed to the Google Groups "edge-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edge-users+...@googlegroups.com.
To post to this group, send email to edge-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edge-users/f98fe635-8857-46d6-8115-0cb2a70be36b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Greenleaf

unread,
Nov 20, 2018, 7:39:09 PM11/20/18
to Lo, Chien-Chi, edge-users, guy

Thank you, those instructions were helpful

I have clicked "De-multiplexed Reads Dir" instead of "Paired Reads", and provided the directory of the 2 FASTQ files,

and created this as the metadata mapping file:

    #SampleID   Files   SampleType      Description
    Sample1     SRR6684160_1.fastq,SRR6684160_2.fastq   BeeGuts metabiome_of_mosquitos

I think the analysis went further but now get this error:

.../QiimeAnalysis$ cat ./errorLog.txt

Traceback (most recent call last):
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/bin/pick_open_reference_otus.py", line 453, in <module>
    main()
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/bin/pick_open_reference_otus.py", line 432, in main
    minimum_failure_threshold=minimum_failure_threshold)
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/lib/python2.7/site-packages/qiime/workflow/pick_open_reference_otus.py", line 966, in pick_subsampled_open_reference_otus
    close_logger_on_success=False)
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/lib/python2.7/site-packages/qiime/workflow/util.py", line 122, in call_commands_serially
    raise WorkflowError(msg)
qiime.workflow.util.WorkflowError:

*** ERROR RAISED DURING STEP: Make the otu table
Command run was:
 make_otu_table.py -i /home/bee015guest/EDGE_output/12b33adca39b9ff78e7d0583e0dec093/QiimeAnalysis/otus/final_otu_map_mc2.txt -o /home/bee015guest/EDGE_output/12b33adca39b9ff78e7d0583e0dec093/QiimeAnalysis/otus/otu_table_mc2.biom
Command returned exit status: 1
Stdout:

Stderr
Traceback (most recent call last):
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/bin/make_otu_table.py", line 119, in <module>
    main()
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/bin/make_otu_table.py", line 115, in main
    write_biom_table(biom_otu_table, opts.output_biom_fp)
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/lib/python2.7/site-packages/qiime/util.py", line 569, in write_biom_table
    "Attempting to write an empty BIOM table to disk. "
qiime.util.EmptyBIOMTableError: Attempting to write an empty BIOM table to disk. QIIME doesn't support writing empty BIOM output files.


Error, CMD: pick_open_reference_otus.py -f -i /home/bee015guest/EDGE_output/12b33adca39b9ff78e7d0583e0dec093/QiimeAnalysis/slout/seqs.fna -o /home/bee015guest/EDGE_output/12b33adca39b9ff78e7d0583e0dec093/QiimeAnalysis/otus -s 0.01  -r /home/bee015guest/EdgeInstallDirectory/edge_dev/scripts/qiime_pipeline/data/Silva119_release/rep_set/97/Silva_119_rep_set97.fna   -p /home/bee015guest/EDGE_output/12b33adca39b9ff78e7d0583e0dec093/QiimeAnalysis/parameter.txt  --min_otu_size 2 -aO 2 died with ret 256 at /home/bee015guest/EdgeInstallDirectory/edge_dev/scripts/qiime_pipeline/qiime_pipeline.pl line 1189.

These are the sizes of the files in the .../QiimeAnalysis directory:

$ find . -type f | xargs ls -al
-rw-r--r-- 1 bee015guest bee015guest       155 Nov 21 00:29 ./aManifestFile.txt
-rw-r--r-- 1 bee015guest bee015guest       155 Nov 21 00:29 ./checkMappingFile/combined_mapping_corrected.txt
-rw-r--r-- 1 bee015guest bee015guest      1719 Nov 21 00:29 ./checkMappingFile/combined_mapping.html
-rw-r--r-- 1 bee015guest bee015guest        44 Nov 21 00:29 ./checkMappingFile/combined_mapping.log
-rw-r--r-- 1 bee015guest bee015guest     50732 Nov 21 00:29 ./checkMappingFile/overlib.js
-rw-r--r-- 1 bee015guest bee015guest       155 Nov 21 00:29 ./combined_mapping.txt
-rw-r--r-- 1 bee015guest bee015guest      2470 Nov 21 00:31 ./errorLog.txt
-rw-r--r-- 1 bee015guest bee015guest       253 Nov 21 00:30 ./fastq-join_joined/join.finished
-rw-r--r-- 1 bee015guest bee015guest      4138 Nov 21 00:30 ./fastq-join_joined/pair0/fastqjoin.join.fastq
-rw-r--r-- 1 bee015guest bee015guest  26959228 Nov 21 00:30 ./fastq-join_joined/pair0/fastqjoin.un1.fastq
-rw-r--r-- 1 bee015guest bee015guest  26959228 Nov 21 00:30 ./fastq-join_joined/pair0/fastqjoin.un2.fastq
-rw-r--r-- 1 bee015guest bee015guest       102 Nov 21 00:30 ./fastq-join_joined/pair0/file.txt
-rw-r--r-- 1 bee015guest bee015guest         0 Nov 21 00:31 ./otus/final_otu_map_mc2.txt
-rw-r--r-- 1 bee015guest bee015guest        72 Nov 21 00:31 ./otus/final_otu_map.txt
-rw-r--r-- 1 bee015guest bee015guest      7108 Nov 21 00:31 ./otus/log_20181121003006.txt
-rw-r--r-- 1 bee015guest bee015guest 251131321 Nov 21 00:31 ./otus/new_refseqs.fna
-rw-r--r-- 1 bee015guest bee015guest         0 Nov 21 00:31 ./otus/rep_set.fna
-rw-r--r-- 1 bee015guest bee015guest       935 Nov 21 00:31 ./otus/step1_otus/failures.fasta
-rw-r--r-- 1 bee015guest bee015guest       963 Nov 21 00:31 ./otus/step1_otus/POTU_SmRu_.0_clusters.uc
-rw-r--r-- 1 bee015guest bee015guest       963 Nov 21 00:31 ./otus/step1_otus/POTU_SmRu_.1_clusters.uc
-rw-r--r-- 1 bee015guest bee015guest        20 Nov 21 00:31 ./otus/step1_otus/seqs_failures.txt
-rw-r--r-- 1 bee015guest bee015guest      1662 Nov 21 00:31 ./otus/step1_otus/seqs_otus.log
-rw-r--r-- 1 bee015guest bee015guest         0 Nov 21 00:31 ./otus/step1_otus/seqs_otus.txt
-rw-r--r-- 1 bee015guest bee015guest         0 Nov 21 00:31 ./otus/step1_otus/step1_rep_set.fna
-rw-r--r-- 1 bee015guest bee015guest       938 Nov 21 00:31 ./otus/step4_otus/failures_clusters.uc
-rw-r--r-- 1 bee015guest bee015guest       579 Nov 21 00:31 ./otus/step4_otus/failures_otus.log
-rw-r--r-- 1 bee015guest bee015guest        72 Nov 21 00:31 ./otus/step4_otus/failures_otus.txt
-rw-r--r-- 1 bee015guest bee015guest       815 Nov 21 00:31 ./otus/step4_otus/step4_rep_set.fna
-rw-r--r-- 1 bee015guest bee015guest       715 Nov 21 00:29 ./parameter.txt
-rw-r--r-- 1 bee015guest bee015guest      3540 Nov 21 00:31 ./processLog.txt
-rw-r--r-- 1 bee015guest bee015guest       105 Nov 21 00:30 ./slout/histograms.txt
-rw-r--r-- 1 bee015guest bee015guest       935 Nov 21 00:30 ./slout/seqs.fna
-rw-r--r-- 1 bee015guest bee015guest         0 Nov 21 00:30 ./slout/split.finished
-rw-r--r-- 1 bee015guest bee015guest       550 Nov 21 00:30 ./slout/split_library_log.txt

Sorry for all this information...

Lo, Chien-Chi

unread,
Nov 21, 2018, 1:37:25 PM11/21/18
to Greenleaf, edge-users, guy

This is fail because of no overlapped from the paired-end reads.    The pipeline expected the two paired can be joined by the fastq-join program and use the joined reads for downstream analysis.  

 

You can try just use the forward reads only as input in the metadata file.

    #SampleID   Files   SampleType      Description
    Sample1     SRR6684160_1.fastq    BeeGuts metabiome_of_mosquitos

 

 

Chienchi

Elinor Pulcini

unread,
Nov 27, 2018, 11:35:02 AM11/27/18
to edge-users

 

In Excel, make a 4 column chart with the following column headers (see below)

 

I put in the sample number and use the same number for the description, leave the middle columns blank.  Then save as a test file Tab delimited.

 

#SampleID

BarcodeSequence

LinkerPrimerSequence

Description

 

Elinor Pulcini

unread,
Nov 27, 2018, 11:54:14 AM11/27/18
to edge-users
Guy...that should have said  text file not test file...sorry


On Tuesday, November 20, 2018 at 1:57:36 PM UTC-7, gmoore777 wrote:

Greenleaf

unread,
Nov 27, 2018, 8:44:09 PM11/27/18
to Lo, Chien-Chi, edge-users, greenleafmou...@gmail.com

1.)

I tried what you suggested, clicking "Unpaired Reads", providing the Single-end FASTQ File,

/home/bee015guest/EdgeInstallDirectory/edge/edge_ui/EDGE_input/BeeGuts/SRR6684160_1.fastq

and the MetadatMappingFile of

/home/bee015guest/EdgeInstallDirectory/edge/edge_ui/EDGE_input/BeeGuts/aManifestFile.csv_1_files

with the contents of that file being:

#SampleID    Files    SampleType    Description
SampleNumber1    SRR6684160_1.fastq    BeeGuts    testing_metabiome_of_mosquitos

2.)

I get this in file of  ./QiimeAnalysis/errorLog.txt:

Traceback (most recent call last):
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/bin/split_libraries_fastq.py", line 365, in <module>
    main()
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/bin/split_libraries_fastq.py", line 272, in main
    has_barcodes=barcode_read_fp is not None)
  File "/home/bee015guest/EdgeInstallDirectory/edge_dev/thirdParty/Anaconda2/lib/python2.7/site-packages/qiime/split_libraries.py", line 310, in check_map
    'identify problems.')
ValueError: Errors were found with mapping file, please run validate_mapping_file.py to identify problems.
Error, CMD: split_libraries_fastq.py -o /home/bee015guest/EDGE_output/ef0998a0d2546773900a28225ba9c0db/QiimeAnalysis/slout -i /home/bee015guest/EDGE_output/ef0998a0d2546773900a28225ba9c0db/QiimeAnalysis/newFastq/SRR6684160_1.nobarcodes.fastq -b /home/bee015guest/EDGE_output/ef0998a0d2546773900a28225ba9c0db/QiimeAnalysis/newFastq/SRR6684160_1.barcodes.fastq -m /home/bee015guest/EDGE_output/ef0998a0d2546773900a28225ba9c0db/QiimeAnalysis/aManifestFile.txt --barcode_type 6 -q 19 -p 0.5 -n 1 --phred_offset 33 died with ret 256 at /home/bee015guest/EdgeInstallDirectory/edge_dev/scripts/qiime_pipeline/qiime_pipeline.pl line 1189.


3.)

But the validate_mapping_file.py seemed to go fine, per this file: ./process_current.log


###########################################################################
Qiime [2018 Nov 28 01:30:54]  Checking Mapping File
###########################################################################
Qiime CMD: validate_mapping_file.py -b -p -m /home/bee015guest/EDGE_output/ef0998a0d2546773900a28225ba9c0db/QiimeAnalysis/combined_mapping.txt -o /home/bee015guest/EDGE_output/ef0998a0d2546773900a28225ba9c0db/QiimeAnalysis/checkMappingFile
No errors or warnings were found in mapping file.

Qiime Running time: 00:00:01


4.) Top 8 lines of my FASTQ file in case something is funny with it:

$ head -8 SRR6684160_1.fastq

@SRR6684160.1 1 length=300
GTGCCAGCAGCCGCGGTAATACGTAGGGTGCAAGCGTTATCCGGATTTATTGGGCGTAAAGAGCTCGTAGGCGGTTCGTCGCGTCTGGTGTGAAAGTCCATCGCTTAACGGTGGATCGGCGCCGGGTACGGGCGGACTGGAGTGCGGTAGGGGAGACTGGAATTCCCGGTGTAACGGTGGAATGTGTCGATATCGGGACGAACACCGATGGCGAAGGCAGGTCTCTGGGCCTTCCCTGACGCTGTGGTGCGCACTCGTGCGGTGCGAACAGGCTTTGTACCCCCTGTTTTCCCTGTCTCC
+SRR6684160.1 1 length=300
CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGCGGGGGGGDFGGGGGGGGGGGGGGDFGGGGGGGGEGGC@<>C8*1;+<+<+A+2*;*:<8****15*1*;;7):**0)*)**10*)9<C7@+*007*00*046)0.04)))))).()))(.).**1(((-(()-3(((.4(-))*199(-(.)).129.).8AD<
@SRR6684160.2 2 length=300
GTGCCAGCCGCCGCGGTAATACGTAGGGTGCAAGCGTTATCCGGATTTATTGGGCGTAAAGAGCTCGTAGGCGGTTCGTCGCGTCTGGTGTGAAAGTCCATCGCTTAACGGTGGATCGGCGCCGGGTACGGGCGGACTGGAGTGCGGTAGGGGAGACTGGAATTCCCGGTGTAACGGTGGAATGTGTAGATATCGGGAAGAACACCGATGGCGAAGGCAGGTCTCTGGGCCGTCACTGACGCTGAGGCGCGAAATCGTGGGGAGCGACCAGGATTAGATACCCGAGTAGTCCCTGTCTCC
+SRR6684160.2 2 length=300
CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGDDGF52+<:C?E+97@9CC5****/*3;*;CC3>8*7)*55*9*7)1<CFF<**075>D*>A6*977?))0)))-,)).(.).**)(((-(()-(.(24<)226*9CD0((-,6<?:6<)4>444


5.)Is my installation flawed?

Should Qiime be easy to run, or does everyone fumble with choosing the correct options in conjunction with creating a correct Metadata Mapping file?

gmoore777

Lo, Chien-Chi

unread,
Nov 28, 2018, 11:23:16 AM11/28/18
to Greenleaf, edge-users

My suggestion is to modify the metadata file only. I didn’t suggest to change the read type to ‘Unpaired Reads’.  You should still click "De-multiplexed Reads Dir" since your data is already demultiplexed.  The “Paired Reads” and “Unpaired Reads” are FASTQ files with barcode in reads or user provides separate barcode fastq files in the parameters section.

 

If you read the qiime tutorial, you will find out how many different types input scenarios. http://qiime.org/tutorials/processing_illumina_data.html

 

EDGE tried to make it easy but looks like it still a bit complex for new users. Please re-read  https://edge.readthedocs.io/en/latest/gui.html#run-qiime documentations and we will appreciate any feedback on making it more clear for users.

 

Thanks,

Chienchi

 

On 11/21/18 1:36 PM, Lo, Chien-Chi wrote:

This is fail because of no overlapped from the paired-end reads.    The pipeline expected the two paired can be joined by the fastq-join program and use the joined reads for downstream analysis.  

 

You can try just use the forward reads only as input in the metadata file.

    #SampleID   Files   SampleType      Description
    Sample1     SRR6684160_1.fastq    BeeGuts metabiome_of_mosquitos

 

 

Greenleaf

unread,
Nov 28, 2018, 1:19:28 PM11/28/18
to Lo, Chien-Chi, edge-users, greenleafmou...@gmail.com
1.)That worked: "De-multiplexed Reads Dir" with a metadata mapping file of listing only "one" of either of my FASTQ files.

Thank you

Reply all
Reply to author
Forward
0 new messages