Adaptaters not detected in sequences!

171 views
Skip to first unread message

MarineLanda

unread,
Mar 30, 2012, 5:17:15 AM3/30/12
to Qiime Forum
Hi everyone,

I've just finished writing a paper using some sequences that I had
analyzed with Qiime. When I tried to submit the sequences to GenBank I
was told that the representative sequence of some OTUs were not
acceptable because they contained a fragment similar to a vector
plasmid...
It turns out that those sequences have a 30 bases fragment at the end
that corresponds to the adaptater used by the pyrosequencing
platform...

How is it possible that those sequences were not eliminated or
corrected during the different cleaning steps at the beginning of the
analyses? Is there a tool to do that and how could I have missed
that?

As a consequence of this I probably have to rerun all my analyses :
the clustering of the sequences into OTUs in particular, and the
taxonomy assignment as well (and probably other stuff that I'm not
thinking about yet)...

Thanks a lot for your help,

Cheers,

Marine

Tony Walters

unread,
Mar 30, 2012, 11:26:36 AM3/30/12
to qiime...@googlegroups.com
Hello Marine,

That step is generally done in the initial demultplexing with split_libraries.py with the -z option (http://qiime.org/scripts/split_libraries.html) to find the reverse primer and truncate the sequence at that point (sometimes only parts of the reverse sequencing adapter are read through, so it generally is a bit more reliable to find the reverse primer and truncate at that position, and this also makes it consistent with the removal of the forward primer at the beginning of the read.).

In the development version of QIIME, there is also a stand-alone script for finding the reverse primer, truncate_reverse_primer.py

You are correct that this is an important issue, and we've been discussing whether we should make the reverse primer a required parameter for demultiplexing to help people avoid this pitfall as the reads get longer with 454 sequencing, and there will definitely be more emphasis on it in the documentation in the next release.

Best regards,
Tony Walters

MarineLanda

unread,
Apr 3, 2012, 4:18:38 AM4/3/12
to Qiime Forum
Hi Tony,

Thank you so much for this answer. Indeed it would be great if this
parameter was not optional, it would avoid this kind of beginners
mistakes!


Best regards,
Marine
Message has been deleted

MarineLanda

unread,
Apr 4, 2012, 7:39:37 AM4/4/12
to Qiime Forum
Hi again,


I'm having trouble running the script...

Here's what I did :

It appears the reverse primer is the 519R : GTNTTACNGCGGCKGCTG
I reversed/complemented it and got : CAGCMGCCGCNGTAANAC

I created a mapping file like this :


#SampleID BarcodeSequence LinkerPrimerSequence ReversePrimer
CS5..1.T8 AGGACGCT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5..2.T8 AGGACTGT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5..3.T8 AGGAGATT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5..4.T8 AGGAGGAT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5.1.T15 AGGAGTCT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5.2.T15 AGGATAGT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5.3.T15 AGGATCTT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5.4.T15 AGGATTAT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS5.POLA AGGACCAT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.1.T15 AGCTAGCT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.1.T7 AGGCACGT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.2.T15 AGCTATGT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.2.T7 AGGCAGTT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.3.T15 AGCTCATT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.3.T7 AGGCCAAT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.4.T15 AGCTCGAT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.4.T7 AGCTACAT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
CS6.POLA AGGCAACT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC
NTC AGCTCTCT GGCNNNCGGGTGAGTAA CAGCMGCCGCNGTAANAC

And when I run the script I get this error message :

marinelanda$ macqiime split_libraries.py -f GP4ZCY204.fasta -m
mapfile_corrected_530.txt -t -M 1 -b 8 -z truncate_only --
reverse_primer_mismatches 1 -d
Traceback (most recent call last):
File "/macqiime/QIIME/bin/split_libraries.py", line 275, in <module>
main()
File "/macqiime/QIIME/bin/split_libraries.py", line 272, in main
added_demultiplex_field = opts.added_demultiplex_field)
File "/macqiime/lib/python2.7/site-packages/qiime/
split_libraries.py", line 1184, in preprocess
added_demultiplex_field)
File "/macqiime/lib/python2.7/site-packages/qiime/
split_libraries.py", line 275, in check_map
added_demultiplex_field=added_demultiplex_field)
File "/macqiime/lib/python2.7/site-packages/qiime/check_id_map.py",
line 1176, in process_id_map
reverse_primers = get_reverse_primers(data, col_headers)
File "/macqiime/lib/python2.7/site-packages/qiime/check_id_map.py",
line 1016, in get_reverse_primers
if row[rev_primer_index]=="ReversePrimer":
IndexError: index out of bounds


What happened here?


Thanks a lot for the help, I really appreciate it!

Marine

Antonio González Peña

unread,
Apr 4, 2012, 7:59:20 AM4/4/12
to qiime...@googlegroups.com
Hi Marine,

Could you use check_id_map.py to validate your mapping file and try again?

Cheers

--
Antonio González Peña
Research Assistant, Knight Lab
University of Colorado at Boulder
https://chem.colorado.edu/knightgroup/

MarineLanda

unread,
Apr 4, 2012, 8:12:07 AM4/4/12
to Qiime Forum
Hi Antonio,

I did that and got the same error message, which is weird cause I
first tested the mapping file and it was ok... I guess I did something
wrong in between


macqiime check_id_map.py -m mapfile_corrected_530.txt -o checking/
Traceback (most recent call last):
File "/macqiime/QIIME/bin/check_id_map.py", line 132, in <module>
main()
File "/macqiime/QIIME/bin/check_id_map.py", line 128, in main
verbose, var_len_barcodes, disable_primer_check,
added_demultiplex_field)
File "/macqiime/lib/python2.7/site-packages/qiime/check_id_map.py",
line 1266, in check_mapping_file
var_len_barcodes, added_demultiplex_field)
File "/macqiime/lib/python2.7/site-packages/qiime/check_id_map.py",
line 1176, in process_id_map
reverse_primers = get_reverse_primers(data, col_headers)
File "/macqiime/lib/python2.7/site-packages/qiime/check_id_map.py",
line 1016, in get_reverse_primers
if row[rev_primer_index]=="ReversePrimer":
IndexError: index out of bounds

Thank you for your help!

Marine

Antonio González Peña

unread,
Apr 4, 2012, 8:38:59 AM4/4/12
to qiime...@googlegroups.com
I think the problem is that the last column has to be the Description
so my suggestion will be to follow these instructions and making sure
that you are using tab between the columns and not spaces:
http://qiime.sourceforge.net/documentation/file_formats.html#metadata-mapping-files

MarineLanda

unread,
Apr 4, 2012, 9:13:44 AM4/4/12
to Qiime Forum
Great, thanks, I corrected my mapping file and got a functional one.

Now the error for the split_libraries.py is the following :

marinelanda$ macqiime split_libraries.py -f GP4ZCY204.fasta -m
mapfiletest.txt -t -M 1 -b 8 -z truncate_only --
reverse_primer_mismatches 1 -d
Traceback (most recent call last):
File "/macqiime/QIIME/bin/split_libraries.py", line 275, in <module>
main()
File "/macqiime/QIIME/bin/split_libraries.py", line 272, in main
added_demultiplex_field = opts.added_demultiplex_field)
File "/macqiime/lib/python2.7/site-packages/qiime/
split_libraries.py", line 1343, in preprocess
reverse_primer_mismatches)
File "/macqiime/lib/python2.7/site-packages/qiime/
split_libraries.py", line 735, in check_seqs
split_seq(curr_qual, barcode_len, primer_len)
File "/macqiime/lib/python2.7/site-packages/qiime/
split_libraries.py", line 334, in split_seq
curr_barcode = curr_seq[0:barcode_len]
TypeError: 'NoneType' object is not subscriptable



On 4 avr, 14:38, Antonio González Peña <antgo...@gmail.com> wrote:
> I think the problem is that the last column has to be the Description
> so my suggestion will be to follow these instructions and making sure
> that you are using tab between the columns and not spaces:http://qiime.sourceforge.net/documentation/file_formats.html#metadata...

Tony Walters

unread,
Apr 4, 2012, 11:08:38 AM4/4/12
to qiime...@googlegroups.com
Hello Marine,

Can you try with this command (it looks like you are not sending a quality scores file):
macqiime split_libraries.py -f GP4ZCY204.fasta -m mapfiletest.txt -t -M 1 -b 8 -z truncate_only --reverse_primer_mismatches 1

Or alternatively, use your last command, but also specify the quality scores file with the -q parameter.

-Tony

MarineLanda

unread,
Apr 4, 2012, 11:50:54 AM4/4/12
to Qiime Forum
Thank you so much to both of you, now the script works, I am so
happy!

Just out of curiosity, do you know if I could do the same kind of job
on the denoised sequences, so that I don't have to denoise the raw
data again? Could I just trim this reverse primer at the end of the
denoised sequences?

Thank you!

Marine
On 4 avr, 17:08, Tony Walters <william.a.walt...@gmail.com> wrote:
> Hello Marine,
>

Tony Walters

unread,
Apr 4, 2012, 12:33:56 PM4/4/12
to qiime...@googlegroups.com
Hello again Marine,

I've attached a script/library/unit test from the development version of QIIME, which is the stand-alone version of the reverse primer truncation software (you would just need to pass the denoised sequences and a mapping file containing the reverse primer).  You'd would want to add this file to your current installation of QIIME in the appropriate directory (scripts, qiime, or tests), and then you should be able to run truncate_reverse_primer.py on your sequences.

-Tony
reverse_primer_truncation.zip
Message has been deleted

MarineLanda

unread,
Apr 5, 2012, 5:49:23 AM4/5/12
to Qiime Forum
Hi Tony,


Thank you so much this is exactly what I need!
I'm not sure I installed the files properly, though. I don't get any
error message, but nothing's happening. It's like the script is
recognized but the job isn't done...

Thank you so much for this much appreciated and useful help.

Marine
> ...
>
> plus de détails »
>
>  reverse_primer_truncation.zip
> 12KAfficherTélécharger

Tony Walters

unread,
Apr 5, 2012, 11:24:14 AM4/5/12
to qiime...@googlegroups.com
Hello Marine,

Does the unit test run correctly (test_truncate_reverse_primers.py)?  What directories did you end up putting each of the files in?

-Tony

MarineLanda

unread,
Apr 5, 2012, 11:54:38 AM4/5/12
to Qiime Forum
Nope, I tried the test and it doesn't work... I get a "command not
found", I guess I put it in the wrong directory.

The files are in macqiime/QIIME/bin, for the script, all the other
scripts were there so I figured... and in macqiime/QIIME/tests for the
test script... I did not find any scripts/ or qiime/ directory...

Sorry to be a bit lost about this... I installed Macqiime according to
the instructions, without going into the details of what it does (and
where it puts things)...



On 5 avr, 17:24, Tony Walters <william.a.walt...@gmail.com> wrote:
> Hello Marine,
>
> ...
>
> plus de détails »

Tony Walters

unread,
Apr 8, 2012, 4:06:20 PM4/8/12
to qiime...@googlegroups.com
(posting the solution on the QIIME forum for getting these files installed on macqiime in case anyone else runs into this situation)

First I needed to get the write/read access to macqiime, so I did this:

cd /

sudo chmod -R 775 macqiime/*.*

sudo chmod -R 775 macqiime/.


You will need the administrator password to run the last two commands.


Then copy all of the files to the proper locations

Full paths (note that this is installed off of your root directory):


the script file (/scripts/):

/macqiime/QIIME/bin/


The unit test (/tests/) file:

/macqiime/QIIME/tests/


The library file (/qiime/):

/macqiime/lib/python2.7/site-packages/qiime/


Then you should be able to do this:

macqiime truncate_reverse_primer.py -h

And get a response

Hope this helps,
Tony Walters

MarineLanda

unread,
May 22, 2012, 4:15:22 AM5/22/12
to Qiime Forum
Hi again everyone,

With Tony's help I managed to clean my sequences... or so I thought!

I tried resubmitting my sequences and again it appears that there are
some adaptators remaining in some of them!

If I understand correctly the problem, it appears that the
truncate_reverse_primer.py script manages to remove the primer
sequence and everything following, as well as a given number of
mismatches... Is it possible that this script doesn't recognize the
insertion/deletion events and therefore leaves some sequences undealt
with?

Is there any way I can clean the sequences really efficiently? Or
maybe just trim off the 30 or so last nucleotides in all sequences
regardless of what those nucleotides are and just get it over with?

Thank you for your help,

Marine
> ...
>
> plus de détails »

Tony Walters

unread,
May 22, 2012, 1:46:41 PM5/22/12
to qiime...@googlegroups.com
Hello Marine,

As the reverse primer truncation script uses a local alignment, it can handle indels (treats as a mismatch).  The best answer would be to increase the number of mismatches so it's more sensitive to the sequence (if you go too high though, you run the risk of it finding the primer in the wrong location, but this would probably only happen at a really high value).

-Tony

MarineLanda

unread,
May 22, 2012, 1:48:48 PM5/22/12
to Qiime Forum
I'll try that, thank you Tony!

Marine

On 22 mai, 19:46, Tony Walters <william.a.walt...@gmail.com> wrote:
> Hello Marine,
>
> ...
>
> plus de détails »

jespenshade

unread,
Jun 29, 2012, 5:02:47 PM6/29/12
to qiime...@googlegroups.com
I've been following this thread and trying to use it to denoise my own sequences but I always get ValueError: too many values to unpack. What does this mean?

Thanks,

Jordan

Tony Walters

unread,
Jun 29, 2012, 5:08:31 PM6/29/12
to qiime...@googlegroups.com
Hello Jordan,

Can you post the exact command you are using?  And how you have QIIME installed (native, macqiime, virtualbox, etc.), and also can you run print_qiime_config.py and post those results?

-Tony
Message has been deleted

jespenshade

unread,
Jul 2, 2012, 9:09:58 AM7/2/12
to qiime...@googlegroups.com
Tony,
 
Haha, and then...?
 
The exact command was:
split_libraries.py -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -m Plate5Map.txt -t -M 1 -b 10 -l 350 -L 650 -z truncate_only --reverse_primer_mismatches 1

Qiime is installed via the updated 1.5.0 virtualbox.
 
print_qiime_config.py output:
System information
==================
         Platform: linux2
   Python version: 2.7.3 (default, Apr 20 2012, 23:04:22)  [GCC 4.6.3]
Python executable: /home/qiime/qiime_software/python-2.7.1-release/bin/python
Dependency versions
===================
                     PyCogent version: 1.5.1
                        NumPy version: 1.5.1
                   matplotlib version: 1.1.0
                  biom-format version: 0.9.3
                QIIME library version: 1.5.0
                 QIIME script version: 1.5.0
        PyNAST version (if installed): 1.1
RDP Classifier version (if installed): rdp_classifier-2.2.jar
QIIME config values
===================
                     blastmat_dir: /home/qiime/qiime_software/blast-2.2.22-release/data
                         sc_queue: all.q
      topiaryexplorer_project_dir: None
     pynast_template_alignment_fp: /home/qiime/qiime_software/core_set_aligned.fasta.imputed
                  cluster_jobs_fp: /home/qiime/qiime_software/qiime-1.5.0-release/bin/start_parallel_jobs.py
pynast_template_alignment_blastdb: None
assign_taxonomy_reference_seqs_fp: /home/qiime/qiime_software/gg_otus-4feb2011-release/rep_set/gg_97_otus_4feb2011.fasta
                     torque_queue: friendlyq
              qiime_test_data_dir: None
   template_alignment_lanemask_fp: /home/qiime/qiime_software/lanemask_in_1s_and_0s
                    jobs_to_start: 1
                cloud_environment: False
                qiime_scripts_dir: /home/qiime/qiime_software/qiime-1.5.0-release/bin
            denoiser_min_per_core: 50
                      working_dir: /tmp/
                    python_exe_fp: /home/qiime/qiime_software/python-2.7.1-release/bin/python
                         temp_dir: /tmp/
                      blastall_fp: /home/qiime/qiime_software/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep: 60
assign_taxonomy_id_to_taxonomy_fp: /home/qiime/qiime_software/gg_otus-4feb2011-release/taxonomies/greengenes_tax_rdp_train.txt
 
Thanks!
-Jo



On Friday, June 29, 2012 3:08:31 PM UTC-6, TonyWalters wrote:
Hello Jordan,

Can you post the exact command you are using?  And how you have QIIME installed (native, macqiime, virtualbox, etc.), and also can you run print_qiime_config.py and post those results?

-Tony

Tony Walters

unread,
Jul 2, 2012, 12:34:48 PM7/2/12
to qiime...@googlegroups.com
Hello Jo,

I'm able to run some test sequences locally (the tutorial data) using your commands, so the problem may be somewhere else (e.g. the mapping file or the sequences)  For the next step, can you run check_id_map.py on the mapping file and see if it returns any errors?  Also, can you post the original error that you saw (so I can look at the part of the code where the error is being raised)?

-Tony

jespenshade

unread,
Jul 2, 2012, 1:24:36 PM7/2/12
to qiime...@googlegroups.com
Tony,

I'm sorry, I completely forgot what it was I couldn't do on Friday. My problem was trying to use the denoise_wrapper.py not the split_libraries.py. That's probably why you're more confused that I am.

So, starting over again, I ran the split libraries command like outlined above and that was fine. I used the sff tools on the .sff file to make it into a .sff.txt file and then tried to denoise it.

Exact command was denoise_wrapper.py -i 062712_plate5.sff.txt -f seqs.fna -o Denoised -n 1 --titanium

Output is:

Traceback (most recent call last):
  File "/home/qiime/qiime_software/qiime-1.5.0-release/bin/denoise_wrapper.py", line 159, in <module>
    main()
  File "/home/qiime/qiime_software/qiime-1.5.0-release/bin/denoise_wrapper.py", line 145, in main
    titanium=opts.titanium)
  File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/denoise_wrapper.py", line 37, in fast_denoiser
    verbose=verbose, titanium=titanium)
  File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/denoiser/flowgram_clustering.py", line 612, in denoise_seqs
    verbose=verbose, squeeze=squeeze, primer=primer)
  File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/denoiser/preprocess.py", line 206, in preprocess
    primer=primer)
  File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/denoiser/flowgram_filter.py", line 140, in truncate_flowgrams_in_SFF
    for f in flowgrams:
  File "/home/qiime/qiime_software/pycogent-1.5.1-release/lib/python2.7/site-packages/cogent/parse/flowgram_parser.py", line 130, in _sff_parser
    t = split_summary(s)
  File "/home/qiime/qiime_software/pycogent-1.5.1-release/lib/python2.7/site-packages/cogent/parse/flowgram_parser.py", line 96, in split_summary
    key, value = line.strip().split(':')

ValueError: too many values to unpack

Sorry for the confusion. I've only had two cups of coffee today...


On Monday, July 2, 2012 10:34:48 AM UTC-6, TonyWalters wrote:
Hello Jo,

I'm able to run some test sequences locally (the tutorial data) using your commands, so the problem may be somewhere else (e.g. the mapping file or the sequences)  For the next step, can you run check_id_map.py on the mapping file and see if it returns any errors?  Also, can you post the original error that you saw (so I can look at the part of the code where the error is being raised)?

-Tony

Tony Walters

unread,
Jul 2, 2012, 2:17:43 PM7/2/12
to qiime...@googlegroups.com
Hello again,

Could you run:
head 062712_plate5.sff.txt > example_flowgram_text.txt

and post the example_flowgram_text.txt data?

-Tony

jespenshade

unread,
Jul 2, 2012, 2:35:03 PM7/2/12
to qiime...@googlegroups.com
Tony,

Common Header:
  Magic Number:  0x2E736666
  Version:       0001
  Index Offset:  480010496
  Index Length:  3024839
  # of Reads:    151206
  Header Length: 840
  Key Length:    4
  # of Flows:    800
  Flowgram Code: 1

- Jordan


On Monday, July 2, 2012 12:17:43 PM UTC-6, TonyWalters wrote:
Hello again,

Could you run:
head 062712_plate5.sff.txt > example_flowgram_text.txt

and post the example_flowgram_text.txt data?

-Tony

Tony Walters

unread,
Jul 2, 2012, 3:04:47 PM7/2/12
to qiime...@googlegroups.com
Hello again Jo,

I am thinking that there is some line(s) in the .sff.txt file that have more than one colon (:) character in it that is creating the ValueError you are seeing.  Can you look through some of the lines (probably the Run, Analysis, or Path section) to see if this is the case?

-Tony

jespenshade

unread,
Jul 2, 2012, 5:03:59 PM7/2/12
to qiime...@googlegroups.com
Tony,

I can't find any places with multiple colons. It pretty much all reads like this:

>HQDHHHE01B3ZAT
  Run Prefix:   R_2012_06_27_15_33_07_
  Region #:     1
  XY Location:  0748_1251

  Run Name:       R_2012_06_27_15_33_07_JR08100313_Administrator_062712_16S_Stability
  Analysis Name:  D_2012_06_27_15_50_07_JR08100313_fullProcessingAmplicons
  Full Path:      /data/R_2012_06_27_15_33_07_JR08100313_Administrator_062712_16S_Stability/D_2012_06_27_15_50_07_JR08100313_fullProcessingAmplicons/

  Read Header Len:  32
  Name Length:      14
  # of Bases:       93
  Clip Qual Left:   5
  Clip Qual Right:  61
  Clip Adap Left:   0
  Clip Adap Right:  0

Jordan


On Monday, July 2, 2012 1:04:47 PM UTC-6, TonyWalters wrote:
Hello again Jo,

I am thinking that there is some line(s) in the .sff.txt file that have more than one colon (:) character in it that is creating the ValueError you are seeing.  Can you look through some of the lines (probably the Run, Analysis, or Path section) to see if this is the case?

-Tony

Tony Walters

unread,
Jul 2, 2012, 5:10:22 PM7/2/12
to qiime...@googlegroups.com
Hello Jo,

Could you try recreating the .sff.txt file using process_sff.py (with the -f parameter, see http://qiime.org/scripts/process_sff.html) and see if you get the same error?  I'm wondering if something went awry in the generation of that file.

-Tony

priesgo

unread,
Jul 9, 2012, 3:44:46 AM7/9/12
to qiime...@googlegroups.com
Hi Tony and Jo,

I had just the very same problem with a file, while everything worked fine with other 7 files. I just recreated the problematic file with process_sff as Tony said and it worked.

In my opinion the cause can be within process_sff, but it can also be a corrupted file as I have been moving the files between different machines.


Thanks!
Pablo.


On Monday, July 2, 2012 11:10:22 PM UTC+2, TonyWalters wrote:
Hello Jo,

Could you try recreating the .sff.txt file using process_sff.py (with the -f parameter, see http://qiime.org/scripts/process_sff.html) and see if you get the same error?  I'm wondering if something went awry in the generation of that file.

-Tony

jespenshade

unread,
Jul 16, 2012, 2:15:07 PM7/16/12
to qiime...@googlegroups.com
Tony,

Yes, that was exactly it. Once I changed the .sff file into a .sff.txt inside my qiime VM, everything worked out. Thanks again!

- Jordan


On Monday, July 2, 2012 3:10:22 PM UTC-6, TonyWalters wrote:
Hello Jo,

Could you try recreating the .sff.txt file using process_sff.py (with the -f parameter, see http://qiime.org/scripts/process_sff.html) and see if you get the same error?  I'm wondering if something went awry in the generation of that file.

-Tony

MarineLanda

unread,
Mar 29, 2013, 7:16:42 AM3/29/13
to qiime...@googlegroups.com
Hi again, 


I don't remember if you had discussed this at the time, I'm wondering how the reverse primer truncation works... 

If my primer is, let's say, CCGCNGCNGCTGGCAC

And some sequences in my data only have 5 or 6 nucleotides at the end, like this :CCGCNG

Will the script find it and remove it? Or can it only deal with sequences that have exactly the total number of nucleotides with or without mismatches, according to the parameters you chose? 

Thanks for your (always very appreciated) help. 

Marine

Tony Walters

unread,
Mar 29, 2013, 11:33:27 AM3/29/13
to qiime...@googlegroups.com
Hello Marine,

It will count as a lot of mismatches with such a short match at the end, and wouldn't be removed.  The way to get around this would be to shorten the reverse primer (maybe just 3' half of it).  On the other hand, if a short region of the primer were left in the sequence, it probably would not make a huge difference to the downstream results, as this is still 16S (or whichever gene was targeted) sequence.  It's the sequence following the primer (barcode, linker, adapter) that causes the most problems, and you wouldn't have to worry about that in this case.

-Tony


--
 
---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

MarineLanda

unread,
Apr 2, 2013, 8:12:34 AM4/2/13
to qiime...@googlegroups.com
Hi Tony, 

Thanks for your feedback. I agree, having a few primer nucleotides left is probably not very important. I noticed that when I use a short version of the primer, I start getting sequences that are cut in the middle. I guess I'd rather keep some primer at the end... 

Thanks again! 

Marine
Reply all
Reply to author
Forward
0 new messages