Hi Nathaniel,
On Oct 22, 2012, at 10:44 AM, Nat G <
ngg...@gmail.com> wrote:
> I've been trying to run Barista on my crux search output but unsuccessfully (I get an error with the command line below and argument not expected with variations of it).
>
> Since the crux output contains separate target and decoy files, I used the 'separate searches' option to specify the decoy file, as shown in the documentation. I added the decoy prefix as reverse since that was the option I specified in create-index. My command lines looks like this:
>
> crux barista --overwrite T --decoy-prefix reverse --separate-searches [...]/search.decoy.txt --output-dir [...]/Barista/09252012-LC3_MS1 [...]/fasta/yeast_orf_trans_all_05-Jan-2010.fasta [...]/Ms2files/120912_yeast_lysate.ms2 [...]/search.target.txt
>
> What am I doing wrong?
>
> Should I not be specifying the decoy output of Crux? Also, should I be including the decoy database generated by Crux (_random.fasta) in the database argument?
Barista requires protein database entries for both the target and decoy sequences. These can be concatenated into a single FASTA file or stored in separate FASTA files. Since you are using 'crux search-for-matches' it's easiest to use separate FASTA files. However, in order to do this, you do need to run 'crux create-index' in to generate the FASTA file containing the decoys. For example, using the sample files provided with Crux, demo.ms2 and small-yeast.fasta, you'd use the following steps. First create a peptide index from the target protein database. This will speed up the search and generate a FASTA file containing the decoy protein sequences:
crux create-index --decoys reverse small-yeast.fasta yeast_index
The protein database for the targets is 'small-yeast.fasta'. The protein database for the decoys will be 'yeast_index/small-yeast-random.fasta'. Next, use the index you just generated to search for peptide-spectrum-matches:
crux search-for-matches --decoys reverse demo.ms2 yeast_index
You need to give Barista the names of two protein databases, one for the target and one for the decoys. This means that the protein database argument to Barista should be the name of a text file containing a list of the database file names. In this case you'd create a file named, say, 'db-list.txt', which would contain just the lines:
small-yeast.fasta
yeast_index/small-yeast-random.fasta
The command to run Barista is then:
crux barista --overwrite T \
--decoy-prefix reverse \
--separate-searches crux-output/search.decoy.txt \
dbase-list.txt demo.ms2 crux-output/search.target.txt
Let us know if this isn't clear, or if you run into further problems.
Charles