Hi Irene,
We usually keep our libraries separate.
One folder for the heavy chains, one for kappa and one for lambda.
In each folder there should be three files:
V.fasta
J.fasta
D.fasta
This includes the kappa and lambda folders so you will need to make a dummy file for the D.fasta file for the light chain databases:
for example just containing a single sequence like
DUMMY.fasta
CCCCCC
The reason for this is that the IgBLAST module requires a D.fasta file or it will not proceed (even in the case of light chains that do not contain Ds).
It will not output the Ds but it is required to stop the program crashing.
Anyway, it is important to get the database files in the correct structure for the program to work - which means removing the IMGT header
In other words you want to change this:
>L10057|IGHV7-4-1*01|Homo sapiens|F|V-REGION|95..388|294 nt|1| | | | |294+0=294| | |
caggtgcagctggtgcaatctgggtctgagttgaagaagcctggggcctcagtgaaggtt
tcctgcaaggcttctggatacaccttcactagctatgctatgaattgggtgcgacaggcc
cctggacaagggcttgagtggatgggatggatcaacaccaacactgggaacccaacgtat
gcccagggcttcacaggacggtttgtcttctccttggacacctctgtcagcacggcatat
ctgcagatctgcagcctaaaggctgaggacactgccgtgtattactgtgcgaga
into this:
>IGHV7-4-1*01
caggtgcagctggtgcaatctgggtctgagttgaagaagcctggggcctcagtgaaggtt
tcctgcaaggcttctggatacaccttcactagctatgctatgaattgggtgcgacaggcc
cctggacaagggcttgagtggatgggatggatcaacaccaacactgggaacccaacgtat
gcccagggcttcacaggacggtttgtcttctccttggacacctctgtcagcacggcatat
ctgcagatctgcagcctaaaggctgaggacactgccgtgtattactgtgcgaga
There is a perl program program called
edit_imgt_file.pl that can do this with each of your files downloaded from IMGT (download the IMGT sequences that do not have gaps!)
The program is available here.
ftp://ftp.ncbi.nih.gov/blast/executables/igblast/release/
Do this for all the various database files (heavy chain Vs, Ds and Js, licht chain Vs and Js) and place them in a single folder for each Ig type (one folder for IGH, with J.fasta, V.fasta and D.fasta files, and one folder for IGK and one for IGL)
Now you should be ready to use them in IgDiscover.
Regards,
Martin