Bowtie Indexes for GeCKO v1 and v2 libraries

1,554 views
Skip to first unread message

Jon Mortison

unread,
Sep 29, 2014, 9:06:38 AM9/29/14
to cri...@googlegroups.com
Would it be possible to make the built indexes for alignment of the GeCKO libraries available online? It seems like that would be easy a nice convenience for users rather than having new users rebuild their own indexes every time (which I just did).

-Jon 

Neville Sanjana

unread,
Sep 29, 2014, 1:18:09 PM9/29/14
to Jon Mortison, cri...@googlegroups.com
Hi Jon.... that's a good idea. We'll upload them to website. You will still need to use the CSV file version for mapping from sgRNAs to genes.

Thanks for the suggestion!

- Neville

On Mon, Sep 29, 2014 at 9:06 AM, Jon Mortison <jdmor...@gmail.com> wrote:
Would it be possible to make the built indexes for alignment of the GeCKO libraries available online? It seems like that would be easy a nice convenience for users rather than having new users rebuild their own indexes every time (which I just did).

-Jon 

--
You received this message because you are subscribed to the Google Groups "Genome Engineering using CRISPR/Cas Systems" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crispr+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gita Reinitz

unread,
Sep 30, 2014, 8:40:31 AM9/30/14
to cri...@googlegroups.com, jdmor...@gmail.com, nsan...@mit.edu
Hi all,

I was actually just reading up on the GeCKO library yesterday. I use CRISPR for pretty standard gene editing, and the ocasional knock-in, but I don't understand how your library works. I read through the protocol on how to replicate and transfect the library, but how do I screen for gene deletions afterwards? How do I know which cell contains what? Would be very happy for an easy high-level explanation of how this works and what some potential applications would be.

Thanks

Jon Mortison

unread,
Oct 1, 2014, 10:40:45 AM10/1/14
to cri...@googlegroups.com, jdmor...@gmail.com, nsan...@mit.edu
Terrific. Thanks!

Hi Eoon

unread,
Dec 6, 2014, 7:16:55 PM12/6/14
to cri...@googlegroups.com
Hi Neville, 

I just started to analyse my screen data. I am a bit nervous, hoping a good result. 

In my hands, only ~70% of the trimmed reads can be mapped to the library index. Is this normal based upon your experiences? 
I used single-end 50bp Hiseq sequencing. I have to admit I am very new with bioinformatics tools with google as my teacher. Especially, I am not very confident about the library index I made with bowtie. Could you please provide your index files for mouse GeCKO v2 libraries? 

Thanks in advance!

Best, 

Hieoon

Neville Sanjana

unread,
Dec 9, 2014, 12:03:25 AM12/9/14
to Hi Eoon, cri...@googlegroups.com
Hi Hileoon,

Yes, it is normal to see 70-80% of the trimmed reads mapping. Of course, this depends heavily on your alignment algorithm but that is typically what we see with bowtie1 and only allowing for a single mismatch.

- Neville

--

Hi Eoon

unread,
Dec 9, 2014, 1:00:08 PM12/9/14
to cri...@googlegroups.com, hie...@gmail.com, nsan...@mit.edu
Great thanks, Neville! 

Hieoon

bahli...@gmail.com

unread,
Feb 8, 2015, 3:05:46 AM2/8/15
to cri...@googlegroups.com
Are the indexes for alignment of the GeCKO libraries available online? I can't seem to find them. Many thanks

Neville Sanjana

unread,
Feb 8, 2015, 9:10:09 AM2/8/15
to bahli...@gmail.com, cri...@googlegroups.com
Hello: We have CSV files for making the indexes yourself. Since there are a few different versions of Bowtie (that do not have mutually compatible index files), we recommend that you make it yourself, which guarantees compatibility with your version of Bowtie. You can do this using the bowtie-build command and it should take less than a minute.


Best,

- Neville

--

bahli...@gmail.com

unread,
Feb 24, 2015, 12:51:30 AM2/24/15
to cri...@googlegroups.com, bahli...@gmail.com, nsan...@mit.edu
thank you Neville for the quick reply. I am optimizing the sequencing protocol for the lifetech proton sequencing platform. I am planning to amplify the gRNA sequences with a different set of primers that will result in an amplicon < 200 in size and use only 1 PCR amplification step. The sequencing adaptors are ligated in a second step (no need for a PCR amplification step to add sequencing adaptors). the primers sequence are F: CTTGGCTTTATATATCTTGTGGAAAGG and R: AAGACCTAGCTAGCGAATTCAAA. The reverse primer does overlap with the chimeric RNA backbone region. Do you foresee any problem with this approach. Many thanks for you advice and for the amazing work you are doing.

N. Jacques

Neville Sanjana

unread,
Feb 24, 2015, 7:28:22 AM2/24/15
to bahli...@gmail.com, cri...@googlegroups.com
Hello: I think ligation should work although I haven’t done it myself for GeCKO readout, as I do the 2 rounds of PCR. As for checking the annealing regions of your primer sequences, could you let me know if you’re using libraries with lentiCRISPRv2 or with lentiGuide-Puro? Once I know the backbone vector, I would be happy to check the annealing sequences for your amplicon.

- Neville

Neville Sanjana

unread,
Feb 24, 2015, 11:22:11 AM2/24/15
to Lab Bahlis, cri...@googlegroups.com
Those sequences look good to me and should amplify the sgRNA 


On Tue, Feb 24, 2015 at 11:16 AM, Lab Bahlis <bahli...@gmail.com> wrote:
Hi Neville

I am using lentiCRISPRv2.
thanks again for your prompt reply

On Tue, Feb 24, 2015 at 5:24 AM, Neville Sanjana <nsan...@mit.edu> wrote:
Hello: I think ligation should work although I haven’t done it myself for GeCKO readout, as I always do the 2 rounds of PCR. As for checking the annealing regions of your primer sequences, could you let me know if you’re using libraries with lentiCRISPRv2 or with lentiGuide-Puro? Once I know the backbone vector, I would be happy to check the annealing sequences for your amplicon.

- Neville

Neville Sanjana

unread,
Feb 24, 2015, 11:22:54 AM2/24/15
to Lab Bahlis, cri...@googlegroups.com
Those primer annealing sequences look good to me and should amplify the sgRNA cassette in lentiCRISPRv2.

Best,

- Neville

On Feb 24, 2015, at 11:16 AM, Lab Bahlis <bahli...@gmail.com> wrote:

Hi Neville

I am using lentiCRISPRv2.

thanks again for your prompt reply

Daniel Gulbranson

unread,
Mar 16, 2015, 11:34:59 PM3/16/15
to cri...@googlegroups.com, bahli...@gmail.com, nsan...@mit.edu
I am probably missing something obvious... but I am having trouble getting the build index to work on the CSV files. I've gone through the bowtie2 tutorial, and can build an index from their example dataset (so I think bowtie2 installation was successful). I'm thinking I need to convert the CSV files that you have uploaded on your website to a different format prior to building the index from them, but I'm not sure. I am thinking they should be in the form below (and I would specify
>HGLibA_00001,A1B1G
GTCGCTGAGCTCCGATTCGA
>HGLibA_00002,A1B1G
ACCTGTAGTTGCCGGCGTGC
Is this what I need to do? Sorry for the newbie question. Thanks for your help!

On Sunday, February 8, 2015 at 7:10:09 AM UTC-7, Neville Sanjana wrote:

gingras....@gmail.com

unread,
Mar 23, 2015, 3:20:30 PM3/23/15
to cri...@googlegroups.com, bahli...@gmail.com, nsan...@mit.edu
I ran in the same problem for the mouse library v2, and indeed I converted the CSV to a fasta file.
It was with bowtie1, but it should work like a charm.
Sebastien

d d

unread,
Feb 11, 2017, 5:27:54 AM2/11/17
to Genome Engineering using CRISPR/Cas Systems, bahli...@gmail.com, nsan...@mit.edu
Hi folks,
Can anyone explicitly show how they converted this weird csv/html/xml file to fasta? Step-by-step.
Because I don't have a clue!

Cheers,
Donncha.

d d

unread,
Feb 14, 2017, 5:13:50 AM2/14/17
to Genome Engineering using CRISPR/Cas Systems, bahli...@gmail.com, nsan...@mit.edu
Hi all,
If this is useful to anyone I figured it out.

# I downloaded the two mouse libraries from
# https://www.dropbox.com/s/171wfhm74qepv5z/Mouse_GeCKOv2_Library_A_09Mar2015.csv?dl=0
# https://www.dropbox.com/s/mtqz1mk5aje12bl/Mouse_GeCKOv2_Library_B_09Mar2015.csv?dl=0
# i renamed them simply "a" and "b" for simplicity
# then in 'R' I...
a<-read.csv("a")  # read file a in.
head(a) # check it
b<-read.csv("b") # read file b in.
head(b)) # check it
c<-rbind(a,b) # this stacks the two files into one called c
c$id<-paste(c$UID,c$gene_id,sep="_") # then i join the gene names with the gRNA ID, creating a new unique_id  which might be useful later on in your mapping
d<-c[,c(4,3)] # this reorganises the table to be unique_id and sequence
write.table(d,"d.tab",sep="\t",quote=F,row.names=F) # this writes the table out of R memory
# use tab_to_fasta and tidy up.can be found at http://sequenceconversion.bugaco.com/converter/biology/sequences/tab_to_fasta.php
# there are ways to use 'R' to make the fasta. the library biostrings can do this. the web interface is easier!
# then in unix command line
wc -l sample-1.fasta# this counts the number of lines
sed -n 3,260420p sample-1.fasta > sample2-1.fasta # this removes the top two lines which arise using tab_to_fasta. don't need them
mv sample2-1.fasta gecko_mouse_lib_a_b_2.fasta # this renames the file
bowtie-build gecko_mouse_lib_a_b._2fasta gecko_mouse_lib_a_b_2 # then use bowtie to make the index
Reply all
Reply to author
Forward
0 new messages