Hi Nilesh,
    
    Hi,
      
      
      I am a 3rd year
        undergraduate student of computer science, pursuing my B.Tech
        degree at RCC Institute of Information Technology. I am
        proficient in Java, PHP and C#.
      
      
      Among the project
        ideas on the GSoC 2013 ideas page, the one particular idea that
        seemed really interesting to me is the one titled "Reactome
        Search". I want to work on it. I think my experience will come
        of good use in this project.
    
    
    Thanks for your interest in the Reactome search project!
    
    We would like to use a Lucene-based search platform called EBeye to
    search our database.  It is currently being used for the databases
    at the European Bioinformatics Institute (EBI), but it does a very
    bad job with Reactome data, because it is not using any
    domain-specific heuristics for sorting results. 
    
    You can try EBeye on this page: 
    
    
http://www.ebi.ac.uk/s4/
    
    
    There is more detail about EBeye here: 
    
    
http://www.ebi.ac.uk/ebisearch/documentation.ebi
    
    
    A full research paper can be found here: 
    
    
http://bib.oxfordjournals.org/content/early/2010/02/11/bib.bbp065.full
    
    
    
      
      
      I am passionate about
        data mining, big data, search and recommendation engines,
        therefore this idea naturally appeals to me a lot. I have
        experience with building search functionality into a live
        production site, where I'm interning at. I used Sphinx with
        MySQL and was responsible for all the database configuration,
        trigger and index creation, and full-text search configuration.
        I have thorough experience with Sphinx (a very capable full-text
        search engine with many matching and ranking algorithms and
        different fuzzy matching options) and am willing to dig deeper
        into Lucene or learn SOLR if the need arises. I have a little
        experience with Lucene and using DefaultSimilarity (uses Cosine
        Similarity).
      
      
      I would like to
        download the Reactome source code and set it up on my local
        machine. But I couldn't find any reference to a source code repo
        anywhere other than that it uses CVS. As suggested in 
http://wiki.reactome.org/index.php/Reactomes,
        I'm CC'ing David to help me out. It'd be great if I could
        examine the code in the perl CGI script (search2) and the code
        in 
GKB/modules/GKB/SearchUtils/ResultsRanker.pm
          to see how I can integrate it with a search platform like
          Lucene.
        
    
    You can download the Reactome source code bundle from:
    
    
http://www.reactome.org/download/current/GKB.tar.gz
    
    You will find our current sorting heuristics under: 
    
    GKB/modules/GKB/SearchUtils/ResultsRanker.pm 
    
    I hope this gives you something to get started on, please let me
    know if you have any questions. 
    
    Cheers, 
    
    David Croft.