Version 0.78

3 views
Skip to first unread message

Felipe Albrecht

unread,
Jan 7, 2010, 4:55:17 AM1/7/10
to geno...@googlegroups.com

Genoogle BETA 0.78
Date: 07/01/2009.


Main changes:
  RNA Sequences Support.
  Ignoring invalid sequences during the data bank formatting process.


Users changes:

Now it is possible to use RNA fasta sequences. It is necessary to inform at the type attribute at the split-databank, that the fasta inputs file are RNA sequences:
   <genoogle:split-databanks name="rna_db" path="files/fasta" mask="1101100110011011" type="RNA" number-of-sub-databanks="1" sub-sequence-length="10" low-complexity-filter="5">
      <genoogle:databank name="rna_fasta" path="rna_fasta_file.rna" />
   </genoogle:split-databanks>
If the "type" is not set, genoogle will consider that the input are DNA sequences. For while, it is only possible to search RNA sequences in the RNA data banks, and DNA sequences in the DNA data banks.

Another change, it is during the formatting data bank process. If some sequences has invalid characters, this sequence will be ignored and the formatting process will continue without it.

It is all. Please use this Group to send questions, suggestions and bugs reports.

Felipe Albrecht

Rodrigo Jardim

unread,
Jan 7, 2010, 10:33:15 AM1/7/10
to geno...@googlegroups.com
Felipe,

ok, obrigado! 

Consegui gerar a base de dados e disponibilizá-la de acordo com a sua documentação (inclusão da base no genoogle.xml). Entretanto, ao tentar fazer uma busca, recebi o erro abaixo:

2010-01-07 13:30:44,537 [http-8080-Processor20] INFO  bio.pih.genoogle.Genoogle - Starting Genoogle .
- Starting Genoogle .
2010-01-07 13:30:44,570 [http-8080-Processor20] INFO  bio.pih.genoogle.io.SplittedSequenceDatabank - Loading internals databanks
- Loading internals databanks
2010-01-07 13:30:44,570 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Loading databank '/usr/local/apache-tomcat-5.5.27/webapps/genoogle/files/fasta/Aquifex aeolicus/Aquifex aeolicus_sub_0.dsdb'.
- Loading databank '/usr/local/apache-tomcat-5.5.27/webapps/genoogle/files/fasta/Aquifex aeolicus/Aquifex aeolicus_sub_0.dsdb'.
2010-01-07 13:30:44,571 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank with : 1 sequences.
- Databank with : 1 sequences.
2010-01-07 13:30:44,571 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank with : 1551335 bases.
- Databank with : 1551335 bases.
2010-01-07 13:30:44,571 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank with : 141030 sub-sequences bases aprox.
- Databank with : 141030 sub-sequences bases aprox.
2010-01-07 13:30:44,571 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank loaded in 1ms with 1 sequences.
- Databank loaded in 1ms with 1 sequences.
2010-01-07 13:30:44,571 [http-8080-Processor20] INFO  bio.pih.genoogle.index.MemoryInvertedIndex - Loading inverted index.
- Loading inverted index.
2010-01-07 13:30:45,138 [http-8080-Processor20] INFO  bio.pih.genoogle.index.MemoryInvertedIndex - Inverted index loaded in 567
- Inverted index loaded in 567
2010-01-07 13:30:45,139 [http-8080-Processor20] INFO  bio.pih.genoogle.io.SplittedSequenceDatabank - Loaded 1 of 1 sub-databanks.
- Loaded 1 of 1 sub-databanks.
2010-01-07 13:30:45,139 [http-8080-Processor20] INFO  bio.pih.genoogle.io.SplittedSequenceDatabank - Databanks loaded in 569ms.
- Databanks loaded in 569ms.
2010-01-07 13:30:45,139 [http-8080-Processor20] INFO  bio.pih.genoogle.io.SplittedSequenceDatabank - Loading internals databanks
- Loading internals databanks
2010-01-07 13:30:45,139 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Loading databank '/usr/local/apache-tomcat-5.5.27/webapps/genoogle/files/fasta/Biowebdb/Biowebdb_sub_0.dsdb'.
- Loading databank '/usr/local/apache-tomcat-5.5.27/webapps/genoogle/files/fasta/Biowebdb/Biowebdb_sub_0.dsdb'.
2010-01-07 13:30:45,339 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank with : 223934 sequences.
- Databank with : 223934 sequences.
2010-01-07 13:30:45,340 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank with : 394729794 bases.
- Databank with : 394729794 bases.
2010-01-07 13:30:45,340 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank with : 35884526 sub-sequences bases aprox.
- Databank with : 35884526 sub-sequences bases aprox.
2010-01-07 13:30:45,340 [http-8080-Processor20] INFO  bio.pih.genoogle.io.AbstractSequenceDataBank - Databank loaded in 201ms with 223934 sequences.
- Databank loaded in 201ms with 223934 sequences.
2010-01-07 13:30:45,340 [http-8080-Processor20] INFO  bio.pih.genoogle.index.MemoryInvertedIndex - Loading inverted index.
- Loading inverted index.
2010-01-07 13:30:46,827 [http-8080-Processor20] FATAL bio.pih.genoogle.Genoogle - Map failed
- Map failed
- Servlet.service() for servlet jsp threw exception
java.lang.NullPointerException
        at org.apache.jsp.query_jsp._jspService(query_jsp.java:103)
        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
        at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:331)
        at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:329)
        at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
        at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
        at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
        at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
        at java.lang.Thread.run(Thread.java:619)


Abraços,
--
Atc,
Rodrigo Jardim
jardim....@gmail.com

Felipe Albrecht

unread,
Jan 7, 2010, 12:29:41 PM1/7/10
to geno...@googlegroups.com
Rodrigo,
eu realmente não tenho idéia o problema que ocorreu.
Pode-se ver, através do log, que o erro acontece quando começa-se a ler o índice invertido.
Então, faça o seguinte: apague o diretório "/usr/local/apache-tomcat-5.5.27/webapps/genoogle/files/fasta/Biowebdb"
e re-execute o format_db.sh.
Preste atenção na mensagem de log do format_db.sh, talvez esteja acontecendo algum problema ali.
Reply all
Reply to author
Forward
0 new messages