pre-release version 2.0.0-alpha-1

5 views
Skip to first unread message

Carlos Peña

unread,
May 28, 2015, 7:44:23 AM5/28/15
to voseq-disc...@googlegroups.com

VoSeq is being rewritten


I arrived to the conclusion that it will be quicker and more efficient a complete redesign of the inner workings of VoSeq,
 rather than trying to improve and fix the old code in PHP and MySQL.

We are rewriting everything in the language Python using the framework Django. The advantage of Python is that we
 can use the tools from BioPython for handling molecular data.

The advantage of Django is that the application can be build really modular with isolated components. It also makes it 
easy to write new plugins to extend functionality that can be incorporated into Voseq without any effort.

I have also moved the database to PostgreSQL which seems to have nicer features than MySQL. I am also trying to
include a indexing back-end so that the database can be used in an efficient way, and be very quick.

This new version is usable but I will be adding the missing features in the next weeks. It is also trickier to install as 
it has more requirements (elasticsearch backend and Python libraries). 

The installation and configuration steps are detailed in the VoSeq's readme file https://github.com/carlosp420/VoSeq
and any system administrator at your institution should be able to follow them as the requirements are standard software.

However, the real release 2.0.0 of VoSeq will include instruction for easy and quick install using Virtual Boxes such as
Docker or Vagrant. These kind of Virtual Machines have become very popular and allow installing a variety of software
and their requirements with one single command in a computer terminal.

Features

Most of all features in the last release of VoSeq 1.7.4 are ready in this pre-release version:

Re-written features

  • Browse page for vouchers recently modified/added to the database.
  • Blast new sequence tool for blasting any sequence against all or a subset of sequences kept in VoSeq.
  • View genes tool to quickly see what genes are currently in the database. New: there is a badge for each gene showing the number of voucher records that have sequences for that particular gene.
  • Create dataset tool to generate ready-to-run datasets in commonly used phylogenetic software such as TNT, PAUP, MrBayes, BEAST, RaXML. Note that it still needs the ability to generate datasets using the degenerated translations by Zwick et al, 2012.
  • Create voucher table tool that generates a ready for publication CSV table (importable in MS Excel) with information for each voucher record, such as: Code, Genus, Species, Locality, Sequences present or not for each gene, etc.
  • Create gene table tool that generates a CSV table with statistics about particular DNA alignments (constructed with the vouchers and gene codes that you can select) such as gene type, sequences length, dataset completion, percentage of variable sites, parsimony informative, conserved sites, and frequency of each nucleotide in your alignment.
  • Create GenBank FASTA file tool that generates FASTA file with most of the required info ready for submission to GenBank using their Sequin software.
  • Share data with GBIF tool that creates a data dump of the information from all vouchers into a CSV table (that can be imported in MS Excel) ready for being used with GBIF’s tool IPT.
  • Advanced search tool for searching voucher or sequences using combination of data fields.
  • Batch modification of voucher data tool in the Administration interface.

New features

  • Faster generation of datasets due to the use of more efficient algorithms.
  • Faster advanced searches due to the indexing of the database using the backend elasticsearch.
  • General search tool in the navegation bar that accepts taxonomic keywords to search for voucher records.
  • Keyword suggestions tool for the general search. If users misspell a taxonomic name when they use this tool, they will be shown the possible correct spelling of the query with the message “Did you mean:†.
  • Pagination tool for searches, so that up to 20 results are shown per page along with links to next and previous pages.
  • Fine grained permissions to upload and look at the sequences.
  • Login accounts. Only users with a working account will be able to look/retrieve any DNA sequence. They will not be able to create datasets. However, they will still be able to look at voucher information, search for vouchers and sequences and perform BLASTs.
  • Superuser account. Only the Superuser/Administrator will be able to create login accounts, change passwords, add user email. Also users can be given specific permission to most of components in VoSeq such as: adding genes, gene sets, members, vouchers, primers, sequences, etc.
  • Users can be grouped for better management of permissions.
  • Batch deletion tools for vouchers and sequences.
Download the pre-realease version 2.0.0-alpha-1


cheers,


carlos
Reply all
Reply to author
Forward
0 new messages