If you discover a project which looks like a good candidate for Debian Med to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Med mailing list
The Bioperl project is a coordinated effort to collect computational methodsroutinely used in bioinformatics into a set of standard CPAN-style,well-documented, and freely available Perl modules. It is well-acceptedthroughout the community and used in many high-profile projects, e.g.,Ensembl.
The CWL open standards are for describing analysis workflows and tools in away that makes them portable and scalable across a variety of software andhardware environments, from workstations to cluster, cloud, and highperformance computing (HPC) environments. CWL is designed to meet the needs ofdata-intensive science, such as Bioinformatics, Medical Imaging, Astronomy,Physics, and Chemistry.
The CWL reference implementation (cwltool) is intended to be feature completeand to provide comprehensive validation of CWL files as well as provide othertools related to working with CWL descriptions.
AcePerl is an object-oriented Perl interface for the AceDBdatabase. It provides functionality for connecting to remote AceDBdatabases, performing queries, fetching ACE objects, and updatingdatabases. The programmer's API is compatible with the JADE Java API,and interoperable with the API used by BoulderIO.
AceDB is a genome database system developed since 1989 primarily byJean Thierry-Mieg (CNRS, Montpellier) and Richard Durbin (SangerInstitute). It was originally developed for the C.elegans genomeproject , from which its name was derived (A C. elegans DataBase).
The BAM Format is a binary format for storing sequence data. This is alightweight C implementation of the read name collation code from thelarger bambam C++ project to handle BAM file input and BAM file output.
BamTools facilitates research analysis and data management using BAMfiles. It copes with the enormous amount of data produced by currentsequencing technologies that is typically stored in compressed, binaryformats that are not easily handled by the text-based parsers commonlyused in bioinformatics research.
Bio::ASN1::Sequence is basically a modified version of the high-performanceBio::ASN1::EntrezGene parser. However this standalone module exists since itis more efficient to keep Sequence-specific code out of EntrezGene.pm.
Bio::Das::Lite is designed as a lightweight and more forgiving alternative tothe client/retrieval/parsing components of Bio::Das. Bio::Das::Lite itself isnot a drop-in replacement for Bio::Das but it can be subclassed to do so.
Bio::DB::BioFetch is a guaranteed best effort sequence entry fetching method.It goes to the Web-based dbfetch server located at the EBI( ) to retrieve sequences in theEMBL or GenBank sequence repositories.
HTSlib is an implementation of a unified C library for accessing common fileformats, such as SAM (Sequence Alignment/Map), CRAM and VCF (Variant CallFormat), used for high-throughput sequencing data, and is the core libraryused by samtools and bcftools. HTSlib only depends on zlib. It is known to becompatible with gcc, g++ and clang.
HTSlib implements a generalized BAM (binary SAM) index, with file extension'csi' (coordinate-sorted index). The HTSlib file reader first looks for thenew index and then for the old if the new index is absent.
Provides a single place to setup some common methods for querying NCBI webdatabases. Bio::DB::NCBIHelper just centralizes the methods for constructinga URL for querying NCBI GenBank and NCBI GenPept and the common HTMLstripping done in postprocess_data().
The Bio::DB::SeqFeature object is the default SeqFeature class stored inBio::DB::SeqFeature databases. It implements both theBio::DB::SeqFeature::NormalizedFeatureI andBio::DB::SeqFeature::NormalizedTableFeatureI interfaces, which means that itssubfeatures, if any, are stored in the database in a normalized fashion, andthat the parent/child hierarchy of features and subfeatures are also storedin the database as set of tuples. This provides efficiencies in both storageand retrieval speed.
The Bioperl project is a coordinated effort to collect computationalmethods routinely used in bioinformatics into a set of standardCPAN-style, well-documented, and freely available Perl modules. Thispackage provides a programmatic interface to NCBI's Entrez ProgrammingUtilities commonly referred to as E-utilities. Namely, it provides theBio::DB::EUtilities and Bio::Tools::EUtilities perl modules.
Entrez is a federated search engine at the National Center forBiotechnology Information (NCBI) for a large number of databasescovering a variety of biomedical data, including nucleotide andprotein sequences, gene records, three-dimensional molecularstructures, and the biomedical literature. E-utilities are a set ofeight server-side programs that provide a stable interface into theEntrez query and database system at the National Center forBiotechnology Information (NCBI).
The Bio::FeatureIO system can be thought of like biological file handles.They are attached to filehandles with smart formatting rules (eg, GFF format,or BED format) and can either read or write feature objects (Bio::SeqFeatureobjects, or more correctly, Bio::FeatureHolderI implementing objects, ofwhich Bio::SeqFeature is one such object). If you want to know what to dowith a Bio::SeqFeatureI object, read Bio::SeqFeatureI.
The idea is that you request a stream object for a particular format. All thestream objects have a notion of an internal file that is read from or writtento. A particular FeatureIO object instance is configured for either input oroutput. A specific example of a stream object is the Bio::FeatureIO::gffobject.
The Bio::Graphics::Panel class provides drawing and formattingservices for any object that implements the Bio::SeqFeatureIinterface, including Ace::Sequence::Feature, Das::Segment::Feature andBio::DB::Graphics objects. It can be used to draw sequenceannotations, physical (contig) maps, protein domains, or any othertype of map in which a set of discrete ranges need to be laid out onthe number line.
MAGE-TAB (MicroArray Gene Expression Tabular) format is a standard from theMicroarray Gene Expression Data Society (MGED). This package contains Perlmodules in the Bio::MAGE hierarchy to manipulate MIAME-compliant (MinimumInformation About a Microarray Experiment) records of microarray ("DNA chips")experiments.
Bio::PrimerDesigner provides a low-level interface to the primer3 and epcrbinary executables and supplies methods to return the results. In addition toaccessing local installations of primer3 or e-PCR, it also offers the abilityto accessing the primer3 binary via a remote server.
Bio::SamTools provides a Perl interface to the libbam library for indexed andunindexed SAM/BAM sequence alignment databases. It provides support forretrieving information on individual alignments, read pairs, and alignmentcoverage information across large regions. It also provides callbackfunctionality for calling SNPs and performing other base-by-base functions.Most operations are compatible with the BioPerl Bio::SeqFeatureI interface,allowing BAM files to be used as a backend to the GBrowse genome browserapplication.
The Bio::SCF (Standard Chromatogram Format) module allows you to read andupdate (in a restricted way) SCF chromatographic sequence files. It is aninterface to Roger Staden's io-lib. It has both tied hash and anobject-oriented interfaces. It provides the ability to read fields from SCFfiles and limited ability to modify them and write them back.
Bio::Variation name space contains modules to store sequence variationinformation as differences between the reference sequence and changessequences. Also included are classes to write out and recrete objectsfrom EMBL-like flat files and XML. Lastly, there are simple classes tocalculate values for sequence change objects.
BioJava is an open-source project dedicated to providing a Java frameworkfor processing biological data. It includes objects for manipulatingsequences, file parsers, DAS client and server support, access to BioSQLand Ensembl databases, and powerful analysis and statistical routinesincluding a dynamic programming toolkit.
BioJava is provided by a vibrant community which meets annually atthe Bioinformatics Open Source Conference (BOSC) that traditionallyaccompanies the Intelligent Systems in Molecular Biology (ISMB)meeting. Much like BioPerl, the employment of this library is valuablefor everybody active in the field because of the many tricks of thetrade one learns just by communicating on the mailing list.
BioJava is an open-source project dedicated to providing a Java frameworkfor processing biological data. It includes objects for manipulatingsequences, file parsers, server support, access to BioSQLand Ensembl databases, and powerful analysis and statistical routinesincluding a dynamic programming toolkit.
Bioparser is a c++ implementation of parsers for several bioinformaticsformats. It consists of only one header file containing template parsersfor FASTA, FASTQ, MHAP, PAF and SAM format. It also supports compressedfiles with gzip.
The CDK is a library of Java classes used in computational andinformation chemistry and in bioinformatics. It includes renderers,file IO, SMILES generation/parsing, maximal common substructurealgorithms, fingerprinting and much, much more.
Chado is a relational database schema that underlies many GMODinstallations. It is capable of representing many of the generalclasses of data frequently encountered in modern biology such assequence, sequence comparisons, phenotypes, genotypes, ontologies,publications, and phylogeny. It has been designed to handle complexrepresentations of biological knowledge and should be considered oneof the most sophisticated relational schemas currently available inmolecular biology. The price of this capability is that the new usermust spend some time becoming familiar with its fundamentals.
b1e95dc632