Voila version 2.2 Release + Notes + Important information about upgrading

338 views
Skip to first unread message

Paul Jewell

unread,
Jun 24, 2020, 6:49:04 PM6/24/20
to majiq...@googlegroups.com

Hello everyone,


MAJIQ 2.2 is now available! There are many positive changes and efficiency improvements in the build process, as well as bug fixes, additional features, and library dependency upgrades.


Along with this new version comes a slight change in the way that updates are handled. The repository URL that you may install from, and the install instructions, are now slightly different. In order to obtain the new information, please go to https://majiq.biociphers.org/app_download/ (for academic users) or https://majiq.biociphers.org/commercial.php (for commercial users) and follow the instructions once more. 


Please also check out the new documentation available here and specifically the documentation about voila public server and passcode features on this page!


The patch notes for version 2.1 > 2.2 are as follows:

MAJIQ:

-Added 'incremental' mode, in order to add samples to an existing majiq build without reprocessing the existing samples already in the previous build

-Automatically detect read length per bam, removing the need to specify maximum read length in MAJIQ build configuration

-Updates to processing of annotated transcriptome to splicegraph
    -Fixed a memory leak in GFF3 parsing
    -Enable processing of gzipped GFF3 files
    -Fixed splicegraph inference for genes whose transcripts should indicate length-1 exons in the splicegraph
    -Enable length-0 "introns" between directly adjacent splicegraph exons
    -Fixed bug when processing alignments to contigs with names containing colons (e.g. HLA haplotypes)

-Fix bugs in coverage detection for junctions and introns
    -We now correctly handle intron coverage detection from split reads, fixing a bug where we would occasionally incorrectly count split reads to introns entirely skipped by the associated junction.
    -We updated how we define read positions for junctions and introns relative to an alignment rather than genomic coordinates. This fixes a bug with detecting coverage from reads with multiple splits, which would manifest itself as systematically underestimated coverage of junctions following short exons.
    -We now correctly search for all potential introns associated with an aligned read during intron coverage detection by using bisection over a properly partitioned auxillary variable of the intron coordinates. This fixes a bug where a few potential introns would incorrectly be skipped because bisection was performed directly on the intron coordinates, which was not properly partitioned for bisection search.

-Fix bugs in summarizing/bootstrapping over junction/intron coverage
    -Correctly detect read positions with outlier coverage (i.e. stacks)
        -Use leave-one-out mean as Poisson mean for Poisson p-values (previously incorrectly using leave-one-out sum)
        -Test all relevant positions (previously incorrectly skipped the next position after identifying a stack)
        -Correctly identify no stacks when all nonzero positions have exactly one read (previously used library would incorrectly calculate the CDF for Poisson distribution with mean=1, leading to these positions being all called as stacks)

    -Fix underdispersed bootstrapped distribution of junction/intron coverage
        -Remove fixed parameter k for number of bootstrapped positions too sample, which underestimates variance for junctions/introns with fewer positions and overestimates variance for junctions/introns with more positions. Instead, sample one less than the number of nonzero positions for each junction/intron.
        -When the variance of this nonparametric procedure is less than the mean, perform parametric sampling using a Poisson distribution

-Fix bugs in combining denovo coverage with annotated splicegraph
    -Denovo junctions must pass minimum number of positions (minpos) as well as reads (min-denovo) to be included in the splicegraph (previously only min-denovo, which was not our intention)
    -Fix detection of recursive exon extension going in the reverse direction. Denovo junctions that are sufficiently close to exon boundaries in the correct direction should lead to exon extension. Closeness to an existing exon was incorrectly applied when comparing increasing vs decreasing coordinates from the exon.
    -Introns are correctly split by denovo exons (previously introns would incorrectly only be kept on one side of the denvo exons)
    -Introns are correctly identified with their source and target exons (previously exons were sorted incorrectly by their string representation, which would fail for genes that crossed powers of 10 in genomic coordinates)
    -Introns always have start and end coordinates offset by 1 from source and target exons (non-overlapping with adjacent exons)

-Updated simplifier behavior
    -min-experiments was set up as the reverse of what we intended
    -we now track experiments that are above simplifier threshold instead of below as criteria for or against simplification, so if enough experiments are above simplifier threshold, a junction/intron will be unsimplified
    -Added flag simplifier-min-experiments to majiq build to choose a different minimum number of experiments for simplifier filters

-Updates for LSV definitions
    -Fixed LSV descriptions used by VOILA for rendering LSV structure (previously would incorrectly order non-reference exons, leading to graphics that would not match splicegraph)
    -Added flag `--permissive-events` to majiq build to allow LSVs that are contained by (but not equivalent to) other events to be reported and later quantified

-Updates to underlying mathematical libraries and random number generation
    -Use Boost for efficient implementation of statistical distribution special functions (e.g. Poisson, Beta cumulative distribution functions)
    -Enable persistent random state when generating random numbers

-Other changes
    -Updated organization of majiq build --help to group related flags and provide more detailed explanation of each flag
    -Fixed saving of empirical deltapsi prior to voila file when using majiq deltapsi
    -Added flag `--dump-constitutive` to majiq build in order to report junctions that were structurally constitutive in the splicegraph
    -Updated majiq setup to ignore numpy deprecation warnings for now
    
VOILA:

-Bugfixes

-Added passcode option for public facing webserver usage / sharing.
    -Meant for secure sharing of voila visualization results with collaborators only for those with the appropriate link.
    
-Added choice of alternate webservers
    -debug server for hot reloading, quick change of input files
    -waitress mode for high preformance single user use cases
    -gunicorn mode, with optional thread/worker count for multi user or server hosted use cases
    
-Various performance improvements / speedups
    -explicit sql indexes for splicegraph
    -multithreading and other improvements available for voila view indexing process
    
-More programmer friendly voila TSV headers
    -special symbols removed from headers and replace with logical text-only alternative
    -json parsable header added for easy access to metadata describing the correspinding voila run
    


Thanks for your help and we hope you enjoy the new majiq,


Biociphers Team.

Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages