Algorithms Talks

7 views

Skip to first unread message

Philip Bille

unread,

Feb 17, 2012, 3:00:38 AM2/17/12

to algolog, ma...@jesperkristensen.dk, vil...@gmail.com

The talk by Morten Stockel has been moved 45 min. and another talk by
Jesper Kristensen has been added immediately afterwards. Both talks
are on papers that will appear at LATA 2012.
All are welcome. Please forward to other interested people.

When: Monday, Feb. 27, 14.00-14.30
Location: 322 / 205
Title: Fast and Cache-Oblivious Dynamic Programming with Local
Dependencies
Speaker: Morten Stockel

Abstract: String comparison such as sequence alignment, edit distance
computation, longest common subsequence computation, and approximate
string matching is a key task (and often computational bottleneck) in
large-scale textual information retrieval. For instance, algorithms
for sequence alignment are widely used in bioinformatics to compare
DNA and protein sequences. These problems can all be solved using
essentially the same dynamic programming scheme over a two-
dimensional
matrix, where each entry depends locally on at most 3 neighboring
entries. We present a simple, fast, and cache-oblivious algorithm for
this type of local dynamic programming suitable for comparing large-
scale strings. Our algorithm outperforms the previous state-of-the-
art
solutions. Surprisingly, our new simple algorithm is competitive with
a complicated, optimized, and tuned implementation of the best cache-
aware algorithm. Additionally, our new algorithm generalizes the best
known theoretical complexity trade-offs for the problem.

----------------------------------------

When: Monday, Feb. 27, 14.30-15.00
Location: 322 / 205
Title: Longest Common Extensions via Fingerprinting
Speaker: Jesper Kristensen
Abstract: The longest common extension (LCE) problem is to prepro-
cess a string in order to allow for a large number of LCE queries,
such that the queries are efficient. The LCE value, LCEs(i,j), is the
length of the longest common prefix of the pair of suffixes starting
at index i and j in the string s. The LCE problem can be solved in
linear space with constant query time and a preprocessing of sorting
complexity. There are two known approaches achieving these bounds,
which use nearest com- mon ancestors and range minimum queries,
respectively. However, in practice a much simpler approach with linear
query time, no extra space and no preprocessing achieves significantly
better average case perfor- mance. We show a new algorithm,
Fingerprintk, which for a parameter k, 1 ≤ k ≤ ⌈logn⌉, on a string of
length n and alphabet size σ, gives O(kn1/k ) query time using O(kn)
space and O(kn + sort(n, σ)) prepro- cessing time, where sort(n,σ) is
the time it takes to sort n numbers from σ. Though this solution is
asymptotically strictly worse than the asymptotically best previously
known algorithms, it outperforms them in practice in average case and
is almost as fast as the simple linear time algorithm. On worst case
input, this new algorithm is significantly faster in practice compared
to the simple linear time algorithm. We also look at cache performance
of the new algorithm, and we show that for k = 2, cache optimization
can improve practical query time.

Reply all

Reply to author

Forward

0 new messages