Knuth-Morris-Pratt is not relevant here. We want an alignment, which
allows for indels, and not a common substring. Hint on this problem,
as people seem to be struggling: the answer is discussed elsewhere in
the homework!
> 2. (b) In the linear space alignment, the original problem of size mn is
> reduced to two subproblems
> of sizes km/2 and (n-k)m/2. In a fast, parallel implementation of sequence
> alignment, it is desirable to have a balanced partitioning that breaks the
> original problem
> into sub-problems of equal sizes. Design a linear space alignment algorithm
> with
> balanced partitioning.
>
> Q : What should be the runtime?
O(NM), the same as for usual Hirschberg's
> 3. (a) The score of a local alignment is not normalized over the length of
> the matching
> region. As a result, a local alignment with score 100 and length 100 will be
> chosen over a
> local alignment with score 99 and length 10, although the latter one is
> probably more
> important biologically. To reflect the length of the local alignment in
> scoring, the score
> F(I,J) of local alignment involving substrings I and J may be adjusted by
> dividing F(I,J)
> by the total length of the aligned regions: F(I,J)/(|I|+|J|). The normalized
> local alignment
> problem is to find substrings I and J that maximize F(I,J)/(|I|+|J|) among
> all substrings I
> and J with |I|+|J| >= k, where k is a threshold for the minimum overall
> length of I and J.
> Devise an algorithm for solving the normalized local alignment problem.
>
> Q : What should be the target runtime?
As fast as you can make it. O(NM) is prefered.
> 3. (d) Bacterial DNA is often organized into circular molecules. Given two
> strings x and y
> of length n and m, respectively, there are n circular shifts of x, and m
> circular shifts of y;
> therefore, there are nm pairs of circular shifts.
> Build an efficient algorithm to find the best global alignment among the nm
> different
> pairs of circular shifts. The trivial O(n2m2) algorithm is not good enough.
> A cubic time
> solution (i.e. O(n^2m)) will only get partial credit.
>
> Q : I have no intuition on how to solve this problem. Is there any hints you
> could give on this problem?
Think about whether two "optimal" alignments between distinct start
and end points can ever intersect.