Feature Requests

1 view
Skip to first unread message

Bruce

unread,
Mar 30, 2009, 2:34:15 PM3/30/09
to dna-align
Please respond to this thread with feature requests.

Bruce Kirchoff

unread,
Apr 9, 2009, 11:57:39 AM4/9/09
to dna-align
How DNA-Align treats "-" versus "~"

I decided to post these comments here so we have a record of the
discussion. First, here a summary of what Chris and I have discussed
by email. My comments are preceeded by ">." Chriss' comments have no
prefix.

> Query gaps does not seem to work on at least certain large data sets.
> I cannot get it to work on the Rubiaceae data set that I have
> attached to this e-mail. Just open the data set and run query gaps.
> You will see the problem.

Right. The problem (issue?) here is that all gaps in this file are
"-" and not "~".

My understanding is that a "-" gap is a "hard" gap, and should not
allow
anything to shift into it.

I took the file, changed every "-" to "~" and then query gaps reported
a
bunch of possible shifts. Although nothing that made a huge
difference
in the overall result. But I guess this isn't surprising, since I
think
this FST file has already undergone a major hand-fixing to get it
where
it is.

If my understanding of "-" versus "~" is wrong, that's easily fix-
able.
In fact, I think I originally treated "-" and "~" identically, but
then
for some reason I changed it.
-------------------
My new comments start here.

I have checked with Dave about this and your understanding of the
difference between "-" and "~" is correct. The feature that BioEdit
has that is lacking in DNA-align is a means to convert "-" to "~" and
vice a versa. That is, in BioEdit you can select certain gaps and
convert them from one form to another. You may also be able to convert
all the gaps globally. This gives the user a way to lock certain gaps
which would then not be moved by any operations done within the
program.

I think that this would be a good feature to add to DNA-align.

Bruce Kirchoff

unread,
Apr 9, 2009, 12:31:17 PM4/9/09
to dna-align
Here are a number of suggestions for features.

1. When you open the program, go to the file menu and select "open"
you are taken to "My Documents." It would be better if the program
remembered the last directory that you opened, and took you back
there.

2. The overview is currently not scalable. It was scalable in a
previous version. It would be good to have this feature back,
especially for large data sets like the Rubiaceae. Looking at these
data sets zoom all the way out does not give the user as much
information as he or she would like to have about the data. For
instance, there is a misaligned section in the Rubiaceae data set.
Right now one can only look at that section of the data at two scales.
It would be good to have it visible as many different scales. The
program could switch from displaying the nucleotides/amino acids as
characters to displaying them as colored rectangles when the font size
of the letters reaches some predefined point, say six points.

3. Dave made the suggestion that the column values be changed to the
GLOCSA column scores. I would like to see the GLOCSA column scores
added to the program, but would like to leave the expected
heterogeneity calculations as they are. We can eventually remove one
or the other, but for now I think it would be good to have both. I'm
not sure what to do about the heat maps as we will need to select one
of the column scores to calculate the color values. Perhaps an options
menu could be added to the drop-down list, and the user could select
which columns core to include in the heat maps. I have another
suggestion for the option menu, below.

4. I would like to have a way to change the weightings on the GLOCSA
scores. This could be another selection from the options menu. Input
boxes could be added, and populated with the default weighting. The
default scores should also be listed in the help files so that the
users can easily go back to the defaults.

5. The GLOCSA calculations get very slow with large data sets. It
might be good to have a way to turn them off. We could play with the
program the way it is a bit more before we make a final decision on
this.

6. I still have not figured out how to move a row. Could you try this,
and make sure it works. I think it would be good to have a way to move
a series of rows at the same time. That is to move a whole contiguous
block of rows to a different position.

7. One component of the GLOCSA score depends on the number of columns
added in the automatic alignment. Right now this part of the score is
not implemented. We could implement it by doing one of two things. (A)
We could make the assumption that the original data had no gaps, and
just count the number of columns. We need Dave's opinion on whether
this would work. Helga is now a member of the group, but I do not
think that she is receiving e-mail yet. If she is, she could weigh in
on this issue to. (B) The second option is to allow the user to
manually input the number of columns in the original file. This could
be another selection under the options menu. I am assuming that this
number would be easily obtainable by the user. It would be good to get
input from Dave or Helga on this.

8. The help files will need to be updated after these changes. They
are already a little bit out of date. For instance, they do not
mention the undo function. I now have a way to record five minute
videos of the computer screen with voiceover. I can make some of these
recordings after we have all of the features implemented.
Reply all
Reply to author
Forward
0 new messages