some c++ functions that may be useful?

18 views
Skip to first unread message

Sergey Chernouhov

unread,
Apr 13, 2019, 6:19:17 AM4/13/19
to Bio++ Development Forum
Dear Sirs.

As I am not a professional programmer, bionformatics is very interesting interdisciplinary field for me.

I see it, the Python is a "standart language" in this field.

But when I solved problems at rosalind info, I used C++. So as a result a "lib of some function" has been borned.

The lib contains 3 groups of functions. The first one - input-output ones (in order to read-write vectors, matrixes, graphs from-to a file via only one commsnd as it is in Python).

The second group is "Working with strings". Contains some functions from computing GC-content, Edit Distance etc to finding all mutated strings in a given one.

The third is "Working with graphs". A data structure "Adjacency vector" is suggested. By the way, in general case, vertices may have negative integers assigned and graphs may have multiple loops and edges.
Some function such as Eulerian Cycle, Path finding, topological sorting etc are implemented.


I understand that this lib haven't a great majority of features. For example it is not able now to work with bioinformatic databases, but here I can not to implement it by myself only.

But may it be useful in developing Bio++?

Free distributed source code and info is here:

Best regards, Chernouhov Sergey

Sergey Chernouhov

unread,
May 3, 2019, 2:54:58 PM5/3/19
to Bio++ Development Forum

added to GitHub: https://github.com/chernouhov/CBioInfCpp-0-

PS I Do declare that I DO NOT clearly understand all about GitHub so nowdays I use it only as a filehosting as it is so popular place.

Julien Y. Dutheil

unread,
May 9, 2019, 7:05:26 AM5/9/19
to Bio++ Development Forum
Dear Sergey,

Thank you for your interest in Bio++. Some of your function overlap with the ones in Bio++ and some might be a potential complement. The best way to contribute code to Bio++ is via making a "pull request" on github. Please note that for ease of maintenance reasons, we usually only accept minor improvements and bug fixes, new developments that depend on Bio++ but are rather independent are better distributed separately. Finally note that the bpp-raa library allows access to databases structured under the ACNUC system, which covers GenBank and EMBL.

All the best,

Julien.

Sergey Chernouhov

unread,
May 9, 2019, 8:44:28 PM5/9/19
to Bio++ Development Forum
Hi.

As github is smth new to me nowadays I do not clear understand what is  "pull request".

I do interested in collaboration in at least 2 ways: both developing of tools and bioinf problems solving.


Sergey Chernouhov

unread,
Jun 27, 2019, 4:33:18 PM6/27/19
to Bio++ Development Forum
23.06.2019 update:
- Group of function "FindIn" has been updated.
- Functions PairVectorCout, PairVectorFout has been updated.
- Group of function "GraphCout" and "GraphFout" has been added. So nowadays one may "cout/ fout" a graph that is set by Adjacency vector to screen/ to file line by line: one edge in one line. 
- Function "StrToCircular" added for finding the circular string of minimal length of the given one.
- Group of function MaxFlowGraph" has been added to help find Maximal Flow, the paths of the maximal flow network and max-flow min-cut in a graph.
- A data structure "Adjacency map" (a modification of data structure for containing graphs "Adjacency vector") has been added. Adjacency map allows to have quicker access to edge’s weight, but it can’t work with multiple edges. 
- Functions for converting Adjacency vector to Adjacency map and conversely AdjVectorToAdjMap and AdjMapToAdjVector have been added. Note that Multiple edges will be joined together.
- Function TandemRepeatsFinding has been added. It is intended for finding tandem repeats in the given string that may be useful for solving problems related to Microsatellite Instability etc.

Let's try together?
Reply all
Reply to author
Forward
0 new messages