Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help biologist choose a new programming language

1 view
Skip to first unread message

John Ladasky

unread,
Feb 6, 2003, 4:36:17 AM2/6/03
to
Hi, folks,

After devoting several years to programming the most troublesome
computers of all, namely living cells, I am beginning to take an
interest in programming silicon again.

Far too much has changed since I last programmed a computer. It is
amazing how obsolete one's knowledge can become.

My personal programming background: like many, from about 1982 - 1987
I owned an Apple II. I got started with BASIC, and eventually
switched to 6502 assembler for greater speed. I was getting into the
guts of the machine, even doing crude operating system hacks. Those
were the days. Fresh from my undergraduate degree in 1990, I went to
work for a biotech company where my duties included some programming,
first in Turbo Pascal and later in Borland C (not C++). I was doing
data acquisition work, talking directly to hardware. When we switched
from DOS to the Windows 3.1 GUI, I had to program with the manuals
open on my lap, because of the hundreds of OS messages and function
calls -- but I managed. At home, I was tinkering with Laser C 2.0 on
an Atari ST 1040.

In 1993 I went to grad school, and essentially stopped programming.
Along came C++, and Java, and a host of other languages which may or
may not take hold. Operating systems changed again. Hardware became
so fast that, for many users, the performance gains obtained from
compiled languages were no longer important.

I bought a used copy of Borland C++ 4.5 around 1995, I think. By that
time I had retired the Atari and purchased a PC. I tried to do some
very simple programming, and the error messages issuing from the
compiler were absolutely incomprehensible. I put it aside so that my
advisor wouldn't kick me out of grad school.

What I would like to do at this point is some bioinformatics work,
data-mining GenBank. I am setting up a computer at home for this
project because, although it is biology research, it's tangential to
my current job. I have found both the BioJava and BioPerl web pages.
There are bioinformaticicians who find merit in at least these two
languages...

I need the ability to read flat-format text files, seek out some key
words and sequence data, and analyze for patterns. Not too difficult,
right?

Well, I followed one friend's advice and investigated Java, perhaps a
little too quickly. I purchased Ivor Horton's _Beginning_Java_2_
book. It is reasonably well-written. But how many pages did I have
to read before I got through everything I needed to know, in order to
read and write files? Four hundred! I need to keep straight detailed
information about objects, inheritance, exceptions, buffers, and
streams, just to read data from a text file???

I haven't actually sat down to program in Java yet. But at first
glance, it would seem to be a step backwards even from the procedural
C programming that I was doing a decade ago. I was willing to accept
the complexity of the Windows GUI, and program with manuals open on my
lap. It is a lot harder for me to accept that I will need to do this
in order to process plain old text, perhaps without even any screen
output.

Here is what I think would make a good programming language for me
(but feel free to try to convince me that I should have other
priorities):

1) A low barrier to entry for performing simple tasks, such as
processing text files. This will allow me to accomplish the job I
want to do right now.

2) A language that doesn't force me to obsess about the details of
OOP.

3) I would like to return to graphical applications eventually.
Therefore the language should have a GUI library, either
Windows-specific or cross-platform.

4) Speed is nice, but secondary. When I consider the fact that my
Apple II was a 1.0 MHz machine with an 8-bit data bus, and my new
machine will be a hyper-threaded Pentium IV 2.0 GHz machine with a
32-bit (64-bit?) data bus, I'm willing to bet that even an Applesoft
BASIC interpreter would be fast enough.

Any suggestions? (I was kidding about BASIC.)

Thanks!

--
John J. Ladasky Jr., Ph.D.
Department of Biology
Johns Hopkins University
Baltimore MD 21218
USA
Earth

Nathan Haigh

unread,
Feb 6, 2003, 5:55:24 AM2/6/03
to
"John Ladasky" <lad...@my-deja.com> wrote in message
news:c09b237b.03020...@posting.google.com...

I am a bioinformatics PhD student in my 2nd year. Before enbarking on this
course i had no programming skills, so embarked on 2 programming courses
(Java and Perl). I was amazed at how much stuff you need to get through to
do a simple task such as open a file, read it a line at a time, parse the
line and close the file etc etc.

If you want a language that is easy to pick up and use, aswell as having
very powerful pattern matching (regular expressions - regex) then perl is a
good bet. It is used widly in the bioinformatics field and there are the
BioPerl modules that are well devoloped and do a lot of the common tasks
that you will come across. It also has a cross platform graphical interface
module (Perl Tk).

If you want some help setting it up/getting started, let me know.
Nathan


Jonathan G Campbell

unread,
Feb 6, 2003, 6:29:20 AM2/6/03
to
John Ladasky wrote:
>
> Hi, folks,
>
> After devoting several years to programming the most troublesome
> computers of all, namely living cells, I am beginning to take an
> interest in programming silicon again.
>
[...]

> I need the ability to read flat-format text files, seek out some key
> words and sequence data, and analyze for patterns. Not too difficult,
> right?
>
> Well, I followed one friend's advice and investigated Java, perhaps a
> little too quickly. I purchased Ivor Horton's _Beginning_Java_2_
> book. It is reasonably well-written. But how many pages did I have
> to read before I got through everything I needed to know, in order to
> read and write files? Four hundred! I need to keep straight detailed
> information about objects, inheritance, exceptions, buffers, and
> streams, just to read data from a text file???
>
> I haven't actually sat down to program in Java yet.

That's the problem.

> But at first
> glance, it would seem to be a step backwards even from the procedural
> C programming that I was doing a decade ago.

[...]

>
> Here is what I think would make a good programming language for me
> (but feel free to try to convince me that I should have other
> priorities):
>
> 1) A low barrier to entry for performing simple tasks, such as
> processing text files. This will allow me to accomplish the job I
> want to do right now.
>
> 2) A language that doesn't force me to obsess about the details of
> OOP.
>
> 3) I would like to return to graphical applications eventually.
> Therefore the language should have a GUI library, either
> Windows-specific or cross-platform.
>
> 4) Speed is nice, but secondary.

Go with Java. You'll be at least twice as productive as with C++ or C.
Of course, if _all_ you are doing (and aiming to do) is ripping text
files apart and putting them together in another way, then the Perl
suggestion is apt.

Good luck,

Jon C.

--
Jonathan G Campbell BT48 7PG jg.ca...@ntlworld.com 028 7126 6125
http://homepage.ntlworld.com/jg.campbell/

gswork

unread,
Feb 6, 2003, 10:51:30 AM2/6/03
to
lad...@my-deja.com (John Ladasky) wrote in message news:<c09b237b.03020...@posting.google.com>...

> Hi, folks,
>
> After devoting several years to programming the most troublesome
> computers of all, namely living cells, I am beginning to take an
> interest in programming silicon again.
>
> Far too much has changed since I last programmed a computer. It is
> amazing how obsolete one's knowledge can become.

or not... your background, though without OOP, is pretty good. C and
assembly and BASIC. nice mix of experience, and with plenty of
options for today.

> When we switched
> from DOS to the Windows 3.1 GUI, I had to program with the manuals
> open on my lap, because of the hundreds of OS messages and function
> calls -- but I managed.

If you successfully wrote raw Win16 API then all credit to you!

> What I would like to do at this point is some bioinformatics work,
> data-mining GenBank. I am setting up a computer at home for this
> project because, although it is biology research, it's tangential to
> my current job. I have found both the BioJava and BioPerl web pages.
> There are bioinformaticicians who find merit in at least these two
> languages...
>
> I need the ability to read flat-format text files, seek out some key
> words and sequence data, and analyze for patterns. Not too difficult,
> right?

Many languages are well suited to that without the need to get
involved in precision string-memory management. perl notably, though
I don't like it all that much myself. Pascal & BASIC too. Java has
it too, but....

> Well, I followed one friend's advice and investigated Java, perhaps a
> little too quickly. I purchased Ivor Horton's _Beginning_Java_2_
> book. It is reasonably well-written. But how many pages did I have
> to read before I got through everything I needed to know, in order to
> read and write files? Four hundred! I need to keep straight detailed
> information about objects, inheritance, exceptions, buffers, and
> streams, just to read data from a text file???

I read the previous edition (for java 1.1) and thought it was
reasonable too, and noted the volume of pages before 'doing stuff',
but it is aimed at beginners in fairness. Actually writing file io
programs isn't too involved, but doesn't present the simple
methodology of the others (IMO).

I've done small text io utilities in BASIC, Pascal (Turbo Pascal,
Delphi & Freepascal), those are my choices for that kind of task -
especially delphi because of it's simple filestreams, and the promise
of portability (to Linux). Freepascal might be worth a look if you're
interested. Perl advocates can show you some pretty compact scripts
that do pretty large amounts of work though, so check that too.

> Here is what I think would make a good programming language for me
> (but feel free to try to convince me that I should have other
> priorities):
>
> 1) A low barrier to entry for performing simple tasks, such as
> processing text files. This will allow me to accomplish the job I
> want to do right now.

BASIC, Pascal, Perl - examples for this stuff all over the net.

Various other scripting languages are probably equally good,
familiarity with c will reward you if you go with that.

> 2) A language that doesn't force me to obsess about the details of
> OOP.

BASIC, C, Pascal and plenty of others wont force you into oop.

> 3) I would like to return to graphical applications eventually.
> Therefore the language should have a GUI library, either
> Windows-specific or cross-platform.

Delphi/Kylix has RAD, there are gui kits for many languages though and
if you were ok with win16 you should be able to pick up.

> 4) Speed is nice, but secondary. When I consider the fact that my
> Apple II was a 1.0 MHz machine with an 8-bit data bus, and my new
> machine will be a hyper-threaded Pentium IV 2.0 GHz machine with a
> 32-bit (64-bit?) data bus, I'm willing to bet that even an Applesoft
> BASIC interpreter would be fast enough.

Isn't that 3.0ghz (remembers reading about in a pc magazine, the new
HT 3ghz P4). They're all pretty fast nowadays!

Hope comp.programming gives you ideas, a good bet is to try out a few
languages and do some simple file io in each, the one that 'clicks'
with you the most would be good to pursue.

bd

unread,
Feb 6, 2003, 6:45:22 PM2/6/03
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Ladasky wrote:

> Hi, folks,
>
> After devoting several years to programming the most troublesome
> computers of all, namely living cells, I am beginning to take an
> interest in programming silicon again.
>
> Far too much has changed since I last programmed a computer. It is
> amazing how obsolete one's knowledge can become.
>
> My personal programming background:

<snip>


> I need the ability to read flat-format text files, seek out some key
> words and sequence data, and analyze for patterns. Not too difficult,
> right?

Nope. Would regexes work?

> Well, I followed one friend's advice and investigated Java, perhaps a
> little too quickly. I purchased Ivor Horton's _Beginning_Java_2_
> book. It is reasonably well-written. But how many pages did I have
> to read before I got through everything I needed to know, in order to
> read and write files? Four hundred! I need to keep straight detailed
> information about objects, inheritance, exceptions, buffers, and
> streams, just to read data from a text file???
>
> I haven't actually sat down to program in Java yet. But at first
> glance, it would seem to be a step backwards even from the procedural
> C programming that I was doing a decade ago. I was willing to accept
> the complexity of the Windows GUI, and program with manuals open on my
> lap. It is a lot harder for me to accept that I will need to do this
> in order to process plain old text, perhaps without even any screen
> output.
>
> Here is what I think would make a good programming language for me
> (but feel free to try to convince me that I should have other
> priorities):
>
> 1) A low barrier to entry for performing simple tasks, such as
> processing text files. This will allow me to accomplish the job I
> want to do right now.

Perl is great for text-editing.

> 2) A language that doesn't force me to obsess about the details of
> OOP.

Perl allows OOP, but does not require it.

> 3) I would like to return to graphical applications eventually.
> Therefore the language should have a GUI library, either
> Windows-specific or cross-platform.

Once again, Perl. GTK, Tk, Qt bindings at least.

> 4) Speed is nice, but secondary. When I consider the fact that my
> Apple II was a 1.0 MHz machine with an 8-bit data bus, and my new
> machine will be a hyper-threaded Pentium IV 2.0 GHz machine with a
> 32-bit (64-bit?) data bus, I'm willing to bet that even an Applesoft
> BASIC interpreter would be fast enough.

It's acceptably fast for me.

> Any suggestions? (I was kidding about BASIC.)

http://www.perl.com

- --
Replace spamtrap with bd to reply.
Freenet distribution not available
Seattle is so wet that people protect their property with watch-ducks.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+QvNTx533NjVSos4RAu7NAJ4qOXKcdafVh2hmdu2W+s+fLt5+2QCfX3yy
Zi5AY7iWatfANlpUMFYg+uY=
=tgj+
-----END PGP SIGNATURE-----

Pierre Asselin

unread,
Feb 6, 2003, 9:24:43 PM2/6/03
to
In comp.programming John Ladasky <lad...@my-deja.com> wrote:

> [ ... ]


> I need the ability to read flat-format text files, seek out some key
> words and sequence data, and analyze for patterns. Not too difficult,
> right?

I'd say perl is the best darn language for this sort of thing. I'm not
sure how much fun you'll have learning perl. I started using it as
perl4, when it looked like awk + sed + all the Unix shells, thrown into
a blender. So it was easy for me.

Another possibility is Tcl. Tcl is a really, really weird language,
but it's also small. You can get over the syntax in one afternoon
and start writing code.

No matter what language you pick, you'll need to learn "regular
expressions", because that's how you will recognize the strings in your
files. Be prepared: regular expressions are a write-only language.
You write one, and seconds later you can't understand it. Luckily,
they usually works as intended.

Dan Tex1

unread,
Feb 7, 2003, 2:20:11 AM2/7/03
to
>Subject: Help biologist choose a new programming language
>From: lad...@my-deja.com (John Ladasky)
>Date: 2/6/03 1:36 AM Pacific Standard Time
>Message-id: <c09b237b.03020...@posting.google.com>

I doubt all of the tried-and-true computer scientist in here would agree with
me, but... if all you need to do is text input/output and manipulation...
Fortran offers incredibly easy to understand syntax that has almost a "zero"
learning curve, yet is quite excellent ( and extremely fast ) at
manipulations.

Read a book or manual for just an hour ( maybe two ) and you'll be opening
files, closing files, parsing and manipulating text like a pro. And most text
manipulation that you perform won't even require that you memorize specific
functions of the language.

Yes. I do a fair deal of numerically oriented code ( large part of why I use
Fortran ), however, I generally do quite a bit of text manipulation also
within my codes. I've found that I can create Windows style GUI's quite easily
with Fortran.

For a simple syntax example: let's say that x="abcdefghijk"

you can acess the substring expression "cd" from within x simply
by refering to x(3:5).

Dan :-)

lvi...@yahoo.com

unread,
Feb 7, 2003, 2:35:55 PM2/7/03
to

According to John Ladasky <lad...@my-deja.com>:
:1) A low barrier to entry for performing simple tasks, such as

:processing text files. This will allow me to accomplish the job I
:want to do right now.
:
:2) A language that doesn't force me to obsess about the details of
:OOP.
:
:3) I would like to return to graphical applications eventually.
:Therefore the language should have a GUI library, either
:Windows-specific or cross-platform.
:
:4) Speed is nice, but secondary.

I came from a very similar background. I would recommend that you
take a look at http://www.tcl.tk/ . There are things like
<URL: http://wiki.tcl.tk/Biowish > and other bioinformatics projects
making use of it.


--
Tcl - The glue of a new generation. <URL: http://wiki.tcl.tk/ >
Even if explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.
<URL: mailto:lvi...@yahoo.com > <URL: http://www.purl.org/NET/lvirden/ >

Catherine Letondal

unread,
Feb 10, 2003, 4:39:06 AM2/10/03
to

I would say that all languages are fine. After all, they are soo similar...
Maybe look at Python, though, since 1), 2) 3) and 4) are met (especially 1).
Regarding 2), you are not required to program in OOP, but once you want to, you have
a nicely featured OOL. And you have Biopython.

http://www.python.org/
http://www.biopython.org/

We have a course for biologists knowing programming:
http://www.pasteur.fr/recherche/unites/sis/formation/python/

and for biologists wanting to learn:
http://www.pasteur.fr/formation/infobio/python/

>
>Thanks!
>
>--
>John J. Ladasky Jr., Ph.D.
>Department of Biology
>Johns Hopkins University
>Baltimore MD 21218
>USA
>Earth
>

--
Catherine Letondal -- Pasteur Institute Computing Center

0 new messages