ANN: biostar-central

36 views
Skip to first unread message

Istvan Albert

unread,
Mar 24, 2011, 5:02:18 PM3/24/11
to biostar-central
Since we are getting more an more questions on the future of BioStar I
felt compelled to explore the possibility of writing our own version
of the code.

Now I know many people (including myself) frown on such undertakings
but in this case I honestly I think this task is not as complicated as
many make it sound (and it certainly should not be as complicated as
the codebase of OSQA, that also happens to by python code using
Django).

The problem just does not seem complicated enough to warrant turning
over the control to a company that wants to monetize it. We really owe
it to ourselves to see whether it would be realistic to have our own
codebase in the first place. After all we already work in one of the
most challenging computational fields.

So I sat down the last two days and spent some time putting together a
sketch of how this might work (as it turns out I spent about 85% of
this time with CSS trying to match what we currently have). Of course
this is not a functional site although you can answer or ask a
question - and there is no login right now etc, think of it as a
functional sketch or draft.

I just want to see where would we need to spend the effort and of
course I am looking for collaborators and contributors. I do think
that we have the option of taking Q&A way beyond what SE can offer and
this would allow us to explore and build a more innovative information
sharing platform.

Sources are on github:

https://github.com/ialbert/biostar-central

If you have python and django you can easily run a development server
(see the bottom of the page linked above) and it will automatically
load an older datadump into your server.

And here is a demo site running on this code:

http://biostar.bx.psu.edu/

LINDENBAUM pierre

unread,
Mar 25, 2011, 5:32:46 AM3/25/11
to biostar...@googlegroups.com, Istvan Albert
Hi Istvan,

A short comment:

I just saw your git repository for github and I saw the nice demo.

However, there is a SE dump of Biostar containing some private data (IP, openids, mails , etc...) https://github.com/ialbert/biostar-central/tree/master/home/import. I don't think it's a good idea to find this kind of private data on github...

Best regards,

Pierre

Giovanni Marco Dall'Olio

unread,
Mar 25, 2011, 5:34:10 AM3/25/11
to biostar...@googlegroups.com, Istvan Albert
Sorry, but this time I won't be able to participate.
The first reason is that I don't have the time to do the coding.. All my free time is consumed by following biostar and by other spare-time projects of mine. I like django and python, but writing a project of the size of biostar is too much for the time I have.
Second, honestly I don't see much the need for this. There are already other opensource implementations of the SE platform, for example osqua.net posted by David, and they all seems to be in an advanced development state. Wouldn't it be easier to just adopt one of these and contribute to their development?
Finally, switching to the SE 2.0 platform was not a bad idea. I can see more advantages than disadvantages: for example, think of all the users that would come from stackoverflow to biostar, thanks to the link in the footer. Notice that switching to the 2.0 and developing an alternative platform for our own are not two incompatible options: we could try the 2.0 and buy us some time, and continue thinking about other solutions in the meanwhile.
--
Giovanni Dall'Olio, phd student
Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)

My blog on bioinformatics: http://bioinfoblog.it

Istvan Albert

unread,
Mar 25, 2011, 9:01:14 AM3/25/11
to biostar...@googlegroups.com, LINDENBAUM pierre
On Fri, Mar 25, 2011 at 5:32 AM, LINDENBAUM pierre
<plind...@gmail.com> wrote:

> openids, mails , etc...)
> https://github.com/ialbert/biostar-central/tree/master/home/import. I don't
> think it's a good idea to find this kind of private data on github...

I thought about it myself, but then that all this data is also
available from the main Biostar site, just click on any user.

There are so many webcrawlers out there designed to extract and
harvest information out of webpages that for anyone that needs this it
is easier to just run it on the main site instead of writing a custom
XML parser.


Istvan


--
Istvan Albert
Associate Professor, Bioinformatics
Pennsylvania State University
http://www.personal.psu.edu/iua1/

Message has been deleted

Istvan Albert

unread,
Mar 25, 2011, 9:34:29 AM3/25/11
to biostar-central


On Mar 25, 5:34 am, "Giovanni Marco Dall'Olio" <dalloli...@gmail.com>
wrote:

> Wouldn't it be easier to just adopt one of these and contribute to their
> development?

I looked into this option very seriously. I came to the conclusion
that this implementation is overcomplicated perhaps to the point where
it is almost no different from being a black box. All you need to look
at are the object models:

http://svn.osqa.net/svnroot/osqa/trunk/forum/models/

So the answer is not as simple as just contributing to it.

> Finally, switching to the SE 2.0 platform was not a bad idea.
> our own are not two incompatible options: we could try the 2.0 and buy us
> some time, and continue thinking about other solutions in the meanwhile.

I partially agree, SE2 is a strong candidate and good option. At the
same time I think once we switch to an option we should stick with it.
It would be greatly distracting to have another decision surrounded by
another set of discussions all the while being unable to do what our
main job is: practicing bioinformatics.

What I am trying out today is an alternative, so that there are
options to choose from. If it does not work out it is fine, good thing
we have a great second option. Yet I am willing to put up a week of
development time (and I wish it was continuous) to see if it is a
realistic goal to have our own.

best,

Istvan

Bio X2Y

unread,
Mar 25, 2011, 10:17:22 AM3/25/11
to biostar...@googlegroups.com
Hi Istvan,

I don't believe all this information is available from the site.
If you click on a user other than yourself, you cannot see "last login IP" or "Open ID" for example. Maybe this is different for moderators.

As such, I'd like to echo Pierre - this data should not be publicly available on github.

Thanks.

Simon

unread,
Mar 25, 2011, 10:20:57 AM3/25/11
to biostar-central
Istvan
I would encourage you to read this:
http://blog.bitquabit.com/2009/07/01/one-which-i-call-out-hacker-news/
Which I think makes a number of valid points.
This will certainly be more work than you think, and without a solid
case for the improvements you can realise before you start, I really
don't see the point.

I think at this point, the best thing that can be done to decide the
future of BioStar is to..... ask BioStar.
Set up a community wiki question with 3 pre-populated answers (SE2,
OSS or roll-your-own). Most votes, wins.

Regards
Simon

Istvan Albert

unread,
Mar 25, 2011, 10:23:15 AM3/25/11
to biostar...@googlegroups.com
On Fri, Mar 25, 2011 at 10:17 AM, Bio X2Y <bio...@gmail.com> wrote:

> I don't believe all this information is available from the site.
> If you click on a user other than yourself, you cannot see "last login IP"
> or "Open ID" for example. Maybe this is different for moderators.

Yes, my bad this is only visible directly if you are an admin. Pierre
just pointed this out in an private email.

We'll get it anonymized, apologies.

Chris Miller

unread,
Mar 25, 2011, 10:42:06 AM3/25/11
to biostar...@googlegroups.com
I'm inclined to agree with Simon (and the fantastic blog post that he
linked). While I do think control of data is a big issue, I don't
think you're going to find as much support as you might hope for
implementing a QA site from scratch when there are reasonable
alternatives. Sure, stackexchange is run by a company, and you assume
certain risks when you entrust your data (and ad placement) to them.
So far, though, they've proved to be reasonable and fairly in touch
with the needs of the communities that they're building. If that
changes, then we should have a discussion about jumping ship, but
right now I can't justify investing the time and energy into
reinventing the wheel.

I'm glad that you're exploring alternatives, but I have yet to see any
compelling evidence that they're going to be either superior or a good
use of our time.

-Chris

Istvan Albert

unread,
Mar 25, 2011, 11:00:49 AM3/25/11
to biostar...@googlegroups.com, Simon
On Fri, Mar 25, 2011 at 10:20 AM, Simon <sjco...@gmail.com> wrote:

> I would encourage you to read this:
> http://blog.bitquabit.com/2009/07/01/one-which-i-call-out-hacker-news/
> Which I think makes a number of valid points.

> This will certainly be more work than you think, and without a solid
> case for the improvements you can realise before you start, I really
> don't see the point.

Yes, I read this, in fact I read this back when it was written. I
closely follow software development practices and I do know it is not
for the faint heart, programmers are the eternal optimists. What is
important to note is whether the person that proposes the idea is
purely idealistic and naive or has some reason and the necessary
experience to back up their optimism.

> Set up a community wiki question with 3 pre-populated answers (SE2,
> OSS or roll-your-own). Most votes, wins.

One can only vote meaningfully if they can evaluate/act upon the
alternatives. Most people want to use Biostar not develop it.
There need to be actionable alternatives, SE2 in corner 1 vs your own
version in corner 2. Are these comparable at all, was it worth the
time to get to a certain point? Etc.

I am a big fan of working sites, so it is not going to be something
developed over many months in isolation, but something that we can see
everyday and evaluate. Just like the current version, it is a simple
draft with minimal features but it does work. That way is easier to
see what kind of progress one is making.

Answer to Chris (email just came in):

> but I have yet to see any
> compelling evidence that they're going to be either superior or a good
> use of our time.

Yes. This needs to be established and that's why I am not directly
asking for help. I want to give it a shot, if other people want to
join in is great. But if not there is nothing to feel bad about. I
want to see if the idea can make good progress within a week or two.
If not I lost a bit of time we all learned something.

best,

Istvan Albert

unread,
Mar 25, 2011, 11:37:57 AM3/25/11
to biostar...@googlegroups.com
User anonymization is now done.

Many thanks to Pierre who provided an anonymizer XSLT transformation
and Aleksandr who showed me how to clean sensitive information from
github.
- this turned out to be a quick operation.

David Quigley

unread,
Mar 27, 2011, 11:36:19 PM3/27/11
to biostar-central
I don't know the actual quality of the osqa code base. Their
implementation may be suboptimal, but apparently they have a working
site to offer. As a former full-time software developer, I know the
first instinct any engineer has when confronted with a pile of someone
else's code is to throw it away and write everything myself from
scratch. I still do it, though I'm trying to break the habit. As you
no doubt know, one reason why other people's code often looks
complicated is that it solves a complicated problem. There will be a
lot of corner cases. Dealing with things like logins, getting security
correct, optimizing expensive queries, etc is a lot of work even with
django doing a lot of the heavy lifting.

The bitquabit blog posting has a lot of attitude but also a lot of
truth to it. More power to you if you have the time and energy to
write a de novo implementation. I suspect replicating the SE feature
set beyond simple posting mechanics would require more time than you
care to devote.

If can we split our bet by attempting to go through the existing SE
2.0 mechanism and developing a replacement at the same time, that's my
vote.

Best regards,
David

Michael Kuhn

unread,
Mar 28, 2011, 7:57:35 AM3/28/11
to biostar-central
I also think that a custom BioStar implementation is not a good idea.
Getting a sort-of-running framework is easy, but dealing with all
aspects of UI, security, real users, spam protection etc. takes a lot
of time. In the end, your codebase might be as complicated as the one
of OSQA.

I would propose to migrate to SE 2.0, while keeping all the content
under CC-BY-SA. In the worst case, we can still take the knowledge
base and migrate elsewhere.

best, Michael

Istvan Albert

unread,
Mar 28, 2011, 9:06:45 AM3/28/11
to biostar...@googlegroups.com
Hi Michael and David,

You both raise valid points and I i will admit that if I were in your
position I would probably advise the same thing.

I want to assure that I am a realist. Let me give this
"reimplementation" project two weeks, then I will ask everyone to take
a second look but this time you can actually compare two existing
products rather a real alternative vs just an idea.

Bio X2Y

unread,
Mar 28, 2011, 12:09:09 PM3/28/11
to biostar...@googlegroups.com
If we do want to go down the SE 2.0 route (even temporarily), do we have to go through the front door like everyone else (i.e. somebody suggesting a project proposal, and going through the define/commit/beta stages), or is a short-cut available by virtue of the fact that we already have a functioning SE 1.0 community? I hope our existing questions can be ported over?

Istvan Albert

unread,
Mar 28, 2011, 12:52:27 PM3/28/11
to biostar...@googlegroups.com
On Mon, Mar 28, 2011 at 12:09 PM, Bio X2Y <bio...@gmail.com> wrote:
> If we do want to go down the SE 2.0 route (even temporarily), do we have to
> go through the front door like everyone else (i.e. somebody suggesting a
> project proposal, and going through the define/commit/beta stages), or is a
> short-cut available by virtue of the fact that we already have a functioning

Good point.

We would need some level of assistance from the company since we will
need to transfer all content to the new site and synchronize the users
so that content and history is preserved.

I have a contact information for the SE 1 site and I will send them an
inquiry on this topic.

neilfws

unread,
Mar 29, 2011, 9:18:12 AM3/29/11
to biostar...@googlegroups.com
I am somewhat agnostic as to platform. What matters most to me is the preservation of our existing questions, answers and users.

That said, my preferences are in the following order:
  1. Migration to SE2 - but only if the site can be migrated "as is" and we can bypass the beta/approval stage
  2. An existing open source alternative, such as Shapado
  3. A new custom software solution
Reply all
Reply to author
Forward
0 new messages