version numbers

0 views

Skip to first unread message

Martín Massera

unread,

Jul 11, 2008, 9:24:15 PM7/11/08

to hou...@googlegroups.com

hey guys, we need to come up with a version number schema, because we are making changes and uploading version1.0 every time:)

On Fri, Jul 11, 2008 at 10:03 PM, jhandl <jha...@gmail.com> wrote:

Nico, we found a bug in the way the crawler treated ISO-8859-1 encoded
pages.
We fixed it and a new version of Hounder is ready for download at
http://hounder.org/downloads/hounder-1.0-binary_installer.tgz
Once you download the new version, just run "ant jar" and copy output/
hounder-trunk.jar to the lib directory where hounder is installed.
You will have to re-crawl to get the pages correctly encoded though.
Hope this fixes the problem.

--Jorge

On Jul 11, 6:21 pm, Nico <nicolasbottar...@gmail.com> wrote:
> both variables are in en_US.UTF-8
>
> On Jul 11, 6:09 pm, jhandl <jha...@gmail.com> wrote:
>
> > Nico, make sure you have the LANG and LC_ALL environment variables set
> > to "en_US.UTF-8".
>
> > -- Jorge
>
> > On Jul 11, 6:01 pm, Nico <nicolasbottar...@gmail.com> wrote:
>
> > > Obviously, the URL is:http://blogsearch.google.com/blogsearch?as_q=coca+cola+dasani&num=100...
>
> > > On Jul 11, 5:26 pm, jhandl <jha...@gmail.com> wrote:
>
> > > > Can you send me the url of the page?
>
> > > > On Jul 11, 5:13 pm, Nico <nicolasbottar...@gmail.com> wrote:
>
> > > > > Thanks. I executed that command and obtained the text. Do you know why
> > > > > there is encoding problems?
> > > > > I get things like: "producto extraï¿½do desde vertientes naturales"
>
> > > > > do i have to configure something?
>
> > > > > Thank you very much for your help!
>
> > > > > On Jul 11, 4:59 pm, jhandl <jha...@gmail.com> wrote:
>
> > > > > > Nico, the quick and dirty way is to extract the text from the index
> > > > > > using the idx script as follows:
>
> > > > > > cd indexer
> > > > > > idx list indexes/index 0
>
> > > > > > The best way, though, is to write a crawler module to extract the
> > > > > > parsed text to a file or database, or directly via rpc to any post-
> > > > > > processing you might want to do with it.
>
> > > > > > Hope this helps.
>
> > > > > > -- Jorge
>
> > > > > > On Jul 11, 3:53 pm, Nico <nicolasbottar...@gmail.com> wrote:
>
> > > > > > > Hi, is there a way to extract from command line the text of the
> > > > > > > crawled pages?
>
> > > > > > > Thanks
>
> > > > > > > nicolas Bottarini

jhandl

unread,

Jul 11, 2008, 11:44:49 PM7/11/08

to hounder

Yes, I was thinking about this while the file was slowly uploading.

The first approach I thought about is the traditional one: We make two
versions available for download; the latest stable version and the
development version (which would be the one currently available). The
version number would then be a mayor.minor number (1.0) for the stable
version, and an additional commit number for the development version
(1.0.23). Whenever the latest development version is successfully
tested live for some period of time and we all agree that it is solid
enough, it becomes the next stable version, at which point the number
is incremented, the development version is renamed to X.Y.0, and the
previous stable version is replaced.

But since we already keep the latest development version stored in our
SVN repository, we could simply upload what we consider is the stable
version and increment the version number each time. We could also add
the build number, which would be the current SVN version (4752). Maybe
small fixes like today's could be uploaded as a separate, "unstable"
version, that would eventually replace the stable version when
reported stable.

Whatever scheme we choose to implement, we need to automate it. Today
uploading a new version is as easy as typing "publishDist.sh", which
has "1.0" hardcoded into it, which is why we are still at the 1.0
version although the latest 1.0 is far from the first 1.0! ;)

Reply all

Reply to author

Forward

0 new messages