Re: [gitorious] defaulting enable_repository_directory_sharding to false on existing sharded installations

75 views
Skip to first unread message

Marius Mårnes Mathiesen

unread,
Sep 20, 2012, 4:00:48 AM9/20/12
to gito...@googlegroups.com
On Thu, Sep 20, 2012 at 6:10 AM, Russell Jackson <r...@csub.edu> wrote:
I recently merged upstream master into my local installation. Apparently the sharding default was changed to be disabled which was causing a rather confusing problem
with push processing. I was getting ActiveMessagingAbort exceptions from the push processor on every push to new repositories created and only to new repositories. After enabling sharding again and recreating the repositories, the problem went away.

The symptoms were identical to those in this posting: https://groups.google.com/forum/?fromgroups=#!topic/gitorious/c6jZdl89QCg

I also found that all traces of this setting were removed from the example gitorious.yml config for some reason. I only found it by looking through the commit log.

Russell,
Did you restart your poller process?

Cheers,
- Marius

Russell Jackson

unread,
Sep 20, 2012, 5:05:43 AM9/20/12
to gito...@googlegroups.com
About a thousand times ;-)

Thomas Kjeldahl Nilsson

unread,
Sep 20, 2012, 7:04:54 AM9/20/12
to gito...@googlegroups.com
On 09/20/2012 06:10 AM, Russell Jackson wrote:
> I recently merged upstream master into my local installation.
> Apparently the sharding default was changed to be disabled which was
> causing a rather confusing problem
> with push processing. I was getting ActiveMessagingAbort exceptions
> from the push processor on every push to new repositories created and
> only to new repositories. After enabling sharding again and recreating
> the repositories, the problem went away.
>
> The symptoms were identical to those in this posting:
> https://groups.google.com/forum/?fromgroups=#!topic/gitorious/c6jZdl89QCg
> <https://groups.google.com/forum/?fromgroups=#%21topic/gitorious/c6jZdl89QCg>
>
> I also found that all traces of this setting were removed from the
> example gitorious.yml config for some reason. I only found it by
> looking through the commit log.
> --
> To post to this group, send email to gito...@googlegroups.com
> To unsubscribe from this group, send email to
> gitorious+...@googlegroups.com

Hi Russel,

first off, I've ensured that the doc and example sharding settings is
back in the sample gitorious.yml file. Seems to have been overwritten by
a merge a while back. Thanks for the catch.

Second: having trouble reproducing your issue. I fired up a cleanroom
Gitorious VM, enabled sharding, created projects/repos, pushed code,
turned off sharding, pushed more code without issue. Also tried the
reverse: created project/repo while not sharded, pushed code, re-enabled
sharding, pushed again. None of these scenarios caused any exceptions in
either the /log/message_processing.log or /log/production.log.

Is your scenario different from what I tested above? What am I missing here?

--
best regards,
Thomas Kjeldahl Nilsson
http://gitorious.com

Russell Jackson

unread,
Sep 20, 2012, 6:55:17 PM9/20/12
to gito...@googlegroups.com, tho...@gitorious.com
Second: having trouble reproducing your issue. I fired up a cleanroom
Gitorious VM, enabled sharding, created projects/repos, pushed code,
turned off sharding, pushed more code without issue. Also tried the
reverse: created project/repo while not sharded, pushed code, re-enabled
sharding, pushed again. None of these scenarios caused any exceptions in
either the /log/message_processing.log or /log/production.log.


Not quite.

It was only newly created repositories after sharding was turned off that had problems. Existing sharded repositories would continue to work normally with the setting either way. Like wise, the new repositories
would continue to not work with the setting either way (which makes me suspect it has something to do with the actual filesystem storage location).


My installation is several years old, so it could just be that something is screwed up in my database and only gets triggered with sharding off.

Thomas Kjeldahl Nilsson

unread,
Sep 21, 2012, 1:34:24 PM9/21/12
to Russell Jackson, gito...@googlegroups.com
Yeah, the big version jump in your installation might be a problem. One thing you could try is to simply create a brand spanking new installation and attempt to use the new snapshot/restore commands to migrate your old data (db and repositories) there.

http://blog.gitorious.org/2012/09/20/simple-backup-and-recovery-with-the-snapshot-command/

And if you haven't already done so: if your installation is really old (2009 or older) you may have to look into this wiki page:

https://gitorious.org/gitorious/pages/LegacyUpgrade

Russell Jackson

unread,
Sep 22, 2012, 12:52:45 AM9/22/12
to gito...@googlegroups.com, Russell Jackson, tho...@gitorious.com
I think I tracked this down to how 'gitdir' is calculated in the post-receive hook.

irb(main):007:0> gitdir = '/git/gitorious/foo/bar.git'=> "/git/gitorious/foo/bar.git"
irb(main):008:0> hashed_dir = gitdir.split('/')[-3,3].join('/').split('.').first
=> "gitorious/foo/bar"

irb(main):005:0> gitdir = '/git/gitorious/484/4ad/b02c777df03449dc0990cfe4aeface1de0.git'
=> "/git/gitorious/484/4ad/b02c777df03449dc0990cfe4aeface1de0.git"
irb(main):006:0> hashed_dir = gitdir.split('/')[-3,3].join('/').split('.').first
=> "484/4ad/b02c777df03449dc0990cfe4aeface1de0"

When the push processor attempts to look up the repository with find_by_hashed_path on the first example (non-hashed), it returns nil.

Russell Jackson

unread,
Sep 22, 2012, 1:00:10 AM9/22/12
to gito...@googlegroups.com, Russell Jackson, tho...@gitorious.com
On Friday, September 21, 2012 9:52:45 PM UTC-7, Russell Jackson wrote:
I think I tracked this down to how 'gitdir' is calculated in the post-receive hook.

Whoops. I meant hashed_path not gitdir.

Then... I looked at the line after and noticed that it gets modified by stripping 'repositories/' off from the path; however, that doesn't work in my case because my repository_base_path is '/git/gitorious'.

Russell Jackson

unread,
Sep 24, 2012, 6:48:47 PM9/24/12
to gito...@googlegroups.com, Russell Jackson, tho...@gitorious.com
Is there any reason why we can't look up the base path from gitorious.yml like so?

require 'pathname'
require 'yaml'

incpath = File.dirname(__FILE__)

hooks_realpath = Pathname.new(incpath).realpath
yaml_path = File.join(hooks_realpath, "..", "..", "config", "gitorious.yml")
gitorious_yaml = YAML.load_file(yaml_path)
base_path = gitorious_yaml[ENV["RAILS_ENV"]]["repository_base_path"]

gitdir = File.expand_path(File.join(incpath, ".."))
hashed_dir = gitdir.sub(/^#{base_path}\//, "")

puts hashed_dir

Thomas Kjeldahl Nilsson

unread,
Sep 25, 2012, 2:56:17 AM9/25/12
to Russell Jackson, gito...@googlegroups.com
Sure, looks viable enough...

Marius Mårnes Mathiesen

unread,
Sep 25, 2012, 3:33:53 AM9/25/12
to gito...@googlegroups.com
On Tue, Sep 25, 2012 at 8:56 AM, Thomas Kjeldahl Nilsson <tho...@gitorious.com> wrote:
On 09/25/2012 12:48 AM, Russell Jackson wrote:
Is there any reason why we can't look up the base path from gitorious.yml like so?

require 'pathname'
require 'yaml'

incpath = File.dirname(__FILE__)

hooks_realpath = Pathname.new(incpath).realpath
yaml_path = File.join(hooks_realpath, "..", "..", "config", "gitorious.yml")
gitorious_yaml = YAML.load_file(yaml_path)
base_path = gitorious_yaml[ENV["RAILS_ENV"]]["repository_base_path"]

gitdir = File.expand_path(File.join(incpath, ".."))
hashed_dir = gitdir.sub(/^#{base_path}\//, "")

puts hashed_dir

I'm actually working on supporting several repository roots in Gitorious these days; gitorious.org currently runs with a single, several terabyte file system keeping all the repository and this makes running fsck take a really long time. 

What I'm planning to do is to introduce a new database table (+Rails model) to define filesystems that should be used for storing repositories. By default your server will run off the single root defined in gitorious.yml, but by creating additional roots you will be able to keep different projects on different file systems (since Git repositories use hard links to save space, we really don't want to clone across different file systems, and cloning is only done within a project).

Anyway, to make this work, we'll need to have the repository itself resolve its complete path on disk by combining its project's root path (either defined in a RepositoryRoot instance or gitorious.yml) with its own hashed_path attribute. In master right now, this is done by RepositoryRoot.default_base_path, but I'm working on a feature branch (features/multiple_roots) where this will be an instance method on Repository. 

Cheers,
- Marius
Reply all
Reply to author
Forward
0 new messages