ref-update hook: Synchronous hook timed out issue

2,298 views
Skip to first unread message

Anu

unread,
Mar 5, 2015, 9:19:26 PM3/5/15
to repo-d...@googlegroups.com
Hello,

I have installed "ref-update" hook at Gerrit server side to validate commit message & bug ID.
This is in addition to the "commit-msg" hook at client side just in case user by pass this client level hook.

But, I have seen frequent time out issues for this hook as quoted below:

..................
[2015-03-05 12:17:08,065] ERROR com.google.gerrit.common.ChangeHookRunner : Synchronous hook timed out /opt/gerrit/hooks/ref-update
[2015-03-05 12:35:52,086] ERROR com.google.gerrit.common.ChangeHookRunner : Synchronous hook timed out /opt/gerrit/hooks/ref-update
[2015-03-05 12:56:00,350] ERROR com.google.gerrit.common.ChangeHookRunner : Synchronous hook timed out /opt/gerrit/hooks/ref-update
[2015-03-05 12:57:27,753] ERROR com.google.gerrit.common.ChangeHookRunner : Synchronous hook timed out /opt/gerrit/hooks/ref-update
..................

Is there a way to overcome this since developer has to repeat sending the review request multiple time? 
Server has 32 GB of RAM and & 15 GB has been assigned to container.heapLimit. 

Below is the cache information:
------------------------------------------------

$ ssh <server> gerrit show-caches
Gerrit Code Review        2.8.5                     now    07:46:23   IST
                                                 uptime    1 days  9 hrs

  Name                          |Entries              |  AvgGet |Hit Ratio|
                                |   Mem   Disk   Space|         |Mem  Disk|
--------------------------------+---------------------+---------+---------+
  accounts                      |   135               |   7.6ms | 98%     |
  accounts_byemail              |    55               |   5.6ms | 96%     |
  accounts_byname               |   135               |         |100%     |
  adv_bases                     |                     |         |         |
  changes                       |                     |         |         |
  groups                        |    18               |  14.0ms | 71%     |
  groups_byinclude              |     5               |   4.4ms | 66%     |
  groups_byname                 |                     |         |         |
  groups_byuuid                 |    22               |   7.6ms | 85%     |
  groups_external               |     1               |  95.0ms | 66%     |
  groups_members                |    22               |   6.7ms | 99%     |
  ldap_group_existence          |     3               |   8.3ms | 40%     |
  ldap_groups                   |     4               |  73.9ms | 98%     |
  ldap_groups_byinclude         |                     |         |         |
  ldap_usernames                |     7               |   4.4ms | 16%     |
  permission_sort               |   114               |         | 99%     |
  plugin_resources              |     1               |         | 99%     |
  project_list                  |     1               |  39.1ms | 99%     |
  projects                      |    94               |  11.6s  | 99%     |
  sshkeys                       |    65               |   6.4ms | 99%     |
D diff                          |    79    825   8.72m|  13.4s  | 95% 100%|
D diff_intraline                |   173   2156   1.42m|   5.9ms | 65% 100%|
D git_tags                      |     1      1   1.09k|         |  0% 100%|
D web_sessions                  |    55   2797   1.10m|         | 98%  10%|

SSH:      2  users, oldest session started    1 days  9 hrs ago
Tasks:    3  total =    1 running +      0 ready +    2 sleeping
Mem:   7.73g total =   4.13g used +   3.60g free + 212.00k buffers
      13.33g max
         128 open files,       16 cpus available,      117 threads

$


Thanks, Anu

Anu

unread,
Mar 10, 2015, 7:35:07 PM3/10/15
to repo-d...@googlegroups.com
Hello,

Can anybody help here? 
This is becoming a major concern for the developers.

Thanks, Anu

Chris Harris

unread,
Mar 11, 2015, 8:55:20 AM3/11/15
to Anu, Repo and Gerrit Discussion
Anu,

What is your ref-update hook doing, it will be timed out if it doesn't return within the timeout. The default timeout is 30 seconds. You can increase the timeout by setting  hooks.syncHookTimeout, however, I would profile your ref-update hook and try to workout why its taking so long to complete.

Regards, 

Chris

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Anu

unread,
Mar 12, 2015, 7:01:09 AM3/12/15
to repo-d...@googlegroups.com, anugr...@gmail.com
Hello Chris,

Thanks for your reply.

Timeout was changed to 45 earlier and service was restarted. But still problem exists....
   [hooks]
           syncHookTimeout = 45

Hook 'ref-update' does 2 tasks basically
1) Checking the commit message whether it has JIRA id in the given format or not
2) Check for few branches whether the provided JIRA ids are in authorized & approved list that is mentioned in a text file (stock list based check-in)
_________________________________________________
### ref-update --project <project name> --refname <refname> --uploader <uploader> --oldrev <sha1> --newrev <sha1>

PROJECT=$2
REFNAME=$4
UPLOADER=$6
OLDREV=$8
NEWREV="${10}"

BASE="<baseDir>"
RepoDir="$BASE/$PROJECT.git"

if ! [ -f "$RepoDir/HEAD" ]; then
   echo "No bare git repository"; exit 1
fi

### Check commit message marker for Code Review Push
if [[ `echo $REFNAME | grep "refs/for/"` ]]; then
   MSG=$(git --git-dir=$RepoDir show -s --format=%B $NEWREV)
   Jira_id=`echo $MSG | grep -v ^#|egrep -o "GITCM-[0-9]{5,}|SCM-[0-9]{2,}"`
   if [[ $Jira_id == ""  ]]; then
      echo -e "<<<===\nCommit message given by '$UPLOADER'\nfor '$PROJECT' repo does not contain JIRA ID===>>>"
      exit 1
   else
      if [[ `echo $Jira_id | wc -w` -gt 1 ]]; then
          echo -e "<<<===\nCommit message given by '$UPLOADER'\nfor '$PROJECT' contains more than one JIRA ID; exiting ..===>>>"
          exit 1
      fi
   fi

   br=$(echo $REFNAME|awk -F'/' '{print $NF}')

   ########### Stocklist based commit #############
   ### ! Please update branch names in below variable where there is a stocklist based check-in allowed !
   HF_Branches="<branch1> <branch2> <branch3>"
   Jira_listDir="<jira listed text file absolute path>"
   if [[ $(echo $br | egrep -w `echo $HF_Branches|sed "s/[ |,]/|/g"`) ]]; then
       Map_File="$Jira_listDir/jira_$br.txt"
       if [ ! -f $Map_File ]; then
           echo -e "[ERROR]\tJira approved List file not available/visible\n"
           exit 1
       fi
       if [ ! `cat $Map_File |grep -w "^$Jira_id$"` ]; then
           echo -e "\n[ERROR]\tJira id \"$Jira_id\" not in Approved List\n"
           exit 1
       fi
   fi
   ########### Stocklist based commit #############

fi
___________________________________________________


Thanks, Anu

Anu

unread,
Mar 12, 2015, 7:14:30 AM3/12/15
to repo-d...@googlegroups.com, anugr...@gmail.com
One note: The delay is basically in below highlighted line.
It takes lots of time there and finally times out.

    Counting objects: 40, done.
    Delta compression using up to 48 threads.
    Compressing objects: 100% (6/6), done.
    Writing objects: 100% (6/6), 1.45 KiB | 0 bytes/s, done.
    Total 6 (delta 5), reused 0 (delta 0)
    remote: Resolving deltas: 100% (5/5)
    ----->>>> remote: Processing changes: (/) <<<<------


Thanks, Anu

Chris Harris

unread,
Mar 12, 2015, 7:33:58 AM3/12/15
to Anu, Repo and Gerrit Discussion
Hi Anu, 

I would add some timing information to see which of those steps is taking the time.

Regards, 

Chris

Gustavo Leite de Mendonça Chaves

unread,
Mar 12, 2015, 9:57:16 AM3/12/15
to Anu, repo-discuss
2015-03-12 8:01 GMT-03:00 Anu <anugr...@gmail.com>:
Hook 'ref-update' does 2 tasks basically
1) Checking the commit message whether it has JIRA id in the given format or not
2) Check for few branches whether the provided JIRA ids are in authorized & approved list that is mentioned in a text file (stock list based check-in)

I can't see how your script could take 45 seconds to run, but it can be subtle or I could be dumb. I'd follow Chris's suggestion and insert some timing information in the script to focus on the spot taking the most time. Make sure to show the time at the beginning and at the end of the script because to problem may as well be outside of it.

Where I work we also use hooks to check for JIRA ids in the commit messages and lots of other things. But we use the patchset-created and the draft-published hooks instead of the ref-update. Have you considered them?

These other hooks run asynchronously, do not freeze the git-push commands while they run, and can (if needed) take any time (but ours run really quickly). We have them configured to vote on the patchset's Verification label, giving it +1 or -1 and also inserting a review comment.

<shameless plug>
We use the Git::Hooks framework for creating our hooks. It supports the above mentioned three Gerrit hooks and It has a plugin to check various things about the integration with JIRA.
</shameless plug>

--
Gustavo

Anu

unread,
Mar 17, 2015, 6:42:16 AM3/17/15
to repo-d...@googlegroups.com, anugr...@gmail.com
Hello,

Sorry for the delay in posting the needed logs. Please find below the details....


17-03-2015_14:32:00 --> Storing arguments -- START
17-03-2015_14:32:00 --> Storing arguments -- END
<repo> | refs/for/<branch> | <author> | 0000000000000000000000000000000000000000 | 4d1316d1adafdf253ff74e15efaf1bda6512154a
17-03-2015_14:32:00 --> Getting only commit message using %B format and validate -- START

===> So, here it terminates I believe since there is nothing written in the log after this line & user received the error. Line in the script of "ref-update" to capture the commit message is as below:

        MSG=$(git --git-dir=$RepoDir show -s --format=%B $NEWREV)

Once variable "MSG" stores the commit message, it then validates with the JIRA id. Looks like this does not succeed. 
Do you have any clue which could help us? Most of the developers are getting frustrated because of this delay.

Note: Our repositories are stored in NetApp storage and on NFS mount on Gerrit server.


Thanks, Anu

Anu

unread,
Mar 17, 2015, 6:44:09 AM3/17/15
to repo-d...@googlegroups.com, anugr...@gmail.com
We would like to reject sending the code review itself, if this condition is NOT met.
So, hooks "patchset-created" probably is not useful here since it happens after patch set creation. 

Anu

unread,
Mar 17, 2015, 7:02:15 AM3/17/15
to repo-d...@googlegroups.com, anugr...@gmail.com
One more update:
Below command is taking more than a minute!!! How to overcome this? 

[root@server]  time git --git-dir=$RepoDir show -s --format=%B $NEWREV
GITCM-109 changes during function handling

Change-Id: I2b3294237adbcb31edb7b080be347493758b4bda


real    1m16.784s
user    0m20.299s
sys     0m4.345s
[root@server] #

Gustavo Leite de Mendonça Chaves

unread,
Mar 17, 2015, 7:11:47 AM3/17/15
to Anu, repo-discuss
2015-03-17 8:02 GMT-03:00 Anu <anugr...@gmail.com>:
One more update:
Below command is taking more than a minute!!! How to overcome this? 

[root@server]  time git --git-dir=$RepoDir show -s --format=%B $NEWREV
GITCM-109 changes during function handling

Good question. But it seems that you found the culprit.

Use strace to see what's going on:

strace -tt -r -o/tmp/git.strace git --git-dir=$RepoDir show -s --format=%B $NEWREV

The /tmp/git.strace file will contain every syscall made by git prefixed by how many microseconds they took. Hopefully this will allow you to drill down further.

I'm guessing your problem is related to your repository being on NFS. Let's see what strace tells us.

--
Gustavo

Anu

unread,
Mar 17, 2015, 9:58:13 AM3/17/15
to repo-d...@googlegroups.com, anugr...@gmail.com

Yes, culprit has been identified because of the hint given by you. Thanks.

I see lots of line as below in the trace output file.
Out of 42588 lines, 41831 lines are with these similar lines ....


     0.005051 access("/nfsmount/scm_gitrepo/repo_1.git/objects/pack/pack-390f0c735df93072475f8174af6dd9a9c25c7737.keep", F_OK) = -1 ENOENT (No such
 file or directory)
     0.000090 stat("/nfsmount/scm_gitrepo/repo_1.git/objects/pack/pack-390f0c735df93072475f8174af6dd9a9c25c7737.pack", {st_mode=S_IFREG|0444, st_si
ze=111581022, ...}) = 0
     0.000098 access("/nfsmount/scm_gitrepo/repo_1.git/objects/pack/pack-04ba591a0be2a11b2b002a329008599cc8f5e4e0.keep", F_OK) = -1 ENOENT (No such
 file or directory)
     0.000063 stat("/nfsmount/scm_gitrepo/repo_1.git/objects/pack/pack-04ba591a0be2a11b2b002a329008599cc8f5e4e0.pack", {st_mode=S_IFREG|0444, st_si
ze=177, ...}) = 0


Do you need the complete log file as attachment to verify? 

Thanks, Anu

Gustavo Leite de Mendonça Chaves

unread,
Mar 17, 2015, 10:54:06 AM3/17/15
to Anu, repo-discuss
2015-03-17 10:58 GMT-03:00 Anu <anugr...@gmail.com>:

Yes, culprit has been identified because of the hint given by you. Thanks.

I see lots of line as below in the trace output file.
Out of 42588 lines, 41831 lines are with these similar lines ....

     0.005051 access("/nfsmount/scm_gitrepo/repo_1.git/objects/pack/pack-390f0c735df93072475f8174af6dd9a9c25c7737.keep", F_OK) = -1 ENOENT (No such
 file or directory)

9.000 lines like this, taking 0.005 seconds, can take 45 seconds. It seems that NFS is the problem.

You can compare it by cloning the repo to a local directory and executing the same command on the clone. See how much time each of these access/stat syscalls take on a local disk.

If NFS is your problem I don't know how to make it fast. I'd think about using local disks instead. 

--
Gustavo Leite de Mendonça Chaves
CPqD - Gerência de Tecnologia da Informação e Comunicação

Anu

unread,
Mar 18, 2015, 5:56:37 AM3/18/15
to repo-d...@googlegroups.com, anugr...@gmail.com
Hello,

I tried this in local file system and it at least takes 15-20 seconds on an average. 

[root@server]#  time strace -tt -r -o/tmp/git.strace.local.2 git --git-dir=$RepoDir show -s --format=%B $NEWREV
GITCM-109 changes during ack handling

Change-Id: I2b3294237adbcb31edb7b080be347493758b4bda

real    0m18.678s
user    0m14.981s
sys     0m3.947s
[root@server]#

Again, out of 43978 lines of trace output file, 42463 are with below lines. Do you still foresee any other issue?

     0.007783 access("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-1929cf5f30b18c96aa6505eb4ca942b02d815512.keep", F_OK) = -1 ENOENT (No such file or directory)
     0.000124 stat("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-1929cf5f30b18c96aa6505eb4ca942b02d815512.pack", {st_mode=S_IFREG|0444, st_size=228, ...}) = 0
     0.000161 access("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-c44cdfb054e20468386c9c4259aa820a6a80e298.keep", F_OK) = -1 ENOENT (No such file or directory)
     0.000119 stat("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-c44cdfb054e20468386c9c4259aa820a6a80e298.pack", {st_mode=S_IFREG|0444, st_size=223, ...}) = 0
     0.000169 access("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-faee9a3f992d4c47e5fff640c1a9050586ca06e0.keep", F_OK) = -1 ENOENT (No such file or directory)
     0.000119 stat("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-faee9a3f992d4c47e5fff640c1a9050586ca06e0.pack", {st_mode=S_IFREG|0444, st_size=215, ...}) = 0
     0.000148 access("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-94eb30ac490eb74cd16039dbc74c8227dc52ae05.keep", F_OK) = -1 ENOENT (No such file or directory)
     0.000109 stat("/var/fpwork/git_work/ims_cscf1.git/objects/pack/pack-94eb30ac490eb74cd16039dbc74c8227dc52ae05.pack", {st_mode=S_IFREG|0444, st_size=200, ...}) = 0



Thanks, Anu

chaitanya j

unread,
Jan 25, 2017, 4:34:28 AM1/25/17
to Repo and Gerrit Discussion, anugr...@gmail.com

Hi Anu,

I also faced the same issue multiple times. If you "git pull  origin", this issue will be solved.
Reply all
Reply to author
Forward
0 new messages