timeout error creating a new blog in zotonic

eigenfunction

unread,

May 6, 2012, 2:39:29 PM5/6/12

to Zotonic users

hi everybody, i have this error when creating a new blog:

20:34:54.019 [info] listening on {0,0,0,0}:2525 via tcp

20:34:56.573 [warning] Installing database "zotonic"@"10.0.2.2":5433
"zotonic"
20:34:58.151 [info] DEBUG: z_install_data:35 {testblog,"Install
start."}

20:34:58.167 [info] DEBUG: z_install_data:48 "Inserting config keys"

20:34:58.186 [info] DEBUG: z_install_data:57 "Inserting modules"

20:34:58.433 [info] DEBUG: z_install_data:97 "Inserting categories"

20:34:58.998 [info] DEBUG: z_install_data:257 "Sorting the category
hierarchy"

20:34:59.304 [info] DEBUG: z_install_data:176 "Inserting base
resources (admin, etc.)"

20:34:59.362 [info] DEBUG: z_install_data:193 "Inserting username for
the admin"

Terminating due to shutdown
20:34:59.676 [error] CRASH REPORT Process <0.141.0> with 0 neighbours
crashed with reason: {timeout,{gen_server,call,[<0.142.0>,
{start_child,testblog}]}}
20:34:59.747 [error] Supervisor zotonic_sup had child z_sites_manager
started with z_sites_manager:start_link() at undefined exit with
reason {timeout,{gen_server,call,[<0.142.0>,{start_child,testblog}]}}
in context start_error
{"init terminating in do_boot",{{badmatch,{error,{shutdown,
{zotonic_app,start,[normal,[]]}}}},[{zotonic,start,1,[{file,"src/
zotonic.erl"},{line,45}]},{init,start_it,1,[]},{init,start_em,1,[]}]}}

I am using the latest version from github

Andreas Stenius

unread,

May 6, 2012, 3:04:59 PM5/6/12

to zotoni...@googlegroups.com

This is a known issue that can arise when running a site for the first time.

Re-run it (a few times if needed). It should come up ok after
everything has been installed.

The problem is when the installation takes longer than the allowed
start up time. But since
the installation is clever enough to only fill in what's missing and
not start over from scratch,
running it a few times will help, until it eventually is completed.

Cheers,
Andreas

2012/5/6 eigenfunction <emeka...@yahoo.com>:

Arjan Scherpenisse

unread,

May 7, 2012, 2:17:25 AM5/7/12

to zotoni...@googlegroups.com

This actually pops up a lot of times. I'll have a look if more stuff can
be done asynchronously at site startup.

Arjan

Marc Worrell

unread,

May 7, 2012, 4:02:11 AM5/7/12

to zotoni...@googlegroups.com

Maybe we need to have a staged startup:

1. Check if we need to run the installer
2. Check if we need to upgrade
3. Start the system

Question is we can do that by just loading different sets of children into the site supervisor?
Otherwise we need a more complex mechanism where modules wait for a certain kind of startup signal.

Another option is to redo the z_supervisor into a z_supervisor_staged.

Where we can give a 'stage' to each task, and only if all tasks from another stage have reported back we continue loading the next stage of processes.
A crash should restart the whole process (I think).

- Marc

M-MZ

unread,

May 7, 2012, 4:34:40 AM5/7/12

to zotoni...@googlegroups.com

Maybe we need to have a staged startup:

1. Check if we need to run the installer
2. Check if we need to upgrade
3. Start the system

That was what I was thinking too.

Question is we can do that by just loading different sets of children into the site supervisor?
Otherwise we need a more complex mechanism where modules wait for a certain kind of startup signal.

Dynamic children is a possibility. I was thinking of separte supervisors for each stage. For the installer you need to have the db running, etc, etc. That also helps with a problem I sometimes have that during a restart shutdown the db is already gone and some processes can't save their stuff.

Another option is to redo the z_supervisor into a z_supervisor_staged.

Hmm, do we really need that?

Where we can give a 'stage' to each task, and only if all tasks from another stage have reported back we continue loading the next stage of processes.
A crash should restart the whole process (I think).

That depends on where the crash is I guess. If you crash during the install stage there is not much point in trying to restart forever. If the system has been started already, the supervisor can just restart that part without running the installer and update checks. Or not?

Maas

Marc Worrell

unread,

May 7, 2012, 4:49:58 AM5/7/12

to zotoni...@googlegroups.com

Question is we can do that by just loading different sets of children into the site supervisor?
Otherwise we need a more complex mechanism where modules wait for a certain kind of startup signal.

Dynamic children is a possibility. I was thinking of separte supervisors for each stage. For the installer you need to have the db running, etc, etc. That also helps with a problem I sometimes have that during a restart shutdown the db is already gone and some processes can't save their stuff.

So we need a process of site startup/tear down that is more like the *nix startup/shutdown?

I don't think the normal OTP supervisors support this.

But then, it is not a new problem, someone must have written something for this :)

Another option is to redo the z_supervisor into a z_supervisor_staged.

Hmm, do we really need that?

I prefer not :)

Where we can give a 'stage' to each task, and only if all tasks from another stage have reported back we continue loading the next stage of processes.
A crash should restart the whole process (I think).

That depends on where the crash is I guess. If you crash during the install stage there is not much point in trying to restart forever. If the system has been started already, the supervisor can just restart that part without running the installer and update checks. Or not?

Sometimes problems are really transient. Think of network hiccups, a hard disk that needs to spin up and gives timeouts, a database that is still booting/recovering/warming up etc.

- M

M-MZ

unread,

May 7, 2012, 4:59:05 AM5/7/12

to zotoni...@googlegroups.com

M-MZ

unread,

May 7, 2012, 5:16:12 AM5/7/12

to zotoni...@googlegroups.com

On Monday, May 7, 2012 10:59:05 AM UTC+2, M-MZ wrote:

On Monday, May 7, 2012 10:49:58 AM UTC+2, Marc Worrell wrote:
Question is we can do that by just loading different sets of children into the site supervisor?
Otherwise we need a more complex mechanism where modules wait for a certain kind of startup signal.

Dynamic children is a possibility. I was thinking of separte supervisors for each stage. For the installer you need to have the db running, etc, etc. That also helps with a problem I sometimes have that during a restart shutdown the db is already gone and some processes can't save their stuff.

So we need a process of site startup/tear down that is more like the *nix startup/shutdown?
I don't think the normal OTP supervisors support this.

But then, it is not a new problem, someone must have written something for this :

Right.

According to the otp manual children are terminated in reverse starting order before the supervisor will terminate itself. If in stage 1 the db process in the install_sup, is started and in stage 3 the system_sup processes. The site processes are terminated first before the db process is terminated. That should leave them with a functional db.

Hmmz, why doesn't it work like that right now?

Sometimes problems are really transient. Think of network hiccups, a hard disk that needs to spin up and gives timeouts, a database that is still booting/recovering/warming up etc.

O yeah, forgot that. Currently zotonic sometimes has a hard time recovering from a postgres restart caused by the dreaded OOM killer. Postgres can be busy with the transaction log for quite some time after that.

Maas

Marc Worrell

unread,

May 7, 2012, 5:38:52 AM5/7/12

to zotoni...@googlegroups.com

On 7 mei 2012, at 11:16, M-MZ wrote:

Right.

According to the otp manual children are terminated in reverse starting order before the supervisor will terminate itself. If in stage 1 the db process in the install_sup, is started and in stage 3 the system_sup processes. The site processes are terminated first before the db process is terminated. That should leave them with a functional db.

Hmmz, why doesn't it work like that right now?

I guess because they are terminated, but not _requested_ to terminate.

So I think they are killed in the correct order, but they don't have a chance to do anything to clean up their act...

O yeah, forgot that. Currently zotonic sometimes has a hard time recovering from a postgres restart caused by the dreaded OOM killer. Postgres can be busy with the transaction log for quite some time after that.

Indeed :)

As can any database, recovering from failure almost always involves moving lots of data around.

- M

M-MZ

unread,

May 7, 2012, 7:03:06 AM5/7/12

to zotoni...@googlegroups.com

I guess because they are terminated, but not _requested_ to terminate.
So I think they are killed in the correct order, but they don't have a chance to do anything to clean up their act...

So far I've applied the ostrich algorithm. After looking better I understand.

They processes are asked to terminate is called, but they sometimes can't cleanup because z_trans_server is gone. Which is indeed started after z_module_manager.

Maas

M-MZ

unread,

May 7, 2012, 7:32:28 AM5/7/12

to zotoni...@googlegroups.com

Looking at z_site_supervisor. Why doesn't it properly terminate its children when it terminates?

My guess is that this is the reason for the random behavior I'm seeing.

Maas

eigenfunction

unread,

May 13, 2012, 11:41:21 AM5/13/12

to Zotonic users

Since i did open this thread, at the end it was my fault. i had a
pretty complex setup. I was working in a virtual machine running
Ubuntu that was sharing folders
with its Host machine running Windows. Moving data both ways is
extremelly slow. I moved all my development folder over to Ubuntu and
now creating a new blog and starting an application is a breeze . I do
have other issues though, but this case here for me is closed.
Thanks for your help.

Marc Worrell

unread,

May 14, 2012, 3:02:46 AM5/14/12

to zotoni...@googlegroups.com

Thanks for reporting back!

We will (before the 1.0) still look into these timeouts.

Thank you,

Marc

Reply all

Reply to author

Forward