dns problem with scalr

15 views
Skip to first unread message

cocoy

unread,
Dec 17, 2008, 2:03:51 AM12/17/08
to scalr-discuss

I've configured scalr on ec2 and installed bind. When I launch a farm,
with mysql and app server it runs ok. My internal domain is tm.local
w/ ns1.tm.local set as nameserver.

An application was created as sample.tm.local. And created the zone
file:

@ 14400 IN SOA server1.tm.local. root.server1.tm.local. (
2008121702 ; serial, todays date+todays
14400 ; refresh, seconds
7200 ; retry, seconds
86400 ; expire, seconds
300 ) ; minimum, seconds

sample.tm.local. 14400 IN NS server1.tm.local.
ext-app 20 IN A 10.252.66.147
int-app 20 IN A 10.252.66.147
@ 90 IN A 10.252.66.176
ext-mysql 20 IN A 10.252.66.176
ext-mysql-master 20 IN A 10.252.66.176
int-mysql 20 IN A 10.252.66.176
int-mysql-master 20 IN A 10.252.66.176


Unfortunately I cannot dig these hosts inside scalr server(but
ns1.tm.local is ok).
Can you advice what's wrong with the setup?

Thanks in advance,
Rodney


Father

unread,
Dec 17, 2008, 6:50:47 AM12/17/08
to scalr-discuss
Hi,

Probably... I think you cannot dig hosts inside your scalr server
because of the record for your zone had not added to the file
'named.conf'.
I've recently had the same problem. I suppose that's a bug.
You need to add the following into your 'named.conf':

zone "sample.tm.local" {
type master;
file "pri/sample.tm.local.db";
};

At least, I've done it exactly so. Maybe, in new version of scalr it
has been fixed.
I'm going to upgrade RC2 to RC3, but not now... I don't have time :(
If you'll be trying to do that, please let us know the result.

Good luck,
Nick.

Alex Kovalyov

unread,
Dec 17, 2008, 12:33:29 PM12/17/08
to scalr-...@googlegroups.com
> Unfortunately I cannot dig these hosts inside scalr server(but
> ns1.tm.local is ok).
dig sample.tm.local @localhost doesn't work but dig sample.tm.local
@ns1.tm.local does?
May be bind is not listening on localhost?

RodneyQ

unread,
Dec 17, 2008, 8:17:35 PM12/17/08
to scalr-discuss
Thanks Nick and Alex.

@Nick:
I've had to add the same lines like what you did and it
resolves. I'm using RC3 :)

@Alex:
bind is working properly and listening on current host. Another
problem, is internal IP is the same with its
external IP (elastic ip checked when the instance were started)
written in the zone.


On Dec 18, 1:33 am, Alex Kovalyov <alex.koval...@gmail.com> wrote:
> > Unfortunately I cannot dig these hosts inside scalr server(but
> > ns1.tm.local is ok).
>
> dig sample.tm.local @localhost doesn't work but dig sample.tm.local
> @ns1.tm.local does?
> May be bind is not listening on localhost?
>

RodneyQ

unread,
Dec 17, 2008, 10:08:48 PM12/17/08
to scalr-discuss
Hi Again,

Log shows, that the IP changed to have Elastic IP, after I shutdown
the farm running for a day. something weird here :)

18-12-2008 02:58:19 INFO i-c67ac3af/trap-hostdown.sh
10.252.61.176 DOWN: Scalr notified me that 10.252.61.176 of role app
(Custom role: app, I'm first: 0) is down
18-12-2008 02:58:17 WARN EventObserver IP changed for instance i-
c67ac3af. New IP address: 174.129.252.245
18-12-2008 02:58:17 WARN EventObserver IP changed for instance i-
d97ac3b0. New IP address: 174.129.249.16
17-12-2008 06:50:03 WARN EventObserver IP changed for instance i-
c67ac3af. New IP address: 10.252.63.48
17-12-2008 06:50:03 WARN EventObserver IP changed for instance i-
d97ac3b0. New IP address: 10.252.61.176
17-12-2008 06:47:36 INFO i-c67ac3af/mysql-init.sh Successfully
uploaded MySQL data bundle to S3.
17-12-2008 06:47:30 INFO i-c67ac3af/mysql-init.sh Extracting MySQL
data snapshot.

Rodney

Alex Kovalyov

unread,
Dec 18, 2008, 3:05:26 AM12/18/08
to scalr-...@googlegroups.com
IPs are assigned from the stack and not sticking to a particular instance
now.
It will be resolved in the next update.

RodneyQ

unread,
Dec 18, 2008, 3:24:31 AM12/18/08
to scalr-discuss
Hi Alex,

Cool. we will be expecting that on the next update.
BTW, any ideas why the zone is not written inside named.conf as what
Nick also stated?

Cheers,
Rodney

On Dec 18, 4:05 pm, Alex Kovalyov <alex.koval...@gmail.com> wrote:
> IPs are assigned from the stack and not sticking to a particular instance
> now.
> It will be resolved in the next update.
>

Alex Kovalyov

unread,
Dec 18, 2008, 4:00:53 AM12/18/08
to scalr-...@googlegroups.com
File permissions?
What does log say?

RodneyQ

unread,
Dec 18, 2008, 4:14:09 AM12/18/08
to scalr-discuss

@Alex: I got no errors or warning the log files.
Probably some coding that updates the named.conf :)
Will try to update to RC3, hope this will get rid these error and let
you know.
Thanks Alex.


@Nick: my bad, I've check the version im using, it's still RC2 not
RC3.

Cheers,
Rodney

On Dec 18, 5:00 pm, Alex Kovalyov <alex.koval...@gmail.com> wrote:
> File permissions?
> What does log say?
>

Alex Kovalyov

unread,
Dec 18, 2008, 4:19:45 AM12/18/08
to scalr-discuss
the dns mgmt code has not been changed in RC3.
Do you have DNSZoneListUpdate cronjob set at all?

RodneyQ

unread,
Dec 18, 2008, 4:28:04 AM12/18/08
to scalr-discuss
Yes, all cronjobs are set properly. It can create new zone files
inside the zone directory.
Only the named.conf is not updated(i.e. no new zones are defined
inside this file).

Alex Kovalyov

unread,
Dec 18, 2008, 4:57:35 AM12/18/08
to scalr-...@googlegroups.com
Is the path to named.conf matches the one you entered in settings?
What does ls -al named.conf say?
Are you able to edit the file under the user that you entered in settings?

RodneyQ

unread,
Dec 18, 2008, 6:13:37 AM12/18/08
to scalr-discuss

The default user in scalr setting is root, so it suppose to read and
append it to this file.

-rw-r--r-- 1 bind bind 907 2008-08-27 18:42 named.conf

On Dec 18, 5:57 pm, Alex Kovalyov <alex.koval...@gmail.com> wrote:
> Is the path to named.conf matches the one you entered in settings?
> What does ls -al named.conf say?
> Are you able to edit the file under the user that you entered in settings?
>

Alex Kovalyov

unread,
Dec 18, 2008, 7:22:23 AM12/18/08
to scalr-...@googlegroups.com
Please open a System Log, find transaction that starts with
"DNSZoneListUpdate" and paste it here.

RodneyQ

unread,
Dec 18, 2008, 7:54:40 AM12/18/08
to scalr-discuss
DNSZoneListUpdate log:
-----------------
145895 INFO 2008-12-18 06:00:03 Starting DNSZoneListUpdate
cronjob...
145896 DEBUG 2008-12-18 06:00:03 Process initialized.
145897 DEBUG 2008-12-18 06:00:03 Number of MaxChilds set to 5
145898 DEBUG 2008-12-18 06:00:03 Executing 'OnStartForking' routine
-------------------



DB log(always says 'OnStartForking' successfully executed but not on
the above DNS log.)
145941 INFO 2008-12-18 06:00:04 Starting MySQLMaintenance
cronjob...
145942 DEBUG 2008-12-18 06:00:04 Process initialized.
145943 DEBUG 2008-12-18 06:00:05 Number of MaxChilds set to 5
145944 DEBUG 2008-12-18 06:00:05 Executing 'OnStartForking' routine
145945 DEBUG 2008-12-18 06:00:05 [FarmID: 12] Checking replication
status
145946 DEBUG 2008-12-18 06:00:05 [FarmID: 12] There are no running
slave hosts.
145947 DEBUG 2008-12-18 06:00:05 'OnStartForking' successfully
executed.
145948 DEBUG 2008-12-18 06:00:05 ProcessObject::ThreadArgs is
empty. Nothing to do.






On Dec 18, 8:22 pm, Alex Kovalyov <alex.koval...@gmail.com> wrote:
> Please open a System Log, find transaction that starts with
> "DNSZoneListUpdate" and paste it here.
>

Alex Kovalyov

unread,
Dec 18, 2008, 8:53:46 AM12/18/08
to scalr-...@googlegroups.com



On 18.12.08 14:54, "RodneyQ" <imc...@gmail.com> wrote:

>
> DNSZoneListUpdate log:
> -----------------
> 145895 INFO 2008-12-18 06:00:03 Starting DNSZoneListUpdate
> cronjob...
> 145896 DEBUG 2008-12-18 06:00:03 Process initialized.
> 145897 DEBUG 2008-12-18 06:00:03 Number of MaxChilds set to 5
> 145898 DEBUG 2008-12-18 06:00:03 Executing 'OnStartForking' routine

* It's being interrupted here.
You should see "'OnStartForking' successfully executed" upon success.
PHP's error log is your target now.

RodneyQ

unread,
Dec 19, 2008, 1:46:39 AM12/19/08
to scalr-discuss
Hi Alex,

It seems the problem is related to RemoteBind class called inside
class.DNSZoneListUpdateProcess.php.
I put some logging statement before calling RemoteBind, and it gets to
the system log.
After RemoteBind call, all codes after were not executed anymore(even
the call to log something).
No PHP error were seen on the php logs. :)

See codes below:

foreach((array)$db->GetAll("SELECT * FROM nameservers WHERE
isproxy='0'") as $ns)
{
if ($ns["host"]!='')
{
$this->Logger->debug("currently
here before calling RemoteBind class.");
$nameservers[$ns["host"]] = new RemoteBIND($ns["host"],
$ns["port"],
array("type" => "password", "login" => $ns["username"],
"password" => $this->Crypto->Decrypt($ns["password"], $cpwd)),
$ns["rndc_path"],
$ns["namedconf_path"],
$ns["named_path"],
CONFIG::$NAMEDCONFTPL
);
}
$this->Logger->debug("currently here
AFTER calling RemoteBind class.");
}


See logs:

NO LUCK With DNSZoneListUpdate:

211086 INFO 2008-12-19 06:36:03 Starting DNSZoneListUpdate
cronjob...
211087 DEBUG 2008-12-19 06:36:03 Process initialized.
211088 DEBUG 2008-12-19 06:36:03 Number of MaxChilds set to 5
211089 DEBUG 2008-12-19 06:36:03 Executing 'OnStartForking' routine
211090 DEBUG 2008-12-19 06:36:03 Value for NS host
'server1.tm.local'
211091 DEBUG 2008-12-19 06:36:03 BEFORE: RemoteBIND call


DNS MAINTENANCE goes ok.

211052 INFO 2008-12-19 06:36:03 Starting DNSMaintenance
cronjob...
211053 DEBUG 2008-12-19 06:36:03 Process initialized.
211054 DEBUG 2008-12-19 06:36:03 Number of MaxChilds set to 5
211055 DEBUG 2008-12-19 06:36:03 Executing 'OnStartForking' routine
211056 INFO 2008-12-19 06:36:03 Fetching completed farms...
211057 INFO 2008-12-19 06:36:03 Found 1 farms.
211058 DEBUG 2008-12-19 06:36:03 'OnStartForking' successfully
executed.
211059 DEBUG 2008-12-19 06:36:03 Begin add handler to signals...
211060 DEBUG 2008-12-19 06:36:03 Handle SIGCHLD = 1
211061 DEBUG 2008-12-19 06:36:03 Handle SIGTERM = 1
211062 DEBUG 2008-12-19 06:36:03 Handle SIGABRT = 1
211063 DEBUG 2008-12-19 06:36:03 Handle SIGUSR2 = 1
211064 DEBUG 2008-12-19 06:36:03 Executing
ProcessObject::ForkThreads()
211065 INFO 2008-12-19 06:36:03 [FarmID: 12] Checking DNS zones
211066 INFO 2008-12-19 06:36:03 [FarmID: 12] Checking zomby records
211067 INFO 2008-12-19 06:36:03 [FarmID: 12] Checking for malformed
NS records
211068 INFO 2008-12-19 06:36:03 [FarmID: 12] Checking for malformed
A records
211069 INFO 2008-12-19 06:36:03 [FarmID: 12] DNS zones check
complete
211070 DEBUG 2008-12-19 06:36:03 Child with PID# 4123 successfully
forked
211092 DEBUG 2008-12-19 06:36:03 HandleSignals received signal 17
211093 DEBUG 2008-12-19 06:36:03 Application received signal 17
from child with PID# 4123 (Exit code: 0)
211094 DEBUG 2008-12-19 06:36:03 All childs exited. Executing
OnEndForking routine
211095 DEBUG 2008-12-19 06:36:03 Main process complete. Exiting...

I wonder is it because /etc/bind is a symlink to /var/lib/bind? Hmm..
Still trying to fix and trying to browse some codes.

Rodney



On Dec 18, 9:53 pm, Alex Kovalyov <alex.koval...@gmail.com> wrote:

RodneyQ

unread,
Dec 23, 2008, 4:34:24 AM12/23/08
to scalr-discuss
Fix:

If running on Ubuntu, check the /etc/php5/cli
to have the same php.ini with /etc/php5/apache2.

Since cron is processing via shell and not from apache :)
> ...
>
> read more >>
Reply all
Reply to author
Forward
0 new messages