Help with "Clustering and Domain Setup Walkthrough"

1,176 views
Skip to first unread message

Eric Hodges

unread,
Jun 23, 2021, 2:06:46 PM6/23/21
to WildFly
I have successfully configured a WildFly 22 domain controller, three servers and three server groups. They are all running on the same machine right now. I need to move one of the servers to another machine.

I found this document, which looks promising:

https://docs.wildfly.org/22/High_Availability_Guide.html#Clustering_and_Domain_Setup_Walkthrough

When I follow those instructions, I get to the "Dry Run" section and my "slave" fails to connect to my "master". Such loaded terms.

The instructions for configuring the slave's host.xml's domain-controller are confusing. They show two different ways to configure it:

<domain-controller>
   <remote protocol="remote" host="10.211.55.7" port="9999" />
</domain-controller>

and

<domain-controller>
 <remote security-realm="ManagementRealm" >
   <discovery-options>
     <static-discovery name="master-native" protocol="remote"  host="10.211.55.7" port="9999" />
     <static-discovery name="master-https" protocol="https-remoting" host="10.211.55.7" port="9993" security-realm="ManagementRealm"/>
     <static-discovery name="master-http" protocol="http-remoting" host="10.211.55.7" port="9990" />
   </discovery-options>
 </remote>
</domain-controller>

The 2nd way fails to execute because the attribute "security-realm" isn't allowed on the "static-discovery" element.

Any tips about diagnosing why my "slave" can't connect to my "master"? I can ping the "master"'s IP address from the "slave". I have firewalls disabled.

Is there another document that I should be looking at?

Eric Hodges

unread,
Jun 23, 2021, 2:18:11 PM6/23/21
to WildFly
One more problem I ran into with that walkthrough:

In the section that configures the "security-realm" on the "slave", there's this reference to a key store:

<keystore path="server.keystore" relative-to="jboss.domain.config.dir" keystore-password="jbossas" alias="jboss" key-password="jbossas"/>

but my WildFly doesn't have that keystore, and there are no instructions in the walkthrough for creating it.

Eric Hodges

unread,
Jun 23, 2021, 4:17:14 PM6/23/21
to WildFly
I can see from my network tools that nothing is listening on port 9999 on my "master" machine. Is there something switch that turns that on in WildFly?

Ken Wills

unread,
Jun 23, 2021, 4:28:47 PM6/23/21
to Eric Hodges, WildFly
Hi Eric,

On Wed, Jun 23, 2021 at 3:17 PM Eric Hodges <eho...@usdataworks.com> wrote:
I can see from my network tools that nothing is listening on port 9999 on my "master" machine. Is there something switch that turns that on in WildFly?

The port you use in the secondary (slave) configuration should match the management interface port on the primary (master) -- the default is 9990. I suspect the documentation is reflecting an older default value. You can check for the configurated value in the primaries host.xml:

        <management-interfaces>
            <http-interface security-realm="ManagementRealm">
                <http-upgrade enabled="true"/>
                <socket interface="management" port="${jboss.management.http.port:9990}"/>
            </http-interface>
        </management-interfaces>

Ken
 

On Wednesday, June 23, 2021 at 1:18:11 PM UTC-5 Eric Hodges wrote:
One more problem I ran into with that walkthrough:

In the section that configures the "security-realm" on the "slave", there's this reference to a key store:

<keystore path="server.keystore" relative-to="jboss.domain.config.dir" keystore-password="jbossas" alias="jboss" key-password="jbossas"/>

but my WildFly doesn't have that keystore, and there are no instructions in the walkthrough for creating it.

On Wednesday, June 23, 2021 at 1:06:46 PM UTC-5 Eric Hodges wrote:
I have successfully configured a WildFly 22 domain controller, three servers and three server groups. They are all running on the same machine right now. I need to move one of the servers to another machine.

I found this document, which looks promising:



When I follow those instructions, I get to the "Dry Run" section and my "slave" fails to connect to my "master". Such loaded terms.

The instructions for configuring the slave's host.xml's domain-controller are confusing. They show two different ways to configure it:

<domain-controller>
   <remote protocol="remote" host="10.211.55.7" port="9999" />
</domain-controller>

and

<domain-controller>
 <remote security-realm="ManagementRealm" >
   <discovery-options>
     <static-discovery name="master-native" protocol="remote"  host="10.211.55.7" port="9999" />
     <static-discovery name="master-https" protocol="https-remoting" host="10.211.55.7" port="9993" security-realm="ManagementRealm"/>
     <static-discovery name="master-http" protocol="http-remoting" host="10.211.55.7" port="9990" />
   </discovery-options>
 </remote>
</domain-controller>

The 2nd way fails to execute because the attribute "security-realm" isn't allowed on the "static-discovery" element.

Any tips about diagnosing why my "slave" can't connect to my "master"? I can ping the "master"'s IP address from the "slave". I have firewalls disabled.

Is there another document that I should be looking at?

--
You received this message because you are subscribed to the Google Groups "WildFly" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wildfly+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wildfly/9962ac92-05fc-4f19-8a9b-a47b0bc470d0n%40googlegroups.com.

Eric Hodges

unread,
Jun 23, 2021, 4:41:52 PM6/23/21
to WildFly
Thanks, Ken.

I thought that might be the case. 

My primary's management interface is on 9990.

When I try port 9990 on the secondary's domain-controller's remote element, I get a connection timeout exception:

[Host Controller] 15:36:21,566 WARN  [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0001: Could not connect to remote domain controller remote://###.###.###.###:9990: java.net.ConnectException: WFLYPRT0023: Could not connect to remote://###.###.###.###:9990. The connection timed out

The secondary tries that several times, then gives up like this:

[Host Controller] 15:36:52,149 WARN  [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0147: No domain controller discovery options remain.
[Host Controller] 15:36:52,149 ERROR [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0002: Could not connect to master. Error was: java.lang.IllegalStateException: WFLYHC0120: Tried all domain controller discovery option(s) but unable to connect
[Host Controller] 15:36:52,149 FATAL [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0178: Aborting with exit code 99

I don't see anything in the primary's log when the secondary is trying to connect.

Ken Wills

unread,
Jun 23, 2021, 5:28:53 PM6/23/21
to Eric Hodges, WildFly
On Wed, Jun 23, 2021 at 3:41 PM Eric Hodges <eho...@usdataworks.com> wrote:
Thanks, Ken.

I thought that might be the case. 

My primary's management interface is on 9990.

When I try port 9990 on the secondary's domain-controller's remote element, I get a connection timeout exception:

[Host Controller] 15:36:21,566 WARN  [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0001: Could not connect to remote domain controller remote://###.###.###.###:9990: java.net.ConnectException: WFLYPRT0023: Could not connect to remote://###.###.###.###:9990. The connection timed out

The secondary tries that several times, then gives up like this:

[Host Controller] 15:36:52,149 WARN  [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0147: No domain controller discovery options remain.
[Host Controller] 15:36:52,149 ERROR [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0002: Could not connect to master. Error was: java.lang.IllegalStateException: WFLYHC0120: Tried all domain controller discovery option(s) but unable to connect
[Host Controller] 15:36:52,149 FATAL [org.jboss.as.host.controller] (Controller Boot Thread) WFLYHC0178: Aborting with exit code 99

I don't see anything in the primary's log when the secondary is trying to connect.

I would check the following:

(1) on the primary, using netstat -anp | more and look for the 9990 port. Make sure that this is bound to an external ip address (sounds like this is already the case).
(2) on the secondarty, when the primary is running, but the secondary is not, try connecting to the primary using jboss-cli.sh $ jboss-cli.sh -c --controller=localhost:9990

if (2) is successful, then likely there is something else not quite right in the configs, if it fails, the most likely case is you may have a host firewall on the primary that is blocking the incoming connections.

Ken
 

Eric Hodges

unread,
Jun 23, 2021, 5:38:17 PM6/23/21
to Ken Wills, WildFly
Thank you!

(1) This is on Windows, but I can use Get-NetTCPConnection to see that 9990 is bound to an external IP address (the same one I can ping from the secondary server).

(2) This fails, but I don't understand the command. Shouldn't the --controller address be the primary's IP?

When I use the primary's IP, I get prompted for a user name and password.

$ ./jboss-cli.sh -c --controller=###.###.###.###:9990
Authenticating against security realm: ManagementRealm
Username: 
--

Eric Hodges

Sr. Product Engineer

Ken Wills

unread,
Jun 23, 2021, 5:45:16 PM6/23/21
to Eric Hodges, WildFly


On Wed, Jun 23, 2021, 16:38 Eric Hodges <eho...@usdataworks.com> wrote:
Thank you!

(1) This is on Windows, but I can use Get-NetTCPConnection to see that 9990 is bound to an external IP address (the same one I can ping from the secondary server).

(2) This fails, but I don't understand the command. Shouldn't the --controller address be the primary's IP?

Yes, sorry I meant to change that before sending.


When I use the primary's IP, I get prompted for a user name and password.

$ ./jboss-cli.sh -c --controller=###.###.###.###:9990
Authenticating against security realm: ManagementRealm
Username: 

Ok, that seems to work then. Is the secondary configured to authenticate? Can you successfully auth? I don't think this will result in a connection refused, but probably still worth checking, along with comparing the secondary configs to the examples in the docs below.

FWIW, the docs here you might find more useful:

Eric Hodges

unread,
Jun 24, 2021, 10:51:29 AM6/24/21
to Ken Wills, WildFly
Thank you!

That document makes some things more clear.

In that doc, they add a "native" interface to the primary, listening on port 9999:

  <native-interface security-realm="ManagementRealm">
    <socket interface="management" port="${jboss.management.native.port:9999}"/>
  </native-interface>


The first doc I was using doesn't add that, and WildFly doesn't ship with that in the default configuration. It only has an "http-interface" listening on port 9990.

After adding the "native-interface" to "management-interfaces", I can see that process listening on port 9999 on my primary.

While digging around in the web management console, trying to find a way to list the users, I saw that Configuration->Interfaces->Management didn't have its Inet Address set to my primary machine's IP address. So I set it via the web console and saw that it updated my primary's domain.xml file. Both of the documents I've been reading had me updating that in the host.xml file. Neither mention changing it in the domain.xml file.

Once I changed that, the secondary was able to connect to the primary!

Thanks for your help. Hopefully the next person with this trouble can find my post.
Reply all
Reply to author
Forward
0 new messages