OKD installation on RHV

34 views
Skip to first unread message

Batur Orkun

unread,
Jul 13, 2021, 3:49:17 AMJul 13
to okd-wg
Unfortunately, my efforts still continu for 5 months. I could not install any version of OKD except for v4.5. I tried to upgrade from 4.5 but failed. I try to install OKD every 2 weeks like a sport. :) This time I tried the last nightly version of v4.8 ( openshift-install-linux-4.8.0-0.okd-2021-07-12-031914.tar.gz) There are 3 master nodes but no worker nodes. Bootstrap is not finished yet.

[root@bastion ~]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             False       True          17h     Unable to apply 4.8.0-0.okd-2021-07-12-031914: an unknown error has occurred: MultipleErrors

I need a cluster installation on 3 bare-metal. Our company is Redhat partner.  I thought RHV is a Redhat product and must be compatible with OCP.  But I have many problems from the beginning. Is there anybody experience with OKD on RHV / oVirt?  Please help me .

Thanks

Josef Meier

unread,
Jul 13, 2021, 3:57:16 AMJul 13
to Batur Orkun, okd-wg
Hi,

dif you approve the CSRs ?

oc get csr -o name | xargs oc adm certificate approve

Von meinem iPhone gesendet

Am 13.07.2021 um 09:49 schrieb Batur Orkun <batur...@gmail.com>:


--
You received this message because you are subscribed to the Google Groups "okd-wg" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okd-wg+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okd-wg/1f1d3124-9d72-4165-8e56-5fda54879375n%40googlegroups.com.

Vadim Rutkovsky

unread,
Jul 13, 2021, 4:11:59 AMJul 13
to Batur Orkun, okd-wg
On Tue, Jul 13, 2021 at 9:49 AM Batur Orkun <batur...@gmail.com> wrote:
>
> There are 3 master nodes but no worker nodes. Bootstrap is not finished yet.

Please provide a log bundle collected after installer has failed


--
Cheers,
Vadim

Batur Orkun

unread,
Jul 13, 2021, 4:34:26 AMJul 13
to okd-wg
hi
yes you are right. I forgot this command :(  I run it now but I have only 3 csr

[root@bastion ~]# oc  get csr
NAME                                       AGE   SIGNERNAME                                    REQUESTOR                                                                         CONDITION
csr-6psvr                                  17h   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         Approved,Issued
csr-slqwz                                  51m   kubernetes.io/kube-apiserver-client-kubelet   system:node:dev-rtxw4-master-0                                                    Approved,Issued
csr-z8d4j                                  17h   kubernetes.io/kubelet-serving                 system:node:dev-rtxw4-master-0                                                    Approved,Issued
system:openshift:openshift-authenticator   17h   kubernetes.io/kube-apiserver-client           system:serviceaccount:openshift-authentication-operator:authentication-operator   Approved,Issued

Batur Orkun

unread,
Jul 13, 2021, 6:55:05 AMJul 13
to okd-wg
hi vadim

"oc adm must-gather" command is not runnable

error running backup collection: Get "https://api.dev.my.com.tr:6443/api?timeout=32s": dial tcp 192.168.2.48:6443: connect: connection refusederror: gather did not start for pod must-gather-4sx96: Get "https://api.dev.my.com.tr:6443/api/v1/namespaces/openshift-must-gather-ppv86/pods/must-gather-4sx96": dial tcp 192.168.2.48:6443: connect: connection refused

Batur Orkun

unread,
Jul 13, 2021, 6:56:56 AMJul 13
to okd-wg
I run the approve command above 
now maybe installation is continue

[root@bastion ~]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             False       True          20h     Working towards 4.8.0-0.okd-2021-07-12-031914: 454 of 702 done (64% complete)

But there is still not available  worker nodes or apps URL

Vadim Rutkovsky

unread,
Jul 13, 2021, 7:09:29 AMJul 13
to Batur Orkun, okd-wg
On Tue, Jul 13, 2021 at 12:56 PM Batur Orkun <batur...@gmail.com> wrote:
>
> hi vadim
>
> "oc adm must-gather" command is not runnable
>
> error running backup collection: Get "https://api.dev.my.com.tr:6443/api?timeout=32s": dial tcp 192.168.2.48:6443: connect: connection refusederror: gather did not start for pod must-gather-4sx96: Get "https://api.dev.my.com.tr:6443/api/v1/namespaces/openshift-must-gather-ppv86/pods/must-gather-4sx96": dial tcp 192.168.2.48:6443: connect: connection refused

If `oc` get can pods and nodes so should must-gather. Check kubeconfig
you're using

>
> On Tuesday, July 13, 2021 at 11:11:59 AM UTC+3 Vadim Rutkovsky wrote:
>>
>> On Tue, Jul 13, 2021 at 9:49 AM Batur Orkun <batur...@gmail.com> wrote:
>> >
>> > There are 3 master nodes but no worker nodes. Bootstrap is not finished yet.
>>
>> Please provide a log bundle collected after installer has failed
>>
>>
>> --
>> Cheers,
>> Vadim
>>
> --
> You received this message because you are subscribed to the Google Groups "okd-wg" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to okd-wg+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/okd-wg/7892373d-f870-4cc4-a41e-083ce0ae2b2cn%40googlegroups.com.



--
Cheers,
Vadim

Batur Orkun

unread,
Jul 13, 2021, 7:48:26 AMJul 13
to okd-wg
```
[root@bastion ~]# KUBECONFIG=/root/okd/install/auth/kubeconfig
[root@bastion ~]# oc login -u system:admin
error: couldn't get https://api.dev.my.com.tr:6443/.well-known/oauth-authorization-server: unexpected response status 404
[root@bastion ~]# oc  get nodes
NAME                 STATUS     ROLES    AGE   VERSION
dev-rtxw4-master-0   NotReady   master   20h   v1.21.1+f36aa36-1389
[root@bastion ~]# oc  get pods
No resources found in default namespace.
[root@bastion ~]#  oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             False       True          21h     Unable to apply 4.8.0-0.okd-2021-07-12-031914: an unknown error has occurred: MultipleErrors
[root@bastion ~]#
```

Vadim Rutkovsky

unread,
Jul 13, 2021, 8:16:21 AMJul 13
to Batur Orkun, okd-wg
On Tue, Jul 13, 2021 at 1:48 PM Batur Orkun <batur...@gmail.com> wrote:
>
> ```
> [root@bastion ~]# KUBECONFIG=/root/okd/install/auth/kubeconfig
> [root@bastion ~]# oc login -u system:admin
> error: couldn't get https://api.dev.my.com.tr:6443/.well-known/oauth-authorization-server: unexpected response status 404
> [root@bastion ~]# oc get nodes
> NAME STATUS ROLES AGE VERSION
> dev-rtxw4-master-0 NotReady master 20h v1.21.1+f36aa36-1389

Wait, so install never completed? openshift-installer should have
collected log bundle - could you attach / upload that?
> To view this discussion on the web visit https://groups.google.com/d/msgid/okd-wg/5e547b2e-03db-4997-8325-89891f7730a7n%40googlegroups.com.



--
Cheers,
Vadim

Batur Orkun

unread,
Jul 13, 2021, 8:31:12 AMJul 13
to okd-wg
yes Vadim,  i said Bootstrap is not finished yet.

Run "./openshift-install gather bootstrap --dir=install" and created a download link from google drive

Vadim Rutkovsky

unread,
Jul 13, 2021, 9:03:47 AMJul 13
to Batur Orkun, okd-wg
On Tue, Jul 13, 2021 at 2:31 PM Batur Orkun <batur...@gmail.com> wrote:
>
> yes Vadim, i said Bootstrap is not finished yet.
>
> Run "./openshift-install gather bootstrap --dir=install" and created a download link from google drive
>
> https://drive.google.com/file/d/1KTQCsUUi3KlZx7GfPn53Uu3OkGl62Gif/view?usp=sharing

No data in dev-rtxw4-master-0:
```
Collecting info from dev-rtxw4-master-0
lost connection
Warning: Permanently added 'dev-rtxw4-master-0' (ED25519) to the list
of known hosts.
core@dev-rtxw4-master-0: Permission denied (publickey,password).
```

And the node is up:
{
"address": "dev-rtxw4-master-0",
"type": "Hostname"
}

Which is wrong, your nodes must have FQDN address to function
properly. Check your DHCP settings
> To view this discussion on the web visit https://groups.google.com/d/msgid/okd-wg/a6e0e4e2-82f9-4630-bf61-579a90baab11n%40googlegroups.com.



--
Cheers,
Vadim

Batur Orkun

unread,
Jul 13, 2021, 2:14:27 PMJul 13
to okd-wg
hi Vadim thanks for your inspection. 
My bootstrap device does not know dev-rtx 4-master-0. I thought it uses an IP address. 
Installation script created e hostname dynamically. I added only 2 DNS addresses. (Api and Apps)
I do not understand how to solve this problem from my DHCP. Maybe the problem is I entered the real base domain  ( mycompany.com ) to the installation script. 
I did not create a specific domain. "dev" is my cluster name. So I created  "api.dev.mycompany.com"  and *.apps.dev.mycompany.com)


Vadim Rutkovsky

unread,
Jul 13, 2021, 2:42:49 PMJul 13
to Batur Orkun, okd-wg
On Tue, Jul 13, 2021 at 8:14 PM Batur Orkun <batur...@gmail.com> wrote:
>
> hi Vadim thanks for your inspection.
> My bootstrap device does not know dev-rtx 4-master-0. I thought it uses an IP address.
> Installation script created e hostname dynamically. I added only 2 DNS addresses. (Api and Apps)
> I do not understand how to solve this problem from my DHCP.

Each host should have FQDN name set as a hostname, so that other nodes
could resolve it via its name.
There are multiple ways of setting hostname when node is provisioned -
if you already have DHCP, you can configure it to send hostname along
with the response.
The alternative is DNS PTR records and/or custom static configuration
when node is being provisioned
> To view this discussion on the web visit https://groups.google.com/d/msgid/okd-wg/c5fc6424-4845-427d-bb13-c2529b270c64n%40googlegroups.com.



--
Cheers,
Vadim

Reply all
Reply to author
Forward
0 new messages