Run-book

0 views
Skip to first unread message

Lcs Basinger

unread,
Aug 4, 2024, 5:36:05 PM8/4/24
to quipremanas
were going to install a DSS v10 (or 11) on a rather complex environment (LDAP, SSO, UIF, Cloudera 6.3.3, Kerberos, Sentry, Spark, ...). I wonder if there's a sort of run-book to help with organizing in sequence the activities which need to be performed. Yes, the documentation has all the details but we'd need to better understand the best order by which activities should be performed. Any help is appreciated.

I'm not sure if there is a recommended run book (or playbook) for the DSS installation, specially when you do the installation on premises, where each organization will have their own architecture and environment.


From the experience, I doubt that the order of the steps after the upgrade and before starting DSS is important. However, if you are installing everything from zero, in a new server, before running the installation script we will be sure to check that first all the packages related to LDAP, SSO, Spark, Hadoop, etc., are installed and up to date, before starting.


However, I'd recommend you get in touch first with DSS support in case they can give you recommendations to your particular setup. Also, a great tool to automatize the process for the future, is the use of Ansible ( -ansible-modules).


One extra comment, opinion: installing a fully operational infrastructure is much more simple if your organization moves into the cloud. We did a test (a proof of concept) last year, and Dataiku Fleet Manager (FM) is a remarkable tool to perform all of these operations automatically, with many savings for the IT groups and administrators.


thanks for the comments. Yes, probably the sequence you indicated is a sensible one. Customers are mainly banks and they don't want to move to the cloud, at least short term, mainly due to law restrictions (consider that AWS has added an Italian region few months ago and first such region of GCP will be inaugurated this June).


The role pre-install tasks take care of preparing the server environment (dss service user creation, installing system packages, tuning sysctl, ...) before downloading and installing DSS & DSS standalone spark package.


The role has been designed to be compatible with restricted network environment in which securit devices might deny direct access to dataiku CDN. This feature might be required for your bank customers.


The -ansible-modules module is used by the role to configure LDAP settings in an enterprise environment (succesfully tested with Active Directory). The role could probably be extended following the same logic to configure SSO settings.


I am working on something similar but I have taken a different approach. For me, there isn't much value in configuring the DSS server since that can be restored from a working backup. For me the biggest issue is to get a VM setup with the required OS level packages and ngix configured for reverse proxy. As such I am working a single command line script that will create and install a DSS instance in a GCP VM. It's already working, I now need to automate the HTTPS cert creation and deployment and I will share it with the community.


- name: Disable SELinux

become: true

ansible.posix.selinux:

state: disabled

register: result

failed_when: result.msg default('ok', True) is not search('(^ok$libselinux-python(SELinux state changed))')

tags: [setup]


It can be used in a playbook in which a second role is used for configuring nginx as an SSL reverse-proxy and gathering the SSL certificate from a provider of your choice : some people might choose to use another web gateway to secure their DSS deployment (Citrix ADC, F5, ...)

3a8082e126
Reply all
Reply to author
Forward
0 new messages