Upgrading Ansible Tower to Ansible Automation Platform

It's been quite a while since I last touched Ansible Tower, and I'm glad to report that the latest Ansible Automation Platform introduces several enhancements that makes it a really attractive product.

An strategy to perform upgrades

The Ansible team at Red Hat has published a number of documents on how to perform the upgrade, as this upgrade changes some of the concepts traditionally used in Tower. Namely, virtual environments are replaced by a container-based technology named Execution Environments.

The guide is available here:

Performing the upgrade

In this case, I'll what I did to upgrade an existing clustered Ansible Tower installation from 3.8.x to Ansible Automation Platform 2.2.x, and enable the new features provided by the product (Automation Hub), and the SaaS service provided by Red Hat at console.redhat.com .

Review source environment

In this steps, you'll be noting how the source environment was configured infrastructure-wise, with things like:

  • Check how servers are currently configured, including:
    • Filesystems and sizes
    • Networks
    • Operating system tuning
    • Operating system hardening
  • Check your Ansible Tower installation:
    • Exact version
    • Database Schema status
    • Inventory file used for installation
  • Firewall rules to required resources, such as:
    • Internet proxies
    • SCMs (Git, etc)
    • Authentication (AD/LDAP)
    • CMDB / dynamic inventory sources
    • Red Hat Satellite
    • Other shared resources

Perform a dry-run migration

It is possible to perform a mock upgrade in a separate system, starting from an Ansible Tower backup of the "old" system, even if the old system is a clustered one.

This can be accomplished by performing a backup on the source Tower system, and a fresh Tower install + restore process in the test system.

root@tower-old ~/ansible-tower-setup-bundle-3.8.6-2 # ./setup -b
(transfer backup to test system) 

Then you can create an inventory in the test system, and run the installer as if it was a new system, with a blank config. Then restore the database dump on it.

root@tower-test ~/ansible-automation-platform-setup-bundle-1.2.7-2 # ./setup.sh
root@tower-test ~/ansible-automation-platform-setup-bundle-1.2.7-2 # ./setup.sh -r -e 'restore_backup_file=/tmp/tower-backup.tar.gz'

In this case, you'll be looking to ensure your database schema is migrated successfully prior to engaging into the next upgrade step (eg, Tower 3.8.x to AAP 1.2.latest, to AAP 2.1.latest, finally to AAP 2.2.latest).

In my case, migrating from Tower 3.8.3 to AAP 1.2 (or Tower 3.8.latest) failed silently. The Ansible Tower update process (setup.sh) finished successfully, but the web page itself was showing a maintenance page.

This was solved by checking the database schema:

root@tower-test ~/ansible-automation-platform-setup-bundle-1.2.7-2 # awx-manage  showmigrations | grep -v [X]
auth
 [ ] 0012_alter_user_first_name_max_length
conf
contenttypes
main
oauth2_provider
 [ ] 0002_auto_20190406_1805
 [ ] 0003_auto_20201211_1314
sessions
sites
social_django
 [ ] 0009_auto_20191118_0520
 [ ] 0010_uid_db_index
sso
taggit
 [ ] 0004_alter_taggeditem_content_type_alter_taggeditem_tag

Re-running setup.sh fixed the issue, and further updates could be done successfully.

After this snag was fixed, the upgrade to 2.1 and 2.2 went smoothly.

Post upgrade tasks

Once your environment is upgraded to Ansible Automation Platform 2.2.x, you can also review the following settings:

Default Execution environment

Virtual Envs are deprecated in AAP 2.x, so you should move to Execution Environments (EEs) and probably create your own EEs based on the supported EEs shipped with AAP.

root@tower ~ # awx-manage list_custom_venvs 
· Discovered Virtual Environments:
/var/lib/awx/venv/myvenv
  • To export the contents of a (deprecated) virtual environment, run the following command while supplying the path as an argument: awx-manage export_custom_venv /path/to/venv

  • To view the connections a (deprecated) virtual environment had in the database, run the following command while supplying the path as an argument: awx-manage custom_venv_associations /path/to/venv

root@tower ~ # awx-manage custom_venv_associations  /var/lib/awx/venv/myvenv -q
inventory_sources: []
job_templates: []
organizations: []
projects: []

Integration with Automation Analytics

Red Hat provides Automation Analytics included in the Ansible Automation Platform, and can be enabled by:

In case a proxy is required, you can configure it in the AAP Job settings menu, then immediately trigger a sync:

# automation-controller-service restart

# awx-manage gather_analytics --ship                                        
/tmp/48627e92-4cfd-4f8d-86f2-c180adcaef42-2022-06-11-000448+0000-0.tar.gz   
/tmp/48627e92-4cfd-4f8d-86f2-c180adcaef42-2022-06-11-000448+0000-1.tar.gz  

Cleaning up instances

You might end up in a state of having leftover instances in your environment.

They can be purged in this way:

# awx-manage list_instances                                                
[controlplane capacity=178 policy=100%]
        localhost capacity=0 node_type=hybrid version=4.2.0
        aap.example.org capacity=178 node_type=hybrid version=4.2.0 heartbeat="2022-06-09 08:12:18"

[default capacity=178 policy=100%]
        localhost capacity=0 node_type=hybrid version=4.2.0
        aap.example.org capacity=178 node_type=hybrid version=4.2.0 heartbeat="2022-06-09 08:12:18"


# awx-manage remove_from_queue --hostname=localhost --queuename=controlplane

# awx-manage remove_from_queue --hostname=localhost --queuename=default

# awx-manage deprovision_instance --hostname localhost
Instance Removed
Successfully deprovisioned localhost
(changed: True)

Enabling the Private Automation Hub

Once your AAP control plane is up and running, you can add your Private Automation Hub by adding the new system into the inventory and re-running setup.sh .

Interesting links

Red Hat has put together a number of resources on this new Ansible Automation Platform, available here:

... and support notes

  • https://access.redhat.com/articles/6239891 - Ansible Automation Platform 2 Migration Strategy Considerations
  • https://access.redhat.com/articles/6185641 - AAP 2 Migration Considerations Checklist https://access.redhat.com/articles/4098921 - What are the Recommended Upgrade Paths for Ansible Tower/Ansible Automation Platform?
  • https://access.redhat.com/solutions/6740441 - How Do I Perform Security Patching / OS Package Upgrades On Ansible Automation Platform Nodes Without Breaking Any Ansible Automation Platform Functionality?
  • https://access.redhat.com/solutions/6834291 - May I only update one of the components I want on Ansible Tower or Ansible Automation Controller? https://access.redhat.com/solutions/4308791 - How Can I Bypass "noexec" Permission Issue On "/tmp" and "/var/tmp" During Ansible Tower and Ansible Automation Platform installation?
  • https://access.redhat.com/articles/6177982 - What’s new with Ansible Automation Platform 2.0: Developing with ansible-builder and Automation execution environments.
  • https://access.redhat.com/solutions/5115431 - How to configure Ansible Tower to use a proxy for Automation Analytics
  • https://access.redhat.com/solutions/5519041 - Why Is The Manual Data Uploading To Red Hat Automation Analytics Failing With Status 401 In Ansible Tower?
  • https://access.redhat.com/solutions/6446711 - How do I Replace All Execution Environments in Ansible Automation Platform using Private Images from Private Automation Hub?
  • https://access.redhat.com/solutions/6539431 - How Do I Install Ansible Automation Platform 2.0 in a Disconnected Environment from the Internet?
  • https://access.redhat.com/solutions/6635021 - How Do I Install Ansible Automation Platform 2.1 in a Disconnected Environment from the Internet in a Single Node?
  • https://access.redhat.com/solutions/6219021 - In Ansible Automation Controller, How Do I Set a Proxy Just for Ansible Galaxy And Not Globally?
  • https://access.redhat.com/solutions/3127941 - How do I Specify HTTP/HTTPS_PROXY using Ansible Tower?
  • https://access.redhat.com/solutions/4798321 - How to Activate Ansible Tower License with Red Hat Customer Credentials under a Proxy Environment? (edit /etc/supervisord.conf file)

Other interesting resources

Porting guides

  • https://docs.ansible.com/ansible/devel/porting_guides/porting_guides.html
  • https://docs.ansible.com/ansible/devel/porting_guides/porting_guide_2.10.html
  • https://docs.ansible.com/ansible/devel/porting_guides/porting_guide_3.html

Ansible lint

https://ansible-lint.readthedocs.io/en/latest/

AWX cli

https://github.com/ansible/awx/blob/devel/INSTALL.md#installing-the-awx-cli

Lifecycle

  • https://access.redhat.com/support/policy/update_policies/
  • https://access.redhat.com/support/policy/updates/ansible-automation-platform

... happy hacking!

RHV 4.4 SP1 released

Red Hat has released RHV 4.4 SP1, the latest version based on upstream oVirt 4.5.x series. Major changes include support for RHEL 8.6 hypervisors, and a new workflow to renew hypervisor certificates. Internal certificates changed validity from 5 years to 13 months during the 4.4 series, and this version rolls back these changes to allow a more convenient way of managing the platform.

Previous to performing an upgrade, the following documents are relevant:

Upgrading RHV-M to the latest version

First I enabled the right repositories for RHV 4.4, which now include some Ceph repositories:

subscription-manager repos \
    --disable='*' \
    --enable=rhel-8-for-x86_64-baseos-rpms \
    --enable=rhel-8-for-x86_64-appstream-rpms \
    --enable=rhv-4.4-manager-for-rhel-8-x86_64-rpms \
    --enable=fast-datapath-for-rhel-8-x86_64-rpms \
    --enable=jb-eap-7.4-for-rhel-8-x86_64-rpms \
    --enable=openstack-16.2-cinderlib-for-rhel-8-x86_64-rpms \
    --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms

In my lab environment, I found the following snags while upgrading:

Unsupported package manager

Prior to launching engine-setup to upgrade the Manager, I manually upgraded the yum and rpm packages to avoid an issue with the RHV-M installer (yum upgrade 'yum*' 'rpm*') .

I was originally runing RHV-M 4.4.5 based on RHEL 8.3, so quite an old release. After upgrading those packages, the upgrade progressed until I found the following issue:

2022-05-27 09:20:35,463+0200 DEBUG otopi.context context._executeMethod:127 Stage setup METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine_common.distro-rpm.packages.Plugin._setup
2022-05-27 09:20:35,465+0200 DEBUG otopi.context context._executeMethod:145 method exception
Traceback (most recent call last):
  File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/util.py", line 305, in getPackageManager
    from otopi import minidnf
  File "/usr/lib/python3.6/site-packages/otopi/minidnf.py", line 25, in <module>
    import dnf.transaction_sr
ModuleNotFoundError: No module named 'dnf.transaction_sr'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/util.py", line 312, in getPackageManager
    from otopi import miniyum
  File "/usr/lib/python3.6/site-packages/otopi/miniyum.py", line 17, in <module>
    import rpmUtils.miscutils
ModuleNotFoundError: No module named 'rpmUtils'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine-common/distro-rpm/packages.py", line 293, in _setup
    osetuputil.getPackageManager(self.logger)
  File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/util.py", line 322, in getPackageManager
    'No supported package manager found in your system'
RuntimeError: No supported package manager found in your system
2022-05-27 09:20:35,467+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Environment setup': No supported package manager found in your system

The installation was automatically rolled-back, so no issues there, so by just updating the yum and rpm packages, the issue was solved.

Unable to upgrade database schema

Another of the issues I found is that the upgrade process wasn't working due to engine-setup being unable to refresh the database schema.

# view /var/log/ovirt-engine/setup/ovirt-engine-setup-20220527092805-eci7jy.log 
 255732 CONTEXT:  SQL statement "ALTER TABLE vdc_options ALTER COLUMN default_value SET NOT NULL"
 255733 PL/pgSQL function fn_db_change_column_null(character varying,character varying,boolean) line 10 at EXECUTE
 255734 FATAL: Cannot execute sql command: --file=/usr/share/ovirt-engine/dbscripts/upgrade/pre_upgrade/0000_config.sql
 255735 
 255736 2022-05-27 09:36:22,230+0200 ERROR otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema schema._misc:530 schema.sh: FATAL: Cannot execute sql command: --file=/usr        /share/ovirt-engine/dbscripts/upgrade/pre_upgrade/0000_config.sql
 255737 2022-05-27 09:36:22,231+0200 DEBUG otopi.context context._executeMethod:145 method exception
 255738 Traceback (most recent call last):
 255739   File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod
 255740     method['method']()
 255741   File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", line 532, in _misc
 255742     raise RuntimeError(_('Engine schema refresh failed'))
 255743 RuntimeError: Engine schema refresh failed
 255744 2022-05-27 09:36:22,232+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Misc configuration': Engine schema refresh failed

This is covered in Bugzilla 2077387#c4, and is easily fixed by updating the database schema

root@rhevm ~ # /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select *  from vdc_options where default_value is null ;"
 option_id |          option_name          |                        option_value                         | version | default_value 
-----------+-------------------------------+-------------------------------------------------------------+---------+---------------
       472 | ConfigDir                     | /etc/ovirt-engine                                           | general | 
       473 | AdminDomain                   | internal                                                    | general | 
       474 | AllowDuplicateMacAddresses    | false                                                       | general | 
       475 | DefaultWorkgroup              | WORKGROUP                                                   | general | 
       476 | KeystoneAuthUrl               |                                                             | general | 
       477 | LicenseCertificateFingerPrint | 5f 38 41 89 b1 33 49 0c 24 13 6b b3 e5 ba 9e c7 fd 83 80 3b | general | 
       478 | MacPoolRanges                 | 00:1A:4A:16:01:51-00:1A:4A:16:01:e6                         | general | 
       479 | MaxMacsCountInPool            | 100000                                                      | general | 
       482 | VdsFenceOptions               |                                                             | general | 
       483 | GlusterTunedProfile           | rhs-high-throughput,rhs-virtualization                      | 3.0     | 
       484 | GlusterTunedProfile           | rhs-high-throughput,rhs-virtualization                      | 3.1     | 
       485 | GlusterTunedProfile           | rhs-high-throughput,rhs-virtualization                      | 3.2     | 
       486 | GlusterTunedProfile           | rhs-high-throughput,rhs-virtualization                      | 3.3     | 
       487 | GlusterTunedProfile           | rhs-high-throughput,rhs-virtualization                      | 3.4     | 
       488 | GlusterTunedProfile           | rhs-high-throughput,rhs-virtualization                      | 3.5     | 
       462 | SupportBridgesReportByVDSM    | true                                                        | 3.1     | 
       716 | GlusterTunedProfile           | virtual-host,rhgs-sequential-io,rhgs-random-io              | 4.2     | 
(17 rows)

root@rhevm ~ # /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE vdc_options SET default_value=option_value WHERE default_value IS NULL AND option_value IS NOT NULL;"
UPDATE 17

root@rhevm ~ #  /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE vdc_options SET default_value='' WHERE default_value IS NULL AND option_value IS NULL;"
UPDATE 0

Finally, the engine-setup process finishes OK and after running a yum upgrade -y && systemctl restart ovirt-engine the Web UI is available again.

Upgrading RHEL hypervisors

My hypervisors where also running RHEL 8.3, and some minor RPM problems were found. It is expected that RHV-H insallations (RHV host) do not find such issues.

After enabling the repositories:

subscription-manager repos \
    --disable='*' \
    --enable=rhel-8-for-x86_64-baseos-rpms \
    --enable=rhel-8-for-x86_64-appstream-rpms \
    --enable=rhv-4-mgmt-agent-for-rhel-8-x86_64-rpms \
    --enable=fast-datapath-for-rhel-8-x86_64-rpms \
    --enable=advanced-virt-for-rhel-8-x86_64-rpms \
    --enable=openstack-16.2-cinderlib-for-rhel-8-x86_64-rpms \
    --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms

When using the integrated Cluster Upgrade assistant in the WebUI, package resolution problems were found, and could be trivially fixed by removing the rpm -e network-scripts-openvswitch2.11.

Certificate validation

KCS 6865861 provides a detail explanation on how the process to renew certificates works at the moment, and provides a nifty script to check overall certificate validity of both RHVM and hypervisors (cert_date.sh).

A sample run shows:

root@rhevm ~ # ./cert_date_0.sh 
This script will check certificate expiration dates

Checking RHV-M Certificates...
=================================================
  /etc/pki/ovirt-engine/ca.pem:                          Feb 27 07:27:16 2028 GMT
  /etc/pki/ovirt-engine/certs/apache.cer:                Jun 11 11:38:13 2023 GMT
  /etc/pki/ovirt-engine/certs/engine.cer:                Jun 11 11:38:12 2023 GMT
  /etc/pki/ovirt-engine/qemu-ca.pem                      Aug  5 19:07:11 2030 GMT
  /etc/pki/ovirt-engine/certs/websocket-proxy.cer        Jun 11 11:38:13 2023 GMT
  /etc/pki/ovirt-engine/certs/jboss.cer                  Jun 11 11:38:12 2023 GMT
  /etc/pki/ovirt-engine/certs/ovirt-provider-ovn         May 18 16:01:35 2023 GMT
  /etc/pki/ovirt-engine/certs/ovn-ndb.cer                May 18 16:01:35 2023 GMT
  /etc/pki/ovirt-engine/certs/ovn-sdb.cer                May 18 16:01:35 2023 GMT
  /etc/pki/ovirt-engine/certs/vmconsole-proxy-helper.cer Feb  3 07:28:10 2023 GMT
  /etc/pki/ovirt-engine/certs/vmconsole-proxy-host.cer   Feb  3 07:28:10 2023 GMT
  /etc/pki/ovirt-engine/certs/vmconsole-proxy-user.cer   Feb  3 07:28:10 2023 GMT


Checking Host Certificates...

Host: rhevh1
=================================================
  /etc/pki/vdsm/certs/vdsmcert.pem:              May 30 02:55:03 2027 GMT
  /etc/pki/vdsm/libvirt-spice/server-cert.pem:   May 30 02:55:03 2027 GMT
  /etc/pki/vdsm/libvirt-vnc/server-cert.pem:     May 30 02:55:03 2027 GMT
  /etc/pki/libvirt/clientcert.pem:               May 30 02:55:03 2027 GMT
  /etc/pki/vdsm/libvirt-migrate/server-cert.pem: May 30 02:55:04 2027 GMT


Host: rhevh2
=================================================
  /etc/pki/vdsm/certs/vdsmcert.pem:              May 30 03:19:59 2027 GMT
  /etc/pki/vdsm/libvirt-spice/server-cert.pem:   May 30 03:19:59 2027 GMT
  /etc/pki/vdsm/libvirt-vnc/server-cert.pem:     May 30 03:19:59 2027 GMT
  /etc/pki/libvirt/clientcert.pem:               May 30 03:19:59 2027 GMT
  /etc/pki/vdsm/libvirt-migrate/server-cert.pem: May 30 03:19:59 2027 GMT

Wrap-up

All in all, some minor snags during the upgrade that should be fixed in newer releases to having a smoother experience.

Happy hacking!

Satellite 6.10 released

Red Hat Satellite version 6.10 has been released! This is a preparatory release for the upcoming Satellite 7.0, where major migrations take place for the new software version.

The official information available in:

Preparing an update

The following steps needs to be taken before upgrading to Satellite 6.10:

  • Ensure you are in the latest Satellite 6.9.z release (6.9.7). This is important as this release relies on having the latest packages to make pulp2 to pulp3 migration feasible.

  • Ensure you have plenty of space in /var/lib/pulp/published . This is where metadata of each content view is kept (namely, repository metadata). This information needs to be renegated by pulp3 so at some point both versions of the information exists at the same time. If you keep lots of content view versions, it is recommended to purge them prior to starting the process in order to save space (and to generally speed up Satellite operations).

  • You can review the pulp migration summary once you are in Satellite 6.9.7 with the following command foreman-maintain content migration-stats :

# foreman-maintain content migration-stats
Running Retrieve Pulp 2 to Pulp 3 migration statistics
================================================================================
Retrieve Pulp 2 to Pulp 3 migration statistics: 
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
============Migration Summary================
Migrated/Total RPMs: 111437/111456
Migrated/Total errata: 41998/41998                                                                       
Migrated/Total repositories: 115/115               
Estimated migration time based on yum content: fewer than 5 minutes

Note: ensure there is sufficient storage space for /var/lib/pulp/published to double in size before starting the migration process.
Check the size of /var/lib/pulp/published with 'du -sh /var/lib/pulp/published/'

Note: ensure there is sufficient storage space for postgresql.
You will need additional space for your postgresql database.  The partition holding '/var/opt/rh/rh-postgresql12/lib/pgsql/data/'
   will need additional free space equivalent to the size of your Mongo db database (/var/lib/mongodb/).

In case of problems with missing or broken RPMs, they will be detected as well:

============Missing/Corrupted Content Summary================
WARNING: MISSING OR CORRUPTED CONTENT DETECTED
Corrupted or Missing Rpm: 19/111456
Corrupted or missing content has been detected, you can examine the list of content in /tmp/unmigratable_content-20211117-32242-1m0sghx and take action by either:
1. Performing a 'Verify Checksum' sync under Advanced Sync Options, let it complete, and re-running the migration
2. Deleting/disabling the affected repositories and running orphan cleanup (foreman-rake katello:delete_orphaned_content) and re-running the migration
3. Manually correcting files on the filesystem in /var/lib/pulp/content/ and re-running the migration
4. Mark currently corrupted or missing content as skipped (foreman-rake katello:approve_corrupted_migration_content).  This will skip migration of missing or corrupted content.

                                                                      [OK]
--------------------------------------------------------------------------------

In my test lab, I just ingored those errors as they were some minor issues with some kernel packages.

It is also good to review the sizes of the current MongoDB and PostgreSQL databases. As MongoDB is finally removed, data will be migrated to Postgres and its filesystem should have enough space.

# du -scm /var/lib/mongodb/
7196    /var/lib/mongodb/
7196    total

 du -scm /var/opt/rh/rh-postgresql12/lib/pgsql/data/
10205   /var/opt/rh/rh-postgresql12/lib/pgsql/data/
10205   total

Note that you might also need to remove the following legacy RPMs prior to upgrading to Satellite 6.10 . My Satellite was installed in the 6.3 timeframe and for some reason the packages have been lingering around since then. If the packages are present, the installer will issue a message regarding yum being unable to properly resolve dependencies.

yum erase tfm-rubygem-ethon tfm-rubygem-qpid_messaging tfm-rubygem-typhoeus tfm-rubygem-zest tfm-rubygem-typhoeus tfm-rubygem-fog-xenserver tfm-rubygem-pulp_docker_client tfm-ruby gem-awesome_print tfm-rubygem-trollop

Upgrading the Satellite version

The upgrade process itself doesn't change much from earlier. It will just take more time to accomodate the data migration.

# time foreman-maintain upgrade run  --target-version=6.10 -y
Checking for new version of satellite-maintain...                                                  
Security: kernel-3.10.0-1160.45.1.el7.x86_64 is an installed security update                       
Security: kernel-3.10.0-1160.42.2.el7.x86_64 is the currently running version                      
Loaded plugins: foreman-protector, product-id, subscription-manager                                
Unable to upload Enabled Repositories Report                                                       
Nothing to update, can't find new version of satellite-maintain.                                   
Running preparation steps required to run the next scenarios                                       
================================================================================                   
Check whether system has any non Red Hat repositories (e.g.: EPEL) enabled:                        
| Checking repositories enabled on the systemUnable to upload Enabled Repositories Report          
| Checking repositories enabled on the system                         [OK]                         
--------------------------------------------------------------------------------                   


Running Checks before upgrading to Satellite 6.10                                                  
================================================================================                   
Warn about Puppet content removal prior to 6.10 upgrade:              [OK]                         
--------------------------------------------------------------------------------                   
Check for newer packages and optionally ask for confirmation if not found.:                        
Confirm that you are running the latest minor release of Satellite 6.9 (assuming yes)              
                                                                      [OK]                         
--------------------------------------------------------------------------------                   
Check for HTTPS proxies from the database:                            [OK]                         
--------------------------------------------------------------------------------                   
Clean old Kernel and initramfs files from tftp-boot:                  [OK]                         
--------------------------------------------------------------------------------                      
Check number of fact names in database:                               [OK]               
--------------------------------------------------------------------------------                         
Check for verifying syntax for ISP DHCP configurations:               [OK]                               
--------------------------------------------------------------------------------                         
Check whether all services are running:                               [OK]                               
--------------------------------------------------------------------------------                         
Check whether all services are running using the ping call:           [OK]                               
--------------------------------------------------------------------------------                         
Check for paused tasks:                                               [OK]                               
--------------------------------------------------------------------------------                         
Check to verify no empty CA cert requests exist:                      [OK]                               
--------------------------------------------------------------------------------                         
Check whether system is self-registered or not:                       [OK]                               
--------------------------------------------------------------------------------                         
Check to make sure root(/) partition has enough space:                [OK]                               
--------------------------------------------------------------------------------                         
Check to make sure /var/lib/candlepin has enough space:               [OK]                               
--------------------------------------------------------------------------------                         
Check to validate candlepin database:                                 [OK]                               
--------------------------------------------------------------------------------                         
Check for running tasks:                                              [OK]                               
--------------------------------------------------------------------------------                         
Check for old tasks in paused/stopped state:                          [OK]                               
--------------------------------------------------------------------------------                         
Check for pending tasks which are safe to delete:                     [OK]                               
--------------------------------------------------------------------------------                         
Check for tasks in planning state:                                    [OK]                 
--------------------------------------------------------------------------------                         
Check to verify if any hotfix installed on system:                                                       
- Checking for presence of hotfix(es). It may take some time to verify.                                  
                                                                      [OK]                               
--------------------------------------------------------------------------------                         
Check whether system has any non Red Hat repositories (e.g.: EPEL) enabled:                              
/ Checking repositories enabled on the systemUnable to upload Enabled Repositories Report                 
/ Checking repositories enabled on the system                         [OK]                               
--------------------------------------------------------------------------------                         
Check if TMOUT environment variable is set:                           [OK]                               
--------------------------------------------------------------------------------                         
Check if any upstream repositories are enabled on system:                                                
\ Checking for presence of upstream repositories                      [OK]                               
--------------------------------------------------------------------------------                         
Check for roles that have filters with multiple resources attached:   [OK]                               
--------------------------------------------------------------------------------                         
Check for duplicate permissions from database:                        [OK]                               
--------------------------------------------------------------------------------                         
Check if system has any non Red Hat RPMs installed (e.g.: Fedora):    [OK]                               
--------------------------------------------------------------------------------                         
Check whether reports have correct associations:                      [OK]                               
--------------------------------------------------------------------------------                         
Check to validate yum configuration before upgrade:                   [OK]                               
--------------------------------------------------------------------------------                         
Check if checkpoint_segments configuration exists on the system:      [OK]                               
--------------------------------------------------------------------------------                         
--------------------------------------------------------------------------------        
Validate availability of repositories:              
/ Validating availability of repositories for 6.10                    [OK]                               
--------------------------------------------------------------------------------                         


The pre-upgrade checks indicate that the system is ready for upgrade.                                    
It's recommended to perform a backup at this stage.                                                      
Confirm to continue with the modification part of the upgrade (assuming yes)                             
Running Procedures before migrating to Satellite 6.10                                                    
================================================================================                         
disable active sync plans:                          
\ Total 0 sync plans are now disabled.                                [OK]                               
--------------------------------------------------------------------------------                         
Add maintenance_mode chain to iptables:                               [OK]                               
--------------------------------------------------------------------------------                         
Stop applicable services:                           

Stopping the following service(s):                  
rh-mongodb34-mongod, rh-redis5-redis, postgresql, qdrouterd, qpidd, squid, pulp_celerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, smart_proxy_dynflow_core, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, foreman.socket, dynflow-sidekiq@worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy
\ All services stopped                                                [OK]                               
--------------------------------------------------------------------------------                         


Running preparation steps required to run the next scenarios                                             
================================================================================        
Check if tooling for package locking is installed:                    [OK]                               
--------------------------------------------------------------------------------                         


Running Migration scripts to Satellite 6.10                                                              
================================================================================                         
Enable applicable services:                         

Enabling the following service(s):                  
pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4                                                                              
| enabling pulpcore-resource-manager                                                                     
Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-api.service to /etc/systemd/system/pulpcore-api.service.                                                                                 

Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-content.service to /etc/systemd/system/pulpcore-content.service.                                                                         

Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-resource-manager.service to /etc/systemd/system/pulpcore-resource-manager.service.
\ enabling pulpcore-worker@4                                                                             
Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker@1.service to /etc/systemd/system/pulpcore-worker@.service.                                                                        
Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker@2.service to /etc/systemd/system/pulpcore-worker@.service.                                                                        
Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker@3.service to /etc/systemd/system/pulpcore-worker@.service.                                                                        
Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker@4.service to /etc/systemd/system/pulpcore-worker@.service.                                                                        
| All services enabled                                                [OK]                               
--------------------------------------------------------------------------------                         

Start applicable services:

Starting the following service(s):
rh-mongodb34-mongod, rh-redis5-redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-resource-manager, qdrouterd, qpidd, squid, pulp_celerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, pulpcore
-worker@1.service, pulpcore-worker@2.service, pulpcore-worker@3.service, pulpcore-worker@4.service, smart_proxy_dynflow_core, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq@w
orker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy
\ All services started                                                [OK]
--------------------------------------------------------------------------------
Switch support for certain content from Pulp 2 to Pulp 3:
Performing final content migration before switching content           [OK]
Print pulp 2 removal instructions:
======================================================
Migration of content from Pulp 2 to Pulp3 is complete 

After verifying accessibility of content from lients, 
it is strongly recommend to run "foreman-maintain content remove-pulp2"
This will remove Pulp 2, MongoDB, and all pulp2 content in /var/lib/pulp/ontent/
======================================================                [OK]                                                                                                   
--------------------------------------------------------------------------------                                                                                             


--------------------------------------------------------------------------------                                                                                             
Upgrade finished.                                                                     

The whole upgrade process took about 2.5h for a Satellite system with RHEL7 and RHEL8 main repos and about 10 content view versions. Note that this migration time is severly affected by amount of RAM, CPU and storage performance.

Cleaning up

Once Satellite 6.10 has been fully migrated and verified, the old pulp2 content should be removed with the following command:

# time foreman-maintain content remove-pulp2 ; time foreman-maintain upgrade run  --target-version=6.10.z -y  
Running Remove Pulp2 and mongodb packages and data
================================================================================
Remove pulp2: 

WARNING: All pulp2 packages will be removed with the following commands:

# rpm -e pulp-docker-plugins  pulp-ostree-plugins  pulp-puppet-plugins  pulp-puppet-tools  pulp-rpm-plugins  pulp-selinux  pulp-server  python-bson  python-mongoengine  python-nectar  python-pulp-common  python-pulp-docker-common  python-pulp-integrity  python-pulp-oid_validation  python-pulp-ostree-common  python-pulp-puppet-common  python-pulp-repoauth  python-pulp-rpm-common  python-pulp-streamer  python-pymongo  python-pymongo-gridfs  python2-amqp  python2-billiard  python2-celery  python2-django  python2-kombu  python2-solv  python2-vine  pulp-katello  pulp-maintenance  python3-pulp-2to3-migration
# yum remove rh-mongodb34-*
# yum remove squid mod_wsgi

All pulp2 data will be removed.

# rm -rf /var/lib/pulp/published
# rm -rf /var/lib/pulp/content
# rm -rf /var/lib/pulp/importers
# rm -rf /var/lib/pulp/uploads
# rm -rf /var/lib/mongodb/
# rm -rf /var/cache/pulp

Do you want to proceed?, [y(yes), q(quit)] y
- Removing pulp2 packages                        
- Removing mongo packages                                                       
| Removing additional packages                                                  
- Dropping migration tables                                                     
| Dropping migrations                                                           
\ Done deleting pulp2 data directories                                [OK]      
--------------------------------------------------------------------------------


real    2m46.147s
user    1m32.814s
sys     0m17.502s

Happy upgrading!

Upgrading to Fedora 34

Fedora 34 was released a few weeks ago, and now I took some time to update my work machine to the new release. Here are some tips I found interesting:

How to upgrade via command line

Upgrades can be applied via CLI with

sudo dnf upgrade -y && \
sudo dnf system-upgrade download --refresh --releasever=34 --nogpgcheck  --allowerasing -y && \
sudo dnf system-upgrade reboot -y

Note that this will take care of removing unneeded or conflicting RPMs. It can be a little too eager removing packages, so you can inspect later what was done via dnf history list and dnf history info X.

Fixing the horizontal dock

I'm a fan of the old way the dock was handled (vertically on the left). Moving the mouse to the top left corner to activate the 'Activities' button, then moving the mouse to the bottom to choose the right application I want to launch seems like a lot of mouse travel.

Fortunately there are extensions that fix that behaviour and revert to the old one.

This extension needs to be used in conjunction to Dash-to-dock, which is available from the linked repo, or installed from RPM with dnf install gnome-shell-extension-dash-to-dock .

Ta-da!

... and that was it for me. Pretty uneventful upgrade as everything seems to work ok.

Happy hacking!

Notes on upgrading RHV 4.3 to RHV 4.4

Reciently Red Hat has published the latest RHV 4.4 version. This introduces some major changes in the underlying operating system (migration from RHEL7 to RHEL8 in both hypervisors and Engine / Self Hosted Engine), and a bunch of new features.

There are extensive notes on how to perform the upgrade, especially for the Self-hosted Engine-type of deployments.

I upgraded a small 2-node lab environment and besides the notes already mentioned in the docs above, I also found relevant:

Before you start

  • Understand the NIC naming differences between RHEL7 and RHEL8.
    • Your hypervisor NICs will probably be renamed.
  • Jot down your hypervisors' NIC to MAC-addresses mappings prior to attempting an upgrade.
    • This will ease understanding what NIC is what after installing RHEL8.
  • When using shared storage (FC), beware of unmapping it while you reinstall each host, or ensure your kickstart does NOT clear the shared disks.
    • Otherwise this might lead into data loss!!

Prerequistes

  • One spare hypervisor, feshly installed with RHEL8/RHVH8 and NOT added to the manager.
  • One additional LUN / NFS share for the new SHE 4.4 deployment.

    • The installer does not upgrade the old SHE in-place, so a new lun is required.
    • This eases the rollback, as the original SHE LUN is untouched.
  • Ensure the new hypervisor has all the configuration to access all required networks prior to starting the upgrade.

    • IP configuration for the ovirtmgmt network (obvious).
    • IP configuration for any NFS/iSCSI networks, if required.
    • Shared FC storage, if required.
    • This is critical as the restore process does not prompt to configure/fix network settings when deploying the upgraded manager.
  • Extra steps

    • Collect your RHV-M details:
      • IP address and netmask
      • FQDN
      • Mac-address if using DHCP.
      • Extra software and additional RPMs (eg: AD/IDM/ldap integration, etc)
      • Existing /etc/hosts details in case you use hosts instead of DNS (bad bad bad!!!).
      • Same for hypervisors!
    • Optionally: Mark your networks within the cluster as non-Required . This might be useful until BZ #1867198 is addressed.

Deploying and registering the hypervisors.

The RHEL8/RHVH8 can be deployed as usual with Foreman / Red Hat Satellite.

Ensure the hypervisors are registered and have access to the repositories as below:

RHEL8 Host repositories

POOLID=`subscription-manager list --available --matches "Red Hat Virtualization"  --pool-only | head -n 1`
subscription-manager attach --pool=$POOLID
subscription-manager repos \
    --disable='*' \
    --enable=rhel-8-for-x86_64-baseos-rpms \
    --enable=rhel-8-for-x86_64-appstream-rpms \
    --enable=rhv-4-mgmt-agent-for-rhel-8-x86_64-rpms \
    --enable=fast-datapath-for-rhel-8-x86_64-rpms \
    --enable=ansible-2.9-for-rhel-8-x86_64-rpms \
    --enable=advanced-virt-for-rhel-8-x86_64-rpms

yum module reset -y virt
yum module enable -y virt:8.2
systemctl enable --now firewalld
yum install -y rhevm-appliance ovirt-hosted-engine-setup

RHVH8 Host repositories

POOLID=$(subscription-manager list --available --matches "Red Hat Virtualization"  --pool-only | head -n 1)
subscription-manager attach --pool=$POOLID
subscription-manager repos \
    --disable='*' \
    --enable=rhvh-4-for-rhel-8-x86_64-rpms
systemctl enable --now firewalld
yum install -y rhevm-appliance

Powering off RHV 4.3 manager

  • Set the Manager in global maintenance mode.
  • OPTIONAL: Mark your networks within the cluster as non-Required . This might be useful until BZ #1867198 is addressed.
  • Stop the ovirt-engine service.
  • Backup the RHV 4.3 database and save in a shared space.

Performing the RHV-M upgrade

  • Copy the database backup into the RHEL8 hypervisor.
  • Launch the restore process with hosted-engine --deploy --restore-from-file=backup.tar.bz2

The process has changed significantly in the last RHV releases and it now performs the new SHE rollout or restore in two phases :

  • Phase 1: it tries to roll it out in the hypervisor local storage.

    • Gather FQDN, IP details of the Manager.
    • Gather other configuration.
  • Phase 2: migrate to shared storage.

    • If Phase1 is successful, this takes care of gathering shared storage details (LUN ID or NFS defails).
    • Copy the bootstrap manager into the shared storage.
    • Configure the ovirt-ha-broker and ovirt-ha-agent in the hypervisor to monitor and ensure the SHE is started.

Phase 1 details

[root@rhevh2 rhev]# time  hosted-engine --deploy --restore-from-file=engine-backup-rhevm-20200807_1536.tar.bz2
[ INFO  ] Stage: Initializing
[ INFO  ] Stage: Environment setup
          During customization use CTRL-D to abort.
          Continuing will configure this host for serving as hypervisor and will create a local VM with a running engine.
          The provided engine backup file will be restored there,
          it's strongly recommended to run this tool on an host that wasn't part of the environment going to be restored.
          If a reference to this host is already contained in the backup file, it will be filtered out at restore time.
          The locally running engine will be used to configure a new storage domain and create a VM there.
          At the end the disk of the local VM will be moved to the shared storage.
          The old hosted-engine storage domain will be renamed, after checking that everything is correctly working you can manually remove it.
          Other hosted-engine hosts have to be reinstalled from the engine to update their hosted-engine configuration.
          Are you sure you want to continue? (Yes, No)[Yes]: yes
          It has been detected that this program is executed through an SSH connection without using tmux.
          Continuing with the installation may lead to broken installation if the network connection fails.
          It is highly recommended to abort the installation and run it inside a tmux session using command "tmux".
          Do you want to continue anyway? (Yes, No)[No]: yes
          Configuration files: 
          Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20200807155111-5blcva.log
          Version: otopi-1.9.2 (otopi-1.9.2-1.el8ev)
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup (late)
[ INFO  ] Stage: Environment customization

          --== STORAGE CONFIGURATION ==--


          --== HOST NETWORK CONFIGURATION ==--

          Please indicate the gateway IP address [10.48.0.100]: 
[ INFO  ] TASK [ovirt.hosted_engine_setup : Execute just a specific set of steps]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Force facts gathering]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Detecting interface on existing management bridge]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Get all active network interfaces]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Filter bonds with bad naming]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Generate output list]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Collect interface types]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Check for Team devices]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Get list of Team devices]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Filter unsupported interface types]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Failed if only teaming devices are availible]
[ INFO  ] skipping: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Validate selected bridge interface if management bridge does not exist]
[ INFO  ] skipping: [localhost]
         Please indicate a nic to set ovirtmgmt bridge on: (eth4.100, ens15.200) [ens15.200]: eth4.100
          Please specify which way the network connectivity should be checked (ping, dns, tcp, none) [dns]: 

          --== VM CONFIGURATION ==--

          Please enter the name of the datacenter where you want to deploy this hosted-engine host. Please note that if you are restoring a backup that contains info about other hosted-engine hosts,
          this value should exactly match the value used in the environment you are going to restore. [Default]: 
          Please enter the name of the cluster where you want to deploy this hosted-engine host. Please note that if you are restoring a backup that contains info about other hosted-engine hosts,
          this value should exactly match the value used in the environment you are going to restore. [Default]: 
          Renew engine CA on restore if needed? Please notice that if you choose Yes, all hosts will have to be later manually reinstalled from the engine. (Yes, No)[No]: 
          Pause the execution after adding this host to the engine?
          You will be able to iteratively connect to the restored engine in order to manually review and remediate its configuration before proceeding with the deployment:
          please ensure that all the datacenter hosts and storage domain are listed as up or in maintenance mode before proceeding.
          This is normally not required when restoring an up to date and coherent backup. (Yes, No)[No]: 
          If you want to deploy with a custom engine appliance image,
          please specify the path to the OVA archive you would like to use
          (leave it empty to skip, the setup will use rhvm-appliance rpm installing it if missing): 
          Please specify the number of virtual CPUs for the VM (Defaults to appliance OVF value): [4]: 
          Please specify the memory size of the VM in MB (Defaults to appliance OVF value): [16384]: 
[ INFO  ] Detecting host timezone.
          Please provide the FQDN you would like to use for the engine.
          Note: This will be the FQDN of the engine VM you are now going to launch,
          it should not point to the base host or to any other existing machine.
         Engine VM FQDN:  []: rhevm.example.org
          Please provide the domain name you would like to use for the engine appliance.
          Engine VM domain: [example.org]
          Enter root password that will be used for the engine appliance: 
          Confirm appliance root password: 
          Enter ssh public key for the root user that will be used for the engine appliance (leave it empty to skip): 
          Do you want to enable ssh access for the root user (yes, no, without-password) [yes]: 
          Do you want to apply a default OpenSCAP security profile (Yes, No) [No]: 
          You may specify a unicast MAC address for the VM or accept a randomly generated default [00:16:3e:03:ec:35]: 
          How should the engine VM network be configured (DHCP, Static)[DHCP]? static
          Please enter the IP address to be used for the engine VM []: 10.48.0.4
[ INFO  ] The engine VM will be configured to use 10.48.0.4/24
          Please provide a comma-separated list (max 3) of IP addresses of domain name servers for the engine VM
          Engine VM DNS (leave it empty to skip) [10.48.0.100]: 
          Add lines for the appliance itself and for this host to /etc/hosts on the engine VM?
          Note: ensuring that this host could resolve the engine VM hostname is still up to you
          (Yes, No)[No] 

          --== HOSTED ENGINE CONFIGURATION ==--

          Please provide the name of the SMTP server through which we will send notifications [localhost]: 
          Please provide the TCP port number of the SMTP server [25]: 
          Please provide the email address from which notifications will be sent [root@localhost]: 
          Please provide a comma-separated list of email addresses which will get notifications [root@localhost]: 
          Enter engine admin password: 
          Confirm engine admin password: 
[ INFO  ] Stage: Setup validation
          Please provide the hostname of this host on the management network [rhevh2]: rhevh2.example.org
[ INFO  ] Stage: Transaction setup
[ INFO  ] Stage: Misc configuration (early)
[ INFO  ] Stage: Package installation
[ INFO  ] Stage: Misc configuration
[ INFO  ] Stage: Transaction commit
[ INFO  ] Stage: Closing up
[ INFO  ] Cleaning previous attempts
[ INFO  ] TASK [ovirt.hosted_engine_setup : Execute just a specific set of steps]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Force facts gathering]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Install oVirt Hosted Engine packages]
[ INFO  ] ok: [localhost]

[... snip ...]

The manager is now being deployed and made available via the hypervisor at a later stage:

[ INFO  ] TASK [ovirt.hosted_engine_setup : Adding new SSO_ALTERNATE_ENGINE_FQDNS line]
[ INFO  ] changed: [localhost -> rhevm.example.org]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Restart ovirt-engine service for changed OVF Update configuration and LibgfApi support]
[ INFO  ] changed: [localhost -> rhevm.example.org]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Mask cloud-init services to speed up future boot]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Wait for ovirt-engine service to start]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Open a port on firewalld]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Expose engine VM webui over a local port via ssh port forwarding]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Evaluate temporary bootstrap engine URL]
[ INFO  ] ok: [localhost]
[ INFO  ] The bootstrap engine is temporary accessible over https://rhevh2.example.org:6900/ovirt-engine/ 
[ INFO  ] TASK [ovirt.hosted_engine_setup : Detect VLAN ID]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Set Engine public key as authorized key without validating the TLS/SSL certificates]
[ INFO  ] changed: [localhost]
[...]
[ INFO  ] TASK [ovirt.hosted_engine_setup : include_tasks]
[ INFO  ] skipping: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Always revoke the SSO token]
[ INFO  ] TASK [ovirt.hosted_engine_setup : include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Obtain SSO token using username/password credentials]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up]

The bootstrap manager is available at https://hypervisor.example.org:6900/ovirt-engine/ and the installer tries to add the current host under the Manager management. (It waits for the host to be in the 'Up' state. This is why is important to have all the storage and network prerequisites prepared/available).

And to finish up :

[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Destroy local storage-pool localvm7imrhb7u]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Undefine local storage-pool localvm7imrhb7u]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Destroy local storage-pool {{ local_vm_disk_path.split('/')[5] }}]
[ INFO  ] TASK [ovirt.hosted_engine_setup : Undefine local storage-pool {{ local_vm_disk_path.split('/')[5] }}]
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20200807193709.conf'
[ INFO  ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ INFO  ] Hosted Engine successfully deployed
[ INFO  ] Other hosted-engine hosts have to be reinstalled in order to update their storage configuration. From the engine, host by host, please set maintenance mode and then click on reinstall button ensuring you choose DEPLOY in hosted engine tab.
[ INFO  ] Please note that the engine VM ssh keys have changed. Please remove the engine VM entry in ssh known_hosts on your clients.

real    45m1,768s
user    18m4,639s
sys     1m9,271s

After finishing the upgrade it is also recommended to register the RHV-Manager virtual machine and upgrade to the latest RPMs available in the Red Hat CDN.

Set the Hosted Engine in Global Maintenance mode and:

POOLID=`subscription-manager list --available --matches "Red Hat Virtualization"  --pool-only | head -n 1`
subscription-manager attach --pool=$POOLID

subscription-manager repos \
    --disable='*' \
    --enable=rhel-8-for-x86_64-baseos-rpms \
    --enable=rhel-8-for-x86_64-appstream-rpms \
    --enable=rhv-4.4-manager-for-rhel-8-x86_64-rpms \
    --enable=fast-datapath-for-rhel-8-x86_64-rpms \
    --enable=ansible-2.9-for-rhel-8-x86_64-rpms \
    --enable=jb-eap-7.3-for-rhel-8-x86_64-rpms

yum module -y enable pki-deps
yum module -y enable postgresql:12
yum module reset -y virt
yum module enable -y virt:8.2

Performing the upgrade :

systemctl stop ovirt-engine
yum upgrade -y
engine-setup --accept-defaults 

Rolling back a failed upgrade

A rollback can be performed if the following applies:

  • The deployment or upgrade to the new RHV 4.4 Manager was not successfull.
  • No new instances have been created and/or VMs have not been altered (eg, added disks or nics, etc). If a rollback occurs those changes will be inconsistent with the old manager DB status and potentially imposible to reconciliate.

If so, the rollback can be performed by:

  • Powering off the new RHEL8/RHVH hypervisor and manager.
  • Powering on the old Manager in a RHEL7 hosts. They should be pointed to the old SHE LUN and storage.

Finalising the upgrade

At this point you should have a working manager under the regular https://FQDN/ovirt-engine/ address. Don't forget to clear cookies and browser cache as this might lead into strange WebUI issues.

At this point you can continue reinstalling your hypervisors. I'd suggest:

  • Starting with your SHE hypervisors first. This will ensure you have SHE HA asap.
  • Then the non-SHE hypervisors.
  • Then finalise with the rest of the task such as upgrading Cluster and DC compatibility, rebooting the guest VMs, etc.

Happy hacking!

Provisioning RHV 4.1 hypervisors using Satellite 6.2

Overview

The RHV-H 4.1 installation documentation describes a method to provision RHV-H hypervisors using PXE and/or Satellite. This document covers all steps required to archieve such configuration in a repeatable manner.

Prerequisites

  • A RHV-H installation ISO file such as RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso downloaded into Satellite/Capsules.
  • rhviso2media.sh script (see here)

Creating the installation media in Satellite and Capsules

  • Deploy the RHV-H iso in /var/lib/pulp/tmp
  • Run the rhviso2media.sh to populate installation media directories. It will make the following files available:
    • kernel and initrd files in tftp://host/ISOFILE/vmlinuz and tftp://host/ISOFILE/initrd
    • DVD installation media in /var/www/html/pub/ISOFILE directory
    • squashfs.img file in /var/www/html/pub/ISOFILE directory

Example:

# ./rhviso2media.sh RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso
Mounting /var/lib/pulp/tmp/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso in /mnt/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso ...
mount: /dev/loop0 is write-protected, mounting read-only
Copying ISO contents to /var/www/html/pub/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso ...
Extracting redhat-virtualization-host-image-update ...
./usr/share/redhat-virtualization-host/image
./usr/share/redhat-virtualization-host/image/redhat-virtualization-host-4.0-20170307.1.el7_3.squashfs.img
./usr/share/redhat-virtualization-host/image/redhat-virtualization-host-4.0-20170307.1.el7_3.squashfs.img.meta
1169874 blocks
OK
Copying squash.img to public directory . Available as http://sat62.lab.local/pub/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso/squashfs.img ...
Copying /mnt/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso/images/pxeboot/vmlinuz /mnt/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso/images/pxeboot/initrd.img to /varlib/tftpboot/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso ...
OK
Unmounting /mnt/RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso

Create Installation media

RHVH Installation media

http://fake.url

This is not used as it the kickstart will use a liveimg url, not a media url; however Satellite is stubborn and still requires it.

Create partition table

name : RHVH Kickstart default

<%#
kind: ptable
name: Kickstart default
%>
zerombr
clearpart --all --initlabel
autopart --type=thinp

Create pxelinux for RHV

Based on Kickstart default PXELinux

```erb
<%#
kind: PXELinux
name: RHVH Kickstart default PXELinux
%>
#
# This file was deployed via '<%= @template_name %>' template
#
# Supported host/hostgroup parameters:
#
# blacklist = module1, module2
#   Blacklisted kernel modules
#
<%
options = []
if @host.params['blacklist']
    options << "modprobe.blacklist=" + @host.params['blacklist'].gsub(' ', '')
end
options = options.join(' ')
-%>

DEFAULT rhvh

LABEL rhvh
    KERNEL <%= @host.params["rhvh_image"] %>/vmlinuz <%= @kernel %>
    APPEND initrd=<%= @host.params["rhvh_image"] %>/initrd.img inst.stage2=http://<%= @host.hostgroup.subnet.tftp.name %>/pub/<%= @host.params["rhvh_image"] %>/ ks=<%= foreman_url('provision') %> intel_iommu=on ssh_pwauth=1 local_boot_trigger=<%= foreman_url("built") %> <%= options %>
    IPAPPEND 2
```

Create Kickstart for RHV

File under Satellite Kickstart Default for RHVH .

Note that the @host.hostgroup.subnet.tftp.name variable is used to point to the capsule associated to this host, rather than the Satellite server itself.

<%#
kind: provision
name: Satellite Kickstart default
%>
<%
rhel_compatible = @host.operatingsystem.family == 'Redhat' && @host.operatingsystem.name != 'Fedora'
os_major = @host.operatingsystem.major.to_i
# safemode renderer does not support unary negation
pm_set = @host.puppetmaster.empty? ? false : true
puppet_enabled = pm_set || @host.params['force-puppet']
salt_enabled = @host.params['salt_master'] ? true : false
section_end = (rhel_compatible && os_major <= 5) ? '' : '%end'
%>
install
# not required # url --url=http://<%= @host.hostgroup.subnet.tftp.name %>/pub/<%= @host.params["rhvh_image"] %>
lang en_US.UTF-8
selinux --enforcing
keyboard es
skipx

<% subnet = @host.subnet -%>
<% if subnet.respond_to?(:dhcp_boot_mode?) -%>
<% dhcp = subnet.dhcp_boot_mode? && !@static -%>
<% else -%>
<% dhcp = !@static -%>
<% end -%>

network --bootproto <%= dhcp ? 'dhcp' : "static --ip=#{@host.ip} --netmask=#{subnet.mask} --gateway=#{subnet.gateway} --nameserver=#{[subnet.dns_primary, subnet.dns_secondary].select(&:present?).join(',')}" %> --hostname <%= @host %><%= os_major >= 6 ? " --device=#{@host.mac}" : '' -%>

rootpw --iscrypted <%= root_pass %>
firewall --<%= os_major >= 6 ? 'service=' : '' %>ssh
authconfig --useshadow --passalgo=sha256 --kickstart
timezone --utc <%= @host.params['time-zone'] || 'UTC' %>

<% if @host.operatingsystem.name == 'Fedora' and os_major <= 16 -%>
# Bootloader exception for Fedora 16:
bootloader --append="nofb quiet splash=quiet <%=ks_console%>" <%= grub_pass %>
part biosboot --fstype=biosboot --size=1
<% else -%>
bootloader --location=mbr --append="nofb quiet splash=quiet" <%= grub_pass %>
<% end -%>

<% if @dynamic -%>
%include /tmp/diskpart.cfg
<% else -%>
<%= @host.diskLayout %>
<% end -%>

text
reboot

liveimg --url=http://<%= foreman_server_fqdn %>/pub/<%= @host.params["rhvh_image"] %>/squashfs.img


%post --nochroot
exec < /dev/tty3 > /dev/tty3
#changing to VT 3 so that we can see whats going on....
/usr/bin/chvt 3
(
cp -va /etc/resolv.conf /mnt/sysimage/etc/resolv.conf
/usr/bin/chvt 1
) 2>&1 | tee /mnt/sysimage/root/install.postnochroot.log
<%= section_end -%>


%post
logger "Starting anaconda <%= @host %> postinstall"

nodectl init

exec < /dev/tty3 > /dev/tty3
#changing to VT 3 so that we can see whats going on....
/usr/bin/chvt 3
(
<% if subnet.respond_to?(:dhcp_boot_mode?) -%>
<%= snippet 'kickstart_networking_setup' %>
<% end -%>

#update local time
echo "updating system time"
/usr/sbin/ntpdate -sub <%= @host.params['ntp-server'] || '0.fedora.pool.ntp.org' %>
/usr/sbin/hwclock --systohc

<%= snippet "subscription_manager_registration" %>

<% if @host.info['parameters']['realm'] && @host.realm && @host.realm.realm_type == 'Red Hat Identity Management' -%>
<%= snippet "idm_register" %>
<% end -%>

# update all the base packages from the updates repository
#yum -t -y -e 0 update

<%= snippet('remote_execution_ssh_keys') %>

sync

<% if @provisioning_type == nil || @provisioning_type == 'host' -%>
# Inform the build system that we are done.
echo "Informing Foreman that we are built"
wget -q -O /dev/null --no-check-certificate <%= foreman_url %>
<% end -%>
) 2>&1 | tee /root/install.post.log
exit 0

<%= section_end -%>

Create new Operating system

  • Name: RHVH
  • Major Version: 7
  • Partition table: RHVH Kickstart default
  • Installation media: RHVH Installation media
  • Templates: "Kickstart default PXELinux for RHVH" and "Satellite kickstart default for RHVH"

Associate the previously-created provisioning templates with this OS.

Create a new hostgroup

Create a new host-group with a Global Parameter called rhvh_image . This parameter is used by the provisioning templates to generate the installation media paths as required.

eg:

rhvh_image = RHVH-4.0-20170308.0-RHVH-x86_64-dvd1.iso

rhviso2media

Final thoughts

Future Satellite versions of satellite might include better integration of RHV-H provisioning; however the method described above can be used in the meantime.

Happy hacking!

RHV 4.2: Using rhv-log-collector-analyzer to assess your virtualization environment

RHV 4.2 includes a tool that allows to quickly analyze your RHV environment. It bases its analysis in either a logcollector report (sosreport and others), or it can connect live to you environment and generate some nice JSON or HTML output.


NOTE : This article is deprecated and is only left for historical reasons.

rhv-log-collector-analyzer now only supports live reporting, so use the method rhv-log-collector-analyzer --live to grab a snapshot of your deployment and verify its status.


You'll find it already installed in RHV 4.2 , and gathering a report is as easy as:

# rhv-log-collector-analyzer --live
Generating reports:
===================
Generated analyzer_report.html

If you need to assess an existing logcollector report on a new system that never had a running RHV-Manager, things get a bit more complicated:

root@localhost # yum install -y ovirt-engine
root@localhost # su - postgres
postgres@localhost ~ # source scl_source enable rh-postgresql95
postgres@localhost ~ # cd /tmp
postgres@localhost /tmp # time rhv-log-collector-analyzer  /tmp/sosreport-LogCollector-20181106134555.tar.xz

Preparing environment:
======================
Temporary working directory is /tmp/tmp.do6qohRDhN
Unpacking postgres data. This can take up to several minutes.
sos-report extracted into: /tmp/tmp.do6qohRDhN/unpacked_sosreport
pgdump extracted into: /tmp/tmp.do6qohRDhN/pg_dump_dir
Welcome to unpackHostsSosReports script!
Extracting sosreport from hypervisor HYPERVISOR1 in /tmp/ovirt-log-collector-analyzer-hosts/HYPERVISOR1
Extracting sosreport from hypervisor HYPERVISOR2 in /tmp/ovirt-log-collector-analyzer-hosts/HYPERVISOR2
Extracting sosreport from hypervisor HYPERVISOR3 in /tmp/ovirt-log-collector-analyzer-hosts/HYPERVISOR3
Extracting sosreport from hypervisor HYPERVISOR4 in /tmp/ovirt-log-collector-analyzer-hosts/HYPERVISOR4

Creating a temporary database in /tmp/tmp.do6qohRDhN/postgresDb/pgdata. Log of initdb is in /tmp/tmp.do6qohRDhN/initdb.log

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
LOG:  redirecting log output to logging collector process
HINT:  Future log output will appear in directory "pg_log".
Importing the dump into a temporary database. Log of the restore process is in /tmp/tmp.do6qohRDhN/db-restore.log

Generating reports:
===================
Generated analyzer_report.html

Cleaning up:
============
Stopping temporary database
Removing temporary directory /tmp/tmp.do6qohRDhN

You'll find a analyzer_report.html file in your current working directory. It can be reviews with a text-only browser such as lynx/links , or opened with a proper full-blown browser.

Bonus track

Sometimes it can also be helpful to check the database dump that is included in the logcollector report. In order to do that, you can do something like:

Review pg_dump_dir in the log above: /tmp/tmp.do6qohRDhN/pg_dump_dir .

Initiate a new postgres instance as follows :

postgres@localhost $ source scl_source enable rh-postgresql95
postgres@localhost $ export PGDATA=/tmp/foo
postgres@localhost $ initdb -D ${PGDATA} 
postgres@localhost $ /opt/rh/rh-postgresql95/root/usr/libexec/postgresql-ctl start -D ${PGDATA} -s -w -t 30 &
postgres@localhost $ psql -c "create database testengine"
postgres@localhost $ psql -c "create schema testengine"
postgres@localhost $ psql testengine < /tmp/tmp.*/pg_dump_dir/restore.sql

Happy hacking!

Satellite 6: Upgrading to Satellite 6.3

We have a new and shiny Satellite 6.3.0 available as of now, so I just bit the bullet and upgraded my lab's Satellite.

The first thing to know if you now have the foreman-maintain tool to do some pre-flight checks, as well as drive the upgrade. You'll need to enable the repository (included as a part of RHEL product) :

subscription-manager repos --disable="*" --enable rhel-7-server-rpms --enable rhel-7-server-satellite-6.3-rpms --enable rhel-server-rhscl-7-rpms --enable rhel-7-server-satellite-maintenance-6-rpms

yum install -y rubygem-foreman_maintain

Check your Satellite health with :

# foreman-maintain health check
Running ForemanMaintain::Scenario::FilteredScenario
================================================================================
Check for verifying syntax for ISP DHCP configurations:               [FAIL]
undefined method `strip' for nil:NilClass
--------------------------------------------------------------------------------
Check for paused tasks:                                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running using hammer ping:             [OK]
--------------------------------------------------------------------------------
Scenario [ForemanMaintain::Scenario::FilteredScenario] failed.

The following steps ended up in failing state:

  [foreman-proxy-verify-dhcp-config-syntax]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="foreman-proxy-verify-dhcp-config-syntax"

And finally perform the upgrade with :

# foreman-maintain upgrade  run  --target-version 6.3 --whitelist="foreman-proxy-verify-dhcp-config-syntax,disk-performance,repositories-setup"                          

Running Checks before upgrading to Satellite 6.3
================================================================================
Skipping pre_upgrade_checks phase as it was already run before.
To enforce to run the phase, use `upgrade run --phase pre_upgrade_checks`

Scenario [Checks before upgrading to Satellite 6.3] failed.

The following steps ended up in failing state:

 [foreman-proxy-verify-dhcp-config-syntax]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="foreman-proxy-verify-dhcp-config-syntax"



Running Procedures before migrating to Satellite 6.3
================================================================================
Skipping pre_migrations phase as it was already run before.
To enforce to run the phase, use `upgrade run --phase pre_migrations`


Running Migration scripts to Satellite 6.3
================================================================================
Setup repositories: 
- Configuring repositories for 6.3                                    [FAIL]    
Failed executing subscription-manager repos --enable=rhel-7-server-rpms --enable=rhel-server-rhscl-7-rpms --enable=rhel-7-server-satellite-maintenance-6-rpms --enable=rhel-7-server-satellite-tools-6.3-rpms --enable=rhel-7-server-satellite-6.3-rpms, exit status 1:
Error: 'rhel-7-server-satellite-6.3-rpms' does not match a valid repository ID. Use "subscription-manager repos --list" to see valid repositories.
Repository 'rhel-7-server-rpms' is enabled for this system.
Repository 'rhel-7-server-satellite-maintenance-6-rpms' is enabled for this system.
Repository 'rhel-7-server-satellite-tools-6.3-rpms' is enabled for this system.
Repository 'rhel-server-rhscl-7-rpms' is enabled for this system.
-------------------------------------------------------------------------------
Update package(s) : 
  (yum stuff)

                                                                        [OK]
 --------------------------------------------------------------------------------
Procedures::Installer::Upgrade: 
Upgrading, to monitor the progress on all related services, please do:
  foreman-tail | tee upgrade-$(date +%Y-%m-%d-%H%M).log
Upgrade Step: stop_services...
Upgrade Step: start_databases...
Upgrade Step: update_http_conf...
Upgrade Step: migrate_pulp...
Upgrade Step: mark_qpid_cert_for_update...
Marking certificate /root/ssl-build/satmaster.rhci.local/satmaster.rhci.local-qpid-broker for update
Upgrade Step: migrate_candlepin...
Upgrade Step: migrate_foreman...
Upgrade Step: Running installer...
Installing             Done                                               [100%] [............................................]
  The full log is at /var/log/foreman-installer/satellite.log
Upgrade Step: restart_services...
Upgrade Step: db_seed...
Upgrade Step: correct_repositories (this may take a while) ...
Upgrade Step: correct_puppet_environments (this may take a while) ...
Upgrade Step: clean_backend_objects (this may take a while) ...
Upgrade Step: remove_unused_products (this may take a while) ...
Upgrade Step: create_host_subscription_associations (this may take a while) ...
Upgrade Step: reindex_docker_tags (this may take a while) ...
Upgrade Step: republish_file_repos (this may take a while) ...
Upgrade completed!
                                                     [OK]
--------------------------------------------------------------------------------


Running Procedures after migrating to Satellite 6.3
================================================================================
katello-service start: 
- No katello service to start                                         [OK]      
--------------------------------------------------------------------------------
Turn off maintenance mode:                                            [OK]
--------------------------------------------------------------------------------
re-enable sync plans: 
- Total 4 sync plans are now enabled.                                 [OK]      
--------------------------------------------------------------------------------

Running Checks after upgrading to Satellite 6.3
================================================================================
Check for verifying syntax for ISP DHCP configurations:               [FAIL]
undefined method `strip' for nil:NilClass
--------------------------------------------------------------------------------
Check for paused tasks:                                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running using hammer ping:             [OK]
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Upgrade finished.

Happy hacking!