Automating password rotation for Oracle databases

Password rotation is not the most exciting task in the world, and that's exactly why it's a perfect candidate for automation. Automating routine tasks like this are good for everyone - DBAs can work on something that's more exciting, companies save costs as less time is spent on changing the passwords, and there's no place for human error, either. At Pythian, we typically use Ansible for task automation, and I like it mainly because of its non-intrusive configuration (no agents need to be installed on the target servers), and its scalability (tasks are executed in parallel on the target servers). This post will briefly describe how I automated password rotation for oracle database users using Ansible.
Overview
This blog post is not an intro to what is Ansible and how to use it, but it's rather an example of how a simple task can be automated using Ansible in a way that's scalable, flexible and easily reusable, and also provides the ability for other tasks to pick up the new passwords from a secure password store.- Scalability - I'd like to take advantage of Ansible's ability of executing tasks on multiple servers at the same time. For example, in a large environments of tens or hundreds of machines, a solution that executes password change tasks serially would not be suitable. This would be an example of a "serial" task (it's not a real thing, but just an illustration that it "hardcodes" a few "attributes" (environment file, the username and the hostname), and creating a separate task for every user/database you'd want to change the password for would be required: [code language="python" gutter="false"] - hosts: ora-serv01 remote_user: oracle tasks: - name: change password for SYS shell: | . TEST1.env && \ sqlplus / as sysdba @change_pasword.sql SYS \ \"\" [/code]
- Flexible - I want to be able to adjust the list of users for which the passwords are changed, and the list of servers/databases that the user passwords are changed for in a simple way, that doesn't include changing the main task list.
- Reusable - this comes together with flexibility. The idea is that the playbook would be so generic, that it wouldn't require any changes when it's implemented in a completely separate environment (i.e. for another client of Pythian)
- Secure password store - the new passwords are to be generated by the automated password rotation tool, and a method of storing password securely is required so that the new passwords could be picked up by the DBAs, application owners or the next automated task that would reconfigure the application
The implementation
Prerequisites
I chose to do the implementation using Ansible 2.3, because it introduces the passwordstore lookup, which enables interaction with thepass
utility (read more about it in
Passwordstore.org).
pass
is very cool. It store passwords in gpg-encrypted files, and it can also be configured to automatically update the changes to a git repository, which relieves us of the headache of password distribution. The password can be retrieved from git on the servers that need the access to the new passwords. Ansible 2.3 runs on python 2.6, unfortunately, the passwordstore lookup requires Python 2.7, which can be an issue if the control host for Ansible runs on Oracle Linux 6 or RHEL 6, as they don't provide Python 2.7 in the official yum repositories. Still, there are ways of getting it done, and I'll write another blog post about it. So, what we'll need is:
- Ansible 2.3
- jmespath plugin on Ansible control host (
pip install jmespath
) - jinja2 plugin on Ansible control host (I had to update it using
pip install -U jinja2
in few cases) - Python 2.7 (or Python 3.5)
pass
utility
The Playbook
This is the whole list of files that are included in the playbook: [code language="bash" gutter="false"] ./chpwd.yml ./inventory/hosts ./inventory/orcl1-vagrant-private_key ./inventory/orcl2-vagrant-private_key ./roles/db_users/files/change_password.sql ./roles/db_users/files/exists_user.sql ./roles/db_users/defaults/main.yml ./roles/db_users/tasks/main.yml [/code] Let's take a quick look at all of them:- ./chpwd.yml - is the playbook and (in this case) it's extremely simple as I want to run the password change against all defined hosts: [code language="python" gutter="false"] $ cat ./chpwd.yml --- - name: password change automation hosts: all roles: - db_users [/code]
- ./inventory/hosts, ./inventory/orcl1-vagrant-private_key, ./inventory/orcl2-vagrant-private_key - these files define the hosts and the connectivity. In this case we have 2 hosts - orcl1 and orcl2, and we'll connect to vagrant user using the private keys. [code language="python" gutter="false"] $ cat ./inventory/hosts [orahosts] orcl1 ansible_host=127.0.0.1 ansible_port=2201 ansible_ssh_private_key_file=inventory/orcl1-vagrant-private_key ansible_user=vagrant orcl2 ansible_host=127.0.0.1 ansible_port=2202 ansible_ssh_private_key_file=inventory/orcl2-vagrant-private_key ansible_user=vagrant[/code]
- ./roles/db_users/files/change_password.sql - A sql script that I'll execute on the database to change the passwords. It takes 2 parameters the username and the password: [code language="python" gutter="false"] $ cat ./roles/db_users/files/change_password.sql set ver off pages 0 alter user &1 identified by "&2"; exit;[/code]
- ./roles/db_users/files/exists_user.sql - A sql script that allows verifying the existence of the users. It takes 1 argument - the username. It outputs "User exists." when the user is there, and "User {username} does not exist." - when it's not. [code language="python" gutter="false"] $ cat ./roles/db_users/files/exists_user.sql set ver off pages 0 select 'User exists.' from all_users where username=upper('&1') union all select 'User '||upper('&1')||' does not exist.' from (select upper('&1') from dual minus select username from all_users); exit;[/code]
- ./roles/db_users/defaults/main.yml - is the default file for the db_users role. I use this file to define the users for each host and database for which the passwords need to be changed: [code language="python" gutter="false"] $ cat ./roles/db_users/defaults/main.yml --- db_users: - name: TEST1 host: orcl1 env: ". ~/.bash_profile && . ~/TEST1.env > /dev/null" pwdstore: "orcl1/TEST1/" os_user: oracle become_os_user: yes users: - dbsnmp - system - name: TEST2 host: orcl2 env: ". ~/.bash_profile && . ~/TEST2.env > /dev/null" pwdstore: "orcl2/TEST2/" os_user: oracle become_os_user: yes users: - sys - system - ctxsys - name: TEST3 host: orcl2 env: ". ~/.bash_profile && . ~/TEST3.env > /dev/null" pwdstore: "orcl2/TEST3/" os_user: oracle become_os_user: yes users: - dbsnmp[/code] In this data structure, we define everything that's needed to be known to connect to the database and change the passwords. each entry to the list contains the following data:
- name - just a descriptive name of the entry in this list, normally it would be the name of the database that's described below.
- host - the host on which the database resides. It should match one of the hosts defined in ./inventory/hosts.
- env - how to set the correct environment to be able to connect to the DB (currently it requires sysdba connectivity).
- pwdstore - the path to the folder in the passwordstore where the new passwords will be stored.
- os_user and become_os_user - these are used in case sudo to another user on the target host is required. In a typical configuration, I connect to the target host using a dedicated user for ansible, and then sudo to the DB owner. if ansible connects to the DB onwer directly, then become_os_user should be set to "no".
- users - this is the list of all users for which the passwords need to be changed.
- ./roles/db_users/tasks/main.yml - this is the task file of the db_users role. The soul of the playbook and the main part that does the password change depending on the contents in the defaults file described above. Instead of pasting the whole at once, I'll break it up task by task, and will provide some comments about what's being done.
- populate host_db_users - This task simply filters the whole db_users data structure that's defined in the defaults file, and creates host_db_users fact with only the DBs that belong to the host the task is currently run on. Using the ansible "when" conditional would also be possible to filter the list, however in such case there's a lot of "skipped" entries displayed when the task is executed, so I prefer filtering the list before it's even passed to the Ansible task. [code language="python" gutter="false"] --- - name: populate host_db_users set_fact: host_db_users="[]" [/code]
- create directory for target on db hosts - for each unique combination of os_user and become_os_user on the target host, and "ansible" directly is created. A json_query is used here, to filter just the os_user and become_os_user attributes that are needed. It would also work with
with_items: ""
, but in that case, the outputs become cluttered as the attributes are displayed during the execution. [code language="python" gutter="false"] - name: create directory for target on db hosts file: path: "ansible" state: directory become_user: "{{item.os_user}}" become: "{{item.become_os_user}}" with_items: "[]" [/code] - copy sql scripts to db_hosts - the missing scripts are copied from Ansible control host to the target "ansible" directories. "with_nested" is the method to create a loop in Ansible. [code language="python" gutter="false"] - name: copy sql scripts to db_hosts copy: src="" dest=ansible/ mode=0644 become_user: "" become: "" with_nested: - "[]" - ['files/change_password.sql','files/exists_user.sql'] [/code]
- verify user existence - I'm using a shell module to execute the sql script after setting the environment. The outputs are collected in "exists_output" variable. This task will not fail and will not show as "changed" because of failed_when and changed_when settings of "false". [code language="python" gutter="false"] - name: verify user existence shell: | && \ sqlplus -S / as sysdba \ @ansible/exists_user.sql register: exists_output become_user: "" become: "" with_subelements: - "" - users failed_when: false changed_when: false [/code]
- User existence results - this task will fail when any of the users didn't exist, and will display which user it was. This is done in a separate task to produce cleaner output, and in case it's not wanted to fail if any of the users don't exist (continue to change passwords for the existing users), this task can simply be commented or the "failed_when: false" can be uncommented. [code language="python" gutter="false"] - name: User existence results fail: msg="{{item}}" with_items: "[]" #failed_when: false [/code]
- generate and change the user passwords - finally, this is the task that actually changes the passwords. The successful password change is detected by checking the output from the sqlscript, which should produce "User altered." The rather complex use of lookups is there for a reason: the passwordstore lookup can also generate passwords, but it's not possible to define the character classes that the new password should contain, however the "password" lookup allows defining these. Additionally, the 1st character is generated only containing "ascii_letters", as there are usually some applications that "don't like" passwords that start with numbers (this is why generating the 1st letter of the password is separated from the remaining 11 characters. And lastly, the "passwordstore" lookup is used with the "userpass=" parameter to pass and store the generated password into the passwordstore (and it also keeps the previous passwords). This part could use some improvement as in some cases different rules for the generated password complexity may be required. The password change outputs are recorded in "change_output" that's checked in the last task. [code language="python" gutter="false"] - name: generate and change the user passwords shell: | && \ sqlplus -S / as sysdba \ @ansible/change_password.sql \ \"\" register: change_output become_user: "" become: "" with_subelements: - "" - users failed_when: false changed_when: "'User altered.' in change_output.stdout" [/code]
- Password change errors - The "change_output" data are verified here, and failed password changes are reported. [code language="python" gutter="false"] # fail if the password change failed. - name: Password change errors fail: msg="{{item}}" with_items: "[]" [/code]
It really works!
Now, when you know how it's built - it's time to show how it works! Please pay attention to the following:- The password store is empty at first
- The whole password change playbook completes in 12 seconds
- The tasks on both hosts are executed in parallel (see the order of execution feedback for each task)
- The passwordstore contains the password entries after the playbook completes, and they can be retrieved by using the
pass
command