Simple MySQL Auditing

Jun 11, 2008 / By Robert Hamel

Tags:

Simple auditing, i.e., knowing what changed recently, can save you tons of time while troubleshooting.

I know that, in the ideal world:

  • Everything is supposed to be done through configuration management.
  • Everything is documented and all changes are tracked through a VCS.
  • Every DDL or set global is trapped via MySQL Proxy and logged.

But there are always ways to bypass the gatekeepers. Changes can go in unnoticed. An hour or so later, your database performance suddenly changes for the worse, and you get that phone call.

First you check if anything caused an actual error. You look around at a few log files and nothing shows up. The next thing you ask yourself is, did someone change anything in the last little while. Of course, everybody says no. After a few hours of digging, comparing schemas, diff-ing old and current config files, you actually find what has changed, put it back the way it was and everything is back to normal. You ask the question again, did anybody change the global variable from X to Y, and someone finally admits it. But they thought that it was not relevant since they did that change on the weekend and the system performance didn’t go down the toilet until Monday morning around 9:00am. Sound familiar?

We have all gone through this many times, and it doesn’t matter what process you have in place — something always slips through the cracks.

I came up with some simple stored procedure that will compare two data sets and keep track of the changes historically. It’s loosely based on slowly-changing dimension type 2 in the data warehouse world (google Ralph Kimball if you want to get all the gory details). This method tracks only changes, so you should be able to keep historical rows forever, unless you are constantly dropping and creating whatever you are tracking.

Say you want to keep track of my.cnf or global variables changes. This one of the simplest tables to track, the information_schema — it has only two columns. Others, such as tables or routines, would make the SQL to compare a little more involved.

First you need a table to track this with. I called mine historical_global_variable. It has the two columns from the original table plus three additional ones:

  1. row_effective_date: Time of change e.g. when the row was the current value
  2. row_end_date: Time of next change e.g. when the row was no longer the current value
  3. current_value_flag : 1 if its the current value 0 otherwise

There are three conditions that we might encounter:

  1. The first execution of our auditing process or a new variable, which can occur after an upgrade.Insert new value into the historical_global_variable setting the current_value_flag=1, row_effective_date=now and row_end_date to some date far in the future — say, 100 or 500 years.
  2. Variable change through a my.cnf change followed by a bounce, a set global variable command, or a new default value which can happen for an unset parameter after an upgrade
    1. Set the current_value_flag = 0; the row_end_date = now() for the current value of the variable
    2. Insert new value into the historical_global_variable setting the current_value_flag=1, row_effective_date=now, and row_end_date to some date far in the future — say, 100 or 500 years
  3. Deprecated variable which can occur after an upgradeSet the current_value_flag = 0; the row_end_date = now() for the current value of the variable

I did my tests in MySQL 5.1 so I was able to create a procedure and schedule it with events.

The table structure:

CREATE TABLE `historical_global_variable` (`variable_name` varchar(64) NOT NULL DEFAULT '',
`variable_value` varchar(20480) DEFAULT NULL,
`row_effective_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`row_end_date` datetime DEFAULT NULL,
`current_value_flag` smallint(6) DEFAULT NULL,
PRIMARY KEY (`variable_name`,`row_effective_date`)

) ;

The stored procedure:

DELIMITER $$DROP PROCEDURE IF EXISTS `audit`.`populate_historical_global_variable`$$
CREATE PROCEDURE `populate_historical_global_variable`()
    MODIFIES SQL DATA
    DETERMINISTIC
BEGIN
 declare done int default 0;
 declare c_variable_name varchar(64);
 declare p_variable_name varchar(64);
 declare c_variable_value varchar(20480);
 declare p_variable_value varchar(20480);
 declare v_now timestamp default now();
 declare c1 cursor for
   select t.variable_name,
          t.variable_value c_variable_value,
 	  h.variable_value p_variable_value,
          h.variable_name p_variable_name
   from temp_global_variables t
   left join
    (select * from historical_global_variable
     where current_value_flag=1) h
   on h.variable_name = t.variable_name
   union
   select t.variable_name,
          t.variable_value c_variable_value,
 	  h.variable_value p_variable_value,
          h.variable_name p_variable_name
   from (select * from historical_global_variable
         where current_value_flag=1) h
   left join temp_global_variables t
   on h.variable_name = t.variable_name;

 DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
 delete from temp_global_variables;
 insert into temp_global_variables
   select * from information_schema.global_variables;

 open c1;
    repeat
 	fetch c1 into c_variable_name,
                      c_variable_value,
                      p_variable_value,
                      p_variable_name;
        if not done then

 	# New variable not seen yet

 	   if p_variable_name is null then
	      insert into historical_global_variable
                     (variable_name,
                      variable_value,
                      row_effective_date,
                      current_value_flag,
                      row_end_date)
 	      values (c_variable_name,
                      c_variable_value,
                      v_now,
                      1,
                      date_add(v_now, interval 500 year));

 	# Deprecated variable

 	   elseif c_variable_name is null then
		update historical_global_variable
                       set current_value_flag=0,
                       row_end_date = v_now
		where variable_name=p_variable_name
                and current_value_flag=1;

 	# Variable change

           elseif ifnull(p_variable_value,'a a a') <>
                  ifnull(c_variable_value,'a a a') then

 		update historical_global_variable
                       set current_value_flag=0,
                       row_end_date = v_now
                where variable_name=c_variable_name
                and current_value_flag=1;

 		insert into historical_global_variable
                     (variable_name,
                      variable_value,
                      row_effective_date,
                      current_value_flag,
                     row_end_date)
		values (c_variable_name,
                        c_variable_value,
                        v_now,
                        1,
                        date_add(v_now, interval 500 year));

 	   end if;

 	end if;

 	until done end repeat;

 close c1;

END$$

DELIMITER ;

I scheduled this to run daily, as there are not supposed to be many changes. You can do it hourly if you need to.

Now, how can you use it? There are typical queries that you can run, and these queries can be executed on anything audited.

Current values (what does it look like now):

select variable_name, variable_value from historical_global_variable where current_value_flag=1;

variable_name               variable_value

AUTOCOMMIT                  ON
AUTOMATIC_SP_PRIVILEGES     ON
AUTO_INCREMENT_INCREMENT    1
AUTO_INCREMENT_OFFSET       1
BACK_LOG                    50
BASEDIR                     /usr/local/mysql-5.1.24/
etc

This is a long list! It’s too bad that the information_schema.global_variables view does not have a default value indicator. The query can be used to regenerate your my.cnf if it gets mangled beyond repair.

Active values for a specific point in time(what did it look like for a specific timestamp):

select variable_name, variable_value from historical_global_variable where '2008-04-17' between row_effective_date and row_end_date;

Historical value changes for MAX_TMP_TABLES (Historical changes to a specific object ):

select row_effective_date, row_end_date, current_value_flag, variable_value from historical_global_variable
where variable_name='MAX_TMP_TABLES'
order by row_effective_date;

row_effective_date  row_end_date        current variable
                                        value   value
                                        flag
2008-04-16 17:21:38 2008-04-16 17:37:23 0       36
2008-04-16 17:38:04 2508-04-16 17:38:04 1       32

Changes in the last 2 days:

select row_effective_date, row_end_date, current_value_flag, variable_name, variable_value from historical_global_variable
where row_effective_date >= date_sub(now(), interval 2 day)
order by row_effective_date;

row_effective_date  row_end_date        current variable  variable
                                        value   name      value
                                        flag
2008-06-09 12:00:00 2008-06-10 12:00:00 0       TIMESTAMP 1213027200
2008-06-10 12:00:00 2008-06-10 15:46:22 0       TIMESTAMP 1213113600
2008-06-10 15:46:22 2508-06-10 15:46:22 1       TIMESTAMP 1213127182

Deprecated Parameters (what has been removed):

select row_effective_date, row_end_date, variable_name, variable_value from historical_global_variable h
where current_value_flag = 0 and row_effective_date = ( select max(max_h.row_effective_date) from  historical_global_variable max_h where max_h.variable_name = h.variable_name);

It returned no rows for me since I haven’t upgraded since the procedure was created but you get the picture.

New Parameters (what has been Added):

select row_effective_date, row_end_date, variable_name, variable_value from historical_global_variable h
where row_effective_date = ( select min(row_effective_date) from  historical_global_variable max_h where max_h.variable_name = h.variable_name)
and row_effective_date >= date_sub(now(), interval 2 day);

This is a rare occurrence, but it can be valid if you track both columns and tables.

I have attempted to do the same with the information_schema.routines table with the same level of success, but as I said earlier, the procedure is little more daunting to write, since you must compare every column to which you wish to track changes.

It’s simple, but once written and scheduled, it’s a time saver when you need to know what changed. The next step would be to write a script that can generate the stored procedures and then we can track any change.

Let me know if you think it was a worthwhile exercise, and if there is any other topic in this area you would like me to address.

One Response to “Simple MySQL Auditing”

  • Lukas says:

    I think these days auditing is best done via mysql proxy, then again I am not sure how to prevent/control direct access to the MySQL server in the most elegant manner.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>