The importance of Oracle database related kernel parameters (aio-max-nr) (with bonus track)
We tend to configure a server following Oracle best practices for the kernel parameters, but what once was a server of a couple instances, soon becomes a huge and heavy environment with four, six or ten instances running on it and more applications connected that use more and more connections. It’s important to review these parameters to ensure they still fit with our needs as systems evolve with time.
We will see now how even the simplest task can trigger a situation where a limit on a kernel parameter is hit and a server restart may be required.
This was a small task that involved clearing a schema in one database and re-populating it with an empty skeleton from a different one. Piece of cake, clean up the target schema, export the metadata of the source and import into the target. The end… or not.
Everything went well until the moment came when we had to export the metadata from the source database. Like any good modern DBA, our tool of choice for this is Data Pump so a nice parameter file is created, everything is in place, schema is very small, this should be quick…
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.FETCH_XML_OBJECTS [OBJECT_GRANT:"SOME OBJECT"] ORA-27090: Unable to reserve kernel resources for asynchronous disk I/O
Interesting error message, let’s take a look into My Oracle Support a.k.a. MOS:
“ORA-27090 – Unable to Reserve Kernel Resources for Asynchronous Disk I/O (Doc ID 579108.1)”
The note says that the error message should appear in the alert log.
Is it there? Fun fact, no, there is no such error message in the database I am trying to export the metadata from, but it appears on the target database, the one which has just been cleared.
So the note says that the “aio-max-nr” kernel limit is too low. We haven’t seen this ORA-27090 before until the Data Pump was attempted, it may well have been the “straw that broke the camel’s back”.
There is also a nice bug “Bug 24425998 ORA-27090: ‘Unable to reserve kernel resources for asynchronous disk I/O’ Error in Alert Log” fixed in 18.1, my first one, but doesn’t seem to be the case that a query is triggering the issue.
So we have the problem with the Data Pump, the error in the alert log and a, supposedly, low value of “aio-max-nr”.
What is this “aio-max-nr” kernel parameter and why do I care?
According to Linux documentation:
aio-nr & aio-max-nr: aio-nr is the running total of the number of events specified on the io_setup system call for all currently active aio contexts. If aio-nr reaches aio-max-nr then io_setup will fail with EAGAIN. Note that raising aio-max-nr does not result in the pre-allocation or re-sizing of any kernel data structures.
OK but, what are these AIO and how to they matter to my Oracle database?
Basically, what matters is that AIOs are used by an Oracle database to avoid the overhead of asking the OS to do the IO and access directly to disk, without having to wait for the IO to finish (you get it: asynchronous). This is the default behaviour when ASM is in use so this parameter takes even more precedence in a RAC environment like the one we are talking here.
Let’s have a look at the current value in our system:
[oracle@racnode1 ~]$ sysctl -a | grep aio fs.aio-max-nr = 1048576 fs.aio-nr = 1048451
And from the documentation for a 12.1 installation:
aio-max-nr 1048576 Note: This value limits concurrent outstanding requests and should be set to avoid I/O subsystem I/O subsystem failures.
But wait, what with our current value? why is it not big enough?
Our current value matches the documentation, what’s wrong with it?
Back to the docs we notice the following:
The kernel parameter and shell limit values in this section are minimum values only. For production database systems, Oracle recommends that you tune these values to optimize the performance of the system. Refer to your operating system documentation for more information about tuning kernel parameters.
Aha! Oracle only provides minimum values that must be adjusted to the real system.
Back to MOS we find the following note: “What value should kernel parameter AIO-MAX-NR be set to? (Doc ID 2229798.1)”. This note includes a formula to calculate the correct value for the aio-max-nr kernel parameter. Due to licensing issues I can’t post the formula here, we should not post MOS content publicly, etc. so let’s just say that the correct value for our environment, with 8 instances + ASM + MGMTDB running on this node is 8196096, way bigger than the original one.
This is a value calculated on the fly, with all the instances running, lots of connections established and people working on the system, so it should be quite accurate.
Of course, when we are deploying a new system we don’t have everything in place, so I’d recommend to review the mentioned MOS note (2229798.1) and use your estimations to set a value high enough. This value is calculated according to the number of instances and the number of processes per instance so, the higher these figures, the higher the parameter.
Obviously there is caveat, we cannot simply set a value close to infinite because we are limited by the kernel memory. About this, we have an extreme case in one of Marc Fielding’s post here showing a very high value being set even higher, so there is plenty of room to set the parameter as big as we may need.
In any case, based on my experience and given the little information I’ve found on the Internet about this very same issue, the default recommended value should be enough for most of the systems running out there.
So, being a kernel parameter we have been lucky and applied the change during an already requested maintenance window that included a server reboot. Once the parameter got changed, we were able to export the metadata and complete the task. Well, actually we would, but we had done it already using good old EXP/IMP tool.
Finally the bonus track: the server load came down from 12 to around 5.5 after the reboot. Is this due to the change in the kernel parameter or just the reboot that cleared some hanging database sessions? (This is a development environment so lots of stuff is going on with little control beyond the infrastructure changes we do). We don’t have evidence for one thing or the other so we will leave it here.
So, have you encountered a similar situation? How common is the task of reviewing the system configuration to ensure it still fits our needs?