Linux hugepages for Oracle on Amazon EC2: Possible, but not convenient, easy or fully supported

Oct 4, 2013 / By Jeremiah Wilton

Tags: , , , , , ,

One of the optimizations available to us when running Oracle on Linux is huge page support. This feature of the Linux kernel enables processes to allocate memory pages of size 2M (instead of 4k). In addition, memory allocated using hugepages is pinned in physical memory. It cannot be swapped out.

It is now common practice to enable huge page support for Oracle databases with large SGAs (one rule of thumb is 8G). Without this feature, the SGA can be, and often is, paged out. Paging out portions of the SGA can result in disastrous consequences from a performance standpoint. There are a variety of load patterns that perform particularly poorly without hugepages. Running with large numbers of processes, sudden increases in processes (connection storms), and highly concurrent access of diverse sets of SGA pages all can bring an Oracle system without hugepages to its knees.

Considering how commonplace use of hugepages is with Oracle, it is surprising that on Amazon EC2, huge page support is not generally available, and that the systems designed expressly for running Oracle cannot use hugepages.

EC2 Virtualization Types

Under the covers, Amazon EC2 uses a hypervisor to (potentially) run many virtual machines on a given physical server. EC2 instances essentially come in two flavours of virtualization: paravirtualization (PVM) and hardware virtualization (HVM). The vast majority of EC2 AMIs use PVM, but for a variety of reasons, only EC2 instances using HVM can allocate hugepages.

Initially, I tried to enable hugepages with PVM. I was able to load a kernel with huge page support, and even enable hugepages and allocate a shared memory segment using hugepages. However upon trying to attach to the shared memory segment, I received a kernel panic. For a simple test case, I used a C program that is included in the Linux kernel distribution which creates a small shared memory segment using hugepages attaches to it: tools/testing/selftests/vm/hugepage-shm.c.


[root@ip-10-28-12-158 ~]# echo 512 >/proc/sys/vm/nr_hugepages
[root@ip-10-28-12-158 ~]# grep Huge /proc/meminfo
HugePages_Total: 512
HugePages_Free: 512
...
Hugepagesize: 2048 kB

[root@ip-10-28-12-158 ~]# strace ./hugepage-shm
execve(“./hugepage-shm”, ["./hugepage-shm"], [/* 22 vars */]) = 0

shmget(0×2, 268435456, IPC_CREAT|SHM_HUGETLB|0600) = 1146881
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), …}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f469afd1000
write(1, “shmid: 0×118001n”, 16shmid: 0×118001) = 16
shmat(1146881, 0, 0)………

At this point, the program hangs, and the kernel panics.

However, when I try the same experiment on an identical EC2 instance running with HVM, the test completes successfully. After trying to use hugepages on a variety of operating systems and configurations on EC2, I conclude that for EC2, huge page support is only available on HVM instances.

HVM is available on the following instance classes:

Class ECUs* Mem(G) Price** Oracle
Licenses
Required
Notes
m3.xlarge 13.0 15.0 0.50 1 2nd Gen Standard
m3.2xlarge 26.0 30.0 1.00 2 2nd Gen Standard
cc2.8xlarge 88.0 60.5 2.40 8 Cluster Compute
cr1.8xlarge 88.0 244.0 3.50 8 Memory-optimized cluster
hi1.4xlarge 35.0 60.5 3.10 4 I/O optimized
hs1.8xlarge 35.0 117.0 4.60 4 Storage optimized
cg1.4xlarge 33.5 22.5 2.10 4 Cluster GPU

* Elastic Compute Units: “The equivalent CPU capacity of one 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor”
** US$ per hour on-demand in us-east-1 region

Fine, I’ll use EC2 instances with HVM. What’s the big deal?

Well, at Openworld in 2010, Oracle and Amazon announced that EC2 would allow customers to select Oracle VM as the underlying hypervisor for their Amazon EC2 instances. This made EC2 a platform completely supported by Oracle.

However, when Oracle and Amazon released the OVM-backed instances, they could not be run on any of the above HVM-capable instance classes. The EC2 API responds with:

Client.InvalidParameterCombination: Non-Windows instances with a virtualization type of 'hvm' are currently not supported for this instance type.

This means that if you want to use hugepages on EC2, you cannot use any of the Oracle VM-backed instances. What??? AWS and Oracle went to the trouble of making OVM available to support running Oracle software on EC2, but didn’t bother to enable it on any of the instance classes capable of hugepages, which is recommended by Oracle?

We have all been over and over the question of whether to run Oracle on non-Oracle hypervisors like VMware and Xen. The policy from Oracle is something like “You can run Oracle on non-Oracle hypervisors, but if you call for support and we suspect the problem is with the hypervisor, we reserve the right to ask you to reproduce the problem on bare metal.”

Well for Amazon Xen, the same policy applies. Essentially we are left with the choice: When running Oracle on Amazon EC2, either run efficiently with hugepages on a not-totally-supported hypervisor, or run inefficiently without hugepages on a fully supported Oracle VM-backed instance.

AWS and Oracle take note: As it stands, the OVM offering is not set up in a way that enables real enterprise use of EC2 by Oracle customers. This is a major oversight that needs to be addressed.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>