- The Partitioning Continuum at a Glance
- nPartitions (Electrically Isolated Hardware Partitions)
- Virtual Partitions (Peak Performance Virtualization)
- HP Integrity Virtual Machines (Fully Virtualized Partitioning)
- Secure Resource Partitions (Partitioning Inside a Single Copy of HP-UX)
Secure Resource Partitions (Partitioning Inside a Single Copy of HP-UX)
Resource partitioning is something that has been integrated with the HP-UX kernel since version 9.0 of HP-UX. Over the years, HP has gradually increased the functionality; today you can provide a remarkable level of isolation between applications running in a single copy of HP-UX. The current version provides both resource isolation, something that has been there from the beginning of resource partitions, and security isolation, the newest addition. Figure 2-18 shows how the resource isolation capabilities allows multiple applications to run in a single copy of HP-UX while ensuring that each partition gets its share of resources.
Figure 2-18 Resource Partitioning in HP-UX
Within a single copy of HP-UX, you have the ability to create multiple partitions. To each partition you can:
- Allocate a CPU entitlement using whole-CPU granularity (processor sets) or sub-CPU granularity (fair share scheduler)
- Allocate a block of memory
- Allocate disk I/O bandwidth
- Assign a set of users and/or application processes that should run in the partition
- Create a security compartment around the processes that ensures that processes in other compartments can't communicate or send signals to the processes in this Secure Resource Partition
One unique feature of HP's implementation of resource partitions is that inside the HP-UX kernel, we instantiate multiple copies of the memory management subsystem and multiple process schedulers. This ensures that if an application runs out of control and attempts to allocate excessive amounts of resources, the system will constrain that application. For example, when we allocate four CPUs and 8GB of memory to Partition 0 in Figure 2-18, if the application running in that partition attempts to allocate more than 8GB of memory, it will start to page, even if there is 32GB of memory on the system. Similarly, the processes running in that partition are scheduled on the four CPUs that are assigned to the partition. No processes from other partitions are allowed to run on those CPUs, and processes assigned to this partition are not allowed to run on the CPUs that are assigned to the other partitions. This guarantees that if a process running in any partition spins out of control, it can't impact the performance of any application running in any other partition.
A new feature of HP-UX is security containment. This is really the migration of functionality available in HP VirtualVault for many years into the standard HP-UX kernel. This is being done in a way that allows customers to choose which of the security features they want to be activated individually. The security-containment feature allows users to ensure that processes and applications running on HP-UX can be isolated from other processes and applications. Specifically, it is possible to erect a boundary around a group of processes that insulates those processes from IPC communication with the rest of the processes on the system. It is also possible to define access to file systems and network interfaces. This feature is being integrated with PRM to provide Secure Resource Partitions.
The resource controls available with Secure Resource Partitions include:
- CPU controls: You can allocate a CPU to a partition with sub-CPU granularity using the fair share scheduler (FSS) or with whole-CPU granularity using processor sets.
- Real memory: Shares of the physical memory on the system can be allocated to partitions.
- Disk I/O bandwidth: Shares of the bandwidth to any volume group can be allocated to each partition.
More details about what is possible and how these features are implemented are provided below.
A CPU can be allocated to Secure Resource Partitions with sub-CPU granularity or whole-CPU granularity. Both of these features are implemented inside the kernel. The sub-CPU granularity capability is implemented by the FSS.
The fair share scheduler is implemented as a second level of time-sharing on top of the standard HP-UX scheduler. The FSS allocates a CPU to each partition in large 10ms time ticks. When a particular partition gets access to a CPU, the process scheduler for that partition analyzes the process run queue for that partition and runs those processes using standard HP-UX process-scheduling algorithms.
CPU allocation via processor sets (PSETs) is quite different in that CPU resources are allocated to each of the partitions on whole CPU boundaries. What this means is that you assign a certain number of whole CPUs to each partition rather than a share of them. The scheduler in the partition will then schedule the processes that are running there only on the CPUs assigned to the partition. This is illustrated in Figure 2-19.
Figure 2-19 CPU Allocation via Processor Sets Assigns Whole CPUs to Each Partition
The configuration shown in Figure 2-19 shows the system split into three partitions. Two will run Oracle instances and the other partition runs the rest of the processing on the system. This means that the Oracle processes running in partition 1 will run on the two CPUs assigned to that partition. These processes will not run on any other CPUs in the system, nor will any processes from the other partitions run on these two CPUs.
Comparing FSS to PSETs is best done using an example. If you have an eight-CPU partition that you wish to assign to three workloads with 50% going to one workload and 25% going to each of the others, you have the option of setting up PSETs with the configuration illustrated in Figure 2-19 or setting up FSS groups with 50, 25, and 25 shares. The difference between the two is that the processes running in partition 1 will either get 100% of the CPU cycles on two CPUs or 25% of the cycles on all eight CPUs.
In Figure 2-19, we see that each of the partitions in this configuration also has a block of memory assigned. This is optional, but it provides another level of isolation between the partitions. HP-UX 11i introduced a new memory-control technology called memory resource groups, or MRGs. This is implemented by providing a separate memory manager for each partition, all running in a single copy of the kernel. This provides a very strong level of isolation between the partitions. As an example, if PSET partition 1 above was allocated two CPUs and 4GB of memory, the memory manager for partition 1 will manage the memory allocated by the processes in that partition within the 4GB that was assigned. If those processes attempt to allocate more than 4GB, the memory manager will start to page out memory to make room, even though there may be 16GB of memory available in the partition.
The default behavior is to allow unused memory to be shared between the partitions. In other words, if the application in partition 1 is only using 2GB of its 4GB entitlement, then processes in the other partitions can "borrow" the available 2GB. However, as soon as processes in partition 1 start to allocate additional memory, the memory that was loaned out will be retrieved. There is an option on MRGs that allows you to "isolate" the memory in a partition. What that means is that the 4GB assigned to the partition will not be loaned out and the partition will not be allowed to borrow memory from any of the other partitions either.
Disk I/O Controls
HP-UX supports disk I/O bandwidth controls for both LVM and VxVM volume groups. You set this up by assigning a share of the bandwidth to each volume group to each partition. LVM and VxVM each call a routine provided by PRM that will reshuffle the I/O queues to ensure that the bandwidth to the volume group is allocated in the ratios assigned. For example, if partition 1 has 50% of the bandwidth, the queue will be shuffled to ensure that every other I/O request comes from processes in that partition.
One thing to note here is that because this is implemented by shuffling the queue, the controls are active only when a queue is building, which happens when there is contention for I/O. This is probably what you want. It normally doesn't make sense to constrain the bandwidth available to one application when that bandwidth would go to waste if you did.
The newest feature added to resource partitions is security containment. With the introduction of security containment in HP-UX 11i V2, we have integrated some of this functionality with resource partitions to create Secure Resource Partitions. There are three major features of the security containment product:
- Secure compartments
- Fine-grained privileges
- Role-based access control
These features have been available in secure versions of HP-UX and Linux but have now been integrated into the base HP-UX in a way that allows them to be optionally activated. Let's look at each of these in detail.
The purpose of compartments is to allow you to provide control of the interprocess communication (IPC), device, and file accesses from a group of processes. This is illustrated in Figure 2-20.
Figure 2-20 Security Compartments Isolate Groups of Processes from Each Other
The processes in each compartment can freely communicate with each other and can freely access files and directories assigned to the partition, but no access to processes or files in other compartments is permitted unless a rule has been defined that allows that specific access. Additionally, the network interfaces, including pseudo-interfaces, are assigned to a compartment. Communication over the network is restricted to the interfaces in the local compartment unless a rule is defined that allows access to an interface in another compartment.
Traditional HP-UX provided very basic control of special privileges, such as overriding permission to access files. Generally speaking, the root user had all privileges and other users had none. With the introduction of security containment, the privileges can now be assigned at a very granular level. There are roughly 30 separate privileges that you can assign.
The combination of these fine-grained privileges and the role-based access control we discuss in the next section allows you to assign specific privileges to specific users when running specific commands. This provides the ability to implement very detailed security policies. Keep in mind, though, that the more security you wish to impose, the more time will be spent getting the configuration set up and tested.
Role-Based Access Controls (RBAC)
In many very secure environments, customers require the ability to cripple or remove the root user from the system. This ensures that if there is a successful break-in to the system and an intruder gains root access, he or she can do little or no damage. In order to provide this, HP has implemented role-based access control in the kernel. This is integrated with the fine-grained privileges so that it is possible to define a "user admin" role as someone who has the ability to create directories under /home and can edit the /etc/password file. You can then assign one or more of your system administrators as "user admin" and they will be able to create and modify user accounts only without having to know the root password.
This is implemented by defining a set of authorizations and a set of roles that have those authorizations against a specific set of objects. Another example would be giving a printer admin authorization to start or stop a particular print queue.
Implementing these using roles makes it much easier to maintain the controls over time. As users come and go, they can be removed from the list of users who have a particular role, but the role is still there and the other users are not impacted by that change. You can also add another object to be managed, like another print queue, and add it to the printer admin role and all the users with that role will automatically get that authorization; you will not have to add it to every user. A sample set of roles is shown in Figure 2-21.
Figure 2-21 A Simple Example of Roles Being Assigned Authorizations
Secure Resource Partitions
An interesting perspective of Secure Resource Partitions is that it is really a set of technologies that are embedded in the HP-UX kernel. These include FSS and PSETs for CPU control, memory resource groups for memory controls, LVM and VxVM for disk I/O bandwidth control, and security containment for process communication isolation.
The product that makes it possible to define Secure Resource Partitions is Process Resource Manager (PRM). All of the other technologies allow you to control a group of processes running on an HP-UX instance. What PRM does is make it much easier for you to define the controls for any or all of them on the same set of processes. You do this by defining a group of users and/or processes, called a PRM group, and then assigning CPU, memory, disk I/O, and security entitlements for that group of processes. Figure 2-22 provides a slightly modified view of Fig ure 2-18, which includes the security isolation in addition to the resource controls.
Figure 2-22 A Graphical Representation of Resource Partitions with the Addition of Security Controls
This diagram illustrates the ability to control both resources and security containment with a single solution. One point to make about PRM is that it doesn't yet allow the configuration of all the features of the underlying technology. For example, PRM controls groups of processes, so it doesn't provide the ability to configure the role-based access control features of the security-containment technology. It does, however, allow you to define a compartment for the processes to run in and will also allow you to assign one or more network interfaces to each partition if you define the security features.
The default behavior of security compartments is that processes will be able to communicate with any process running in the same compartment but will not be able to communicate with any processes running in any other compartment. However, file access uses standard file system security by default. This is done to ensure that independent software vendor applications will be able to run in this environment without modifications and without requiring the user to configure in potentially complex file-system security policies. However, if you are interested in tighter file-system security and are willing to configure that, there are facilities to allow you to do that. For network access, you can assign multiple pseudo-LAN interfaces (eg. lan0, lan1, etc.) to a single physical network interface card. This gives you the ability to have more pseudo-interfaces and IP addresses than real interfaces. This is nice for security compartments and SRPs because you can create at least one pseudo-interface for each compartment, allowing each compartment to have its own set of IP addresses. The network interface code in the kernel has been modified to ensure that no two pseudo-interfaces can see each others' packets even if they are using the same physical interface card.
The security integration into PRM for Secure Resource Partitions uses the default compartment definitions, with the exception of network interface rules. Most modern applications require network access, so this was deemed a requirement. When using PRM to define an SRP, you have the ability to assign at least one pseudo-interface to each partition, along with the resource controls discussed earlier in this section.
User and Process Assignment
Because all the processes running in all the SRPs are running in the same copy of HP-UX, it is critical to ensure that users and processes get assigned to the correct partition as they come and go. In order to simplify this process across all the SRP technologies, PRM provides an application manager. This is a daemon that is configured to know what users and applications should be running in each of the defined SRPs.
Resource Partition integration with HP-UX
Because resource partitioning and PRM were introduced in HP-UX in 1995, this technology is thoroughly integrated with the operating system. HP-UX functions and tools such as fork(), exec(), cron, at, login, ps, and GlancePlus are all integrated and will react appropriately if Secure Resource Partitions are configured. For example:
- Login will query the PRM configuration for user records and will start the users' shell in the correct partition based on that configuration
- The ps command has two command-line options, –P and –R, which will either show the PRM partition each process displayed is in or only show the processes in a particular partition.
- GlancePlus will group the many statistics it collects for all the processes running in each partition. You can also use the GlancePlus user interface to move a process from one partition to another.
The result is that you get a product that has been enhanced many times over the years to provide a robust and complete solution.
More details on Secure Resource Partitions, including examples of how to configure them, will be provided in Chapter 11, "Secure Resource Partitions."