Oracle9i Administrator's Reference Release 1 (9.0.1) for UNIX Systems: AIX-Based Systems, Compaq Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel and Sun SPARC Solaris Part Number A90347-02 |
|
This appendix contains the following tuning information for Oracle9i on AIX:
Memory contention occurs when processes require more memory than is available. To cope with the shortage, the system pages programs and data between memory and disks.
Excessive paging activity decreases performance substantially. This can become a problem with database files created on the Journaled File System (JFS). In this situation, a large number of SGA data buffers might also have analogous JFS buffers containing the most frequently referenced data. The behavior of the AIX file buffer cache manager can have a significant impact on performance. It can cause an I/O bottleneck, resulting in lower overall system throughput.
On AIX, tuning buffer-cache paging activity is possible but it must be done carefully and infrequently. Use the vmstat
command to tune the following AIX system parameters:
Log on as the root
user and use the vmtune
command to change these limits. The AIX vmtune
command is operating-system version specific. If you run the vmtune
command from one release on a different AIX release, the operating system might fail.
You can base the size of the MINFREE parameter on the working set size of the programs that require a quick response. You must have enough pages in the free list so that the loaded program does not need to replenish the free list. You can determine the size of the working set for a program by using the svmon
-p
command. The value of the MAXFREE parameter should be greater than the value of the MINFREE parameter by the value of the MAXPGAHEAD parameter, or 8 pages, which ever is higher. For instance, if you determine that the MINFREE parameter value should be 128 pages and the MAXPGAHEAD parameter value is 16 pages, enter the following command to set the value of the MINFREE parameter to 128 pages and the value of the MAXFREE parameter to 144 pages:
$ /usr/lpp/bos/samples/vmtune -f 128 -F 144
The purpose of the AIX file buffer cache is to reduce disk access frequency when the JFS file system is used. If this cache is too small, disk usage increases and potentially saturates one or more disks. If the cache is too large, memory is wasted.
See Also:
For the implications of increasing the AIX file buffer cache, see "Controlling Buffer-Cache Paging Activity". |
You can configure the AIX file buffer cache by adjusting the MINPERM and MAXPERM parameters. In general, if the buffer hit ratio is low (less than 90 percent), as determined by the sar -b
command, increasing the MINPERM parameter value might help. If maintaining a high buffer hit ratio is not critical, decreasing the MINPERM parameter value increases the physical memory available. Refer to your AIX documentation for more information on increasing the size of the AIX file buffer cache.
The performance gain cannot be quantified easily, because it depends on the degree of multiprogramming and the I/O characteristics of the workload.
AIX provides a mechanism for you to loosely control the ratio of page frames used for files versus those used for computational (working or program text) segments by adjusting the MINPERM and MAXPERM values according to the following guidelines:
Use the following algorithm to calculate the default values:
Use the following command to change the value of the MINPERM parameter to 5 percent of the total number of page frames, and the value of the MAXPERM parameter to 20 percent of the total number of page frames:
# vmtune -p 5 -P 20
The default values are 20 percent and 80 percent, respectively.
If the database files are on raw devices, you can set the MINPERM and MAXPERM parameters to low values, for example 5 percent and 20 percent, respectively. This is because the AIX file buffer cache is not used for raw devices. The memory might be better used for other purposes, such as the Oracle System Global Area.
Inadequate paging space usually causes the system to hang or suffer abnormally slow response times. On AIX, you can dynamically add paging space on raw disk partitions. The amount of paging space you should configure depends on the amount of physical memory present and the paging space requirements of your applications. Use the lsps
command to monitor paging space use and the vmstat
command to monitor system paging activities.
You should configure the amount of paging space to be two or three times the amount of physical memory required for most workloads. Use a higher value if you plan to use Oracle development tools or Oracle Applications. Monitor the use of paging space and increase it as necessary. To increase the paging space, use the smit
pgsp
command. For systems with very large physical memory (more than 1 GB), one and a half times the physical memory might be enough paging space.
Constant and excessive paging indicates that the real memory is over-committed. In general, you should:
For instance, if you query Oracle dynamic performance tables and views and find that both the shared pool and database buffer cache require more memory, assigning the limited spare memory to the shared pool might be more beneficial than assigning it to the database block buffer cache.
The following AIX commands provide paging status and statistics:
You can configure the Oracle database block size for better I/O throughput. On AIX, you can set the value of the DB_BLOCK_SIZE initialization parameter to between 2 KB to 16 KB, with a default of 4 KB. If the Oracle database is JFS-based, then the block size should be a multiple of the JFS blocksize (4 KB on AIX). For databases on raw partitions, the Oracle database block size is a multiple of the operating system physical block size (512 bytes on AIX).
Oracle Corporation recommends smaller Oracle database block sizes (2 KB or 4 KB) for online transaction processing (OLTP) or mixed workload environments and larger block sizes (8 KB or 16 KB) for decision support system (DSS) workload environments.
You can improve the speed of archiving the database by dedicating larger buffers. However, over-configuring can degrade overall system performance. You can increase the LOG_ARCHIVE_BUFFER_SIZE initialization parameter to 128 KB. The default value for the LOG_ARCHIVE_BUFFER_SIZE parameter is 4 KB.
Tune this parameter carefully so that overall performance of normal database activity does not degrade drastically. Performance improvements in the archiving process of 0 to 20 percent are possible. Some users report even larger improvements.
The constraining factor in using multiple shared memory segments for the SGA is the amount of physical memory that current hardware architectures can support. You do not want the Oracle SGA paged out because of inadequate main memory. For a large number of simultaneous database users, larger SGAs are recommended.
The initialization parameters that have the greatest effect on the SGA size are DB_BLOCK_BUFFERS and SHARED_POOL_SIZE. How much you increase these parameters depends on the amount of memory you can allocate for the SGA. There is a performance overhead with large values for the DB_BLOCK_BUFFERS and SHARED_POOL_SIZE parameters, so lowering these values can result in decreased CPU usage. It is possible to increase or decrease the Oracle9i buffer cache by modifying the SHARED_POOL_SIZE initialization parameter and restarting the instance.
For high-speed data loading, such as using the SQL*Loader direct path option in addition to loading data in parallel, the CPU spends most of its time waiting for I/O to complete. By increasing the number of buffers, you can usually push the CPU usage harder, thereby increasing overall throughput.
The number of buffers (set by the SQL*Loader BUFFERS parameter) you choose depends on the amount of available memory and how hard you want to push CPU usage. Refer to Oracle9i Database Utilities for information on adjusting the file processing options string for the BUFFERS parameter.
The performance gains depend on CPU usage and the degree of parallelism that you use when loading data.
The BUFFER parameter for the Import utility should be set to a large value to optimize the performance of high-speed networks when they are used. For instance, if you use the IBM RS/6000 Scalable POWERparallel Systems (SP) switch, you should use a value of at least 1 MB.
Disk I/O contention can result from poor memory management (with subsequent paging and swapping), or poor distribution of tablespaces and files across disks. Spread the I/O load evenly across all disks.
The AIX Logical Volume Manager (LVM) can stripe data across multiple disks to reduce disk contention. The primary objective of striping is to achieve high performance when reading and writing large sequential files. Effective use of the striping features in the LVM allows you to spread I/O more evenly across disks, resulting in greater overall performance.
When you define a striped logical volume, you must specify the following items:
Use the vmtune
command to adjust the MINPGAHEAD, MAXPGAHEAD, and MAXFREE parameters to achieve the highest sequential I/O throughput. See the AIX Performance Tuning Guide, Version 3.2 and 4 for the recommended settings.
Make sure that the I/O activity is evenly distributed across multiple disk drives by using AIX utilities such as filemon
, sar
, iostat
, and other performance tools to identify any disks with high I/O activity.
Performance gains from effective use of the LVM can vary greatly, depending on the LVM you use and the characteristics of the workload. For DSS workloads, you can see substantial improvement. For OLTP-type or mixed workloads, you can still expect significant performance gains.
Note the following considerations when you are deciding whether to use the JFS file system or raw partitions:
The degree of performance improvement largely depends on the I/O characteristics of the workload.
If you use the JFS file system, it is easier to manage and maintain database files than it is if you use raw devices. However, because JFS supports only buffered read and write at present, every time data is transferred to or from the I/O subsystem (other than the Oracle buffer cache or SGA) extra AIX file buffer caches are created. This is the main drawback to using the JFS file system, and this penalty becomes even greater when the I/O subsystem is more heavily exercised.
Also, on heavily loaded systems, the contention for the AIX file buffer cache becomes the main bottleneck. Some side-effects include:
pagedaemon
command, which flushes data from the AIX file buffer cache to disk
Use raw devices/volumes for Oracle files only if your site has at least as many raw disk partitions as Oracle datafiles. If the raw disk partitions are already formatted, match datafile size to the partition size as closely as possible to avoid wasting space. In a heavily-used database configuration, use raw devices.
To move from the JFS file system to raw devices without the need to manually reload all of the data, enter the following command as the root
user:
# dd if=old_JFS_file of=new_raw_device seek=1 bs=4k
Ensure that you follow these guidelines:
You might experience a slight degradation in sequential read performance. If the application is performing many full table scans, you might experience a server performance degradation on these full table scans, because the AIX file buffer cache is no longer used to cache these tables.
In configurations where memory is scarce, you might need to resize the AIX file buffer cache.
In the AIX environment, Oracle Corporation encourages the use of symbolic links whenever possible, because the name of the raw device can change. For example, this change can occur when reconfiguring disk peripherals or moving the database to a new hardware configuration. Refer to the IBM documentation for more information.
On AIX, you can choose the JFS file system for some datafiles and raw partitions for other datafiles. Asynchronous I/O works with both raw partitions and the JFS file system. If you know the nature of I/O activity on database objects beforehand, you can place the datafiles corresponding to specific objects on either the JFS file system or raw partitions, together with a LVM, if available.
The improvement of performance varies greatly depending on the nature of your workload and the disk or file system configuration.
Oracle9i takes full advantage of asynchronous I/O (AIO) provided by AIX, resulting in faster database access. AIO interweaves multiple I/O to improve I/O subsystem throughput. The advantage of AIO is realized only when data is well distributed among different disks.
Using the LVM and striping enhances the effectiveness of AIO. The LVM reduces disk contention by striping data across multiple disk spindles. Using AIO with LVM significantly improves RDBMS performance.
All version of AIX 4 support AIO for database files created both on file system partitions and on raw devices. After issuing AIO requests, the application processes can continue running. For AIO requests on file systems (or virtual shared disks (VSDs) or HSDs with AIX 4.2.1), the kernel server process (kproc
) is in charge of each request from the time it is taken off the queue until it completes. The number of servers limits the number of AIO requests that can be executed in the system concurrently. On AIX 4.2.1, AIO on raw devices is kernelized, which means that AIX servers are not required to service the AIO requests. On AIX 4.3, AIO on all raw devices, including VSDs and HSDs (with APAR IX79690), is kernelized. The kernelized AIO provides better performance than AIO with the kernel process.
Use one of the following commands to set the number of servers:
Set the minimum value to the number of servers to be started at system boot. Set the maximum value to the number of servers that can be started in response to large number of concurrent requests.
The default value for the minimum number of servers is 1. The default value for the maximum number of servers is 10. These values are usually too low to run the Oracle server if you are not using kernelized AIO. Oracle Corporation recommends that you set the following values:
Parameter | Value |
---|---|
MAXSERVERS |
At least ten times the number of disks that AIO accesses simultaneously |
MINSERVERS |
The value of MAXSERVERS multiplied by 2 |
I/O Slaves are specialized processes that perform only I/O. They can operate whether or not asynchronous I/O is available. They are allocated from shared memory buffers. I/O Slaves use a set of initialization parameters, listed in Table A-1, that allow a degree of control over the way they operate.
Parameter | Range of Values | Default Value |
---|---|---|
DISK_ASYNCH_IO |
TRUE/FALSE |
TRUE |
TAPE_ASYNCH_IO |
TRUE/FALSE |
TRUE |
BACKUP_TAPE_IO_SLAVES |
TRUE/FALSE |
FALSE |
DBWR_IO_SLAVES |
0 - 999 |
0 |
DB_WRITER_PROCESSES |
1-10 |
1 |
There are times when the use of asynchronous I/O is not desirable or not possible. The first two parameters in Table A-1, DISK_ASYNCH_IO and TAPE_ASYNCH_IO, allow asynchronous I/O to be switched off respectively for disk or tape devices. Because the number of I/O slaves for each process type defaults to zero, by default no I/O Slaves are deployed.
The DBWR_IO_SLAVES parameter should be set to greater than 0 only if the DISK_ASYNCH_IO, or the TAPE_ASYNCH_IO parameter has been set to FALSE, otherwise the database writer process (DBWR) becomes a bottleneck. In this case, the optimal value on AIX for the DBWR_IO_SLAVES parameter is 4.
The DB_WRITER_PROCESSES parameter specifies the initial number of database writer processes for an instance. If you use the DBWR_IO_SLAVES parameter, only one database writer process is used, regardless of the setting of the DB_WRITER_PROCESSES parameter.
A large value for the DB_FILE_MULTIBLOCK_READ_COUNT initialization parameter usually yields better I/O throughput. On AIX, this parameter ranges from 1 to 512, but using a value higher than 16 usually does not provide additional performance gain.
Set this parameter so that its value when multiplied by the value of the DB_BLOCK_SIZE parameter produces a number that is larger than the LVM stripe size. Such a setting causes more disks to be used.
RAID 5 enhances sequential read performance, but decreases overall write performance. Oracle Corporation recommends using RAID 5 only for workloads that are not write-intensive. Intensive writes on RAID 5 might result in a performance degradation compared to a non-RAID environment.
RAID 0 and 1 generally result in better performance, as they introduce striping and mirroring at the hardware level, which is more efficient than at the AIX or Oracle level. RAID 7 is capable of providing better small and large read and write performance than RAID 0 to 6.
The write behind feature enables the operating system to group write I/Os together up to the size of a partition. Doing this increases performance because the number of I/O operations is reduced. The file system divides each file into 16 KB partitions to increase write performance, limit the number of dirty pages in memory, and minimize disk fragmentation. The pages of a particular partition are not written to disk until the program writes the first byte of the next 16 KB partition. To set the size of the buffer for write behind to eight 16 KB partitions, enter the following command:
# vmtune -c 8
To disable write behind, enter the following command:
# vmtune -c 0
The VMM anticipates the need for pages of a sequential file. It observes the pattern in which a process accesses a file. When the process accesses two successive pages of the file, the VMM assumes that the program will continue to access the file sequentially, and schedules additional sequential reads of the file. These reads overlap the program processing and make data available to the program sooner. Two VMM thresholds, implemented as kernel parameters, determine the number of pages it reads ahead:
The number of pages read ahead when the VMM first detects the sequential access pattern
The maximum number of pages that VMM reads ahead in a sequential file
Set the MINPGAHEAD and MAXPGAHEAD parameters to appropriate values for your application. The default values are 2 and 8 respectively. Use the vmtune
command to change these values. You can use higher values for the MAXPGAHEAD parameter in systems where the sequential performance of striped logical volumes is of paramount importance. To set the MINPGAHEAD parameter to 32 pages and the MAXPGAHEAD parameter to 64 pages, enter the following command:
# vmtune -r 32 -R 64
Set both the MINPGAHEAD and MAXPGAHEAD parameters to a power of two. For example, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1042, and so on.
Disk I/O pacing is an AIX mechanism that allows the system administrator to limit the number of pending I/O requests to a file. This prevents disk I/O intensive processes from saturating the CPU. Therefore, the response time of interactive and CPU-intensive processes does not deteriorate.
You can achieve disk I/O pacing by adjusting two system parameters: the high-water mark and the low-water mark parameters. When a process writes to a file that already has a pending high-water mark I/O request, the process is put to sleep. The process wakes up when the number of outstanding I/O requests falls below or equals the low-water mark.
You can use the smit
command to change the high and low-water marks. Determine the water marks through trial-and-error. Use caution when setting the water marks because they affect performance. Tuning the high and low-water marks has less effect on disk I/O larger than 4 KB.
On AIX, you can, to some extent, control the placement of a logical volume on a disk. Placing logical volumes with high disk activity close to each other can reduce disk seek time, resulting in better overall performance.
On AIX, you can, to some extent, control the placement of a logical volume on a disk. Placing logical volumes with high disk activity close to each other can reduce disk seek time, resulting in better overall performance.
Oracle9i Real Application Clusters running on the SP architecture uses VSDs or HSDs as the common storage that is accessible from all instances on different nodes. If an I/O request is to a VSD where the logical volume is local to the node, local I/O is performed. The I/O traffic to VSDs that are not local goes through network communication layers.
For better performance, it is important to minimize remote I/O as much as possible. Redo logs of each instance should be placed on the VSDs that are on local logical volumes. Each instance should have its own private rollback segments that are on VSDs mapped to local logical volumes if updates and insertions are intensive.
In each session, each user is allowed only one temporary tablespace. The temporary tablespaces should each contain at least one datafile local to each of the nodes.
Carefully design applications and databases (by partitioning applications and databases, for instance) to minimize remote I/O.
IBM Corporation recommends that you do not use VSD cache buffers (nocache) under normal situations for the following reasons:
Use the statvsd
command to check the performance of the VSD. If the statvsd
command consistently shows requests queued waiting for buddy buffers, do not add more buddy buffers. Instead, increase the size of the switch send pool:
# /usr/lpp/ssp/css/chgcss -l css0 -a spoolsize=new_size_in_bytes
If the send pool size increases, you should also increase the mbuf parameter top ceiling mark:
# /etc/no -o thewall=new_size_in_kbytes
The mbuf parameter top ceiling mark specifies the maximum amount of memory that can be used for network buffers. To check the current sizes of the send and receive pools, enter the following command:
$ /usr/sbin/lsattr -El css0
The CPU is another system component for which processes might contend. Although the AIX kernel allocates CPU effectively most of the time, many processes compete for CPU cycles. If your system has more than one CPU (SMP), there might be different levels of contention on each CPU.
The default value for the runtime slice of the AIX RR dispatcher is ten milliseconds. Use the schedtune
command to change the time slice. However, be careful when using this command. A longer time slice causes a lower context switch rate if the applications' average voluntary switch rate is lower. As a result, fewer CPU cycles are spent on context-switching for a process and the system throughput should improve.
However, a longer runtime slice can deteriorate response time, especially on a uniprocessor system. The default runtime slice is usually acceptable for most applications. When the run queue is high and most of the applications and Oracle shadow processes are capable of running a much longer duration, you might want to increase the time slice by entering the following command:
# /usr/samples/kernel/schedtune -t n
In the previous example, choosing a value for n of 0 results in a slice of 10 ms, choosing a value of 1 results in a slice of 20 ms, choosing a value of 2 results in a slice of 30 ms, and so on.
Binding certain processes to a processor can improve performance substantially on an SMP system. Processor binding is available and fully functional with AIX 4.x and higher.
Processor binding offers the following benefits:
Processor binding on AIX is not automatic. On a multiprocessor system, you must explicitly bind a process to a processor by using the bindprocessor
command. Only the root
user or the Oracle software owner can bind an Oracle process to a processor. The child processes inherit the processor binding.
Oracle Corporation recommends binding the various Oracle background processes (except the database writer process) to different processors and leaving one processor free to service the database writer process. This guarantees the database writer a processor on which to execute and at the same time allows the database writer process to migrate freely to the other processors if it becomes CPU bound.
The binding of a process to a processor is not exclusive. The processor is free to execute other processes.
When an Oracle client process connects to an Oracle server process using an Oracle Net Services listener, the server process can be easily bound to a processor by binding the listener process. All Oracle server processes that the listener subsequently spawns are bound to the same processor.
One way to do this is to start multiple listeners, each listening on its own port. You must customize the $ORACLE_HOME/network/admin/listener.ora
file to have one set of lines for each listener. You launch multiple listeners on the server side. On the client side, you might want to customize the tnsnames.ora
file so that clients or even applications connect to different ports that are listened on by different listeners. For example, you can modify the listener.ora
file and have two listeners, L1 and L2, that listen on ports 1521 and 1522, respectively, as follows:
listener.ora
file as follows:
L1 = (ADDRESS_LIST = (ADDRESS= (PROTOCOL= TCP)(Host= nowhere)(Port= 1521)) ) SID_LIST_L1 = (SID_LIST = (SID_DESC = (ORACLE_HOME= /oracle) (SID_NAME = ordb) ) ) L2 = (ADDRESS_LIST = (ADDRESS= (PROTOCOL= TCP)(Host= nowhere)(Port= 1522)) ) SID_LIST_L2 = (SID_LIST = (SID_DESC = (ORACLE_HOME= /oracle) (SID_NAME = ordb) ) )
$ lsnrctl start L1 $ lsnrctl start L2
$ ps -ef | grep tnslsnr
$ bindprocessor process_id_for_L1 processor_id $ bindprocessor process_id_for_L2 processor_id
In the preceding example, the first processor has the ID 0, the second processor has the ID 1, and so on.
If you want to restrict the number of CPUs used by certain client applications, you can use this method to bind the listener so that the applications run only on that processor when it is available.
Processor binding is more difficult when the clients and the Oracle servers run on the same computer using the two-task pipe driver. You must determine the process ID for each server process and manually bind it to a processor. The administrative overhead is excessive and probably not worth the effort unless the servers have long process lives.
Processor binding of Oracle processes can have negative effects on the performance of other applications running on the same system. Careful tuning and monitoring is strongly recommended.
Oracle9i Real Application Clusters uses UDP for interprocess communications. You can tune UDP kernel settings to improve Oracle performance. You can modify kernel UDP buffering on AIX by changing the udp_sendspace and udp_recvspace parameters.
The value of the udp_recvspace parameter should be at least ten times the value of the udp_sendspace parameter because UDP might not be able to send a packet to an application before another packet arrives.
To determine the suitability of the udp_recvspace parameter settings, enter the following command:
$ netstat -s | grep "socket buffer overflows"
If the number of overflows is not zero, increase the value of the udp_recvspace parameter.
Use the dd
command to back up raw devices. Do not overwrite the first 4 KB block of a raw device, which is used by the AIX Logical Volume Manager. It is faster to back up the device using a blocksize of larger than 4K. A good blocksize for backing up to tape is 256K.
To back up the raw device to tape, enter a command similar to the following:
$ dd if=/dev/raw_device of=/dev/rmt0.1 bs=256k
To restore the raw device from tape, enter commands similar to the following:
$ dd if=/dev/rmt0.1 of=/dev/raw_device count=63 seek=1 skip=1 bs=4k $ mt -f /dev/rmt0.1 bsf 1 $ dd if=/dev/rmt0.1 of=/dev/raw_device seek=1 skip=1 bs=256k
|
![]() Copyright © 2001 Oracle Corporation. All Rights Reserved. |
|