This manual is a conceptual and tutorial guide for experienced users
responsible for optimizing performance on OpenVMS systems.
Revision/Update Information:
This manual supersedes the OpenVMS Performance Management, OpenVMS Alpha Version 7.1 and
OpenVMS VAX Version 7.1.
Software Version:
OpenVMS Alpha Version 7.2
OpenVMS VAX Version 7.2
Compaq Computer Corporation Houston, Texas
January 1999
Compaq Computer Corporation makes no representations that the use of
its products in the manner described in this publication will not
infringe on existing or future patent rights, nor do the descriptions
contained in this publication imply the granting of licenses to make,
use, or sell equipment or software in accordance with the description.
Possession, use, or copying of the software described in this
publication is authorized only pursuant to a valid written license from
Compaq or an authorized sublicensor.
Compaq conducts its business in a manner that conserves the environment
and protects the safety and health of its employees, customers, and the
community.
The following are trademarks of Compaq Computer Corporation: Alpha,
Compaq, ACMS, Bookreader, CI, DECdirect, DECdtm, DECnet, DECwindows,
DIGITAL, HSC, MSCP, OpenVMS, VAX, VAXcluster, VMS, and the Compaq logo.
The following are third-party trademarks:
Motif is a registered trademark of Open Software Foundation, Inc.
All other trademarks and registered trademarks are the property of
their respective holders.
ZK6491
The OpenVMS documentation set is available on CD-ROM.
This document was prepared using VAX DOCUMENT, Version V3.2n.
This manual presents techniques for evaluating, analyzing, and
optimizing performance on a system running OpenVMS. Discussions address
such wide-ranging concerns as:
Understanding the relationship between work load and system capacity
Learning to use performance-analysis tools
Responding to complaints about performance degradation
Helping the site adopt programming practices that result in the
best system performance
Using the system features that distribute the work load for better
resource utilization
Knowing when to apply software corrections to system
behavior---tuning the system to allocate resources more effectively
Evaluating the effectiveness of a tuning operation; knowing how to
recognize success and when to stop
Evaluating the need for hardware upgrades
The manual includes detailed procedures to help you evaluate resource
utilization on your system and to diagnose and overcome performance
problems resulting from memory limitations, I/O limitations, CPU
limitations, human error, or combinations of these. The procedures
feature sequential tests that use OpenVMS tools to generate performance
data; the accompanying text explains how to evaluate it.
Whenever an investigation uncovers a situation that could benefit from
adjusting system values, those adjustments are described in detail, and
hints are provided to clarify the interrelationships of certain groups
of values. When such adjustments are not the appropriate or available
action, other options are defined and discussed.
Decision-tree diagrams summarize the step-by-step descriptions in the
text. These diagrams should also serve as useful reference tools for
subsequent investigations of system performance.
This manual does not describe methods for capacity planning, nor does
it attempt to provide details about using OpenVMS RMS features
(hereafter referred to as RMS). Refer to the Guide to OpenVMS File Applications for that
information. Likewise, the manual does not discuss DECnet for OpenVMS
performance issues, because the DECnet-Plus for OpenVMS Network
Management manual provides that information.
Intended Audience
This manual addresses system managers and other experienced users
responsible for maintaining a consistently high level of system
performance, for diagnosing problems on a routine basis, and for taking
appropriate remedial action.
Document Structure
This manual is divided into 13 chapters and 4 appendixes, each covering
a related group of performance management topics as follows:
Chapter 1 provides a review of workload management concepts and
describes guidelines for evaluating user complaints about system
performance.
Chapter 2 lists postinstallation operations for enhancing
performance and discusses performance investigation and tuning
strategies.
Chapter 4 explains how to use utilities and tools to collect and
analyze data on your system's hardware and software resources. Included
are suggestions for reallocating certain resources should analysis
indicate such a need.
Chapter 5 outlines procedures for investigating performance
problems.
Chapter 6 describes how to evaluate system resource
responsiveness.
Chapter 7 describes how to evaluate the performance of the
memory resource and how to isolate specific memory resource limitations.
Chapter 8 describes how to evaluate the performance of the disk
I/O resource and how to isolate specific disk I/O resource limitations.
Chapter 9 describes how to evaluate the performance of the CPU
resource and how to isolate specific CPU resource limitations.
Chapter 10 provides general recommendations for improving
performance with available resources.
Chapter 11 provides specific recommendations for improving the
performance of the memory resource.
Chapter 12 provides specific recommendations for improving the
performance of the disk I/O resource.
Chapter 13 provides specific recommendations for improving the
performance of the CPU resource.
Appendix A lists the decision trees used in the various
performance evaluations described in this manual.
Appendix B summarizes the MONITOR data items you will find useful
in evaluating your system.
Appendix C provides an example of a MONITOR multifile summary
report.
Appendix D provides ODS-1 performance information.
Related Documents
For additional information on the topics covered in this manual, you
can refer to the following documents:
OpenVMS System Manager's Manual
Guide to OpenVMS File Applications
OpenVMS System Management Utilities Reference Manual
Guidelines for OpenVMS Cluster Configurations
OpenVMS Cluster Systems
For additional information on the Open Systems Software Group (OSSG)
products and services, access the OpenVMS World Wide Web address:
http://www.openvms.digital.com
Reader's Comments
Compaq welcomes your comments on this manual.
Print or edit the online form SYS$HELP:OPENVMSDOC_COMMENTS.TXT and send
us your comments by:
Use the following World Wide Web address to order additional
documentation:
http://www.openvms.digital.com:81/
If you need help deciding which documentation best meets your needs,
call 800-DIGITAL (800-344-4825).
Conventions
In this manual, every use of DECwindows and DECwindows Motif refers to
DECwindows Motif for OpenVMS software.
The following conventions are also used in this manual:
Ctrl/
x
A sequence such as Ctrl/
x indicates that you must hold down the key labeled Ctrl while
you press another key or a pointing device button.
PF1
x or
GOLD
x
A sequence such as PF1
x or GOLD
x indicates that you must first press and release the key
labeled PF1 or GOLD and then press and release another key or a
pointing device button.
GOLD key sequences can also have a slash (/), dash (--), or
underscore (_) as a delimiter in EVE commands.
[Return]
In examples, a key name enclosed in a box indicates that you press a
key on the keyboard. (In text, a key name is not enclosed in a box.)
In the HTML version of this document, this convention appears as
brackets, rather than a box.
...
Horizontal ellipsis points in examples indicate one of the following
possibilities:
Additional optional arguments in a statement have been omitted.
The preceding item or items can be repeated one or more times.
Additional parameters, values, or other information can be entered.
.
.
.
Vertical ellipsis points indicate the omission of items from a code
example or command format; the items are omitted because they are not
important to the topic being discussed.
( )
In command format descriptions, parentheses indicate that you must
enclose the options in parentheses if you choose more than one.
[ ]
In command format descriptions, brackets indicate optional elements.
You can choose one, none, or all of the options. (Brackets are not
optional, however, in the syntax of a directory name in an OpenVMS file
specification or in the syntax of a substring specification in an
assignment statement.)
{ }
In command format descriptions, braces indicate required elements; you
must choose one of the options listed.
bold text
This text style represents the introduction of a new term or the name
of an argument, an attribute, or a reason.
italic text
Italic text indicates important information, complete titles of
manuals, or variables. Variables include information that varies in
system output (Internal error
number), in command lines (/PRODUCER=
name), and in command parameters in text (where
device-name contains up to five alphanumeric characters).
UPPERCASE TEXT
Uppercase text indicates a command, the name of a routine, the name of
a file, or the abbreviation for a system privilege.
Monospace type
Monospace type indicates code examples and interactive screen displays.
In the C programming language, monospace type in text identifies the
following elements: keywords, the names of independently compiled
external functions and files, syntax summaries, and references to
variables or identifiers introduced in an example.
-
A hyphen at the end of a command format description, command line, or
code line indicates that the command or statement continues on the
following line.
numbers
All numbers in text are assumed to be decimal unless otherwise noted.
Nondecimal radixes---binary, octal, or hexadecimal---are explicitly
indicated.
Managing system performance involves being able to evaluate and
coordinate system resources and workload demands.
A system resource is a hardware or software component
or subsystem under the direct control of the operating system, which is
responsible for data computation or storage. The following subsystems
are system resources:
CPU
Memory
Disk I/O
Network I/O
LAN I/O
Internet I/O
Cluster communication other than LAN (CI, FDDI, MC)
In addition to this manual, specific cluster information can be found
in the Guidelines for OpenVMS Cluster Configurations and the OpenVMS Cluster Systems.
Performance management means optimizing your hardware
and software resources for the current work load. This involves
performing the following tasks:
Acquiring a thorough knowledge of your work load and an
understanding of how that work load exercises the system's resources
Monitoring system behavior on a routine basis in order to determine
when and why a given resource is nearing capacity
Investigating reports of degraded performance from users
Planning for changes in the system work load or hardware
configuration and being prepared to make any necessary adustments to
system values
Performing certain optional system management operations after
installation
To help you understand the scope and interrelationship of these issues,
this chapter deals with the following topics:
A review of workload management concepts
Guidelines for developing a performance management strategy
Because many different networking options are available, network I/O is
not formally covered in this manual. General performance concepts
discussed here apply to networking, and networking should be considered
within the scope of analyzing any system performance problem. You
should consult the documentation available for the specific products
that you have installed for specific guidelines concerning
configuration, monitoring, and diagnosis of a networking product.
Similarly, database products are extremely complex and perform much of
their own internal management. The settings of parameters external to
OpenVMS may have a profound effect upon how efficiently OpenVMS is
used. Thus, reviewing server application specific-material is a must if
you are to efficiently understand and resolve a related performance
issue.
Even if you are familiar with basic concepts discussed in this section,
there are some details discussed that are specific to this process, so
please read the entire section.
Long term measurement and observation of your system is key to
understanding how well it is working and is invaluable in identifying
potential performance problems before they become so serious that the
system grinds to a halt and it negatively affects your business. Thus,
performance management should be a routine process of monitoring and
measuring your systems to assure good operation through deliberate
planning and resource management.
Waiting until a problem cripples a system before addressing system
performance is not performance management, rather it is crisis
management. Performance management involves systematically measuring
the system, gathering and analyzing the data, evaluating trends, and
archiving data to maintain a performance history. You will often
observe trends and thus be able to address performance issues before
they become serious and adversely affect your business operations.
Should an unforeseen problem occur, your historical data will likely
prove invaluable for pinpointing the cause and rapidly and efficiently
resolving the problem. Without past data from your formerly
well-running system, you may have no basis upon which to judge the
value of the metrics you can collect on your currently poorly running
system. Without historical data you are guessing; resolution will take
much longer and cost far more.
Upgrades and Reconfigurations
Some systems are so heavily loaded that the cost of additional
functionality of new software can push the system beyond the maximum
load that the system was intended to handle and thus deliver
unacceptable response times and throughput. If your system is running
near its limit now during peak workload periods, you want to ensure
that you take the steps necessary to avoid pushing your system beyond
its limits when you cannot afford it.
If your system is anything but a finely tuned, well-running machine,
you are advised to use caution when considering changes to anything. If
you have observed users complaining about slow response times, erratic
system behavior, unexplained system pauses, hangs, or crashes, your
system is already being pushed to, or beyond, its original designed
capacity. If this is the case, you need a performance audit to
determine your current workload and the resources necessary to
adequately support your current and possibly future workloads.
Implementing changes not specifically designed to increase such a
system's capacity or reduce its workload can degrade performance
further. Thus, investing in a performance audit will pay off by
delivering you a more reliable, productive, available, and lower
maintenance system.
Remember that there are likely many factors involved in upgrades and
reconfigurations that will contribute to increased resource
consumption. Keep in mind that future workloads that your system will
be asked to support may be unforseeable due to changes in the system,
workload, and business.
Blind reconfiguration without measurement, analysis, modification, and
contingency plans can result in serious problems. Significant increases
in CPU, disk, memory, and LAN utilization demand serious consideration,
measurement, and planning for additional workload and upgrades.
A number of steps should be carried out to evaluate whether your
systems are viable candidates for proposed changes, identify
modifications that must be made, if any, and assure that changes,
planned for and implemented, deliver expected results. Those steps are
outlined here.
Characterizing CPU, disk, memory, and LAN utilization on the systems
under consideration before reconfiguration is as important as, if not
more important than, measuring system activity after installation.
Adding segments and redistributing load may be critical to successful
implementation with minimal impact. Without scientific measurement
before installation and modification, as well as after, you will not
acquire the data necessary to understand, plan for, and resolve
potential problems in the immediate as well as distant future.
There are no hard and fast rules other than this: measure, plan,
understand, test, and confirm. Measurements should be done for one
week, if not longer, before installing your network to understand how
system workloads vary.
Workloads follow business cycles which often vary predictably
throughout the day, the week, the month, and the year. Planners should
take into account these variations which may be affected by financial
and legal deadlines as well as seasonal factors such as holidays and
other cyclic activity.
Seek to identify periods of peak heavy loads
(relatively long periods of heavy load lasting approximately five or
more minutes). Understanding their frequency and the factors affecting
them is key to successful system planning and management.
Peak Workloads and the Cyclic Nature of Workloads
You must first identify periods of activity during which you cannot
afford to have system performance degrade and then measure overall
system activity during these periods.
These periods will vary from system to system minute to minute, hour to
hour, day to day, week to week, and month to month. Holidays and other
such periods are often significant factors and should be considered.
These periods depend upon the business cycles that the system is
supporting.
If the periods you have identified as critical cannot be measured at
this time, then measurements taken in the immediate future will have to
be used as the basis for estimates of the activity during those
periods. In such cases you will have to take measurements in the near
term and make estimates based on the data you collect in combination
with other data such as order rates from the previous calendar month or
year, as well as projections or forecasts. But you must keep in mind
that factors other than CPU may become a bottleneck and slow the system
down. For example, a fixed number of assistants can only process so
many orders per hour, regardless of the number of callers waiting on
the phone for service.
This manual describes several strategies and procedures for evaluating
performance, evaluating system resources, and diagnosing resource
limitations as shown in the following list: