IBM System Management Facilities

IBM System Management Facility (SMF) is a component of IBM's z/OS for mainframe computers, providing a standardised method for writing out records of activity to a file (or data set to use a z/OS term). SMF provides full "instrumentation" of all baseline activities running on that IBM mainframe operating system, including I/O, network activity, software usage, error conditions, processor utilization, etc.

One of the most prominent components of z/OS that uses SMF is the IBM Resource Measurement Facility (RMF). RMF provides performance and usage instrumentation of resources such as processor, memory, disk, cache, workload, virtual storage, XCF and Coupling Facility. RMF is technically a priced (extra cost) feature of z/OS. BMC sells a competing alternative, CMF.

SMF forms the basis for many monitoring and automation utilities. Each SMF record has a numbered type (e.g. "SMF 120" or "SMF 89"), and installations have great control over how much or how little SMF data to collect. Records written by software other than IBM products generally have a record type of 128 or higher. Some record types have subtypes - for example Type 70 Subtype 1 records are written by RMF to record CPU activity.

SMF record types
Here is a list of the most common SMF record types:
 * RMF records are in the range 70 through to 79. RMF's records are generally supplemented - for serious performance analysis - by Type 30 (subtypes 2 and 3) address space records.
 * RACF type 80 records are written to record security issues, i.e. password violations, denied resource access attempts, etc. TopSecret, another security system, also writes type 80 records. ACF2 provides equivalent information in, by default, type 230 records but this SMF record type can be changed for each installed site.
 * SMF type 89 records indicate software product usage and are used to calculate reduced sub-capacity software pricing.
 * IBM Db2 writes type 100, 101 and 102 records, depending on specific Db2 subsystem options.
 * CICS writes type 110 records, depending on specific CICS options.
 * Websphere MQ writes type 115 and 116 records, depending on specific Websphere MQ subsystem options.
 * WebSphere Application Server for z/OS writes type 120. Version 7 introduced a new subtype to overcome shortcomings in the earlier subtype records. The new Version 7 120 Subtype 9 record provide a unified request-based view with lower overhead.

Evolving records
The major record types, especially those created by RMF, continue to evolve at a rapid pace. Each release of z/OS brings new fields. Different processor families and Coupling Facility levels also change the data model.

SMF data recording
SMF can record data in two ways:
 * The standard and classical way: Using buffers the SMF address space, together with a set of preallocated datasets (VSAM datasets) to use when a buffer fills up. The standard name for the datasets is SYS1.MANx, where x is a numerical suffix (starting from 0).
 * The relatively new way: Using log streams. SMF utilizes System Logger to record collected data, which improves the writing rate and avoids buffer shortages. It has more flexibility, allowing the z/OS system to straightforwardly record to multiple log streams, and (using keywords on the dump program) allowing z/OS to read a set of SMF data once and write it many times.

Both the two ways can be declared for the use, but only one is used at a time in order to have the other as a fallback alternative.

This data is then periodically dumped to sequential files (for example, tape drives) using the IFASMFDP SMF Dump Utility (or IFASMFDL when using log streams). IFASMFDP can also be used to split existing SMF sequential files and copy them to other files. The two dump programs produce the same output, so it does not involve changes in the SMF records elaboration chain, other than changing the JCL with the call of the new dump utility.

SMF data collection and analysis
SMF data can be collected through IBM Z Operational Log and Data Analytics and IBM Z Anomaly Analytics with Watson. IBM Z Operational Log and Data Analytics collects SMF data, transforms it in a consumable format and then sends the data to third-party enterprise analytics platforms like the Elastic Stack and Splunk, or to the included operational data analysis platform, for further analysis. IBM Z Anomaly Analytics with Watson collects SMF data from multiple IBM Z systems and subsystems, including IBM Db2 for z/OS, IBM CICS Transaction Server for z/OS and IBM MQ for z/OS, uses historical IBM Z metric and log data to build a model of normal operational behavior, and analyzes real-time operational data through comparison with the model of normal operations to detect and alert IT operations of anomalous behavior.

IBM Z Operational Log and Data Analytics collects SMF data in the following three ways, and IBM Z Anomaly Analytics with Watson collects SMF data in the first two of the following ways:


 * In log stream mode with SMF in-memory buffer

When SMF is run in the log stream mode, the Common Data Provider in IBM Z Operational Log and Data Analytics and IBM Z Anomaly Analytics with Watson can be configured to collect SMF from the SMF in-memory buffer with the SMF real-time interface.


 * In data set recording mode

When SMF is run in the data set recording mode, the Common Data Provider in IBM Z Operational Log and Data Analytics and IBM Z Anomaly Analytics with Watson collect and stream SMF data via a set of SMF user exits.


 * In batch mode

The System Data Engine of the Common Data Provider in IBM Z Operational Log and Data Analytics can be run stand-alone in batch mode to read SMF data from a data set and then write it to a file. The System Data Engine batch jobs can be created to write SMF data to data sets and send SMF data to the Data Streamer.

SMF data can be analyzed on the following analytics platforms:


 * Z Data Analytics platform, a component of IBM Z Operational Log and Data Analytics, which can help to visualize and search through a large number of Z operational data on a single pane of glass. The dashboards and saved searches provide insights into the operational data and help with early problem detection and problem diagnosis.
 * Enterprise platforms such as Splunk, the Elastic Stack, Apache Kafka, or Humio that can receive and process operational data for analysis. The platforms like the Elastic Stack and Splunk do not include expert knowledge about z Systems and applications, but users can create or import their own analytics to run against the data.
 * IBM Z Anomaly Analytics with Watson, a product that uses both log-based and metric-based machine learning technology to provide anomaly detection.
 * IBM Db2 Analytics Accelerator for z/OS, a database application that provides query-based reporting.
 * IntelliMagic Vision for z/OS, from IntelliMagic. The platform can provide insights and recommended actions to the system owners, which are based on expert knowledge about z Systems and applications.