When configuring a Manager run in the TrueSight Capacity Optimization Gateway Server (formerly BMC Performance Assurance for Unix console) there are certain settings that can be considered a Best Practice based on our experience. This document will list each Manager option where there are suggestions for what value to use, or what value not to use for that configuration option. This document only covers options where Best Practices would suggest a change from the default value, or the Best Practices use of the parameter deserves additional explanation. |
The latest Unix Console Manager Run Best Practices document can be found here:
ftp://ftp.bmc.com/pub/perform/gfc/mpp/Unix%20Console%20Manager%20Run%20Best%20Practices.pdf Unix Console Manager Run best Practices v 1.1Manager Run Best PracticesSection I: OverviewWhen configuring a Manager run in the BMC Performance Assurance for Unix console there are certain settings that can be considered a Best Practice based upon our experience. This document will list each Manager option where there are suggestions for what value to use, or what value not to use for that configuration option.Section headings marked with an asterisk include information of particular important or document recommended changes from the default Manager settings. Recommended Directory Structure and File ConventionsIn Manager it is important to protect all of the files associated with the nightly Manager runs and only modify those files when making changes to the nightly Manager run. Files such as the Manager Commands File (*.vcmds), Analyze Commands File (*.an), and Domain File (*.dmn) should be dedicated to the nightly Manger runs and a separate set of files should be used for any ad hoc data processing.To simplify management of these files, something similar to the following directory structure is recommended: ./files ./files/daily ./files/daily/domains ./files/daily/vcmds ./files/daily/workloads ./files/adhoc ./files/adhoc/domains ./files/adhoc/vcmds ./files/adhoc/workloads ./manager/ ./manager/daily ./manager/daily/[Run Name] ./manager/daily/visarchive/[Run Name] The idea being this directory structure is:
Install the latest console patches for your consolePatches are especially important for a new console installation because a console problem may impact data collection, data transfer, or data processing for every node in your environment. It is best to install the latest console patch set available when you install your new BMC Performance Assurance (BPA) console. At a minimum, download the latest patch set and consult the README files to see what issues have been addressed by the currently available patch set.For more information on how to obtain and install the latest patch set see KA000097159: Cumulative Hot Fixes for TrueSight Capacity Optimization (TSCO), TSCO Gateway Server, and TSCO Agent, and TSCO Perceiver Use the UDR Collection Manager Status ReportsThe UDR Collection Manager (UCM) status reports are web based reports that provide information on the status of data collection and data transfer in your environment. These reports contain detailed and useful information regarding which nodes are successfully collecting and transferring data to the console and provide valuable insight into collection and transfer failures in your environment.KA000108785: Using the BMC Performance Assurance UDR Collection Manager (UCM) Status Reports describes the UCM Status Reports and how they can be used. Section II: Manager VCMDS settingsManager => Main Window => Data tabData Source [ System Data | System and Application Data | Application Data ]Default: System and Application DataRecommended: System and Application Data For a nightly Manager run, the Collect New Data option should be selected. That option is what allows a run to be scheduled for the future. Either the System Data or System and Application Data data source option can be selected since they are functionally equivalent. The old Application Data data type is related to deprecated data types that were collected by previous releases of the Perform product. Currently, there are no Application Data data types in the product. Domain File*Default: <NONE>Requirements/Recommendations:
Domains are processed serially/sequentially within a Manager run not concurrently. This means that a Manager run with a large number of domains will frequently have a long elapsed processing time for all domains within the run which would make it difficult to finish data processing within the nightly processing window. Before a Manager run will begin data processing all data for all nodes within the run (across all domains) must be successfully transferred or the Transfer Duration must have elapsed. This means that a large number of domains within a single run will frequently cause a long data transfer period since there is an increased chance that a single node will be unable to transfer causing the whole run to delay processing until the transfer duration elapses. Output Files Directory*Default: <NONE>Recommendation:
This directory structure makes the association between the Manager Output Directory and the Manager run very clear and is very helpful during Manager run troubleshooting and recovery. Manager => Main Window => Schedule tabStart Date*For a future collection Manager run, the Start Date should generally be specified as the current day, or as the next day. For example, for a run being submitted on March 16th, 2009 at 10:00 AM, if the goal is to begin data collection immediately on the remote node and have data collected and processed for the remainder of the day, then the Start Date specified should be Mar 16 2009. If the goal is to have the Manager run pick up data collection starting tomorrow the Start Date specified should be -> Mar 17 2020.End date*For a daily Manager run the End Date specified should be some time in the distant future. It is relatively easy to stop an active Manager run before the specified End Date and the consequences of a run reaching its End Date are sometimes severe.A Manager run should never be scheduled to run until December 31st of the current year unless there are no plans to continue that run into the next year. January 1st is a holiday and every year there is at least one Perform customer working on that holiday re-scheduling all of their Manager runs because they expired on December 31st. Any date many years in the future is acceptable.[4] Daily Begin*For a 24 hour daily data collection a Daily Begin of 00:00 is typically the most appropriate option. It is generally not a good idea to try to 'stagger' the daily begin of data collection within Manager by a few minutes (00:01, 00:02, and so on) since this will impact the interval times within Visualizer.Some key points: * The begin time for all intervals within a Visualizer database should be at the same time for all nodes within the database. So, if Run A had a Daily Begin of 00:00 and Run B had a Daily Begin of 01:00 (each with a 60 minute Visualizer Interval) that would be acceptable since the beginning of each interval within the database would be the same across each run, but if Run A had a Daily Begin of 00:00 and Run B had a Daily Begin of 00:05 that would be bad because now the start times of the interval would be offset. Daily End*For a 24 hour daily data collection a Daily End of 23:59 is typically the most appropriate option. It is possible to select a Daily End value of 00:00 for a daily run, but this does impact the name of the resulting Visualizer file.[5]Manager => Main Window => Workload tabWorkload Definition*By default, Manager uses the 'Default Workload' list which contains just the 'zzz' workload (so all process level work will be assigned to the zzz workload and transaction on the machine. This screen allows you to specify a custom workload characterization for your Manager run.Additional information related to workload characterization can be found in the Creating and Refining Workloads section of the BPA product documentation. There are currently no specific best practices recommendations related to the use (or lack of use) of workloads within a Manager run. Manager => Options menu => Advanced Features => Collect tabCollect OptionsWhen collection is completed, send email to:Default: <Blank> (Don't send e-mail)Recommended: <Blank> (Don't send e-mail) In general this option should be left blank. It instructs each remote node to send an e-mail to the specified address upon completion of data collection. The UDR Collection Manager (UCM) Status Reports should be used for monitoring the success of data collection and transfer. Run script before collection time (hours & minutes)Default: 30 minutesRecommended: 30 minutes. This option specifies how long before the Start Time of data collection to begin sending collection requests to the remote node. In almost all cases the default of 30 minutes is sufficient for the collection requests to be successfully registered on the remote node before data collection is scheduled to begin. There is no reason to stagger this value for Manager runs since the UDR Collection Manager (UCM) processes that handle sending the data collection start requests will throttle the number of concurrent start requests.[6] Intervals for collection and summarizationCollect system data every X secondsDefault: 10 secondsRecommended: 10 seconds This option specifies the default sample rate for data collection on the remote node. In most scenarios the default of 10 seconds is appropriate. Collect application data every X secondsDefault: 30 secondsRecommended: N/A (This setting is obsolete) This option is not used by Manager since there are no longer any Application data types available for data collection in Manager. Summarize data every X minutesDefault: 15 minutesRecommended: 15 minutes This option specifies the default summarization interval for the Manager run. This option determines the granularity of the UDR data. A 15 minute summarization interval means that the lowest granularity of data that can be extracted from the raw UDR data is 15 minutes. Decreasing this value to allow the UDR data to be summarized more frequently may be appropriate in some environments, but it will have a significant impact on the size of the UDR data (the UDR data size will increase by roughly 40% each time the summarization interval is halved) and some impact on overall CPU consumption related to data collection on the remote nodes (since the Perform Agent will need to summarize the data more frequently). The selector will only allow a summarization interval of 15, 30, 45, 60, to be specified. But this field can be directly edited in the GUI to specify a particular summarization interval (such as 1, 2, 5, or 10 minutes). Note that the summarization interval must be a factor of the Visualizer interval (so a 10 minute summarization interval with a 30 minute Visualizer interval would be fine, but wouldn't work when using a 15 minute Visualizer interval). Enable collector restartDefault: SelectedRecommended: Selected Selecting this option will restart the UDR Collection Manager (UCM) processes which control data collection and data transfer if they terminate for some reason on the console (such as a machine reboot) and will allow UCM Status Reporting to be enabled. This is a good feature that should be enabled within Manager. Collect restart interval X minutesDefault: 15 minutesRecommended: 15 minutes (although other settings may be appropriate depending on the environment) This option determines how frequently the '*.Collect query' script will be executed by pcron and separately how frequently the udrCollectMgr process will query the remote node. In Perform version 7.2.00 and later the only purpose of the '*.Collect query' script is to ensure that the udrCollectMgr process associated with the Manager run is running. The '*.Collect query' script does not query the remote node to check if data collection is running – that is handled by the udrCollectMgr command itself. In some environments decreasing the query rate to 30, or even 60 minutes, may be appropriate. This setting is most important after a machine reboot since it determines how quickly after a reboot the udrCollectMgr process will be restarted. If a machine is rebooted at 12:05 AM and the restart interval is 60 minutes that means that the udrCollectMgr process won't get restarted until 1:05 AM which means that an hour of the Transfer window will have been lost (since the udrCollectMgr process is what handles the data transfer). If it is unlikely that the machine would be rebooted near the end of data collection or during the data transfer window then it is less important that the 'restart interval' be set to a frequent value – but it is still important that the restart be enabled. Enable status reportingDefault: SelectedRecommended: Selected This feature enabled the UDR Collection Manager (UCM) Status Reporting functionality which creates and updates a web based status report related to data collection and data transfer for the nodes associated with the Manager run. Manager => Options menu => Advanced Features => Analyze tabAnalyze OptionsRun AnalyzeDefault: SelectedRecommended: Selected The typical purpose for a nightly Manager run is to collect, transfer, and process data into a Visualizer file which is then populated into a Visualizer database. Analyze must be enabled in order for a Visualizer file to be created by a Manager run. Generate Analyze System ReportsDefault: UnselectedRecommended: Unselected The Generate Analyze System Reports checkbox enables the creation of the Analyze reports for each interval that Analyze is executed. Analyze will be executed once per Visualizer Interval (default: 60 minutes) and would thus create separate text reports for each hour of the day. These reports should generally be disabled as part of the nightly Manager run for several reasons:
Override RepositoriesDefault: UnselectedRecommended: Unselected This is a deprecated setting that should not be used. Generate reports in XML formatDefault: UnselectedRecommended: Unselected (Grayed out when Generate Analyze Reports is unselected) CPU Performance Rating BasisDefault: SPEC CINT2000 RateRecommended: SPEC CINT2000 Rate The default CPU Performance Rating Basis of SPECint2000 is the best choice because there are conversion factors available to convert all other rating systems to it. For environments that are exclusively running modern hardware SPEC CINT2006 Rate would also be an acceptable selection here. KA 000031533: What is the best rating system to use as the 'CPU Performance Rating Basis' in Perform Manager? describes this in more detail. Cut Disk OptionsCut Disk Function*Default: OffRecommended: On The Cut Disk features removes disks doing less than a specified threshold of I/Os per second or a specified percent utilization from the Analyze reports, Predict model, and Visualizer file created by a Manager run or within Analyze itself. There are many benefits to using Cut Disk including:
A good initial option is 'Cut Measurement' of 'I/O Rate in pg/sec' and 'Threshold' of '1.0'. This tells Analyze to not include disks doing less than 1.0 I/Os per second in the Visualizer file, Predict model file, or any reports that it creates. Disk TypeDefault: GENERIC_DISKRecommended: GENERIC_DISK Cut Measurement*Default: Utilization in %Recommended: I/O Rate in pg/sec Disk NameDefault: zzzcutdiskRecommended: zzzcutdisk Threshold*Default: 0.0Recommended: 1.0 Export DataDefault: UnselectedRecommended: Unselected The Analyze Export Data setting creates a separate XML file per Visualizer interval that could be used to import Analyze data into another application. This feature is not frequently used. The metrics output to the Analyze XML export file are described in the Analyze Data Export schema file ($BEST1_HOME/bgs/pengine/adapter/Xml/XmlSchema/Analyze.xsd). Generate data for exportDefault: BlankRecommended: Blank When the Export Data option is selected an output directory for the resulting XML file must be selected. Output DirectoryManager => Options menu => Advanced Features => Predict tabRun PredictDefault: SelectedRecommended: Selected Running Predict is required to include Workload Response Time values in the output Visualizer files created by the Manager run. In general when properly licensed for predict we would recommend that Predict be turned on unless there were a reason to disable Predict in your environment. The most common reason to disable execution of Predict is to decrease the nightly Manager run processing window in environments where workload response time in Visualizer is less important than processing duration (or just the 'default' zzz workload is being used). Generate Predict reportsDefault: UnselectedRecommended: Unselected Perform Automatic Calibration*Default: SelectedRecommended: Selected (for Unix machines) / Unselected (for Windows machines) When processing data from Unix machines it is generally best to have the Perform Automatic Calibration option selected to best calibrate CPU Wait Time in Predict to the Analyze measured run queue on the system. But, when processing data from Windows machines it is generally best to have the Perform Automatic Calibration option unselected due to known overstatement of the measured run queue due to the way that Windows reports the run queue.[7] Generate reports in XML formatDefault: Unselected (Grayed out when Generate Predict Reports is unselected)Recommended: Unselected Predict commands file nameDefault: BlankRecommended: Blank A Predict Commands File is an advanced feature that allows a command file to be specified which modifies the model file created by Analyze after it has been evaluated but before the Visualizer file or any reports have been created from Predict. There is almost never any reason to specify a Predict commands file as part of a nightly Manager run. Export DataGenerate data for exportDefault: UnselectedRecommended: Unselected The Predict Export Data setting creates a separate XML file per Visualizer interval that could be used to import Analyze data into another application. This feature is not frequently used. The metrics output to the Analyze XML export file are described in the Analyze Data Export schema file ($BEST1_HOME/bgs/pengine/adapter/Xml/XmlSchema/Predict.xsd). Output DirectoryDefault: BlankRecommended: Blank When the Export Data option is selected an output directory for the resulting XML file must be selected. Manager => Options menu => Advanced Features => Visualizer tabVisualizer interval (minutes)Default: 60Recommended: 60 The Visualizer Interval determines the summarization interval/granularity of the data that will be populated to the Visualizer database. The default value is to summarize the data to a 60 minute (1 hour) interval. This means that the graphs in Visualizer will be broken down into 24 data points per day. In most environments, the default of 60 minutes is a good starting point but in some environments (or for some subset of machines) it may be desirable to summarize the data more frequently to provide higher a granularity view of data in Visualizer and Perceiver. The selection of the Visualizer Interval will has a direct impact on the total CPU, I/O, and processing duration of your nightly Manager runs. A 30 minute interval will require twice as much CPU, I/O, and processing time to complete the nightly Manager run. A 15 minute interval will require 4 times as much CPU, I/O, and processing time. Selecting the appropriate Visualizer Interval is a balancing act between resource consumption and the number of data points per day available in Visualizer and Perceiver. Common problems related to a low Visualizer Interval:
Create System Visualizer files in [Predict | Analyze | Both]Default: BothRecommended: Both Visualizer File TypeDefault: System and Windows extended metricsRecommended: System and Windows extended metrics The 'Include Windows extended metrics' option in Manager determines if any of the Windows specific metrics will be included in the output Visualizer file when processing Windows data. If the 'Include Windows extended metrics' checkbox is disabled only the Visualizer tables that are shared between both Windows and Unix machines will be included in the Visualizer file.[9] Summarize I/O Server for Visualizer*Default: UnselectedRecommended: Selected In some environments Analyze will create a large number of workload@hostA@hostB workloads that seem to have no utilization associated with them. These are I/O Server workloads that are a modeling construct used to balance remote I/O that was seen to occur in the model. In general there is no reporting value to these so they should generally be summarized out to leave only the primary work carrying workloads in the resulting Visualizer file.[10] Enable Transaction and Workload Name alterationDefault: UnselectedRecommended: Unselected This option is deprecated. In the past it was used to ensure that workload names were unque within the first 30 characters to avoid name space collision in Visualizer. But the supported workload name length has been increased to 50 characters in Visualizer which has eliminated this problem so this option should be left disabled. Enable Oracle Populate*Default: UnselectedRecommended: Selected (when using an Oracle database for Visualizer) The Oracle Populate feature (also all 'Unix Populate' or 'mpopulate') allows the Visualizer file created by Manager to be automatically populated to an Oracle Visualizer database. This feature only supports population to an Oracle database, not population to other databases such as Microsoft SQLServer. When using an Oracle database for Visualizer this is the easiest method available to populate the data into the database.[11] Database User NameDefault: BlankRecommended: N/A The Oracle username to use to connect to the database. Database PasswordDefault: BlankRecommended: N/A The password for the specified user. Database InstanceDefault: BlankRecommended: N/A The database service name (TNS name) for the database. Following Populate [ None | Move vis File to | Copy vis Files to | Delete vis Files ]*Default: NoneRecommended: Move vis Files to When using Unix Populate, once the Visualizer file has been successfully populated it can either be left in the Manager Output Directory, moved or copied to another location, or deleted. The best option is to Move the Visualizer file to another directory. The reason is that the move will only happen after the file has been successfully populated (if the population fails the Visualizer file will be left in the Manager Output Directory) which makes it very easy to tell which Visualizer file populates failed and which succeeded on a nightly basis. Destination*Default: BlankRecommended: Site specific This is the directory to move the Visualizer file to after successful population. An archive directory like ./manager/daily/visarchive/[Run Name] is a good option since it keeps the Visualizer files from the various Manager runs separate but all under the same basis directory structure. Manager => Options menu => Advanced Features => OSR tabRun Operational Status Reporting*Default: SelectedRecommended: Unselected The Operational Status Reporting (OSR) feature creates HTML reports that contain information related to nightly processing. In general, OSR should be left disabled because the OSR reports have been superceded by the collect, transfer, and processing status reporting made available via General Manager.[12] There is a CPU, I/O, and run duration cost to running OSR as part of a nightly Manager run. Some sites find that the OSR reports are beneficial to them and are worth that cost. Other sites after using the OSR reports determine that they can get the same basic information our of the [date]-[date].ProcessDay.out file and don't require the extra overhead of creating the HTML reports. Output Override Directory (Optional)Default: BlankRecommended: N/A By default the OSR reports will be written to the $BEST1_HOME/local/workarea/osr_output directory. This field allows the OSR reports for the Manager run to be redirected to an alternate location. Manager => Options menu => Advanced Features => Other tabMiscellaneous OptionsPreprocess commandDefault: BlankRecommended: N/A The Preprocess command is a script executed at the beginning of the run (before any data processing has been done). In the vast majority of Manager runs this field will be left blank and no script will be executed. The Manager run will wait until this script finishes executing before continuing. The Preprocess command is executed once before the first domain is processed for a run. It is not executed before each domain begins processing in turn. Preprocess command argumentsDefault: BlankRecommended: N/A The command line arguments to pass to the Preprocess command. Postprocess commandDefault: BlankRecommended: N/A The Postprocess command is a script executed at the end of the Manager run. Postprocess command argumentsDefault: BlankRecommended: N/A The command line arguments to pass to the Postprocess command. Processing delay (days, hours, & minutes)*Default: 0 days 00 hours 10 minutesRecommended: Site specific The Processing delay determines the minimum amount of time that Manager will wait before checking to see if all of the UDR data has been transferred and either (a) begin data processing if all data has been transferred, or (b) wait until all data is transferred or the Transfer Duration elapses. A good starting value for the Processing Delay is 30 minutes or half the Transfer Duration. Setting a lower Processing delay for a Manager run can give that run a priority in relation to beginning data processing so it may be able to complete data processing earlier in the day (assuming that the data is able to transfer back to the console quickly). Processing timezoneDefault: BlankRecommended: N/A This field allows an alternate Timezone to be specified to adjust the times in the output Visualizer file in relation to that time zone. This field is somewhat complex to use on the Unix console since it only adjusts the time applied to data processing in Analyze, not the time used for data collection on the remote node or the times when the script will be executed on the console. In general, the most simple configuration is to not specify a Processing timezone and just allow the data to be processed in relation to the time zone of the console.[13] Use time stamp for output directoryDefault: SelectedRecommended: Selected By default this option is selected. This will cause a date/time stamp directory to be created in the Manager Output Directory whenever a Manager run is activated. As long as that Manager run is active it will continue to use the same date/time stamp directory (so, the files for a Manger run activated on March 16th at 1:54 PM would be created in a Mar-16-2009.13.54 sub-directory under the 'Output Files Directory' specified on the Data tab of the main Manager window. This option should be selected since it is a required setting when using the $BEST1_HOME/bgs/scripts/migrateManagerRuns.pl script to migrate Manager runs from a previous console version to a new console version after an upgrade.[14] Collect Results transfer durationMaximum time allowed for data transfer (hours & minutes)*Default: 1 hour 00 minutesRecommended: Site specific By default, a Manager run will wait 60 minutes for all data to be successfully transferred from the remote nodes before starting data processing and trying to process whatever data was successfully transferred to the console. In most small to medium sized environments a 60 minute transfer duration will be sufficient for all data to be transferred from the remote nodes. In a large environment or an environment with network stability problems a longer transfer duration may be necessary. The UDR Collection Manager (UCM) Status Reports[15] can be used to see if transfers are failing due to the specified transfer duration being insufficient. Delete collect data on agent computer if transfer fails*Default: UnselectedRecommended: Selected This option can be used to instruct the remote node to automatically delete data that is still left on the remote node due to a failed transfer after some period of time. Typically it is good to enable this option to ensure that data that has failed to transfer is eventually deleted from the remote node once it is clear that a manual transfer isn't going to be done to recover that failed transfer. After a successful transfer of data from the remote node to the console data is immediately deleted from the remote node so this setting will have no effect on successful transfers. Delete data after (days & hours)*Default: 0 days 0 hoursRecommended: 14 days 0 hours (or Site specific) This is the time the remote node should wait before automatically deleting the data that has failed to transfer to the console from the remote nodes. The time period specified should be long enough than any desired manual recovery of the failed transfer will be completed but short enough that the file system on the remote nodes will likely not fill up due to untransferred data. A good starting point for this value is 14 days.[16] Manager => Options menu => Data Management => Compress tabCompress data in current intervalDefault: UnselectedRecommended: Unselected The Compress data in current interval option should not be selected due to the way that Manager is designed to ensure that both the current day and previous day's data are uncompressed and available for Analyze during data processing. This means that the Compress data in current interval option doesn't result in a space savings because by design, Manager forces that you have the current day and the previous day's data uncompressed in the Console Data Repository. Compress data older than X days*Default: UnselectedRecommended: 7 days (or Site specific) This feature will compress UDR data in the Console Data Repository that is older than the specified period. Compressing older UDR data will reduce data processing times for current Manager runs because it decreases the amount of data that Analyze must search during its data discovery.[17] Compress Manager output filesDefault: UnselectedRecommended: Unselected This feature will compress files in the Manager Output Directory after the nightly Manager run. The overall space savings is generally not sufficient for this feature to be worthwhile to enable. File extensionsDefault: Grayed out when 'Compress Manager output files' is unselectedRecommended: N/A Manager => Options menu => Data Management => Archive tabArchive Data FilesDefault: UnselectedRecommended: Unselected The Archive tab in the Data Management menu provides access to deprecated functionality related to archiving directly to a tape drive. This functionality is deprecated and may be removed without notice from a future release of the Perform product. This functionality should not be used. Manager => Options menu => Data Management => Delete tabDelete data older than X days*Default: UnselectedRecommended: Site specific This feature will automatically delete UDR data from the Console Data Repository that is older than the specified time. Many sites implement a separate custom data cleanup script that runs outside of the normal nightly processing window to delete or archive UDR data older than a specified date rather than using the automatic data deletion functionality provided by Manager (since a custom script can implement the cleanup far more efficiently and it can be done outside of the nightly processing window which will eliminate contention between data processing and data cleanup). Delete temporary files after run*Default: UnselectedRecommended: Selected During normal nightly Manager runs this option should be selected so temporary files created during data processing can be deleted. But, there are some files that should be kept around for at least a few days rather than being deleted immediately after the run. The deletion of these files would then need to be managed outside of Manger by a custom script, but these files will be useful in determining what went wrong if there is a problem, or can be used to re-transfer or re-process data easily if there was a major transfer or processing failure. File Extensions*Default: All filesRecommended: Exempt some files from automatic deletion The files to exclude from temporary file deletion are:
Be aware that the *.Manager script should not be deleted by a cleanup script since the *.Manager script is created at the beginning of the Manager run (when the run is initially submitted) and then controls the run throughout its life. If the *.Manager script is deleted the Manager run will terminate at that point. Manager => PC Transfer menuAutomatic Transfer ModeDefault: UnselectedRecommended: Site specific The PC Transfer Menu is designed to allow the Visualizer file created by the nightly Manager run to be uploaded (via FTP if using the default ftpsend.template script) to the Visualizer PC where Automatic will be running to populate the Visualizer file. This feature is necessary when the Visualizer database isn't Oracle (since Manager on Unix can only populate an Oracle database directly). This feature can also be modified to notify Automator running on the Visualizer PC that population has completed and execution of a database summarization, maintenance, or reporting script can be executed.[18] Transfer shell scriptDefault: BlankRecommended: /usr/adm/best1_default/bgs/scripts/ftpsend.template HostDefault: BlankRecommended: N/A UserDefault: BlankRecommended: N/A PasswordDefault: BlankRecommended: N/A DirectoryDefault: BlankRecommended: N/A Global Marker FileDefault: UnselectedRecommended: Selected (if Automatic Transfer Mode selected) The Global Marker File is created and sent at the end of the Manager run (when all domains have finished processing). The marker file is used by the 'Wait for Marker' event within the Automator run. Domain Marker FileDefault: UnselectedRecommended: Site specific The Domain Marker File is created and sent at the end of processing for each individual domain within a Manager run. The marker file is used by the 'Wait for Marker' event within the Automator run. Manager => Advanced Scheduling menu => Days of Week tabSelect Days of WeekDefault: All selectedRecommended: All selected This dialog box allows certain days of the week to be excluded from the Manager run. The recommended option is to select all days of the week for data collection and data processing. Manager => Advanced Scheduling menu => Exception Dates tabSelect Exception DatesDefault: No dates specifiedRecommended: No dates specified This dialog box allows certain days of the year to be excluded from data collection. For example, if you really don't want to know about machine performance on Christmas you could specify Dec 25th 2014 in this box. This feature is not frequently used within the product. Section III: Configuration FilesThe definition and default value for each option available in the Manager configuration files on the console is available within the configuration files themselves. Therefore, only options where a change is recommended or requiring further explanation will be described here.$BEST1_HOME/local/setup/dataProcessing.cfgMAX_CONCURRENT_DATA_PROCESSINGDefault: 4 (Unix console) / 2 (Windows console)RELEASE_LOCK_TIMEOUTThis setting controls how long before a data processing lock is automatically released in the event that a Manager run hasn't released its processing lock (due to the lock process being killed, or whatever reason). The default value of 6 hours is probably acceptable in most environments (since a Manager run failing to release a processing lock is relatively uncommon on Unix).DATA_PROCESSING_LOG_FILE_MAX_SIZEThis setting controls the size of the $BEST1_HOME/local/manager/log/[hostname]-ManagerLock.log file. The default is generally appropriate for all environments.DATA_PROCESSING_LOG_WARNING_LEVELThis setting controls the volume of debug output written to the $BEST1_HOME/local/manager/log/[hostname]-ManagerLock.log file. The default (Errors only) is generally appropriate for all environments.$BEST1_HOME/local/setup/collectManager.cfgCOLREQ_DAYS_ADVANCEThis setting and setting recommendations are described in detail in KA000030392: What is the UDR Collection Manager COLREQ_DAYS_ADVANCE feature? What is the benefit of this being enabled by default in Perform version 7.4.10 and later?COLREQ_CONCURRENT_GLOBALThe COLREQ_CONCURRENT_GLOBAL parameter allows Manager to throttle the number of concurrent collection requests being sent from the console to remote nodes. In almost all environments the default value of 10 will be sufficient to allow all collection requests to make it to the remote nodes before the scheduled start of data collection. This is because the requests are sent to the remote nodes starting 30 minutes (by default – based upon the Run script before collection time setting in the Manager VCMDS file) before the scheduled start time of data collection.COLREQ_CONCURRENT_RUNThe COLREQ_CONCURRENT_GLOBAL setting allows Manager to throttle the number of concurrent collection requests being sent from the console to remote nodes by any individual Manager run. In virtually all environments the default value of 10 will be sufficient to allow all collection requests to make it to the remote nodes before the scheduled start of data collection.COLREQ_FAILURE_RETRY_INTERVALThe COLREQ_FAILURE_RETRY_INTERVAL setting controls how long to wait between collection retry attempts being sent to the remote node after a failed request (for example due to the remote node being down or inaccessible over the network). In virtually all environments the default value of 5 minutes is appropriate.COLREQ_MIN_PERCENT_TIMEThe COLREQ_MIN_PERCENT_TIME setting controls how long before the specified end of data collection Manager will stop making additional collection requests to the remote nodes. In almost all environments the default value of 10% is appropriate. For a 24 hour collection period (00:00 – 23:59) the default value of 10% would instruct UDR Collection Manager (UCM) to stop trying to start data collection for the current day at 9:35 PM (144 minutes before 23:59).The most common time when this setting results in unexpected behavior is during testing a short Manager run. If a Manager run is scheduled at 13:00 (1 PM) to collect data from 00:00 – 14:00 (2 PM) it won't execute data collection on the remote node because there is less than 10% of the collection remaining when the run was initiated. TRANSFER_CONCURRENT_GLOBALThis setting determines how many concurrent data transfer requests will be allowed on the console. The default value of 10 is appropriate for small to medium size environments. But, this is a very conservative value and most consoles will easily be able to support additional data transfers running concurrently.For a small to medium size environment a TRANSFER_CONCURRENT_GLOBAL value of '20' would probably be good. For a medium to large environment a TRANSFER_CONCURRENT_GLOBAL value of '50' would probably be good. When increasing the TRANSFER_CONCURRENT_GLOBAL parameters it's best to monitor the amount of time dedicated to data transfer before and after the change. If the parameter is increased too much it will cause contention between multiple data transfers which will actually increase the total amount of time to transfer data (and may also result in data transfer failures). It is possible to run into transfer problems if the TRANSFER_CONCURRENT_GLOBAL parameter is set too high. KA000028597: Why does a very high setting for the UCM TRANSFER_CONCURRENT_GLOBAL parameter cause data transfers to fail? describes that problem in more detail. TRANSFER_CONCURRENT_RUNThis setting determines how many concurrent data transfer requests will be allowed for an individual Manager run. In general the default setting of 10 is appropriate for most environments.Decreasing this parameter may be useful for environments where nodes that are located geographically in the same remote location are contained within the same Manager run and there is a slow link between the console and the remote locations. By decreasing the number of concurrent transfers per Manager run that may spread the transfer work out over the various slow links resulting in overall better throughput. But that is not a common scenario. TRANSFER_FAILURE_RETRY_INTERVALHow long to wait after a failed transfer before a computer can be put back on the transfer retry list. The default value of 5 minutes is appropriate for most environments.COLMGR_DATA_RETENTIONThe number of days to keep all the UCM status and log files. The data management process (which is automatically scheduled in pcron by Manager) will automatically delete status and log files older than this retention period. The default value of 7 days is appropriate for most environments.COLMGR_STATUS_INTERVAL_OVERRIDEBy default the collection status reports are updated on the console after each collector query sent to the remote node based upon the Collector Restart interval. This option can be used to override that to force the status reports to be updated at a fixed rate. The default option of 0 (disabled) is appropriate for most environments.COLMGR_DETAILED_STATUS_LEVELThis parameter controls whether the dtailed status messages will be included in the UCM status reports. The default setting of 3 (detailed status messages for all collection requests) is appropriate for most environments.FIREWALL_METRIC_GROUP_WARNING_STATUSWhen a remote node is unable to collect some metric groups that node will be reported in Warning state (yellow) in the UDR Collection Manager (UCM) Node Status reports. In many cases analysis will reveal that the metric groups missing from the remote node are due to those groups not being available on the remote node and thus you'll want to flag those groups as not being required for this node to be reported in OK Status (green) in the UCM reports (which can be done using the udrCollectFilter command).The udrCollectFilter command is only able to filter missing metric groups when the remote node has provided the console with the list of missing metric groups. This is done by the remote node contacting the managing node on port 6768 and providing the list of groups that it is unable to collect. If the remote node is unable to initiate a connection to the console on port 6768 (possibly due to a firewall between the remote node and the console) the list of groups not being collected will not be available on the console and udrCollectFilter will not be able to change the node status from Warning to OK. This parameter allow UDR Collection Manager (UCM) to be configured to report remote nodes in OK State (green) when a count of missing metric groups has been returned by the collection status query, but information regarding which groups could not be registered is not available on the console. COLMGR_TRANSFER_DELAYThis setting specifies the transfer delay period, in minutes, that is added to the end collection time to determine the start time of the transfer. The default value of 5 minutes is appropriate for most environments.TRANSFER_USE_FILE_LOCKSThis setting controls if file locks are used during UDR file transfer. In a properly configured environment this setting can be left disabled (the default value). This setting can be used to protect against data corruption if a remote computer has been included in multiple Manager runs on the console with the same collection start time. In that scenario both runs may attempt to transfer data from the remote node to the console concurrently resulting in UDR data file corruption in the console repository for that computer.$BEST1_HOME/local/setup/Populate.cfgBy default the $BEST1_HOME/local/setup/Populate.cfg file does not exist as part of a Perform installation. There is a $BEST1_HOME/bgs/setup/Populate.cfg.sample sample file that can be copied into the $BEST1_HOME/local/setup directory and renamed to allow modifications to be made from the default values.TIMEOUTThe TIMEOUT setting determines how long to wait for population of an individual Visualizer file to complete before timing out the populate. The default value of 3600 second (1 hour) is sufficient for most environments.This setting most commonly needs to be increased when the Visualizer database is known to be offline for periods during the nightly populate and it is necessary to wait for the database to become available again to continue the population. CONCURRENT_POPULATIONBy default, the Unix console is configured to allow mpopulate to populate Visualizer files to be the same Visualizer database concurrently. In Perform version 7.4.10 and later concurrent population is generally not a problem as long as row level locking has been enabled within the Visualizer database schema.TRACEThe TRACE setting allows the population trace to be enabled for all mpopulate runs on the console. Enabling population tracing will have a considerable negative impact on the speed of population. The population trace can be enabled for an individual manual population by specifying the '-t' flag when running mpopulate.sh from the command line.Appendix A: Sample Cleanup ScriptWhen certain files have been excluded from the Delete temporary files after run list it becomes necessary to manage the deletion of those files through a custom script. A simple way to do that would be via the following command:#!/bin/sh ###################################################################### # Begin User Configuration Options # ###################################################################### # # The Manager Output Directory to search for files to delete # MANAGER_OUTPUT_DIRECTORY=/home2/best1data/manager/daily # # Delete files older than MAX_AGE days # MAX_AGE=7 ###################################################################### # End User Configuration Options # ###################################################################### PATH="/usr/bin:/bin:/usr/sbin:/sbin:/etc:/usr/ucb"; export PATH; find $MANAGER_OUTPUT_DIRECTORY \( -name "*.Collect" \ -o -name "*.ProcessDay" -o -name "*.XferData" \ -o -name "*.Variables" -o -name "*.ProcessDay.out" \) \ -mtime +${MAX_AGE} -exec rm {} \; -ls This script could be executed via cron or executed by Manager as a Post-Processing shell script via Manager. Appendix B: Change LogVersion 0.8 (2009-04-17)
Version 0.9 (2009-05-01)
Version 1.0 (2009-05-04)
Version 1.0.1 (2009-07-15)
Version 1.1 (2014-06-04)
[3] The actual technical limit is that the length of the domain list should be less than 1024 characters. For example, if each domain specified is 80 characters in length then 12 domains can be specified.
[4] I had historically used December 21st 2012 but that date came and went. January 18th, 2038 might be a good new future date.
[5] One somewhat confusing behavior when you specify a Daily End of 00:00 is that the name of the Visualizer files will change. Before, when the interval end time was 23:59 the name of the Visualizer file for data from March 16th would be '316xDOMAIN.vis'. Once the end time is changed to 00:00 the name of the Visualizer file for data from March 16th will be '317aDOMAIN.vis. The reason is that the name of the Visualizer file comes from the end of the last interval it contains. In this the end of the interval is now 03/17 @ 00:00 instead of 03/16 @ 23:59 so the name of the file represents the beginning of the next day rather than the end of the previous day.
[6] The dataProcessing.cfg COLREQ_CONCURRENT_RUN parameter controls the maximum number of concurrent start requests per Manager run and the COLREQ_CONCURRENT_GLOBAL parameter controls the maximum number of concurrent start requests that can be outstanding from the console across all runs.
[7] The primary impact of queue length calibration is to adjust workload response time in an effort to calibrate the Predict model view of a machine's behavior to the view of what really happened from the measured data. More detailed information related to CPU queue length calibration is available in KA412713: How can I fix a CPU Queue Length calibration exception in Perform Predict?
[8] It is physically possible to process with a 1 minute Visualizer interval but it causes graphics artifacts in Visualizer and requires a code change to Manager. KA339826: How can the minimum Visualizer interval be reduced to 1 minute in Manager on Unix since by default Manager only allows a minimum Visualizer interval of 2 minutes? describes the process.
[13] More information on some of the complexities and considerations related to using the Processing timezone can be found in KA316306: How can I process data using its 'native timezone' through UNIX Manager?
[16] If there are Perform version 7.3.00 or earlier remote nodes in your environment the Delete data after setting should not be set to greater than 27 days due to Defect 532523 which causes the Perform Agent to go into a loop when data deletion period longer than that is specified. An agent patch to address that problem is available for Perform version 7.3.00 and the issue is fixed in Perform version 7.4.00 production and later.
[17] This is less of a problem with modern releases of BPA on Linux due to changes in the way that Analyze (a) Find the correct time period of UDR data during the nightly run and (b) the number of times it does a UDR data discovery. In modern releases of BPA Analyze uses the UDR data Mmm-dd-yyyy.hh.mm time stamp directory as a hint for which directories to look in for yesterday's data (when running as part of automated nightly data processing). Also, the master Analyze process only does a single UDR data discovery and then child Analyze processes can just directly read the data. In the past, when Analyze searched for UDR data it used just the 'hostname' directory but it didn't use the Mmm-dd-yyyy.hh.mm directory as a filter. Analyze would browse all of the UDR data under the hostname directory and check the time period that data covers. That meant that if the console data repository contains a large amount of data processing will be slower because Analyze needed to constantly keep searching the full data repository to determine which UDR data files to open for the current interval being processed. There was also a separate bgsanalyze process used to process each interval in the day this search is repeated for every interval – it didn't just find the data once and then know to keep looking in that directory.
|