There are three phases to consider when recovering a Manager run:
(1) Re-initializing the Manager run to get scripts created for the current day's collection, transfer, and processing
(2) Execute data transfer and processing scripts for the previous day if necessary
(3) For days where no processing scripts were created by Manager, recover data transfer and processing
Depending on the TrueSight Capacity Optimization (TSCO) Gateway Server version there are different options available for these recovery phases.
TSCO Gateway Server Phase 1 and Phase 2 Recovery
The 'ManagerComputerFailureRecovery.pl' script can be used to automatically recover today's Manager runs (to get the scripts created and the udrCollectMgr processes running) and execute a recovery of data processing for any runs which have a [date]-[date].XferData, [date]-[date].ProcessDay, and [date]-[date].Variables file created for them.
cd $BEST1_HOME/bgs/scripts # Where $BEST1_HOME represents the /usr/adm/best1_V.V.VV directory where V.V.VV is the TSCO Gateway Server version, such as 20.02.00
./ManagerComputerFailureRecovery.pl
TSCO Gateway Server version 10.7.01 and earlier Phase 1 (Run Re-initialization) Recovery
The first thing to do is execute all of the [date]-[date].Manager scripts to initialize the runs for today.
The following KA describes the easiest way to do that:
000032140: Recovery and debugging options for CO Gateway Server/BPA Unix console Manager run failures. (
https://bmcsites.force.com/casemgmt/sc_KnowledgeArticle?sfdcid=000032140)
That should result in udrCollectMgr processes running for all of your Manager runs and all of the necessary scripts being created for today ([today]-[today].XferData, .ProcessDay, .Variables).
TSCO Gateway Server version 10.5 and earlier Phase 2 (Previous Day Recovery) Recovery
Once you've initialized today's Manager run (udrCollectMgr processes are running and the required scripts to collect, transfer, and process the data have been created) the next step is recovery of data transfer and data processing for previous days.
(2.1) If the [date]-[date].XferData, [date]-[date].ProcessDay, and [date]-[date].Variables files exist in the Manager Output Directory for a day then the easiest way to recover the data transfer and data processing for that date is to manually execute the [date]-[date].XferData script with the '-r' recovery flag.
The following KA describes that process:
000028141: In BPA Linux console, how can I quickly recover nightly Manager data processing if the *.ProcessDay script didn't run to completion and thus runs aren't listed in General Manager? (
https://bmcsites.force.com/casemgmt/sc_KnowledgeArticle?sfdcid=000028141)
The idea of that KA is that you are just building a list of all of your [date]-[date].XferData scripts in a script so they can all be executed with the '-r' flag.
Another way to do that same thing is via the 'find' command:
cd /[Base Manager Run Directory]
find . -name "Mmm-dd-yyyy*.XferData" -ls > xfer.shWhere Mmm-dd-yyyy is the target date of the run. For example, to find March 16th's XferData scripts:
cd /[Base Manager Run Directory]
find . -name "Mar-16-2017*.XferData" -print > xfer.sh A more detailed example would be, say I have two Manager runs scheduled:
$BEST1_HOME/bgs/scripts/listManagerRuns.pl -p MANAGER_COMMANDS_FILE OUTPUT_DIRECTORY
#MANAGER_COMMANDS_FILE,OUTPUT_DIRECTORY,
/usr/adm/best1_workspace/automation/TSCO10700_lab.vcmds,/usr/adm/best1_workspace/results/Mar-06-2017.14.10_TSCO10700_lab,
/usr/adm/best1_workspace/automation/TSCO10700_mf_lab.vcmds,/usr/adm/best1_workspace/results/Apr-04-2017.16.20_TSCO10700_mf_lab,In the above my [Base Manager Run Directory] is "/usr/adm/best1_workspace/results" since both Manager runs are scheduled in a sub-directory of that.
cd /usr/adm/best1_workspace/results
find . -name "Aug-07-2017*.XferData" -print > xfer.shBut then we need to edit that file and add "-r &" to the end of each line.
So, I start with a file like this:
./Apr-04-2017.16.20_TSCO10700_mf_lab/Aug-07-2017.00.00-Aug-07-2017.23.59.XferData
./Mar-06-2017.14.10_TSCO10700_lab/Aug-07-2017.00.00-Aug-07-2017.23.59.XferDataThen I need to edit it and add '-r &' at the end of each line to come up with this:
./Apr-04-2017.16.20_TSCO10700_mf_lab/Aug-07-2017.00.00-Aug-07-2017.23.59.XferData -r &
./Mar-06-2017.14.10_TSCO10700_lab/Aug-07-2017.00.00-Aug-07-2017.23.59.XferData -r &
TSCO All Releases Phase 3 Recovery (Recovery of data processing when Manager scripts weren't created or don't exist for that day)
If the [date]-[date].XferData, .ProcessDay, or .Variables files don't exist for a particular date that means that the run was never registered on that day. That is, by far, the most complex recovery scenario without much of an option for automation.
In general this requires the UDR data to be transferred back to the console via a script:
000031075: What is the best way to get data collected via the UCM "Collect Days in Advanced" feature transferred back to the TrueSight Capacity Optimization Gateway Server (BPA console) when Manager has failed to start data collection for that day? (
https://bmcsites.force.com/casemgmt/sc_KnowledgeArticle?sfdcid=000031075)
And then ad hoc Manager runs to be executed to re-process the data.
The following KA describes how to submit an ad hoc Manager run via the Gateway Manager UI:
000371220: How to re-process UDR files into a VIS file with TrueSight Capacity Optimization (TSCO) (
https://bmcsites.force.com/casemgmt/sc_KnowledgeArticle?sfdcid=000371220)
That process would be good for a small number of Manager runs but would require a lot of effort when several Manager runs needed to be recovered.
There really isn't a great way to recover the data processing when the *.ProcessDay scripts weren't created for a Manager run on a particular day. There are some manual steps available that can accelerate the creation and execution of the ad hoc Manager run for data processing recovery.
To manually define Manager runs from the command line for the ad hoc processing (once the data has been transferred back to the TSCO Gateway Server console) do the following:
(1) Make a copy of each of your nightly Manager *.VCMDS file into a new directory and rename each of them with something to indicate they are an 'ad hoc' run (maybe add 'reprocess' in the VCMDS name or something).
(2) Edit each *.VCMDS file and change the following:
RUN_COLLECT to 'NO'
START_DATE to 'Aug-05-2023'
END_DATE to 'Aug-08-2023'
DELAY_RUN_BY to '0:00:05'You can even do that to all the *.vcmds files in the directory at one with the following 3 'sed' commands:
sed -i 's/RUN_COLLECT.*/RUN_COLLECT NO/g' *.vcmds
sed -i 's/START_DATE.*/START_DATE Aug-05-2023/g' *.vcmds
sed -i 's/END_DATE.*/END_DATE Aug-08-2023/g' *.vcmds
sed -i 's/DELAY_RUN_BY.*/DELAY_RUN_BY 0:00:05/g' *.vcmdsThat assumes the date range of the recovery is Aug 5, 2023 through Aug 8, 2023. Change the START_DATE and END_DATE to fit your recovery time period.
Just make sure you don't run that in a directory where your regular nightly Manager VCMDS files are (because that will really mess up your nightly runs).
(3) Then, once all your Manager VCMDS files have been updated you just need to execute each of them:
$BEST1_HOME/bgs/scripts/best1manager [VCMDS]For example:$BEST1_HOME/bgs/scripts/best1manager run1_reprocess.vcmdsThat will register the run to reprocess the data from August 4th to August 8th. But, it will only reprocess data already transferred to the console so you'll need to use the other KA to get the data transferred to the console before creating the ad hoc recovery Manager runs.
Note that a DELAY_RUN_BY is still set so when you execute the script via the best1manager command it will still get registered in pcron and the execution will be delayed by 5 minutes. That allow you to get all of the runs registered and then throttled via the normal dataProcessing.cfg throtting rather than running 1 at a time (if the DELAY_RUN_BY was set to 0). After 5 minutes you should see the *.ProcessDay scripts running.