Archive Manager is very slow with 9.1. We can see it is taking 15 minutes to index a single incident. What are the things that can be done to troubleshoot why archiving incidents with 9.x takes such a long time? |
This knowledge article may contain information that does not apply to version 21.05 or later which runs in a container environment. Please refer to Article Number 000385088 for more information about troubleshooting BMC products in containers. Please keep in mind that the archive process itself was designed to be slow so that it did not impact overall system performance. Archiving with 9.x is single threaded as also designed. The archiving process itself does not consume a lot of resources on the system. The objective was to make sure that it does not impact production performance and as mentioned, this also means that the process will be slow as intended. We can't provide an expected number of incidents to be indexed or how quickly because system resources can be different, network speeds, etc., not to mention that every incident will be different as the number of related entries to the incident would vary. The archiving process looks to each incident to determine if it meets the criteria for the archive policy and then grabs those by chunk up to the default of 1000 or Max-Entries-Per-Query value using the lesser of the two values. However, when we retrieve the related entries using the defined associations these are not done by chunk. For this particular issue API, SQL, and Archive logging were captured and demonstrated the total time taken to archive is - 13:26:53.1600 - 13:26:48.2790 = 4second and 881 ms. We spent 4 seconds and 458ms on two SQL statements which means those two SQL statements are taking 92% of total time for archive process and this is not as intended. The log entries are as follows: <SQL > <TID: 0000000341> <RPC ID: 0000000000> <Queue: Archive > <Client-RPC: 1 > <USER: AR_ARCHIVER > <Overlay-Group: 1 > /* June 19 2017 13:26:48.4090 */ SELECT J1.C1 a0 FROM T2261 J0 INNER JOIN T2235 J1 ON (J0.C1 = J1.C450) WHERE ((J0.C1 = N'000000000038277') AND ((J0.C300411705 = N'HPD:Help Desk') OR (J0.C300411705 = N'CHG:Infrastructure Change') OR (J0.C300411705 = N'SRM:Request') OR (J0.C300411705 = N'WOI:WorkOrder')) AND (J1.C457 = N'SLM:Measurement')) ORDER BY a0 ASC <SQL > <TID: 0000000341> <RPC ID: 0000000000> <Queue: Archive > <Client-RPC: 1 > <USER: AR_ARCHIVER > <Overlay-Group: 1 > /* June 19 2017 13:26:52.5960 */ SELECT J1.C1 a0 FROM T2261 J0 INNER JOIN T2395 J1 ON (J0.C490009000 = J1.C400030500) WHERE ((J0.C1 = N'000000000038277') AND ((J0.C300411705 = N'HPD:Help Desk') OR (J0.C300411705 = N'CHG:Infrastructure Change') OR (J0.C300411705 = N'SRM:Request') OR (J0.C300411705 = N'WOI:WorkOrder'))) ORDER BY a0 ASC Based on the logging and internal testing, we discovered four missing indexes needed for archiving Incidents and their associated records. The following non-unique indexes need to be applied to both the regular form and the archive form. Forms Field RBE:Message / RBE:Message_Archive 304313600 RBE:Transaction / RBE:Transaction_Archive 304313600 SLM:AuditLog / SLM:AuditLog_Archive 450 SLM:RuleActionNotifier / SLM:RuleActionNotifier_Archive 400030500 The average length of time spent archiving records improved after adding these indexes. |