Method R Discussion GroupMethod R helps developers, DBAs, and decision-makers optimize Oracle-based software in every phase of the software life cycle. |
||
Discuss Method R products, techniques, and events. Ask new questions or post your own responses. |
Having just finished the OPINT class last week, I came to work ready to tune. I decided to present this as sort of a case study. In my company it almost takes an act of Congress to install any outside software, even if it is free like the Hotsos test harness. So it will probably be a couple of months before I can install the test harness on a test system. So I had to use the concepts but not the test harness in this case.
Our business complaint was general slowness of the system and various times. We had already tuned some business critical queries earlier, and I am now looking for queries that don't run terrible, but run often that could be improved. Using this query against v$sql, I identified the top ten queries running in production that accumulated the most time: SELECT * FROM (SELECT s.sql_text, ROUND((s.elapsed_time/100000),2) Total_Elapsed_Sec, case s.executions WHEN 0 THEN round(s.elapsed_time/1000000,2) ELSE Round((s.elapsed_time/s.executions)/1000000,2 ) END as Elapsed_Per_Exec, s.buffer_gets, s.disk_reads, case s.cpu_time WHEN 0 THEN 0 ELSE Round((s.cpu_time/1000000),2 ) END as CPU_Secs, s.loads, s.sorts, s.parse_calls, s.executions, s.fetches, s.rows_processed, s.FIRST_LOAD_TIME, s.module, s.child_number, s.hash_value, s.address, s.plan_hash_value, s.optimizer_cost, s.sharable_mem, s.action, s.invalidations, s.parsing_user_id FROM v$sql s ORDER BY s.elapsed_time desc) WHERE ROWNUM < 11 One of those queries is a regularly scheduled query that looks like this: SELECT * FROM maximo.wfassignment WHERE (processname = 'PRAPPR' AND escrole = 'AUTOACCEPT' AND assignstatus = 'ACTIVE') AND NOT EXISTS (SELECT ownerid FROM maximo.escrepeattrack WHERE ownerid = wfassignmentid AND objectname = 'WFASSIGNMENT' AND refpointid = 51 AND escalation = 'PRAUTOAPPR') ------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------| 0 | SELECT STATEMENT | | 94 | 16356 | 6583 | |* 1 | FILTER | | | | | |* 2 | TABLE ACCESS FULL | WFASSIGNMENT | 94 | 16356 | 6395 | |* 3 | INDEX UNIQUE SCAN | ESCREPEATTRACK_NDX | 1 | 27 | 2 | -----------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "MAXIMO"."ESCREPEATTRACK" "ESCREPEATTRACK" WHERE "ESCREPEATTRACK"."OWNERID"=:B1 AND "ESCREPEATTRACK"."OBJECTNAME"='WFASSIGNMENT' AND "ESCREPEATTRACK"."REFPOINTID"=51 AND "ESCREPEATTRACK"."ESCALATION"='PRAUTOAPPR')) 2 - filter("SYS_ALIAS_1"."PROCESSNAME"='PRAPPR' AND "SYS_ALIAS_1"."ESCROLE"='AUTOACCEPT' AND "SYS_ALIAS_1"."ASSIGNSTATUS"='ACTIVE') 3 - access("ESCREPEATTRACK"."ESCALATION"='PRAUTOAPPR' AND "ESCREPEATTRACK"."REFPOINTID"=51 AND "ESCREPEATTRACK"."OBJECTNAME"='WFASSIGNMENT' AND "ESCREPEATTRACK"."OWNERID"=:B1) Note: cpu costing off (Oracle V9.2 Database) This query took about 67K LIO's. Since 05-14-2007 this query has run 1485 times averaging about 14 seconds per execution and returning some rows only about half the time. Total cumulative time processing during those 1485 executions was 5.6 hours with over 100 million LIO's of which 54 million were disk reads. Here is why I think the Oracle optimizer chooses a bad access path plan for this query. Looking at the WHERE clause of wfassignment, we see criteria for these three columns processname, escrole, assignstatus. I looked at the data distribution, which would have been easier with the test harness... SELECT COUNT(*) AS Count, processname FROM maximo.wfassignment GROUP BY processname ORDER BY COUNT(*) COUNT PROCESSNAME ------- ----------- 2 SELFREG 24 CONTAPPR 1090 INVPRCDISC 1869 INVFRTDISC 6048 MOCAPPR 12721 INVEXTDISC 36609 INVFNLAPPR 57193 NVQTYDISC 64415 POAPPR 78053 PRAPPR 808444 WOAPPR 1484139 INVAPPR SELECT COUNT(*) AS Count, escrole FROM maximo.wfassignment GROUP BY escrole ORDER BY COUNT(*) COUNT ESCROLE -------- ---------- 20620 SUPERVISOR 29180 WF_SUPVSR 330704 AUTOACCEPT 2170105 (NULL) SELECT COUNT(*) AS Count, assignstatus FROM maximo.wfassignment GROUP BY assignstatus ORDER BY COUNT(*) COUNT ASSIGNSTATUS ------- ------------------ 592 DEFAULT 1354 ESCALATED 5632 FORWARDED 23845 ACTIVE 750020 INACTIVE 1769168 COMPLETE So what this all means is that the data values are heavily skewed in all three tables. Processname is not very selective when you look for 'INVAPPR', but it is fairly selective when looking for 'PRAPPR' as we are in this query. In the escrole column, we are looking for 'AUTOACCEPT', which has 330K rows so it isn't very selective. Assignstatus is fairly selective if we are looking for 'ACTIVE', but not if we are looking for "COMPLETE'. Out of the box, Oracle assumes all data to be equally distributed. Therefore it doesn't know that in this query, it would be best too look at assignstatus first (only 24K rows to look for), then filter the other criteria against those 24K rows. Instead, it does a full table scan of wfassignment and filters them all. So, according to class, this is a job for histograms. When this type of statistics is gathered against the wfassignent processname, escrole, assignstatus columns, Oracle understands the data skew and uses it to determine the best access path. We just made the Oracle optimizer smarter about this table! I ran the commands for such a histogram in our test environment. After that, the optimizer did choose to look at assignstatus first, and since there was already an index available with assignstatus in it, the query ran 0.16 seconds instead of 14. It had only about 6K LIO's compared to 65K LIO's originally. PLAN_TABLE_OUTPUT ------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ------------------------- | 0 | SELECT STATEMENT | | 4 | 708 | 1288 | |* 1 | FILTER | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID| WFASSIGNMENT | 4 | 708 | 1280 | |* 3 | INDEX RANGE SCAN | WFASSIGN_NDX4 | 26134 | | 196 | |* 4 | INDEX UNIQUE SCAN | ESCREPEATTRACK_NDX | 1 | 27 | 2 | -----------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "MAXIMO"."ESCREPEATTRACK""ESCREPEATTRACK" WHERE "ESCREPEATTRACK"."OWNERID"=:B1 AND "ESCREPEATTRACK"."OBJECTNAME"='WFASSIGNMENT' AND "ESCREPEATTRACK"."REFPOINTID"=51 AND "ESCREPEATTRACK"."ESCALATION"='PRAUTOAPPR')) 2 - filter("SYS_ALIAS_1"."PROCESSNAME"='PRAPPR' AND "SYS_ALIAS_1"."ESCROLE"='AUTOACCEPT') 3 - access("SYS_ALIAS_1"."ASSIGNSTATUS"='ACTIVE') 4 - access("ESCREPEATTRACK"."ESCALATION"='PRAUTOAPPR' AND "ESCREPEATTRACK"."REFPOINTID"=51 AND "ESCREPEATTRACK"."OBJECTNAME"='WFASSIGNMENT' AND "ESCREPEATTRACK"."OWNERID"=:B1) Note: cpu costing is off The advantage of using the histogram over an index hint is that other queries, including ad-hoc queries may well benefit when similar criteria is used. BEGIN DBMS_STATS.gather_table_stats( ownname => 'MAXIMO', tabname => 'WFASSIGNMENT', method_opt =>'FOR COLUMNS PROCESSNAME SIZE 12, ESCROLE SIZE 5, ASSIGNSTATUS SIZE 7'); END; Using these histograms, I would anticipate the next 1485 executions to consume only about 4 minutes of time and 9 million LIO's, a definite win!
Before = 336 minutes
After = 4 minutes Before = 100M LIOs After = 9M LIOs ...and done in a manner that's likely to have collateral benefits upon other parts of the application. Sounds like a pretty good day's work to me. Thanks VERY much, Charles, for sharing.
It's great to see how you were able to apply the knowledge gained from the class to a performance issue right away and with such great success!
|
Powered by FogBugz