您的当前位置:首页正文

Oracle数据库性能SQL优化案例

2024-02-01 来源:钮旅网
Oracle数据库性能SQL优化案例

一. 背景描述

总体感觉运行比较缓慢,主要消耗在数据库模块,其中数据库所在的主机资源紧张,CPU的IDLE很低,说明数据库急需优化。

二. 总体调优

1. COMMIT提交过频繁(已解决)

分析数据库运行一周以来的AWR报表,发现数据库存在日至切换频繁的情况,其中wait class 为commit的log file sync等待事件居然占了23%,一周时间内等待了3918701次近400万秒,很显然数据库应用存在单次提交过频繁,未有效的批量提交的情况。

通过如下查询,发现timeTask@itsm_ht (TNS V1-V3)模块有一个更新语句非常频繁,产生了200多万次提交,当时查看该SESSION而登录仅仅不过6小时而已。

效果:经过和后台开发人员沟通发现,这是后台程序的BUG,修正后,提交大幅度减少,数据库中COMMIT相关的log file sync等待得到极大的改善。

2. 库的统计信息收集未开启(已解决)

由于数据库总体运行缓慢,偶尔从同事的某些SQL的语句执行计划中发现驱动顺序明显错误得到启发,检查数据库的统计信息情况,发现居然返回了7431条,几乎占了bosswg和basedba用户的对象的全部!

接下来发现,原来ORACLE 的自动收集统计信息的功能被关闭了,具体如下:

开启自动收集exec dbms_scheduler.enable('GATHER_STATS_JOB');后,数据库统计信息得以正常收集

3. 手工收集统计信息含全局临时表(已解决)

全局临时表是不能被收集统计信息的,否则容易出大问题,影响执行计划,当前调和模块的全局临时表RN_IDENTIFICATION_BATCH被收集了统计信息,如下:

解决方法就是删除表的统计信息:

EXEC dbms_stats.delete_table_stats(ownname => 'BOSSWG',tabname => 'RN_IDENTIFICATION_BATCH') ; 当前已经解决(注:20121119完成这个回收全局临时表统计信息的改造)

4. 大量索引有并行属性(已解决)

在随后的一小时的AWR报表分析中,发现PX的等待也非常明显,这是由于并行度设置在表或索引属性中引发的一种常见等待事件,如下所示,在一小时的采样中居然有近1万秒的PX等待:

查看后发现索引居然有1334个设置有并行度属性,如下所示,略去大部分展现:

SQL> select t.owner, t.table_name, index_name, degree, status from dba_indexes t

where owner in ('BOSSWG', 'BASEDBA') and t.degree > '1';

OWNER TABLE_NAME INDEX_NAME DEGREE STATUS --------- ------------------------------ ------------------------------ ---------- ------------------------------------------------------------------- BOSSWG PERF_HOST_FILESYSTEM_HIS IDX_TEMP1 4 VALID BOSSWG PERF_WEBLOGIC_WEBMODULE_HIS IDX_TEMP2 4 VALID BOSSWG V3_REPLACE_CI_RELATION_LOG PK_V3_REPLACE_CI_RELATION_LOG 9 VALID BOSSWG V3_REPLACE_CI_LOG PK_V3_REPLACE_CI_LOG 9 VALID BOSSWG V3_REPLACE_CI_CLASS PK_V3_REPLACE_CI_CLASS 9 VALID BOSSWG IFACE_TODO_LIST PK_IFACE_TODO_LIST 9 VALID BOSSWG IFACE_TODO PK_IFACE_TODO 9 VALID BOSSWG IFACE_STAFF PK_IFACE_STAFF 9 VALID --以下略去1000多行

1334 rows selected.

效果:用如下方法,将这些并行取消后,数据库的PX等待事件从此消失了。 select 'alter index '|| t.owner||'.'||index_name || ' noparallel;' from dba_indexes t

where owner in ('BOSSWG', 'BASEDBA') and t.degree >'1';

5. 众多表记录需要瘦身(完成部分)

以下记录中PERF_HOST_FILESYSTEM和INP_DATA_PERF表都是同一版本的表,记录都达到几亿条。 此外AH_GATHER_ALLFLOW_RESULT这个表虽然只有2千万,但是很奇怪的代码是天天删除,怎么会有这么多?

SQL> SELECT COUNT(*) FROM PERF_HOST_FILESYSTEM; COUNT(*) ----------------- 231049804

SQL> SELECT COUNT(*) FROM INP_DATA_PERF; COUNT(*) ---------------- 332761103

SQL> SELECT COUNT(*) FROM AH_GATHER_ALLFLOW_RESULT AG; COUNT(*) ----------------- 17778694

优化思路:考虑大表的历史数据能清理就清理,采样频率能适当的降低,其中的INP_DATA_PERF表记录很大且索引不少,已经影响了如下SQL语句的入库速度,具体见SQL调优部分的SQL_ID=6vv2w2k5jan6d部分。

6. 存在未使用绑定变量问题(解决大部分)

在最糟糕的时段,安徽居然出现软解析仅70%比率的糟糕情况,说明系统存在代码大量硬解析的情况,主要在调和模块的部分代码,已经在pkp_cmdb_reconcile_engine程序的v2.0版本中更新了。

类似如下(以下脚本调用次数极为频繁,却未使用绑定变量): 原脚本: v_sql:='delete from '||rec.table_name|| ' where instance_id in (select instance_id from ci_base_element where '||i_condition|| ' and class_id='||v_class_id||')'; execute immediate v_sql; 修正为: v_sql:='delete from '||rec.table_name|| ' where instance_id in (select instance_id from ci_base_element where '||i_condition|| ' and class_id=:1)'; execute immediate v_sql using v_class_id; 原脚本: v_identification_sql := 'update ci_base_element set reconciliation_id=' || v_reconciliation_id || ' where INSTANCE_ID in (' || instencerec.instance_id1 || ',' || instencerec.instance_id2 || ') and reconciliation_id is null'; execute immediate v_identification_sql; 修正为: v_identification_sql := 'update ci_base_element set reconciliation_id=:1 where INSTANCE_ID in (:2,:3) and reconciliation_id is null'; execute immediate v_identification_sql using v_reconciliation_id,instencerec.instance_id1,instencerec.instance_id2; 三. SQL调优

当前数据库不少SQL需要优化改进,其中调和模块由于调用频繁且运行时间较长,优先级最高。

1. SQL_ID= d4hw7rpzdvmsd(调和)

紧急程度 SQL描述 紧急 该语句一周执行700次左右,平均每次执行2000秒左右,产生300多亿的逻辑读 其中CI_BASE_RELATIONSHIP记录217670条,CI_BASE_ELEMENT记录194475条。CI_CLASS_RELATION_TREE 记录只有10来条,RN_IDENTIFICATION_BATCH为全局临时表 SQL_ID= d4hw7rpzdvmsd select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_relationship where markasdeleted <> 1 and (source_instance_id in (select a.destination_instance_id from ci_base_relationship a, ci_class_relation_tree b where a.class_id = b.class_id and b.is_auto_discover = 1 and a.source_instance_id in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 7)) and dataset_id = 6) and a.destination_instance_id not in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 7)) and dataset_id = 6)) or destination_instance_id in (select a.destination_instance_id from ci_base_relationship a, ci_class_relation_tree b where a.class_id = b.class_id and b.is_auto_discover = 1 and a.source_instance_id in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch SQL语句 where source_dataset_id = 7)) and dataset_id = 6) and a.destination_instance_id not in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false' ) */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 7)) and dataset_id = 6))) 功能模块 开发分析 JDBC Thin Client 等价于 select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_relationship where markasdeleted <> 1 and source_instance_id in (select a.destination_instance_id from ci_base_relationship a, ci_class_relation_tree b where a.class_id = b.class_id and b.is_auto_discover = 1 and a.source_instance_id in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 7)) and dataset_id = 6) and a.destination_instance_id not in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element 现场分析 where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 7)) and dataset_id = 6)) union select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_relationship where markasdeleted <> 1 and destination_instance_id in (select a.destination_instance_id from ci_base_relationship a, ci_class_relation_tree b where a.class_id = b.class_id and b.is_auto_discover = 1 and a.source_instance_id in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 7)) and dataset_id = 6) and a.destination_instance_id not in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false' ) */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 7)) and dataset_id = 6)) 思路:由于语句太复杂加上全局临时表的影响,OR语句导致执行计划出现笛卡尔乘积等错误的方式,通过分析发现UNION 的改写基本能保持执行计划正常,建议暂时修正为UNION 的写法。 处理步骤 遗留问题 2. SQL_ID=gyx7jpkgv9mzr(调和)

紧急程度 SQL描述 紧急 该语句一周执行近8000次,平均每次执行1000多秒左右,产生200多亿的逻辑读,代码逻辑非常奇怪,需要开发人员确认。 其中NE_CTRL_MSG记录44836条,ci_base_element记录20万条左右。 SQL_ID= gyx7jpkgv9mzr select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ ds1.instance_id1, ds2.instance_id2, nvl(ds1.reconciliation_id, ds2.reconciliation_id) reconciliation_id from (select ci_base_element.instance_id instance_id1, ci_base_element.reconciliation_id, ci_base_element.class_id, CI_BASE_ELEMENT.Name ca1 from rn_identification_batch, ci_base_element where rn_identification_batch.instance_id = ci_base_element.instance_id and ci_base_element.class_id in (3, 1, 2, 4, 10…) and rn_identification_batch.source_dataset_id = :ds) ds1, (select ci_base_element.instance_id instance_id2, ci_base_element.reconciliation_id, ci_base_element.class_id, CI_BASE_ELEMENT.Name cb1 from rn_identification_batch, ci_base_element where rn_identification_batch.instance_id = ci_base_element.instance_id and ci_base_element.class_id in (3, 1, 2, 4, 10…) and rn_identification_batch.source_dataset_id = :ds) ds2 where ds1.ca1 = ds2.cb1 and ds1.class_id = ds2.class_id SQL语句 功能模块 开发分析 JDBC Thin Client 这里的主要问题在于,由于rn_identification_batch是全局临时表,ORACLE无法正确收集分析其统计信息,误以为是很小的记录,实际情况是该全剧临时表其实在执行时很大。 ds1和ds2两个结果集现在ORACLE是如此处理的,将ds1中的rn_identification_batch跳出ds1外,与ds2结果集的表的rn_identification_batch关联,实际情况这是大表驱动大表,NL连接中的大犯,因此在本例中, 我们可以把两个结果集固定住,不让在外部先关联,具体代码修正为如下: select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ ds1.instance_id1, ds2.instance_id2, nvl(ds1.reconciliation_id, ds2.reconciliation_id) reconciliation_id from (select ci_base_element.instance_id instance_id1, ci_base_element.reconciliation_id, ci_base_element.class_id, CI_BASE_ELEMENT.Name ca1, rownum from rn_identification_batch, ci_base_element where rn_identification_batch.instance_id = ci_base_element.instance_id and ci_base_element.class_id in (3, 1, 2, 4, 10 ) and rn_identification_batch.source_dataset_id = :ds) ds1, (select ci_base_element.instance_id instance_id2, ci_base_element.reconciliation_id, ci_base_element.class_id, CI_BASE_ELEMENT.Name cb1, rownum from rn_identification_batch, ci_base_element where rn_identification_batch.instance_id = ci_base_element.instance_id and ci_base_element.class_id in (3, 1, 2, 4, 10) and rn_identification_batch.source_dataset_id = :ds) ds2 where ds1.ca1 = ds2.cb1 and ds1.class_id = ds2.class_id 现场分析 处理步骤 遗留问题 3. SQL_ID=8qfr2ywm53byn

紧急程度 SQL描述 紧急 该语句一周执行34,215次,平均每次执行13秒左右,产生了5,978,706,027次逻辑读。 其中主要是涉及到管道表TABLE(PKP_AGENT_MGMT.GETKMSNEPARAMCFGS(:B1)) SQL_ID= 8qfr2ywm53byn SELECT DISTINCT KMNAME, TO_CHAR(NEID), TO_CHAR(KMCIID), PURPOSE, PARAMNAME, PARAMVALUE, PARAMTYPE FROM TABLE(PKP_AGENT_MGMT.GETKMSNEPARAMCFGS(:B1)) ORDER BY KMNAME, TO_CHAR(NEID), TO_CHAR(KMCIID), PURPOSE; SQL语句 功能模块 开发分析 itmd@itsm_ht (TNS V1-V3) 这个管道表涉及到的代码如下,重点观察涉及到的表 function getKMsNEParamCfgs(agentId ci_agent.instance_id%type) return KMParamCfgInfoArray pipelined is vLastKmProductId ci_km.itm_product_id%type := -1; vCfgRecord KMParamCfgInfo; vKMDependedArr KMDependedInfoArray := KMDependedInfoArray(); vKMDependedRec KMDependedInfo; vCounter Number(9) := 1; cursor c_depended IS select r.destination_product_id, r.source_product_id from itm_product_relation r, itm_product p where p.itm_product_id = r.source_product_id and p.state = '0SA' and p.type = KM_PRODUCT_TYPE and r.relation_type = DEPENDENCY_PRODUCT_RELA_TYPE and exists (select b.itm_product_id from ci_base_relationship a, ci_km b where a.source_instance_id = agentId and a.destination_instance_id = b.instance_id and a.dataset_id = MONITOR_DATA_SET_ID and a.class_id = KM_AGENT_RELA_CLSID and a.name = KM_AGENT_RELA_NAME and a.markasdeleted = 0 and b.itm_product_id = r.destination_product_id) 现场分析 order by r.destination_product_id; cursor c_task is select p.itm_product_name KMName, r_ne_km.destination_instance_id NEId, p.itm_product_id KMId, k.instance_id KMCIId from ci_base_relationship r_km_agent, ci_km k, ci_base_relationship r_ne_km, itm_product p where r_km_agent.source_instance_id = agentId and r_km_agent.destination_instance_id = k.instance_id and r_km_agent.dataset_id = MONITOR_DATA_SET_ID and r_km_agent.class_id = KM_AGENT_RELA_CLSID and r_km_agent.name = KM_AGENT_RELA_NAME and r_km_agent.markasdeleted = 0 and r_ne_km.source_instance_id = k.instance_id and r_ne_km.dataset_id = MONITOR_DATA_SET_ID and r_ne_km.class_id = NE_KM_RELA_CLSID and r_ne_km.name = NE_KM_RELA_NAME and r_ne_km.markasdeleted = 0 and k.itm_product_id = p.itm_product_id and p.state = '0SA' order by KMId, NEId, KMCIId; CURSOR c_kp(KMCIId in ci_km_parameters.instance_id%type) IS select kp.purpose, i.parameter_name, kp.parameter_type from ci_km_parameters i, ci_km k, km_in_parameters kp where i.instance_id = KMCIId and i.parameter_name = kp.parameter_name and kp.parameter_type = 'param' and kp.itm_product_id = k.itm_product_id and k.instance_id = i.instance_id; CURSOR c_op(KMCIId in ci_km.instance_id%type) IS select c.instance_id,o.parameter_name,o.kpi_id from ci_km c, km_out_parameters o where c.instance_id=KMCIId and o.itm_product_id=c.itm_product_id and o.kpi_type='alarm'; CURSOR c_tv(KMCIId in ci_km.instance_id%type, paramName in km_out_parameters.parameter_name%type) IS select tv.CONFIG_ITEM from threshold_view tv where tv.KM_INSTANCE_ID = KMCIId and tv.KM_PERF_PARAMETER = paramName group by tv.CONFIG_ITEM; vKMProductIdList varchar2(4000) := ''; vCur T_CURSOR; vThdCur T_CURSOR; vKPAttrRec KMParamAttrCfgInfo; vThdCfgName ci_base_element.name%type; begin if OUTPUT_DEBUG_ON = 1 then dbms_output.put_line('print vKMDependedRec Info:'); end if; open c_depended; loop fetch c_depended into vKMDependedRec; exit when c_depended%notfound; vKMDependedArr.extend; vKMDependedArr(vCounter) := vKMDependedRec; vCounter := vCounter + 1; if OUTPUT_DEBUG_ON = 1 then dbms_output.put_line(vKMDependedRec.DestProductId || ' -> ' || vKMDependedRec.SourceProductId); end if; end loop; close c_depended; for it in c_task loop if vLastKmProductId <> it.KMId then vLastKmProductId := it.KMId; vKMProductIdList := getDepenedKMRecursively(vKMDependedArr, it.KMId); if OUTPUT_DEBUG_ON = 1 then dbms_output.put_line('KMName=' || it.KMName || ' KMId=' || it.KMId || ' DependedProductList=' || vKMProductIdList); end if; end if; if OUTPUT_DEBUG_ON = 1 then dbms_output.put_line('print KMsNeParamCfgs Info:'); end if; open vCur for 'select distinct kp.purpose, kp.parameter_name, kp.param_sql, kp.parameter_type from km_in_parameters kp where kp.itm_product_id in (' || vKMProductIdList || ') and kp.parameter_type = ''attr'''; loop fetch vCur into vKPAttrRec; exit when vCur%notfound; vCfgRecord.KMName := it.KMName; vCfgRecord.NEId := it.NEId; vCfgRecord.KMCIId := it.KMCIId; vCfgRecord.Purpose := vKPAttrRec.Purpose; vCfgRecord.ParamName := vKPAttrRec.ParamName; vCfgRecord.ParamType := vKPAttrRec.ParamType; execute immediate vKPAttrRec.ParamSql into vCfgRecord.ParamValue using it.NEId; pipe row(vCfgRecord); /* if OUTPUT_DEBUG_ON = 1 then dbms_output.put_line('##attr-- KMName=' || vCfgRecord.KMName || ' NEId=' || vCfgRecord.NEId || ' KMId=' || vCfgRecord.KMCIId || ' Purpose=' || vCfgRecord.Purpose || ' ParamName=' || vCfgRecord.ParamName || ' ParamValue=' || vCfgRecord.ParamValue || 'ParamType='||vCfgRecord.ParamType); end if; */ end loop; close vcur; for it_kp in c_kp(it.KMCIId) loop vCfgRecord.KMName := it.KMName; vCfgRecord.NEId := it.NEId; vCfgRecord.KMCIId := it.KMCIId; vCfgRecord.Purpose := it_kp.purpose; vCfgRecord.ParamName := it_kp.parameter_name; vCfgRecord.ParamValue := ''; vCfgRecord.ParamType := it_kp.parameter_type; pipe row(vCfgRecord); /* if OUTPUT_DEBUG_ON = 1 then dbms_output.put_line('##kp---- KMName=' || vCfgRecord.KMName || ' NEId=' || vCfgRecord.NEId || ' KMId=' || vCfgRecord.KMCIId || ' Purpose=' || vCfgRecord.Purpose || ' ParamName=' || vCfgRecord.ParamName || ' ParamValue=' || vCfgRecord.ParamValue || 'ParamType=' || vCfgRecord.ParamType); end if; */ end loop; -- add by wenghf for threshold for it_op in c_op(it.KMCIId) loop for it_tv in c_tv(it.kmciid,it_op.parameter_name) loop vCfgRecord.KMName := it.Kmname; vCfgRecord.NEId := it.neid; vCfgRecord.KMCIId := it.kmciid; vCfgRecord.Purpose := 'threshold'; vCfgRecord.ParamType := ''; if it_tv.config_item is null then vCfgRecord.ParamName := it_op.kpi_id ; dbms_output.put_line(it.kmciid || ' ' || it_op.parameter_name); vCfgRecord.ParamValue := pkp_agent_mgmt.getKmThresholds(it.kmciid,it_op.parameter_name); if vCfgRecord.ParamValue is not null then pipe row(vCfgRecord); end if; else vCfgRecord.ParamValue := pkp_agent_mgmt.getKmThresholds(it.kmciid,it_op.parameter_name,it_tv.config_item); Open vThdCur for 'select name from ci_base_element where instance_id in(' || it_tv.config_item || ')'; loop fetch vThdCur into vThdCfgName; exit when vThdCur%notfound; vCfgRecord.ParamName := it_op.kpi_id || vThdCfgName; if vCfgRecord.ParamValue is not null then pipe row(vCfgRecord); end if; end loop; close vThdCur; /** vCfgRecord.ParamName := || iit_op.kpi_idt_tv.config_item_name || '@@'; dbms_output.put_line(it.kmciid || ' ' || it_op.parameter_name); **/ end if; /** if vCfgRecord.ParamValue is not null then pipe row(vCfgRecord); end if; **/ end loop; /** vCfgRecord.KMName := it.Kmname; vCfgRecord.NEId := it.neid; vCfgRecord.KMCIId := it.kmciid; vCfgRecord.Purpose := 'threshold'; vCfgRecord.ParamName := it_op.kpi_id ; dbms_output.put_line(it.kmciid || ' ' || it_op.parameter_name); vCfgRecord.ParamValue := pkp_agent_mgmt.getKmThresholds(it.kmciid,it_op.parameter_name); vCfgRecord.ParamType := ''; if vCfgRecord.ParamValue is not null then pipe row(vCfgRecord); end if; **/ end loop; end loop; return; end; 思路:采用10046TRACE方式进行跟踪,知道该函数哪里开销最大,再分析,代入1000000012624,如: sqlplus bosswg/itsm56rffy set linesize 266 set timing on set pagesize 5000 alter session set events '10046 trace name context forever,level 12'; SELECT DISTINCT KMNAME, TO_CHAR(NEID), TO_CHAR(KMCIID), PURPOSE, PARAMNAME, PARAMVALUE, PARAMTYPE FROM TABLE(PKP_AGENT_MGMT.GETKMSNEPARAMCFGS(1000000012628)) ORDER BY KMNAME, TO_CHAR(NEID), TO_CHAR(KMCIID), PURPOSE; alter session set events '10046 trace name context off'; select d.value || '/' || LOWER (RTRIM(i.INSTANCE, CHR(0))) || '_ora_' || p.spid || '.trc' trace_file_name from (select p.spid from v$mystat m,v$session s, v$process p where m.statistic#=1 and s.sid=m.sid and p.addr=s.paddr) p, (select t.INSTANCE FROM v$thread t,v$parameter v WHERE v.name='thread' AND(v.VALUE=0 OR t.thread#=to_number(v.value))) i, (select value from v$parameter where name='user_dump_dest') d; tkprof /oracle/app/oracle/admin/itnm/udump/itnm_ora_9064.trc /oracle/app/oracle/product/10.2/ljb/20121115/ itnm_ora_9064.txt sys=no sort=prsela,exeela,fchela 结论:由于频率调整,当前速度已经在5秒左右可以完成,基本满足了需求。 处理步骤 遗留问题 4. SQL_ID=49mv8ygm101n4

紧急程度 SQL描述 紧急 该语句一周执行158次,平均每次执行400多秒,产生了130,740,847次逻辑读。 其中INFO_HIGH_5030355300007表记录有47788811条。 SQL_ID=49mv8ygm101n4 SELECT 1, DECODE(COUNT(*), NULL, 0, '', '0', COUNT(*)) FROM (SELECT COUNT(*) FROM INFO_HIGH_5030355300007 WHERE TO_CHAR(STATE_DATE, 'yyyymmddhh24') = TO_CHAR(SYSDATE - :B1 / 24, 'yyyymmddhh24') GROUP BY TO_CHAR(STATE_DATE, 'yyyymmddhh24')); SQL语句 功能模块 开发分析 timeTask@itsm_ht (TNS V1-V3) 这个语句需求比较特别,但是对列进行分析,导致无法用到STATE_DATE列索引(不过在STATE_DATE列本身也没有索引)。因为WHERE 条件是一小时内的记录,返回相对全表很少,所以用索引一定高效,因为暂时无法转化写法,新增函数索引来完成 原先执行计划为近5千万表全表扫描,调整后执行计划改变为如下索引读如下: 现场分析 处理步骤 遗留问题 create index idx_func_info_state on INFO_HIGH_5030355300007(TO_CHAR(STATE_DATE, 'yyyymmddhh24')); 然后发现执行计划走索引,速度从原来的400多秒缩减为0.01秒完成。 5. SQL_ID=24gkcz1k2fjf2

紧急程度 紧急 该语句一周执行158次,平均每次执行近300秒,产生了126,613,244次逻辑读。 其中INFO_HIGH_5030355300007表记录有47788811条。 此外该语句和SQL_ID=49mv8ygm101n4的语句非常类似,就是多了一个AND USER_NAME IS NULL的条件而已。 SQL描述 SQL_ID= 24gkcz1k2fjf2 SELECT 1, DECODE(COUNT(*), NULL, 0, '', '0', COUNT(*)) FROM (SELECT COUNT(*) FROM INFO_HIGH_5030355300007 WHERE TO_CHAR(STATE_DATE, 'yyyymmddhh24') = TO_CHAR(SYSDATE - :B1 / 24, 'yyyymmddhh24') AND USER_NAME IS NULL GROUP BY TO_CHAR(STATE_DATE, 'yyyymmddhh24')); SQL语句 功能模块 开发分析 现场分析 处理步骤 遗留问题 timeTask@itsm_ht (TNS V1-V3) 优化思路:和之前的SQL_ID=49mv8ygm101n4的SQL是一样的优化方式,由于函数索引已经建好,所以速度也大幅度提升了。 6. SQL_ID=2us77smt19xt8

紧急程度 SQL描述 一般 该语句一周执行30万次左右,平均每次执行0.8秒左右,产生17亿的逻辑读,代码逻辑非常奇怪,需要开发人员确认。 其中NE_CTRL_MSG记录44836条,ci_base_element记录194475条。 SQL_ID=2us77smt19xt8 SELECT TO_CHAR(A.NE_CTRL_MSG_ID), TO_CHAR(A.KPI_ID), TO_CHAR(A.NE_ID), TO_CHAR(C.CLASS_ID), TO_CHAR(A.AGENT_ID), A.KPI_VALUE, A.STATE, (SYSDATE - A.STATE_DATE) * 24 * 3600, decode(to_char(B.markasdeleted), '0', '0SA', 'OSX') FROM NE_CTRL_MSG A, ci_base_element B, ci_base_element C WHERE A.STATE in ('10', '20') AND A.AGENT_ID = B.INSTANCE_ID AND A.NE_ID = C.INSTANCE_ID and rownum <= 1000 ORDER BY A.STATE SQL语句 功能模块 开发分析 现场分析 处理步骤 遗留问题 Manager@itsm_ht (TNS V1-V3) 1. Order by 意义有点奇怪 2. Rownum<=1000再order by 给谁看? 7. SQL_ID=7znqsa9jc7c2v(调和)

紧急程度 SQL描述 紧急 该语句一周执行31,226次,平均每次执行1.3秒,产生了8,254,610,389次逻辑读。 其中CI_BASE_ELEMENT表记录20万条左右,RN_IDENTIFICATION_BATCH为全局临时表。 SQL_ID= 7znqsa9jc7c2v SELECT B.INSTANCE_ID, B.CLASS_ID FROM RN_IDENTIFICATION_BATCH A, CI_BASE_ELEMENT B WHERE A.INSTANCE_ID = B.INSTANCE_ID AND B.RECONCILIATION_ID = :B2 AND B.DATASET_ID = :B1 SQL语句 功能模块 开发分析 现场分析 处理步骤 遗留问题 JDBC Thin Client 这也和全局临时表被收集统计信息有关,也许此时全局临时表的记录非常庞大了,NL连接已经不合适,或者说NL连接中这个临时表驱动已经不合适,现在已经消除了全局临时表的收集,观察中。 8. SQL_ID=5k6mq0mc3hu47

紧急程度 SQL描述 一般 该语句一周执行229次,平均每次执行134秒,产生了32,030,052次逻辑读。 其中AH_GATHER_ALLFLOW_RESULT表近2千万条。 SQL_ID=5k6mq0mc3hu47 DELETE FROM AH_GATHER_ALLFLOW_RESULT AG WHERE TO_DATE(AG.CREATE_DATE, 'yyyy-mm-dd hh24') >= TRUNC(SYSDATE)-1; SQL语句 功能模块 开发分析 timeTask@itsm_ht (TNS V1-V3) 这个语句是每天删除当天的记录,当前记录还是2千万条,当天的记录一定很少,所以建议对列运算的写法修改了 ,建议将CREATE_DATE从VARCHAR2型改为DATA型,然后建索引,然后写成如下: DELETE FROM AH_GATHER_ALLFLOW_RESULT AG 现场分析 WHERE AG.CREATE_DATE >= SYSDATE ; 处理步骤 遗留问题 9. SQL_ID=6vv2w2k5jan6d

紧急程度 SQL描述 一般 该语句一周执行57,994次,平均每次执行18秒,产生了12,778,032,148次逻辑读(百亿)。 其中INP_DATA_PERF表记录有3亿多,且有众多索引。 SQL_ID=6vv2w2k5jan6d insert into INP_DATA_PERF (FILE_ID, NE_ID, NE_TYPE_ID, AUDIT_ID, AREA_CODE, SYSLOC, MSG_SOURCE, KPI_ID, KPI_NAME, KBP, KBP_NAME, MIN, AVG, MAX, MSG_TIME) values (:1, pkp_inp_convert.convert_inp_data_perf(:2, :3), :4, :5, :6, :7, 1, replace(:8, '-', ''), :9, :10, :11, :12, :13, :14, to_date(:15, 'YYYYMMDDHH24MI')) timeTask@itsm_ht (TNS V1-V3) SQL语句 功能模块 开发分析 现场分析 处理步骤 遗留问题

瘦身该表记录,可提升插入速度,目前已经从3亿缩减为1千万 10. SQL_ID=cg2bfwdvz8kf8(调和)

紧急程度 紧急 该语句SQL_ID= cg2bfwdvz8kf8一周执行共计约执行1万次,产生了3次不同的执行计划,不同的只执行计划执行时间差异巨大,之前有过每次执行只有0.3秒的情况,也有300秒的情况。 逻辑读共计产生了500多亿,不过主要都是由最糟糕的执行计划引发的,具体随后可见。 此外该查询还多次引发了ORA-01555的快照太久的相关错误而被中止 ORA-01555 caused by SQL statement below (SQL ID: cg2bfwdvz8kf8, Query Duration=3864 sec, SCN: 0x0bd5.d2ddbba2): SQL描述 SQL_ID= cg2bfwdvz8kf8 SELECT B.INSTANCE_ID, B.CLASS_ID, B.RECONCILIATION_ID FROM RN_IDENTIFICATION_BATCH A, CI_BASE_ELEMENT B WHERE A.INSTANCE_ID = B.INSTANCE_ID AND A.SOURCE_DATASET_ID = :B1 AND EXISTS (SELECT 1 FROM RN_IDENTIFICATION_BATCH C, CI_BASE_ELEMENT D WHERE C.INSTANCE_ID = D.INSTANCE_ID AND C.SOURCE_DATASET_ID = :B2 AND B.RECONCILIATION_ID = D.RECONCILIATION_ID); --当前平均执行300多秒的执行计划(最糟糕的这个执行计划) SQL语句 功能模块 JDBC Thin Client 开发分析 这里很显然是因为RN_IDENTIFIED_BATCH产生笛卡尔乘积引发的,这个表是全局临时表,一般来说记录在运行期间,会达到10万条左右,这个笛卡尔乘积是非常可怕的。 采用固定执行计划的方式,代码修正大致如下: SELECT /*+ OPT_PARAM('_optimizer_mjc_enabled','false') leading(A) use_hash(B) */ B.INSTANCE_ID, B.CLASS_ID, B.RECONCILIATION_ID FROM RN_IDENTIFICATION_BATCH A, CI_BASE_ELEMENT B WHERE A.INSTANCE_ID = B.INSTANCE_ID AND A.SOURCE_DATASET_ID = 1 AND EXISTS (SELECT /*+leading(C)*/ 1 FROM RN_IDENTIFICATION_BATCH C, CI_BASE_ELEMENT D WHERE C.INSTANCE_ID = D.INSTANCE_ID AND C.SOURCE_DATASET_ID = 1 AND B.RECONCILIATION_ID = D.RECONCILIATION_ID); 大致执行计划将会如下: 现场分析 不过这个修改收效不明显,后续采用将全局临时表的统计信息收回,继续观察中。 处理步骤 遗留问题 11. SQL_ID=gvx85f2snmv91(调和)

紧急程度 SQL描述 紧急 该语句是手动执行的,执行了2万多秒,产生了ORA-01555错误 SQL_ID= gvx85f2snmv91 select count(*) from ci_base_relationship a, ci_class_relation_tree b where a.class_id = b.class_id and b.is_cascade_delete = 1 and a.source_instance_id in (select a.destination_instance_id from ci_base_relationship a, ci_class_relation_tree b where a.class_id = b.class_id SQL语句 and b.is_auto_discover = 1 and a.source_instance_id in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 6)) and dataset_id = 2) and a.destination_instance_id not in (select /*+ OPT_PARAM('_optimizer_mjc_enabled', 'false') */ instance_id from ci_base_element where reconciliation_id in (select reconciliation_id from ci_base_element where instance_id in (select instance_id from rn_identification_batch where source_dataset_id = 6)) and dataset_id = 2)) and rownum < 2; 功能模块 开发分析 现场分析 处理步骤 遗留问题 JDBC Thin Client 全局临时表统计信息收回后,再继续观察中 12. SQL_ID= 5aqgh8kjdxgc0(告警权限)

紧急程度 紧急 该语句SQL_ID= 5aqgh8kjdxgc0是手动执行的,产生了ORA-01555错误,执行时间1万秒。实际上和这个语句类似的语句有非常之多,比如还有如下7个,这7个语句除了SQL_ID=4qr2y85v36swk也在1万多秒外,其他当时的执行时间均在5千秒左右,在11月19日下午3点-5点之间,这些语句点击次数均在50次左右。 SQL描述 SQL_ID= 2d9gm8k2n4zhm SQL_ID= 4qr2y85v36swk SQL_ID=249mbgraf4d7d SQL_ID=5ca3t96b5mz37 SQL_ID=g626wakff4pjt SQL_ID=b0juzv48c6xcf SQL_ID=dpaqxqm7acbta 这些相似语句只是部分地方取值有不同而已。 至于这些告警权限语句的优化,之前早有定论了,具体见描述。 SQL_ID=5aqgh8kjdxgc0 select * from (select rownum sid, a.* from (select a.ne_alarm_list_id id, a.flow_id, /*流程ID*/ a.oprt_state oprtState, /*操作状 态*/ a.alarm_state alarmState, /*告警状态*/ a.alarm_Level alarmLevel, /*告警级别*/ a.BROKER_IP, /*Agent所连接的broker的ip地址*/ a.alarm_class alarmClass, /*告警类别*/ to_char(a.generate_time, 'yyyy-mm-dd') generatedate, /*产生日期*/ to_char(a.last_generate_time, 'yyyy-mm-dd') lastdate, /*最后产生日期*/ b1.list_label alarm_type, a.ne_name, a.KPI_NAME, a.kpi_value, a.ne_name DATASOURCE, '' || decode(a.alarm_Level, '1', '严重', '2', '重要', '3', '一般', '未知') || '' alarm_Level, to_char(a.generate_time, SQL语句 'yyyy-mm-dd hh24:mi:ss') generate_time, to_char(a.last_generate_time, 'yyyy-mm-dd hh24:mi:ss') last_generate_time, a.alarm_times, decode(a.OPRT_STATE, '20 ', to_char(a.CONFIRM_TIME, 'yyyy-mm-dd hh24:mi:ss'), '25', to_char(a.SUSPEND_TIME, 'yyyy-mm-dd hh24:mi:ss'), '30', to_char(a.CLEAR_TIME, 'yyyy-mm-dd hh24:mi:ss'), '40', to_char(a.DELETE_TIME, 'yyyy-mm-dd hh24:mi:ss')) executeTim, b2.list_label alarm_state, b4.list_label oprt_state, PKP_ALARM_RULE.getNotifyStaff(a.ne_alarm_list_id) sms /*自定义列项*/ from ne_alarm_list a, (select * fro m TP_DOMAIN_LISTVALUES where domain_code = 'DOMAIN_NE_ALARM_TYPE') b1, (select * from tp_domain_listvalues where domain_code = 'DOMAIN_ALARM_STATE') b2, (select * from tp_domain_listvalues where domain_code = 'DOMAIN_DR_ID_FLAG') b3, (select * from tp_domain_listvalues where domain_code = 'DOMAIN_ALARM_OPRT_STATE') b4, kpi_code_list d, manage_region e, ne_trans_alarm nta, CI_BASE_ELEMENT \"$CI_TAB\ (SELECT CLASS_ID FROM CI_CLASS_TREE CONNECT BY SUPER_CLASS_ID = PRIOR CLASS_ID START WITH CLASS_ID IN (SELECT CLASS_ID FROM CONFIG_PRIVILEGE_CLASS WHERE STATE = '0SA')) \"$SET_PRI_CI_CLASS\ (SELECT T.PRIMARY_ID, SUM(T.HAS_READ) HAS_READ FROM TREE_PRIVILEGE T WHERE T.TREE_CFG_NAME = 'NET_ELEMENT' AND T.ASSIGN_OBJECT IN (:1, :2, :3, :4, :5, :6, :7, :8) GROUP BY T.PRIMARY_ID) \"$PRI_VIEW\" where \"$CI_TAB\".class_id = \"$SET_PRI_CI_CLASS\".class_id(+) and \"$CI_TAB\".instance_id = a.CONFIG_NE_ID and nvl2(\"$SET_PRI_CI_CLASS\".class_id, a.CONFIG_NE_ID, a.NE_ID) = \"$PRI_VIEW\".PRIMARY_ID AND \"$PRI_VIEW\".HAS_READ > 0 and b1.list_value = a.alarm_Type and b2.list_value = a.alarm_state and b3.list_value = a.dr_id || '' and b4.list_value = a.oprt_state and a.kpi_id = d.kpi_id and nvl(a.ALARM_REGION_ORIGIN, '-1') = to_char(e.region_id(+)) and a.ne_alarm_list_id = nta.ne_alarm_list_id(+) and a.NE_TYPE_ID not in (select b.ci_class_id class_id from basic_tree_node a, ci_sys_view_tree b where a.tree_id = b.tree_id and a.public_tree_key = 'CI_BUSI_TREE_FOR_CI_VIEW' union select class_id from ci_class_tree a connect by prior a.class_id = a.super_class_id start with a.class_id = pkp_cmdb_util.get_ci_class_id('core.Business') union select class_id from ci_class_tree t where pkp_csk_manager.getServFlowTacheType(t.class_id) = 'BusinessService') and a.ALARM_STATE in (0) and 1 = :9 and a.CREATE_TIME >= to_date(:10, 'yyyy-MM-dd hh24:mi:ss') and a.CREATE_TIME <= to_date(:11, 'yyyy-MM-dd hh24:mi:ss') order by b2.sort_id, b4.sort_id, a.alarm_type, a.ne_id, nvl(a.last_send_time, a.create_Time) desc) a where rownum <= :12) b where b.sid >= :13 功能模块 手工点击 该模块的优化之前我已经建议过了,此前集团,浙江,新疆均发生过,并已经调整更新了。请开发人员将补丁更新一下即可。 主要问题在如下: (SELECT T.PRIMARY_ID, SUM(T.HAS_READ) HAS_READ FROM TREE_PRIVILEGE T WHERE T.TREE_CFG_NAME = 'NET_ELEMENT' AND T.ASSIGN_OBJECT IN (:1, :2, :3, :4, :5, :6, :7, :8) GROUP BY T.PRIMARY_ID) \"$PRI_VIEW\" 这个结果集变化为全局临时表后,执行计划方可正常。 此外这个PKP_ALARM_RULE.getNotifyStaff(a.ne_alarm_list_id) sms /*自定义列项*/ 需要去掉 此外SQL_ID= 2d9gm8k2n4zhm SQL_ID= 4qr2y85v36swk SQL_ID=249mbgraf4d7d SQL_ID=5ca3t96b5mz37 SQL_ID=g626wakff4pjt SQL_ID=b0juzv48c6xcf SQL_ID=dpaqxqm7acbta 和本处SQL_ID=5aqgh8kjdxgc0 居然是同一条语句, 差异只在于 and a.ALARM_STATE in (0) and 1 = :9 这些部分! 13. SQL_ID= b1z7kzp1p1hjd

紧急程度 SQL描述 紧急 该语句SQL_ID= b1z7kzp1p1hjd 一周时间执行200万次,平均每次执行1.6秒左右,一周下来合计产生400多亿次逻辑读。 该表DATA_MSG_FILELIST有约2千万条左右 SQL_ID=b1z7kzp1p1hjd SELECT DATA_MSG_FILELIST_ID, FILE_NAME, FILE_PATH, FILE_STATE, nvl(file_type, 1) FROM DATA_MSG_FILELIST WHERE FILE_TYPE=1 AND FILE_STATE=:FILE_STATE SQL语句 功能模块 开发分析 InfIndb@itnmapp (TNS V1-V3) 根据需求以及和开发人员郑勇的沟通,这个语句基本上确定下来就是 WHERE FILE_STATE=1 AND FILE_TYPE=1 的逻辑,取值都是1,并且都是这样联合查询的。 SQL> select count(*) from DATA_MSG_FILELIST WHERE FILE_STATe=1; COUNT(*) ---------- 22580 SQL> select count(*) from DATA_MSG_FILELIST WHERE FILE_TYPE=1; COUNT(*) ---------- 4548314 SQL> select count(*) from DATA_MSG_FILELIST WHERE FILE_STATE=1 AND FILE_TYPE=1; COUNT(*) ---------- 0 从以上统计来来看,显然是FILE_STATE和FILE_TYPE有一个联合索引是最高效,因为联合查询返回几乎每次都是0或者很小的值,当前语句用到的索引是FILE_STATE列的索引,虽然因为FILE_STATe=1的固定取值返回22580占2千万记录比率很小的一部分,效率尚可,时间平均仅0.16秒,但是显然是联合索引返回0条或者少数条来的高效! SQL> select t.table_name,, 2 t.partitioning_type, 3 t.partition_count 4 from user_part_tables t 5 where table_name in 现场分析 6 ('DATA_MSG_FILELIST'); TABLE_NAME PARTITI PAR -------------------------- ------- --- SQL> select t.table_name,t.index_name, t.column_name, t.column_position, t.DESCEND 2 from user_ind_columns t 3 where table_name in('DATA_MSG_FILELIST') 4 order by table_name,index_name, column_position; TABLE_NAME INDEX_NAME COLUMN_NAME C DESCEND -------------------- ------------------------------ ------------------------- - ------- DATA_MSG_FILELIST DATA_MSG_FILE_GENERATE_DATE FILE_GENERATE_DATE 1 ASC DATA_MSG_FILELIST DATA_MSG_TYPE FILE_TYPE 1 ASC DATA_MSG_FILELIST PK_DATA_MSG_FILELIST DATA_MSG_FILELIST_ID 1 ASC DATA_MSG_FILELIST PK_DATA_MSG_STATE FILE_STATE 1 ASC 此外该表只要保留最近三个月数据,却发现去年数据都有1千万左右 SQL> SELECT count(*) FROM DATA_MSG_FILELIST a where a.file_generate_date二.分析及建议

1. 时间条件写法导致查询多跨一个分区

2. 该有的索引没有

总表大小:34682156,网元(166889):51377,而你在这个ne_id列没有索引! 如果这索引加了,应该几秒就完成

因篇幅问题不能全部显示,请点此查看更多更全内容