Method R Discussion Group

Method R helps developers, DBAs, and decision-makers optimize Oracle-based software in every phase of the software life cycle.

Discuss Method R products, techniques, and events.

Ask new questions or post your own responses.

SQL*Net message to client may be within database call wait event

Hi All,

the tests that I ran suggest, that SQL*Net message to client may be both a between database call wait event (as stated in "Optimizing Oracle Performance", page 88) and a within database call wait event. This has implications for calculating R, though the impact is small, since this wait event completes within a few microseconds.

It seems that whenever an EXEC call returns a value such as when calling a function (see my previous posting with DBMS_PIPE) or DML RETURNING is used, SQL*Net message to client is a within database call wait event.

After an EXEC call without DML returning, SQL*Net message from client comes after EXEC in the trace, whereas with DML returning, it comes before the EXEC entry:

CREATE TABLE t (
    n NUMBER,
    c1 VARCHAR2(4000) DEFAULT rpad('X', 3999, 'X')
);

VARIABLE c1 VARCHAR2(4000)
ALTER SESSION SET EVENTS '10046 trace name context forever, level 8';
INSERT INTO t(n) VALUES(1);
INSERT INTO t(n) VALUES(1) RETURNING c1 INTO :c1;

get this trace:
PARSING IN CURSOR #10 len=26 dep=0 uid=61 oct=2 lid=61 tim=219287143976 hv=3758199711 ad='6c4d5e64'
INSERT INTO t(n) VALUES(1)
END OF STMT
PARSE #10:c=0,e=1673,p=0,cr=1,cu=0,mis=1,r=0,dep=0,og=1,tim=219287143969
EXEC #10:c=0,e=14408,p=0,cr=1,cu=22,mis=0,r=1,dep=0,og=1,tim=219287158490
WAIT #10: nam='SQL*Net message to client' ela= 6 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=219287158605
WAIT #10: nam='SQL*Net message from client' ela= 1593 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=219287160261

=====================
PARSING IN CURSOR #7 len=48 dep=0 uid=61 oct=2 lid=61 tim=219287160635 hv=4030903414 ad='6c1ac860'
INSERT INTO t(n) VALUES(1) RETURNING c1 INTO :c1
END OF STMT
PARSE #7:c=0,e=218,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=219287160628
WAIT #7: nam='SQL*Net message to client' ela= 4 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=219287161396
WAIT #7: nam='SQL*Net more data to client' ela= 25 driver id=1111838976 #bytes=2088 p3=0 obj#=-1 tim=219287161477
WAIT #7: nam='SQL*Net more data to client' ela= 18 driver id=1111838976 #bytes=1967 p3=0 obj#=-1 tim=219287161538
EXEC #7:c=15625,e=825,p=0,cr=2,cu=6,mis=1,r=1,dep=0,og=1,tim=219287161573

Note the position of SQL*Net message from client. In the trace of the first INSERT, it was emitted after the EXEC of cursor 10. In the trace of INSERT ... RETURNING, it is emitted before the EXEC entry. This implies that  SQL*Net message from client is part of processing the EXEC #7 call and is thus a within database call wait event.

Kind regards,

Norbert

PS: book reviewers for "Secret ORACLE" are still welcome: http://groups.yahoo.com/group/oracle_book_review
Norbert Debes Send private email
Friday, November 30, 2007
 
 
You've correctly identified another problem with page 88 in the OOP book. SNMTC is actually always a within call event. The book incorrectly states it as a between call event. That list was comprised before we ran a test proving that SNMTC is a within call event. If you need more detail I can provide it (including the test definition).

But when you say that SNMFC is a within call event, you confused me when you said "part of processing the EXEC #7 call."

Did you mean "EXEC #10 preceding" or did you mean "PARSE #7 following"?

If you meant "EXEC #10 preceding", then that doesn't make sense. The EXEC #10 cannot complete until all of its within call events complete.

The reason that it looks like a within call event is the choice for the cursor slot number (i.e., 10 in this case) made by the RDBMS developers. It's a poor choice. Sometimes you'll notice that a SNMFC's cursor slot number is zero and other times it's non-zero. It just depends on what happens before and after the SNMFC.

Now, if you meant "PARSE #7 following", then that also doesn't make sense because the SNMFC's ela is 1593 but the following PARSE has an e of 218.
Jeff Holt Send private email
Thursday, December 6, 2007
 
 
Hello Jeff,

thanks for your reply and sorry for confusing you. I meant to write SQL*Net message to client all the way. Below is the correction (with line numbers this time):

1 PARSING IN CURSOR #7 len=48 dep=0 uid=61 oct=2 lid=61 tim=219287160635 hv=4030903414 ad='6c1ac860'
2 INSERT INTO t(n) VALUES(1) RETURNING c1 INTO :c1
3 END OF STMT
4 PARSE #7:c=0,e=218,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=219287160628
5 WAIT #7: nam='SQL*Net message to client' ela= 4 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=219287161396
6 WAIT #7: nam='SQL*Net more data to client' ela= 25 driver id=1111838976 #bytes=2088 p3=0 obj#=-1 tim=219287161477
7 WAIT #7: nam='SQL*Net more data to client' ela= 18 driver id=1111838976 #bytes=1967 p3=0 obj#=-1 tim=219287161538
8 EXEC #7:c=15625,e=825,p=0,cr=2,cu=6,mis=1,r=1,dep=0,og=1,tim=219287161573

Note the position of SQL*Net message to client. In the trace of the first INSERT, it was emitted after the EXEC of cursor 10. In the trace of INSERT ... RETURNING, it is emitted before the EXEC entry (WAIT #7, 2nd example, line 5). This implies that  SQL*Net message to client is part of processing the EXEC #7 call in line 8 and is thus a within database call wait event.

I am very curious to learn how you proved that SQL*Net message to client is always a within database call wait event.
The rule I remember from OOP is that database calls such as EXEC are emitted after all the within call wait events due to that same database call. In the first example WAIT #10 for SQL*Net message to client is emitted after EXEC #10. This breaks the aforementioned rule, if SQL*Net message to client is a within database call wait event.

If there is any (upcoming) publication or presentation that covers the same ground as chapter "Interpreting Extended SQL Trace Data", I'd appreciate it if you sent me a link or file.

Will there be a revised edition of the book in the near future?

Thank you and kind regards,
Norbert
Norbert Debes Send private email
Tuesday, December 11, 2007
 
 
I'll run some more tests but for now it's extremely difficult for me to accept that the EXEC didn't finish before the following SNMTC started. In other words, I find it difficult to believe that SNMTC is only sometimes a within call event.

Regarding a new revision of the book, I don't think there will be one because I recall him saying so. The only reason I don't know for sure is because we haven't talked about it for quite a while.
Jeff Holt Send private email
Thursday, December 13, 2007
 
 
Now for the defintion of the test "what is snmtc: a within call event or a between call event?"

The ability to easily tell is based upon the ability to force the write call (i.e., the one whose file descriptor is the socket between the server and the client) to last a predictable duration.

If you can force that write to last at least 5 seconds and assuming the write would normally take a few milliseconds, then you can probably easily see where the 5 seconds show up. Here's a 9i+ example:

WAIT #1: nam='SQL*Net message to client' ela= 4 ...
EXEC #7:c=100,e=400,...
In this example, the EXEC lasted 400us consuming 100us CPU and took 4us to send the response back to the client. It's actually incorrect (i.e., bug) that the rdbms says 4 but that's beside the point.

If we trace the system calls executed by the server process we'll see a write call corresponding to the WAIT event. If we modify the write call to make it last for a really long time, say, 5 seconds and run the test again, then we see something close to this:

WAIT #1: nam='SQL*Net message to client' ela= 5000000 ...
EXEC #7:c=100,e=5000000,...

So, there are two questions.

How do you modify the write system call? I'll have to ask Cary if I'm allowed to say anything further.

Can you exercise enough of the rdbms code to know that SNMTC is really always a within call event? Unfortunately, it's probably impossible to know all the contexts in which a SNMTC will execute for a given instance. Even 9i is extremely robust/complicated. And it is absolutely true changes to the rdbms (e.g., 10g, 11g) can affect SNMTC behavior and the contexts in which it executes. And so, the best that we can do is test the most common contexts and then ask for peer review.
Jeff Holt Send private email
Thursday, December 13, 2007
 
 
Hi Jeff,

thanks for sharing the test results. I had the same idea you used: make the write to the socket last long enough, to see where the time shows up. I guess a debugger would do. If not, use an OS where you can change the C library and link with a special C library.

So the rule, that within call waits of a database call are emitted before the database call itself, does not always hold.

I will modify my own profiler to reflect that SNMTC is a within call wait event.

Norbert
Norbert Debes Send private email
Friday, December 14, 2007
 
 
Hello Jeff,

as you may imagine, it was hard for me to put this issue to rest, before trying to find some evidence myself. I pursued the idea with the debugger and after some hours of trial and error, was able to come up with some evidence, that suggests SNMTC can be both a within and a between call wait event. The problem was finding the right function to set a breakpoint in, such that the SNMTC wait could be artifically elongated. Unfortunately the SQL*Net server trace did not contain the right function name.

Here's my testcase, that uses the gdb debugger on Linux (running in vmware, which should not matter):

1. Start SQL*Plus and get the OS pid:
$ sqlplus ndebes/secret@ten_rac.oradbpro.com
SQL*Plus: Release 10.2.0.1.0 - Production on Sat Dec 15 00:09:50 2007
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, Oracle Label Security and Data Mining options

SQL> select spid from v$session s,v$process p where s.paddr=p.addr and s.sid=userenv('sid');

SPID
------------
26078

2. On Linux, attach to the dedicated server process with the debugger gdb:
$ gdb $ORACLE_HOME/bin/oracle 26078
GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib/tls/libthread_db.so.1".
...

3. Set a breakpoint in niosqn:
(gdb) break nioqsn
Breakpoint 1 at 0xb7cad16

4. In SQL*Plus, run my SQL script (download from  http://www.oradbpro.com/snmtc.sql ):
SQL> @snmtc

5. In gdb, repeatedly allow the program to continue with "c", after it has stopped at the breakpoint:
Breakpoint 1, 0x0b7cad16 in nioqsn ()
(gdb) c
Continuing.

Make sure you wait one or more seconds before entering "c". The longer you wait, the longer SQL*Net message to client gets.

A trace file with very long SQL*Net message to client wait events is created in USER_DUMP_DEST:

$ tail -f ten1_ora_26078.trc
Oracle process number: 13
Unix process pid: 26078, image: oracleTEN1@dbserver1.oradbpro.com

*** 2007-09-08 14:23:24.112
*** ACTION NAME:() 2007-09-08 14:23:24.109
*** MODULE NAME:(SQL*Plus) 2007-09-08 14:23:24.109
*** SERVICE NAME:(TEN.oradbpro.com) 2007-09-08 14:23:24.109
*** SESSION ID:(159.5375) 2007-09-08 14:23:24.109
WAIT #8: nam='SQL*Net message to client' ela= 18957967 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381058700221

Here's an excerpt of the trace file ten1_ora_26078.trc with line numbers:

 1: =====================
 2: PARSING IN CURSOR #3 len=26 dep=0 uid=42 oct=2 lid=42 tim=1161381171121312 hv=3758199711 ad='2df3d948'
 3: INSERT INTO t(n) VALUES(1)
 4: END OF STMT
 5: PARSE #3:c=43993,e=44639,p=0,cr=22,cu=0,mis=1,r=0,dep=0,og=1,tim=1161381171121285
 6: EXEC #3:c=3999,e=3969,p=0,cr=1,cu=23,mis=0,r=1,dep=0,og=1,tim=1161381171125312
 7: WAIT #3: nam='SQL*Net message to client' ela= 1154070 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381172279516
 8: WAIT #3: nam='SQL*Net message from client' ela= 23424 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381172303445
 9: WAIT #0: nam='SQL*Net message to client' ela= 1115166 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381173419214
10: WAIT #0: nam='SQL*Net message from client' ela= 48020 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381173467267
11: =====================
12: PARSING IN CURSOR #8 len=48 dep=0 uid=42 oct=2 lid=42 tim=1161381173469325 hv=4030903414 ad='2ded26e8'
13: INSERT INTO t(n) VALUES(1) RETURNING c1 INTO :c1
14: END OF STMT
15: PARSE #8:c=1999,e=1539,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1161381173469308
16: WAIT #8: nam='SQL*Net message to client' ela= 1321978 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381174794123
17: *** 2007-09-08 14:25:30.502
18: WAIT #8: nam='SQL*Net more data to client' ela= 7335842 driver id=1413697536 #bytes=2076 p3=0 obj#=-1 tim=1161381182130598
19: WAIT #8: nam='SQL*Net more data to client' ela= 684390 driver id=1413697536 #bytes=1967 p3=0 obj#=-1 tim=1161381182815633
20: EXEC #8:c=88986,e=9346126,p=0,cr=2,cu=6,mis=1,r=1,dep=0,og=1,tim=1161381182815962
21: WAIT #8: nam='SQL*Net message from client' ela= 35588 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381182852088
22: WAIT #0: nam='SQL*Net message to client' ela= 8041280 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381190894021
23: WAIT #0: nam='SQL*Net message from client' ela= 14700 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1161381190909307
24: =====================
25: PARSING IN CURSOR #9 len=60 dep=0 uid=42 oct=47 lid=42 tim=1161381190913515 hv=1164329770 ad='2ddf8228'
26: BEGIN :n:=dbms_pipe.receive_message('some_pipe', 10); END;
27: END OF STMT
28: PARSE #9:c=4000,e=3955,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1161381190913497
29: =====================

The SNMTC WAIT #0 in lines 7 and 9 is not rolled up into any database call.

The three SQL*Net related waits associated with cursor 8 in lines 16-19 are rolled up into the EXEC #8 in line 20.

The complete trace file is available at  http://www.oradbpro.com/ten1_ora_26078_long_snmtc_gdb.trc .
A resource profile of the trace file created with my ESQLTRCPROF is at  http://www.oradbpro.com/ten1_ora_26078_long_snmtc_gdb.prof

Please take a look at the evidence and let me know your conclusions. Thank you and have a nice weekend!

Norbert
Norbert Debes Send private email
Friday, December 14, 2007
 
 
The problem that I haven't made clear, so far, is that the appearance of PARSE, EXEC, and FETCH don't always equate to a client's call to the database.

Some time ago (I forget exactly when), server technologies made enhancements to the OCI that allows bundling of the PARSE and EXEC into a single message. In other words, the PARSE is deferred until the EXEC is called.

So the SNMTC on line 16 is being sent to the client as the first part of the response to the EXEC, which makes it a withiner even in this context.

You can see evidence of some unknown OCI function working on lines 22 and 23. A complete round trip with the OCI client was made on those two lines but there's no evidence to indicate the meaning of the round trip. Knowing SQL*Plus, it could be the effect of calling OCIHandleFree but further tests of instrumented OCI applications could probably say for sure if such an action is likely to produce this server-side behavior.

So, to summarize more accurately than I have before, the event SNMTC is a response to some type of message (i.e., call) made on behalf of the client OCI. It's extremely unfortunate that afaik there is no way to make the server process say when an OCI function (or underlying function) is called. Having that information would make it a whole lot easier to write a profiler.
Jeff Holt Send private email
Wednesday, December 19, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Discussion Groups Main Page

Powered by FogBugz