Log Buffer #366, A Carnival of the Vanities for DBAs

While the Oracle blogosphere is buzzing with the Collaborate news and views, SQL Server and MySQL bloggers are also getting upbeat about their respective fields and producing gems of blog posts. This Log Buffer Edition covers that all and more.

Oracle:

Run Virtual Machines with Oracle VM.

Last Call to Submit to the JavaOne Java EE Track.

UX – No Time To Kill.

Updated: Price Comparison for Big Data Appliance and Hadoop.

PLSQL_OPTIMIZE_LEVEL paraeter and optimizing of the PL/SQL code.

SQL Server:

SQL Server 2012 AlwaysOn Groups and FCIs.

How to recover a suspect msdb database in SQL Server.

Data Science Laboratory System – Distributed File Databases.

Stairway to T-SQL: Beyond The Basics Level 5: Storing Data in Temporary Tables.

The Girl with the Backup Tattoo.

MySQL:

pt-online-schema change and row based replication.

MariaDB Client Library for C 2.0 and Maria JDBC Client 1.1.7 released.

Help MariaDB gather some statistics!

Uninitialized data in the TokuDB recovery log.

MySQL at LOADays Conference in Antwerp, Belgium (Apr 5-6, 2014)

A Simple Way to Monitor Java in Linux

A quick and easy way to know what is it inside Java process that is using your CPU. Using just Linux command line tools and JDK supplied command line utilities.

Introduction

Here are a few things you need to know before starting. Following the links is not necessary, they are available for the reference.

  • there are different vendors of Java Virtual Machine. This post is about Oracle’s JVM which is called HotSpot. Linux x86-64 is considered as OS platform. Most of the things about HotSpot are applicable to other vendors too but with slight changes. OSes other than Linux may add some more complications
  • it’s called Virtual Machine, because it virtualizes runtime environment for a Java application. So to know where to look at you need to know a few things about how specific VM is organized. For a detailed overview of the HotSpot, please refer to this article
  • on Linux, a thread inside HotSpot VM is mapped directly to an OS level thread. Well, it may not be true on all OSes, but for modern Linux kernels this is true. So every thread on the OS level is a thread inside a Java application
  • there are generally two types of threads inside a HotSpot VM: native and application threads. Application threads are those that run some Java code, and usually this is what applications are using to run their code. Native threads run something which is not written in Java, usually it’s code in C/C++ and usually all these threads are special utility threads started by a VM itself.

Identifying Threads

Since a Java program may start many threads, each executing some program code, it is necessary to understand which threads are using CPUs. On Linux, top -H will show you CPU usage on a per-thread basis. Here is an example. First, a process which consumes CPU:

top - 16:32:29 up 10:29,  3 users,  load average: 1.08, 0.64, 0.56
Tasks: 172 total,   1 running, 171 sleeping,   0 stopped,   0 zombie
Cpu(s): 48.7%us, 51.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:   1500048k total,  1476708k used,    23340k free,    62012k buffers
Swap:  4128764k total,    75356k used,  4053408k free,   919836k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7599 oracle    20   0 1151m  28m 9100 S 85.5  1.9   0:12.67 java
 2575 oracle    -2   0  709m  10m 8336 S 10.6  0.7  47:34.05 oracle
 2151 root      20   0  207m  44m 6420 S  1.7  3.0   0:27.18 Xorg

If we check the details of CPU usage for PID=7599 with “top -H -p 7599″, then we will see something like this:

top - 16:40:39 up 10:37,  3 users,  load average: 1.47, 1.25, 0.90
Tasks:  10 total,   1 running,   9 sleeping,   0 stopped,   0 zombie
Cpu(s): 49.3%us, 50.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:   1500048k total,  1460468k used,    39580k free,    50140k buffers
Swap:  4128764k total,    76208k used,  4052556k free,   912644k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7601 oracle    20   0 1151m  28m 9100 R 85.9  1.9   7:19.98 java
 7602 oracle    20   0 1151m  28m 9100 S  1.0  1.9   0:02.95 java
 7599 oracle    20   0 1151m  28m 9100 S  0.0  1.9   0:00.01 java

So there is 1 execution thread inside a Java process, which is constantly on top and is utilizing around 85% of a single core.

Now the next thing to know is: what is this thread doing. To answer that question we need to know two things: thread stacks from a Java process and a way to map OS level thread to a Java thread. As I mentioned previously, there is one to one mapping between OS and Java level threads in HotSpot running on Linux.

To get a thread dump we need to use a JDK utility called jstack:

[oracle@oel6u4-2 test]$ jstack 7599
2014-02-28 16:57:23
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.12-b01 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f05a0001000 nid=0x1e66 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Low Memory Detector" daemon prio=10 tid=0x00007f05c4088000 nid=0x1db8 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007f05c4085800 nid=0x1db7 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007f05c4083000 nid=0x1db6 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f05c4081000 nid=0x1db5 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f05c4064800 nid=0x1db4 in Object.wait() [0x00007f05c0631000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000eb8a02e0> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
	- locked <0x00000000eb8a02e0> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x00007f05c4062800 nid=0x1db3 in Object.wait() [0x00007f05c0732000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000eb8a0380> (a java.lang.ref.Reference$Lock)
	at java.lang.Object.wait(Object.java:485)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
	- locked <0x00000000eb8a0380> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]
   java.lang.Thread.State: RUNNABLE
	at java.security.SecureRandom.getInstance(SecureRandom.java:254)
	at java.security.SecureRandom.getDefaultPRNG(SecureRandom.java:176)
	at java.security.SecureRandom.<init>(SecureRandom.java:133)
	at RandomUser.main(RandomUser.java:9)

"VM Thread" prio=10 tid=0x00007f05c405c000 nid=0x1db2 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f05c408b000 nid=0x1db9 waiting on condition

JNI global references: 975

To map OS level thread to a Java thread in a thread dump, we need to convert native thread ID from Linux to base 16, and search for “nid=$ID” in the stack trace. In our case thread ID is 7601 which is 0x1db1, and the Java thread had following stack trace at the time of running jstack:

"main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]
   java.lang.Thread.State: RUNNABLE
	at java.security.SecureRandom.getInstance(SecureRandom.java:254)
	at java.security.SecureRandom.getDefaultPRNG(SecureRandom.java:176)
	at java.security.SecureRandom.<init>(SecureRandom.java:133)
	at RandomUser.main(RandomUser.java:9)

A Way to Monitor

Here is a way to get a stack trace of a thread inside a Java process with command line tools (PID and TID are Process ID of Java process, and Thread ID of an interesting thread on the OS level):

[oracle@oel6u4-2 test]$ jstack $PID | awk '/ nid='"$(printf '%#x' $TID)"' /,/^$/'
"main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]
   java.lang.Thread.State: RUNNABLE
	at java.io.FileInputStream.readBytes(Native Method)
	at java.io.FileInputStream.read(FileInputStream.java:220)
	at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:185)
	at sun.security.provider.NativePRNG$RandomIO.ensureBufferValid(NativePRNG.java:247)
	at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:261)
	- locked <0x00000000eb8a3370> (a java.lang.Object)
	at sun.security.provider.NativePRNG$RandomIO.access$200(NativePRNG.java:108)
	at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:97)
	at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
	- locked <0x00000000e43adc90> (a java.security.SecureRandom)
	at java.security.SecureRandom.next(SecureRandom.java:455)
	at java.util.Random.nextInt(Random.java:189)
	at RandomUser.main(RandomUser.java:9)

As you can see here, the thread is executing a main method of RandomUser class – at least at the time of taking a thread dump. If you would like to see how this changes over time, then a simple watch command may help to get an idea if this thread stack changes frequently or not:

watch -n .5 "jstack $PID | awk '/ nid='"$(printf '%#x' $TID)"' /,/^$/'"

Every 0.5s: jstack 7599 | awk '/ nid='0x1db1' /,/^$/'                                                                                Fri Mar 14 16:29:37 2014

"main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]
   java.lang.Thread.State: RUNNABLE
        at java.util.LinkedHashMap$LinkedHashIterator.<init>(LinkedHashMap.java:345)
        at java.util.LinkedHashMap$KeyIterator.<init>(LinkedHashMap.java:383)
        at java.util.LinkedHashMap$KeyIterator.<init>(LinkedHashMap.java:383)
        at java.util.LinkedHashMap.newKeyIterator(LinkedHashMap.java:396)
        at java.util.HashMap$KeySet.iterator(HashMap.java:874)
        at java.util.HashSet.iterator(HashSet.java:153)
        at java.util.Collections$UnmodifiableCollection$1.<init>(Collections.java:1005)
        at java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1004)
        at java.security.SecureRandom.getPrngAlgorithm(SecureRandom.java:523)
        at java.security.SecureRandom.getDefaultPRNG(SecureRandom.java:165)
        at java.security.SecureRandom.<init>(SecureRandom.java:133)
        at RandomUser.main(RandomUser.java:9)

So this way you could see what the application thread is doing right now. Since it could be quite a lot of different type of work, the next reasonable step is to add aggregation.

[oracle@oel6u4-2 test]$ ./prof.sh 7599 7601
Sampling PID=7599 every 0.5 seconds for 10 samples
      6  "main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]    java.lang.Thread.State: RUNNABLE
	at java.io.FileInputStream.readBytes(Native Method)
	at java.io.FileInputStream.read(FileInputStream.java:220)
	at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:185)
	at sun.security.provider.NativePRNG$RandomIO.ensureBufferValid(NativePRNG.java:247)
	at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:261)
	- locked <address> (a java.lang.Object)
	at sun.security.provider.NativePRNG$RandomIO.access$200(NativePRNG.java:108)
	at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:97)
	at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
	- locked <address> (a java.security.SecureRandom)
	at java.security.SecureRandom.next(SecureRandom.java:455)
	at java.util.Random.nextInt(Random.java:189)
	at RandomUser.main(RandomUser.java:9)

      2  "main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]    java.lang.Thread.State: RUNNABLE
	at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:268)
	- locked <address> (a java.lang.Object)
	at sun.security.provider.NativePRNG$RandomIO.access$200(NativePRNG.java:108)
	at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:97)
	at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
	- locked <address> (a java.security.SecureRandom)
	at java.security.SecureRandom.next(SecureRandom.java:455)
	at java.util.Random.nextInt(Random.java:189)
	at RandomUser.main(RandomUser.java:9)

      1  "main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]    java.lang.Thread.State: RUNNABLE
	at java.util.Random.nextInt(Random.java:189)
	at RandomUser.main(RandomUser.java:9)

      1  "main" prio=10 tid=0x00007f05c4007000 nid=0x1db1 runnable [0x00007f05c82f4000]    java.lang.Thread.State: RUNNABLE
	at java.security.SecureRandom.next(SecureRandom.java:452)
	at java.util.Random.nextInt(Random.java:189)
	at RandomUser.main(RandomUser.java:9)

Here is what’s inside the prof.sh script:

#!/bin/bash

P_PID=$1
P_NID=$2

if [ "$P_SLEEP" == "" ]; then
  P_SLEEP=0.5
fi

if [ "$P_CNT" == "" ]; then
  P_CNT=10
fi

echo Sampling PID=$P_PID every $P_SLEEP seconds for $P_CNT samples

if [ "$P_NID" == "" ]; then
  CMD="awk '//'"
else
  CMD="awk '/ nid='"$(printf '%#x' $P_NID)"' /,/^$/'"
fi

for i in `seq $P_CNT`
do
  jstack $P_PID | eval $CMD
  sleep $P_SLEEP;
done |
  awk ' BEGIN { x = 0; s = "" }
    /nid=/ { x = 1; }
    // {if (x == 1) {s = s ", "gensub(/<\w*>/, "<address>", "g") } }
    /^$/ { if ( x == 1) { print s; x = 0; s = ""; } }' |
  sort | uniq -c | sort -n -r | head -10 |
  sed -e 's/$/\n/g' -e 's/\t/\n\t/g' -e 's/,//g'

The idea of the script is based on the method from poor man’s profiler adapted for HotSpot thread dumps. The script does the following things:

  • Takes $P_CNT thread dumps of the Java process ID passed as $1 (10 by default)
  • If a native thread ID has been supplied as $2, then searches for the thread stack of this thread in the thread dump
  • Concatenates each thread stack trace into a comma-separated string
  • Aggregates strings and sorts them by the number of occurrences
  • Prettifies the output: removes tabs, commas, and adds new lines back to the thread stack

Conclusion

With a few little things it is possible to understand quite a lot of things in almost any situation related to Java: you can find out the most frequent stack trace by sampling thread dumps.
With this knowledge it is then easy to understand why an issue happening. In my test case, the application instantly generates random numbers without a pause, and 1 thread is occupying 1 CPU core.

Cache anomaly

Just a quick heads-up for anyone who likes to play around with the Keep and Recycle caches.

In 11g Oracle introduced the option for serial direct path reads for tablescans on tables that was sufficiently large – which meant more than the small_table_threshold – provided the table wasn’t already sufficient well cached.  (The rules mean that the choice of mechanism can appear to be a little random in the production environment for tables that are near the threshold size – but if you try testing by doing “alter system flush buffer_cache” you find that you always get direct path reads in testing.)

I’ve just discovered a little oddity about this, though.  I have a table of about 50MB which is comfortably over the threshold for direct path reads. But if I create a KEEP cache (db_keep_cache_size) that is a little larger than the table and then assign the table to the KEEP cache (alter table xxx storage(buffer_pool keep)) then 11.2.0.4 stops doing direct path reads, and caches the table.

Now this doesn’t seem unreasonable – if you’ve assigned an object to the KEEP cache you probably want it (or once wanted it) to be kept in cache as much as possible; so using the KEEP cache if it’s defined and specified makes sense. The reason I mention this as an oddity, though, is that it doesn’t reproduce in 11.1.0.7.

I think I saw a bug note about this combination a few months ago- I was looking for something else at the time and, almost inevitably, I can’t find it when I want it – but I don’t remember whether it was the 11.1 or 11.2 behaviour that was deemed to be correct.

 Update

See comments 1 and 2 below.  I’ve written about this previously, and the caching bechaviour is the correct behaviour. The patch is in 11.2.0.2 and backports are available for 11.1.0.7 and 11.2.0.1. The patch ensures that the table will be read into the cache if it is smaller than the db_keep_cache_size.  (Although we might want to check – see Tanel’s notes – whether this is based on the high water mark recorded in the segment header or on the optimizer stats for the table; and I suppose it might be worth checking that the same feature applies to index fast full scans). From the MoS document:

With the bug fix applied, any object in the KEEP buffer pool, whose size is less than DB_KEEP_CACHE_SIZE, is considered as a small or medium sized object. This will cache the read blocks and avoid subsequent direct read for these objects.

 

 


Cache anomaly

Just a quick heads-up for anyone who likes to play around with the Keep and Recycle caches.

In 11g Oracle introduced the option for serial direct path reads for tablescans on tables that was sufficiently large – which meant more than the small_table_threshold – provided the table wasn’t already sufficient well cached.  (The rules mean that the choice of mechanism can appear to be a little random in the production environment for tables that are near the threshold size – but if you try testing by doing “alter system flush buffer_cache” you find that you always get direct path reads in testing.)

I’ve just discovered a little oddity about this, though.  I have a table of about 50MB which is comfortably over the threshold for direct path reads. But if I create a KEEP cache (db_keep_cache_size) that is a little larger than the table and then assign the table to the KEEP cache (alter table xxx storage(buffer_pool keep)) then 11.2.0.4 stops doing direct path reads, and caches the table.

Now this doesn’t seem unreasonable – if you’ve assigned an object to the KEEP cache you probably want it (or once wanted it) to be kept in cache as much as possible; so using the KEEP cache if it’s defined and specified makes sense. The reason I mention this as an oddity, though, is that it doesn’t reproduce in 11.1.0.7.

I think I saw a bug note about this combination a few months ago- I was looking for something else at the time and, almost inevitably, I can’t find it when I want it – but I don’t remember whether it was the 11.1 or 11.2 behaviour that was deemed to be correct.

 Update

See comments 1 and 2 below.  I’ve written about this previously, and the caching bechaviour is the correct behaviour. The patch is in 11.2.0.2 and backports are available for 11.1.0.7 and 11.2.0.1. The patch ensures that the table will be read into the cache if it is smaller than the db_keep_cache_size.  (Although we might want to check – see Tanel’s notes – whether this is based on the high water mark recorded in the segment header or on the optimizer stats for the table; and I suppose it might be worth checking that the same feature applies to index fast full scans). From the MoS document:

With the bug fix applied, any object in the KEEP buffer pool, whose size is less than DB_KEEP_CACHE_SIZE, is considered as a small or medium sized object. This will cache the read blocks and avoid subsequent direct read for these objects.

 

 


Easy – Oops.

A question came up on OTN today asking for suggestions on how to enforce uniqueness on a pair of columns only when the second column was not null. There’s an easy and obvious solution – but I decided to clone the OP’s example and check that I’d typed my definition up before posting it; and the result came as a bit of a surprise. Here’s a demo script (not using the OP’s table):


create table t1  
(  
	col1	int not null,
	col2	varchar2(1)
);  

create unique index t1_i1 on t1( 
--	case col2 when null then cast(null as int) else col1 end,
--	case when col2 is null then cast(null as int) else col1 end,
	case when col2 is not null then col1 end,
	col2
)
;

insert into t1 values(1,null);
insert into t1 values(1,null);
insert into t1 values(1,'x');
insert into t1 values(1,'y');
insert into t1 values(1,'y');

commit;

column ind1_is   format a5
column ind1_when format 9999

set null N/A

select
	case when col2 is null then cast (null as int) else col1 end	ind1_is,
	case col2 when null then cast (null as int)  else col1 end	ind1_when
from 
	t1
;

The strategy is simple, you create a unique function-based index with two columns; the first column of the index id defined to show the first column of the table if the second column of the table is not null, the second column of the index is simply the second column of the table. So if the second column of the table is null, both columns in the index are null and there is no entry in the index; but if the second column of the table is not null then the index copies both columns from the table and a uniqueness test applies.

Based on the requirement and definition you would expect the first 4 of my insert statements to succeed and the last one to fail. The index will then have two entries, corresponding to my 3rd and 4th insertions.

I’ve actually shown three ways to use the case statement to produce the first column of the index. The last version is the cleanest, but the first option is the one I first thought of – it’s virtually a literal translation the original requirement. The trouble is, with my first definition the index acquired an entry it should not have got, and the second insert raised a “duplicate key” error; the error didn’t appear when I switched the syntax of the case statement to the second version.

That’s why the closing query of the demo is there – when you run it the two values reported should be the same as each other for all four rows in the table – but they’re not. This is what I got on 11.2.0.4:


IND1_IS IND1_WHEN
------- ---------
N/A             1
N/A             1
      1         1
      1         1


I’m glad I did a quick test before suggesting my original answer.

Anyone who has other versions of Oracle available is welcome to repeat the test and report back which versions they finding working correctly (or not).

Update

It’s not a bug (see note 2 below from Jason Bucata), it’s expected behaviour.

 


Easy – Oops.

A question came up on OTN today asking for suggestions on how to enforce uniqueness on a pair of columns only when the second column was not null. There’s an easy and obvious solution – but I decided to clone the OP’s example and check that I’d typed my definition up before posting it; and the result came as a bit of a surprise. Here’s a demo script (not using the OP’s table):


create table t1  
(  
	col1	int not null,
	col2	varchar2(1)
);  

create unique index t1_i1 on t1( 
--	case col2 when null then cast(null as int) else col1 end,
--	case when col2 is null then cast(null as int) else col1 end,
	case when col2 is not null then col1 end,
	col2
)
;

insert into t1 values(1,null);
insert into t1 values(1,null);
insert into t1 values(1,'x');
insert into t1 values(1,'y');
insert into t1 values(1,'y');

commit;

column ind1_is   format a5
column ind1_when format 9999

set null N/A

select
	case when col2 is null then cast (null as int) else col1 end	ind1_is,
	case col2 when null then cast (null as int)  else col1 end	ind1_when
from 
	t1
;

The strategy is simple, you create a unique function-based index with two columns; the first column of the index id defined to show the first column of the table if the second column of the table is not null, the second column of the index is simply the second column of the table. So if the second column of the table is null, both columns in the index are null and there is no entry in the index; but if the second column of the table is not null then the index copies both columns from the table and a uniqueness test applies.

Based on the requirement and definition you would expect the first 4 of my insert statements to succeed and the last one to fail. The index will then have two entries, corresponding to my 3rd and 4th insertions.

I’ve actually shown three ways to use the case statement to produce the first column of the index. The last version is the cleanest, but the first option is the one I first thought of – it’s virtually a literal translation the original requirement. The trouble is, with my first definition the index acquired an entry it should not have got, and the second insert raised a “duplicate key” error; the error didn’t appear when I switched the syntax of the case statement to the second version.

That’s why the closing query of the demo is there – when you run it the two values reported should be the same as each other for all four rows in the table – but they’re not. This is what I got on 11.2.0.4:


IND1_IS IND1_WHEN
------- ---------
N/A             1
N/A             1
      1         1
      1         1


I’m glad I did a quick test before suggesting my original answer.

Anyone who has other versions of Oracle available is welcome to repeat the test and report back which versions they finding working correctly (or not).

Update

It’s not a bug (see note 2 below from Jason Bucata), it’s expected behaviour.

 


Tweaking

The following question came up on OTN recently:

Which one gives better performance? Could please explain.

1) nvl( my_column, ‘N’) <> ‘Y’

2) nvl( my_column, ‘N’) = ‘N’

It’s a question that can lead to a good 20 minute discussion – if you were in some sort of development environment and had a fairly free hand to do whatever you wanted.

The most direct answer is that you could expect the performance to be the same whichever option you chose – but the results might be different, of course, unless you had a constraint on my_column that ensured that it would hold only null, ‘Y’, or ‘N’.  (Reminder:  the constraint check (my_column in (‘Y’,’N’) will allow nulls in the column).

On the other hand, if you create a function-based index on nvl(my_column,’N’) then the second option would give the optimizer the opportunity to use the index – which could make it the more efficient option if a sufficiently small quantity of the data from a large table matched the predicate. Of course in this case you would need a histogram on the hidden column definition supporting the index so that the optimizer could detect the data skew.

But if a function-based index is a step in the right direction it’s worth thinking about how to pick the best function-based index.


create index t1_i1 on t1(case my_column when 'Y' then null else 'N' end);

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1');

select * from t1 where case my_column when 'Y' then null else 'N' end = 'N';

Provided you can change the queries that use the original predicate, you can create a function-based index that is the smallest possible index to assist your query, and you won’t need a histogram to allow the optimizer to get the correct cardinality since there will be just one distinct value in the index (and the underlying column).

It’s possible to go a little further – what if the typical query that started this discussion was really something like:


select  *
from    t1
where   nvl(my_column, 'N') = 'N'
and     other_column = {some value}

If the nvl() predicate always returned a small fraction of the data we might engineer an index to handle the requirement of the entire where clause:


create index t1_i2 on t1(case my_column when 'Y' then null else other_column end);
execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all hidden columns size 254');

In this case you might need a histogram on the hidden column definition supporting the index, which is why I’ve changed my method_opt. We’ve constructed an index that is the smallest possible index that will satisfy the query as precisely as possible while giving the optimizer the best chance of producing accurate cardinality estimates. [Update: until the appearance of Kirill's comment (#1) I forgot to point out that once again you have to change the original SQL from using the nvl() expression to using a case expression that matches the index.]

Footnote

It would be easy to fall into the trap of playing around with this type of tweaking every time you hit a performance problem. Bear in mind, though, that even indexes which are as small and accurately targeted as possible can introduce various types of overhead (contention, cost of gathering stats, instability of histogram collection); so always try to think through the potential costs as well as the benefits of such an approach - is it really worth the effort.

It’s also worth noting that things don’t always work the way you expect.

Finally, of course, you might choose to create “proper” virtual columns in 11g so that you can refer to them by name, rather than handling them indirectly as you have to when they are “undocumented” hidden columns generated by function-based indexes.

 


Tweaking

The following question came up on OTN recently:

Which one gives better performance? Could please explain.

1) nvl( my_column, ‘N’) <> ‘Y’

2) nvl( my_column, ‘N’) = ‘N’

It’s a question that can lead to a good 20 minute discussion – if you were in some sort of development environment and had a fairly free hand to do whatever you wanted.

The most direct answer is that you could expect the performance to be the same whichever option you chose – but the results might be different, of course, unless you had a constraint on my_column that ensured that it would hold only null, ‘Y’, or ‘N’.  (Reminder:  the constraint check (my_column in (‘Y’,’N’) will allow nulls in the column).

On the other hand, if you create a function-based index on nvl(my_column,’N’) then the second option would give the optimizer the opportunity to use the index – which could make it the more efficient option if a sufficiently small quantity of the data from a large table matched the predicate. Of course in this case you would need a histogram on the hidden column definition supporting the index so that the optimizer could detect the data skew.

But if a function-based index is a step in the right direction it’s worth thinking about how to pick the best function-based index.


create index t1_i1 on t1(case my_column when 'Y' then null else 'N' end);

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1');

select * from t1 where case my_column when 'Y' then null else 'N' end = 'N';

Provided you can change the queries that use the original predicate, you can create a function-based index that is the smallest possible index to assist your query, and you won’t need a histogram to allow the optimizer to get the correct cardinality since there will be just one distinct value in the index (and the underlying column).

It’s possible to go a little further – what if the typical query that started this discussion was really something like:


select  *
from    t1
where   nvl(my_column, 'N') = 'N'
and     other_column = {some value}

If the nvl() predicate always returned a small fraction of the data we might engineer an index to handle the requirement of the entire where clause:


create index t1_i2 on t1(case my_column when 'Y' then null else other_column end);
execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all hidden columns size 254');

In this case you might need a histogram on the hidden column definition supporting the index, which is why I’ve changed my method_opt. We’ve constructed an index that is the smallest possible index that will satisfy the query as precisely as possible while giving the optimizer the best chance of producing accurate cardinality estimates. [Update: until the appearance of Kirill's comment (#1) I forgot to point out that once again you have to change the original SQL from using the nvl() expression to using a case expression that matches the index.]

Footnote

It would be easy to fall into the trap of playing around with this type of tweaking every time you hit a performance problem. Bear in mind, though, that even indexes which are as small and accurately targeted as possible can introduce various types of overhead (contention, cost of gathering stats, instability of histogram collection); so always try to think through the potential costs as well as the benefits of such an approach - is it really worth the effort.

It’s also worth noting that things don’t always work the way you expect.

Finally, of course, you might choose to create “proper” virtual columns in 11g so that you can refer to them by name, rather than handling them indirectly as you have to when they are “undocumented” hidden columns generated by function-based indexes.

 


Indexing Foreign Keys (Helden)

A recent question on an internal forum asked whether an index on a Foreign Key (FK) constraint designed to avoid locking issues associated with deletes on the parent tables needs to precisely match the columns in the FK. Could the columns in the index potentially be a different order or be appended with additional columns ? The answer is […]

Oracle Midlands : Event #3 – Registration Open

Registration is now open for Oracle Midlands Event #3 on Tuesday 20th May..

As I mentioned in a previous postChristian Antognini will be the speaker for both the sessions this time. He’ll be covering “12c Adaptive Query Optimization” and “Row Chaining and Row Migration Internals”.

Red Gate Software have kindly offered to sponsor the event again, so registration is free!

I’ve already registered. :) Please make the effort to come along and support the event!

Cheers

Tim…


Oracle Midlands : Event #3 – Registration Open was first posted on March 31, 2014 at 6:04 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.