Texas Memory Systems' Blog: A Holistic Look at the Oracle CPU Layer

The CPU layer is where all of the processing occurs within a database. Of course now with the introduction of extremely distributed processing as in the Exadata systems, this line is blurring because some of the processing is actually being done at the storage level. In the future this may even blur more as some companies are offering CPUs built into memory chips (8 CPUs on a single gigabyte memory chip).

Insufficient CPU Resources

I have heard it said that tuning usually pushes the log jam from one part of the information stream to another; I have found this to be true. Once we tune the code, fix the IO subsystem, and fix the memory issues we may have with our system, we usually end up with higher CPU utilization. Higher CPU utilization in itself is not a bad thing, unless it leads to over utilization or overloading of the system CPUs.

When examining an AWR or Statspack report, I look at the balance between Busy, Idle, and IO Wait on the CPUs in the system before making a recommendation as to how much an SSD will improve performance. For example, if the percent busy is 30, the percent idle 20, and the IO wait percent 50, then even if we reduce the IO wait to less than 5% we can still only reclaim 20 percent of performance losses. In order to get the maximum bang for the performance buck for any improvement to the IO subsystem above 20 percent we would need to either tune to reduce CPU usage, add more CPUs, or replace the CPUs with ones that can sustain a higher level of operations. Figure 1 shows an Operating Statistics section of an AWR report.

Figure 1: Operating System Statistics

However, even looking at Figure 1 it may not tell the whole truth about CPU utilization or the amount of CPU cycles that could be regained from reclamation of IO waits. You should always verify the above number for IO wait time by using the statistic from the Server Statistics area on User IO Wait time. In many cases the value will be much larger than the one reported form the Operating System statistics. In figure 1 if a RamSan SSD would reduce the value for the IO statistic by a factor of 9, which would return 98,000 ticks to the CPU for processing, since the CPU has 20,819,000 idle ticks available, this would at least double the performance.

However, if in returning IO wait time to the CPU you exceed the amount of idle time available, then you won’t get the full benefit from adding in a low latency IO solution such as a RamSan SSD.

Other sections of the report also show how utilized the CPU is in your system, Figure 2 shows some of the additional statistics available.

Figure 2: CPU related areas of AWR

From Figure 2 we can also see that there is plenty of CPU available with a 99.8% idle figure and only 0.2% being utilized by this instance. Of that 0.2% the instance did spend 78.6% of that 0.2% doing actual work.

Review of the SQL area of the reports will also show what SQL is consuming the most CPU cycles. When tuning to decrease CPU utilization, the most attention should be paid to the SQL which consumes the most of the CPU cycles. Figure 3 shows an excerpt from an AWR report SQL using high CPU section.

Figure 3: SQL using most CPU AWR Section

When tuning SQL look at removing excessive sorts, use bind variables to eliminate recursive SQL. In PL/SQL avoid excessive looping and don’t do processing inside the loop that can be done once outside the loop. In a best case situation post-10g if the PL/SQL optimizer realizes a bit of code is deterministic such as using a function against a constant, will only be done once, but, if you are doing anything more complex it will add more cycles each time through the loop. In Figure 3 we can see that the first statement is actually a PL/SQL routine, you could start by seeing if that PL/SQL has excessive looping.

Memory Trashing

With automatic memory management (AMM) if you have improper settings for your SGA_MAX_SIZE and SGA_TARGET (pre-11) and MEMORY_TARGET and MEMORY_MAX_TARGET (post-11), then a condition called thrashing can occur. Thrashing is when the AMM processes allocate and deallocate memory continuously. In an example, if the DB cache size is too small, then the AMM will take memory form the other sections, usually the shared pool, to supplement the DB cache. This involves a lot of Mutex and latching operations. It also removes possibly reusable SQL and PL/SQL from the SQL and PL/SQL caches.

Now, what happens when the reverse happens and the shared pool needs more space? It takes it back from the cache possibly causing IO activity, lots of latch operations and other CPU stealing actions. Then once the shared pool gets the memory it probably reloads SQL that it had to offload before and has to redo all of the recursive SQL needed to parse it, again adding CPU cycles from latches, mutexs and parses of recursive SQL. Remember, it has been a true statement since the shared pool was introduced that a shared pool miss is many times more expensive CPU wise than a cache miss. Figure 4 shows an example of thrashing in the memory area of Oracle.

Figure 4: Example of Memory Trashing

Notice how all of the actions in Figure 4 are DEFFERED, this is because they involve waits for latches, mutexs, recursive SQL and a number of CPU cycle stealing operations to occur before they can be completed.

The way to mitigate thrashing is first to have properly tuned code, next, properly size the db cache, shared pool, large pool, java pool and streams pool using the base parameters, then provide proper wiggle room by setting the MAX parameters (SGA_MAX_SIZE, MEMORY_MAX_TARGET) 10-20% higher than the resulting TARGET settings (SGA_TARGET, MEMORY_TARGET). I see the most thrashing when the SGA or MEMORY MAX parameters are too small and the TARGET parameters are set equal to them with no base parameters set.

By setting the base parameters (DB_CACHE_SIZE, SHARED_POOL_SIZE, etc) you establish a floor value below which the AMM cannot go for that parameter. You set them at a point where you know you get reasonable performance.

Proper SQL and PL/SQL tuning will do wonders to reduce CPU usage. Optimizing memory settings to eliminate thrashing will also help.

Texas Memory Systems' Blog

Monday, April 9, 2012

A Holistic Look at the Oracle CPU Layer

No comments:

Post a Comment

Search This Blog

Contributors

Followers

Blog Archive