Thursday, June 21, 2012

Can You Take a Hint?


Hints in Oracle have been around since version 8. Hints are like compiler directives that tell Oracle what path to take when optimizing a SQL statement (generally speaking.) However, Oracle will ignore the hint if it can’t do it or it is formatted poorly.

Most tuning products for SQL will make use of hints if statement re-arrangement doesn’t solve the problem. In later versions of Oracle, outlines became available. You see in many applications the SQL is considered source code and the end user is of course not allowed to modify them. If you cannot place the hint into the code then of course hints could not be used for those applications with source code restrictions. Outlines allowed hints to be tied to a SQL statement ID tag (a signature) and to be applied at run time by creating a execution plan containing the hints and storing that plan to be used when the SQL with the proper ID was recognized by the Oracle kernel. Profiles were the first stage of execution plan stabilization (ensuring that for a specific SQL statement the execution plan stayed the same, even with changes in the underlying statistics at the table and index level.)

Now in 10g and beyond Oracle provides not only hints and outlines, but a new execution plan stabilization mechanism called profiles. Profiles alter the costs of various optimizer branches to optimize a specific piece of SQL code.

Generally speaking adding hints was considered to be the last thing you did if you couldn’t get the SQL optimizer to properly optimize the SQL statement through the use of indexes, statement rewrite or initialization parameter adjustments. Why? Well, by adding hints to SQL statements you generated possible documentation issues if you didn’t properly document where hints had been placed in the application code and why. The use of hints also could result in excessively stable plans that wouldn’t change even if they needed to, not a good situation to be in. Finally hints are changed, dropped and deprecated in each release of Oracle, so with each patch or upgrade you would need to re-verify each hinted statement to be sure Oracle didn’t send along a little surprise in the upgrade or patch.

 In my opinion it is a good practice to only use hints when no other method of SQL statement optimization can be applied. This also applies to outlines and profiles if you don’t have access to change source code.   

Monday, June 18, 2012

RamSan the Green Solution


In a recent research project I compared the energy consumption of SSD based technology to IOPS equivalent disk based fibre attached SAN systems. As you can well imagine the SSD technology was much more space efficient, required less cooling and of course less electricity while providing faster data access.

But just how much can be saved? In comparisons to state of the art disk based systems (sorry, I can’t mention the company we compared to) at 25K IOPS, 50K IOPS and 100K IOPS with redundancy, SSD based technology saved from a low of $27K per year at 2 terabytes of storage and 25K IOPS to a high of $120K per year at 100K IOPS and 6 terabytes of storage using basic electrical and cooling estimation methods. Using methods documented in an APC whitepaper the cost savings varied from $24K/yr to $72K/yr for the same range. The electrical cost utilized was 9.67 cents per kilowatt hour (average commercial rate across the USA for the last 12 months) and cooling costs were calculated at twice the electrical costs based on data from standard HVAC cost sources. It was also assumed that the disks were in their own enclosures separate from the servers while the SSD could be placed into the same racks as the servers. For rack space calculations it was assumed 34U of a 42U rack was available for the SSD and its required support equipment leaving 8U for the servers/blades.

Even figuring in the initial cost difference, the SSD technology paid for itself before the first year was over in all IOPS and terabyte ranges calculated. In fact, based on values utilized at the storage performance council website and the tpc.org website for a typically configure SAN from the manufacturer used in the study, even the cost for the SSD was less for most configurations in the 25K-100K IOPS range.

Obviously, from a green technology standpoint SSD technology (specifically the RamSan 810) provides directly measurable benefits. When the benefits from direct electrical, space and cooling cost savings are combined with the greater performance benefits the decision to purchase SSD technology should be a no brainer.

Friday, June 15, 2012

Best Practices for Utilizing SSDs in HA Configurations


Introduction

SSD technology (the use of Flash or DDR chips verses hard disks to store data) is the disruptive technology of this decade. By the end of 2020 virtually all data centers will place their most critical data on SSDs. Having no moving parts (other than switches and cooling fans) SSD technology promises to be more reliable and resilient than the disk drive technology it replaces. Disks with their motors, armatures, head assemblies and mechanical components are prone to breakdown and are inherently less reliable than solid state devices. However, unless proper practices are followed in the SSD technology, single point failures can occur which will require shutdown and possible loss of data on SSD based systems. This paper will provide best practices to mitigate the possibility of data loss and system downtime due to a single point failure in the SSD based SAN architecture.

The Problem

Given the ultra-reliable nature of SSD devices many users are not paying attention to best practices involving data center redundancy when the architecture is utilizing SSDs. This paper is not met to address the single use, form factor SSD type installation, but rather the large scale (greater than 1-2 terabyte ) installation of SSD systems in a rack mount configuration such as is seen with the TMS RamSan-500, 620 and 630 systems and a lesser extent with the RamSan-300/400 series. Note that these best practices also apply to other manufacturers that provide rack mount SSD based systems or enclosures.

SSD Reliability

Most properly designed SSD based rack mount systems are very reliable and resilient. They employ many technologies to make sure that a single point of failure at the card or device level is not fatal. For example, RamSan systems utilize RAID5 across the chips on a single component card, as well as ECC and ChipKill technologies that automatically correct or remap chip or cell based errors to good sectors, similar to how hard disk technology has been doing it for years. In addition, many SSDs also offer the capability for hot spare cards so that should a single flash card fail, another is remapped with its data. However, in any single component device there are usually single points of failure which must be addressed.

Best Practices

  1. The entire storage system substructure must be redundant
    1. Dual, multi-ported HBAs, FC or IB
    2. Use of auto-failover MPIO software in the host
    3. Use of multiple cross-connected FC or IB switches
    4. Multiple cross linked FC or IB cards in the SSD
  2. The SSD must provide dual power supplies with auto-failover and have hot swap capability
  3. The cooling fans in the SSD units must be redundant and hot swappable
  4. If full reliability and redundancy is required, and a dual backplane is not provided internally, then at least 2-SSD units must be utilized in an external-mirrored (RAID1) configuration
  5. The rack power must be from at least two independent sources, which at least one  must be an uninterruptable power supply (UPS).
  6. Any disk based systems used in concert with the SSD based systems must share the same level of redundancy and reliability.
  7. For disaster recovery, a complete copy of the data must be sent to an offsite facility that is not on the same power grid or net branch as the main data center. The offsite data center must provide the at least the minimum accepted performance level to meet service level agreements.

Reasoning

If any part of storage system is single threaded, then that part becomes a single point of failure eliminating the protections of the other sections. For example, while all SSDs are redundant and self-protecting at the card or component level, usually these cards or components plug into a single backplane. The single backplane provides for a single point of failure. Designing with dual backplanes would drive system cost higher and involve complete re-engineering of existing systems. Therefore it is easier to purchase dual components and provide for true redundancy at the component level.

If the system is fully redundant but the power supplies are fed from a single line source or not fed using a UPS then the power supply to the systems becomes a single point of failure.

When additional storage is provided by hard disk based systems, if they are not held to the same level of redundancy as the SSD portion of the system, then they become a single point of failure.

Disaster recoverability in the case of main data center loss requires that the systems be minimally redundant at the offsite data center, but the data must be fully redundant. To this end technologies such as block level copy or standby system such as Oracle’s Dataguard should be used. In order to provide performance at the needed levels, an additional set of SSD assets should be provided at the offsite data center. Another possibility with Oracle11g is the use of a geo-remote RAC installation using preferred-read technology available in Oracle11g ASM.

Are TMS RamSans Fully Redundant?

All TMS RamSans (except the internal PCi 70) can be purchased compliant with best practices 1-3. Due to the single backplane design of the RamSans, 2 units must be used in a mirror configuration for assurance of complete redundancy and reliability (best practice 4). Best practices 5-7 depend on things external to the RamSans. The newly release RamSan720 and 820 designs are fully internally redundant and can be used for HA as a standalone appliance.

Summary

A system utilizing SSD rack mount technology can be made fully redundant by the use of 2 or more SSD components using external mirroring. If 2 or more SSD components are not utilized a single point failure in the backplane of the SSD could result in downtime and/or loss of data if the new RamSan-720 or RamSan-820 are not utilized.

Monday, June 11, 2012

The Myth of Database Independence


Introduction

It happens more times than we care to think about. A new project starts, much time and energy is spent on selecting the right database system, hardware is bought, specifications are developed and then the bomb from on high goes off…”This must be database independent”, come again?

The Grail of Database Independence

Database independent applications hold great lures for management, the thought of developing a single application and then, at will, being able to swap out the database underneath it all, say, from expensive Oracle to cheap MySQL.

Another argument for moving the logic from the database tier to the application is that then the company has greater control over the application and isn’t dependent on the database vendor. Isn’t this rather like buying a Mazerratti but replacing the steering with ropes and pulleys and the brakes with a pull-board nailed to the side of the drivers door so you aren’t dependent on the manufacturer, and oh by the way, replace that high-tech fuel injection system with a single barrel venturi carburetor so anyone can maintain it, and jerk out that electronic ignition, we want points and plugs, while you are at it, remove the computer and sensors as we really don’t want to be bothered with them.

Unfortunately, like the quest for the Holy Grail of old, the quest for a truly database independent application is almost always doomed to fail unless one is pure of code and deed. Why do I say this? Because you must code to the least common denominators in order to achieve database independence.

What Are least Common Denominators?

In math the least common denominator (LCD) is that number that will equally divide a group of numbers, for example for {6,27,93} the LCD is 3. But what does this least common denominator principle mean to database independence? Essentially the LCD for a group of databases is the largest set of features that is common to all of the database systems. For example, SQL Server, Oracle, MySQL all have mechanisms to populate a numeric key, SQL Server and MySQL have an auto increment feature, Oracle uses a sequence generator, therefore there would be no common function between the three for populating a number based key. In another case, all use ANSI standard SQL statements so you could use SQL with all three, as long as you limited your usage to pure ANSI standard SQL.

So What is the Big Deal?

Database independence means several things:
  1. Move application logic into the middle or upper tiers
  2. Use no database specific feature
  3. Use no database specific language

Let’s look at these three items for a start and see why they may be issues in many environments.

Move Application Logic into the Middle Tier

PL/SQL, T-SQL and other database languages were developed to provide application logic friendly coding at the lowest level, at the database itself. Oracle, MicroSoft and other database vendors have spent many dollars and hours (years) of effort to make these languages as internally efficient as possible so they can provide the performance needed. Now we are told we must move the logic to the middle tier or into the application itself, mainly because coders are too busy (lazy) to learn PL/SQL, T-SQL or a specific database language. What does this mean exactly?

  1. No triggers – triggers work off of internal PL/SQL or T-SQL logic
  2. No replication – since it can’t be done independent of database code
  3. No referential integrity – again, not all databases support this logic
  4. No packages, procedures, Java in the database – Not database independent if code is run internally
  5. No advanced queuing since it is dependent on database specific routines
  6. No virtual private databases – again this is dependent on internal codes and procedures
  7. No types, objects, methods, materialized views, summaries, dimensions, etc- not all databases support these constructs

All of the code that would be internalized or handled internally by the database in items 1-7 now must be written in external routines and moved to the middle tier or to the application itself. The coders must recreate the wheels of database development for each application.

Use no Database Specific Feature

For the other database systems this isn’t such a big deal, however, Oracle’s feature set is robust but you have to throw all of those features away if you buy into the database independent movement. What will you give up? Let’s just do a partial list:

  1. Bitmap Indexes
  2. Bitmap join indexes
  3. Function based indexes
  4. Advanced text based indexing
  5. Reverse key indexes
  6. Hash, list, composite partitions
  7. No PL/SQL triggers, functions, procedures or packages
  8. No types, methods, etc.
  9. Materialized views, summaries, dimensions
  10. No Java in the database
  11. No XML in the database (at least no parsing)
  12. Advanced security features (password checking, aging, logon restrictions, etc)
  13. Real Application Clusters
  14. Oracle standby features

Of course some times it isn’t as strict as this and may allow anything that is structural in nature to be utilized as long as it doesn’t affect the code base, however, you have to admit most of the items above, if not available, would require additional coding, so it really doesn’t change the list that much.

Use No Database Specific Language

This is the restriction that probable causes the most problems, the restriction on using no database specific language. In Oracle the database specific language is PL/SQL but this also covers the extensions to the ANSI Standard SQL in most cases. PL/SQL is the heart of Oracle, the DBMS packages are of course all written in PL/SQL, triggers are written in PL/SQL, functions, procedures and methods are also written in PL/SQL so you would not be able to use any of them.

If the rule against using a database specific language or extension extends to triggers, referential integrity and internals then you end up throwing away many performance enhancements. Let’s look at an example; PL/SQL developers will recognize the construct in Figure 1 as a key populating trigger.

CREATE OR REPLACE TRIGGER exam_pkey
BEFORE INSERT ON exam FOR EACH ROW
BEGIN
SELECT exam_seq.NEXTVAL
INTO :NEW.exam_pk
FROM DUAL;
END;

Figure 1: Example Simple PL/SQL Trigger

Now, what would the pseudo-code for an external program to do the same thing as this simple trigger look like? Let’s assume we use a central sequence table that stores the owner, table name, column name and last value used.

1. Send Select to database to get ID for row containing our owner, table, column name and last value
2. Implicitly lock row
3. Add 1 to the last value
4. Insert row with new value into table
5. Update row in sequence table
6. Release lock on sequence table
Figure 2: Psuedo-code for External Sequence Routine

The steps in bold in Figure 2 require at least one round trip to the database. In addition, while the table row for our sequence table is locked, no one can add rows to the base table. So rather than send a simple update to the database and letting it handle the nitty-gritty, we have to send: 1 Select, 1 lock, 1 Insert, 1 Update, 1 release lock commands to the database all requiring SQLNet roundtrips. The above pseudo-code was what we had to do back in early versions of Oracle before we had sequences and triggers! Database independent coding is actually taking a giant step backwards! We are talking at least 10 network roundtrips to do what Oracle does in 2, not to mention all the recursive SQL since we really can’t use bind variables the same way in each database.

Of course we also lose the efficiencies of cached sequences, the locking issues will slow down high transaction database applications and the additional network traffic will clog the networks.

This is just a simple example of what truly database independent coding would entail, imagine the more complex PL/SQL routines such as the streams, advanced queuing or DBMS_LOCK having to be recoded in C, C#, Java or some other language?

The Mystery

The real mystery here is why, if the application sponsors or developers really wanted database independence that they would choose Oracle, SQL Server, DB2 or Informix for their application. Of course maybe the database is the corporate standard and it is being forced upon them.

Refusing the use the best tools to get the job done, especially when they don’t cost you anything, is rather dense. Building a database independent application on top of Oracle is akin to ignoring a pile of precut, premeasured, prebuilt walls and roof sections and chopping down trees instead to build a home. The end-point will be a home (how livable will depend on the skill of the builders) but starting with trees you will take longer, have to do many more cuts and probably won’t end up with as well built a house.

The Solution

It is almost a surety that building a database independent application will take longer, be more complex and the resulting application will perform worse than if the strengths of the underlying database are utilized. The number of times a major application has been moved within a company from one database to another, especially from a more capable to a less capable database, is relatively few and almost always, the result is not as good as it seemed it would be.

A majority of issues that are used to drive toward database independence for applications are usually a result of:
  1. Buying too much database to begin with
  2. Hiring inexperienced developers
  3. Developers unwilling to learn database specific skills
  4. Being unable or unwilling to get needed skills
  5. Throne decisions

It is almost always cheaper to hire developers who have a complete understanding of your database (Oracle or something else) or to train your existing developers to understand your database features and capabilities than to reinvent the wheels already in the database system you own. It is guaranteed that in most cases the internalized database feature will perform better than anything you develop to run externally, across the network, in a separate tier.

The Issues with Database Independent Applications

The major issues with database independent applications are:
  1. True portability is almost impossible
  2. It is nearly impossible to achieve acceptable scalability
  3. It is nearly impossible to achieve acceptable performance
  4. Maintenance
  5. Retention of skill sets and knowledge

Achieving True Portability

In order to achieve full portability of code all database and application logic must be contained within the application itself, absolutely no database specific feature can be used. Face it, this just can’t be done except with the most basic of applications. All cross-database vendors have database specific routines that interface to the underlying databases, they also have database specific installation scripts for schemas and structures.

Scalability

As soon as you say “network round trip” you negate scalability, the more round trips an application requires the less it will scale. More calls to the database increases locks, latches and recursive SQL reducing scalability. The more round trips you have in a transaction, the less dependent you are on the server and database for scalability and you move the dependence to the network which now becomes the Achilles heel of your system.

Performance

When code is contained in the database the database does the work internally and many times the code will be further optimized as it is compiled by the internal compilers and interpreters, an example would be the internalization and optimization of referential integrity triggers in Oracle another would be the external compilation of PL/SQL by Oracle11g. Internal code is also pre-loaded after the first execution and has already been parsed and optimized on subsequent executions. When code calls Oracle externally, Oracle has to do soft or hard parses depending on how well the code is written, it also cannot use optimized internal routines for common functions such as sequences and referential integrity. This results in more latching, locking, and recursive SQL generating more load on the database server, and, more load on the application tier servers as well.

Even a perfectly written database independent application cannot out perform a perfectly written database specific application. Database independence is obtained at the cost of efficiency and performance.

Maintenance

Writing the complete application while not taking advantage of database feature sets means re-writing a lot of code that already exists. The code in database independent applications must be maintained by the group that writes it. Unfortunately in this day and age much of the code base is being written using offshore resources, this means the in-house staff, if present at all, now must maintain code that they didn’t even write! Worse yet, your application may be dependent on support once or twice removed (India is now offshoring to China.)

When database specific features, languages and capabilities are utilized this greatly reduces the maintenance needs for your staff, as the maintenance, bug fixes and other worries are now on the database manufacturer and not on your staff.

Retention of Knowledge and Skill Sets

Even if the application is fully developed in-house, can you guarantee that 5 years down the road you will have the expertise to maintain it? Think about how much companies where paying Cobol developers to come out of retirement to fix the Y2K problems. There will always be PL/SQL and Oracle experts, can you say as much for that wonderful new language dejur  you might be developing in right now?

People move on, get promoted, get fired, skill sets atrophy or are considered obsolete. That great deal on developing your application offshore in Y+ might seem a bargain today, but what about 5 years down the road when Z# is the skill set offered and there are no Y+ developers to be found. Take the long look ahead instead of to the next quarter.

Summary

Database independence is a myth, it doesn’t exist. Database independence will always mean poorer scalability and poorer performance than with an application that takes advantage of the strengths of its underlying database. Independent code increases maintenance, decreases reliability and increases dependence on your development teams (in house or off shore.) Just say no to database independence!

Wednesday, June 6, 2012

I Come Not to Praise Disks, but to Bury Them


Disk based systems have gone from 5000 RPM and 30 or less megabytes to 15K RPM and terabytes in size in the last 30 years. The first Winchester technology drive I came in contact with in the early 1980’s had a 90 megabyte capacity (9 times the capacity that the 12 inch platters that I was used to had) and was rack mounted, since it weighed over 400 pounds! Now we have 3 terabyte drives in a 3-1/2 inch form factor. However as information density increased, the bandwidth of information transfer didn’t keep up at the hard drive level. Most modern disk drives can only accomplish 2 to 3 times the transfer rate of their early predecessors, why is this?
The speed at which a disk drive can access information is limited by 3 things:

1.      The number of independent heads

2.      The speed of the disk actuator mechanism

3.      The speed of the disks rotation

While most disks have multiple platters and multiple read/write heads, the read/write heads are mounted to a single actuator mechanism. By mounting the heads on a single actuator mechanism you may increase the amount of data capable of being read/written at the same time, but you do not increase the maximum random IOPS. Because of these physical limitations most hard drives can only deliver 2-5 millisecond latency and 200-300 random IOPS.


Figure 1: HDD Armature and Disks

Modern SAN and NAS system are capable of delivering hundreds of thousands of IOPS, however, you must provide enough disk spindles in the array to allow this. To get 100,000 IOPS assuming 300 IOPS per disk you would need 334 disk drives at a minimum, more if you want to serve that 100,000 IOPS to multiple servers and users. In an EMC test, they needed 496 drives to get to 100,000 IOPS. Of course, the IOPS latency would still be from 2-5 milliseconds or greater. The only way to reduce latency to nearer to the 2 millisecond level is to do what is known as short-stroking.

Short stroking means only utilizing the outer, faster, edges of the disk, in fact usually less than 30% of the total disk capacity. That 496 disks for 100,000 IOPS at 5 milliseconds just became 1488 or more to give 100,000 IOPS at 2 millisecond latency.

Disks have fallen dramatically in cost per gigabyte.  However their cost per IOPS has remained the same or risen. The major cost factors in a disk construction are the disk motor/actuator and technology to create the high density disks. This means that as disk technology ages, without major enhancements to the technology, their price will eventually stall at around 10 times the cost of the raw materials to manufacture them.


Figure 2: Disk Cost Per GB RAW

So where does all this leave us? SSD technology is the new kid on the block (well, actually they have been around since the first memory chips) now that costs have fallen to the point where SSD storage using Flash technology is nearly on par with enterprise disk costs with SSD cost per gigabyte falling below $40/gb. A recent EMC representative presentation quoted $17K/TB of storage.

SSD technology using Flash memory utilizing SLC based chips provides reliable, permanent, and relatively inexpensive storage alternatives to traditional hard drives. Since each SSD doesn’t require its own motor, actuator and spinning disks, their prices will continue to fall as manufacturing technology and supply-and-demand allows. Current prices for a fully loaded 24 TB SSD using eMLC technology sit at around $12K/TB, less than for enterprise level HDD based SAN. This leads to a closing of the gap between HDD and SDD usage modes as shown in Figure 3.


Figure 3: Usage Mode Changes for HDD and SSD

In addition to price to purchase, operational costs (electric, cooling) for SSD technology is lower than the costs for hard disk based technology. Combine that with the smaller footprint per usable capacity and you have a combination that sounds a death knoll for disk in the high performance end of the storage spectrum. Now that SSDs are less expensive then enterprise level disks at dollars (or Euros) per terabyte and when a single 1-U chassis full of SSDs can replace over 1000 disks and costs are nearing parity, it doesn’t take a genius to see that SSDs will be taking over storage.


Figure 4: Comparison of IOPS verses Latency

A SSDs full capacity can be used for storage, there is no need to “short-stroke” them for performance. This means that rather than buying 100 terabytes to get 33 terabytes you buy 33 terabytes. SSDs also deliver this storage capacity with IOPS numbers of 200,000 or greater and latencies of less then 0.5 milliseconds. With the various SSD options available the needed IO characteristics for any system can be met as is shown in Figure 5.


Figure 5: Response and IOPS for Various SSD Solutions

These facts are leading to tiering of storage solutions with high performance SSDs at the peak, or tier zero, and disks at the bottom as archival devices or for storage of non-critical data. Rest in peace disk drives.