Wednesday, April 11, 2012

An Oracle ASM Primer

Oracle Automatic Storage Manager (ASM) is a mature product capable of replacing high-cost disk management software for most Oracle applications. In this first blog of a series on ASM we will review ASM basics.

Automatic Storage Management (ASM) is an all-inclusive approach to storage management, performance, and availability. ASM is an excellent tool for managing mixed storage environments where both SSD and HDD technologies are being used. It is an intrinsic block manager database that balances file access across disks and disk subsystems to eliminate hotspots and increase storage efficiency. By striping extents across diskgroup members and mirroring across failgroups, ASM can generate RAID performance and protection at the file-level of databases. ASM can also be used in conjunction with high-end disk arrays for mixed-storage file management.

ASM operates as a lightweight Oracle database and offers the stability and reliability of a standard Oracle database. Therefore, multiple databases can be clients to a single ASM instance. This will allow a single pool of disks to be efficiently used for various databases. In addition, ASM can be used in a RAC environment where ASM failover can occur nearly seamlessly between active nodes, permitting non-stop uptime for storage management.

ASM does not have to replace, but can complement, disk subsystems and arrangements for efficient use in specific files. ASM will not automatically separate storage areas in diskgroups for log files, archived backups, or database files. ASM templates assist in the striping performance of each type of file, but sequential I/O is not differentiated from random I/O. It is still the duty of the architect to specify storage for specific roles. Sequential access log files should traditionally be held separate from the random-I/O generated by database files. This is not a flaw in ASM but a continuing practice of tiered storage and disk speeds. ASM is most effective when diskgroups are allocated with single homogenous disks as their components.

The use of high-speed SSD highlights the benefits of ASM. When multiple devices are available, the striping performance of ASM can linearly increase the throughput of the storage, reducing the response time to the client. In situations where extremely low-latency is needed, such as log writes, the impact of mirroring with ASM is still negligible and the performance promises of a single SSD are still met.



Figure 1: Tiered Storage Pyramid

ASM Internals

ASM provides its striping through a default of 1MB file extents with normal disk arrays and 4 MB extents if used within an Exadata cell. By striping across files instead of blocks, multiple files can be used on a single ASM instance but have independent levels of redundancy. Log files and database files are often held at a normal redundancy while control files are often in triplicate for extra protection. In addition, the default template for ASM specifies the extent size for each file. Control files and log files are assigned a smaller extent size of 128K while larger, more bandwidth-intensive operations default to 1MB extents. Testing in Oracle’s development labs showed the 1MB size as the most balanced size in bandwidth and latency and concurrency of access.

ASM will utilize disks in what may appear as a very sporadic layout. Extents are not sequential in disks or failgroups. The first extent may be on Disk 1, but the second extent may be on Disk 3 of the third failgroup. In addition, ASM will place extents throughout lengths of disk space. This results in the even use of disk platters and prevents hotspots within the disks themselves. The only guarantee in the placement of extents is the mirrored extent located in a different failgroup. All disks will be used as evenly as possible to prevent any disk having higher utilization than others. This is demonstrated in Figure 2.



Figure 2. ASM Internal Layout

From the layout in Figure 2, each disk contains an original extent and backup extents (B) are assigned to a non-correlating disk in a separate failgroup. It is evident that ASM can take full advantage of the disks included. Reading from the disk provides efficient use of the disks as a fully striped volume. Write performance is equivalent to that of a RAID10 as half the disks are written while the other half serves as a mirror to the file extent. Through this behavior, the performance gain is much greater than a standard mirrored array but still provide the redundancy necessary for file protection. If failgroups are assigned appropriately, then ASM will also continue to serve even amidst a controller failure.

Because ASM is a lightweight database, managing ASM is done with a limited command set nearly equivalent to that of a standard Oracle database. Management such as the assigning of disks to diskgroups and failure groups can also be done through GUI interfaces such as DBCA. The v$ Dynamic Views within ASM call the system-based functions directly to eliminate a layer of administration. When querying the v$asm_disk view with a standard SELECT statement, Oracle will automatically initiate a rescan of the SCSI bus to detect any new devices attached to the server. The v$asm_disk_stat and v$asm_diskgroup_stat return the current rows for the ASM instance without the inherent system calls of scanning for disks.

Diskgroups and Failgroups

A diskgroup is a set of disks defined to work in tandem to provide a single volume. Such as drive letter access in Windows (X:\), a diskgroup is given an ASM pseudo-filesystem to be accessed through the same reference by a file. All files located on ASM are referenced by the following:

“+\\filename”

where the + sign indicates ASM storage.

To navigate the ASM-based storage on disks through a filesystem interface, Oracle provides the ASMCMD executable. ASMCMD operates as an interactive shell for ASM in which you can view, delete, and alias ASM-based files and directories. The DBA may perform the same functions through an SQL terminal, but ASMCMD provides a more direct approach.

Diskgroups can be created with 3 different levels of redundancy: Normal, High, and External. Many expensive disk arrays rely on internal mechanisms for data protection. The External level of protection does not feature any duplication of data in the ASM diskgroup, but simply includes the volume into the ASM volume manager. This feature is commonly used for expensive SANs. These arrays typically hold cache-substitutes for dropped disks or can allocate a replacement disk from a hot-spare almost immediately. However, catastrophic failures are only sustained through mirroring systems.

Normal redundancy offers a 2x to 3x mirror level while High redundancy specifies a 3x mirror. This allows ASM to provide redundancy rather than requiring the purchase of a SAN that guarantees no single point of failure. These mirror levels specify the number of entries each extent will have on a diskgroup. ASM will provide the mirroring across specified failgroups. A failgroup is a subset of a diskgroup that defines a point of failure. This provides consistency across controllers and paths, in addition to disks. With Normal redundancy, each mirrored extent is guaranteed to be located on a failgroup separate from the original extent. With this configuration, the primary copy may fail, along with an entire failgroup, but the corresponding failgroup will contain data fully consistent with that of the original data. The 3x mirror will attempt to mirror across three failgroups. If there are only 2 failgroups in the setup, then the third copy will be duplicated within a failgroup but held on a separate disk if possible.

Enterprise SSDs offer protection against data corruption and power loss. However, uptime during maintenance can only be upheld through mirroring. ASM copies each extent to a non-corresponding location in a different failgroup. For system mirroring, failgroups would be defined as all drives included within a single system. Oracle recommends setting failgroups across controllers. The primary exception to this rule is to apply multipathing. Multipathing will increase throughput, provide failover, and will not require additional space allocated for mirroring the disk.

In my next blog I will discuss good practices for ASM when used with SSDs.

No comments:

Post a Comment