Wednesday, July 21, 2010

Every Cloud Has a Silver Lining

By Mike Ault

One of the big buzz words today is cloud. Server Cloud, Memory Cloud, Storage Cloud, Public Cloud, Private Cloud, clouds ad nauseum, we hear of a new “cloud” implementation almost on a daily basis. But what exactly is a cloud in computer context?

A cloud is a way to present a particular computer resource such that that resource appears to be infinite to the user. For example, company X launches a new website and expects to use 10 servers and 1 terabyte of storage with 100 mb/s bandwidth. Instead, they find they need 100 servers, 10 terabytes and 1000 mb/s due to the unprecedented need for their cell phone antenna amplifier. In the not-so-long-ago days, this could have been a disaster as they ordered new servers, more storage and got more bandwidth and weeks later, they were able to meet a demand no longer present due to the hurried release of the next generation of phone. Enter the era of the cloud: as their monitoring staff notices the huge leaps in access and resource requirements they notify their cloud provider and within a few minutes (not days or weeks) new servers, storage and bandwidth are magically added to their application, keeping things running smooth with no apparent issues to the users, this is how the cloud concept is supposed to work. Unfortunately, the cloud rarely works that way for huge increases in need.

The challenge is that the cloud providers have to be able to scale out and up to meet the needs of all their subscribers. This means being over provisioned in all areas to allow for sudden peaks and needs. Recent papers show how these need spikes can result in under-capacity issues for cloud providers which result in loss of clients, revenue and of course negative publicity. Other issues include perceived security issues with many potential users stating that they would never put their sensitive corporate data “in the cloud.”

All the issues and potential issues aside, one area that really causes problems is the provisioning of storage resources. Unlike CPU resources which can be easily allocated and deallocated at will using virtual machine technology as loads change, static data needs only increase for the users in the cloud space, requiring larger and larger numbers of storage arrays. In addition to capacity as volume, capacity as related to IOPS and latency are also an issue to meet required service level agreements (SLA). Providers find they must use many times the number of disks for storage capacity to satisfy SLA requirements, leaving excess capacity storage-volume wise unused.

One solution for the storage capacity versus SLA dilemma in the cloud space is to utilize a tiered performance based storage cloud for use by the users of the overall cloud space. Utilizing fast SSD storage in the uppermost tiers allow maximum use of resources as SSDs are not sensitive to data placement and there is no need to short-stroke them to get low latency access. Thus clients with stringent SLA requirements are placed into the SSD portion of the cloud while those without as strict a requirement are relegated to standard disk based storage. By removing the need for low latency response from the disks, the disks can be more fully utilized so rather than only provisioning at 20% of capacity per disk drive, they a now be provisioned at 60% or higher, allowing 1/3 the number of disks to be required.

By using SSD technology for low latency customers greater overall storage efficiency is realized as SSDs can be used at 100% of storage capacity and by removing the need for low latency reads from lower tier disk assets, the disks can also be utilized at a much higher capacity. For example, if an application requires 1-2 ms latency to meet response time requirements for their applications, you would need to have a read-caching SAN with disks short-stroked to 20% of capacity. This would mean at a minimum, buying 5 times the number of needed drives to meet this performance requirement. So a 7 TB database would require, at a minimum 35 TB of disks with no protection, up to 70 disks depending on the type of RAID utilized. Alternatively, if the application data is hosted on a tier 0 SSD system such as a RamSan-630 which has 10 TB of storage, only one or two (for redundancy) SSDs are required for a large reduction in server room footprint, energy and cooling requirements.

In the server cloud space, SSDs can also make a huge difference. The largest use of resources for the cloud is the instantiation of the virtual machine spaces used to serve clients. In tests using a standard SAN, only 10-15 VMs were able to instantiated simultaneously. When a SSD was substituted for the SAN, 40-50 VMs could be instantiated in the same time frame with much lower taxing of other resources. You can read more about this SSD implementation here: http://vknowledge.wordpress.com/2010/04/27/texas-memory-systems-ramsan-620/

Looks like the clouds silver lining might just be SSDs.

Wednesday, June 2, 2010

Calculating a True Storage Value Index

By Mike Ault, Oracle Guru, TMS, Inc.

I read with interest a new paper from Xiotech that puts forward a new performance metric called the “Storage Value Index”. The Storage Value Index takes into consideration several key characteristics of an IO subsystem to give an overall numerical grade to determine the true value of your, or a proposed, IO subsystem. The basic formula for the Value Index is:

Storage Value Index = (TUC*IOPS*WY)/cost

Where:
TUC=Total usable capacity in Terabytes
IOPS=validated IOs per second (SPC-1 results for example)
WY=Warranty years (or years of paid maintenance if added to cost)
Cost=Cost of validated system

I found this to be an interesting metric except for one problem: it is only taking into consideration one side of the performance issue, IOPS. Why do I say this? Let’s look at some results from applying this metric to see where there may be issues with this value. Look at table 1 which uses values from the SPC website for its data source.


Table 1: Calculated Storage Value Indexes

From consideration of Storage Value Index (SVI) alone we would conclude from the results in Table 1 that the Fujitsu DX8400 would be the best IO subsystem because of its SVI of 35.4, followed by the Infortrend at 28.7 and so on. However this SVI is not giving us the entire performance picture. Notice that the systems with the lowest latency are being penalized by the ability of higher latency systems to add disks to increase IOPS.

In my work with Oracle tuning both prior to and during my tenure with Texas Memory Systems, my primary indication of IO subsystem problems is the read latency. Generally speaking the higher the IO latency the worse the system will perform. If you notice, the SVI doesn’t include anything dealing with latency. From queuing theory we know that IOPS and latency are not dependent on each other. To increase IOPS we can just add queues. I can have hundreds of thousands of IOPS and still have latencies in the 10-20 millisecond range just by adding disk drives to a system. So, it should be obvious if we want to take into consideration the true value if a system we must take into account the latency of that system at the measured IOPS value used in the SVI calculation. To this end I propose the latency adjusted SVI is a better measure as follows:

Adjusted Storage Value Index = (TUC*IOPS*WY)/(cost*L)

Where:
TUC=Total usable capacity in Terabytes
IOPS=validated IOs per second (SPC-1 results for example)
WY=Warranty years (or years of paid maintenance if added to cost)
Cost=Cost of validated system
L=Latency at measure IOPS level

Taking into account the latency now makes our results adjusted by both the throughput (IOPS) and the response time (latency) and gives a true Storage Value Index. Table 2 shows the results with this adjustment.


Table 2: Adjusted Storage Value Index

As you can see, by taking into account the final performance metric, latency, the results now give a better understanding of the complete IO subsystem.

In addition, the actual projected operating costs (floor space, electricity and cooling) for the warranty period should be added to the cost figures to get the true monetary cost of the systems to be compared. Unfortunately that information is not provided or easily obtainable.
References:
Xiotech White Paper: “Strategies for Measuring and Optimizing the Value of Your Storage Investments”, May, 2010

http://www.storageperformance.org/results/benchmark_results_spc1/#spc1

Wednesday, May 26, 2010

Excuse me Sir, this Violin seems out of Tune

By Mike Ault

In a recent article on the Channel Register website, Violin through Chris Mellor talks about their 3200 Flash Memory Array. Unfortunately in the release there are some claims that require examination if not repudiation. Let’s look at the claims put forth from Violin:
  1. Integrated Flash RAID and a sustainable 10-fold performance advantage over leading competitors
  2. Scales from 500GB to 10TB
  3. Data latency less than 100 microseconds
  4. Working life of 10+ years with continuous writes
  5. First memory array to scale to more than 140TB in a rack with performance over 2 million IOPS
  6. Total cost lowered by more than 50 percent
  7. Has RAID protection unlike Oracle’s Exadata
  8. Violin is the first company to aggregate Flash as an enterprise storage solution, beyond just a cache strategy
Let’s examine each of these claims.
  1. Integrated Flash RAID and a “sustainable 10-fold performance advantage over leading competitors”
    Wrong: While Violin may be an integrated Flash RAID, their only competitor in this market is Texas Memory Systems. Based on the proven 80 microsecond write times (per SPC Benchmark 1™) of the RamSan-620 products, obviously the 10 fold performance claim is patently false. Now in when compared to disks, this is true with virtually all Flash providers.
  2. Scales from 500GB to 10TB.
    Wrong: This doesn’t take into account the capacity that must be used for Flash management, wear leveling, RAID etc. Their actual usable capacity at the top end of the range is only 7.5 or so terabytes, compared to the actual usable capacity for the RamSan-630 of 10TB.
  3. Data latency less than 100 microseconds.
    Inaccurate: So, what is the reference point of this claim? Is this read, write, or a blended rate? If it is read, then what is write latency? The RamSan-500 provides 15 microsecond read latency (from cache), the RamSan-620/630 products provide 80 microsecond write latency and 250 microsecond read latency (nominal) with generally better latency than reported. In looking at the graphs of latency versus IOPS on the Violin site for the 3200, its latency rapidly increases above the reported 100 microseconds as IOPS increase.
  4. Working life of 10+ years with continuous writes.
    Inaccurate: Show me the numbers. Is this 365X24X7 at 220,000 IOPS with 100% writes? 80/20 read/write? As they used to say in math class…show your work. Just going by the numbers (I can send you a spreadsheet) the RamSan-630 with a full Flash load-out will last 27 years at 400K write IOPS, it will be on eBay before it wears out.
  5. First memory array to scale to more than 140TB in a rack with performance of over 2 million IOPS.
    Wrong: The RamSan-630 at 10TB usable capacity and 500,000 IOPS in a 3U form factor provides 140TB usable space and 7,000,000 IOPS in a single rack. It was announced in April 2010 (actually earlier than that, but that was the “official” date). The 3200 was announced in May 2010.
  6. Total cost lowered more than 50 percent.
    Inaccurate: More hand waving, break it down. For example, what are support costs, the cost of the needed head to provide RAID, and other costs? Total cost compared to what? For example the base cost of a 10TB RamSan-630 is $370K for a full capacity 10TB system ($36/GB usable), which is actually 13.5TB total giving $26/GB. At 200K for a Violin 7.5 TB system (usable) so while it is $20/GB for actual storage, for usable it is $26/GB. Essentially the price is a wash with no real benefit, it is just smoke and mirrors and nowhere near the claimed 50%.  At most a 38% difference, however, what else are they not giving all the facts about?
  7. Has RAID protection unlike Oracle’s Exadata
    Wrong: Exadata uses ASM technology which provides striping and mirroring (RAID10) in fact from the numbers reported for the Exadata of actual versus available capacity for an Exadata cell, Exadata appears to be using HIGH redundancy which means 3-way mirroring. So, wrong again, Exadata has RAID capacity and is RAIDed.
  8. Violin is the first company to aggregate Flash as an enterprise solution, beyond just a cache strategy.
    Wrong: TMS, with the RamSan-500 (2TB), RamSan-620 (5TB) and RamSan-630 (10TB) products provided the first aggregate Flash enterprise solution. The RamSan-500 was announced in September 2007, the RamSan-620 was announced in April 2009 and the RamSan-630 was announced in April 2010, the Violin 3200 in May 2010. Obviously it is fourth in line, not first.
With so many inaccuracies, can the actual specifications provided really be trusted?

Monday, May 3, 2010

Who's In Charge Here Anyways?

As DBAs we have all seen it, heck, probably done it. We call over to the server administrators for more space for our database files and sometime later we get it. We have no idea how it is configured, where it is located or if it will contend with existing file placements. All of the files we own are located in some magic land, let’s call it SAN Land, where everything is always load balanced, there are no hot spots and nothing ever contends with anything else. I think it is located right next to Lake Woebegone.

The SAN as a blackbox technology has been a boon and a bane to Oracle administrators. We know how things should be set up but when we try to pass along this information to the SAN administrator we hear the usual replies about how we have to co-exist with the other users and it is just not possible to configure things just for us. Well, those days have ended.

How about space that doesn’t have to be configured with an eye towards contention due to head movement or contention caused by block placement? How about freedom from hotspots and all the other problems which plague disk based technology? Even better, how about storage that can be locally managed? Impossible? Am I in a fantasy land somewhere?

Nope, not a fantasy land, welcome to the year 2010. How about 225 to 450 gigabytes of low latency storage that is locally controlled and doesn’t depend on disks, and better yet, can usually be purchased and installed with little pushback from system or LAN administrators? The RamSan-10 or RamSan-20 provide 225-450 gigabytes of high speed – low latency SLC flash memory based storage that plugs into a full size PCIe slot in the server and looks like another disk drive, but looks are deceiving.

As a “database accelerator” for a single server database that hooks directly into the server and doesn’t require any fibre channel, NFS, iSCSI or SAS connection, PCIe storage bypasses many of the management headaches associated with standard SAN technology. Due to the RamSans not being dependent on a physical device, 100% of the storage capacity can be utilized, no need to worry about short-stroking, striping or mirroring to get better performance. At a price of between 8-20K USD these solutions also fall easily within the signatory purchase powers of most department heads.

So shake off the fetters of the SAN world and step into the 21st century! Deliver 5 times the performance of standard SAN technologies to your database that you control locally.

Wednesday, April 21, 2010

Day 2 of Collaborate 2010

Well, survived day 2! The 1:30-2pm Theater session was packed full, the only issue was that there was no microphone so after having to project above the background noise from the exhibit hall I was a little horse but that's ok. The second presentation for me was actually a RAC Tuning panel from 4-5pm and it went well with lots of great questions and answers.

I attended two of the RAC tuning bootcamp presentations and it was interesting to see that RAC tuning really hasn't changed much from version 9 to version 11 as some of the same things I used to teach when I was consulting are still being used.

Today I have a 1-2pm RAC Expert Panel in Palm F and my Oracle Holistic Tuning presentation from 4-5pm in Palm F, which will also be a webcast.

Of course I will also be at the booth, number 1645. We are giving away a free PDF version of the Oracle Tuning using SSDs book if you stop by with a thumbdrive, come on by!

Hope to see you there!

Tuesday, April 20, 2010

Day 1 of Collaborate

Well, here is it the beginning of day two of Collaborate'10. Day one I attended a couple of SSD and ASM presentations and helped setup and man the TMS booth. The presentations I attended didn't thrill me so much but they were interesting in the way that they showed the level of understanding in the Oracle community about SSD and ASM usages.

One problem was that the first presentation about SSDs and Oracle on DELL servers and storage didn't give any hard numbers, but instead the presenter (or perhaps DELL) decided to present normalize numbers. This means that the values are normalized to the worst value (in this case the hard SAS drive latency and IOPS) and the other values are reported as they relate to that number ( a percentage). As a possible buyer of the technology I would want hard numbers to compare to rather than normalized numbers, but then maybe I am odd that way.

The second presentation dealt with ASM and load balancing. Unfortunately it didn't do too deep a dive and didn't cover much anything that wasn't in the manual. Maybe I am jaded.

The booth setup went well and we had many people come by to talk performance and real numbers on the RamSan SSDs. We have the RamSan630 here for demos, it is the latest of the TMS RamSan line offering up to 10 terabytes of capacity with 80 microsecond writes and 250 microsecond reads (no I didn't get that reversed!) and with a full set of fibrechannel cards, 500,000 IOPS.

Today I will be presenting in the mini-theater on "Oracle and SSDs" at 1pm. I will also be participating in the Oracle RAC Tuning Panel at 4pm, see you there! Other than that I will be either attending interesting presentations or at the booth.

Come by and see us! We are booth 1645.

Mike Ault

Tuesday, March 2, 2010

TMS San Jose Road Show

Well after a rough start (getting there!) the San Jose TMS Road Show was a resounding success. Woody, Levi, Webex (one of our customers) and I addressed a packed room about TMS products and their use in resolving IO bottlenecks in computer architectures.



The audience responded well with many intellient and well thought out questions. With the wide level of expertise we brought to the show we were able to answer all questions asked. These road shows are a definite way to get your questions about using SSD technology answered!



Next week it is off to Chicago for the second road show. If you are in the Chicago area please register (use the link on the blog title) and hopefully we will see you there!



Mike



(I tell about my adventure getting to the road show in my personal blog at http://mikerault.blogspot.com)