Friday, July 8, 2016

Best Practices when making decisions about a San purchase

The following case study is in regards to Best Practices when making decisions about a San purchase.

Aside from the correct sizing of a San that fit the needs of the organization there are also choices about maintenance, spares and proactive monitoring.

Should we purchase spares and how many? Your San vendor can make a recommendation but it is highly recommended that spares be part of your strategy. Spares play an integral role, as you'll see in the example below, in an overall strategy that hopefully will make a failure seem like just an inconvenience to fix rather then an outage.

Should we purchase maintenance? Yes, and the repair turn around time and what is covered is up to you and what makes sense for your organization. Also being able to get your firmware or software updates as they are available are important as well as maintenance coverage (should something stop functioning on down the line). In the example below- the standard maintenance package included a 4 hour turn around for parts and labor (meaning the replacement will be in your hands in 4 hours and their tech or yours can install).

Investing in proactive monitoring is also important in order to get status updates on your San when things are no longer operating within specifications. For single event incidents this allows your organization to be as proactive as possible before something catastrophic occurs or grows (unnoticed into a larger issue).

In this example, the proactive alarms that came in were for a general Raid error (fairly innocuous) following by a drive failure then a reduction of spare drives.


The alarm in figure 1 told us something had occurred on the Raid controller:
                         Figure 1.

The alarm shown in figure 2 told us that we had a problem with one of our drives:
                               Figure 2.

The alarm shown in figure 3 told us two things: One that a spare drive was now in use and two if the spare is in use then the drive failure is confirmed and we need to take action to replace the drive that was noted in the alarm from Figure 2.
                                Figure 3.

Manual validation (from the San control Gui, shown below) confirmed what the monitoring alarms revealed.
                                 Figure 4.

                                 Figure 5.





No comments:

Post a Comment