Bigger, Faster and “More Efficient” Doesn’t Always Mean Better

In today’s dynamic and ever changing IT landscape there is a lot of emphasis on purchasing technologies that do more with less, increase performance, and make existing approaches more efficient. Clients are turning to their trusted advisors and asking them to sift through all the stories, FUD and hype in the hopes that their solution providers will help them architect a strategy that utilizes the newest technologies to increase competitiveness, all while reducing total cost of ownership.

The single greatest advance in this area, at least in my opinion is the virtualization of servers, which has helped clients consolidate silo’ed resources and management structures, while increasing performance, availability and reducing TCO in massive ways.

Another area in which massive savings have been found is in the de-duplication of data within an IT environment. This is a tactic employed to reduce the amount of data that resides in an environment, both on primary storage systems, as well as in the backup stack in an effort to reduce the strain on networks, as well as the time and money spent on expensive disk technologies.

While both of these tools can provide massive savings in capex/opex to clients when implemented in the right way, they can also cause as many issues as they solve if not properly thought out and managed through their life cycle.

Was That VM Ever Really Needed??

When working with clients who have been virtualized for a few years now and have moved onto standardizing the virtualization of all applications that are supported in a virtualized state, the ability to create services so quickly (VM’s) can be an issue in itself.  When most clients who are heavily virtualized are asked around the usage of VM’s and how efficient their virtualization has been, most just scratch their heads and rhyme off how many physical servers they have versus virtual, like this in itself provides a measure of the efficiency within their environment.

Upon further discussion it usually becomes obvious that at least a third of their virtual machines really serve no purpose beyond their actual existence, or the smile they put on the person’s face who requested their unnecessary creation in the first place. For these virtual machines, wouldn’t it just have been smarter to NOTspin them up to begin with? No one likes telling their staff “no”, but learning to do so can help our client squeeze fare more out of the investments they have made in virtualization.  Effective planning in this area will help drive up utilization rates, while driving more value out of data center investments.

Where did my De-Dupe Go??

When architecting primary storage solutions there is a huge focus on providing a solution that utilizes technologies like thin provisioning and de-duplication to help length disk purchase cycles, and drive every dollar of utilization out of a system before an upgrade is needed.

While thin provisioning has its own pit falls and things to be considered before implementation (I’ve discussed in a previous post) primary storage de-duplication can cause unexpected capacity nightmare down the road if not implemented in the right fashion and tracked properly over time.

Data de-duplication is the practice of using algorithms to look for duplicate blocks of data, and rather than writing the duplicate blocks to disk, just placing a pointer on where to find the original block. This allows clients in “traditional” application environments to see a massive reduction in the amount of disk they require to support an environment. 

While this can provide a lot of capacity savings, and make existing disk last longer than array’s that aren’t de-duplicating primary data there are some major concerns that need to be kept top of mind with this approach. Many of you are expecting a vitriolic diatribe about how primary storage de-duplication affects performance negatively blah blah blah.. Well, that’s not what today’s focus is on, at least not for myself.

The major problem the de-duplication of primary storage causes is in a storage admin’s understanding of what data dependencies exist, and what capacity can be “rescued” in an emergency by simply deleting data. Many storage admin’s will delete data in a pinch to free up space on their SAN/NAS rather than waiting to expand the hard drive pool, but if a volume has been de-duplicated when its erased there is a major concern that need to be kept in mind: If you are erasing what looks like data, but is just “pointers” that pointed to where the original data was found you may think you are deleting a 2TB LUN, when in fact you are deleting 56kb of pointers and nothing more.

If the above issue is not addressed from the planning phase, and not closely monitored on a regular basis it can end at best with a storage administrator pulling out their hair in trying to plan storage array growth, and at worst the rampant loss of data due to not closely monitoring data dependencies.

Enough of my “Mr Grinch” impression for today… and I will ensure to post something more fun and happy next time around. Thanks for stopping by!!!

Related Posts

Are You Ready to Join the Revolution Against Outdated Data Centers? –... Editor’s note: our virtualization expert Stephen Akuffo weighs in on the Virtual Space Race Study blog series – see his notes below. Do you remember the days of dial up...
Isn’t It Time You Hopped on the Virtualization Bandwagon? – UPDATED Editor's note: our virtualization expert Stephen Akuffo weighs in on the Virtual Space Race Study blog series - see his notes below. You’ve probably researched how virt...
Change Your IT Role from Gatekeeper to Innovator – UPDATED Editors note: our virtualization expert Stephen Akuffo weighs in on the Virtual Space Race Study blog series - see his notes below. Long gone are the days of a single ser...

About Adam Wolfson

Adam Wolfson is now an HP Technical Architect here at Softchoice and holds more than 8 enterprise technical architecture certifications from HP.