Upcoming Events and Activities Update V2010.1

The end of year christmas and new years holiday season has come and gone which means of course that 2009 is a wrap along with the travel from being out and about.

In addition to getting some time to relax a bit (playing Wii resort, snow plowing, cooking etc.), I have also been catching up on developing some new content including articles, blogs (some yet to be post), tips as well as podcasts along with some custom research advisory projects.

Check out some recent tips, articles, videos and podcasts here along with perspecitives and comments on indusitry news here.

2009 events and activities saw visits to cities including San Jose, Tucson, Cancun Mexico, Dallas, Tampa, Miami, Los Angles, San Jose, Las Vegas, Milwaukee, Atlanta, St. Louis, Birmingham, Cincinnati, Santa Ana, Minneapolis, Boston, Dallas, Boston, Chicago, Parsipanny, Raleigh, Providence, Kansas City, Denver, Chicago, Orlando, Chicago, Philadelphia, Toronto, Richmond, Columbus, Princeton, Seattle, Portland, Dallas, San Francisco, Minneapolis, Toronto, Chicago, New York, Milwaukee, Atlanta, Boston, Cleveland and Detroit among others.

This time of the year also means that the 2010 events and activities including in person keynote and presentations also known as out and about are getting underway. While the 2010 schedule of events is still being finalized, some initial events have are on the calendar, my bags are about to be packed and tickets in hand not to mention finalizing the presentation and discussion content.

In addition to some non public events including keynote presenting at some vendors annual sales (kick off) meetings, the following are some of what are currently on the calendar that you can click on the links below to learn more about the venues.

February 3, 2010 Green Data Center Conference, San Diego, CA
January 21, 2010 Dinner Event keynote Speaker Dynamic IT Infrastructure, Beverly Hills, CA
January 21, 2010 Morning keynote Speaker The Green and Virtual Data Center, San Diego, CA
January 19, 2010 Dinner Event keynote Speaker Dynamic IT Infrastructure, Miami, FL

Watch for updates to the events calendar and I look forward to seeing you all while Im out and about during 2010.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Recent tips, videos, articles and more update V2010.1

Realizing that some prefer blogs to webs to twitter to other venues, here are some recent links to articles, tips, videos, webcasts and other content that have appeared in different venues since August 2009.

  • i365 Guest Interview: Experts Corner: Q&A with Greg Schulz December 2009
  • SearchCIO Midmarket: Remote-location disaster recovery risks and solutions December 2009
  • BizTech Magazine: High Availability: A Delicate Balancing Act November 2009
  • ESJ: What Comprises a Green, Efficient and Effective Virtual Data Center? November 2009
  • SearchSMBStorage: Determining what server to use for SMB November 2009
  • SearchStorage: Performance metrics: Evaluating your data storage efficiency October 2009
  • SearchStorage: Optimizing capacity and performance to reduce data footprint October 2009
  • SearchSMBStorage: How often should I conduct a disaster recovery (DR) test? October 2009
  • SearchStorage: Addressing storage performance bottlenecks in storage September 2009
  • SearchStorage AU: Is tape the right backup medium for smaller businesses? August 2009
  • ITworld: The new green data center: From energy avoidance to energy efficiency August 2009
  • Video and podcasts include:
    December 2009 Video: Green Storage: Metrics and measurement for management insight
    Discussion between Greg Schulz and Mark Lewis of TechTarget the importance of metrics and measurement to gauge productivity and efficiency for Green IT and enabling virtual information factories. Click here to watch the Video.

    December 2009 Podcast: iSCSI SANs can be a good fit for SMB storage
    Discussion between Greg Schulz and Andrew Burton of TechTarget about iSCSI and other related technologies for SMB storage. Click here to listen to the podcast.

    December 2009 Podcast: RAID Data Protection Discussion
    Discussion between Greg Schulz and Andrew Burton of TechTarget about RAID data proteciton, techniques and technologies. Click here to listen to the podcast.

    December 2009 Podcast: Green IT, Effiency and Productivity Discussion
    Discussion between Greg Schulz and Jon Flower of Adaptec about data Green IT, energy effiency, inteligent power management (IPM) also known as MAID 2.0 and other forms of optimization techniques including SSD. Click here to listen to the podcast sponsored by Adaptec.

    November 2009 Podcast: Reducing your data footprint impact
    Even though many enterprise data storage environments are coping with tightened budgets and reduced spending, overall net storage capacity is increasing. In this interview, Greg Schulz, founder and senior analyst at StorageIO Group, discusses how storage managers can reduce their data footprint. Schulz touches on the importance of managing your data footprint on both online and offline storage, as well as the various tools for doing so, including data archiving, thin provisioning and data deduplication. Click here to listen to the podcast.

    October 2009 Podcast: Enterprise data storage technologies rise from the dead
    In this interview, Greg Schulz, founder and senior analyst of the Storage I/O group, classifies popular technologies such as solid-state drives (SSDs), RAID and Fibre Channel (FC) as “zombie” technologies. Why? These are already set to become part of standard storage infrastructures, says Schulz, and are too old to be considered fresh. But while some consider these technologies to be stale, users should expect to see them in their everyday lives. Click here to listen to the podcast.

    Check out the Tips, Tools and White Papers, and News pages for additional commentary, coverage and related content or events.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    The other Green Storage: Efficiency and Optimization

    Some believe that green storage is specifically designed to reduce power and cooling costs.

    The reality is that there are many ways to reduce environmental impact while enhancing the economics of data storage besides simply booting utilizing.

    These include optimizing data storage capacity as well as boosting performance to increase productivity per watt of energy used when work needs to be done.

    Some approaches require new hardware or software while others can be accomplished with changes to management including reconfiguration leveraging insight and awareness of resource needs.

    Here are some related links:

    The Other Green: Storage Efficiency and Optimization (Videocast)

    Energy efficient technology sales depend on the pitch

    Performance metrics: Evaluating your data storage efficiency

    How to reduce your Data Footprint impact (Podcast)

    Optimizing enterprise data storage capacity and performance to reduce your data footprint

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Optimize Data Storage for Performance and Capacity Efficiency

    This post builds on a recent article I did that can be read here.

    Even with tough economic times, there is no such thing as a data recession! Thus the importance of optimizing data storage efficiency addressing both performance and capacity without impacting availability in a cost effective way to do more with what you have.

    What this means is that even though budgets are tight or have been cut resulting in reduced spending, overall net storage capacity is up year over year by double digits if not higher in some environments.

    Consequently, there is continued focus on stretching available IT and storage related resources or footprints further while eliminating barriers or constraints. IT footprint constraints can be physical in a cabinet or rack as well as floorspace, power or cooling thresholds and budget among others.

    Constraints can be due to lack of performance (bandwidth, IOPS or transactions), poor response time or lack of availability for some environments. Yet for other environments, constraints can be lack of capacity, limited primary or standby power or cooling constraints. Other constraints include budget, staffing or lack of infrastructure resource management (IRM) tools and time for routine tasks.

    Look before you leap
    Before jumping into an optimization effort, gain insight if you do not already have it as to where the bottlenecks exist, along with the cause and effect of moving or reconfiguring storage resources. For example, boosting capacity use to more fully use storage resources can result in a performance issue or data center bottlenecks for other environments.

    An alternative scenario is that in the quest to boost performance, storage is seen as being under-utilized, yet when capacity use is increased, low and behold, response time deteriorates. The result can be a vicious cycle hence the need to address the issue as opposed to moving problems by using tools to gain insight on resource usage, both space and activity or performance.

    Gaining insight means looking at capacity use along with performance and availability activity and how they use power, cooling and floor-space. Consequently an important tool is to gain insight and knowledge of how your resources are being used to deliver various levels of service.

    Tools include storage or system resource management (SRM) tools that report on storage space capacity usage, performance and availability with some tools now adding energy usage metrics along with storage or system resource analysis (SRA) tools.

    Cooling Off
    Power and cooling are commonly talked about as constraints, either from a cost standpoint, or availability of primary or secondary (e.g. standby) energy and cooling capacity to support growth. Electricity is essential for powering IT equipment including storage enabling devices to do their specific tasks of storing data, moving data, processing data or a combination of these attributes.

    Thus, power gets consumed, some work or effort to move and store data takes place and the by product is heat that needs to be removed. In a typical IT data center, cooling on average can account for about 50% of energy used with some sites using less.

    With cooling being a large consumer of electricity, a small percentage change to how cooling consumes energy can yield large results. Addressing cooling energy consumption can be to discuss budget or cost issues, or to enable cooling capacity to be freed up to support installation of extra storage or other IT equipment.

    Keep in mind that effective cooling relies on removing heat from as close to the source as possible to avoid over cooling which requires more energy. If you have not done so, have a facilities review or assessment performed that can range from a quick walk around, to a more in-depth review and thermal airflow analysis. A means of removing heat close to the sort are techniques such as intelligent, precision or smart cooling also known by other marketing names.

    Powering Up, or, Powering Down
    Speaking of energy or power, in addition to addressing cooling, there are a couple of ways of addressing power consumption by storage equipment (Figure 1). The most popular discussed approach towards efficiency is energy avoidance involving powering down storage when not used such as first generation MAID at the cost of performance.

    For off-line storage, tape and other removable media give low-cost capacity per watt with low to no energy needed when not in use. Second generation (e.g. MAID 2.0) solutions with intelligent power management (IPM) capabilities have become more prevalent enabling performance or energy savings on a more granular or selective basis often as a standard feature in common storage systems.

    GreenOptionsBalance
    Figure 1:  How various RAID levels and configuration impact or benefit footprint constraints

    Another approach to energy efficiency is seen in figure 1 which is doing more work for active applications per watt of energy to boost productivity. This can be done by using same amount of energy however doing more work, or, same amount of work with less energy.

    For example instead of using larger capacity disks to improve capacity per watt metrics, active or performance sensitive storage should be looked at on an activity basis such as IOP, transactions, videos, emails or throughput per watt. Hence, a fast disk drive doing work can be more energy-efficient in terms of productivity than a higher capacity slower disk drive for active workloads, where for idle or inactive, the inverse should hold true.

    On a go forward basis the trend already being seen with some servers and storage systems is to do both more work, while using less energy. Thus a larger gap between useful work (for active or non idle storage) and amount of energy consumed yields a better efficiency rating, or, take the inverse if that is your preference for smaller numbers.

    Reducing Data Footprint Impact
    Data footprint impact reduction tools or techniques for both on-line as well as off-line storage include archiving, data management, compression, deduplication, space-saving snapshots, thin provisioning along with different RAID levels among other approaches. From a storage access standpoint, you can also include bandwidth optimization, data replication optimization, protocol optimizers along with other network technologies including WAFS/WAAS/WADM to help improve efficiency of data movement or access.

    Thin provisioning for capacity centric environments can be used to achieving a higher effective storage use level by essentially over booking storage similar to how airlines oversell seats on a flight. If you have good historical information and insight into how storage capacity is used and over allocated, thin provisioning enables improved effective storage use to occur for some applications.

    However, with thin provisioning, avoid introducing performance bottlenecks by leveraging solutions that work closely with tools that providing historical trending information (capacity and performance).

    For a technology that some have tried to declare as being dead to prop other new or emerging solutions, RAID remains relevant given its widespread deployment and transparent reliance in organizations of all size. RAID also plays a role in storage performance, availability, capacity and energy constraints as well as a relief tool.

    The trick is to align the applicable RAID configuration to the task at hand meeting specific performance, availability, capacity or energy along with economic requirements. For some environments a one size fits all approach may be used while others may configure storage using different RAID levels along with number of drives in RAID sets to meet specific requirements.


    Figure 2:  How various RAID levels and configuration impact or benefit footprint constraints

    Figure 2 shows a summary and tradeoffs of various RAID levels. In addition to the RAID levels, how many disks can also have an impact on performance or capacity, such as, by creating a larger RAID 5 or RAID 6 group, the parity overhead can be spread out, however there is a tradeoff. Tradeoffs can be performance bottlenecks on writes or during drive rebuilds along with potential exposure to drive failures.

    All of this comes back to a balancing act to align to your specific needs as some will go with a RAID 10 stripe and mirror to avoid risks, even going so far as to do triple mirroring along with replication. On the other hand, some will go with RAID 5 or RAID 6 to meet cost or availability requirements, or, some I have talked with even run RAID 0 for data and applications that need the raw speed, yet can be restored rapidly from some other medium.

    Lets bring it all together with an example
    Figure 3 shows a generic example of a before and after optimization for a mixed workload environment, granted you can increase or decrease the applicable capacity and performance to meet your specific needs. In figure 3, the storage configuration consists of one storage system setup for high performance (left) and another for high-capacity secondary (right), disk to disk backup and other near-line needs, again, you can scale the approach up or down to your specific need.

    For the performance side (left), 192 x 146GB 15K RPM (28TB raw) disks provide good performance, however with low capacity use. This translates into a low capacity per watt value however with reasonable IOPs per watt and some performance hot spots.

    On the capacity centric side (right), there are 192 x 1TB disks (192TB raw) with good space utilization, however some performance hot spots or bottlenecks, constrained growth not to mention low IOPS per watt with reasonable capacity per watt. In the before scenario, the joint energy use (both arrays) is about 15 kWh or 15,000 watts which translates to about $16,000 annual energy costs (cooling excluded) assuming energy cost of 12 cents per kWh.

    Note, your specific performance, availability, capacity and energy mileage will vary based on particular vendor solution, configuration along with your application characteristics.


    Figure 3: Baseline before and after storage optimization (raw hardware) example

    Building on the example in figure 3, a combination of techniques along with technologies yields a net performance, capacity and perhaps feature functionality (depends on specific solution) increase. In addition, floor-space, power, cooling and associated footprints are also reduced. For example, the resulting solution shown (middle) comprises 4 x 250GB flash SSD devices, along with 32 x 450GB 15.5K RPM and 124 x 2TB 7200RPM enabling an 53TB (raw) capacity increase along with performance boost.

    The previous example are based on raw or baseline capacity metrics meaning that further optimization techniques should yield improved benefits. These examples should also help to discuss the question or myth that it costs more to power storage than to buy it which the answer should be it depends.

    If you can buy the above solution for say under $50,000 (cost to power), or, let alone, $100,000 (power and cool) for three years which would also be a good acquisition, then the myth of buying is more expensive than powering holds true. However, if a solution as described above costs more, than the story changes along with other variables include energy costs for your particular location re-enforcing the notion that your mileage will vary.

    Another tip is that more is not always better.

    That is, more disks, ports, processors, controllers or cache do not always equate into better performance. Performance is the sum of how those and other pieces working together in a demonstrable way, ideally your specific application workload compared to what is on a product data sheet.

    Additional general tips include:

    • Align the applicable tool, technique or technology to task at hand
    • Look to optimize for both performance and capacity, active and idle storage
    • Consolidated applications and servers need fast servers
    • Fast servers need fast I/O and storage devices to avoid bottlenecks
    • For active storage use an activity per watt metric such as IOP or transaction per watt
    • For in-active or idle storage, a capacity per watt per footprint metric would apply
    • Gain insight and control of how storage resources are used to meet service requirements

    It should go without saying, however sometimes what is understood needs to be restated.

    In the quest to become more efficient and optimized, avoid introducing performance, quality of service or availability issues by moving problems.

    Likewise, look beyond storage space capacity also considering performance as applicable to become efficient.

    Finally, it is all relative in that what might be applicable to one environment or application need may not apply to another.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Storage Efficiency and Optimization – The Other Green

    For those of you in the New York City area, I will be presenting live in person at Storage Decisions September 23, 2009 conference The Other Green, Storage Efficiency and Optimization.

    Throw out the "green“: buzzword, and you’re still left with the task of saving or maximizing use of space, power, and cooling while stretching available IT dollars to support growth and business sustainability. For some environments the solution may be consolation while others need to maintain quality of service response time, performance and availability necessitating faster, energy efficient technologies to achieve optimization objectives.

    To accomplish these and other related issues, you can turn to the cloud, virtualization, intelligent power management, data footprint reduction and data management not to mention various types of tiered storage and performance optimization techniques. The session will look at various techniques and strategies to optimize either on-line active or primary as well as near-line or secondary storage environment during tough economic times, as well as to position for future growth, after all, there is no such thing as a data recession!

    Topics, technologies and techniques that will be discussed include among others:

    • Energy efficiency (strategic) vs. energy avoidance (tactical), whats different between them
    • Optimization and the need for speed vs. the need for capacity, finding the right balance
    • Metrics & measurements for management insight, what the industry is doing (or not doing)
    • Tiered storage and tiered access including SSD, FC, SAS, tape, clouds and more
    • Data footprint reduction (archive, compress, dedupe) and thin provision among others
    • Best practices, financial incentives and what you can do today

    This is a free event for IT professionals, however space I hear is limited, learn more and register here.

    For those interested in broader IT data center and infrastructure optimization, check out the on-going seminar series The Infrastructure Optimization and Planning Best Practices (V2.009) – Doing more with less without sacrificing storage, system or network capabilities Seminar series continues September 22, 2009 with a stop in Chicago. This is also a free Seminar, register and learn more here or here.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Back to School and Dedupe School

    Summers is over hear in the northern hemisphere and its back to school time.

    This coming week I will be the substitute teacher filling in for my friend Mr. Backup in Minneapolis and Toronto for TechTargets Dedupe School. If you are in either city and have not yet signed up, check out the link here to learn more.

    Hope to see you this week, or, next week at Infrastructure Optimization in Chicago or Storage Decisions in NYC where I will also be presenting or teaching if you prefer, as well as listening and learning from the attendees whats on their minds.

    Stay current on other upcoming activities on our events page, as well as see whats new or in the news here.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Upcoming Out and About Events

    Following up on previous Out and About updates ( here and here ) of where I have been, heres where I’m going to be over the next couple of weeks.

    On September 15th and 16th 2009, I will be the keynote speaker along with doing a deep dive discussion around data deduplication in Minneapolis, MN and Toronto ON. Free Seminar, register and learn more here.

    The Infrastructure Optimization and Planning Best Practices (V2.009) – Doing more with less without sacrificing storage, system or network capabilities Seminar series continues September 22, 2009 with a stop in Chicago. Free Seminar, register and learn more here.

    On September 23, 2009 I will be in New York City at Storage Decisions conference participating in the Ask the Experts during the expo session as well as presenting The Other Green — Storage Efficiency and Optimization.

    Throw out the "green“: buzzword, and you’re still left with the task of saving or maximizing use of space, power, and cooling while stretching available IT dollars to support growth and business sustainability. For some environments the solution may be consolation while others need to maintain quality of service response time, performance and availability necessitating faster, energy efficient technologies to achieve optimization objectives. To accomplish these and other related issues, you can turn to the cloud, virtualization, intelligent power management, data footprint reduction and data management not to mention various types of tiered storage and performance optimization techniques. The session will look at various techniques and strategies to optimize either on-line active or primary as well as near-line or secondary storage environment during tough economic times, as well as to position for future growth, after all, there is no such thing as a data recession!

    Topics, technologies and techniques that will be discussed include among others:

    • Energy efficiency (strategic) vs. energy avoidance (tactical)
    • Optimization and the need for speed vs. the need for capacity
    • Metrics and measurements for management insight
    • Tiered storage and tiered access including SSD, FC, SAS and clouds
    • Data footprint reduction (archive, compress, dedupe) and thin provision
    • Best practices, financial incentives and what you can do today

    Free event, learn more and register here.

    Check out the events page for other upcoming events and hope to see you this fall while Im out and about.

    Cheers – gs

    Greg Schulz – StorageIOblog, twitter @storageio Author “The Green and Virtual Data Center” (CRC)

    Clarifying Clustered Storage Confusion

    Clustered storage can be iSCSI, Fibre Channel block based or NAS (NFS or CIFS or proprietary file system) file system based. Clustered storage can also be found in virtual tape library (VTL) including dedupe solutions along with other storage solutions such as those for archiving, cloud, medical or other specialized grids among others.

    Recently in the IT and data storage specific industry, there has been a flurry of merger and acquisition (M&A) (Here and here), new product enhancement or announcement activity around clustered storage. For example, HP buying clustered file system vendor IBRIX complimenting their previous acquisition of another clustered file system vendor (PolyServe) a few years ago, or, of iSCSI block clustered storage software vendor LeftHand earlier this year. Another recent acquisition is that of LSI buying clustered NAS vendor ONstor, not to mention Dell buying iSCSI block clustered storage vendor EqualLogic about a year and half ago, not to mention other vendor acquisitions or announcements involving storage and clustering.

    Where the confusion enters into play is the term cluster which means many things to different people, and even more so when clustered storage is combined with NAS or file based storage. For example, clustered NAS may infer a clustered file system when in reality a solution may only be multiple NAS filers, NAS heads, controllers or storage processors configured for availability or failover.

    What this means is that a NFS or CIFS file system may only be active on one node at a time, however in the event of a failover, the file system shifts from one NAS hardware device (e.g. NAS head or filer) to another. On the other hand, a clustered file system enables a NFS or CIFS or other file system to be active on multiple nodes (e.g. NAS heads, controllers, etc.) concurrently. The concurrent access may be for small random reads and writes for example supporting a popular website or file serving application, or, it may be for parallel reads or writes to a large sequential file.

    Clustered storage is no longer exclusive to the confines of high-performance sequential and parallel scientific computing or ultra large environments. Small files and I/O (read or write), including meta-data information, are also being supported by a new generation of multipurpose, flexible, clustered storage solutions that can be tailored to support different applications workloads.

    There are many different types of clustered and bulk storage systems. Clustered storage solutions may be block (iSCSI or Fibre Channel), NAS or file serving, virtual tape library (VTL), or archiving and object-or content-addressable storage. Clustered storage in general is similar to using clustered servers, providing scale beyond the limits of a single traditional system—scale for performance, scale for availability, and scale for capacity and to enable growth in a modular fashion, adding performance and intelligence capabilities along with capacity.

    For smaller environments, clustered storage enables modular pay-as-you-grow capabilities to address specific performance or capacity needs. For larger environments, clustered storage enables growth beyond the limits of a single storage system to meet performance, capacity, or availability needs.

    Applications that lend themselves to clustered and bulk storage solutions include:

    • Unstructured data files, including spreadsheets, PDFs, slide decks, and other documents
    • Email systems, including Microsoft Exchange Personal (.PST) files stored on file servers
    • Users’ home directories and online file storage for documents and multimedia
    • Web-based managed service providers for online data storage, backup, and restore
    • Rich media data delivery, hosting, and social networking Internet sites
    • Media and entertainment creation, including animation rendering and post processing
    • High-performance databases such as Oracle with NFS direct I/O
    • Financial services and telecommunications, transportation, logistics, and manufacturing
    • Project-oriented development, simulation, and energy exploration
    • Low-cost, high-performance caching for transient and look-up or reference data
    • Real-time performance including fraud detection and electronic surveillance
    • Life sciences, chemical research, and computer-aided design

    Clustered storage solutions go beyond meeting the basic requirements of supporting large sequential parallel or concurrent file access. Clustered storage systems can also support random access of small files for highly concurrent online and other applications. Scalable and flexible clustered file servers that leverage commonly deployed servers, networking, and storage technologies are well suited for new and emerging applications, including bulk storage of online unstructured data, cloud services, and multimedia, where extreme scaling of performance (IOPS or bandwidth), low latency, storage capacity, and flexibility at a low cost are needed.

    The bandwidth-intensive and parallel-access performance characteristics associated with clustered storage are generally known; what is not so commonly known is the breakthrough to support small and random IOPS associated with database, email, general-purpose file serving, home directories, and meta-data look-up (Figure 1). Note that a clustered storage system, and in particular, a clustered NAS may or may not include a clustered file system.

    Clustered Storage Model: Source The Green and Virtual Data Center (CRC)
    Figure 1 – Generic clustered storage model (Courtesy “The Green and Virtual Data Center  (CRC)”

    More nodes, ports, memory, and disks do not guarantee more performance for applications. Performance depends on how those resources are deployed and how the storage management software enables those resources to avoid bottlenecks. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion when processing diverse application workloads. Other things to consider include support for industry-standard interfaces, protocols, and technologies.

    Scalable and flexible clustered file server and storage systems provide the potential to leverage the inherent processing capabilities of constantly improving underlying hardware platforms. For example, software-based clustered storage systems that do not rely on proprietary hardware can be deployed on industry-standard high-density servers and blade centers and utilizes third-party internal or external storage.

    Clustered storage is no longer exclusive to niche applications or scientific and high-performance computing environments. Organizations of all sizes can benefit from ultra scalable, flexible, clustered NAS storage that supports application performance needs from small random I/O to meta-data lookup and large-stream sequential I/O that scales with stability to grow with business and application needs.

    Additional considerations for clustered NAS storage solutions include the following.

    • Can memory, processors, and I/O devices be varied to meet application needs?
    • Is there support for large file systems supporting many small files as well as large files?
    • What is the performance for small random IOPS and bandwidth for large sequential I/O?
    • How is performance enabled across different application in the same cluster instance?
    • Are I/O requests, including meta-data look-up, funneled through a single node?
    • How does a solution scale as the number of nodes and storage devices is increased?
    • How disruptive and time-consuming is adding new or replacing existing storage?
    • Is proprietary hardware needed, or can industry-standard servers and storage be used?
    • What data management features, including load balancing and data protection, exists?
    • What storage interface can be used: SAS, SATA, iSCSI, or Fibre Channel?
    • What types of storage devices are supported: SSD, SAS, Fibre Channel, or SATA disks?

    As with most storage systems, it is not the total number of hard disk drives (HDDs), the quantity and speed of tiered-access I/O connectivity, the types and speeds of the processors, or even the amount of cache memory that determines performance. The performance differentiator is how a manufacturer combines the various components to create a solution that delivers a given level of performance with lower power consumption.

    To avoid performance surprises, be leery of performance claims based solely on speed and quantity of HDDs or the speed and number of ports, processors and memory. How the resources are deployed and how the storage management software enables those resources to avoid bottlenecks are more important. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion.

    Learn more about clustered storage (block, file, VTL/dedupe, archive), clustered NAS, clustered file system, grids and cloud storage among other topics in the following links:

    "The Many faces of NAS – Which is appropriate for you?"

    Article: Clarifying Storage Cluster Confusion
    Presentation: Clustered Storage: “From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0”
    Video Interview: How to Scale Data Storage Systems with Clustering
    Guidelines for controlling clustering
    The benefits of clustered storage

    Along with other material on the StorageIO Tips and Tools or portfolio archive or events pages.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    March and Mileage Mania Wrap-up

    Today’s flight to Santa Ana (SNA) Orange County California for an 18 hour visit marks my 3rd trip to the left coast in the past four weeks that started out with a trip to Los Angeles. The purpose of today’s trip is to deliver a talk around Business Continuance (BC) and Disaster recovery (DR) topics for virtual server and storage environments along with related data transformation topics themes, part of a series of on-going events.

    Planned flight path from MSP to SNA, note upper midwest snow storms. Thanks to Northwest Airlines, now part of Delta!
    Planned flight path from MSP to SNA courtesy of Northwest Airlines, now part of Delta

    This is a short trip to southern California in that I have to be back in Minneapolis for a Wednesday afternoon meeting followed by keynoting at an IT Infrastructure Optimization Seminar downtown Minneapolis Thursday morning. Right after Thursday morning session, its off to the other coast for some Friday morning and early afternoon sessions in the Boston area, the results of which I hope to be able to share with you in a not so distant future posting.

    Where has March gone? Its been a busy and fun month out on the road with in-person seminars, vendor and user group events in Minneapolis, Los Angles, Las Vegas, Milwaukee, Atlanta, St. Louis, Birmingham, Minneapolis for CMG user group, Cincinnati and Orange County not to mention some other meetings and consulting engagements elsewhere including participating in a couple of webcast and virtual conference/seminars while on the road. Coverage and discussion around my new book "The Green and Virtual Data Center" (CRC) continues expand, read here to see what’s being said.

    What has made the month fun in addition to traveling around the country is the interaction with the hundreds of IT professionals from organizations of all size hearing what they are encountering, what their challenges are, what they are thinking, and in general what’s on their mind.

    Some of the common themes include:

  • There’s no such thing as a data recession, however the result is doing more with less, or, with what you have
  • Confusion abounds around green hype including carbon footprints vs. core IT and business issues
  • There is life beyond consolidation for server and storage virtualization to enable business agility
  • Security and encryption remain popular topic as does heterogeneous and affordable key management
  • End to end IT resource management for virtual environments is needed that is scalable and affordable
  • Performance and quality of service can not be sacrificed in the quest to drive up storage utilization
  • Clouds, SSD (FLASH), Dedupe, FCoE and Thin Provisioning among others are on the watch list
  • Tape continues to be used complimenting disks in tiered storage environments along with VTLs
  • Dedupe continues to be deployed and we are just seeing the very tip of the ice-berg of opportunity
  • Software licensing cost savings or reallocation should be a next step focus for virtual environments
  • Now, for a bit of irony and humor, overheard was a server sales person talking to a storage sales person comparing notes on how they are missing their forecasts as their customers are buying fewer servers and storage now that they are consolidating with virtualization, or using disk dedupe to eliminate disk drives. Doh!!!

    Now if those sales people can get their marketing folks to get them the play book for virtualization for business agility, improving performance and enabling business growth in an optimized, transformed environment, they might be able to talk a different story with their customers for new opportunities…

    What’s on deck for April? More of the same, however also watch and listen for some additional web based content including interviews quotes and perspectives on industry happenings, articles, tips and columns, reports, blogs, videos, podcasts, webcasts and twitter activity as well as appearances at events in Boston, Chicago, New Jersey and Providence among other venues.

    To all of those who came out to the various events in March, thank you very much and look forward to future follow-up conversations as well as seeing you at some of the upcoming future events.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    On The Road Again: An Update

    A while back, I posted about a busy upcoming spring schedule of activity and events, and then a few weeks ago, posted an update, so this can be considered the latest "On The Road Again" update. While the economy continues to be in rough condition and job reductions or layoffs continuing, or, reduction in hours or employees being asked to take time off without pay or to take sabbaticals, not to mention the race to get the economic stimulus bill passed, for many people, business and life goes on.

    Airport parking lots have plenty of cars in them, airplanes while not always full, are not empty (granted there has been some fleet optimization aka aligning capacity to best suited tier of aircraft and other consolidation or capacity improvements). Many organizations cutting back on travel and entertainment (T&E) spending, either to watch the top and bottom line, avoid being perceived or seen on the news as having employees going on junkets when they may in fact being going to conferences, seminars, conventions or other educational and related events to boost skills and seek out ways to improve business productivity.

    One of the reason that I have a busy travel schedule in addition to my normal analyst and consulting activities is that many events and seminars are being scheduled close to, or in the cities where IT professionals are located who might otherwise have T&E restrictions or other constraints from traveling to industry events, some of which are or will be impacted by recent economic and business conditions.

    Last week I was invited to attend and speak at the FujiFilm Executive Seminar, no private jets were used or seen, travel was via scheduled air carriers (coach air-fare). FujiFilm has a nice program for those interested in or involved with tape whether for disk to tape backup, disk to disk to tape, long term archive, bulk storage and other scenarios involving the continued use and changing roles of tape as a green data storage medium for in-active or off-line data. Check out FujiFilm TapePower Center portal.

    This past week I was in the big "D", that’s Dallas Texas to do another TechTarget Dinner event around the theme of BC/DR, Virtualization and IT optimization. The session was well attended by a diverse audience of IT professionals from around the DFW metroplex. Common themes included discussions about business and economic activity as well as the need to keep business and IT running even when budgets are being stretched further and further. Technology conversations included server and storage virtualization, tiered storage including SSD, fast FC and SAS disk drives, lower performance high capacity "fat" disk drives as well as tape not to mention tiered data protection, tiered servers and other related items.

    The Green Gap continues to manifest itself in that when asked, most people do not have Green IT initiatives, however, when asked they do have power, cooling, floor-space, environmental (PCFE) or business economic sustainability concerns, aka, the rest of the Green story.

    While some attendees have started to use some new technologies including dedupe technology, most I find are still using a combination of disk and tape with some considering dedupe for the future for certain applications. Other technologies and trends being watched, however also ones with concerns as to their stability and viability for enterprise use include FLASH based SSD, Cloud computing and thin provisioning among others. Common themes I hear from IT professionals are that these are technologies and tools to keep an eye on, or, use on a selective basis and are essentially tiered resources to have in a tool box of technologies to apply to different tasks to meet various service requirements. Hopefully the Cowboys can put a fraction of the amount of energy and interest into and improving their environment that the Dallas area IT folks are applying to their environments, especially given the strained IT budgets vs. the budget that the Cowboys have to work with for their player personal.

    I always find it interesting when talking to groups of IT professionals which tend to be enterprise, SME and SMB hearing what they are doing and looking at or considering which often is in stark contrast to some of the survey results on technology adoption trends one commonly reads or hears about. Hummm, nuff said, what say you?

    Hope to see you at one of the many upcoming events perhaps coming to a venue near you.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    Did someone forget to tell Dell that Tape is dead?

    Storage I/O trends

    Did someone forget to send a memo to Dell that magnetic tape is dead, or, perhaps pre-occupied with other activities? Maybe no body at Dell read the “virtual” or “fictional” memo that tape is dead?

    Ok, enough with the cynicism and joking around, tape is not dead (See recent Computerworld and Dell story) and Dell is one of several vendors including IBM who still find time to talk about tape as part of a solution to different customer and environment needs.

    Sure, tape might be in or heading into its golden years or what can also be called the plateau of productivity (for customers) or profitability (for some vendors), tape does not get the marketing dollars and media coverage as its been around as a technology for a long time and their are cooler and niftier (techno term) things to discuss including disk based backup and data protection, CDP, VTLs, de-dupe debates, clusters, grids and clouds, FCoE vs. iSCSI, NAS, SAS, virtualization, OSD and pretty much anything except tape.

    However, the reality is that many organizations, particular larger organizations still use and rely on tape based data protection for backup/BC/DR as well as archive for compliance and non-compliance data retention or data preservation activities, in some cases complimenting and co-existing with disk based solutions.

    Disk to disk (D2D) based backups and data protection certainly continue to gain adoption and deployments in both large and small environments, however, the shift to disk based data protection, or, clinging to tape with a death grip does not have to be, nor should it be an all or nothing value proposition, that is, they can and do co-exist for different uses and purposes leveraging the various economics and benefits of the technologies to address various tasks and requirements.

    New and emerging technologies certainly need to be discussed, dissected, developed and deployed as they are the future for maintaining and sustaining business growth via IT service delivery in economical and reliable fashion, that is, apply what technologies makes economic and business sense at a given point in time to minimize risk while maximizing useful benefits to your business.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Links to Upcoming and Recent Webcasts and Videocasts

    Here are links to several recent and upcoming Webcast and video casts covering a wide range of topics. Some of these free Webcast and video casts may require registration.

    Industry Trends & Perspectives – Data Protection for Virtual Server Environments

    Next Generation Data Centers Today: What’s New with Storage and Networking

    Hot Storage Trends for 2008

    Expanding your Channel Business with Performance and Capacity Planning

    Top Ten I/O Strategies for the Green and Virtual Data Center

    Cheers
    Greg Schulz – StorageIO

    SMB capacity planning; Focusing on energy conservation

    Storage I/O trends

    Here’s a link to a new tip I wrote that is posted over at SearchSMBStorage on Capacity Planning and energy conservation.

    Here are some added links to other recent tips I wrote and posted at a SearchSMBStorage:

    Improve your storage energy efficiency

    Data protection for virtual server environments

    Data footprint reduction for SMBs

    Is clustered NAS for SMBs?

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Missing Dedupe Debate Detail!

    Storage I/O trends

    The de-dupe vendors like to debate details of their solutions, ranging from compression or de-dupe ratios, to hashing and caching algorithms, to processor vs. disk vs. memory, to in-band vs. out-of-band, pre or post processing among other items. At times the dedupe debates can get more lively than a political debate or even the legendary storage virtualization debates of yester year.

    However one item that an IT professional recently mentioned that is not being addressed or talked about during the de-dupe debates is how IT customers will get around vendor lock-in. Never mind the usual lock-in debates of whose back-end storage or disk drives, whose server a de-dupe appliance software runs and so forth.

    The real concern is how data in the future will be recoverable from a de-dupe solution similar to how data can be recovered from tape today. Granted this is an apple to oranges comparison at best. The only real similarity is that a backup or archive solution sends a data stream in a tar-ball or backup or archive save set or perhaps in a file format to the tape or de-dupe appliance. Then, the VTL or de-dupe appliance software puts the data into yet another format.

    Granted not all tape media can be interchanged between different tape drives given format, generations and of course using the proper backup or archive application to un-pack the data for use. Probably a more applicable apple to oranges comparison would be how will IT personal get data back from a VTL (non de-duping) disk based storage system compared to getting data back from a VTL or de-dupe appliance.

    Today and for the foreseeable future the answer is simple, if your pain point is severe and you need the benefits of de-dupe, then the de-dupe software and appliance is your point of vendor lock-in. If vendor lock-in is a main concern, take your time, do your homework and due diligence for solutions that reduce lock-in or at least give a reasonable strategy for data access in the future.

    Welcome to the world of virtualized data and virtualized data protection. Here?s the golden rule for de-dupe and that is like virtualization, who ever controls the software and management meta data controls the vendor lock-in, good, bad or in-different, that?s the harsh reality.

    For the record, I like de-dupe technology in general as part of an overall data footprint reduction strategy combined with archiving and real-time compression for on-line and off-line data. I see a very bright future for it moving forward. I also see many of the heavy thinking and heavy lifting issues to support large-scale deployments and processing getting addressed over time allowing de-dupe to move from mid markets to large-scale mainstream adoption.

    Now, back to your regularly scheduled de-dupe debate drama!

    Cheers
    gs