Which Enterprise HDD for Content Applications Different File Size Impact

Which HDD for Content Applications Different File Size Impact

Different File Size Impact server storage I/O trends

Updated 1/23/2018

Which enterprise HDD to use with a content server platform different file size impact.

Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the fifth in a multi-part series (read part four here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). In this post the focus looks at large and small file I/O processing.

File Performance Activity

Tip, Content solutions use files in various ways. Use the following to gain perspective how various HDD’s handle workloads similar to your specific needs.

Two separate file processing workloads were run (12), one with a relative small number of large files, and another with a large number of small files. For the large file processing (table-3), 5 GByte sized files were created and then accessed via 128 Kbyte (128KB) sized I/O over a 10 hour period with 90% read using 64 threads (workers). Large file workload simulates what might be seen with higher definition video, image or other content streaming.

(Note 12) File processing workloads were run using Vdbench 5.04 and file anchors with sample script configuration below. Instead of vdbench you could also use other tools such as sysbench or fio among others.

VdbenchFSBigTest.txt
# Sample script for big files testing
fsd=fsd1,anchor=H:,depth=1,width=5,files=20,size=5G
fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=128k,fileselect=random,fileio=random,threads=64
rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

vdbench -f VdbenchFSBigTest.txt -m 16 -o Results_FSbig_H_060615

VdbenchFSSmallTest.txt
# Sample script for big files testing
fsd=fsd1,anchor=H:,depth=1,width=64,files=25600,size=16k
fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=1k,fileselect=random,fileio=random,threads=64
rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

vdbench -f VdbenchFSSmallTest.txt -m 16 -o Results_FSsmall_H_060615

The 10% writes are intended to reflect some update activity for new content or other changes to content. Note that 128KB per second translates to roughly 1 Gbps streaming content such as higher definition video. However 4K video (not optimized) would require a higher speed as well as resulting in larger file sizes. Table-3 shows the performance during the large file access period showing average read /write rates and response time, bandwidth (MBps), average open and close rates with response time.

Avg. File Read Rate

Avg. Read Resp. Time
Sec.

Avg. File Write Rate

Avg. Write Resp. Time
Sec.

Avg.
CPU %
Total

Avg. CPU % System

Avg. MBps
Read

Avg. MBps
Write

ENT 15K R1

580.7

107.9

64.5

19.7

52.2

35.5

72.6

8.1

ENT 10K R1

455.4

135.5

50.6

44.6

34.0

22.7

56.9

6.3

ENT CAP R1

285.5

221.9

31.8

19.0

43.9

28.3

37.7

4.0

ENT 10K R10

690.9

87.21

76.8

48.6

35.0

21.8

86.4

9.6

Table-3 Performance summary for large file access operations (90% read)

Table-3 shows that for two-drive RAID 1, the Enterprise 15K are the fastest performance, however using a RAID 10 with four 10K HDD’s with enhanced cache features provide a good price, performance and space capacity option. Software RAID was used in this workload test.

Figure-4 shows the relative performance of various HDD options handling large files, keep in mind that for the response line lower is better, while for the activity rate higher is better.

large file processing
Figure-4 Large file processing 90% read, 10% write rate and response time

In figure-4 you can see the performance in terms of response time (reads larger dashed line, writes smaller dotted line) along with number of file read operations per second (reads solid blue column bar, writes green column bar). Reminder that lower response time, and higher activity rates are better. Performance declines moving from left to right, from 15K to 10K Enterprise Performance with enhanced cache feature to Enterprise Capacity (7.2K), all of which were hardware RAID 1. Also shown is a hardware RAID 10 (four x 10K HDD’s).

Results in figure-4 above and table-4 below show how various drives can be configured to balance their performance, capacity and costs to meet different needs. Table-4 below shows an analysis looking at average file reads per second (RPS) performance vs. HDD costs, usable capacity and protection level.

Table-4 is an example of looking at multiple metrics to make informed decisions as to which HDD would be best suited to your specific needs. For example RAID 10 using four 10K drives provides good performance and protection along with large usable space, however that also comes at a budget cost (e.g. price).

Avg.
File Reads Per Sec. (RPS)

Single Drive Cost per RPS

Multi-Drive Cost per RPS

Single Drive Cost / Per GB Capacity

Cost / Per GB Usable (Protected) Cap.

Drive Cost (Multiple Drives)

Protection Overhead (Space Capacity for RAID)

Cost per usable GB per RPS

Avg. File Read Resp. (Sec.)

ENT 15K R1

580.7

$1.02

$2.05

$ 0.99

$0.99

$1,190

100%

$2.1

107.9

ENT 10K R1

455.5

1.92

3.84

0.49

0.49

1,750

100%

3.8

135.5

ENT CAP R1

285.5

1.40

2.80

0.20

0.20

798

100%

2.8

271.9

ENT 10K R10

690.9

1.27

5.07

0.49

0.97

3,500

100%

5.1

87.2

Table-4 Performance, capacity and cost analysis for big file processing

Small File Size Processing

To simulate a general file sharing environment, or content streaming with many smaller objects, 1,638,464 16KB sized files were created on each device being tested (table-5). These files were spread across 64 directories (25,600 files each) and accessed via 64 threads (workers) doing 90% reads with a 1KB I/O size over a ten hour time frame. Like the large file test, and database activity, all workloads were run at the same time (e.g. test devices were concurrently busy).

Avg. File Read Rate

Avg. Read Resp. Time
Sec.

Avg. File Write Rate

Avg. Write Resp. Time
Sec.

Avg.
CPU %
Total

Avg. CPU % System

Avg. MBps
Read

Avg. MBps
Write

ENT 15K R1

3,415.7

1.5

379.4

132.2

24.9

19.5

3.3

0.4

ENT 10K R1

2,203.4

2.9

244.7

172.8

24.7

19.3

2.2

0.2

ENT CAP R1

1,063.1

12.7

118.1

303.3

24.6

19.2

1.1

0.1

ENT 10K R10

4,590.5

0.7

509.9

101.7

27.7

22.1

4.5

0.5

Table-5 Performance summary for small sized (16KB) file access operations (90% read)

Figure-5 shows the relative performance of various HDD options handling large files, keep in mind that for the response line lower is better, while for the activity rate higher is better.

small file processing
Figure-5 Small file processing 90% read, 10% write rate and response time

In figure-5 you can see the performance in terms of response time (reads larger dashed line, writes smaller dotted line) along with number of file read operations per second (reads solid blue column bar, writes green column bar). Reminder that lower response time, and higher activity rates are better. Performance declines moving from left to right, from 15K to 10K Enterprise Performance with enhanced cache feature to Enterprise Capacity (7.2K RPM), all of which were hardware RAID 1. Also shown is a hardware RAID 10 (four x 10K RPM HDD’s) that has higher performance and capacity along with costs (table-5).

Results in figure-5 above and table-5 below show how various drives can be configured to balance their performance, capacity and costs to meet different needs. Table-6 below shows an analysis looking at average file reads per second (RPS) performance vs. HDD costs, usable capacity and protection level.

Table-6 is an example of looking at multiple metrics to make informed decisions as to which HDD would be best suited to your specific needs. For example RAID 10 using four 10K drives provides good performance and protection along with large usable space, however that also comes at a budget cost (e.g. price).

Avg.
File Reads Per Sec. (RPS)

Single Drive Cost per RPS

Multi-Drive Cost per RPS

Single Drive Cost / Per GB Capacity

Cost / Per GB Usable (Protected) Cap.

Drive Cost (Multiple Drives)

Protection Overhead (Space Capacity for RAID)

Cost per usable GB per RPS

Avg. File Read Resp. (Sec.)

ENT 15K R1

3,415.7

$0.17

$0.35

$0.99

$0.99

$1,190

100%

$0.35

1.51

ENT 10K R1

2,203.4

0.40

0.79

0.49

0.49

1,750

100%

0.79

2.90

ENT CAP R1

1,063.1

0.38

0.75

0.20

0.20

798

100%

0.75

12.70

ENT 10K R10

4,590.5

0.19

0.76

0.49

0.97

3,500

100%

0.76

0.70

Table-6 Performance, capacity and cost analysis for small file processing

Looking at the small file processing analysis in table-5 shows that the 15K HDD’s on an apples to apples basis (e.g. same RAID level and number of drives) provide the best performance. However when also factoring in space capacity, performance, different RAID level or other protection schemes along with cost, there are other considerations. On the other hand the Enterprise Capacity 2TB HDD’s have a low cost per capacity, however do not have the performance of other options, assuming your applications need more performance.

Thus the right HDD for one application may not be the best one for a different scenario as well as multiple metrics as shown in table-5 need to be included in an informed storage decision making process.

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

File processing are common content applications tasks, some being small, others large or mixed as well as reads and writes. Even if your content environment is using object storage, chances are unless it is a new applications or a gateway exists, you may be using NAS or file based access. Thus the importance of if your applications are doing file based processing, either run your own applications or use tools that can simulate as close as possible to what your environment is doing.

Continue reading part six in this multi-part series here where the focus is around general I/O including 8KB and 128KB sized IOPs along with associated metrics.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Which Enterprise HDD for Content Applications General I/O Performance

Which HDD for Content Applications general I/O Performance

hdd general i/o performance server storage I/O trends

Updated 1/23/2018

Which enterprise HDD to use with a content server platform general I/O performance Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the sixth in a multi-part series (read part five here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). In this post the focus is around general I/O performance including 8KB and 128KB IOP sizes.

General I/O Performance

In addition to running database and file (large and small) processing workloads, Vdbench was also used to collect basic small (8KB) and large (128KB) sized I/O operations. This consisted of random and sequential reads as well as writes with the results shown below. In addition to using vdbench, other tools that could be used include Microsoft Diskspd, fio, iorate and iometer among many others.

These workloads used Vdbench configured (13) to do direct I/O to a Windows file system mounted device using as much of the available disk space as possible. All workloads used 16 threads and were run concurrently similar to database and file processing tests.

(Note 13) Sample vdbench configuration for general I/O, note different settings were used for various tests

Table-7 shows workload results for 8KB random IOPs 75% reads and 75% writes including IOPs, bandwidth and response time.

 

ENT 15K RAID1

ENT 10K RAID1

ENT CAP RAID1

ENT 10K R10
(4 Drives)

ECAP SW RAID (5 Drives)

 

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

I/O Rate (IOPs)

597.11

559.26

514

475

285

293

979

984

491

644

MB/sec

4.7

4.4

4.0

3.7

2.2

2.3

7.7

7.7

3.8

5.0

Resp. Time (Sec.)

25.9

27.6

30.2

32.7

55.5

53.7

16.3

16.3

32.6

24.8

Table-7 8KB sized random IOPs workload results

Figure-6 shows small (8KB) random I/O (75% read and 25% read) across different HDD configurations. Performance including activity rates (e.g. IOPs), bandwidth and response time for mixed reads / writes are shown. Note how response time increases with the Enterprise Capacity configurations vs. other performance optimized drives.

general 8K random IO
Figure-6 8KB random reads and write showing IOP activity, bandwidth and response time

Table-8 below shows workload results for 8GB sized I/Os 100% sequential with 75% reads and 75% writes including IOPs, MB/sec and response time in seconds.

ENT 15K RAID1

ENT 10K RAID1

ENT CAP RAID1

ENT 10K R10
(4 Drives)

ECAP SW RAID (5 Drives)

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

I/O Rate (IOPs)

3,778

3,414

3,761

3,986

3,379

1,274

11,840

8,368

2,891

1,146

MB/sec

29.5

26.7

29.4

31.1

26.4

10.0

92.5

65.4

22.6

9.0

Resp. Time (Sec.)

2.2

3.1

2.3

2.4

2.7

10.9

1.3

1.9

5.5

14.0

Table-8 8KB sized sequential workload results

Figure-7 shows small 8KB sequential mixed reads and writes (75% read and 75% write), while the Enterprise Capacity 2TB HDD has a large amount of space capacity, its performance in a RAID 1 vs. other similar configured drives is slower.

8KB Sequential
Figure-7 8KB sequential 75% reads and 75% write showing bandwidth activity

Table-9 shows workload results for 100% sequential, 100% read and 100% write 128KB sized I/Os including IOPs, bandwidth and response time.

ENT 15K RAID1

ENT 10K RAID1

ENT CAP RAID1

ENT 10K R10
(4 Drives)

ECAP SW RAID (5 Drives)

Read

Write

Read

Write

Read

Write

Read

Write

Read

Write

I/O Rate (IOPs)

1,798

1,771

1,716

1,688

921

912

3,552

3,486

780

721

MB/sec

224.7

221.3

214.5

210.9

115.2

114.0

444.0

435.8

97.4

90.1

Resp. Time (Sec.)

8.9

9.0

9.3

9.5

17.4

17.5

4.5

4.6

19.3

20.2

Table-9 128KB sized sequential workload results

Figure-8 shows sequential or streaming operations of larger I/O (100% read and 100% write) requests sizes (128KB) that would be found with large content applications. Figure-8 highlights the relationship between lower response time and increased IOPs as well as bandwidth.

128K Sequential
Figure-8 128KB sequential reads and write showing IOP activity, bandwidth and response time

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

Some content applications are doing small random I/Os for database, key value stores or repositories as well as meta data processing while others are doing large sequential I/O. 128KB sized I/O may be large for your environment, on the other hand, with an increasing number of applications, file systems, software defined storage management tools among others, 1 to 10MB or even larger I/O sizes are becoming common. Key is selecting I/O sizes and read write as well as random sequential along with I/O or queue depths that align with your environment.

Continue reading part seven the final post in this multi-part series here where the focus is around how HDD’s continue to evolve including performance beyond traditional RPM based execrations along with wrap up.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

HDDs evolve for Content Application servers

HDDs evolve for Content Application servers

hdds evolve server storage I/O trends

Updated 1/23/2018

Enterprise HDDs evolve for content server platform

Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the seventh and final post in this multi-part series (read part six here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). The focus of this post is comparing how HDD continue to evolve over various generations boosting performance as well as capacity and reliability. This also looks at how there is more to HDD performance than the traditional focus on Revolutions Per Minute (RPM) as a speed indicator.

Comparing Different Enterprise 10K And 15K HDD Generations

There is more to HDD performance than RPM speed of the device. RPM plays an important role, however there are other things that impact HDD performance. A common myth is that HDD’s have not improved on performance over the past several years with each successive generation. Table-10 shows a sampling of various generations of enterprise 10K and 15K HDD’s (14) including different form factors and how their performance continues to improve.

different 10K and 15K HDDs
Figure-9 10K and 15K HDD performance improvements

Figure-9 shows how performance continues to improve with 10K and 15K HDD’s with each new generation including those with enhanced cache features. The result is that with improvements in cache software within the drives, along with enhanced persistent non-volatile memory (NVM) and incremental mechanical drive improvements, both read and write performance continues to be enhanced.

Figure-9 puts into perspective the continued performance enhancements of HDD’s comparing various enterprise 10K and 15K devices. The workload is the same TPC-C tests used earlier in a similar (14) (with no RAID). 100 simulated users are shown in figure-9 accessing a database on each of the different drives all running concurrently. The older 15K 3.5” Cheetah and 2.5” Savio used had a capacity of 146GB which used a database scale factor of 1500 or 134GB. All other drives used a scale factor 3000 or 276GB. Figure-9 also highlights the improvements in both TPS performance as well as lower response time with new HDD’s including those with performance enhanced cache feature.

The workloads run are same as the TPC-C ones shown earlier, however these drives were not configured with any RAID. The TPC-C activity used Benchmark Factory with similar setup and configuration to those used earlier including on a multi-socket, multi-core Windows 2012 R2 server supporting a Microsoft SQL Server 2012 database with a database for each drive type.

ENT 10K V3 2.5"

ENT (Cheetah) 15K 3.5"

Users

1

20

50

100

Users

1

20

50

100

TPS (TPC-C)

14.8

50.9

30.3

39.9

TPS (TPC-C)

14.6

51.3

27.1

39.3

Resp. Time (Sec.)

0.0

0.4

1.6

1.7

Resp. Time (Sec.)

0.0

0.3

1.8

2.1

ENT 10K 2.5" (with cache)

ENT (Savio) 15K 2.5"

Users

1

20

50

100

Users

1

20

50

100

TPS (TPC-C)

19.2

146.3

72.6

71.0

TPS (TPC-C)

15.8

59.1

40.2

53.6

Resp. Time (Sec.)

0.0

0.1

0.7

0.0

Resp. Time (Sec.)

0.0

0.3

1.2

1.2

ENT 15K V4 2.5"

Users

1

20

50

100

TPS (TPC-C)

19.7

119.8

75.3

69.2

Resp. Time (Sec.)

0.0

0.1

0.6

1.0

ENT 15K (enhanced cache) 2.5"

Users

1

20

50

100

TPS (TPC-C)

20.1

184.1

113.7

122.1

Resp. Time (Sec.)

0.0

0.1

0.4

0.2

Table-10 Continued Enterprise 10K and 15K HDD performance improvements

(Note 14) 10K and 15K generational comparisons were run on a separate comparable server to what was used for other test workloads. Workload configuration settings were the same as other database workloads including using Microsoft SQL Server 2012 on a Windows 2012 R2 system with Benchmark Factory driving the workload. Database memory sized was reduced however to only 8GB vs. 16GB used in other tests.

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

A little bit of flash in the right place with applicable algorithms goes a long way, an example being the Seagate Enterprise HDD’s with enhanced cache feature. Likewise, HDD’s are very much alive complementing SSD and vice versa. For high-performance content application workloads flash SSD solutions including NVMe, 12Gbps SAS and 6Gbps SATA devices are cost effective solutions. HDD’s continue to be cost-effective data storage devices for both capacity, as well as environments that do not need the performance of flash SSD.

For some environments using a combination of flash and HDD’s complementing each other along with cache software can be a cost-effective solution. The previous workload examples provide insight for making cost-effective informed storage decisions.

Evaluate today’s HDD’s on their effective performance running workloads as close as similar to your own, or, actually try them out with your applications. Today there is more to HDD performance than just RPM speed, particular with the Seagate Enterprise Performance 10K and 15K HDD’s with enhanced caching feature.

However the Enterprise Performance 10K with enhanced cache feature provides a good balance of capacity, performance while being cost-effective. If you are using older 3.5” 15K or even previous generation 2.5” 15K RPM and “non-performance enhanced” HDD’s, take a look at how the newer generation HDD’s perform, looking beyond the RPM of the device.

Fast content applications need fast content and flexible content solution platforms such as those from Servers Direct and HDD’s from Seagate. Key to a successful content application deployment is having the flexibility to hardware define and software defined the platform to meet your needs. Just as there are many different types of content applications along with diverse environments, content solution platforms need to be flexible, scalable and robust, not to mention cost effective.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

As the platters spin, HDD’s for cloud, virtual and traditional storage environments

HDDs for cloud, virtual and traditional storage environments

Storage I/O trends

Updated 1/23/2018

As the platters spin is a follow-up to a recent series of posts on Hard Disk Drives (HDD’s) along with some posts about How Many IOPS HDD’s can do.

HDD and storage trends and directions include among others

HDD’s will continue to be declared dead into the next decade, just as they have been for over a decade, meanwhile they are being enhanced, continued to be used in evolving roles.

hdd and ssd

SSD will continue to coexist with HDD, either as separate or converged HHDD’s. Where, where and how they are used will also continue to evolve. High IO (IOPS) or low latency activity will continue to move to some form of nand flash SSD (PCM around the corner), while storage capacity including some of which has been on tape stays on disk. Instead of more HDD capacity in a server, it moves to a SAN or NAS or to a cloud or service provider. This includes for backup/restore, BC, DR, archive and online reference or what some call active archives.

The need for storage spindle speed and more

The need for faster revolutions per minute (RPM’s) performance of drives (e.g. platter spin speed) is being replaced by SSD and more robust smaller form factor (SFF) drives. For example, some of today’s 2.5” SFF 10,000 RPM (e.g. 10K) SAS HDD’s can do as well or better than their larger 3.5” 15K predecessors can for both IOPS and bandwidth. This is also an example where the RPM speed of a drive may not be the only determination for performance as it has been in the past.


Performance comparison of four different drive types, click to view larger image.

The need for storage space capacity and areal density

In terms of storage enhancements, watch for the appearance of Shingled Magnetic Recording (SMR) enabled HDD’s to help further boost the space capacity in the same footprint. Using SMR HDD manufactures can put more bits (e.g. areal density) into the same physical space on a platter.


Traditional vs. SMR to increase storage areal density capacity

The generic idea with SMR is to increase areal density (how many bits can be safely stored per square inch) of data placed on spinning disk platter media. In the above image on the left is a representative example of how traditional magnetic disk media lays down tracks next to each other. With traditional magnetic recording approaches, the tracks are placed as close together as possible for the write heads to safely write data.

With new recording formats such as SMR along with improvements to read/write heads, the tracks can be more closely grouped together in an overlapping way. This overlapping way (used in a generic sense) is like how the shingles on a roof overlap, hence Shingled Magnetic Recording. Other magnetic recording or storage enhancements in the works include Heat Assisted Magnetic Recording (HAMR) and Helium filed drives. Thus, there is still plenty of bits and bytes room for growth in HDD’s well into the next decade to co-exist and complement SSD’s.

DIF and AF (Advanced Format), or software defining the drives

Another evolving storage feature that ties into HDD’s is Data Integrity Feature (DIF) that has a couple of different types. Depending on which type of DIF (0, 1, 2, and 3) is used; there can be added data integrity checks from the application to the storage medium or drive beyond normal functionality. Here is something to keep in mind, as there are different types or levels of DIF, when somebody says they support or need DIF, ask them which type or level as well as why.

Are you familiar with Advanced Format (AF)? If not you should be. Traditionally outside of special formats for some operating systems or controllers, that standard open system data storage block, page or sector has been 512 bytes. This has served well in the past, however; with the advent of TByte and larger sized drives, a new mechanism is needed. The need is to support both larger average data allocation sizes from operating systems and storage systems, as well as to cut the overhead of managing all the small sectors. Operating systems and file systems have added new partitioning features such as GUID Partition Table (GPT) to support 1TB and larger SSD, HDD and storage system LUN’s.

These enhancements are enabling larger devices to be used in place of traditional Master Boot Record (MBR) or other operating system partition and allocation schemes. The next step, however, is to teach operating systems, file systems, and hypervisors along with their associated tools or drives how to work with 4,096 byte or 4 Kbyte sectors. The advantage will be to cut the overhead of tracking all of those smaller sectors or file system extents and clusters. Today many HDD’s support AF however by default may have 512-byte emulation mode enabled due to lack of operating system or other support.

Intelligent Power Management, moving beyond drive spin down

Intelligent Power Management (IPM) is a collection of techniques that can be applied to vary the amount of energy consumed by a drive, controller or processor to do its work. These include in the case of an HDD slowing the spin rate of platters, however, keep in mind that mass in motion tends to stay in motion. This means that HDD’s once up and spinning do not need as much relative power as they function like a flywheel. Where their power draw comes in is during reading and write, in part to the movement of reading/write heads, however also for running the processors and electronics that control the device. Another big power consumer is when drives spin up, thus if they can be kept moving, however at a lower rate, along with disabling energy used by read/write heads and their electronics, you can see a drop in power consumption. Btw, a current generation 3.5” 4TB 6Gbs SATA HDD consumes about 6-7 watts of power while in active use, or less when in idle mode. Likewise a current generation high performance 2.5” 1.2TB HDD consumes about 4.8 watts of energy, a far cry from the 12-16 plus watts of energy some use as HDD fud.

Hybrid Hard Disk Drives (HHDD) and Solid State Hybrid Drives (SSDHD)

Hybrid HDD’s (HHDD’s) also known as Solid State Hybrid Drives (SSHD) have been around for a while and if you have read my earlier posts, you know that I have been a user and fan of them for several years. However one of the drawbacks of the HHDD’s has been lack of write acceleration, (e.g. they only optimize for reads) with some models. Current and emerging HDDD’s are appearing with a mix of nand flash SLC (used in earlier versions), MLC and eMLC along with DRAM while enabling write optimization. There are also more drive options available as HHDD’s from different manufactures both for desktop and enterprise class scenarios.

The challenge with HHDD’s is that many vendors either do not understand how they fit and compliment their tiering or storage management software tools or simply do not see the value proposition. I have had vendors and others tell me that the HHDD’s don’t make sense as they are too simple, how can they be a fit without requiring tiering software, controllers, SSD and HDD’s to be viable?

Storage I/O trends

I also see a trend similar to when the desktop high-capacity SATA drives appeared for enterprise-class storage systems in the early 2000s. Some of the same people did not see where or how a desktop class product or technology could ever be used in an enterprise solution.

Hmm, hey wait a minute, I seem to recall similar thinking when SCSI drives appeared in the early 90s, funny how some things do not change, DejaVu anybody?

Does that mean HHDD’s will be used everywhere?

Not necessarily, however, there will be places where they make sense, others where either an HDD or SSD will be more practical.

Networking with your server and storage

Drive native interfaces near-term will remain as 6Gbs (going to 12Gbs) SAS and SATA with some FC (you might still find a parallel SCSI drive out there). Likewise, with bridges or interface cards, those drives may appear as USB or something else.

What about SCSI over PCIe, will that catch on as a drive interface? Tough to say however I am sure we can find some people who will gladly try to convince you of that. FC based drives operating at 4Gbs FC (4GFC) are still being used for some environments however most activity is shifting over to SAS and SATA. SAS and SATA are switching over from 3Gbs to 6Gbs with 12Gbs SAS on the roadmaps.

So which drive is best for you?

That depends; do you need bandwidth or IOPS, low latency or high capacity, small low profile thin form factor or feature functions? Do you need a hybrid or all SSD or a self-encrypting device (SED) also known as Instant Secure Erase (ISE), these are among your various options.

Disk drives

Why the storage diversity?

Simple, some are legacy soon to be replaced and disposed of while others are newer. I also have a collection so to speak that get used for various testing, research, learning and trying things out. Click here and here to read about some of the ways I use various drives in my VMware environment including creating Raw Device Mapped (RDM) local SAS and SATA devices.

Other capabilities and functionality existing or being added to HDD’s include RAID and data copy assist; securely erase, self-encrypting, vibration dampening among other abilities for supporting dense data environments.

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

Do not judge a drive only by its interface, space capacity, cost or RPM alone. Look under the cover a bit to see what is inside in terms of functionality, performance, and reliability among other options to fit your needs. After all, in the data center or information factory not everything is the same.

From a marketing and fun to talk about new technology perspective, HDD’s might be dead for some. The reality is that they are very much alive in physical, virtual and cloud environments, granted their role is changing.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.