Part 4 – Which HDD for Content Applications – Database Workloads

Part 4 – Which HDD for Content Applications – Database Workloads

data base server storage I/O trends

Updated 1/23/2018
Which enterprise HDD to use with a content server platform for database workloads

Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the fourth in a multi-part series (read part three here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). In this post the focus expands to database application workloads that were run to test various HDD’s.

Database Reads/Writes

Transaction Processing Council (TPC) TPC-C like workloads were run against the SUT from the STI. These workloads simulated transactional, content management, meta-data and key-value processing. Microsoft SQL Server 2012 was configured and used with databases (each 470GB e.g. scale 6000) created and workload generated by virtual users via Dell Benchmark Factory (running on STI Windows 2012 R2).

A single SQL Server database instance (8) was used on the SUT, however unique databases were created for each HDD set being tested. Both the main database file (.mdf) and the log file (.ldf) were placed on the same drive set being tested, keep in mind the constraints mentioned above. As time was a constraint, database workloads were run concurrent (9) with each other except for the Enterprise 10K RAID 1 and RAID 10. Workload was run with two 10K HDD’s in a RAID 1 configuration, then another workload run with a four drive RAID 10. In a production environment, ideally the .mdf and .ldf would be placed on separate HDD’s and SSDs.

To improve cache buffering the SQL Server database instance memory could be increased from 16GB to a larger number that would yield higher TPS numbers. Keep in mind the objective was not to see how fast I could make the databases run, rather how the different drives handled the workload.

(Note 8) The SQL Server Tempdb was placed on a separate NVMe flash SSD, also the database instance memory size was set to 16GB which was shared by all databases and virtual users accessing it.

(Note 9) Each user step was run for 90 minutes with a 30 minute warm-up preamble to measure steady-state operation.

Users

TPCC Like TPS

Single Drive Cost per TPS

Drive Cost per TPS

Single Drive Cost / Per GB Raw Cap.

Cost / Per GB Usable (Protected) Cap.

Drive Cost (Multiple Drives)

Protect
Space Over head

Cost per usable GB per TPS

Resp. Time (Sec.)

ENT 15K R1

1

23.9

$24.94

$49.89

$0.99

$0.99

$1,190

100%

$49.89

0.01

ENT 10K R1

1

23.4

$37.38

$74.77

$0.49

$0.49

$1,750

100%

$74.77

0.01

ENT CAP R1

1

16.4

$24.26

$48.52

$0.20

$0.20

$ 798

100%

$48.52

0.03

ENT 10K R10

1

23.2

$37.70

$150.78

$0.49

$0.97

$3,500

100%

$150.78

0.07

ENT CAP SWR5

1

17.0

$23.45

$117.24

$0.20

$0.25

$1,995

20%

$117.24

0.02

ENT 15K R1

20

362.3

$1.64

$3.28

$0.99

$0.99

$1,190

100%

$3.28

0.02

ENT 10K R1

20

339.3

$2.58

$5.16

$0.49

$0.49

$1,750

100%

$5.16

0.01

ENT CAP R1

20

213.4

$1.87

$3.74

$0.20

$0.20

$ 798

100%

$3.74

0.06

ENT 10K R10

20

389.0

$2.25

$9.00

$0.49

$0.97

$3,500

100%

$9.00

0.02

ENT CAP SWR5

20

216.8

$1.84

$9.20

$0.20

$0.25

$1,995

20%

$9.20

0.06

ENT 15K R1

50

417.3

$1.43

$2.85

$0.99

$0.99

$1,190

100%

$2.85

0.08

ENT 10K R1

50

385.8

$2.27

$4.54

$0.49

$0.49

$1,750

100%

$4.54

0.09

ENT CAP R1

50

103.5

$3.85

$7.71

$0.20

$0.20

$ 798

100%

$7.71

0.45

ENT 10K R10

50

778.3

$1.12

$4.50

$0.49

$0.97

$3,500

100%

$4.50

0.03

ENT CAP SWR5

50

109.3

$3.65

$18.26

$0.20

$0.25

$1,995

20%

$18.26

0.42

ENT 15K R1

100

190.7

$3.12

$6.24

$0.99

$0.99

$1,190

100%

$6.24

0.49

ENT 10K R1

100

175.9

$4.98

$9.95

$0.49

$0.49

$1,750

100%

$9.95

0.53

ENT CAP R1

100

59.1

$6.76

$13.51

$0.20

$0.20

$ 798

100%

$13.51

1.66

ENT 10K R10

100

560.6

$1.56

$6.24

$0.49

$0.97

$3,500

100%

$6.24

0.14

ENT CAP SWR5

100

62.2

$6.42

$32.10

$0.20

$0.25

$1,995

20%

$32.10

1.57

Table-2 TPC-C workload results various number of users across different drive configurations

Figure-2 shows TPC-C TPS (red dashed line) workload scaling over various number of users (1, 20, 50, and 100) with peak TPS per drive shown. Also shown is the used space capacity (in green), with total raw storage capacity in blue cross hatch. Looking at the multiple metrics in context shows that the 600GB Enterprise 15K HDD with performance enhanced cache is a premium option as an alternative, or, to complement flash SSD solutions.

database TPCC transactional workloads
Figure-2 472GB Database TPS scaling along with cost per TPS and storage space used

In figure-2, the 1.8TB Enterprise 10K HDD with performance enhanced cache while not as fast as the 15K, provides a good balance of performance, space capacity and cost effectiveness. A good use for the 10K drives is where some amount of performance is needed as well as a large amount of storage space for less frequently accessed content.

A low cost, low performance option would be the 2TB Enterprise Capacity HDD’s that have a good cost per capacity, however lack the performance of the 15K and 10K drives with enhanced performance cache. A four drive RAID 10 along with a five drive software volume (Microsoft WIndows) are also shown. For apples to apples comparison look at costs vs. capacity including number of drives needed for a given level of performance.

Figure-3 is a variation of figure-2 showing TPC-C TPS (blue bar) and response time (red-dashed line) scaling across 1, 20, 50 and 100 users. Once again the Enterprise 15K with enhanced performance cache feature enabled has good performance in an apples to apples RAID 1 comparison.

Note that the best performance was with the four drive RAID 10 using 10K HDD’s Given popularity, a four drive RAID 10 configuration with the 10K drives was used. Not surprising the four 10K drives performed better than the RAID 1 15Ks. Also note using five drives in a software spanned volume provides a large amount of storage capacity and good performance however with a larger drive footprint.

database TPCC transactional workloads scaling
Figure-3 472GB Database TPS scaling along with response time (latency)

From a cost per space capacity perspective, the Enterprise Capacity drives have a good cost per GB. A hybrid solution for environment that do not need ultra-high performance would be to pair a small amount of flash SSD (10) (drives or PCIe cards), as well as the 10K and 15K performance enhanced drives with the Enterprise Capacity HDD (11) along with cache or tiering software.

(Note 10) Refer to Seagate 1200 12 Gbps Enterprise SAS SSD StorageIO lab review

(Note 11) Refer to Enterprise SSHD and Flash SSD Part of an Enterprise Tiered Storage Strategy

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

If your environment is using applications that rely on databases, then test resources such as servers, storage, devices using tools that represent your environment. This means moving up the software and technology stack from basic storage I/O benchmark or workload generator tools such as Iometer among others instead using either your own application, or tools that can replay or generate various workloads that represent your environment.

Continue reading part five in this multi-part series here where the focus shifts to large and small file I/O processing workloads.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Which Enterprise HDD for Content Applications Different File Size Impact

Which HDD for Content Applications Different File Size Impact

Different File Size Impact server storage I/O trends

Updated 1/23/2018

Which enterprise HDD to use with a content server platform different file size impact.

Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the fifth in a multi-part series (read part four here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). In this post the focus looks at large and small file I/O processing.

File Performance Activity

Tip, Content solutions use files in various ways. Use the following to gain perspective how various HDD’s handle workloads similar to your specific needs.

Two separate file processing workloads were run (12), one with a relative small number of large files, and another with a large number of small files. For the large file processing (table-3), 5 GByte sized files were created and then accessed via 128 Kbyte (128KB) sized I/O over a 10 hour period with 90% read using 64 threads (workers). Large file workload simulates what might be seen with higher definition video, image or other content streaming.

(Note 12) File processing workloads were run using Vdbench 5.04 and file anchors with sample script configuration below. Instead of vdbench you could also use other tools such as sysbench or fio among others.

VdbenchFSBigTest.txt
# Sample script for big files testing
fsd=fsd1,anchor=H:,depth=1,width=5,files=20,size=5G
fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=128k,fileselect=random,fileio=random,threads=64
rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

vdbench -f VdbenchFSBigTest.txt -m 16 -o Results_FSbig_H_060615

VdbenchFSSmallTest.txt
# Sample script for big files testing
fsd=fsd1,anchor=H:,depth=1,width=64,files=25600,size=16k
fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=1k,fileselect=random,fileio=random,threads=64
rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

vdbench -f VdbenchFSSmallTest.txt -m 16 -o Results_FSsmall_H_060615

The 10% writes are intended to reflect some update activity for new content or other changes to content. Note that 128KB per second translates to roughly 1 Gbps streaming content such as higher definition video. However 4K video (not optimized) would require a higher speed as well as resulting in larger file sizes. Table-3 shows the performance during the large file access period showing average read /write rates and response time, bandwidth (MBps), average open and close rates with response time.

Avg. File Read Rate

Avg. Read Resp. Time
Sec.

Avg. File Write Rate

Avg. Write Resp. Time
Sec.

Avg.
CPU %
Total

Avg. CPU % System

Avg. MBps
Read

Avg. MBps
Write

ENT 15K R1

580.7

107.9

64.5

19.7

52.2

35.5

72.6

8.1

ENT 10K R1

455.4

135.5

50.6

44.6

34.0

22.7

56.9

6.3

ENT CAP R1

285.5

221.9

31.8

19.0

43.9

28.3

37.7

4.0

ENT 10K R10

690.9

87.21

76.8

48.6

35.0

21.8

86.4

9.6

Table-3 Performance summary for large file access operations (90% read)

Table-3 shows that for two-drive RAID 1, the Enterprise 15K are the fastest performance, however using a RAID 10 with four 10K HDD’s with enhanced cache features provide a good price, performance and space capacity option. Software RAID was used in this workload test.

Figure-4 shows the relative performance of various HDD options handling large files, keep in mind that for the response line lower is better, while for the activity rate higher is better.

large file processing
Figure-4 Large file processing 90% read, 10% write rate and response time

In figure-4 you can see the performance in terms of response time (reads larger dashed line, writes smaller dotted line) along with number of file read operations per second (reads solid blue column bar, writes green column bar). Reminder that lower response time, and higher activity rates are better. Performance declines moving from left to right, from 15K to 10K Enterprise Performance with enhanced cache feature to Enterprise Capacity (7.2K), all of which were hardware RAID 1. Also shown is a hardware RAID 10 (four x 10K HDD’s).

Results in figure-4 above and table-4 below show how various drives can be configured to balance their performance, capacity and costs to meet different needs. Table-4 below shows an analysis looking at average file reads per second (RPS) performance vs. HDD costs, usable capacity and protection level.

Table-4 is an example of looking at multiple metrics to make informed decisions as to which HDD would be best suited to your specific needs. For example RAID 10 using four 10K drives provides good performance and protection along with large usable space, however that also comes at a budget cost (e.g. price).

Avg.
File Reads Per Sec. (RPS)

Single Drive Cost per RPS

Multi-Drive Cost per RPS

Single Drive Cost / Per GB Capacity

Cost / Per GB Usable (Protected) Cap.

Drive Cost (Multiple Drives)

Protection Overhead (Space Capacity for RAID)

Cost per usable GB per RPS

Avg. File Read Resp. (Sec.)

ENT 15K R1

580.7

$1.02

$2.05

$ 0.99

$0.99

$1,190

100%

$2.1

107.9

ENT 10K R1

455.5

1.92

3.84

0.49

0.49

1,750

100%

3.8

135.5

ENT CAP R1

285.5

1.40

2.80

0.20

0.20

798

100%

2.8

271.9

ENT 10K R10

690.9

1.27

5.07

0.49

0.97

3,500

100%

5.1

87.2

Table-4 Performance, capacity and cost analysis for big file processing

Small File Size Processing

To simulate a general file sharing environment, or content streaming with many smaller objects, 1,638,464 16KB sized files were created on each device being tested (table-5). These files were spread across 64 directories (25,600 files each) and accessed via 64 threads (workers) doing 90% reads with a 1KB I/O size over a ten hour time frame. Like the large file test, and database activity, all workloads were run at the same time (e.g. test devices were concurrently busy).

Avg. File Read Rate

Avg. Read Resp. Time
Sec.

Avg. File Write Rate

Avg. Write Resp. Time
Sec.

Avg.
CPU %
Total

Avg. CPU % System

Avg. MBps
Read

Avg. MBps
Write

ENT 15K R1

3,415.7

1.5

379.4

132.2

24.9

19.5

3.3

0.4

ENT 10K R1

2,203.4

2.9

244.7

172.8

24.7

19.3

2.2

0.2

ENT CAP R1

1,063.1

12.7

118.1

303.3

24.6

19.2

1.1

0.1

ENT 10K R10

4,590.5

0.7

509.9

101.7

27.7

22.1

4.5

0.5

Table-5 Performance summary for small sized (16KB) file access operations (90% read)

Figure-5 shows the relative performance of various HDD options handling large files, keep in mind that for the response line lower is better, while for the activity rate higher is better.

small file processing
Figure-5 Small file processing 90% read, 10% write rate and response time

In figure-5 you can see the performance in terms of response time (reads larger dashed line, writes smaller dotted line) along with number of file read operations per second (reads solid blue column bar, writes green column bar). Reminder that lower response time, and higher activity rates are better. Performance declines moving from left to right, from 15K to 10K Enterprise Performance with enhanced cache feature to Enterprise Capacity (7.2K RPM), all of which were hardware RAID 1. Also shown is a hardware RAID 10 (four x 10K RPM HDD’s) that has higher performance and capacity along with costs (table-5).

Results in figure-5 above and table-5 below show how various drives can be configured to balance their performance, capacity and costs to meet different needs. Table-6 below shows an analysis looking at average file reads per second (RPS) performance vs. HDD costs, usable capacity and protection level.

Table-6 is an example of looking at multiple metrics to make informed decisions as to which HDD would be best suited to your specific needs. For example RAID 10 using four 10K drives provides good performance and protection along with large usable space, however that also comes at a budget cost (e.g. price).

Avg.
File Reads Per Sec. (RPS)

Single Drive Cost per RPS

Multi-Drive Cost per RPS

Single Drive Cost / Per GB Capacity

Cost / Per GB Usable (Protected) Cap.

Drive Cost (Multiple Drives)

Protection Overhead (Space Capacity for RAID)

Cost per usable GB per RPS

Avg. File Read Resp. (Sec.)

ENT 15K R1

3,415.7

$0.17

$0.35

$0.99

$0.99

$1,190

100%

$0.35

1.51

ENT 10K R1

2,203.4

0.40

0.79

0.49

0.49

1,750

100%

0.79

2.90

ENT CAP R1

1,063.1

0.38

0.75

0.20

0.20

798

100%

0.75

12.70

ENT 10K R10

4,590.5

0.19

0.76

0.49

0.97

3,500

100%

0.76

0.70

Table-6 Performance, capacity and cost analysis for small file processing

Looking at the small file processing analysis in table-5 shows that the 15K HDD’s on an apples to apples basis (e.g. same RAID level and number of drives) provide the best performance. However when also factoring in space capacity, performance, different RAID level or other protection schemes along with cost, there are other considerations. On the other hand the Enterprise Capacity 2TB HDD’s have a low cost per capacity, however do not have the performance of other options, assuming your applications need more performance.

Thus the right HDD for one application may not be the best one for a different scenario as well as multiple metrics as shown in table-5 need to be included in an informed storage decision making process.

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

File processing are common content applications tasks, some being small, others large or mixed as well as reads and writes. Even if your content environment is using object storage, chances are unless it is a new applications or a gateway exists, you may be using NAS or file based access. Thus the importance of if your applications are doing file based processing, either run your own applications or use tools that can simulate as close as possible to what your environment is doing.

Continue reading part six in this multi-part series here where the focus is around general I/O including 8KB and 128KB sized IOPs along with associated metrics.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Which Enterprise HDD for Content Applications General I/O Performance

Which HDD for Content Applications general I/O Performance

hdd general i/o performance server storage I/O trends

Updated 1/23/2018

Which enterprise HDD to use with a content server platform general I/O performance Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the sixth in a multi-part series (read part five here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). In this post the focus is around general I/O performance including 8KB and 128KB IOP sizes.

General I/O Performance

In addition to running database and file (large and small) processing workloads, Vdbench was also used to collect basic small (8KB) and large (128KB) sized I/O operations. This consisted of random and sequential reads as well as writes with the results shown below. In addition to using vdbench, other tools that could be used include Microsoft Diskspd, fio, iorate and iometer among many others.

These workloads used Vdbench configured (13) to do direct I/O to a Windows file system mounted device using as much of the available disk space as possible. All workloads used 16 threads and were run concurrently similar to database and file processing tests.

(Note 13) Sample vdbench configuration for general I/O, note different settings were used for various tests

Table-7 shows workload results for 8KB random IOPs 75% reads and 75% writes including IOPs, bandwidth and response time.

 

ENT 15K RAID1

ENT 10K RAID1

ENT CAP RAID1

ENT 10K R10
(4 Drives)

ECAP SW RAID (5 Drives)

 

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

I/O Rate (IOPs)

597.11

559.26

514

475

285

293

979

984

491

644

MB/sec

4.7

4.4

4.0

3.7

2.2

2.3

7.7

7.7

3.8

5.0

Resp. Time (Sec.)

25.9

27.6

30.2

32.7

55.5

53.7

16.3

16.3

32.6

24.8

Table-7 8KB sized random IOPs workload results

Figure-6 shows small (8KB) random I/O (75% read and 25% read) across different HDD configurations. Performance including activity rates (e.g. IOPs), bandwidth and response time for mixed reads / writes are shown. Note how response time increases with the Enterprise Capacity configurations vs. other performance optimized drives.

general 8K random IO
Figure-6 8KB random reads and write showing IOP activity, bandwidth and response time

Table-8 below shows workload results for 8GB sized I/Os 100% sequential with 75% reads and 75% writes including IOPs, MB/sec and response time in seconds.

ENT 15K RAID1

ENT 10K RAID1

ENT CAP RAID1

ENT 10K R10
(4 Drives)

ECAP SW RAID (5 Drives)

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

75% Read

25% Read

I/O Rate (IOPs)

3,778

3,414

3,761

3,986

3,379

1,274

11,840

8,368

2,891

1,146

MB/sec

29.5

26.7

29.4

31.1

26.4

10.0

92.5

65.4

22.6

9.0

Resp. Time (Sec.)

2.2

3.1

2.3

2.4

2.7

10.9

1.3

1.9

5.5

14.0

Table-8 8KB sized sequential workload results

Figure-7 shows small 8KB sequential mixed reads and writes (75% read and 75% write), while the Enterprise Capacity 2TB HDD has a large amount of space capacity, its performance in a RAID 1 vs. other similar configured drives is slower.

8KB Sequential
Figure-7 8KB sequential 75% reads and 75% write showing bandwidth activity

Table-9 shows workload results for 100% sequential, 100% read and 100% write 128KB sized I/Os including IOPs, bandwidth and response time.

ENT 15K RAID1

ENT 10K RAID1

ENT CAP RAID1

ENT 10K R10
(4 Drives)

ECAP SW RAID (5 Drives)

Read

Write

Read

Write

Read

Write

Read

Write

Read

Write

I/O Rate (IOPs)

1,798

1,771

1,716

1,688

921

912

3,552

3,486

780

721

MB/sec

224.7

221.3

214.5

210.9

115.2

114.0

444.0

435.8

97.4

90.1

Resp. Time (Sec.)

8.9

9.0

9.3

9.5

17.4

17.5

4.5

4.6

19.3

20.2

Table-9 128KB sized sequential workload results

Figure-8 shows sequential or streaming operations of larger I/O (100% read and 100% write) requests sizes (128KB) that would be found with large content applications. Figure-8 highlights the relationship between lower response time and increased IOPs as well as bandwidth.

128K Sequential
Figure-8 128KB sequential reads and write showing IOP activity, bandwidth and response time

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

Some content applications are doing small random I/Os for database, key value stores or repositories as well as meta data processing while others are doing large sequential I/O. 128KB sized I/O may be large for your environment, on the other hand, with an increasing number of applications, file systems, software defined storage management tools among others, 1 to 10MB or even larger I/O sizes are becoming common. Key is selecting I/O sizes and read write as well as random sequential along with I/O or queue depths that align with your environment.

Continue reading part seven the final post in this multi-part series here where the focus is around how HDD’s continue to evolve including performance beyond traditional RPM based execrations along with wrap up.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

HDDs evolve for Content Application servers

HDDs evolve for Content Application servers

hdds evolve server storage I/O trends

Updated 1/23/2018

Enterprise HDDs evolve for content server platform

Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the seventh and final post in this multi-part series (read part six here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). The focus of this post is comparing how HDD continue to evolve over various generations boosting performance as well as capacity and reliability. This also looks at how there is more to HDD performance than the traditional focus on Revolutions Per Minute (RPM) as a speed indicator.

Comparing Different Enterprise 10K And 15K HDD Generations

There is more to HDD performance than RPM speed of the device. RPM plays an important role, however there are other things that impact HDD performance. A common myth is that HDD’s have not improved on performance over the past several years with each successive generation. Table-10 shows a sampling of various generations of enterprise 10K and 15K HDD’s (14) including different form factors and how their performance continues to improve.

different 10K and 15K HDDs
Figure-9 10K and 15K HDD performance improvements

Figure-9 shows how performance continues to improve with 10K and 15K HDD’s with each new generation including those with enhanced cache features. The result is that with improvements in cache software within the drives, along with enhanced persistent non-volatile memory (NVM) and incremental mechanical drive improvements, both read and write performance continues to be enhanced.

Figure-9 puts into perspective the continued performance enhancements of HDD’s comparing various enterprise 10K and 15K devices. The workload is the same TPC-C tests used earlier in a similar (14) (with no RAID). 100 simulated users are shown in figure-9 accessing a database on each of the different drives all running concurrently. The older 15K 3.5” Cheetah and 2.5” Savio used had a capacity of 146GB which used a database scale factor of 1500 or 134GB. All other drives used a scale factor 3000 or 276GB. Figure-9 also highlights the improvements in both TPS performance as well as lower response time with new HDD’s including those with performance enhanced cache feature.

The workloads run are same as the TPC-C ones shown earlier, however these drives were not configured with any RAID. The TPC-C activity used Benchmark Factory with similar setup and configuration to those used earlier including on a multi-socket, multi-core Windows 2012 R2 server supporting a Microsoft SQL Server 2012 database with a database for each drive type.

ENT 10K V3 2.5"

ENT (Cheetah) 15K 3.5"

Users

1

20

50

100

Users

1

20

50

100

TPS (TPC-C)

14.8

50.9

30.3

39.9

TPS (TPC-C)

14.6

51.3

27.1

39.3

Resp. Time (Sec.)

0.0

0.4

1.6

1.7

Resp. Time (Sec.)

0.0

0.3

1.8

2.1

ENT 10K 2.5" (with cache)

ENT (Savio) 15K 2.5"

Users

1

20

50

100

Users

1

20

50

100

TPS (TPC-C)

19.2

146.3

72.6

71.0

TPS (TPC-C)

15.8

59.1

40.2

53.6

Resp. Time (Sec.)

0.0

0.1

0.7

0.0

Resp. Time (Sec.)

0.0

0.3

1.2

1.2

ENT 15K V4 2.5"

Users

1

20

50

100

TPS (TPC-C)

19.7

119.8

75.3

69.2

Resp. Time (Sec.)

0.0

0.1

0.6

1.0

ENT 15K (enhanced cache) 2.5"

Users

1

20

50

100

TPS (TPC-C)

20.1

184.1

113.7

122.1

Resp. Time (Sec.)

0.0

0.1

0.4

0.2

Table-10 Continued Enterprise 10K and 15K HDD performance improvements

(Note 14) 10K and 15K generational comparisons were run on a separate comparable server to what was used for other test workloads. Workload configuration settings were the same as other database workloads including using Microsoft SQL Server 2012 on a Windows 2012 R2 system with Benchmark Factory driving the workload. Database memory sized was reduced however to only 8GB vs. 16GB used in other tests.

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

A little bit of flash in the right place with applicable algorithms goes a long way, an example being the Seagate Enterprise HDD’s with enhanced cache feature. Likewise, HDD’s are very much alive complementing SSD and vice versa. For high-performance content application workloads flash SSD solutions including NVMe, 12Gbps SAS and 6Gbps SATA devices are cost effective solutions. HDD’s continue to be cost-effective data storage devices for both capacity, as well as environments that do not need the performance of flash SSD.

For some environments using a combination of flash and HDD’s complementing each other along with cache software can be a cost-effective solution. The previous workload examples provide insight for making cost-effective informed storage decisions.

Evaluate today’s HDD’s on their effective performance running workloads as close as similar to your own, or, actually try them out with your applications. Today there is more to HDD performance than just RPM speed, particular with the Seagate Enterprise Performance 10K and 15K HDD’s with enhanced caching feature.

However the Enterprise Performance 10K with enhanced cache feature provides a good balance of capacity, performance while being cost-effective. If you are using older 3.5” 15K or even previous generation 2.5” 15K RPM and “non-performance enhanced” HDD’s, take a look at how the newer generation HDD’s perform, looking beyond the RPM of the device.

Fast content applications need fast content and flexible content solution platforms such as those from Servers Direct and HDD’s from Seagate. Key to a successful content application deployment is having the flexibility to hardware define and software defined the platform to meet your needs. Just as there are many different types of content applications along with diverse environments, content solution platforms need to be flexible, scalable and robust, not to mention cost effective.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Big Files Lots of Little File Processing Benchmarking with Vdbench

Big Files Lots of Little File Processing Benchmarking with Vdbench


server storage data infrastructure i/o File Processing Benchmarking with Vdbench

Updated 2/10/2018

Need to test a server, storage I/O networking, hardware, software, services, cloud, virtual, physical or other environment that is either doing some form of file processing, or, that you simply want to have some extra workload running in the background for what ever reason? An option is File Processing Benchmarking with Vdbench.

I/O performance

Getting Started


Here’s a quick and relatively easy way to do it with Vdbench (Free from Oracle). Granted there are other tools, both for free and for fee that can similar things, however we will leave those for another day and post. Here’s the con to this approach, there is no Uui Gui like what you have available with some other tools Here’s the pro to this approach, its free, flexible and limited by your creative, amount of storage space, server memory and I/O capacity.

If you need a background on Vdbench and benchmarking, check out the series of related posts here (e.g. www.storageio.com/performance).

Get and Install the Vdbench Bits and Bytes


If you do not already have Vdbench installed, get a copy from the Oracle or Source Forge site (now points to Oracle here).

Vdbench is free, you simply sign-up and accept the free license, select the version down load (it is a single, common distribution for all OS) the bits as well as documentation.

Installation particular on Windows is really easy, basically follow the instructions in the documentation by copying the contents of the download folder to a specified directory, set up any environment variables, and make sure that you have Java installed.

Here is a hint and tip for Windows Servers, if you get an error message about counters, open a command prompt with Administrator rights, and type the command:

$ lodctr /r


The above command will reset your I/O counters. Note however that command will also overwrite counters if enabled so only use it if you have to.

Likewise *nix install is also easy, copy the files, make sure to copy the applicable *nix shell script (they are in the download folder), and verify Java is installed and working.

You can do a vdbench -t (windows) or ./vdbench -t (*nix) to verify that it is working.

Vdbench File Processing

There are many options with Vdbench as it has a very robust command and scripting language including ability to set up for loops among other things. We are only going to touch the surface here using its file processing capabilities. Likewise, Vdbench can run from a single server accessing multiple storage systems or file systems, as well as running from multiple servers to a single file system. For simplicity, we will stick with the basics in the following examples to exercise a local file system. The limits on the number of files and file size are limited by server memory and storage space.

You can specify number and depth of directories to put files into for processing. One of the parameters is the anchor point for the file processing, in the following examples =S:\SIOTEMP\FS1 is used as the anchor point. Other parameters include the I/O size, percent reads, number of threads, run time and sample interval as well as output folder name for the result files. Note that unlike some tools, Vdbench does not create a single file of results, rather a folder with several files including summary, totals, parameters, histograms, CSV among others.


Simple Vdbench File Processing Commands

For flexibility and ease of use I put the following three Vdbench commands into a simple text file that is then called with parameters on the command line.
fsd=fsd1,anchor=!fanchor,depth=!dirdep,width=!dirwid,files=!numfiles,size=!filesize

fwd=fwd1,fsd=fsd1,rdpct=!filrdpct,xfersize=!fxfersize,fileselect=random,fileio=random,threads=!thrds

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=!etime,interval=!itime

Simple Vdbench script

# SIO_vdbench_filesystest.txt
#
# Example Vdbench script for file processing
#
# fanchor = file system place where directories and files will be created
# dirwid = how wide should the directories be (e.g. how many directories wide)
# numfiles = how many files per directory
# filesize = size in in k, m, g e.g. 16k = 16KBytes
# fxfersize = file I/O transfer size in kbytes
# thrds = how many threads or workers
# etime = how long to run in minutes (m) or hours (h)
# itime = interval sample time e.g. 30 seconds
# dirdep = how deep the directory tree
# filrdpct = percent of reads e.g. 90 = 90 percent reads
# -p processnumber = optional specify a process number, only needed if running multiple vdbenchs at same time, number should be unique
# -o output file that describes what being done and some config info
#
# Sample command line shown for Windows, for *nix add ./
#
# The real Vdbench script with command line parameters indicated by !=
#

fsd=fsd1,anchor=!fanchor,depth=!dirdep,width=!dirwid,files=!numfiles,size=!filesize

fwd=fwd1,fsd=fsd1,rdpct=!filrdpct,xfersize=!fxfersize,fileselect=random,fileio=random,threads=!thrds

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=!etime,interval=!itime

Big Files Processing Script


With the above script file defined, for Big Files I specify a command line such as the following.
$ vdbench -f SIO_vdbench_filesystest.txt fanchor=S:\SIOTemp\FS1 dirwid=1 numfiles=60 filesize=5G fxfersize=128k thrds=64 etime=10h itime=30 numdir=1 dirdep=1 filrdpct=90 -p 5576 -o SIOWS2012R220_NOFUZE_5Gx60_BigFiles_64TH_STX1200_020116

Big Files Processing Example Results


The following is one of the result files from the folder of results created via the above command for Big File processing showing totals.


Run totals

21:09:36.001 Starting RD=format_for_rd1

Feb 01, 2016 .Interval. .ReqstdOps.. ...cpu%... read ....read.... ...write.... ..mb/sec... mb/sec .xfer.. ...mkdir... ...rmdir... ..create... ...open.... ...close... ..delete...
rate resp total sys pct rate resp rate resp read write total size rate resp rate resp rate resp rate resp rate resp rate resp
21:23:34.101 avg_2-28 2848.2 2.70 8.8 8.32 0.0 0.0 0.00 2848.2 2.70 0.00 356.0 356.02 131071 0.0 0.00 0.0 0.00 0.1 109176 0.1 0.55 0.1 2006 0.0 0.00

21:23:35.009 Starting RD=rd1; elapsed=36000; fwdrate=max. For loops: None

07:23:35.000 avg_2-1200 4939.5 1.62 18.5 17.3 90.0 4445.8 1.79 493.7 0.07 555.7 61.72 617.44 131071 0.0 0.00 0.0 0.00 0.0 0.00 0.1 0.03 0.1 2.95 0.0 0.00


Lots of Little Files Processing Script


For lots of little files, the following is used.


$ vdbench -f SIO_vdbench_filesystest.txt fanchor=S:\SIOTEMP\FS1 dirwid=64 numfiles=25600 filesize=16k fxfersize=1k thrds=64 etime=10h itime=30 dirdep=1 filrdpct=90 -p 5576 -o SIOWS2012R220_NOFUZE_SmallFiles_64TH_STX1200_020116

Lots of Little Files Processing Example Results


The following is one of the result files from the folder of results created via the above command for Big File processing showing totals.
Run totals

09:17:38.001 Starting RD=format_for_rd1

Feb 02, 2016 .Interval. .ReqstdOps.. ...cpu%... read ....read.... ...write.... ..mb/sec... mb/sec .xfer.. ...mkdir... ...rmdir... ..create... ...open.... ...close... ..delete...
rate resp total sys pct rate resp rate resp read write total size rate resp rate resp rate resp rate resp rate resp rate resp
09:19:48.016 avg_2-5 10138 0.14 75.7 64.6 0.0 0.0 0.00 10138 0.14 0.00 158.4 158.42 16384 0.0 0.00 0.0 0.00 10138 0.65 10138 0.43 10138 0.05 0.0 0.00

09:19:49.000 Starting RD=rd1; elapsed=36000; fwdrate=max. For loops: None

19:19:49.001 avg_2-1200 113049 0.41 67.0 55.0 90.0 101747 0.19 11302 2.42 99.36 11.04 110.40 1023 0.0 0.00 0.0 0.00 0.0 0.00 7065 0.85 7065 1.60 0.0 0.00


Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

The above examples can easily be modified to do different things particular if you read the Vdbench documentation on how to setup multi-host, multi-storage system, multiple job streams to do different types of processing. This means you can benchmark a storage systems, server or converged and hyper-converged platform, or simply put a workload on it as part of other testing. There are even options for handling data footprint reduction such as compression and dedupe.

Ok, nuff said, for now.

Gs

Greg Schulz - Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Server StorageIO January 2016 Update Newsletter

Volume 16, Issue I – beginning of Year (BoY) Edition

Hello and welcome to the January 2016 Server StorageIO update newsletter.

Is it just me, or did January disappear in a flash like data stored in non-persistent volatile DRAM memory when the power is turned off? It seems like just the other day that it was the first day of the new year and now we are about to welcome in February. Needless to say, like many of you I have been busy with various projects, many of which are behind the scenes, some of which will start appearing publicly sooner while others later.

In terms of what have I been working on, it includes the usual of performance, availability, capacity and economics (e.g. PACE) related to servers, storage, I/O networks, hardware, software, cloud, virtual and containers. This includes NVM as well as NVMe based SSD’s, HDD’s, cache and tiering technologies, as well as data protection among other things with Hyper-V, VMware as well as various cloud services.

Enjoy this edition of the Server StorageIO update newsletter and watch for new tips, articles, StorageIO lab report reviews, blog posts, videos and podcast’s along with in the news commentary appearing soon.

Cheers GS

In This Issue

  • Feature Topic
  • Industry Trends News
  • Commentary in the news
  • Tips and Articles
  • StorageIOblog posts
  • Videos and Podcasts
  • Events and Webinars
  • Recommended Reading List
  • Industry Activity Trends
  • Server StorageIO Lab reports
  • New and Old Vendor Update
  • Resources and Links
  • Feature Topic – Microsoft Nano, Server 2016 TP4 and VMware

    This months feature topic is virtual servers and software defined storage including those from VMware and Microsoft. Back in November I mentioned the 2016 Technical Preview 4 (e.g. TP4) along with Storage Spaces Direct and Nano. As a reminder you can download your free trial copy of Windows Server 2016 TP4 from this Microsoft site here.

    Three good Microsoft Blog posts about storage spaces to check out include:

    • Storage Spaces Direct in Technical Preview 4 (here)
    • Hardware options for evaluating Storage Spaces Direct in Technical Preview 4 (here)
    • Storage Spaces Direct – Under the hood with the Software Storage Bus (here)

    As for Microsoft Nano, for those not familiar, it’s not a new tablet or mobile device, instead, it is a very light weight streamlined version of the Windows Server 2016 server. How streamlined? Much more so then the earlier Windows Server versions that simply disabled the GUI and desktop interfaces. Nano is smaller from a memory and disk storage space perspective meaning it uses less RAM, boots faster, has fewer moving parts (e.g. software modules) to break (or need patching).

    Specifically Nano removes 32 bit support and anything related to the desktop and GUI interfaces as well as removing the console interface. That’s right, no console or virtual console to log into, Wow is gone, access is via Powershell or Windows Management Interface tools from remote systems. How small is it? I have a Nano instance built on a VHDX that is under a GB in size, granted, its only for testing. The goal of Nano is to have a very light weight streamlined version of Windows Server that can run hundreds (or more) VMs in a small memory footprint, not to mention supports lots of containers. Nano is part of WIndows TP4, learn more about Nano here in this Microsoft post including how to get started using it.

    Speaking of VMware, if you have not received an invite yet to their Digital Enterprise February 6, 2016 announcement event, click here to register.

    StorageIOblog Posts

    Recent and popular Server StorageIOblog posts include:

    View other recent as well as past blog posts here

    Server Storage I/O Industry Activity Trends (Cloud, Virtual, Physical)

    StorageIO news (image licensed for use from Shutterstock by StorageIO)

    Some new Products Technology Services Announcements (PTSA) include:

    • EMC announced Elastic Cloud Storage (ECS) V2.2. A main theme of V2.2 is that besides being the 3rd generation of EMC object storage (dating back to Centera, then Atmos), is that ECS is also where the functionality of Centera, Atmos and other functionality converge. ECS provides object storage access along with HDFS (Hadoop and Hortonworks certified) and traditional NFS file access.

      Object storage access includes Amazon S3, OpenStack Swift, ATMOS and CAS (Centera). In addition to the access, added Centera functionality for regulatory compliance has been folded into the ECS software stack. For example, ECS is now compatible with SEC 17 a-4(f) and CFTC 1.3(b)-(c) regulations protecting data from being overwritten or erased for a specified retention period. Other enhancements besides scalability, resiliency and ease of use include meta data and search capabilities. You can download and try ECS for non-production workloads with no capacity or functionality limitations from EMC here.

    View other recent news and industry trends here

    StorageIO Commentary in the news

    StorageIO news (image licensed for use from Shutterstock by StorageIO)
    Recent Server StorageIO commentary and industry trends perspectives about news, activities tips, and announcements. In case you missed them from last month:

    • TheFibreChannel.com: Industry Analyst Interview: Greg Schulz, StorageIO
    • EnterpriseStorageForum: Comments Handling Virtual Storage Challenges
    • PowerMore (Dell): Q&A: When to implement ultra-dense storage

    View more Server, Storage and I/O hardware as well as software trends comments here

    Vendors you may not have heard of

    Various vendors (and service providers) you may not know or heard about recently.

    • Datrium – DVX and NetShelf server software defined flash storage and converged infrastructure
    • DataDynamics – StorageX is the software solution for enabling intelligent data migration, including from NetApp OnTap 7 to Clustered OnTap, as well as to and from EMC among other NAS file serving solutions.
    • Paxata – Little and Big Data management solutions

    Check out more vendors you may know, have heard of, or that are perhaps new on the Server StorageIO Industry Links page here (over 1,000 entries and growing).

    StorageIO Tips and Articles

    Recent Server StorageIO articles appearing in different venues include:

    • InfoStor:  Data Protection Gaps, Some Good, Some Not So Good

    And in case you missed them from last month

    • IronMountain:  5 Noteworthy Data Privacy Trends From 2015
    • Virtual Blocks (VMware Blogs):  Part III EVO:RAIL – When And Where To Use It?
    • InfoStor:  Object Storage Is In Your Future
    • InfoStor:  Water, Data and Storage Analogy

    Check out these resources and links technology, techniques, trends as well as tools. View more tips and articles here

    StorageIO Videos and Podcasts

    StorageIO podcasts are also available via and at StorageIO.tv

    StorageIO Webinars and Industry Events

    EMCworld (Las Vegas) May 2-4, 2016

    Interop (Las Vegas) May 4-6 2016

    NAB (Las Vegas) April 19-20, 2016

    TBA – March 31, 2016

    Redmond Magazine Gridstore (How to Migrate from VMware to Hyper-V) February 25, 2016 Webinar (11AM PT)

    TBA – February 23, 2016

    Redmond Magazine and Dell Foglight – Manage and Solve Virtualization Performance Issues Like a Pro (Webinar 9AM PT) – January 19, 2016

    See more webinars and other activities on the Server StorageIO Events page here.

    From StorageIO Labs

    Research, Reviews and Reports

    Quick Look: What’s the Best Enterprise HDD for a Content Server?
    Which enterprise HDD for content servers

    Insight for Effective Server Storage I/O decision-making
    This StorageIO® Industry Trends Perspectives Solution Brief and Lab Review (compliments of Seagate and Servers Direct) looks at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate (www.seagate.com) Enterprise Hard Disk Drive (HDDs).

    I was given the opportunity to do some hands-on testing running different application workloads with a 2U content solution platform along with various Seagate Enterprise 2.5” HDDs handle different application workloads. This includes Seagate’s Enterprise Performance HDDs with the enhanced caching feature.

    Read more in this Server StorageIO industry Trends Perspective white paper and lab review.

    Looking for NVM including SSD information? Visit the Server StorageIO www.thessdplace.com and www.thenvmeplace.com micro sites. View other StorageIO lab review and test drive reports here.

    Server StorageIO Recommended Reading List

    The following are various recommended reading including books, blogs and videos. If you have not done so recently, also check out the Intel Recommended Reading List (here) where you will also find a couple of mine as well as books from others. For this months recommended reading, it’s a blog site. If you have not visited Duncan Eppings (@DuncanYB) Yellow-Bricks site, you should, particular if you are interested in virtualization, high availability and related topical themes.

    Seven Databases in Seven Weeks guide to no SQL via Amazon.com

    Granted Duncan being a member of the VMware CTO office covers a lot of VMware related themes, however being the author of several books, he also covers non VMware related topics. Duncan recently did a really good and simple post about rebuilding a failed disk in a VMware VSAN vs. in a legacy RAID or erasure code based storage solution.

    One of the things that struck me as being important with what Duncan wrote about is avoiding apples to oranges comparisons. What I mean by this is that it is easy to compare traditional parity based or mirror type solutions that chunk or shard data on KByte basis spread over disks, vs. data that is chunk or sharded on GByte (or larger) basis over multiple servers and their disks. Anyway, check out Duncan’s site and recent post by clicking here.

    Server StorageIO Industry Resources and Links

    Check out these useful links and pages:

    storageio.com/links
    objectstoragecenter.com
    storageioblog.com/data-protection-diaries-main/
    storageperformance.us
    thenvmeplace
    thessdplace.com
    storageio.com/performance.com
    storageio.com/raid
    storageio.com/ssd

    Ok, nuff said

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Server StorageIO December 2015 Update Newsletter


    Server and StorageIO Update Newsletter

    Volume 15, Issue XII – End of Year (EOY) Edition

    Hello and welcome to this December 2015 Server StorageIO update newsletter.

    Seasons Greetings and Happy New Years.

    Winter has arrived here in the northern hemisphere and it is also the last day of 2015 e.g. End Of Year or EOY). For some this means relaxing and having fun after a busy year, for others, it’s the last day of the most important quarter of the most important year ever, particular if you are involved in sales or spending.

    This is also that time of year where predictions for 2016 will start streaming out as well as reflections looking back at 2015 appear (more on these in January). Another EOY activity is planning for 2016 as well as getting items ready for roll-out or launch in the new year. Overall 2015 has been a very good year with many things in the works both public facing, as well as several behind the scenes some of which will start to appear throughout 2016.

    Enjoy this abbreviated edition of the Server StorageIO update newsletter and watch for new tips, articles, predictions, StorageIO lab report reviews, blog posts, videos and podcast’s along with in the news commentary appearing soon.

    Thank you for enabling a successful 2015 and wishing you all a prosperous new year in 2016.

    Cheers GS

    In This Issue

  • Tips and Articles
  • Events and Webinars
  • Resources and Links
  • StorageIO Tips and Articles

    Recent Server StorageIO articles appearing in different venues include:

    • IronMountain:  5 Noteworthy Data Privacy Trends From 2015
    • Virtual Blocks (VMware Blogs):  Part III EVO:RAIL – When And Where To Use It?
    • InfoStor:  Object Storage Is In Your Future
    • InfoStor:  Water, Data and Storage Analogy

    Check out these resources and links technology, techniques, trends as well as tools. View more tips and articles here

    StorageIO Webinars and Industry Events

    EMCworld (Las Vegas) May 2-4, 2016

    Interop (Las Vegas) May 4-6 2016

    NAB (Las Vegas) April 19-20, 2016

    Redmond Magazine Gridstore (How to Migrate from VMware to Hyper-V) February 25, 2016 Webinar (11AM PT)

    See more webinars and other activities on the Server StorageIO Events page here.

    Server StorageIO Industry Resources and Links

    Check out these useful links and pages:

    storageio.com/links
    objectstoragecenter.com
    storageioblog.com/data-protection-diaries-main/
    storageperformance.us
    thenvmeplace
    thessdplace.com
    storageio.com/raid
    storageio.com/ssd

    Ok, nuff said

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    How to test your HDD SSD or all flash array (AFA) storage fundamentals

    How to test your HDD SSD AFA Hybrid or cloud storage

    server storage data infrastructure i/o hdd ssd all flash array afa fundamentals

    Updated 2/14/2018

    Over at BizTech Magazine I have a new article 4 Ways to Performance Test Your New HDD or SSD that provides a quick guide to verifying or learning what the speed characteristic of your new storage device are capable of.

    An out-take from the article used by BizTech as a "tease" is:

    These four steps will help you evaluate new storage drives. And … psst … we included the metrics that matter.

    Building off the basics, server storage I/O benchmark fundamentals

    The four basic steps in the article are:

    • Plan what and how you are going to test (what’s applicable for you)
    • Decide on a benchmarking tool (learn about various tools here)
    • Test the test (find bugs, errors before a long running test)
    • Focus on metrics that matter (what’s important for your environment)

    Server Storage I/O performance

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    To some the above (read the full article here) may seem like common sense tips and things everybody should know otoh there are many people who are new to servers storage I/O networking hardware software cloud virtual along with various applications, not to mention different tools.

    Thus the above is a refresher for some (e.g. Dejavu) while for others it might be new and revolutionary or simply helpful. Interested in HDD’s, SSD’s as well as other server storage I/O performance along with benchmarking tools, techniques and trends check out the collection of links here (Server and Storage I/O Benchmarking and Performance Resources).

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    I/O, I/O how well do you know good bad ugly server storage I/O iops?

    How well do you know good bad ugly I/O iops?

    server storage i/o iops activity data infrastructure trends

    Updated 2/10/2018

    There are many different types of server storage I/O iops associated with various environments, applications and workloads. Some I/Os activity are iops, others are transactions per second (TPS), files or messages per time (hour, minute, second), gets, puts or other operations. The best IO is one you do not have to do.

    What about all the cloud, virtual, software defined and legacy based application that still need to do I/O?

    If no IO operation is the best IO, then the second best IO is the one that can be done as close to the application and processor as possible with the best locality of reference.

    Also keep in mind that aggregation (e.g. consolidation) can cause aggravation (server storage I/O performance bottlenecks).

    aggregation causes aggravation
    Example of aggregation (consolidation) causing aggravation (server storage i/o blender bottlenecks)

    And the third best?

    It’s the one that can be done in less time or at least cost or effect to the requesting application, which means moving further down the memory and storage stack.

    solving server storage i/o blender and other bottlenecks
    Leveraging flash SSD and cache technologies to find and fix server storage I/O bottlenecks

    On the other hand, any IOP regardless of if for block, file or object storage that involves some context is better than those without, particular involving metrics that matter (here, here and here [webinar] )

    Server Storage I/O optimization and effectiveness

    The problem with IO’s is that they are a basic operations to get data into and out of a computer or processor, so there’s no way to avoid all of them, unless you have a very large budget. Even if you have a large budget that can afford an all flash SSD solution, you may still meet bottlenecks or other barriers.

    IO’s require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data too their destination or retrieve them from where they are stored. While IO’s cannot be eliminated, their impact can be greatly improved or optimized by, among other techniques, doing fewer of them via caching and by grouping reads or writes (pre-fetch, write-behind).

    server storage I/O STI and SUT

    Think of it this way: Instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip. However, that optimization may also mean your drive will take longer. So, sometimes it makes sense to go on a couple of quick, short, low-latency trips instead of one larger one that takes half a day even as it accomplishes many tasks. Of course, how far you have to go on those trips (i.e., their locality) makes a difference about how many you can do in a given amount of time.

    Locality of reference (or proximity)

    What is locality of reference?

    This refers to how close (i.e., its place) data exists to where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, ready to be acted on immediately. This would be followed by levels 1, 2, and 3 (L1, L2, and L3) onboard caches, followed by main memory, or DRAM. After that comes solid-state memory typically NAND flash either on PCIe cards or accessible on a direct attached storage (DAS), SAN, or NAS device. 

    server storage I/O locality of reference

    Even though a PCIe NAND flash card is close to the processor, there still remains the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with meta or control information to further optimize and improve locality of reference. In other words, this information is used to help with cache hits, cache use, and cache effectiveness vs. simply boosting cache use.

    SSD to the rescue?

    What can you do the cut the impact of IO’s?

    There are many steps one can take, starting with establishing baseline performance and availability metrics.

    The metrics that matter include IOP’s, latency, bandwidth, and availability. Then, leverage metrics to gain insight into your application’s performance.

    Understand that IO’s are a fact of applications doing work (storing, retrieving, managing data) no matter whether systems are virtual, physical, or running up in the cloud. But it’s important to understand just what a bad IO is, along with its impact on performance. Try to identify those that are bad, and then find and fix the problem, either with software, application, or database changes. Perhaps you need to throw more software caching tools, hypervisors, or hardware at the problem. Hardware may include faster processors with more DRAM and faster internal busses.

    Leveraging local PCIe flash SSD cards for caching or as targets is another option.

    You may want to use storage systems or appliances that rely on intelligent caching and storage optimization capabilities to help with performance, availability, and capacity.

    Where to gain insight into your server storage I/O environment

    There are many tools that you can be used to gain insight into your server storage I/O environment across cloud, virtual, software defined and legacy as well as from different layers (e.g. applications, database, file systems, operating systems, hypervisors, server, storage, I/O networking). Many applications along with databases have either built-in or optional tools from their provider, third-party, or via other sources that can give information about work activity being done. Likewise there are tools to dig down deeper into the various data information infrastructure to see what is happening at the various layers as shown in the following figures.

    application storage I/O performance
    Gaining application and operating system level performance insight via different tools

    windows and linux storage I/O performance
    Insight and awareness via operating system tools on Windows and Linux

    In the above example, Spotlight on Windows (SoW) which you can download for free from Dell here along with Ubuntu utilities are shown, You could also use other tools to look at server storage I/O performance including Windows Perfmon among others.

    vmware server storage I/O
    Hypervisor performance using VMware ESXi / vsphere built-in tools

    vmware server storage I/O performance
    Using Visual ESXtop to dig deeper into virtual server storage I/O performance

    vmware server storage i/o cache
    Gaining insight into virtual server storage I/O cache performance

    Wrap up and summary

    There are many approaches to address (e.g. find and fix) vs. simply move or mask data center and server storage I/O bottlenecks. Having insight and awareness into how your environment along with applications is important to know to focus resources. Also keep in mind that a bit of flash SSD or DRAM cache in the applicable place can go along way while a lot of cache will also cost you cash. Even if you cant eliminate I/Os, look for ways to decrease their impact on your applications and systems.

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    >Keep in mind: SSD including flash and DRAM among others are in your future, the question is where, when, with what, how much and whose technology or packaging.

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    Revisiting RAID data protection remains relevant resource links

    Revisiting RAID data protection remains relevant and resources

    Storage I/O trends

    Updated 2/10/2018

    RAID data protection remains relevant including erasure codes (EC), local reconstruction codes (LRC) among other technologies. If RAID were really not relevant anymore (e.g. actually dead), why do some people spend so much time trying to convince others that it is dead or to use a different RAID level or enhanced RAID or beyond raid with related advanced approaches?

    When you hear RAID, what comes to mind?

    A legacy monolithic storage system that supports narrow 4, 5 or 6 drive wide stripe sets or a modern system support dozens of drives in a RAID group with different options?

    RAID means many things, likewise there are different implementations (hardware, software, systems, adapters, operating systems) with various functionality, some better than others.

    For example, which of the items in the following figure come to mind, or perhaps are new to your RAID vocabulary?

    RAID questions

    There are Many Variations of RAID Storage some for the enterprise, some for SMB, SOHO or consumer. Some have better performance than others, some have poor performance for example causing extra writes that lead to the perception that all parity based RAID do extra writes (some actually do write gathering and optimization).

    Some hardware and software implementations using WBC (write back cache) mirrored or battery backed-BBU along with being able to group writes together in memory (cache) to do full stripe writes. The result can be fewer back-end writes compared to other systems. Hence, not all RAID implementations in either hardware or software are the same. Likewise, just because a RAID definition shows a particular theoretical implementation approach does not mean all vendors have implemented it in that way.

    RAID is not a replacement for backup rather part of an overall approach to providing data availability and accessibility.

    data protection and durability

    What’s the best RAID level? The one that meets YOUR needs

    There are different RAID levels and implementations (hardware, software, controller, storage system, operating system, adapter among others) for various environments (enterprise, SME, SMB, SOHO, consumer) supporting primary, secondary, tertiary (backup/data protection, archiving).

    RAID comparison
    General RAID comparisons

    Thus one size or approach does fit all solutions, likewise RAID rules of thumbs or guides need context. Context means that a RAID rule or guide for consumer or SOHO or SMB might be different for enterprise and vise versa, not to mention on the type of storage system, number of drives, drive type and capacity among other factors.

    RAID comparison
    General basic RAID comparisons

    Thus the best RAID level is the one that meets your specific needs in your environment. What is best for one environment and application may be different from what is applicable to your needs.

    Key points and RAID considerations include:

    · Not all RAID implementations are the same, some are very much alive and evolving while others are in need of a rest or rewrite. So it is not the technology or techniques that are often the problem, rather how it is implemented and then deployed.

    · It may not be RAID that is dead, rather the solution that uses it, hence if you think a particular storage system, appliance, product or software is old and dead along with its RAID implementation, then just say that product or vendors solution is dead.

    · RAID can be implemented in hardware controllers, adapters or storage systems and appliances as well as via software and those have different features, capabilities or constraints.

    · Long or slow drive rebuilds are a reality with larger disk drives and parity-based approaches; however, you have options on how to balance performance, availability, capacity, and economics.

    · RAID can be single, dual or multiple parity or mirroring-based.

    · Erasure and other coding schemes leverage parity schemes and guess what umbrella parity schemes fall under.

    · RAID may not be cool, sexy or a fun topic and technology to talk about, however many trendy tools, solutions and services actually use some form or variation of RAID as part of their basic building blocks. This is an example of using new and old things in new ways to help each other do more without increasing complexity.

    ·  Even if you are not a fan of RAID and think it is old and dead, at least take a few minutes to learn more about what it is that you do not like to update your dead FUD.

    Wait, Isn’t RAID dead?

    There is some dead marketing that paints a broad picture that RAID is dead to prop up something new, which in some cases may be a derivative variation of parity RAID.

    data dispersal
    Data dispersal and durability

    RAID rebuild improving
    RAID continues to evolve with rapid rebuilds for some systems

    Otoh, there are some specific products, technologies, implementations that may be end of life or actually dead. Likewise what might be dead, dying or simply not in vogue are specific RAID implementations or packaging. Certainly there is a lot of buzz around object storage, cloud storage, forward error correction (FEC) and erasure coding including messages of how they cut RAID. Catch is that some object storage solutions are overlayed on top of lower level file systems that do things such as RAID 6, granted they are out of sight, out of mind.

    RAID comparison
    General RAID parity and erasure code/FEC comparisons

    Then there are advanced parity protection schemes which include FEC and erasure codes that while they are not your traditional RAID levels, they have characteristic including chunking or sharding data, spreading it out over multiple devices with multiple parity (or derivatives of parity) protection.

    Bottom line is that for some environments, different RAID levels may be more applicable and alive than for others.

    Via BizTech – How to Turn Storage Networks into Better Performers

    • Maintain Situational Awareness
    • Design for Performance and Availability
    • Determine Networked Server and Storage Patterns
    • Make Use of Applicable Technologies and Techniques

    If RAID is alive, what to do with it?

    If you are new to RAID, learn more about the past, present and future keeping mind context. Keeping context in mind means that there are different RAID levels and implementations for various environments. Not all RAID 0, 1, 1/0, 10, 2, 3, 4, 5, 6 or other variations (past, present and emerging) are the same for consumer vs. SOHO vs. SMB vs. SME vs. Enterprise, nor are the usage cases. Some need performance for reads, others for writes, some for high-capacity with low performance using hardware or software. RAID Rules of thumb are ok and useful, however keep them in context to what you are doing as well as using.

    What to do next?

    Take some time to learn, ask questions including what to use when, where, why and how as well as if an approach or recommendation are applicable to your needs. Check out the following links to read some extra perspectives about RAID and keep in mind, what might apply to enterprise may not be relevant for consumer or SMB and vise versa.

    Some advise needed on SSD’s and Raid (Via Spiceworks)
    RAID 5 URE Rebuild Means The Sky Is Falling (Via BenchmarkReview)
    Double drive failures in a RAID-10 configuration (Via SearchStorage)
    Industry Trends and Perspectives: RAID Rebuild Rates (Via StorageIOblog)
    RAID, IOPS and IO observations (Via StorageIOBlog)
    RAID Relevance Revisited (Via StorageIOBlog)
    HDDs Are Still Spinning (Rust Never Sleeps) (Via InfoStor)
    When and Where to Use NAND Flash SSD for Virtual Servers (Via TheVirtualizationPractice)
    What’s the best way to learn about RAID storage? (Via Spiceworks)
    Design considerations for the host local FVP architecture (Via Frank Denneman)
    Some basic RAID fundamentals and definitions (Via SearchStorage)
    Can RAID extend nand flash SSD life? (Via StorageIOBlog)
    I/O Performance Issues and Impacts on Time-Sensitive Applications (Via CMG)
    The original RAID white paper (PDF) that while over 20 years old, it provides a basis, foundation and some history by Katz, Gibson, Patterson et al
    Storage Interview Series (Via Infortrend)
    Different RAID methods (Via RAID Recovery Guide)
    A good RAID tutorial (Via TheGeekStuff)
    Basics of RAID explained (Via ZDNet)
    RAID and IOPs (Via VMware Communities)

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    What is my favorite or preferred RAID level?

    That depends, for some things its RAID 1, for others RAID 10 yet for others RAID 4, 5, 6 or DP and yet other situations could be a fit for RAID 0 or erasure codes and FEC. Instead of being focused on just one or two RAID levels as the solution for different problems, I prefer to look at the environment (consumer, SOHO, small or large SMB, SME, enterprise), type of usage (primary or secondary or data protection), performance characteristics, reads, writes, type and number of drives among other factors. What might be a fit for one environment would not be a fit for others, thus my preferred RAID level along with where implemented is the one that meets the given situation. However also keep in mind is tying RAID into part of an overall data protection strategy, remember, RAID is not a replacement for backup.

    What this all means

    Like other technologies that have been declared dead for years or decades, aka the Zombie technologies (e.g. dead yet still alive) RAID continues to be used while the technologies evolves. There are specific products, implementations or even RAID levels that have faded away, or are declining in some environments, yet alive in others. RAID and its variations are still alive, however how it is used or deployed in conjunction with other technologies also is evolving.

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    CompTIA needs input for their Storage+ certification, can you help?

    CompTIA needs input for their Storage+ certification, can you help?

    The CompTIA folks are looking for some comments and feedback from those who are involved with data storage in various ways as part of planning for their upcoming enhancements to the Storage+ certification testing.

    As a point of disclosure, I am member of the CompTIA Storage+ certification advisory committee (CAC), however I don’t get paid or receive any other type of renumeration for contributing my time to give them feedback and guidance other than a thank, Atta boy for giving back and playing it forward to help others in the IT community similar to what my predecessors did.

    I have been asked to pass this along to others (e.g. you or who ever forwards it on to you).

    Please take a few moments and feel free to share with others this link here to the survey for CompTIA Storage+.

    What they are looking for is to validate the exam blueprint generated from a recent Job Task Analysis (JTA) process.

    In other words, does the certification exam show real-world relevance to what you and your associates may be doing involved with data storage.

    This is opposed to being aligned with those whose’s job it is to create test questions and may not understand what it is you the IT pro involved with storage does or does not do.

    If you have ever taken a certification exam test and scratched your head or wondered out why some questions that seem to lack real-world relevance were included, vs. ones of practical on-the-job experience were missing, here’s your chance to give feedback.

    Note that you will not be rewarded with an Amex or Amazon gift card, Starbucks or Dunkin Donuts certificates, free software download or some other incentive to play and win, however if you take the survey let me know and will be sure to tweet you an Atta boy or Atta girl! However they are giving away a free T-Shirt to every 10 survey takers.

    Btw, if you really need something for free, send me a note (I’m not that difficult to find) as I have some free copies of Resilient Storage Networking (RSN): Designing Flexible Scalable Data Infrastructures (Elsevier) you simply pay shopping and handling. RSN can be used to help prepare you for various storage testing as well as other day-to-day activities.

    CompTIA is looking for survey takers who have some hands-on experience or involved with data storage (e.g. can you spell SAN, NAS, Disk or SSD and work with them hands-on then you are a candidate ;).

    Welcome to the CompTIA Storage+ Certification Job Task Analysis (JTA) Survey

  • Your input will help CompTIA evaluate which test objectives are most important to include in the CompTIA Storage+ Certification Exam
  • Your responses are completely confidential.
  • The results will only be viewed in the aggregate.
  • Here is what (and whom) CompTIA is looking for feedback from:

  • Has at least 12 to 18 months of experience with storage-related technologies.
  • Makes recommendations and decisions regarding storage configuration.
  • Facilitates data security and data integrity.
  • Supports a multiplatform and multiprotocol storage environment with little assistance.
  • Has basic knowledge of cloud technologies and object storage concepts.
  • As a small token of CompTIA appreciation for your participation, they will provide an official CompTIA T-shirt to every tenth (1 of every 10) person who completes this survey. Go here for official rules.

    Click here to complete the CompTIA Storage+ survey

    Contact CompTIA with any survey issues, research@comptia.org

    What say you, take a few minutes like I did and give some feedback, you will not be on the hook for anything, and if you do get spammed by the CompTIA folks, let me know and I in turn will spam them back for spamming you as well as me.

    Ok, nuff said

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Is there an information or data recession? Are you using less storage? (With Polls)

    Is there an information or data recession? Are you using less storage? (With Polls)

    StorageIO industry trends

    Is there an information recession where you are creating, processing, moving or saving less data?

    Are you using less data storage than in the past either locally online, offline or remote including via clouds?

    IMHO there is no such thing as a data or information recession, granted storage is being used more effectively by some, while economic pressures or competition enables your budgets to be stretched further. Likewise people and data are living longer and getting larger.

    In conversations with IT professionals particular the real customers (e.g. not vendors, VAR’s, analysts, blogalysts, consultants or media) I routinely hear from people that they continue to have the need to store more information, however they’re data storage usage and acquisition patterns are changing. For some this means using what they have more effectively leveraging data footprint reduction (DFR) which includes (archiving, compression, dedupe, thin provision, changing how and when data is protected). This also means using different types of storage from flash SSD to HDD to SSHD to tape summit resources as well as cloud in different ways spanning block, file and object storage local and remote.

    A common question that comes up particular around vendor earnings announcement times is if the data storage industry is in decline with some vendors experience poor results?

    Look beyond vendor revenue metrics

    As a back ground reading, you might want to check out this post here (IT and storage economics 101, supply and demand) which candidly should be common sense.

    If all you looked at were a vendors revenues or margin numbers as an indicator of how well such as the data storage industry (includes traditional, legacy as well as cloud) you would not be getting the picture.

    What needs to be factored into the picture is how much storage is being shipped (from components such as drives to systems and appliances) as well as delivered by service providers.

    Looking at storage systems vendors from a revenue earnings perspective you would get mixed indicators depending on who you include, not to mention on how those vendors report break of revenues by product, or amount units shipped. For example looking at public vendors EMC, HDS, HP, IBM, NetApp, Nimble and Oracle (among others) as well as the private ones (if you can see the data) such as Dell, Pure, Simplivity, Solidfire, Tintri results in different analysis. Some are doing better than others on revenues and margins, however try to get clarity on number of units or systems shipped (for actual revenue vs. loaners (planting seeds for future revenue or trials) or demos).

    Then look at the service providers such as AWS, Centurlylink, Google, HP, IBM, Microsoft Rackspace or Verizon (among others) you should see growth, however clarity about how much they are actually generating on revenues plus margin for storage specific vs. broad general buckets can be tricky.

    Now look at the component suppliers such as Seagate and Western Digital (WD) for HDDs and SSHDs who also provide flash SSD drives and other technology. Also look at the other flash component suppliers such as Avago/LSI whose flash business is being bought by Seagate, FusionIO, SANdisk, Samsung, Micron and Intel among others (this does not include the systems vendors who OEM those or other products to build systems or appliances). These and other component suppliers can give another indicator as to the health of the industry both from revenue and margin, as well as footprint (e.g. how many devices are being shipped). For example the legacy and startup storage systems and appliance vendors may have soft or lower revenue numbers, however are they shipping the same or less product? Likewise the cloud or service providers may be showing more revenues and product being acquired however at what margin?

    What this all means?

    Growing amounts of information?

    Look at revenue numbers in the proper context as well as in the bigger picture.

    If the same number of component devices (e.g. processors, HDD, SSD, SSHD, memory, etc) are being shipped or more, that is an indicator of continued or increased demand. Likewise if there is more competition and options for IT organizations there will be price competition between vendors as well as service providers.

    All of this means that while IT organizations budgets stay stretched, their available dollars or euros should be able to buy (or rent) them more storage space capacity.

    Likewise using various data and storage management techniques including DFR, the available space capacity can be stretched further.

    So this then begs the question of if the management of storage is important, why are we not hearing vendors talking about software defined storage management vs. chasing each other to out software define storage each other?

    Ah, that’s for a different post ;).

    So what say you?

    Are you using less storage?

    Do you have less data being created?

    Are you using storage and your available budget more effectively?

    Please take a few minutes and cast your vote (and see the results).

    Sorry I have no Amex or Amazon gift cards or other things to offer you as a giveaway for participating as nobody is secretly sponsoring this poll or post, it’s simply sharing and conveying information for you and others to see and gain insight from.

    Do you think that there is an information or data recession?

    How about are you using or buying more storage, could there be a data storage recession?

    Some more reading links

    IT and storage economics 101, supply and demand
    Green IT deferral blamed on economic recession might be result of green gap
    Industry trend: People plus data are aging and living longer
    Is There a Data and I/O Activity Recession?
    Supporting IT growth demand during economic uncertain times
    The Human Face of Big Data, a Book Review
    Garbage data in, garbage information out, big data or big garbage?
    Little data, big data and very big data (VBD) or big BS?

    Ok, nuff said (for now)

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    March 2014 StorageIO Update Newsletter : Cisco Cloud, VMware VSAN and More

    Industry Trends Perspectives: Cisco Cloud and VMware VSAN

    Welcome to the March 2014 edition of the StorageIO Update (newsletter) containing trends perspectives on cloud, virtualization and data infrastructure topics. Technically it is now spring here in North America and to say that we have had abnormal cold weather would be an understatement. However it is March with April just around the corner meaning plenty to do including several upcoming events (see below).

    Clouds and Cisco

    Some recent industry activity has included Cisco announcing its Cloud intentions (e.g. more than simply selling servers and networking hardware). So far the Cisco Cloud move appears to be more about hybrid and partner ecosystem including channels vs. going toes to toe with an Amazon Web Service (AWS). Cisco appears to playing the hybrid theme of being a technology supplier as well as provider or partner. Thus, it looks like for the near term the Cisco cloud target is not as much AWS as the likes of an IBM who recently added Softlayer or an HP.

    Greg Schulz Storage I/OGreg Schulz on break

    This will also be interesting to watch where along with how other Cisco partners such as EMC, Microsoft, NetApp, VCE and VMware participate. Keep in mind that some of these and other Cisco partners also have their own public, private and hybrid cloud initiatives, services along with being a supplier to each other.

    VMware VSAN Software Defined Storage

    Another industry activity involving servers storage I/O networking hardware software and virtualization (aka software defined) was the general announcement (GA) by VMware of Virtual SAN (VSAN). VMware VSAN went into public beta shortly after VMworld 2013 timeframe when many of us downloaded, installed and did various types of testing with it.

    For those not familiar with VSAN, it is added licensed software functionality for VMware that creates a cluster to host Virtual Machines (VMs) along with its own shared resilient storage solution (e.g. Software Defined Storage). How VSAN works is to use PCIe, SAS, SATA dedicated direct attached storage (DAS) including that are local to the VMware host server (physical machine or PM). The VMware host PMs support DAS Hard Disk Drives (HDD), Solid State Devices (SSD) including PCIe cards, drives or DIMMs, along with Solid State Hybrid Drives (SSHD). This local DAS storage is served and shared among the nodes (up to 32 host or PMs) per VSAN cluster balancing performance, availability (and resiliency) along with space capacity to host VM objects. Note that VM objects include VMDKs (e.g. virtual disks) and are not to be confused with the other type of object storage or access such as CDMI/SWIFT/S3/HTTP/REST.

    VMs (and those managing them) see in the VSAN cluster dats that are familiar with other VMware implementations including storage policies and other tools. Here is a link to a great piece by Patrick Schulz a data infrastructure systems engineer in Germany (no relation, at least not that I know of yet) where he shares his experiences with VSAN implementation.

    storage I/O vsan
    Generic VSAN example

    Instead of using an external iSCSI, Fibre Channel (FC) or FC over Ethernet (FCoE) shared SAN or NAS storage system / appliance to create the storage repository, local DAS is leveraged in groups spread across the hosts in the VSAN cluster (up to 32 nodes ). VSAN requires a percentage of SSD for each storage group on the host cluster nodes that a part is used for caching data which is persistently stored on HDD based media.

    VSAN software is licensed by the number of active sockets (not the cores) in the host servers (PM) that are in the cluster or by number of VDI users (guest VMs). For example if there are four servers two with one socket and two with dual sockets there would be six socket licenses. MSRP License cost per processor socket is $2,495 USD which also assumes core VMware licenses already exist. There are also a per guest VM license of $50 per VDI instance, as well as other optional license models and bundles with different features or upgrades.

    What is different with VSAN vs. other VMware clusters is that a) the storage is only accessible to VMs that are in the VSAN cluster (unless a VM exports and serves to others via NFS, iSCSI, etc which is a different conversation for another day). Another difference is that today VSAN leverage storage inside of servers or direct attached as opposed to using iSCSI, FC, FCoE SAN or NAS storage systems.

    Btw, the current maximum LUN, volume or target storage device size is 4TB so if you were thinking of taking a SAS attached storage system and creating a bunch of small LUNs, you might want to review that from a cost perspective, or at least for today.

    There is much more to VSAN including how it works, what it can and can not do, who it is for and whom should not use for different app’s, however IMHO besides lower-end, SMB, workgroup, departmental, VMware centric environments, the number one scenario today is VDI along with where converged solutions such as those from Nutanix, Simplivity and Tintri among others are playing.

    Watch for more StorageIO posts, commentary, perspectives, presentations, webinars, tips and events on information and data infrastructure topics, themes and trends. Data Infrastructure topics include among others cloud, virtual, legacy server, storage I/O networking, data protection, hardware and software.

    Check out our backup, restore, BC, DR and archiving (Under the resources section on StorageIO.com) for extra content.

    StorageIO Industry Trends and PerspectivesIndustry trends tips, commentary, articles and blog posts
    What is being seen, heard and talked about while out and about

    The following is a synopsis of some StorageIOblog posts, articles and comments in different venues on various industry trends, perspectives and related themes about clouds, virtualization, data and storage infrastructure topics among related themes.

    StorageIO in the newsRecent StorageIO comments and perspectives in the news

    SearchSolidStateStorage: Comments on automated storage tiering and flash
    EnterpriseStorageForum: Comments on Cloud-Storage Mergers and Acquisitions
    SearchDataBackup: Comments on near-CDP nudging true CDP from landscape
    EnterpriseStorageForum: Comments on Ways to Avoid Cloud Storage Pricing Surprises
    SearchDataBackup: Q&A: Snapshot, replication ‘great approach’ for data protection
    SearchDataBackup: Comments on LTFS-enabled products

    StorageIO tips and articles Recent StorageIO tips and articles in various venues

    InformationSecurityBuzz: Dark Territories – Do You Know Where Your Information Is?
    InformationSecurityBuzz: Rings Of Security For Data Protection Or For Appearance?
    SearchSolidStateStorage: Q&A on automated storage tiering and flash
    SpiceWorks: My copies were corrupted: The 3-2-1 data protection rule

    StorageIOblog postRecent StorageIOblog posts and perspectives

  • Missing MH370 reminds us, do you know where your digital assets are? Click to read more
  • Old School, New School, Current and Back to School – Click to read and view poll
  • USENIX FAST (File and Storage Technologies) 2014 Proceedings – Click to read more
  • Spring 2014 StorageIO Events and Activities Update Click to view
  • Review – iVMcontrol iPhone VMware management, iTool or iToy? Click to read more
  • February 2014 Server StorageIO Update Newsletter
  • Remember to check out our objectstoragecenter.com page where you will find a growing collection of information and links on cloud and object storage themes, technologies and trends from various sources.

    Server and StorageIO seminars, conferences, web cats, events, activities StorageIO activities (out and about)

    Seminars, symposium, conferences, webinars
    Live in person and recorded recent and upcoming events

    The StorageIO calendar continues to evolve, here are some recent and upcoming activities.

    129/78/148/103/1527/350/242/91 = 650

    June 12, 2014The Many Facets of Virtual Storage and Software Defined Storage VirtualizationWebinar
    9AM PT
    June 11, 2014The Changing Face and Landscape of Enterprise StorageWebinar
    9AM PT
    May 16, 2014 What you need to know about virtualization (Demystifying Virtualization)Nijkerk Holland
    May 15, 2014 Data Infrastructure Industry Trends: What’s New and TrendingNijkerk Holland
    May 14, 2014 To be announcedNijkerk Holland
    May 13, 2014 Data Movement and Migration: Storage Decision Making ConsiderationsNijkerk Holland
    May 12, 2014 Rethinking Business Resiliency: From Disaster Recovery to Business ContinuanceNijkerk Holland
    May 5-7, 2014EMC WorldLas Vegas
    April 22-23, 2014SNIA DSI Event

    Presenting – The “Cloud” Hybrid Home Run
    Life beyond they Hype

    Santa Clara CA
    April 16, 2014Open Source and Cloud Storage – Enabling business, or a technology enabler?Webinar
    9AM PT
    April 9, 2014Storage Decision Making for Fast, Big and Very Big Data EnvironmentsWebinar
    9AM PT
    April 8, 2014NABNational Association Broadcasters (e.g. Very Big Fast data Event)Las Vegas
    March 27, 2014
    Keynote: The 2017 Datacenter – PREPARING FOR THE 2017 DATACENTER SESSIONSEdina
    8:00AM
    Register Here

    Click here to view other upcoming along with earlier event activities. Watch for more 2014 events to be added soon to the StorageIO events calendar page. Topics include data protection modernization (backup/restore, HA, BC, DR, archive), data footprint reduction (archive, compression, dedupe), storage optimization, SSD, object storage, server and storage virtualization, big data, little data, cloud and object storage, performance and management trends among others.

    Vendors, VAR’s and event organizers, give us a call or send an email to discuss having us involved in your upcoming pod cast, web cast, virtual seminar, conference or other events.

    Thank you to the current StorageIoblog.com site sponsor advertisers

    Druva (End Point Data Protection)
    Unitrends (Enterprise backup solution and management tools)
    Veeam (VMware and Hyper-V virtual server backup and data protection tools).

    Contact StorageIO to learn about sponsorship and other partnership opportunities.

    Click here to view earlier StorageIO Update newsletters (HTML and PDF versions). Subscribe to this newsletter (and pass it along) and click here to subscribe to this news letter. View archives of past StorageIO update news letters as well as download PDF versions at: www.storageio.com/newsletter

    Ok, nuff said (for now)

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Dell Inspiron 660 i660, Virtual Server Diamond in the rough?

    Storage I/O trends

    Dell Inspiron 660 i660, Virtual Server Diamond in the rough?

    During the 2013 post thanksgiving black friday shopping day, I did some on-line buying including a Dell Inspiron 660 i660 (5629BK) to be used as a physical machine (PM) or VMware host (among other things).

    Now technically I know, this is a workstation or desktop and thus not what some would consider a server, however as another PM to add to my VMware environment (or be used as a bare metal platform), it is a good companion to my other systems.

    Via Dell.com Dell 660 i660

    Taking a step back, needs vs. wants

    Initially my plan for this other system was to go with a larger, more expensive model with as many DDR3 DIMM (memory) and PCIe x4/x8/x16 expansion slots as possible. Some of my other criteria were PCIe Gen 3, latest Intel processor generation with VT (Virtualization Technology) and Extended Page Tables (EPT) for server virtualization support without breaking my budget. Heck, I would love a Dell VRTX or some similar types of servers from the likes of Cisco, HP, IBM, Lenovo, Supermicro among many others. On the other hand, I really don’t need one of those types of systems yet, unless of course somebody wants to send some to play with (excuse me, test drive, try-out).

    Hence needs are what I must have or need, while wants are those things that would be, well, nice to have.

    Server shopping and selection

    In the course of shopping around, looking at alternatives and having previously talked with Robert Novak (aka @gallifreyan) and he reminded me to think outside the box a bit, literally. Check out Roberts blog (aka rsts11 a great blog name btw for those of use who used to work with RSTS, RSX and others) including a post he did shortly after I had a conversation with him. If you read his post and continue through this one, you should be able to connect the dots.

    While I still have a need and plans for another server with more PCIe and DDR3 (maybe wait for DDR4? ;) ) slots, I found a Dell Inspiron 660.

    Candidly normally I would have skipped over this type or class of system, however what caught my eye was that while limited to only two DDR3 DIMM slots and a single PCIe x16 slot, there were three extra x1 slots which while not as robust, certainly gave me some options if I need to use those for older, slower things. Likewise leveraging higher density DIMM’s, the system is already now at 16GB RAM waiting for larger DIMM’s if needed.

    VMware view of Inspiron 600

    The Dell Inspiron 660-i660 I found had a price of a little over $550 (delivered) with an Intel i5-3330 processor (quad-core, quad thread 3GHz clock), PCIe Gen 3, one PCIe x16 and three PCIe x1 slots, 8GB DRAM (since reallocated), GbE port and built-in WiFi, Windows 8 (since P2V and moved into the VMware environment), keyboard and mouse, plus a 1TB 6Gb SATA drive, I could afford two, maybe three or four of these in place of a larger system (at least for now). While for something’s I have a need for a single larger server, there are other things where having multiple smaller ones with enough processing performance, VT and EPT support comes in handy (if not required for some virtual servers).

    Some of the enhancements that I made were once the initial setup of the Windows system was complete, did a clone and P2V of that image, and then redeploying the 1TB SATA drive to join others in the storage pool. Thus the 1TB SATA HDD has been replaced with (for now) a 500GB Momentus XT HHDD which by time you read this could already changed to something else.

    Another enhancements was bumping up the memory from 8GB to 16GB, and then adding a StarTech enclosure (See below) for more internal SAS / SATA storage (it supports both 2.5" SAS and SATA HDD’s as well as SSD’s). In addition to the on-board SATA drive port plus one being used for the CD/DVD, there are two more ports for attaching to the StarTech or other large 3.5" drives that live in the drive bay. Depending on what I’m using this system for, it has different types of adapters for external expansion or networking some of which have already included 6Gbps and 12Gbps SAS HBA’s.

    What about adding more GbE ports?

    As this is not a general purpose larger system with many expansion ports for PCIe slots, that is one of the downsides you get for this cost. However depending on your needs, you have some options. For example I have some Intel PCIe x1 GbE cards to give extra networking connectivity if or when needed. Note however that as these are PCIe x1 slots they are PCIe Gen 1 so from a performance perspective exercise caution when mixing these with other newer, faster cards when performance matters (more on this in the future).

    Via Amazon.com Intel PCIe x1 GbE card
    Via Amazon.com Intel (Gigabit CT PCI-E Network Adapter EXPI9301CTBLK)

    One of the caveats to be aware of if you are going to be using VMware vSphere/ESXi is that the Realtek GbE NIC on the Dell Inspiron D600-i660 may not play well, however there are work around’s. Check out some of the work around’s over at Kendrick Coleman (@KendrickColeman) and Erik Bussink (@ErikBussink) sites both of which were very helpful and I can report that the Realtek GbE is working fine with VMware ESXi 5.5a.

    Need some extra SAS and SATA internal expansion slots for HDD and SSD’s?

    The StarTech 4 x 2.5″ SAS and SATA internal enclosures supports various speed SSD and HDD’s depending on what you connect the back-end connector port to. On the back of the enclosure chassis there is a connector that is a pass-thru to the SAS drive interface that also accepts SATA drives. This StarTech enclosure fits nicely into an empty 5.2″ CD/DVD expansion bay and then attach the individual drive bays to your internal motherboard SAS or SATA ports, or to those on another adapter.

    Via Amazon.com StarTech 4 port SAS / SATA enclosure
    Via Amazon.com StarTech 4 x 2.5" SAS and SATA internal enclosure

    So far I have used these enclosures attached to various adapters at different speeds as well as with HDD, HHDD, SSHD and SSD’s at various SAS/SATA interface speeds up to 12Gbps. Note that unlike some other enclosures that have SAS or SATA expander, the drive bays in the StarTech are pass-thru hence are not regulated by the expander chip and its speed. Price for these StarTech enclosures is around $60-90 USD and are good for internal storage expansion (hmm, need to build your own NAS or VSAN or storage server appliance? ;) ).

    Via Amazon Molex power connector

    Note that you will also need to get a Molex power connector to go from the back of the drive enclosure to an available power port such as for expansion DVD/CD that you can find at a Radio Shack, Fry’s or many other venues for couple of dollars. Double check your specific system and cable connector leads to verify what you will need.

    How is it working and performing

    So far so good, in addition to using it for some initial calibration and validation activities, the D660 is performing very well and no buyers remorse. Ok, sure, would like more PCIe Gen 3 x4/x8/x16 or an extra on-board Ethernet, however all the other benefits have outweighed those pitfalls.

    Speaking of which, if you think a SSD (or other fast storage device) is fast on a 6Gbps SAS or PCIe Gen 2 interface for physical or virtual servers, wait until you experience those IOPs or latencies at 12Gbps SAS and PCIe Gen 3 with a faster current generation Intel processor, just saying ;)…

    Server and Storge I/O IOPS and vmware   
    

    In the above chart (slide scroll bar to view more to the right) a Windows 7 64 bit systems (VMs configured with 14GB DRAM) on VMware vSphere V5.5.1 is shown running on different hardware configurations. The Windows system is running Futuremark PCMark 7 Pro (v1.0.4). From left to right the Windows VM on the Dell Inspiron 660 with 16GB physical DRAM using a SSHD (Solid State Hybrid Drive). Second from the left shows results running on a Dell T310 with an Intel X3470 processor also on a SSHD. Middle is the workload on the Dell 660 running on a HHDD, second from right is the workload on the Dell T310 also on a HHDD, while on the right is the same workload on an HP DCS5800 with an Intel E8400. The workload results show a composite score, system storage, simulating user productivity, lightweight processing, and compute intensive tasks.

    Futuremark PCMark Windows benchmark
    Futuremark PCMark

    Don’t forget about the KVM (Keyboard Video Mouse)

    Mention KVM to many people in and around the server, storage and virtualization world and they think KVM as in the hypervisor, however to others it means Key board, Video and Mouse aka the other KVM. As part of my recent and ongoing upgrades, it was also time to upgrade from the older smaller KVM’s to a larger, easier to use model. The benefit, support growth while also being easier to work with. Having done some research on various options that also varied in price, I settled in on the StarTech shown below.

    Via Amazon.com StarTech 8 port KVM
    Via Amazon.com StarTech 8 Port 1U USB KVM Switch

    What’s cool about the above 8 port StarTech KVM switch is that it comes with 8 cables (there are 8 ports) that on one end look like a regular VGA monitor screen cable connector. However on the other end that attached to your computer, there is the standard VGA connection that attached to your video out, and a short USB tail cable that attached to an available USB port for Keyboard and Mouse. Needless to say it helps to cut down on the cable clutter while coming in around $38.00 USD per server port being managed, or about a dollar a month over a little over three years.

    Word of caution on make and models

    Be advised that there are various makes and models of the Dell Inspiron available that differ in the processor generation and thus feature set included. Pay attention to which make or model you are looking at as the prices can vary, hence double-check the processor make and model and then visit the Intel site to see if it is what you are expecting. For example I double checked that the processor for the different models I looked at were i5-3330 (view Intel specifications for that processor here).

    Summary

    Thanks to Robert Novak (aka @gallifreyan) for taking some time providing useful tips and ideas to help think outside the box for this, as well as some future enhancements to my server and StorageIO lab environment.

    Consequently while the Dell Inspiron D600-i660 was not the server that I wanted, it has turned out to be the system that I need now and hence IMHO a diamond in the rough, if you get the right make and mode.

    Ok, nuff said

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2013 StorageIO and UnlimitedIO All Rights Reserved