Filesystem Benchmarking with PostMark from Network Appliance

Filesystem Testing

In addition to the low-level raw device benchmarking that I've been doing with vinum, I also found PostMark, a benchmark from Network Appliance.

While not perfect (it does not fork, and therefore is completely single-threaded, and it does not require that all meta-data be written synchronously), it does seem to be at least a plausible first-pass simulation of an Internet mail or USENET news system, in that it creates lots of relatively small files (between one-half and 10KB in size), does operations on them, and then deletes them in rapid succession. Note that it there is now an official "port" for postmark in FreeBSD 4.x-CURRENT, under /usr/ports/benchmarks/postmark/.

The goal here is to get multiple people to submit data to be included in these tables, and I have not personally performed all the testing shown here. The company or personal name of the author is enclosed in square brackets at the end of the description of each system configuration that was tested, and for the original results from Network Appliance, I used the tag "[NetApp]".

Their original white paper is at http://www.netapp.com/tech_library/3022.html, but I'll replicate the test results here (to make the comparisons easier): PostMark was configured in four different ways:

1000 initial files and 50000 transactions
20000 initial files and 50000 transactions
20000 initial files and 100000 transactions
20000 initial files, 100000 transactions, and 100 subdirectories (suggested by Matt Dillon)

All other PostMark parameters were left constant at the default values.

The NFS benchmark was performed on a Sun Microsystems Ultra 1/170 with 256 Mbytes RAM running under the Solaris 2.5 operating system. The following configurations were tested:

UFS - The Solaris standard Unix File System (derived from the Berkeley Fast File System) [NetApp]
TMPFS - A memory-based temporary file system operating in buffer cache (no persistent files, but included to show hardware capabilities without disk bottlenecks) [NetApp]
ODS/R0 - A software disk array (Sun Online Disk Suite) providing only striping [NetApp]
ODS/R5 - A software disk array providing parity and striping across the pool of disk drives (RAID5) [NetApp]
NFS/F330 - NFS to Network Appliance F330 over 100 Mbit/s CDDI (WAFL/RAID4) [NetApp]
NFS/F630 - NFS to Network Appliance F630 over 100 Mbit/s CDDI (WAFL/RAID4) [NetApp]
PPro200+SOFT - Pentium Pro @ 200Mhz w/ 128MB RAM, running FreeBSD 3.3-RC, Adaptec 2940UW SCSI controller, Western Digital Enterprise 1.61 4.5GB SCSI hard drive, and UFS+softupdates [Brad Knowles]
P3/Linux - Dell PowerEdge 1300 w/ 450Mhz Pentium III processor, 1MB L2 cache, 1GB RAM, Slackware Linux w/ kernel 2.2.9, Adaptec AIC-7890 on-board SCSI controller, Quantum Viking II 4.5LWS hard drive, mounted "defaults" (async writes for data and metadata) [Brad Knowles]
P3+DPT - Dell PowerEdge 1300 w/ 450Mhz Pentium III processor, 1MB L2 cache, 1GB RAM, Slackware Linux w/ kernel 2.2.9, DPT SmartRAID V "Century" (PM2654U2-R-16M) caching SCSI RAID controller w/16MB cache, Quantum Viking II 4.5LWS hard drive, mounted "defaults" (async writes for data and metadata) [Brad Knowles]
2xP3+UFS - Dell PowerEdge 1300 with dual Pentium III processors at 450Mhz w/ 1MB L2 cache per processor, 1GB ECC RAM, running FreeBSD 3.4-STABLE, Adaptec AIC-7890 SCSI controller, Quantum Atlas IV 9 WLS 9GB hard drive, and UFS only (no softupdates) [Brad Knowles]
2xP3+SOFT - Dell PowerEdge 1300 with dual Pentium III processors at 450Mhz w/ 1MB L2 cache per processor, 1GB ECC RAM, running FreeBSD 3.4-STABLE, Adaptec AIC-7890 SCSI controller, Quantum Atlas IV 9 WLS 9GB hard drive, with softupdates [Brad Knowles]
2xP3+MFS - Dell PowerEdge 1300 with dual Pentium III processors at 450Mhz w/ 1MB L2 cache per processor, 1GB ECC RAM, running FreeBSD 3.4-STABLE, with a memory-based filesystem (to demonstrate maximum theoretical speed) [Brad Knowles]
U5+UFS - Sun UltraSPARC 5 w/ UltraSPARC-IIi chip @ 270MHz, 256MB RAM, Solaris 7, Seagate Medalist 34342A 4GB IDE hard drive, with UFS filesystem [Brad Knowles]
U5+Log - Sun UltraSPARC 5 w/ UltraSPARC-IIi chip @ 270MHz, 256MB RAM, Solaris 7, Seagate Medalist 34342A 4GB IDE hard drive, with Solstice DiskSuite UFS Logging (log-structured journaling filesystenm) [Brad Knowles]
U5+TMPFS - Sun UltraSPARC 5 w/ UltraSPARC-IIi chip @ 270MHz, 256MB RAM, Solaris 7, Seagate Medalist 34342A 4GB IDE hard drive, with tmpfs (to demonstrate absolute maximum performance) [Brad Knowles]
PBG3/HFS+ - Apple PowerBook G3 "Pismo" laptop @ 400MHz, 1GB RAM, MacOS X 10.2.1, IBM TravelStar 48GH ATA-5 hard disk (48GB, 5400 RPM, 12.0ms avg. seek), with HFS+ (not UFS) [Brad Knowles]
PBG3/MFS - Apple PowerBook G3 "Pismo" laptop @ 400MHz, 1GB RAM, MacOS X 10.2.1, IBM TravelStar 48GH ATA-5 hard disk (48GB, 5400 RPM, 12.0ms avg. seek), with 512MB MFS (using RamBunctious 2.0 to create a memory-based filesystem or "RAM disk", not Macintosh File System ;-) [Brad Knowles]
PBG4/HFS+ - Apple PowerBook G4 (DVI) laptop @ 800MHz, 1GB RAM, MacOS X 10.2.2, Toshiba MK4018GAS hard disk (40GB, 4200 RPM, 12.0ms avg. seek), with HFS+ (not UFS) [Brad Knowles]
PBG4/MFS - Apple PowerBook G4 (DVI) laptop @ 800MHz, 1GB RAM, MacOS X 10.2.2, Toshiba MK4018GAS hard disk (40GB, 4200 RPM, 12.0ms avg. seek), with 512MB MFS (memory-based filesystem created using RamBunctious 2.0) [Brad Knowles]
Compaq+UFS - Compaq Armada 4131T laptop, Pentium-133, 48MB RAM, FreeBSD-4.6.2-REL, IBM Travelstar 20GN ATA-4 hard disk (10GB, 4200 RPM, 12.0ms avg. seek), with vfs.vmiodirenable, dirprefs, & dirhash turned on, but without "softupdates" [Brad Knowles]
Compaq+Soft - Compaq Armada 4131T laptop, Pentium-133, 48MB RAM, FreeBSD-4.6.2-REL, IBM Travelstar 20GN ATA-4 hard disk (10GB, 4200 RPM, 12.0ms avg. seek), with vfs.vmiodirenable, dirprefs, dirhash, and "softupdates" enabled [Brad Knowles]

NOTE:

The Dell PowerEdge 1300 server running Linux was in production (anonymous ftp server) at the time it was tested, although it was extremely lightly loaded, and this is unlikely to have had much effect.

1000/50000 UFS TMPFS ODS/R0 ODS/R5 NFS/F330 NFS/F630 PPro+SOFT P3/Linux P3+DPT 2xP3+UFS 2xP3+SOFT 2xP3+MFS U5+UFS U5+Log U5+TMPFS PBG3/HFS+ PBG3/MFS PBG4/HFS+ PBG4/MFS Compaq+UFS Compaq+Soft

Transactions per second 36 2000 63 23 139 253 94 271 1851 54 458 1724 29 107 2000 82 485 492 980 72 149

Data read (Kbytes/sec) 115.67 4880 199.73 74.13 441.71 799.91 296.10 850.70 6092.8 171.77 1495.04 5488.64 93.04 336.36 6328.32 268.32 1566.72 332.02 3174.40 235.40 489.66

Data written (Kbytes/sec) 118.27 7330 204.22 75.79 451.64 817.89 302.75 869.82 6236.16 175.64 1525.76 5611.52 95.14 343.92 6471.68 280.35 1638.40 346.92 3317.76 245.96 511.62

Table 1: PostMark Results for Unix and NFS (1,000 initial files and 50,000 transactions)

20000/50000 UFS TMPFS ODS/R0 ODS/R5 NFS/F330 NFS/F630 PPro+SOFT P3/Linux P3+DPT 2xP3+UFS 2xP3+SOFT 2xP3+MFS U5+UFS U5+Log U5+TMPFS PBG3/HFS+ PBG3/MFS PBG4/HFS+ PBG4/MFS Compaq+UFS Compaq+Soft

Transactions per second 15 438 29 14 76 176 20 62 98 35 142 228 16 16 416 29 349 36 666 15 23

Data read (Kbytes/sec) 29.93 663.64 56.60 27.05 177.68 383.41 49.24 142.98 237.29 67.30 318.81 504.14 32.34 37.40 1126.4 82.00 737.57 99.14 1495.04 38.95 66.57

Data written (Kbytes/sec) 54.22 1530 102.54 49.00 321.88 694.58 89.20 259.02 429.87 121.91 577.55 913.29 58.58 67.75 2048 154.75 1392.64 187.09 2816.00 73.51 125.63

Table 2: PostMark Results for Unix and NFS (20,000 initial files and 50,000 transactions)

20000/100000
TMPFS ODS/R0 ODS/R5 NFS/F330 NFS/F630 PPro+SOFT P3/Linux P3+DPT 2xP3+UFS 2xP3+SOFT 2xP3+MFS U5+UFS U5+Log U5+TMPFS PBG3/HFS+ PBG3/MFS PBG4/HFS+ PBG4/MFS Compaq+UFS Compaq+Soft

Transactions per second 335 30 14 74 169 26 56 83 35 139 228 16 19 564 31 310 33 613 15 22

Data read (Kbytes/sec) 613.03 73.19 35.05 204.72 446.69 73.52 153.28 232.66 85.86 329.79 606.47 41.01 49.86 1679.36 90.20 819.89 94.42 1628.16 42.68 69.72

Data written (Kbytes/sec) 1160 101.17 48.46 282.98 617.45 101.62 211.88 321.60 118.69 524.98 838.31 56.69 68.92 2324.48 129.65 1177.60 135.72 2334.72 61.35 100.21

Table 3: PostMark Results for Unix and NFS (20,000 initial files and 100,000 transactions)

20000/100000/100 dir
P3/Linux 2xP3+UFS 2xP3+SOFT 2xP3+MFS U5+UFS U5+Log U5+TMPFS PBG3/HFS+ PBG3/MFS PBG4/HFS+ PBG4/MFS Compaq+UFS Compaq+Soft

Transactions per second 57 25 86 1333 18 15 564 24 277 28 574 14 13

Data read (Kbytes/sec) 151.17 60.92 237.10 3471.36 44.79 39.13 1669.12 65.43 708.74 77.21 1484.80 38.81 40.19

Data written (Kbytes/sec) 208.96 84.21 327.75 4802.56 61.92 54.09 2314.24 93.79 1015.93 110.68 2129.92 55.63 57.61

Table 4: PostMark Results for Unix and NFS (20,000 initial files in 100 subdirectories and 100,000 transactions)

The thing that impresses me most about these tests is that the first one shows primarily cache effects under FreeBSD with softupdates and Linux with async writes, but overall the ancient (and slow) Pentium Pro beat the Sun UltraSPARC (with twice as much RAM) in every test! The Linux machine in the first test also shows major cache effects with the DPT SmartRAID V controller (which continues to have a somewhat dimished effect throughout the tests), and its performance drops quite precipitously with the larger tests, where FreeBSD with softupdates tends to do measurably better (Linux had the advantage of being on a single CPU and avoiding SMP, FreeBSD had the advantage of being totally unloaded).

Unfortunately, the Sun Ultra 5 tests with UFS Logging were set up in the most pathological configuration possible -- the metadata log device was mirrored to two different stripes on the same disk, and both of which were on the same disk as the data device. Yet, except for the surprising results with subdirectories, UFS Logging still outperformed UFS at least a little, and for small filesysets outperformed it by quite a lot (by a factor of approximately 3.6). This demonstrates a percentage speed up close to what is visible with FreeBSD and softupdates!

However, the results that NetApp got with their Ultra 1/170 are rather surprising. It makes you wonder just what exactly what the configuration of this machine was, and how they managed to get results that are faster (for the first test) than what I could manage with an Ultra 5, more memory, and a newer version of the OS (with all unnecessary programs disabled). Maybe this was with an extremely fast SCSI disk?

These tests highlight the importance of using a log-structured journaling filesystem, "softupdates" or some other method of safely delaying, grouping, and avoiding writing to disk synchronous metadata operations, in order to improve performance. Note that the Linux-standard practice of mounting volumes "async" is very dangerous, and while it may result in high performance you can get equally high performance with higher reliability using FreeBSD with softupdates.

Interestingly, comparing the respective results of the third and fourth tests, FreeBSD MFS performance appears to be highly dependant on the number of files in a single directory that is being accessed and updated (as one would expect), while Sun Solaris TMPFS performance appears to be dependant solely on the total number and size of files on the filesystem, regardless of how many subdirectories they might be stored in.

Yes, I know that the NetApp F330 and F630 are not the current top-of-the-line models (that would be the F760), but then none of the systems I'm comparing them against are the latest top-of-the-line, either. In fact, most of the systems I'm comparing with have designs that are themselves about as old as the F330 and F630 models, so this is still a reasonably valid comparison. The only thing that's newer about the systems I'm comparing with is the OS (and resulting network libraries, etc...), and NetApp has had just as much opportunity to improve their OS as everyone else has.

Once I'm further along with my vinum testing, I'll also add some postmark testing with it on the PPro 200 machine with softupdates, to see just how far this machine can be pushed without adding any expensive hardware (such as a DPT SmartRAID V controller, although I also hope to test that on this machine).

1000/50000	UFS	TMPFS	ODS/R0	ODS/R5	NFS/F330	NFS/F630	PPro+SOFT	P3/Linux	P3+DPT	2xP3+UFS	2xP3+SOFT	2xP3+MFS	U5+UFS	U5+Log	U5+TMPFS	PBG3/HFS+	PBG3/MFS	PBG4/HFS+	PBG4/MFS	Compaq+UFS	Compaq+Soft
Transactions per second	36	2000	63	23	139	253	94	271	1851	54	458	1724	29	107	2000	82	485	492	980	72	149
Data read (Kbytes/sec)	115.67	4880	199.73	74.13	441.71	799.91	296.10	850.70	6092.8	171.77	1495.04	5488.64	93.04	336.36	6328.32	268.32	1566.72	332.02	3174.40	235.40	489.66
Data written (Kbytes/sec)	118.27	7330	204.22	75.79	451.64	817.89	302.75	869.82	6236.16	175.64	1525.76	5611.52	95.14	343.92	6471.68	280.35	1638.40	346.92	3317.76	245.96	511.62

20000/50000	UFS	TMPFS	ODS/R0	ODS/R5	NFS/F330	NFS/F630	PPro+SOFT	P3/Linux	P3+DPT	2xP3+UFS	2xP3+SOFT	2xP3+MFS	U5+UFS	U5+Log	U5+TMPFS	PBG3/HFS+	PBG3/MFS	PBG4/HFS+	PBG4/MFS	Compaq+UFS	Compaq+Soft
Transactions per second	15	438	29	14	76	176	20	62	98	35	142	228	16	16	416	29	349	36	666	15	23
Data read (Kbytes/sec)	29.93	663.64	56.60	27.05	177.68	383.41	49.24	142.98	237.29	67.30	318.81	504.14	32.34	37.40	1126.4	82.00	737.57	99.14	1495.04	38.95	66.57
Data written (Kbytes/sec)	54.22	1530	102.54	49.00	321.88	694.58	89.20	259.02	429.87	121.91	577.55	913.29	58.58	67.75	2048	154.75	1392.64	187.09	2816.00	73.51	125.63

20000/100000	TMPFS	ODS/R0	ODS/R5	NFS/F330	NFS/F630	PPro+SOFT	P3/Linux	P3+DPT	2xP3+UFS	2xP3+SOFT	2xP3+MFS	U5+UFS	U5+Log	U5+TMPFS	PBG3/HFS+	PBG3/MFS	PBG4/HFS+	PBG4/MFS	Compaq+UFS	Compaq+Soft
Transactions per second	335	30	14	74	169	26	56	83	35	139	228	16	19	564	31	310	33	613	15	22
Data read (Kbytes/sec)	613.03	73.19	35.05	204.72	446.69	73.52	153.28	232.66	85.86	329.79	606.47	41.01	49.86	1679.36	90.20	819.89	94.42	1628.16	42.68	69.72
Data written (Kbytes/sec)	1160	101.17	48.46	282.98	617.45	101.62	211.88	321.60	118.69	524.98	838.31	56.69	68.92	2324.48	129.65	1177.60	135.72	2334.72	61.35	100.21

20000/100000/100 dir	P3/Linux	2xP3+UFS	2xP3+SOFT	2xP3+MFS	U5+UFS	U5+Log	U5+TMPFS	PBG3/HFS+	PBG3/MFS	PBG4/HFS+	PBG4/MFS	Compaq+UFS	Compaq+Soft
Transactions per second	57	25	86	1333	18	15	564	24	277	28	574	14	13
Data read (Kbytes/sec)	151.17	60.92	237.10	3471.36	44.79	39.13	1669.12	65.43	708.74	77.21	1484.80	38.81	40.19
Data written (Kbytes/sec)	208.96	84.21	327.75	4802.56	61.92	54.09	2314.24	93.79	1015.93	110.68	2129.92	55.63	57.61