fioによるbcacheシステムのベンチマーク, あまりbcacheに期待しないほうが良い
- ファイルシステム: Btrfs, マウントオプションは/dev/bcache0 on / type btrfs (rw,noatime,compress=zstd,ssd,space_cache,subvolid=5,subvol=/)
- SSD: Samsung SM961
- HDD: TOSHIBA DT01ACA3
bcacheが正常に効いているのか確認するため, キャッシュデバイス無効有効両方でベンチマークを取ることにします.
本当はbcacheを使わない純粋なSSDでのベンチマークもするべきなのでしょうが, 普段使いのPCなので環境が構築するのが大変面倒なのでやれません.
一応公称値としてはSM961はシーケンシャル:読み出し3,100MB/s、書き込み1,400MB/sらしいです.
hdparm -Tt
を使った簡易的なSSD, HDD, bcacheデバイスのベンチマークも載せておきます.
2018-01-18T02:00:11 ncaq@strawberry/pts/0(0) ~/Documents/archive/2018-01
% sudo hdparm -Tt /dev/nvme0n1p5
/dev/nvme0n1p5:
Timing cached reads: 18616 MB in 2.00 seconds = 9316.29 MB/sec
Timing buffered disk reads: 7206 MB in 3.00 seconds = 2401.50 MB/sec
2018-01-18T02:02:41 ncaq@strawberry/pts/0(0) ~/Documents/archive/2018-01
% sudo hdparm -Tt /dev/sda2
/dev/sda2:
Timing cached reads: 19080 MB in 2.00 seconds = 9549.13 MB/sec
Timing buffered disk reads: 558 MB in 3.00 seconds = 185.75 MB/sec
2018-01-18T02:02:59 ncaq@strawberry/pts/0(0) ~/Documents/archive/2018-01
% sudo hdparm -Tt /dev/bcache0
/dev/bcache0:
Timing cached reads: 18700 MB in 2.00 seconds = 9358.02 MB/sec
Timing buffered disk reads: 606 MB in 3.00 seconds = 201.96 MB/sec
Linuxのディスクベンチマークはfioが支配的かつ高機能のようなので, これを使ってベンチマークを取ることにしました.
fioはジョブファイルを元にベンチマークを動かします. 私は以下のようなジョブファイルを書きました.
[global]
directory=/home/ncaq/Downloads/
size=100MB
[Read-Seq]
rw=read
stonewall
[Read-Random]
rw=randread
stonewall
[Write-Seq]
rw=write
stonewall
[Write-Random]
rw=randwrite
stonewall
本来はメモリ容量の32GBをはみ出す量のファイルサイズを指定するのが望ましいのでしょうが, 実際やってみた場合15分以上かかることがわかったので, 今回は簡易的なベンチマークということで軽量に済ますことにします.
キャッシュが有効になったベンチマークテストという都合上, 2回実行します. 2回実行するのはbcacheが有効になった時だけで良いと思うのですが, 公平にするためにbcache無効時も2回実行します.
fioベンチマークが作るファイルはそれぞれのテスト開始時に消去します.
としてbcacheからキャッシュデバイスを開放します.
2018-01-17T21:23:06 ncaq@strawberry/pts/0(0) ~
% sudo zsh -c 'echo a2be69aa-d1f5-4b93-91e3-2ea54ecec596 > /sys/block/bcache0/bcache/detach'
2018-01-17T21:23:10 ncaq@strawberry/pts/0(0) ~
% sudo zsh -c 'echo 1 /sys/fs/bcache/85f27d01-75f9-4f1e-9497-7d35f0f5c9ab/stop'
1 /sys/fs/bcache/85f27d01-75f9-4f1e-9497-7d35f0f5c9ab/stop
再起動します.
fioを実行します.
1回目
Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes
Read-Seq: Laying out IO file (1 file / 100MiB)
Read-Random: Laying out IO file (1 file / 100MiB)
Write-Seq: Laying out IO file (1 file / 100MiB)
Write-Random: Laying out IO file (1 file / 100MiB)
Read-Seq: (groupid=0, jobs=1): err= 0: pid=5644: Thu Jan 18 00:13:57 2018
read: IOPS=44.4k, BW=174MiB/s (182MB/s)(100MiB/576msec)
clat (nsec): min=329, max=27802k, avg=22173.03, stdev=364819.28
lat (nsec): min=360, max=27802k, avg=22210.91, stdev=364819.69
clat percentiles (nsec):
| 1.00th=[ 342], 5.00th=[ 350], 10.00th=[ 350],
| 20.00th=[ 362], 30.00th=[ 362], 40.00th=[ 370],
| 50.00th=[ 422], 60.00th=[ 502], 70.00th=[ 1976],
| 80.00th=[ 2544], 90.00th=[ 3568], 95.00th=[ 3696],
| 99.00th=[ 4256], 99.50th=[ 5664], 99.90th=[5275648],
| 99.95th=[5341184], 99.99th=[9502720]
lat (nsec) : 500=59.75%, 750=2.77%, 1000=0.03%
lat (usec) : 2=8.78%, 4=24.54%, 10=3.73%, 20=0.01%, 250=0.01%
lat (msec) : 4=0.01%, 10=0.39%, 50=0.01%
cpu : usr=0.00%, sys=6.61%, ctx=9494, majf=0, minf=9
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5645: Thu Jan 18 00:13:57 2018
read: IOPS=340, BW=1360KiB/s (1393kB/s)(100MiB/75277msec)
clat (usec): min=41, max=68349, avg=2938.93, stdev=3326.52
lat (usec): min=41, max=68349, avg=2939.06, stdev=3326.53
clat percentiles (usec):
| 1.00th=[ 46], 5.00th=[ 50], 10.00th=[ 52], 20.00th=[ 55],
| 30.00th=[ 57], 40.00th=[ 61], 50.00th=[ 1778], 60.00th=[ 3359],
| 70.00th=[ 4883], 80.00th=[ 6456], 90.00th=[ 8094], 95.00th=[ 8848],
| 99.00th=[ 9503], 99.50th=[ 9634], 99.90th=[10421], 99.95th=[16057],
| 99.99th=[60556]
bw ( KiB/s): min= 76, max= 4434, per=77.43%, avg=1053.10, stdev=462.58, samples=150
iops : min= 19, max= 1108, avg=262.89, stdev=115.63, samples=150
lat (usec) : 50=5.95%, 100=40.18%, 250=0.62%, 500=0.06%, 750=0.05%
lat (usec) : 1000=0.05%
lat (msec) : 2=4.47%, 4=12.87%, 10=35.56%, 20=0.16%, 50=0.02%
lat (msec) : 100=0.01%
cpu : usr=0.09%, sys=0.29%, ctx=25637, majf=0, minf=7
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5652: Thu Jan 18 00:13:57 2018
write: IOPS=512k, BW=2000MiB/s (2097MB/s)(100MiB/50msec)
clat (nsec): min=1319, max=18839, avg=1591.17, stdev=256.83
lat (nsec): min=1360, max=19259, avg=1637.72, stdev=258.66
clat percentiles (nsec):
| 1.00th=[ 1368], 5.00th=[ 1432], 10.00th=[ 1448], 20.00th=[ 1464],
| 30.00th=[ 1480], 40.00th=[ 1496], 50.00th=[ 1496], 60.00th=[ 1528],
| 70.00th=[ 1624], 80.00th=[ 1752], 90.00th=[ 1832], 95.00th=[ 1928],
| 99.00th=[ 2288], 99.50th=[ 2512], 99.90th=[ 4080], 99.95th=[ 4512],
| 99.99th=[ 8096]
lat (usec) : 2=96.18%, 4=3.71%, 10=0.11%, 20=0.01%
cpu : usr=32.65%, sys=63.27%, ctx=2, majf=0, minf=8
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5653: Thu Jan 18 00:13:57 2018
write: IOPS=434k, BW=1695MiB/s (1777MB/s)(100MiB/59msec)
clat (nsec): min=1440, max=20439, avg=1870.60, stdev=286.28
lat (nsec): min=1490, max=20679, avg=1915.74, stdev=286.88
clat percentiles (nsec):
| 1.00th=[ 1544], 5.00th=[ 1592], 10.00th=[ 1624], 20.00th=[ 1688],
| 30.00th=[ 1736], 40.00th=[ 1784], 50.00th=[ 1832], 60.00th=[ 1864],
| 70.00th=[ 1928], 80.00th=[ 2008], 90.00th=[ 2160], 95.00th=[ 2256],
| 99.00th=[ 2672], 99.50th=[ 2896], 99.90th=[ 4640], 99.95th=[ 5344],
| 99.99th=[ 6240]
lat (usec) : 2=79.58%, 4=20.19%, 10=0.22%, 50=0.01%
cpu : usr=18.97%, sys=79.31%, ctx=1, majf=0, minf=8
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=174MiB/s (182MB/s), 174MiB/s-174MiB/s (182MB/s-182MB/s), io=100MiB (105MB), run=576-576msec
Run status group 1 (all jobs):
READ: bw=1360KiB/s (1393kB/s), 1360KiB/s-1360KiB/s (1393kB/s-1393kB/s), io=100MiB (105MB), run=75277-75277msec
Run status group 2 (all jobs):
WRITE: bw=2000MiB/s (2097MB/s), 2000MiB/s-2000MiB/s (2097MB/s-2097MB/s), io=100MiB (105MB), run=50-50msec
Run status group 3 (all jobs):
WRITE: bw=1695MiB/s (1777MB/s), 1695MiB/s-1695MiB/s (1777MB/s-1777MB/s), io=100MiB (105MB), run=59-59msec
2回目
Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes
Read-Seq: (groupid=0, jobs=1): err= 0: pid=5689: Thu Jan 18 00:15:46 2018
read: IOPS=44.0k, BW=176MiB/s (184MB/s)(100MiB/569msec)
clat (nsec): min=320, max=22764k, avg=21709.63, stdev=337119.57
lat (nsec): min=350, max=22764k, avg=21747.85, stdev=337120.98
clat percentiles (nsec):
| 1.00th=[ 350], 5.00th=[ 370], 10.00th=[ 390],
| 20.00th=[ 524], 30.00th=[ 1752], 40.00th=[ 1784],
| 50.00th=[ 1816], 60.00th=[ 1864], 70.00th=[ 2192],
| 80.00th=[ 2512], 90.00th=[ 2928], 95.00th=[ 3344],
| 99.00th=[ 4016], 99.50th=[ 6048], 99.90th=[5079040],
| 99.95th=[5210112], 99.99th=[8454144]
lat (nsec) : 500=19.20%, 750=1.74%, 1000=0.06%
lat (usec) : 2=46.78%, 4=31.15%, 10=0.66%, 20=0.01%, 750=0.01%
lat (msec) : 4=0.01%, 10=0.39%, 20=0.01%, 50=0.01%
cpu : usr=1.41%, sys=7.57%, ctx=20129, majf=0, minf=9
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5690: Thu Jan 18 00:15:46 2018
read: IOPS=333, BW=1334KiB/s (1366kB/s)(100MiB/76758msec)
clat (usec): min=49, max=74409, avg=2995.94, stdev=3265.48
lat (usec): min=49, max=74410, avg=2996.14, stdev=3265.49
clat percentiles (usec):
| 1.00th=[ 56], 5.00th=[ 64], 10.00th=[ 71], 20.00th=[ 86],
| 30.00th=[ 221], 40.00th=[ 255], 50.00th=[ 1827], 60.00th=[ 3392],
| 70.00th=[ 4883], 80.00th=[ 6521], 90.00th=[ 8094], 95.00th=[ 8848],
| 99.00th=[ 9503], 99.50th=[ 9634], 99.90th=[10159], 99.95th=[10421],
| 99.99th=[42730]
bw ( KiB/s): min= 752, max= 4468, per=83.07%, avg=1108.12, stdev=358.36, samples=153
iops : min= 188, max= 1117, avg=276.62, stdev=89.61, samples=153
lat (usec) : 50=0.04%, 100=23.15%, 250=16.02%, 500=7.59%, 750=0.07%
lat (usec) : 1000=0.04%
lat (msec) : 2=4.26%, 4=12.98%, 10=35.73%, 20=0.11%, 50=0.01%
lat (msec) : 100=0.01%
cpu : usr=0.21%, sys=0.38%, ctx=25713, majf=0, minf=7
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5691: Thu Jan 18 00:15:46 2018
write: IOPS=474k, BW=1852MiB/s (1942MB/s)(100MiB/54msec)
clat (nsec): min=1340, max=57846, avg=1554.51, stdev=420.67
lat (nsec): min=1390, max=57886, avg=1597.23, stdev=420.87
clat percentiles (nsec):
| 1.00th=[ 1384], 5.00th=[ 1400], 10.00th=[ 1416], 20.00th=[ 1432],
| 30.00th=[ 1448], 40.00th=[ 1448], 50.00th=[ 1464], 60.00th=[ 1480],
| 70.00th=[ 1544], 80.00th=[ 1688], 90.00th=[ 1784], 95.00th=[ 1928],
| 99.00th=[ 2256], 99.50th=[ 2448], 99.90th=[ 3952], 99.95th=[ 4192],
| 99.99th=[ 8768]
lat (usec) : 2=96.34%, 4=3.57%, 10=0.08%, 20=0.01%, 100=0.01%
cpu : usr=11.32%, sys=84.91%, ctx=1, majf=0, minf=8
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5692: Thu Jan 18 00:15:46 2018
write: IOPS=376k, BW=1471MiB/s (1542MB/s)(100MiB/68msec)
clat (nsec): min=1479, max=72186, avg=1987.54, stdev=538.63
lat (nsec): min=1529, max=72236, avg=2034.62, stdev=541.44
clat percentiles (nsec):
| 1.00th=[ 1608], 5.00th=[ 1688], 10.00th=[ 1736], 20.00th=[ 1784],
| 30.00th=[ 1832], 40.00th=[ 1864], 50.00th=[ 1912], 60.00th=[ 1976],
| 70.00th=[ 2040], 80.00th=[ 2160], 90.00th=[ 2288], 95.00th=[ 2480],
| 99.00th=[ 3024], 99.50th=[ 3344], 99.90th=[ 5024], 99.95th=[ 5664],
| 99.99th=[ 9664]
lat (usec) : 2=64.19%, 4=35.55%, 10=0.25%, 20=0.01%, 100=0.01%
cpu : usr=20.90%, sys=77.61%, ctx=2, majf=0, minf=8
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=176MiB/s (184MB/s), 176MiB/s-176MiB/s (184MB/s-184MB/s), io=100MiB (105MB), run=569-569msec
Run status group 1 (all jobs):
READ: bw=1334KiB/s (1366kB/s), 1334KiB/s-1334KiB/s (1366kB/s-1366kB/s), io=100MiB (105MB), run=76758-76758msec
Run status group 2 (all jobs):
WRITE: bw=1852MiB/s (1942MB/s), 1852MiB/s-1852MiB/s (1942MB/s-1942MB/s), io=100MiB (105MB), run=54-54msec
Run status group 3 (all jobs):
WRITE: bw=1471MiB/s (1542MB/s), 1471MiB/s-1471MiB/s (1542MB/s-1542MB/s), io=100MiB (105MB), run=68-68msec
bcacheを有効化し直します.
sudo zsh -c 'echo 85f27d01-75f9-4f1e-9497-7d35f0f5c9ab > /sys/block/bcache0/bcache/attach'
ちなみに, 有効にしてからの初回起動はキャッシュヒット率が4%と極めて低く, 動作は遅かったままでした.
1回目
Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes
Read-Seq: Laying out IO file (1 file / 100MiB)
Read-Random: Laying out IO file (1 file / 100MiB)
Write-Seq: Laying out IO file (1 file / 100MiB)
Write-Random: Laying out IO file (1 file / 100MiB)
Read-Seq: (groupid=0, jobs=1): err= 0: pid=5919: Thu Jan 18 00:25:22 2018
read: IOPS=48.4k, BW=189MiB/s (198MB/s)(100MiB/529msec)
clat (nsec): min=330, max=9648.3k, avg=20331.72, stdev=308917.80
lat (nsec): min=369, max=9648.3k, avg=20369.58, stdev=308917.99
clat percentiles (nsec):
| 1.00th=[ 342], 5.00th=[ 342], 10.00th=[ 342],
| 20.00th=[ 350], 30.00th=[ 362], 40.00th=[ 370],
| 50.00th=[ 422], 60.00th=[ 1880], 70.00th=[ 1992],
| 80.00th=[ 2192], 90.00th=[ 3024], 95.00th=[ 3664],
| 99.00th=[ 4256], 99.50th=[ 6112], 99.90th=[5210112],
| 99.95th=[5341184], 99.99th=[5406720]
lat (nsec) : 500=54.89%, 750=1.60%, 1000=0.02%
lat (usec) : 2=13.96%, 4=26.95%, 10=2.17%, 20=0.01%, 250=0.01%
lat (usec) : 500=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.38%
cpu : usr=2.46%, sys=4.73%, ctx=11034, majf=0, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5920: Thu Jan 18 00:25:22 2018
read: IOPS=327, BW=1309KiB/s (1341kB/s)(100MiB/78220msec)
clat (usec): min=50, max=27403, avg=3053.88, stdev=3402.73
lat (usec): min=50, max=27403, avg=3054.01, stdev=3402.74
clat percentiles (usec):
| 1.00th=[ 58], 5.00th=[ 64], 10.00th=[ 68], 20.00th=[ 72],
| 30.00th=[ 76], 40.00th=[ 83], 50.00th=[ 1909], 60.00th=[ 3523],
| 70.00th=[ 5014], 80.00th=[ 6652], 90.00th=[ 8225], 95.00th=[ 8979],
| 99.00th=[ 9765], 99.50th=[10159], 99.90th=[20579], 99.95th=[24511],
| 99.99th=[26870]
bw ( KiB/s): min= 63, max= 4027, per=79.33%, avg=1038.46, stdev=436.57, samples=156
iops : min= 15, max= 1006, avg=259.27, stdev=109.12, samples=156
lat (usec) : 100=45.19%, 250=1.37%, 500=0.16%, 750=0.03%, 1000=0.06%
lat (msec) : 2=3.74%, 4=12.61%, 10=36.24%, 20=0.49%, 50=0.12%
cpu : usr=0.13%, sys=0.48%, ctx=28693, majf=0, minf=8
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5930: Thu Jan 18 00:25:22 2018
write: IOPS=522k, BW=2041MiB/s (2140MB/s)(100MiB/49msec)
clat (nsec): min=1329, max=21979, avg=1578.04, stdev=283.14
lat (nsec): min=1369, max=22409, avg=1624.35, stdev=284.34
clat percentiles (nsec):
| 1.00th=[ 1352], 5.00th=[ 1416], 10.00th=[ 1432], 20.00th=[ 1448],
| 30.00th=[ 1464], 40.00th=[ 1480], 50.00th=[ 1496], 60.00th=[ 1512],
| 70.00th=[ 1576], 80.00th=[ 1704], 90.00th=[ 1816], 95.00th=[ 1976],
| 99.00th=[ 2320], 99.50th=[ 2576], 99.90th=[ 4384], 99.95th=[ 5216],
| 99.99th=[ 8096]
lat (usec) : 2=95.39%, 4=4.48%, 10=0.12%, 20=0.01%, 50=0.01%
cpu : usr=16.67%, sys=81.25%, ctx=4, majf=0, minf=9
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5931: Thu Jan 18 00:25:22 2018
write: IOPS=449k, BW=1754MiB/s (1840MB/s)(100MiB/57msec)
clat (nsec): min=1430, max=19619, avg=1828.22, stdev=289.68
lat (nsec): min=1470, max=19819, avg=1877.15, stdev=292.75
clat percentiles (nsec):
| 1.00th=[ 1528], 5.00th=[ 1576], 10.00th=[ 1608], 20.00th=[ 1656],
| 30.00th=[ 1688], 40.00th=[ 1736], 50.00th=[ 1768], 60.00th=[ 1800],
| 70.00th=[ 1864], 80.00th=[ 1976], 90.00th=[ 2128], 95.00th=[ 2256],
| 99.00th=[ 2640], 99.50th=[ 2928], 99.90th=[ 4768], 99.95th=[ 5408],
| 99.99th=[ 6240]
lat (usec) : 2=81.76%, 4=18.05%, 10=0.19%, 20=0.01%
cpu : usr=21.43%, sys=76.79%, ctx=1, majf=0, minf=9
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=189MiB/s (198MB/s), 189MiB/s-189MiB/s (198MB/s-198MB/s), io=100MiB (105MB), run=529-529msec
Run status group 1 (all jobs):
READ: bw=1309KiB/s (1341kB/s), 1309KiB/s-1309KiB/s (1341kB/s-1341kB/s), io=100MiB (105MB), run=78220-78220msec
Run status group 2 (all jobs):
WRITE: bw=2041MiB/s (2140MB/s), 2041MiB/s-2041MiB/s (2140MB/s-2140MB/s), io=100MiB (105MB), run=49-49msec
Run status group 3 (all jobs):
WRITE: bw=1754MiB/s (1840MB/s), 1754MiB/s-1754MiB/s (1840MB/s-1840MB/s), io=100MiB (105MB), run=57-57msec
2回目
Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes
Read-Seq: (groupid=0, jobs=1): err= 0: pid=5967: Thu Jan 18 00:25:31 2018
read: IOPS=48.4k, BW=189MiB/s (198MB/s)(100MiB/529msec)
clat (nsec): min=300, max=10336k, avg=20169.27, stdev=300075.95
lat (nsec): min=330, max=10336k, avg=20205.15, stdev=300076.09
clat percentiles (nsec):
| 1.00th=[ 322], 5.00th=[ 362], 10.00th=[ 370],
| 20.00th=[ 390], 30.00th=[ 470], 40.00th=[ 1624],
| 50.00th=[ 1752], 60.00th=[ 2448], 70.00th=[ 2576],
| 80.00th=[ 3280], 90.00th=[ 3376], 95.00th=[ 3856],
| 99.00th=[ 4448], 99.50th=[ 6048], 99.90th=[5079040],
| 99.95th=[5210112], 99.99th=[5275648]
lat (nsec) : 500=31.52%, 750=5.82%, 1000=0.45%
lat (usec) : 2=16.03%, 4=43.84%, 10=1.94%, 20=0.01%, 250=0.01%
lat (usec) : 750=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.37%, 20=0.01%
cpu : usr=0.57%, sys=9.09%, ctx=15834, majf=0, minf=9
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5968: Thu Jan 18 00:25:31 2018
read: IOPS=38.2k, BW=149MiB/s (156MB/s)(100MiB/671msec)
clat (usec): min=17, max=8498, avg=25.49, stdev=93.02
lat (usec): min=17, max=8498, avg=25.54, stdev=93.02
clat percentiles (usec):
| 1.00th=[ 18], 5.00th=[ 18], 10.00th=[ 19], 20.00th=[ 19],
| 30.00th=[ 19], 40.00th=[ 19], 50.00th=[ 19], 60.00th=[ 20],
| 70.00th=[ 20], 80.00th=[ 20], 90.00th=[ 23], 95.00th=[ 64],
| 99.00th=[ 104], 99.50th=[ 115], 99.90th=[ 149], 99.95th=[ 161],
| 99.99th=[ 6587]
bw ( KiB/s): min=152182, max=152182, per=99.72%, avg=152182.00, stdev= 0.00, samples=1
iops : min=38045, max=38045, avg=38045.00, stdev= 0.00, samples=1
lat (usec) : 20=81.70%, 50=8.73%, 100=8.52%, 250=1.02%
lat (msec) : 2=0.01%, 10=0.02%
cpu : usr=0.15%, sys=25.52%, ctx=25602, majf=0, minf=7
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5969: Thu Jan 18 00:25:31 2018
write: IOPS=69.8k, BW=272MiB/s (286MB/s)(100MiB/367msec)
clat (nsec): min=1220, max=310554k, avg=13629.31, stdev=1940951.49
lat (nsec): min=1269, max=310555k, avg=13673.71, stdev=1940956.03
clat percentiles (nsec):
| 1.00th=[ 1256], 5.00th=[ 1288], 10.00th=[ 1288], 20.00th=[ 1304],
| 30.00th=[ 1336], 40.00th=[ 1368], 50.00th=[ 1432], 60.00th=[ 1448],
| 70.00th=[ 1496], 80.00th=[ 1656], 90.00th=[ 1896], 95.00th=[ 2064],
| 99.00th=[ 2192], 99.50th=[ 2320], 99.90th=[ 3696], 99.95th=[ 4128],
| 99.99th=[ 7328]
lat (usec) : 2=92.11%, 4=7.82%, 10=0.06%
lat (msec) : 500=0.01%
cpu : usr=1.91%, sys=12.84%, ctx=6, majf=0, minf=8
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5970: Thu Jan 18 00:25:31 2018
write: IOPS=63.4k, BW=248MiB/s (260MB/s)(100MiB/404msec)
clat (nsec): min=1330, max=330047k, avg=14920.07, stdev=2062778.23
lat (nsec): min=1380, max=330047k, avg=14965.43, stdev=2062780.01
clat percentiles (nsec):
| 1.00th=[ 1464], 5.00th=[ 1560], 10.00th=[ 1656], 20.00th=[ 1752],
| 30.00th=[ 1816], 40.00th=[ 1880], 50.00th=[ 1944], 60.00th=[ 2008],
| 70.00th=[ 2096], 80.00th=[ 2224], 90.00th=[ 2416], 95.00th=[ 2576],
| 99.00th=[ 5280], 99.50th=[ 6176], 99.90th=[ 6688], 99.95th=[ 7008],
| 99.99th=[ 9280]
lat (usec) : 2=59.45%, 4=39.20%, 10=1.35%, 20=0.01%
lat (msec) : 500=0.01%
cpu : usr=3.23%, sys=14.39%, ctx=6, majf=0, minf=9
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=189MiB/s (198MB/s), 189MiB/s-189MiB/s (198MB/s-198MB/s), io=100MiB (105MB), run=529-529msec
Run status group 1 (all jobs):
READ: bw=149MiB/s (156MB/s), 149MiB/s-149MiB/s (156MB/s-156MB/s), io=100MiB (105MB), run=671-671msec
Run status group 2 (all jobs):
WRITE: bw=272MiB/s (286MB/s), 272MiB/s-272MiB/s (286MB/s-286MB/s), io=100MiB (105MB), run=367-367msec
Run status group 3 (all jobs):
WRITE: bw=248MiB/s (260MB/s), 248MiB/s-248MiB/s (260MB/s-260MB/s), io=100MiB (105MB), run=404-404msec
それぞれのbwをテーブルにまとめました.
case | Read-Seq | Read-Random | Write-Seq | Write-Random |
---|---|---|---|---|
disable-1 | 174MiB | 1360KiB | 2000MiB | 1695MiB |
disable-2 | 176MiB | 1334KiB | 1852MiB | 1471MiB |
enable-1 | 189MiB | 1309KiB | 2041MiB | 1754MiB |
enable-2 | 189MiB | 149MiB | 272MiB | 248MiB |
enable-2のみRead-Randomの単位がKiBではなくMiBになっているのが注目するべきポイントですね. 他のテストはRead-Randomが異様に遅いため, これのせいでかなり時間がかかっていましたが, enable-2はキャッシュが効いているのか即座に終わるようになりました.
しかしWriteテストは逆に性能が落ち込んでいます. Read-Randomが単位レベルで高速化されてるので, それを無視するレベルで早くなっているのですが.
謎に思ったので数回fioを実行してみたのですが, Readのテスト値は安定しているのですが, Writeのテスト値は10倍レベルで変動して, 全く信用できないことがわかりました.
既存のテストファイルを削除してからfioを実行し直してみるとRead-Randomの値はKiBになってしまいますが, Writeの値が1961MiB, 1754MiBと再び高速化しました.
Btrfsの透過圧縮が関係しているのでしょうか? ファイルの内容を乱数か何かで決めたとすると, 内容によって値が変わってくるということでしょうか.
disable-2でもWriteの値が大幅低下していたらならば, Btrfsが上書きに異様に弱いという推定が出来たのですが, これは一体どういうことでしょうか. bcacheのwritebackの性能があまり良くないということなのでしょうか.
Testing The BCache SSD Cache For HDDs On Linux 4.8 - Phoronix を見てもそこまで極端にWriteに弱いという結果は出ていないので謎です.
fioがあまり信用できないので他のツールを探したのですが, Btrfsの圧縮機能に対して実際の値を出してくれるベンチマークツールはありませんでした.
実感としてbcacheを無効化するとアプリケーションの起動速度が目に見えて遅くなるので, bcacheの実際の効果はあるのでしょうけれど, fioベンチマークではあまり振るわない結果となりました.
実アプリケーションに対してそこそこ機能するとしても, やはりSSDを直接使うのには遠く及ばないパフォーマンスのようですね. ファイルを置く場所に悩まなくなるのはとても良いことなのですが, 何もデメリットのない夢の解決策というわけではないようです.