Null Block Driver
The Rust null block driver rnull
is an effort to implement a drop in
replacement for null_blk
in Rust.
A null block driver is a good opportunity to evaluate Rust bindings for the block layer. It is a small and simple driver and thus should be simple to reason about. Further, the null block driver is not usually deployed in production environments. Thus, it should be fairly straight forward to review, and any potential issues are not going to bring down any production workloads.
Being small and simple, the null block driver is a good place to introduce the Linux kernel storage community to Rust. This will help prepare the community for future Rust projects and facilitate a better maintenance process for these projects.
Statistics from the
commit log of the C null_blk
driver
(before
move)
show that the C null block driver has had a significant amount of memory safety
related problems in the past. 41% of fixes merged for the C null block driver
are fixes for memory safety issues. This makes the null block driver a good
candidate for rewriting in Rust.
The driver is implemented entirely in safe Rust, with all unsafe code fully contained in the abstractions that wrap the C APIs.
Please note that the performance measurements on this page might be misleading due to the results not being normally distributed. This analysis has more details. We observe that issue is resovled for v6.14-rc5, but we are monitoring the situation going forward.
Features
Implemented features:
blk-mq
support- Direct completion
- SoftIRQ completion
- Timer completion
- Read and write requests
- Optional memory backing
- Bio-based submission
- NUMA support
- Block size configuration
- Multiple devices
- Dynamic device creation/destruction
- Queue count configuration
Features available in the C null_blk
driver that are currently not implemented
in this work:
- Queue depth configuration
- Discard operation support
- Cache emulation
- Bandwidth throttling
- Per node hctx
- IO scheduler configuration
- Blocking submission mode
- Shared tags configuration (for >1 device)
- Zoned storage support
- Bad block simulation
- Poll queues
Resources
6.14-rc5 Rebase (rnull-v6.14-rc5
)
Changes from rnull-v6.13
:
- Change reference counting scheme for
Request
. - Move
rnull
driver to separate directory. - Rename
RawWriter
toBufferWriter
and move it. - Enable configuration of
rnull
viaconfigfs
.- Enable dynamic createion/destruction of devices via
configfs
.
- Enable dynamic createion/destruction of devices via
- Use
Owned
for rust managedPage
objects. - Change segment iterator to prevent concurrent mutable access to pages.
- Use
GFP_NOIO
flag for backing rnull pages. - Add
user_per_node_hctx
rnull config option. - Add NUMA home node rnull config option.
- Add submit queue count rnull config option.
- Fix a bug where unwritten bytes were not zeroed on read.
- Properly handle IO requests that are not equal in size to one block.
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.11
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.13 Rebase (rnull-v6.13
)
Changes from rnull-v6.12
:
- None
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.12 Rebase (rnull-v6.12
)
Changes from rnull-v6.12-rc2
:
- None
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.12-rc2 Rebase (rnull-v6.12-rc2
)
Changes from rnull-v6.11
:
- Make
QueueData
references pinned.
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.11 Rebase (rnull-v6.11
)
Changes from rnull-v6.10
:
- None.
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.11-rc2 Rebase (rnull-v6.11-rc2
)
Changes from rnull-v6.10
:
- Base abstractions merged upstream 🥳
- Use atomic queue limits C API for setting queue limits.
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.10 Rebase (rnull-v6.10
)
Changes from rnull-v6.10-rc3
:
- None
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.10-rc3 Rebase (rnull-v6.10-rc3
)
Changes from rnull-v6.9
:
- Add
ForeignBorrowed
. - Move
GenDisk
to a builder pattern instead of typestate pattern. - Move block size validation from driver to abstractions.
- Pin
NullBlkModuel
. - Refactor
Request::try_set_end
. - Rewrite atomic functions in terms of
core
library functions. - Fix a bug in timer completions where an offset was not calculated correctly.
- Refactor
TagSet
initialization in terms ofcore::mem::zeroed()
instead ofOpaque::try_ffi_init
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.9 Rebase (rnull-v6.9
)
Changes from rnull-v6.8
:
- Do not rely on C refcounting of
Request
- Use
ARef
to trackRequest
lifetime - Use
Page
instead ofFolio
to track memory for memory backed mode - Use typestate pattern to track state of
GenDisk
- Panic when requests cannot be completed
- Remove associated type
RequestDataInit
and use return position impl trait instead - Call
Request::start
implicitly - Split helper function C file
Performance
Setup
- AMD Ryzen 5 7600
- 32 GB 4800 MT/s DDR5 on one channel
- 1x Samsung 990 Pro 1TB (PCIe 4.0 x4 16 GT/S)
- NixOS 24.05
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 5 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.8 Rebase (rnull-v6.8
)
Changes from rnull-v6.8-rc6
:
- Slight refactoring of patch order
Performance
Setup
- 12th Gen Intel(R) Core(TM) i5-12600
- 32 GB DRAM
- Debian Bullseye userspace
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 5 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.8-rc6 Rebase (rnull-v6.8-rc6
)
Changes from rnull-6.8
:
- Change lock alignment mechanics
- Apply reference counting to
Request
- Drop some inline directives
Performance
Setup
- 12th Gen Intel(R) Core(TM) i5-12600
- 32 GB DRAM
- Debian Bullseye userspace
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 5 samples for each configuration
- Difference of means modeled with t-distribution
- P95 confidence intervals
6.7 Rebase (rnull-6.7
)
Changes from null_blk-6.6:
- Move to
Folio
for memory backing instead ofPage
- Move to
XArray
for memory backing instead ofRaddixTree
Performance
Setup
- 12th Gen Intel(R) Core(TM) i5-12600
- 32 GB DRAM
- Debian Bullseye userspace
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples
- Difference of means modeled with t-distribution
- P95 confidence intervals
Performance September 2023 (null_blk-6.6
)
Setup
- 12th Gen Intel(R) Core(TM) i5-12600
- 32 GB DRAM
- 1x INTEL MEMPEK1W016GA (PCIe 3.0 x2)
- Debian Bullseye userspace
Results
- Plot shows
(mean_iops_r - mean_iops_c) / mean_iops_c
- 40 samples
- Difference of means modeled with t-distribution
- P95 confidence intervals
Performance September 2023
Setup
- 12th Gen Intel(R) Core(TM) i5-12600
- 32 GB DRAM
- 1x INTEL MEMPEK1W016GA (PCIe 3.0 x2)
- Debian Bullseye userspace
Results
In most cases there is less than 2% difference between the Rust and C drivers.
Contact
Please contact Andreas Hindborg through Zulip.