Rust Port of Linux's RamFS File System

1. Introduction

RamFS is a Ram-based file system in Linux. It has been self-described as a simple file system for learning the minimal implementations needed to create a new Linux file system (link).

During the Fall 2021 semester of Advanced Linux Kernel Programming with Dr. Changwoo Min at Virginia Tech. Connor Shugg and I (Chase Minor) ported it from kernel C to kernel Rust to learn the process of porting something internal to the kernel. We offer our source and knowledge here for usage for including or learning from.

The main contribution of our work is the porting of the RamFS file system. However, we also have added various other things to the kernel that could be beneficial to other Rust for Linux developers. We will focus on discussing these additions below as RamFS itself should be fairly self-descriptive. We will try and stay away from miscellaneous code changes; however, if there is interest, we can take some time to explain those as well.

Our source code can be found at https://github.com/acminor/linux.

2. RamFS Port

RamFS has been mostly ported to Rust. The only things left to port are dependent on macros (fs_initcall), functions/types not exported using rust/kernel/bindings_helper.h (struct fs_context_operations, etc.), and inline function wrappers (dget). What is left can be found in fs/ramfs_rust/inode.c. Other than this, we also did not port file-nommu.c. Furthermore, we did not change anything related to include/linux/ramfs.h.

2.1. Process

In general, our process was to port individual parts of RamFS logic incrementally. We accomplished this by adding cbindgen to the Makefile.build rules to generate header files from Rust source code. This was to allow us to reference Rust code from C in an automated fashion. In this way, we could port a function that has dependencies in kernel C to Rust. We would include our generated headers in the C file and compile both the C source and Rust source and link them together.

3. Cbindgen Issues

Cbindgen, in general, is meant to work with Cargo projects. This becomes an issue for Rust for Linux which does not use Cargo. We spent some time trying to generate the relevant information for cbindgen from the kernel build system with no luck. Instead, we currently rely on the lack of namespace support in cbindgen (link). Using this, we can create an internal module with "metatype" information on whether an exported type is a struct, enum, or union. This can be seen below. As Rust can properly ignore the code while cbindgen cannot, this accomplished our goal and allows cbindgen to properly export to C-style types with a prefix for the "metatype".

#[allow(unused)]
#[rustfmt::skip]
mod __anon__ {
    struct user_namespace;
    ...
    struct fs_parameter_spec;
}

4. Sequence Files

In the process of making our code more Rust-like, we noticed that ramfs_show_options used seq_printf (link). Currently, to our knowledge, Rust for Linux does not have the functionality to handle this. However, due to the work of Gary Guo (nbdd0121), Rust for Linux does have support for printing Rust-style formatting strings with the "%pA" format specifier (link). This is used by the pr_info! family of macros. Taking inspiration from this code, we created a similar style macro for sequence file printing (seq_printf!). Special care had to be taken to ensure that unsafe code blocks are not leaked from the macro for the sequence file itself. Regarding the leaking of unsafe assumptions to the arguments, this needs to be investigated. I believe more work will need to be done concerning this. See my comments here. You can see an example of using seq_printf! below.

if mode != RAMFS_DEFAULT_MODE {
  seq_printf!(unsafe{ m.as_mut().unwrap() }, ",mode={:o}", mode);
} 

5. Compile-time Default C-style Structs

In Rust, static data has to be available at compile-time. This can result in having to use libraries such as lazy_static. As Rust for Linux does not have lazy_static, we originally manually specified each of the unspecified fields in a Rust structure by hand. This is because C auto-sets these values to zero when left out while Rust does not allow that.

It would be more Rust-like to implement Default for our various structures and expand this into the static data using the ".." expansion syntax. However, Default is not a compile-time expression. Thus, it cannot be used for static data.

It might be tempting to use something like alloc::alloc_zeroed. This is valid as we can assume all C-style structs are valid if zero-initialized (this is how C interprets things). However, this function is also not compile-time. We believe we had hit a wall until we discovered that both transmuting data and fixed-sized arrays were compile-time.

With this information, we implemented a macro called c_default_struct! for generating C-style default zeroed structs. This currently has to be implemented as a macro. We attempted to make this a Rust function; however, as of our last attempt, it appears that work on const-generics is affecting the ability to do this. In regard to implementation, it simply casts a fixed-size array of core::mem::size_of type bytes and uses core::mem::transmute to cast this to the final type. An example of using this macro can be seen below. This macro can be found here.

static ramfs_ops: super_operations = super_operations {
  statfs: Some(simple_statfs),
  drop_inode: Some(generic_delete_inode),
  show_options: Some(ramfs_show_options),
  ..c_default_struct!(super_operations)
};

6. Kbuild Information

We added options under "File systems" for "Rust Filesystems" where we have an option to replace RamFS with the Rust RamFS version.

7. Build Instructions

Follow the normal build guide. Cbindgen should be installed at version 0.20.0. Ensure that in menuconfig, you enable replacing RamFS with the Rust version. See the above information on Kbuild.

8. Future Work

There is much future work that can be done regarding our work.

  1. It would be prudent (if RamFS Rust was upstreamed) to address the proper visibility of the various functions in inode_rs.rs. They should correspond to the original C version (removing pub when the original C version was marked as static).
  2. RamFS was updated during our porting process, and we have yet to include the updated code.
  3. Rust interfaces for structs such as super_operations would be nice. One potential option for this is a Trait style interface where the different functions could be optionally implemented on a type. This would need to be cast-able or binary equivalent to the C struct.
  4. Anonymous structs should be properly handled. By default, bindgen will give generated names to anonymous structs and unions. This could become an issue if the struct is reordered, and it generally makes comprehending code difficult. One possible solution to this is to conditionally define a macro function to give names to the anonymous members when parsed by bindgen but not when compiled normally. The issue with this is that Rust code would cause C code to be affected by these markings.

Example of anonymous struct naming.

S_IFREG => {
  inode.i_op = unsafe { &ramfs_file_inode_operations };
  inode.__bindgen_anon_3.i_fop = unsafe { &ramfs_file_operations };
}

Example of conditional naming of anonymous structs in C.

#ifdef RUST_BINDGEN
#define BINDGEN_NAME(NAME) NAME
#else
#define BIDNGEN_NAME(NAME)
#endif

struct inode {
  union {
    const struct file_operations  *i_fop; /* former ->i_op->default_file_ops */
    void (*free_inode)(struct inode *);
  } BINDGEN_NAME(fop_union);
};

9. Miscellaneous

Our tests for the RamFS Rust file system can be found here.

Our project paper with more information can be found here.

  • Note, the build instructions in this paper may be out of date.

Date: 2022-01-20 Thu

Author: A.C. Minor

Validate