This article also has an Chinese version.
This article discusses my thoughts, designs, and implementations on creating a reliable context-passing component in Rust. My project, certain-map, has been open-sourced (it was first released over a year ago, with more improvements made recently, which I will discuss later in this article). Feel free to use it!
Project URL: https://github.com/ihciah/certain-map
What problem does it solve:
- When passing context across components, it can leverage the compiler to ensure the existence of fields (i.e., when a component has a read dependency on a field in the Context, the preceding component must have written to that field, otherwise it will not compile).
- The context required by generic components can be defined as a generic parameter and constrained, which makes the component implementation more generic and not coupled to a specific type of Context.
Note: Although the project name seems to suggest a map implementation, it is actually a struct generated using procedural macros. The reason it is named certain-map is that it was originally designed to replace TypeMap and ensure the existence of fields.
Service Abstraction
If you are already familiar with this, you may skip this section.
In Rust, thanks to the type system, it is easy to build layered abstractions, break down complex processes into independent components, and combine them when needed. A typical example is the tower Service (I have also defined an async-oriented Service and its auxiliary tools: service-async).
A typical Service is defined as follows:
1 | trait Service<Request> { |
Using the general-purpose gateway framework I am currently developing as an example, L7 capabilities may be built on top of L5 capabilities, which in turn are based on L4 implementations. To ensure that the logic is fully decoupled and pluggable, using the Service abstraction allows us to implement the following Services:
L4Svc: Service<(SocketAddr, T)>
TLSSvc: Service<T> where T: AsyncRead + AsyncWrite
H1Dispatcher: Service<T> where T: AsyncRead + AsyncWrite
H1Svc: Service<http::Request<Bytes>, Response = http::Response<Bytes>>
When building components, L4Svc
acts as the outermost layer, receiving peer addresses and connections and passing these connections to the next layer. The H1Dispatcher
, on the other hand, needs to implement a loop that continuously parses HTTP requests and passes them to H1Svc
for processing, finally writing the Response back.
Thus, we can have code similar to the following:
1 | struct L4Svc<T> { |
Isn’t it very straightforward? We can build complex logic by composing different Services, which can be implemented by different people. We just need to clearly define what kind of Request each Service accepts and constrain its inner to implement a certain type of Service.
Context Passing
As can be seen from the previous examples, the essence of a Service is the abstraction of asynchronous functions, where the type system constrains the inputs and outputs of these functions together. Information generated by an outer Service can be passed on to an inner Service, and from the inner Service to the next layer.
This explicit passing can effectively show the types of transformations between Requests and Responses, reflecting the logic and constraints of the Service (for example, H1Dispatcher receives impl Read+Write
and constrains its inner svc to handle http::Request<Bytes>
, obviously focusing on HTTP protocol encoding and decoding internally).
However, sometimes this explicit passing can lead to code redundancy. For instance, if each layer needs access to the requester’s IP address (the SocketAddr
in the previous example), then this data must be passed at every layer. If there is a lot of information that needs to be passed across layers, this explicit passing can become cumbersome and make components less general and more coupled.
One solution to this problem is to collect all such information into a single structure and pass this structure through each layer. This structure could be a predefined struct or a type map. This way, at each layer, all information can be accessed and information that the subsequent Services might need can be inputted.
Two Methods of Context Storage
Struct-based Context
1 | struct MyContext { |
MyContext
is defined by the user and contains all necessary information for passing. Here, the “user” refers to the developer who has implemented a part of the Service, and they may use Services provided by other developers.
At this point, all Service implementations need to receive the concrete type MyContext
and pass it on to the next layer of Service. Clearly, Services implemented in this way lack reusability, as each user’s Context is a different type.
TypeMap-based Context
TypeMap is not a special structure; fundamentally, it is a HashMap where the key is a type id and the value is a trait object. It can store data of any different type.
1 | impl<T, R> Service<(TypeMap, SocketAddr, R)> for L4Svc<T> |
This method addresses the problem of struct-based context, as TypeMap is sufficiently general and can be acceptably tightly coupled with all services.
The hyper
library uses this form to pass certain internally used context information, with Extensions
in http::Request
and http::Response
functioning as a TypeMap.
We can define new types to wrap fields, in order to avoid key conflicts:
1 | struct PeerAddr(pub SocketAddr); |
Flaws of TypeMap
From the analysis above, we can see that TypeMap is a very versatile solution; however, it has several shortcomings:
- Inability to Guarantee the Presence of Values: When retrieving values, there is no guarantee that the value will exist, which can lead to panic or necessitate additional error handling logic.
- Heap Allocation Overhead
Among these issues, the first one is particularly severe. I participated in the development of an internal RPC framework at ByteDance, which has now been open-sourced as Volo. In its early stages, it experienced several unexpected panics, which were due to the absence of values in the TypeMap. Outer layer Services, due to oversight, forgot to insert certain context values in some branches, while inner layer Services strongly depended on and assumed the presence of these values, leading to panic.
Reliable Context Passing
Let’s reflect on why TypeMap holds these defects? The reason lies in its nature as a runtime read-write general map, making it incapable of fully leveraging compile-time check capabilities.
To introduce compile-time checks to solve this issue, one idea is to use different types to describe states of existence, and conditionally implement certain traits for them. When the existence of fields changes, we would need to change the type accordingly.
We need to address three questions:
- How to design Traits to constrain the presence of fields and manipulate them?
- How to define types that change as the presence of fields changes?
- How to optionally implement Traits for different states of field presence?
Designing Traits to Manipulate Fields
This design is similarly discussed in my article: Rust HTTP Framework Design - Taking Axum 0.6 as an Example
The Rust standard library provides abstractions like AsRef
and AsMut
. Linkerd has designed the Param
to describe similar abstractions, and Axum has designed the reverse trait FromRef
.
We can further develop this idea by providing more operational Traits:
I have published a package to define these traits: param; its source code can be found here.
1 | pub trait Param<T> { |
When we need to constrain the presence of a specific field, we can declare CX: Param<PeerAddr>
to enforce this constraint.
When it comes to operations such as inserting or deleting fields, which change the field’s existence, we need to obtain ownership of the current type and return a new type. This new type is defined as the associated type of that trait.
Defining Types and Changing Types During Operations
Struct is clearly the best storage method as it is highly efficient. Since the number and types of fields depend on user definitions, we need to generate code based on user specifications. A procedural macro is our best choice for this.
Before developing a procedural macro, we need to manually write out a target structure to validate the feasibility of this design and use it as a template. The implementation of the procedural macro is not complicated, so here I will just show the target code.
Consider the following user-defined structure:
1 | struct MyContext { |
We can generate the following structure:
1 | struct MyContext<T1, T2> { |
We can define two helper structures to be included in our library:
1 | pub struct Occupied<T>(T); |
These two structures can be used to fill the positions of T1 and T2.
For example, initially, their types expand to:
1 | struct MyContext { |
After writing the peer addr, the type becomes:
1 | struct MyContext { |
During write operations (which refer to operations that change the existence of fields), we need to take ownership of the current type and reassemble the fields into the new type. For example:
1 | impl<T1, T2> ParamSet<PeerAddr> for MyContext<T1, T2> { |
Conditionally Implementing Traits for Different States
With the design previously discussed, we can now conditionally implement traits for different states. For example:
1 | impl<T2> ParamRef<PeerAddr> for MyContext<Occupied<PeerAddr>, T2> { |
MyContext<Vacant, T>
does not implement ParamRef<PeerAddr>
.
Furthermore, we can define two traits, Available
and MaybeAvailable
, to represent the operability of fields, which can reduce the complexity of the generated code:
1 | pub trait Available { |
Available indicates that a field definitely exists (implemented only for Occupied<T>
), primarily involving read methods, take methods, etc.; MaybeAvailable indicates that a field might exist (thus it can be implemented for both Occupied<T>
and Vacant
), including overwrite write methods, remove methods, etc. To reduce the possibility of misuse, we can also introduce sealed traits to restrict the implementation of traits.
Implementation Results
So far, we have addressed the first three issues, defined the operational traits, generated the target structures, and implemented operations for changing the existence of fields. We can check the existence of fields at compile time to avoid panic.
Below is an example from certain-map (link):
1 | use certain_map::{certain_map, Param, ParamRef, ParamRemove, ParamSet, ParamTake}; |
If we attempt to read UserName
before inserting it, or after deleting it, the compiler will throw an error, preventing the code from compiling.
In the given example, the type of the context
structure undergoes the following changes:
- Initially,
MyCertainMap
is an empty struct without any fields, corresponding toMyCertainMap<Vacant, Vacant>
, and its size is 0. - After inserting
UserName
,MyCertainMap
changes toMyCertainMap<Occupied<UserName>, Vacant>
, and its size becomes equivalent to that ofUserName
. - After taking
UserName
,MyCertainMap
reverts toMyCertainMap<Vacant, Vacant>
, and its size returns to 0. - After inserting
UserAge
,MyCertainMap
changes toMyCertainMap<Vacant, Occupied<UserAge>
, and its size becomes equivalent to that ofUserAge
.
With this design, we have perfectly solved our problem; it ensures that if the code compiles, it is impossible for the fields to not exist. Furthermore, by introducing the param series of traits, we can use Context
as a generic parameter when defining Service
, decoupling its concrete type and making the components more versatile.
More Efficient Context Passing
I integrated certain-map (v0.2) into MonoLake, which is a generic gateway framework based on Monoio that I developed at ByteDance. It has not yet been open-sourced but is expected to be in the near future.
During stress testing and performance profiling, I observed some related memory copying overhead. There are two sources of this overhead:
- As the field existence changes, the struct type and size also change, necessitating the splitting of fields and recombination into the new type, which leads to a certain amount of stack copying.
- When an outer layer Service passes the Context to an inner layer, there can also be some copying overhead (whether this overhead actually exists depends on whether the async generator has optimized for this).
To address this issue, I proposed a new design: separate storage from state. The storage part pre-allocates space for all fields, and a composite structure made up of references to the state and storage is passed. Thus, when the state changes, only the state is modified, and the storage does not move, thereby avoiding the aforementioned stack copying overhead.
This design has been implemented in version 0.3 of certain-map.
Storage Structure and State Structure
The storage structure is a pre-allocated struct that is unaware of the state (i.e., the presence of fields). For example, for a user-defined structure like:
1 | struct MyContext { |
The actual storage structure (referred to below as the Store structure) is:
1 | pub struct MyContextStorage { |
The state structure, on the other hand, is a zero-sized struct, with generic parameters indicating the presence of corresponding fields in the storage structure (referred to below as the State structure):
1 | pub struct MyContextState<T1, T2> { |
The final structure passed is a composite structure (referred to below as the Handler structure; this structure is essentially equivalent to a single reference, thus very low in passing cost):
1 | pub struct MyContextHandler<'a, T1, T2> { |
At the trait level, we can still use the preceding design, where we need to implement traits like Param
for MyContextHandler; the Available/MaybeAvailable design can also be continued (though function signatures need modification).
We also need some new traits to provide capabilities like generating the Handler structure:
1 | pub trait Handler { |
Finally, we need to implement the Drop method for the Handler structure to ensure that the fields within the storage structure are properly dropped.
Advantages and Issues
Compared to previous designs, this design offers the following advantages:
- The storage structure is pre-allocated on the stack, which allows for quick access while avoiding stack copy overhead (compared to the TypeMap approach, it also saves on heap expenses), and is more friendly to CPU caches.
- The passed structure is a reference, so the passing cost is very low. This avoids unnecessary stack copy overhead.
- When the state changes, only the state needs to be modified, while the storage remains unmoved, thus avoiding stack copying overhead.
However, this design also introduces new problems:
- Users need to be aware of the Store structure and the Handler structure.
- Lifecycle management becomes more complex (the problems and solutions encountered will be discussed later).
- Cloning requires a special implementation.
The first issue is not a major problem; users only need to create the storage and generate its Handler, after which they can use the Handler just like they previously used the Context structure.
We will now proceed to discuss the last two issues in more detail.
Lifecycle and Constraint Definitions
Since service abstraction typically involves nested calling, in theory, placing the Store structure on the stack of the outer layer service and passing its Handler to the inner layer service should pose no issues.
However, in practical terms, when implementing services, due to not binding to the concrete type of Context, operations based on traits need to be mindful of lifetimes.
In this example, we implemented a few simple services and combined them:
1 | struct Add1<T>(T); |
To verify its correctness:
1 | let svc = Add1(Mul2(Identical)); |
When invoking, it is necessary to create the Store structure, and after generating the Handler, pass it as part of the Request using CX
.
However, I am not satisfied with just this; the automatic injection of Context should also be implemented as a Service too!
1 | struct CXSvc<CXStore, T> { |
HRTB is your friend!
In this case, the Handler structure needs to specify the lifetime, while there is no lifetime defined on the structure and the Service itself. At this point, it is necessary to use HRTB (Higher-Ranked Trait Bounds) to constrain the inner service (type T
) to accept Handlers with any lifetime.
Does everything look normal? Not quite!
cannot return value referencing local variable
store
returns a value referencing data owned by the current function
The reason is that the compiler cannot prove that the returned Response or Error does not contain a reference to the store. If it does contain such a reference, since the store is created inside the function, its reference will become invalid after the function ends, and therefore cannot be returned.
To solve this problem, we need to employ some tricks:
1 | impl<CXStore, T, R, RESP, ERR> Service<R> for CXSvc<CXStore, T> |
This implementation allows for correct compilation. But why is that?
By making the Response and Error generic parameters, we decouple their types from the lifetime of the Handler. This means that for any lifetime of the Handler, the same Response and Error are returned, allowing the compiler to prove that the Response and Error do not contain references to the store.
Now, we can verify that the current implementation meets the expectations:
1 | let svc = CXSvc::<MyCertainMap, _>::new(svc); |
Implementing Fork
In version 0.2, we could directly derive Clone to implement cloning. However, in version 0.3, with storage and state separated, we need to implement a special trait to convey the semantics of Clone, which we refer to as Fork.
I have divided the Clone semantics into the following two behaviors: one is using the Handler to copy the Store and State (the Handler must be used, otherwise the existence of fields in State is unknown), and the other is constructing the Handler using the Store and State:
1 | pub trait Fork { |
Building on the previous example, we can implement a new Service to test Fork. This Service will constrain Req: Copy
(actual calls use numbers) and Resp: Add<Output = Resp>
(actual also use numbers), and implement it by invoking the inner service twice with the Req, and then adding the results:
1 | struct DupSvc<T>(T); |
Just writing this Svc definition might seem to compile, but will it perform as expected? No, it will not. During invocation, the compiler will complain about unsatisfied constraints!
We need to carefully understand how to properly use HRTB. We need to modify the code as follows:
1 | impl<T, R, CXIn, CXStore, CXState, Resp, Err> Service<(R, CXIn)> for DupSvc<T> |
The difference lies here:
1 | // fail |
In the erroneous case, we treat HDR as a generic parameter. T: Service<(R, HDR)>
effectively poses an existential constraint (intersection), but for parameters, we should use HRTB to constrain their lifetimes, that is, to constrain over all possible Hdr<'a>
(union).
After correctly using HRTB, we can compile and execute this Svc correctly:
1 | let svc = CXSvc::<MyCertainMap, _>::new(DupSvc(Add1(Mul2(Identical)))); |
Summary
In this article, I introduced how to design a reliable context passing scheme and provided two implementations along with their design rationale.
In version 0.3 of certain-map, I separated storage and state by passing them through a Handler structure, avoiding stack copy overhead and enhancing performance. However, this increased the complexity of lifetime management, so I presented the correct way to use HRTB to solve this issue.
Finally, I welcome everyone to use this component in your projects and invite suggestions and improvements.