GhostCell

GhostCell 是另一个内部可变性的实现。相比 RefCell,它实现的是编译期的检查。

问题

Rust 中面临所谓 AXM 的问题。A 指的是 Aliasing,M 指的是 Mutability。AXM 表示 A Xor M,也就是这两个只能保证其一。

这让 Rust 容易实现树状结构,而难以实现图状结构(比如双向链表、图等)。其原因是图状结构中有环,环意味着 internal sharing。

对此,Rust 一般提供两种做法:

  1. raw 指针
    具有下面的特点:
    • Unsafe,也就是 Rust 的类型系统无法进行约束
    • 没有 AXM 限制
  2. Interior Mutability
    • 非线程安全的有 Cell 和 RefCell,它们允许通过一个不可变借用 &T 去修改内部的对象
    • 线程安全的有 Mutex 和 RwLock

比如双向链表就可以写成

1
2
3
4
5
6
7
struct Node<T> {
data: T,
prev: Option<NodeRef<T>>,
next: Option<NodeRef<T>>,
}

type NodeRef<T>= &RwLock<Node<T>>;

作者就说这样的话一个 Node 会有一个 RwLock。如果我们希望链表可以被多个线程同时修改,这是有用的,但如果只是希望在整个链表上实现 AXM,就有点 overkill 了。

当然,我觉得始终可以去把整个链表外面包一层来解决 AXM 的问题的吧。

介绍

作者宣称 GhostCell 有如下的特性:

  1. 支持 internal sharing
  2. zero cost abstraction
  3. safe,通过 Coq 证明
  4. flexible,是 thread-safe 的,对 type T 没有要求

原理

作者说 Rust 将 permission 和 data 绑定了。所以这导致了 AXM 在一个很细的粒度上,比如链表中的每个节点。

因此,GhostCell 将 permission 和 data 分开来。让一个 permission 可以和一整个data 关联,比如整个链表。

这个想法称为 Brand Types,来自于 ST monad 的启发。

GhostCell 表示一个大结构里面的某个小元素。这个大结构有一个 brand 为 'id
GhostToken 是一个 Permission,可以用来访问任意的标有 'id 的 GhostCell。

以链表为例,进行如下的修改即可使用 GhostCell。

使用

如下所示,创建一个 vec,其中元素都为 cell 的引用。我修改 vec[n / 2] 的值,然后再读取 cell,会发现 cell 的值也被修改了。这里的逻辑没问题,但是 GhostToken 能够避免 borrow checker 的检查报错。

1
2
3
4
5
6
7
8
9
10
11
12
use ghost_cell::{GhostToken, GhostCell};

fn demo(n: usize) {
let value = GhostToken::new(|mut token| {
let cell = GhostCell::new(42);
let vec: Vec<_> = (0..n).map(|_| &cell).collect();
*vec[n / 2].borrow_mut(&mut token) = 33;
*cell.borrow(&token)
});

assert_eq!(value, 33);
}

实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
type InvariantLifetime<'brand> = PhantomData<fn(&'brand ()) -> &'brand ()>;

/// A `GhostToken<'x>` is _the_ key to access the content of any `&GhostCell<'x, _>` sharing the same brand.
///
/// Each `GhostToken<'x>` is created alongside a unique brand (its lifetime), and each `GhostCell<'x, T>` is associated
/// to one, and only one, `GhostToken` at a time via this brand. The entire set of `GhostCell<'x, T>` associated to a
/// given `GhostToken<'x>` creates a pool of cells all being accessible solely through the one token they are associated
/// to.
///
/// The pool of `GhostCell` associated to a token need not be homogeneous, each may own a value of a different type.
pub struct GhostToken<'brand> {
_marker: InvariantLifetime<'brand>,
}

impl<'brand> GhostToken<'brand> {
/// Creates a fresh token to which `GhostCell`s can be tied to later.
///
/// Due to the use of a lifetime, the `GhostCell`s tied to a given token can only live within the confines of the
/// invocation of the `fun` closure.

#[allow(clippy::new_ret_no_self)]
pub fn new<R, F>(fun: F) -> R
where
for<'new_brand> F: FnOnce(GhostToken<'new_brand>) -> R,
{
let token = Self {
_marker: InvariantLifetime::default(),
};
fun(token)
}
}

impl<'brand, T: ?Sized> GhostCell<'brand, T> {
pub fn borrow<'a>(&'a self, _: &'a GhostToken<'brand>) -> &'a T {
// Safety:
// - The cell is borrowed immutably by this call, it therefore cannot already be borrowed mutably.
// - The token is borrowed immutably by this call, it therefore cannot be already borrowed mutably.
// - `self.value` therefore cannot be already borrowed mutably, as doing so requires calling either:
// - `borrow_mut`, which would borrow the token mutably.
// - `get_mut`, which would borrow the cell mutably.
unsafe { &*self.value.get() }
}

pub fn borrow_mut<'a>(&'a self, _: &'a mut GhostToken<'brand>) -> &'a mut T {
// Safety:
// - The cell is borrowed immutably by this call, it therefore cannot already be borrowed mutably.
// - The token is borrowed mutably by this call, it therefore cannot be already borrowed.
// - `self.value` therefore cannot already be borrowed, as doing so requires calling either:
// - `borrow` or `borrow_mut`, which would borrow the token.
// - `get_mut`, which would borrow the cell mutably.
unsafe { &mut *self.value.get() }
}

pub fn from_mut(t: &mut T) -> &mut Self {
// Safety:
// - `t` is mutably borrowed for the duration.
// - `GhostCell<'_, T>` has the same in-memory representation as `T`.
unsafe { mem::transmute(t) }
}
}

Reference

  1. https://www.bilibili.com/video/BV1HP4y1s762/