Understanding Smart Pointers in Rust: A Comprehensive Guide

Ebina Perelyn Okoh
8 min readJun 5, 2023

--

Photo by Jossuha Théophile on Unsplash

Introduction

Before we go into what smart pointers are let’s try to set a clear basis for what a pointer is, a pointer in programming is often a piece of data that directs to the location of another piece of data, like for example, your home address directs to where you live. Smart pointers are like regular pointers that direct to the location of a piece of data but with additional capabilities which include allocating multiple owners to a value, interior mutability, and more.

Ps. & which connotes a reference in Rust can also be regarded as a pointer since it points to a reference of a piece of data.

In this article, we’ll look at common smart pointers in Rust and how we can use them. some of the common smart pointers in Rust include Box, Rc, Arc, Refcell, and Mutex.

Box

The Box smart pointer is often used to allocate data to the heap, so with the Box smart pointer you can allocate an i32 which is usually allocated on the stack to the heap instead, this is usually helpful when you have large data that you don’t want to be stored on the stack because of its limited size.

Here’s how you use the Box smart pointer:

// Our i32 will be allocated on the heap instead of the stack 
let heap_allocated_i32 = Box::new(1);

Here’s how you can store large data with Box :

struct LargeData {
data: [i32; 1000000], // An array with 1 million elements
}

fn main(){
let boxed_data = Box::new(LargeData{
data: [0; 1000000], // Initialize the array with zeros
});
}

Normally, a variable with the type [i32; 1000000] would be stored on the stack but this would be inefficient because of its size and that’s why we would want to store it on the heap instead.

Another reason you would want to use a Box is to create recursive data structures like binary trees and linked lists. Recursive data structures, such as trees, often self-reference, making it hard to determine their size at compile time in Rust. The Box smart pointer helps bypass this by storing a fixed-size pointer to the data on the stack, while the actual data resides on the heap. This method enables the creation of recursive structures without needing to know their size in advance.

Here’s a simple implementation of a binary tree with Box

#[derive(Debug)]
struct TreeNode<T> {
value: T,
left: Option<Box<TreeNode<T>>>,
right: Option<Box<TreeNode<T>>>,
}

impl<T> TreeNode<T> {
fn new(value: T) -> TreeNode<T> {
TreeNode {
value,
left: None,
right: None,
}
}
}

fn main() {
let left_leaf = TreeNode::new("left leaf");
let right_leaf = TreeNode::new("right leaf");

let root = TreeNode {
value: "root",
left: Some(Box::new(left_leaf)),
right: Some(Box::new(right_leaf)),
};

println!("{:#?}", root);
}

Another important difference between the Box smart pointer and a regular pointer is the fact that the Box smart pointer is an owning pointer, when you Drop the Box, it will Drop the T it contains.

RC (Reference counting)

The Rust compiler follows a rule in which every variable is supposed to have one owner but with the RC smart pointer, we can mess around with that rule. The Reference counting(RC ) smart pointer as the name implies keeps count of how many variables own the data it wraps and the data is deallocated from memory when the number of owners for that data gets to zero.

Here’s an example:

#[derive(Debug)]
//This is how we bring Rc into scope
use std::rc::Rc;

struct Person {
name: String,
age: u32,
}

fn main() {
let person1 = Rc::new(Person {
name: "Alice".to_string(),
age: 25,
});

// Clone the Rc pointer to create additional references
let person2 = Rc::clone(&person1);
let person3 = Rc::clone(&person1);

println!("Name: {}, Age: {}", person1.name, person1.age);
println!("Name: {}, Age: {}", person2.name, person2.age);
println!("Name: {}, Age: {}", person3.name, person3.age);
println!("Reference Count: {}", Rc::strong_count(&person1));
}

It is worth noting that the clone method on Rc does not clone the data it wraps but instead makes another Rc that points to the data on the heap.

Arc(Atomic reference counting)

The Arc smart pointer is just like the Rc smart pointer but with a little bonus, it is thread-safe. what this simply implies is that the Arcsmart pointer lets us give multiple variables ownership of a certain piece of data while being able to access it in multiple threads. Let’s try our previous Rc code examples with multiple threads and see how it performs:

use std::thread;
use std::rc::Rc;

struct Person {
name: String,
age: u32,
}

fn main() {
let person = Rc::new(Person {
name: "Alice".to_string(),
age: 25,
});

let person_clone1 = Rc::clone(&person);
let person_clone2 = Rc::clone(&person);

let thread1 = thread::spawn(move || {
println!("Thread 1: Name={}, Age={}", person_clone1.name, person_clone1.age);
// Simulate some work being done in thread 1
thread::sleep_ms(1000);
});

let thread2 = thread::spawn(move || {
println!("Thread 2: Name={}, Age={}", person_clone2.name, person_clone2.age);
// Simulate some work being done in thread 2
thread::sleep_ms(1000);
});

thread1.join().unwrap();
thread2.join().unwrap();

println!("Reference Count: {}", Rc::strong_count(&person));
}

When we run this code, we’ll end up with this error:

error[E0277]: `Rc<Person>` cannot be sent between threads safely
--> src/main.rs:18:33

We get this error because Rc is not thread-safe and is meant only for a single thread.

If we wanted to be able to share data among multiple threads with the Arc smart pointer, here’s how we would do it:

use std::thread;
use std::sync::Arc;

struct Person {
name: String,
age: u32,
}

fn main() {
let person = Arc::new(Person {
name: "Alice".to_string(),
age: 25,
});

let person_clone1 = Arc::clone(&person);
let person_clone2 = Arc::clone(&person);

let thread1 = thread::spawn(move || {
println!("Thread 1: Name={}, Age={}", person_clone1.name, person_clone1.age);
// Simulate some work being done in thread 1
thread::sleep_ms(1000);
});

let thread2 = thread::spawn(move || {
println!("Thread 2: Name={}, Age={}", person_clone2.name, person_clone2.age);
// Simulate some work being done in thread 2
thread::sleep_ms(1000);
});

thread1.join().unwrap();
thread2.join().unwrap();

// Hey Curly, do you know why this is one? I know it has something to do with threads
println!("Reference Count: {}", Arc::strong_count(&person));
}

Although Arc is better suited for multiple-threaded cases, it is slower than Rc when dealing with single threads.

RefCell(Reference Cell)

We saw how we could mess around with ownership rules with Rc and Arc by being able to allocate multiple owners to a value in single and multiple threads respectively. With the RefCell smart pointer, we can bend the borrowing rules by mutating immutable references, and this pattern is often referred to as interior mutability in Rust.

One of the borrowing rules in Rust implies that you cannot have a mutable reference to an immutable value, so when we try something like this:

fn main(){
let a : i32 = 14;
*&mut a += 1;

println!("{}", a);
}

We would get an error like this:

error[E0596]: cannot borrow `a` as mutable, as it is not declared as mutable
--> src/main.rs:3:3
|
3 | *&mut a += 1;
| ^^^^^^ cannot borrow as mutable
|
help: consider changing this to be mutable
|
2 | let mut a : i32 = 14;
| +++

Of course, we could just do what the compiler says and make a mutable by adding mut a: i32 but what if we can’t? then we would have to use the RefCell smart pointer like so:

use std::cell::RefCell;

fn main(){
let a: RefCell<i32> = RefCell::new(14);
*a.borrow_mut() += 1;

println!("{}", *a.borrow());
}

Note: You can think of .borrow and .borrow_mut methods as & and &mut respectively for the RefCell smart pointer

Let’s look at a simple real-life scenario where we would need to use the RefCell smart pointer and the interior mutability pattern. Imagine you wanted to implement a trait for a data type and one of the methods of the trait takes an immutable reference as its parameter like &self but you want to be able to mutate this parameter, because of the Rust borrowing you would not be able to do this normally but thankfully we have RefCell in our toolkit.

For example, say this is the trait we wanted to implement for our data type:

trait Counter {
fn increment(&self);
fn get(&self) -> i32;
}

Here’s our data type called Count

use std::cell::RefCell;

struct Count {
value: RefCell<i32>,
}

impl Count {
fn new() -> Self {
Count {
value: RefCell::new(0),
}
}
}

Here’s how we would implement the method for our Count type:

impl Counter for Count {
fn increment(&self) {
// we make a mutable reference of `&self` with the borrow_mut method
let mut value = self.value.borrow_mut();
// then mutate the mutable reference
*value += 1;
}

fn get(&self) -> i32 {
*self.value.borrow()
}
}

Our entire code including a main function for testing should look like this:

use std::cell::RefCell;

trait Counter {
fn increment(&self);
fn get(&self) -> i32;
}

struct Count {
value: RefCell<i32>,
}

impl Count {
fn new() -> Self {
Count {
value: RefCell::new(0),
}
}
}
impl Counter for Count {
fn increment(&self) {
// we make a mutable reference of `&self` with the borrow_mut method
let mut value = self.value.borrow_mut();
// then mutate the mutable reference
*value += 1;
}

fn get(&self) -> i32 {
*self.value.borrow()
}
}

fn main() {
let count = Count::new();
count.increment();
count.increment();
println!("Count: {}", count.get()); // Output will be "Count: 2"
}

Mutex(Mutual Exclusion)

The Mutex smart pointer is helpful when we want to be able to mutate shared data in multiple threads safely. As the full acronym implies “Mutual exclusion”, each thread can lock a value while mutating it until it's out of scope, the lock each thread places on a shared value prevents other threads from mutating it. Let’s look at an example:

use std::sync::Mutex; 

fn main(){

// wrap an integer in a Mutex
let value = Mutex::new(0);

// lock `value` to this variable
let mut value_changer = value.lock().unwrap();

// deference value then increment the wrapped integer by 1
*value_changer += 1;

println!("{}", value_changer); // Output: 1
}

In this example, we wrap an integer in a Mutex and assign it to a variable called value, later we lock the Mutex to the value_changer variable and then increment it on the next line. With the lock placed on value by value_changer no other variable will be to mutate or even access it. Take for example:

use std::sync::Mutex; 

fn main(){
//same code
let value = Mutex::new(0);
let mut value_changer = value.lock().unwrap();
*value_changer += 1;

//Look here!
println!("{:?}", &value); // Mutex { data: <locked>, poisoned: false, .. }
}

When we try to output value we get Mutex { data: <locked>, poisoned: false, .. } notice how the data says locked, that’s due to the lock() the method we called on value earlier while assigning it to value_changer. To be able to “unlock” value, we’d need to wait for value_changer to go out of scope or use the unlock method on it.

use std::sync::Mutex; 

fn main(){
//same code
let value = Mutex::new(0);
let mut value_changer = value.lock().unwrap();
*value_changer += 1;

//This is the same as a variable going out of scope
std::mem::drop(value_changer);

//Look here!
println!("{:?}", value); // Mutex { data: 1, poisoned: false, .. }
}

Notice how when value_changer goes out of scope, value can access the Mutex, this is similar to how variables are dropped after a thread has ended, so when you lock in a Mutex in a thread, only that thread will have access to the value and will be able to mutate it. We do this because we want to be able to protect data when it is being used in multiple threads to prevent race conditions and make use of our concurrent program is thread-safe.

Conclusion

In this article, we have looked at smart pointers in Rust, what they are, and how we can use them to our advantage. We have looked at some common smart pointers in Rust, including Box, Rc, Arc, RefCell, and Mutex. We have also seen how we can use smart pointers to allocate data directly to the heap, create recursive data structures like binary trees and linked lists, mess around with ownership rules, and implement the interior mutability pattern in Rust. Finally, we have looked at how we can use the Mutex smart pointer to protect data when it is being used in multiple threads and prevent race conditions to make our concurrent programs thread-safe.

--

--

No responses yet