Today I started doing something new. Over the next couple of weeks, I will try to implement a bunch of algorithms. And today I'm kicking it off with the simplest algorithm that I could think of: bubble sort.
The Why
I've been thinking about my programming skills and in the last 1.5 years working as a freelancer I rarely had the opportunity to learn something new on the purely technical side of things. Of course, I learned a ton when it comes to managing projects, my own time, and especially mental health, but I originally studied computer science because I just love nerding and learning about cool algorithms and elegant solutions to hard problems. When developing web applications, much of my daily time is spent arranging HTML/CSS and pushing JSON to the server and back out.
So to get my daily fix of nerdage, that's what I'm gonna do: Implement one algorithm a day.
The how
For now, I've decided to go with rust to practice wrangling with the borrow checker. Especially later when it comes to tree and graph data structures.
The algorithms I'm gonna go for are the list of sorting algorithms on Wikipedia.
Bubble Sort
This is mostly just the warm-up to get things started. On Wikipedia, there is a pseudocode example that can be translated one to one.
Here is the final implementation I came up with:
pub fn bubble_sort<T: PartialOrd>(elements: &mut [T]) {
if elements.len() <= 1 {
return;
}
let mut swapped = true;
let mut n = elements.len() - 1;
while swapped {
swapped = false;
for i in 0..n {
if elements[i] > elements[i + 1] {
// Split the slice mutably to ensure that the borrow checker swaps the elements accordingly.
let (left, right) = elements.split_at_mut(i + 1);
std::mem::swap(&mut left[i], &mut right[0]);
swapped = true;
}
}
n -= 1;
}
}
It's pretty straightforward. If there are 1 or 0 elements, we return to avoid a usize underflow later. Afterward, we swap elements to the back until the list is sorted.
It is clear that the list is sorted when we didn't perform any swaps. Simple stuff.
Testing
One of the many beautiful language features is the integration of unit tests directly alongside the code. Let's start with the trivial test cases:
#[cfg(test)]
mod tests {
use crate::sorting::test_helpers::{is_stably_sorted, random_comparable_list};
use super::bubble_sort;
// the following tests are just to check for crashes or endless loops
#[test]
fn it_sorts_empty_slices() {
let mut data: Vec<i32> = Vec::new();
bubble_sort(data.as_mut_slice());
}
#[test]
fn it_sorts_one_element_slices() {
let mut data = [1];
bubble_sort(&mut data);
}
#[test]
fn it_sorts_element_slices_of_same_elements() {
let mut data = [1, 1, 1, 1, 1, 1, 1];
bubble_sort(&mut data);
}
// ...
}
These are very simple test cases to cover the case of the list being empty, and containing just one element. Even though they seem trivial, my original implementation did have a bug:
pub fn bubble_sort<T: PartialOrd>(elements: &mut [T]) {
if elements.is_empty() {
return;
}
// ...
When the list had exactly one element, there was an underflow at the end of the loop:
let mut swapped = true;
let mut n = elements.len() - 1;
while swapped {
swapped = false;
for i in 0..n {
if elements[i] > elements[i + 1] {
let (left, right) = elements.split_at_mut(i + 1);
std::mem::swap(&mut left[i], &mut right[0]);
swapped = true;
}
}
n -= 1; // this underflows because n is already zero when list length is 1
}
In C++ this bug would have been "harmless" in this case because n would not have been used afterward in any case. But since rust checks for underflows when compiling in debug mode, it nonetheless panicked. A very reasonable behavior if you ask me.
More complex cases
Since bubble sort is a stable search algorithm, I wanted to make sure that it sorts stably. To do so, I added a new module with some test helpers:
// sorting/test_helpers.rs
pub struct StableSortComparableItem {
value: i64,
initial_index: usize,
}
impl PartialEq for StableSortComparableItem {
fn eq(&self, other: &Self) -> bool {
self.value == other.value
}
}
impl PartialOrd for StableSortComparableItem {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
match self.value.partial_cmp(&other.value) {
Some(core::cmp::Ordering::Equal) => return Some(core::cmp::Ordering::Equal),
ord => return ord,
}
}
}
/// Generates a random list of StableSortComparableItems with the correct initial index
pub fn random_comparable_list(len: usize, min: i64, max: i64) -> Vec<StableSortComparableItem> {
let mut result = Vec::new();
result.reserve(len);
let mut generator = rand::thread_rng();
for i in 0..len {
result.push(StableSortComparableItem {
value: generator.gen_range(min..=max),
initial_index: i,
});
}
result
}
While sorting, only the contained integer value is compared, while the index in which the item was initially created is preserved.
Later, after the whole thing is sorted we check the array and assert, that if two items contain the same value, the initial index of the left one is less than the initial index of the right element.
The test case is pretty simple:
#[test]
fn it_sorts_element_slices_of_different_elements() {
// generate 25 elements between 0 and 10 inclusive to make sure that at least some of them are the same
let mut data = random_comparable_list(25, 0, 10);
bubble_sort(&mut data);
assert!(is_stably_sorted(&data));
}
Conclusion
Even though bubble sort is incredibly simple, when it comes to algorithms it is easy to introduce bugs at certain edge cases. Another thing is that I had to resort to splitting the slice into two to swap the elements because of rust's borrow checker.
I found the following unsafe workaround in rusts standard library for the UnsafeCell type, but I'm honestly not sure if there even is a performance benefit and if I can reasonably assume that it's safe for all cases.
let left = &elements[i] as *const T as *mut T;
let right = &elements[i + 1] as *const T as *mut T;
unsafe {
std::mem::swap(&mut *left, &mut *right);
}
Something to find out in the future I guess. For tomorrow I'll start to work down the list of sorting algorithms on Wikipedia.
Feel free to have a look at the repository containing the algorithms here.