content/blog/2019-03-23-handmade-rust-3-containers.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347

+++
title = "Handmade Rust Part 3: Containers"
[taxonomies]
tags = ["Rust", "Programming"]
+++

The most commonly used kinds of containers are arrays and maps. Pretty much any other container type can be built using those two, so that's what we'll build today!

The Rust standard library has `Vec` which is a dynamic array and `HashMap` and `BTreeMap` which are two kind of maps, the first one relying on the key being hashable, and the second one relying on the key being sortable.

The hash map has the cost of hashing the key but then lookup is constant time (in theory at least), whereas the ordered map only compares the keys, but lookup is logarithmic time. I chose to implement a hash map because it's the one I find myself using most of the time, but ordered map are interesting as well.

Of course, just like for `Unq`, we won't be making simple replacements, instead we'll be making the most minimal containers necessary for now and add features later as needed, but we'll be make them allocator aware.

As always, full source code is available at [github.com/stevenlr/HandmadeRust](https://github.com/stevenlr/HandmadeRust).

<!-- more -->

`RawArray`: the dumbest container
=================

All containers will somehow need to allocate aggregates of data. We'll be making the `RawArray` type for allocating contiguous regions of memory, resizing them, and then deallocating the memory. This container will not be taking care of initializing memory, calling destructors, or anything like that.

```rust
pub struct RawArray<T, A: Allocator>
{
    pub ptr: *mut T,
    pub capacity: usize,
    alloc: A,
    _phantom: PhantomData<T>,
}
```

Let's analyze this. First we have two type parameters. `T` is the type contained in the array. `A` is an allocator, as defined in the first part of this series. `ptr` is the pointer to the memory region containing our data. `capacity` is the size of that region. Finally, `_phantom` is used to tell the compiler that this struct acts like it contains some data of type `T`.

When creating a `RawArray`, we simply set the pointer to null and the capacity to 0. Unless `T` is zero-sized. In this case, we set the capacity to the max value of a `usize`. That way, the array will never grow, because we don't need it to grow.

We also define a `reserve` method that takes in a new capacity. If the current capacity is larger that the new capacity, we don't do anything. Otherwise we allocate a new memory region of the desired size, copy the old data using `ptr::copy_from`, and dealloc the old region. Simple! Note that I don't know how this container will behave with the `Pin` type, and actually I don't care for now.

Finally, when dropped, the raw array simply deallocates its memory if its pointer is not null.

`Array`: the less dumb container
==================

Now that we can allocate raw arrays, we can define the proper `Array` type that can be pushed to and poped from.

```rust
pub struct Array<T, A: Allocator>
{
    size: usize,
    buffer: RawArray<T, A>,
}
```

As you can see, the `Array` simply wraps a raw array and keeps track of the number of elements actually contained. Unlike `RawArray`, `Array` will guarantee that memory is always properly initialized for the `size` first elements of the memory region, and it will drop the content of the elements when needed.

First let's define the `push` operation. It needs to grow the container when its max capacity is reached, and then write the new value to the end of the used region.

```rust
pub fn push(&mut self, value: T)
{
    if self.size == self.buffer.capacity
    {
        self.grow_auto();
    }

    unsafe
    {
        self.buffer.ptr.offset(self.size as isize).write(value);
    }

    self.size += 1;
}
```

`grow_auto` is a simple method that calls `reserve` on the raw array and doubles the current capacity (or sets it to 1 if the array was empty). Note that when writing the value, we don't use `write_unaligned` because we made sure that `RawArray` properly aligns its memory region.

```rust
pub fn pop(&mut self) -> Option<T>
{
    if self.size == 0
    {
        None
    }
    else
    {
        let value = unsafe
        {
            self.buffer.ptr.offset((self.size - 1) as isize).read()
        };

        self.size -= 1;
        Some(value)
    }
}
```

Nothing very exciting here. Compared to C++, `pop` not only resizes the array, but at the same time returns the element. In C++ you need to access the last element, then `pop` to have the same effect, and you also need to check if the array is empty or not. We can avoid all of this simply because we have the `Option` type!

```rust
pub fn clear(&mut self)
{
    if needs_drop::<T>()
    {
        unsafe
        {
            for i in 0..self.size
            {
                drop_in_place(self.buffer.ptr.offset(i as isize));
            }
        }
    }

    self.size = 0;
}
```

When clearing, we need to drop the elements contained in the array. We do this by calling `drop_in_place` which calls `drop` without needing any copy. This is of course `unsafe`. Then, drop for the array can be implemented by simply calling `clear`.

Finally, let's talk about `resize`. There are two cases here:

 - When resizing to a smaller size, we need to drop all elements above the new size.
 - When resizing to a bigger size, we may need to reserve to the new size, and then to write a default value to each new slot. For this `T` needs to implement `Clone`. However we don't really want our array to only be usable with cloneable types. Instead we'll provide a resize method that takes in a closure that generates new `T`s. When `T` is cloneable, that closure can simply be `|| value.clone()`. We could also provide a `resize_default` method that only takes in the new size when `T` implements `Default`.

 ```rust
pub fn resize_with<F>(&mut self, new_size: usize, f: F)
    where F: Fn() -> T
{
    if new_size < self.size && needs_drop::<T>()
    {
        for i in new_size..self.size
        {
            unsafe
            {
                drop_in_place(self.buffer.ptr.offset(i as isize));
            }
        }
    }
    else if new_size > self.size
    {
        if new_size > self.buffer.capacity
        {
            self.reserve(new_size);
        }

        for i in self.size..new_size
        {
            unsafe
            {
                self.buffer.ptr.offset(i as isize).write(f());
            }
        }
    }

    self.size = new_size;
}

pub fn resize(&mut self, new_size: usize, value: T)
    where T: Clone
{
    self.resize_with(new_size, || value.clone());
}
```

Now, that's all fine, but we'd like our array to behave like any other contiguous region of memory: we want to index in it (possibly even index ranges), iterate over it, and basically be able to do anything a slice can. There is a very simple way of doing this.

```rust
impl<T, A: Allocator> Deref for Array<T, A>
{
    type Target = [T];

    #[inline]
    fn deref(&self) -> &Self::Target
    {
        unsafe
        {
            slice::from_raw_parts(self.buffer.ptr, self.size)
        }
    }
}
```

Also implement `DerefMut` and voilà! Our array can behave just like a slice, and that was free!

Actually we may want to implement `Index` so we don't have to pay for creating a slice everytime we need to index. Careful generated assembly examination will tell if that's needed.

`HashMap`s need to `Hash`
========================

Before implementing the hash map, we need to somehow be able to hash some data. The Rust core library defines the `Hash` trait. This trait indicates that a type can be hashed using a `Hasher`. No hasher is provided by default in the core library so let's make one! I chose to implement [SipHash 2-4](https://en.wikipedia.org/wiki/SipHash) because that's what the standard library uses, and it looked simple enough.

I won't bother you with the implementation, just know that the `Hasher` trait requires two methods:

 - `write` to run a few rounds of hashing on a slice of bytes
 - `finish` to actually finish hashing and return the hash.

It is also a good idea for the hasher to implement `Default` so we can use `BuildHasherDefault` which implements `BuildHasher`, basically a hasher factory.

`HashMap`, finally
====================

Now we have everything we need to build our hash map. I won't go into details of the implementation, but just go through some interesting implementation points.

I chose to implement something that resembles [Google's swiss table](https://www.youtube.com/watch?v=ncHmEUmJZf4). The basic idea is that you have a first array that stores 7 bits of the hash of each element, the 8th bit is used to know when an element is present. When looking up a slot, this is used to quickly find a suitable slot without having to iterate though the main array that contains the keys and values. Yay for cache locality. Note that with some SIMD magic, this can be made even faster. In my implementation, this 8 bits hint is called a `ShortBucket`.

```txt
    1xxx xxxx = occupied, xxx xxxx = first 7 bits of the key hash.
    0000 0001 = free
    0000 0010 = deleted, required for linear probing
```

In my implementation, the main array is defined as containing `Option<Bucket>`. `Bucket` itself contains the hash of the key (so we don't need to re-hash it), the key, and the value. This is not really efficient, but I didn't want to have to deal with uninitialized values, and `Option` takes care of that. Note that I could have used `RawArray` instead, but I'd have had more boilerplate code to write. I'll probably revise all of this one day, but for now it works well enough.

```rust
struct Bucket<K, V>
where
    K: Sized + Eq + Hash,
    V: Sized,
{
    hash: u64,
    key: K,
    value: V,
}

struct HashMapInner<K, V, A>
where
    K: Sized + Eq + Hash,
    V: Sized,
    A: Allocator + Clone,
{
    shortb: Array<ShortBucket, A>,
    buckets: Array<Option<Bucket<K, V>>, A>,
}

pub struct HashMap<K, V, A>
where
    K: Sized + Eq + Hash,
    V: Sized,
    A: Allocator + Clone,
{
    alloc: A,
    hash: BuildHasherDefault<SipHash>,
    inner: HashMapInner<K, V, A>,
}
```

The reason for having `HashMapInner` is for resizing: the simplest way of doing that is to create a new `HashMapInner` with the new capacity, then iterate over all elements and swap them into the new container, and finally replace the old `HashMapInner` with the new one.

For finding a suitable slot, we define the `find_bucket` method. For a given key, it returns either the matching slot, or the first one that the key can be inserted into.

 - It first hashes the key to find the start position for linear probing, then iterates over the short buckets array.
    - If the first 7 bits match the short bucket, we check the full hash in the buckets array. If that matches too we check the keys for equality. If that matches too, we found the correct bucket, we're done.
    - If we find a deleted slot, we mark it as acceptable for insertion if none was marked as such before and then continue.
    - If we find a free slot, linear probing is done. If we had an acceptable bucket for insertion, we return it, otherwise we return the current one.
 - If we reach back the starting index, we're done and we return the acceptable slot for insertion (which may be none).

 This method alone is enough for lookup, insertion, deletion, etc.

 When we want to count the elements in the map, we just need to iterate over the short buckets and count the ones that are occupied.

Currently, resizing only happens when we run out of space. Ideally it should be based on load factor.

There are many things that this hash map doesn't do such as iterating over elements, but for now that's enough...

`HashSet` is free
=========================

Since `Array` supports zero-sized types, ...

```rust
pub struct HashSet<T, A>
where
    T: Sized + Eq + Hash ,
    A: Allocator + Clone,
{
    map: HashMap<T, (), A>,
}
```

... done! (Well you still need to implement a few methods but you know...)

Dynamic `String`s
=======================

Dynamic strings just wrap a `Array<u8, A>`. We implement a method for creating them from `str`.

```rust
pub fn from_str(s: &str, alloc: A) -> Self
{
    let slice = s.as_bytes();
    let mut buf = Array::new(alloc);
    buf.resize(slice.len(), 0);

    unsafe
    {
        copy_nonoverlapping(s.as_ptr(), buf.as_mut_ptr(), slice.len());
    }

    Self { buf }
}
```

Just like `Array` can be coerced to a slice, we would like this string to coerce to a `str` so we can do anything `str` can with `String`. We implement `Deref` and `DerefMut`, and a `as_str` method.

```rust
impl<A: Allocator> String<A>
{
    #[inline]
    pub fn as_str(&self) -> &str
    {
        self
    }
}

impl<A: Allocator> Deref for String<A>
{
    type Target = str;

    #[inline]
    fn deref(&self) -> &str
    {
        unsafe
        {
            str::from_utf8_unchecked(&self.buf)
        }
    }
}
```

Now we can very easily implement useful traits such as `Display`, `Hash` or `PartialEq`:

```rust
impl<A: Allocator> hash::Hash for String<A>
{
    fn hash<H: hash::Hasher>(&self, h: &mut H)
    {
        hash::Hash::hash(self.as_str(), h);
    }
}
```

And that's it for this container, for now. We'll probably want to be able to concatenate and do other mutating operations on it eventually, but the basic structure is here.

Wrap up
===============

The containers we have implemented so far should be enough for most things we need. Of course they lack some very useful features at the moment, but we'll get back to them as needed. I don't suggest anybody use those in production because, me not being an expert at Rust, there probably are many bugs remaining. I guess time will tell. Next time we'll take a look at initializing Vulkan. Until then, you can contact me on [Twitter](https://twitter.com/steven_lr).