MESYETI's Website
<- HomeDay 10 - Implementing vectors in Callisto
My December Adventure log is back, after 5 days of nothing. I started implementing vectors (dynamic arrays) in Callisto. It went horribly wrong as many Callisto bugs started to show themselves. It took a while, but eventually they were all fixed. Now to talk about vectors.
Up until now, Callisto has been about as low level and as difficult to use as C. C doesn't make using arrays on the heap that easy, and neither did Callisto. In C, this usually makes you end up just using fixed size arrays where you can, which is good as they are faster than dynamic arrays. However, if you really need a resizable array, you might get lazy and make a slow implementation without things like array capacity (which means less realloc calls - I'll explain later). Not that this matters anyway, since Callisto is pretty slow (it's not too bad though!)
How vectors work
Vectors are just arrays with a variable size, so they need to be allocated on the heap. Re-allocating every time you add something to this array can be slow - so it's good to use something called capacity. This means pre-allocating some big (not too big) amount of data in case stuff needs to be added. So, whenever something is added, instead of re-allocating, simply increase the length value and write to the already allocated memory. If there is no more capacity left (the vector length is equal to the vector capacity), then re-allocate. This means re-allocation happens much less often.
With this in mind, I can start off my vector implementation like this:
struct Vector : Array
usize capacity
end
Array is a built-in structure in Callisto. It has a definition like this:
struct Array
usize length
usize memberSize
addr elements
end
Something I like about object orientated languages is that they usually have constructors or destructors for their classes. Things are easier when you don't have to be worrying about freeing things when they go out of scope. That's why I added them to Callisto. Here, I'm going to use them to initialise the values in the Vector structure.
implement Vector init
let ptr Vector vec
-> vec
0 -> vec.length
0 -> vec.memberSize
0 -> vec.elements
0 -> vec.capacity
end
implement Vector deinit
let ptr Vector vec
-> vec
if vec.elements 0 = not then
vec.elements free
end
end
Setting the values in the vector is important because of the destructor. If the elements value is not initialised - then it has some garbage value in it. An empty array should be NULL. This is important for the destructor because freeing a garbage address will cause a fatal runtime error.
However, this initialisation is not enough. The structure doesn't yet know
about the type it will hold. The user will give this information to the structure
with the init_vec
function:
func init_vec usize typeSize ptr Vector vec begin
typeSize -> vec.memberSize
1 -> vec.capacity
0 -> vec.length
vec alloc_vec_capacity
end
It also initialises the vector's capacity to 1. The alloc_vec_capacity
function simply calls realloc with the vector's element address and the vector's
capacity.
Now that we have everything needed for the Vector type itself - next a way to add data to the vector is needed. I will add a function to push values to the array:
func man vec_push cell value ptr Vector vec begin
let ptr Vector vec
-> vec
if vec.length 1 + vec.capacity > then
vec double_vec_capacity
end
vec.length 1 + -> vec.length
vec.length 1 - vec a!
end
The double_vec_capacity
function just doubles the vector's capacity
and re-allocates. I'm not sure if doubling the capacity is always a good idea
(you could run out of memory if you have an enormous amount of elements),
so at some point I'll check other implementations to see what they do,
but this should be fine for now.
Now to make a function that does the opposite, pops from the top of the vector.
func error vec_pop ptr Vector vec begin
if vec.length 0 = then
"vec_pop: Vector empty" throw
end
vec.length 1 - -> vec.length
if vec.length vec.capacity 2 / < then
vec halve_vec_capacity
end
end
halve_vec_capacity
does the opposite of double_vec_capacity
.
It halves the vector's capacity and re-allocates.
2 more functions to go until I would consider this vector implementation
usable. Remove and insert. vec_remove
will remove an element
at any index from the vector. First, all elements after the element to be
removed should be moved to where the element is. So, removing index 1
from [1, 2, 3] will result in [1, 3]. The length is still the same, so just
subtract 1 from it.
func error vec_remove cell index ptr Vector vec begin
if index vec.length >= then
c"vec_remove: Index out of bounds" throw
end
if index vec.length 1 - = then
vec vec_pop
return
end
index vec a^ # dest
index 1 + vec a^ # src
vec.length index - vec.memberSize * # n
copy_mem
vec.length 1 - -> vec.length
if vec.length vec.capacity 2 / < then
vec halve_vec_capacity
end
end
Insert is the opposite of this. It takes 2 parameters. First is the index to insert at, and second there's the value. The first thing to do is to move all elements from index to the end to index + 1. This leaves a space for the inserted element. After that, you simply write the inserted element there and increment the length.
func error vec_insert cell value cell index ptr Vector vec begin
if index vec.length > then
c"vec_insert: Index out of bounds" throw
end
if index vec.length = then
value vec vec_push
return
end
if vec.length 1 + vec.capacity > then
vec double_vec_capacity
end
index 1 + vec.memberSize * vec.elements + # dest
index vec a^ # src
vec.length index - vec.memberSize * # n
copy_mem
vec.length 1 + -> vec.length
value index vec a!
end
That's the end of my vector implementation. This makes Callisto much easier to use as vector and similar structures that will come in the future will make manual memory management not as much of an issue.
Links