Stale slices

As we have seen in a previous post (Arrays vs Slices), in Golang, a slice is just a header pointing to a backing array. In this post, we will discuss slices again and we will focus on some pitfalls regarding the append function.

Let's start with a very simple example declaring 2 slices: one containing all the letters of the name 'MASSIMILIANO' and one pointing to the same backing array, but without the last letter:

package main

import "fmt"

func main() {
    charSlice1 := []byte{'M', 'A', 'S', 'S', 'I', 'M', 'I', 'L', 'I', 'A', 'N', 'O'}
    fmt.Println("[1]", "charSlice1", string(charSlice1))

    charSlice2 := charSlice1[:len(charSlice1)-1]

    fmt.Println("[2]", "charSlice1", string(charSlice1))
    fmt.Println("[2]", "charSlice2", string(charSlice2))
}

---- OUT ----
[1] charSlice1 MASSIMILIANO
[2] charSlice1 MASSIMILIANO
[2] charSlice2 MASSIMILIAN

Up to now, no surprises: everything is as expected. To demonstrate that the 2 slices are pointing to the same backing array, let's change the 'S' chars in charSlice2 to 'X' and let's see the values of charSlice1 and charSlice2

package main

import "fmt"

func main() {
    charSlice1 := []byte{'M', 'A', 'S', 'S', 'I', 'M', 'I', 'L', 'I', 'A', 'N', 'O'}
    fmt.Println("[1]", "charSlice1", string(charSlice1))

    charSlice2 := charSlice1[:len(charSlice1)-1]

    fmt.Println("[2]", "charSlice1", string(charSlice1))
    fmt.Println("[2]", "charSlice2", string(charSlice2))

    charSlice2[2] = 'X'
    charSlice2[3] = 'X'

    fmt.Println("[3]", "charSlice1", string(charSlice1))
    fmt.Println("[3]", "charSlice2", string(charSlice2))
}

---- OUT ----
[1] charSlice1 MASSIMILIANO
[2] charSlice1 MASSIMILIANO
[2] charSlice2 MASSIMILIAN
[3] charSlice1 MAXXIMILIANO
[3] charSlice2 MAXXIMILIAN

Again, everything is as expected: both charSlice1 and charSlice2 have been changed. Now let's try to append an 'o' char to charSlice2. Before reading on, try to think about what will happen and why.

Here is the new code:

package main

import "fmt"

func main() {
    charSlice1 := []byte{'M', 'A', 'S', 'S', 'I', 'M', 'I', 'L', 'I', 'A', 'N', 'O'}
    fmt.Println("[1]", "charSlice1", string(charSlice1))

    charSlice2 := charSlice1[:len(charSlice1)-1]

    fmt.Println("[2]", "charSlice1", string(charSlice1))
    fmt.Println("[2]", "charSlice2", string(charSlice2))

    charSlice2[2] = 'X'
    charSlice2[3] = 'X'

    fmt.Println("[3]", "charSlice1", string(charSlice1))
    fmt.Println("[3]", "charSlice2", string(charSlice2))

    charSlice2 = append(charSlice2, 'o')

    fmt.Println("[4]", "charSlice1", string(charSlice1))
    fmt.Println("[4]", "charSlice2", string(charSlice2))
}

---- OUT ----
[1] charSlice1 MASSIMILIANO
[2] charSlice1 MASSIMILIANO
[2] charSlice2 MASSIMILIAN
[3] charSlice1 MAXXIMILIANO
[3] charSlice2 MAXXIMILIAN
[4] charSlice1 MAXXIMILIANo
[4] charSlice2 MAXXIMILIANo

Is this the output you were expecting? The append added an 'o' to charSlice2 but at the same time changed the last character of charSlice1! Why?

We will give all the answers, but before that let's try another example. This time we will add the chars for ' Z.' to charSlice2. Again, try to figure out what will happen and why before reading on, then check if you figured it out correctly.

Here is the new code:

package main

import "fmt"

func main() {
    charSlice1 := []byte{'M', 'A', 'S', 'S', 'I', 'M', 'I', 'L', 'I', 'A', 'N', 'O'}
    fmt.Println("[1]", "charSlice1", string(charSlice1))

    charSlice2 := charSlice1[:len(charSlice1)-1]

    fmt.Println("[2]", "charSlice1", string(charSlice1))
    fmt.Println("[2]", "charSlice2", string(charSlice2))

    charSlice2[2] = 'X'
    charSlice2[3] = 'X'

    fmt.Println("[3]", "charSlice1", string(charSlice1))
    fmt.Println("[3]", "charSlice2", string(charSlice2))

    charSlice2 = append(charSlice2, 'o')

    fmt.Println("[4]", "charSlice1", string(charSlice1))
    fmt.Println("[4]", "charSlice2", string(charSlice2))

    charSlice2 = append(charSlice2, ' ', 'Z', '.')
    fmt.Println("[5]", "charSlice1", string(charSlice1))
    fmt.Println("[5]", "charSlice2", string(charSlice2))

}

--- OUT ---
[1] charSlice1 MASSIMILIANO
[2] charSlice1 MASSIMILIANO
[2] charSlice2 MASSIMILIAN
[3] charSlice1 MAXXIMILIANO
[3] charSlice2 MAXXIMILIAN
[4] charSlice1 MAXXIMILIANo
[4] charSlice2 MAXXIMILIANo
[5] charSlice1 MAXXIMILIANo
[5] charSlice2 MAXXIMILIANo Z.

Wait... WHAT? This time it appended the chars to charSlice2 but didn't do anything to charSlice1. What is happening?

I will answer, I swear. But before, one last experiment. Let's change back the 'X' to 'S' on charSlice2. Again, before reading on, try to figure out what is going to happen to charSlice1.

Here is the new code:

package main

import "fmt"

func main() {
    charSlice1 := []byte{'M', 'A', 'S', 'S', 'I', 'M', 'I', 'L', 'I', 'A', 'N', 'O'}   // (1)
    fmt.Println("[1]", "charSlice1", string(charSlice1))

    charSlice2 := charSlice1[:len(charSlice1)-1]                                       // (2)

    fmt.Println("[2]", "charSlice1", string(charSlice1))
    fmt.Println("[2]", "charSlice2", string(charSlice2))

    charSlice2[2] = 'X'                                                                // (3)
    charSlice2[3] = 'X'

    fmt.Println("[3]", "charSlice1", string(charSlice1))
    fmt.Println("[3]", "charSlice2", string(charSlice2))

    charSlice2 = append(charSlice2, 'o')                                               // (4)

    fmt.Println("[4]", "charSlice1", string(charSlice1))
    fmt.Println("[4]", "charSlice2", string(charSlice2))

    charSlice2 = append(charSlice2, ' ', 'Z', '.')                                     // (5)
    fmt.Println("[5]", "charSlice1", string(charSlice1))
    fmt.Println("[5]", "charSlice2", string(charSlice2))

    charSlice2[2] = 'S'                                                                // (6)
    charSlice2[3] = 'S'
    fmt.Println("[6]", "charSlice1", string(charSlice1)) 
    fmt.Println("[6]", "charSlice2", string(charSlice2))

}

--- OUT ---
[1] charSlice1 MASSIMILIANO
[2] charSlice1 MASSIMILIANO
[2] charSlice2 MASSIMILIAN
[3] charSlice1 MAXXIMILIANO
[3] charSlice2 MAXXIMILIAN
[4] charSlice1 MAXXIMILIANo
[4] charSlice2 MAXXIMILIANo
[5] charSlice1 MAXXIMILIANo
[5] charSlice2 MAXXIMILIANo Z.
[6] charSlice1 MAXXIMILIANo
[6] charSlice2 MASSIMILIANo Z.

What?? Now changing charSlice2 didn't change charSlice1 anymore!!!!! Why?

ANSWERS

Ok, it's time to get some answers. The issue is that append is usually considered a const operation that always allocates a new array and never changes the old one. That's true only if the new array doesn't fit the current capacity.

Let's explain what happens in the code:

  • in (1) we declare a 12 bytes slice (charSlice1). That will result in allocating a 12 bytes array and returning a slice pointing to it. Len and Capacity of the slice will be 12.
  • in (2) allocate an 11 bytes slice pointing to the same backing array of charSlice1 (charSlice2). Len of charSlice2 will be 11, while Capacity of charSlice2 will be 12 (the backing array is 12 bytes long)
  • in (3) we change the chars at positions 2 and 3 to 'X'. Both slices receive the change because they are pointing to the same backing array. Everything is as expected.
  • in (4) we append an 'o' to charSlice2. Here the first strange event happens: the last character of charSlice1 changes to o. What is happening here is that charSlice2 has a length of 11, but a capacity of 12. For that reason, appending one char doesn't require allocating a new array: append will just put an 'o' at the 12th position of the backing array and change the charSlice2 length to 12. Since both charSlice1 and charSlice2 are pointing to the same backing array and since charSlice1 already had a length of 12, the last byte of charSize1 will get changed!
  • in (5) we append 3 more characters to charSlice2. This time they don't fit into the backing array, thus append will allocate a new one. Now charSlice1 and charSlice2 are diverging: charSlice1 still points to the old backing array, while charSlice2 points to the new one. For that reason, this time we see the change only in charSlice2!
  • in (6) we change again the chars in positions 2 and 3 of charSlice2. As we explained in the previous point, now charSlice1 and charSlice2 are pointing to 2 different backing arrays, so the change won't affect charSlice1.