Optimizing struct size can improve both memory usage and application performance. Let's look at the following example:
package main
import (
"fmt"
"unsafe"
)
type Contact struct {
enabled bool
name string
surname string
isSpam bool
age int
}
func main() {
var c Contact
fmt.Printf("Size of Contact.enabled: %d\n", unsafe.Sizeof(c.enabled))
fmt.Printf("Size of Contact.name: %d\n", unsafe.Sizeof(c.name))
fmt.Printf("Size of Contact.surname: %d\n", unsafe.Sizeof(c.surname))
fmt.Printf("Size of Contact.isSpam: %d\n", unsafe.Sizeof(c.isSpam))
fmt.Printf("Size of Contact.age: %d\n\n", unsafe.Sizeof(c.age))
fmt.Printf("Size of Contact: %d\n", unsafe.Sizeof(c))
}
If we run it, the output will be:
Size of Contact.enabled: 1
Size of Contact.name: 16
Size of Contact.surname: 16
Size of Contact.isSpam: 1
Size of Contact.age: 8
Size of Contact: 56
What is happening? The sum of all the sizes is 1+16+16+1+8
is 42
: why the size of the struct is 56
?
We will answer that question, but before let's look at another example:
package main
import (
"fmt"
"unsafe"
)
type Contact struct {
enabled bool
name string
surname string
isSpam bool
age int
}
type Contact1 struct {
name string
surname string
age int
enabled bool
isSpam bool
}
func main() {
var c Contact
var c1 Contact1
fmt.Printf("Size of Contact.enabled: %d\n", unsafe.Sizeof(c.enabled))
fmt.Printf("Size of Contact.name: %d\n", unsafe.Sizeof(c.name))
fmt.Printf("Size of Contact.surname: %d\n", unsafe.Sizeof(c.surname))
fmt.Printf("Size of Contact.isSpam: %d\n", unsafe.Sizeof(c.isSpam))
fmt.Printf("Size of Contact.age: %d\n\n", unsafe.Sizeof(c.age))
fmt.Printf("Size of Contact: %d\n", unsafe.Sizeof(c))
fmt.Printf("Size of Contact1: %d\n", unsafe.Sizeof(c1))
}
If you run it, the output will be:
Size of Contact.enabled: 1
Size of Contact.name: 16
Size of Contact.surname: 16
Size of Contact.isSpam: 1
Size of Contact.age: 8
Size of Contact: 56
Size of Contact1: 48
Contact
and Contact1
have the exact same attributes, however, Contact1
is smaller than Contact
. What is happening?
In modern 64-bit CPUs, the struct attributes are aligned at 8 bytes. That means that when you define an attribute, the compiler tries to fit it into the current 8 bytes block. If it doesn't fit, the rest of the block is wasted and a new 8 bytes block is allocated.
Let's look at Contact
:
type Contact struct {
enabled bool // 1 byte
name string // 16 bytes
surname string // 16 bytes
isSpam bool // 1 byte
age int // 8 bytes
}
The compiler will accommodate it in the following way (X is a wasted byte):
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | size |
---|---|---|---|---|---|---|---|---|
enabled | X | X | X | X | X | X | X | 8 |
name | name | name | name | name | name | name | name | 16 |
name | name | name | name | name | name | name | name | 24 |
surname | surname | surname | surname | surname | surname | surname | surname | 32 |
surname | surname | surname | surname | surname | surname | surname | surname | 40 |
isSpam | X | X | X | X | X | X | X | 48 |
age | age | age | age | age | age | age | age | 56 |
As you can see, both enabled
and isSpam
are wasting 7 bytes because the next attribute doesn't fit in the remaining 7 bytes.
Let's look at Contact1
:
type Contact1 struct {
name string //16 bytes
surname string //16 bytes
age int // 8 bytes
enabled bool // 1 byte
isSpam bool // 1 byte
}
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | size |
---|---|---|---|---|---|---|---|---|
name | name | name | name | name | name | name | name | 8 |
name | name | name | name | name | name | name | name | 16 |
surname | surname | surname | surname | surname | surname | surname | surname | 24 |
surname | surname | surname | surname | surname | surname | surname | surname | 32 |
age | age | age | age | age | age | age | age | 40 |
enabled | isSpam | X | X | X | X | X | X | 48 |
This time we are wasting only 6 bytes and the struct size is just 48 bytes!
We started this post by saying that optimizing a struct can improve both memory usage and performance. We explained how memory is affected, but we didn't say anything about performance.
A 64-bit CPU can deal with 64 bits (8 bytes) in one CPU cycle: that means that every time the CPU has to move our struct, it will need 7 cycles for Contact
(56 bytes / 8 bytes = 7 cycles), while it will need only 6 cycles for Contact1
(48 bytes / 8 bytes = 6 cycles)!