Allocating dynamic memory is the thing everyone tries to avoid because it leads to performance degradation. Sometimes you can reorganize your logic to remove all the heap escapes from your code, but there are cases when it’s impossible to achieve such optimization. If that happens, you can try another method. Particularly, you can preallocate a sufficient amount of memory and request it on demand from a memory manager.
Standard Solution: sync.Pool
The standard library already has a primitive to save the pieces of dynamic memory allocated earlier during the application execution and request them back later. I’m talking about sync.Pool here. Unfortunately, the behaviour of that synchronization primitive is not consistent. Sometimes, it does the job; other times, it decides it shouldn’t. I needed something that worked for sure, so I created my own pool.
Linked List and Memory Allocation Amortization
Linked list is a good container for memory pool because it has a constant time of O(1) for removal and insertion operations. If I based the solution on an array list, it would take O(n) to insert a new element in the worst case because of copying the existing contents to a new array when the size is insufficient.
Go has an out-of-the-box linked list implementation in its standard library — the container/list package. But it has two downsides: The first is that it doesn’t utilize generics and thus accepts interface{} as an argument which leads to an unnecessary memory allocation. The second is that it dynamically allocates and removes nodes after the corresponding element is deleted. It doesn’t retain them in case they would be needed later. This leads to even more unnecessary memory allocations that could be avoided.
That’s why I came up with my linked list type:
package list
// List represents a doubly linked list.
// The zero value for List is an empty list ready to use.
type AmortizedList[T any] struct {
root Element[T]
len int
memory *List[T]
}
// Init initializes or clears list l.
func (l *AmortizedList[T]) Init() *AmortizedList[T] {
l.root.next = &l.root
l.root.prev = &l.root
l.len = 0
l.memory = New[T]()
return l
}
// New returns an initialized list.
func NewAmortized[T any]() *AmortizedList[T] {
return new(AmortizedList[T]).Init()
}
// Len returns the number of elements of list l.
// The complexity is O(1).
func (l *AmortizedList[T]) Len() int { return l.len }
// Front returns the first element of list l or nil if the list is empty.
func (l *AmortizedList[T]) Front() *Element[T] {
if l.len == 0 {
return nil
}
return l.root.next
}
// Back returns the last element of list l or nil if the list is empty.
func (l *AmortizedList[T]) Back() *Element[T] {
if l.len == 0 {
return nil
}
return l.root.prev
}
// lazyInit lazily initializes a zero List value.
func (l *AmortizedList[T]) lazyInit() {
if l.root.next == nil {
l.Init()
}
}
// insert inserts e after at, increments l.len, and returns e.
func (l *AmortizedList[T]) insert(e, at *Element[T]) *Element[T] {
e.prev = at
e.next = at.next
e.prev.next = e
e.next.prev = e
e.alist = l
l.len++
return e
}
// insertValue is a convenience wrapper for insert(&Element{Value: v}, at).
func (l *AmortizedList[T]) insertValue(v T, at *Element[T]) *Element[T] {
lastSlot := l.memory.Back()
if lastSlot == nil {
return l.insert(&Element[T]{Value: v}, at)
}
l.memory.Remove(lastSlot)
lastSlot.Value = v
return l.insert(lastSlot, at)
}
// remove removes e from its list, decrements l.len
func (l *AmortizedList[T]) remove(e *Element[T]) {
e.prev.next = e.next
e.next.prev = e.prev
e.next = nil // avoid memory leaks
e.prev = nil // avoid memory leaks
e.list = nil
l.len--
l.memory.insert(e, l.memory.root.prev)
}
// move moves e to next to at.
func (l *AmortizedList[T]) move(e, at *Element[T]) {
if e == at {
return
}
e.prev.next = e.next
e.next.prev = e.prev
e.prev = at
e.next = at.next
e.prev.next = e
e.next.prev = e
}
// Remove removes e from l if e is an element of list l.
// It returns the element value e.Value.
// The element must not be nil.
func (l *AmortizedList[T]) Remove(e *Element[T]) T {
if e.alist == l {
// if e.list == l, l must have been initialized when e was inserted
// in l or l == nil (e is a zero Element) and l.remove will crash
l.remove(e)
}
return e.Value
}
// PushFront inserts a new element e with value v at the
// front of list l and returns e.
func (l *AmortizedList[T]) PushFront(v T) *Element[T] {
l.lazyInit()
return l.insertValue(v, &l.root)
}
// PushBack inserts a new element e with value v at the
// back of list l and returns e.
func (l *AmortizedList[T]) PushBack(v T) *Element[T] {
l.lazyInit()
return l.insertValue(v, l.root.prev)
}
As you can see, it’s based on Google’s implementation, the full source code is here. I used a new field, memory, to store the earlier produced nodes for the case if needed later. When a new value is added to the amortized list, a node for it is taken from the memory list. If it’s empty, a new node is created. If a value leaves the amortized list, the node that contained it goes to the memory list to reuse it.
The Memory Manager
All the elements placed in the memory pool must correspond to this interface:
package mempool
// Erasable is an object
// whose data can be completely
// nullified in order to reuse it.
type Erasable interface {
// Erase resets all the fields
// of the object to defaults
// (not deeply).
Erase() error
}
The Erase method is called upon adding an object to the pool. Also, you need to specify a constructor that will create new objects if the pool turns out to be empty when a Get request was made.
package mempool
import (
"sync"
ll "github.com/zergon321/ll"
)
// Pool holds dynamically
// allocated objects that
// can be reused throughout
// the runtime.
type Pool[T Erasable] struct {
objects *ll.List[T]
mut *sync.Mutex
constructor func() T
}
// NewPool returns a new pool
// for a certain object type.
func NewPool[T Erasable](
constructor func() T,
options ...PoolOption[T],
) (*Pool[T], error) {
var pool Pool[T]
var params poolParams
// Apply all the options.
for i := 0; i < len(options); i++ {
option := options[i]
err := option(&pool, ¶ms)
if err != nil {
return nil, err
}
}
// Fill the pool.
pool.constructor = constructor
pool.objects = ll.New[T]()
for i := 0; i < params.initLen; i++ {
pool.objects.PushBack(pool.constructor())
}
return &pool, nil
}
We need a mutex because the calls to the pool could be made from multiple goroutine. Also, there is an option to preallocate a certain number of options, i.e., fill the pool preemptively. These options are defined as follows:
package mempool
import "sync"
// PoolOption changes something
// in the newly created pool.
type PoolOption[T Erasable] func(pool *Pool[T], params *poolParams) error
// poolParams holds the
// parameters to be used
// for pool initialization.
type poolParams struct {
initLen int // initLen is the initial length the pool object slice.
}
// PoolOptionInitialLength creates an option
// to set the initial length of the object pool.
func PoolOptionInitialLength[T Erasable](length int) PoolOption[T] {
return func(pool *Pool[T], params *poolParams) error {
if length < 0 {
return &ErrorNegativeLength{
length: length,
}
}
params.initLen = length
return nil
}
}
// PoolOptionConcurrent enables a newly
// created pool to gandle concurrent gets and puts
// from many different goroutines at the same time.
func PoolOptionConcurrent[T Erasable]() PoolOption[T] {
return func(pool *Pool[T], params *poolParams) error {
pool.mut = &sync.Mutex{}
return nil
}
}
Using the memory pool is easy. Just call Get anytime you want to create a new object and Put when you don’t need the object anymore and can dispose of it. Yeah, we kind of bring the C pipeline of working with dynamic memory to Go. We request more memory and then free it.
// Get extracts an empty object
// from the pool.
func (pool *Pool[T]) Get() T {
if pool.objects.Len() <= 0 {
return pool.constructor()
}
if pool.mut != nil {
pool.mut.Lock()
defer pool.mut.Unlock()
}
object := pool.objects.Remove(pool.objects.Back())
return object
}
// Put puts the object to the pool
// and erases its fields so it can
// be reused.
func (pool *Pool[T]) Put(object T) error {
if pool.mut != nil {
pool.mut.Lock()
defer pool.mut.Unlock()
}
err := object.Erase()
if err != nil {
return err
}
pool.objects.PushBack(object)
return nil
}
Quite straightforward, isn’t it? Now let’s do some benchmarks.
Measuring Performance
First, let’s define our custom data type to work with:
type ByteArray []byte
func (array ByteArray) Erase() error {
for i := 0; i < len(array); i++ {
array[i] = 0
}
return nil
}
Now we need to see how much time it takes to populate a sync.Pool and extract values from it:
func BenchmarkSyncPool(b *testing.B) {
pool := &sync.Pool{
New: func() any { return make([]byte, 64) },
}
for i := 0; i < b.N; i++ {
pool.Put(make([]byte, 64))
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
pool.Get()
}
}
func BenchmarkSyncPoolFill(b *testing.B) {
pool := &sync.Pool{
New: func() any { return make([]byte, 64) },
}
for i := 0; i < b.N; i++ {
pool.Put(make([]byte, 64))
}
}
Then the same for our custom memory pool:
func BenchmarkMempool(b *testing.B) {
pool, _ := mempool.NewPool[ByteArray](
func() ByteArray { return make([]byte, 64) })
for i := 0; i < b.N; i++ {
pool.Put(make([]byte, 64))
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
pool.Get()
}
}
func BenchmarkMempoolFill(b *testing.B) {
pool, _ := mempool.NewPool[ByteArray](
func() ByteArray { return make([]byte, 64) })
for i := 0; i < b.N; i++ {
pool.Put(make([]byte, 64))
}
}
In addition, we need to test the amortization of dynamic memory allocations during the second time filling (i.e., when an object was extracted from the pool and then put back).
func BenchmarkMempoolRefill(b *testing.B) {
pool, _ := mempool.NewPool[ByteArray](
func() ByteArray { return make([]byte, 64) })
for i := 0; i < b.N; i++ {
pool.Put(make([]byte, 64))
}
data := make([]ByteArray, 0, b.N)
for i := 0; i < b.N; i++ {
data = append(data, pool.Get())
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
pool.Put(data[i])
}
}
Time to see the magnificent results:
goos: linux
goarch: amd64
pkg: github.com/zergon321/mempool
cpu: AMD Ryzen 5 5600H with Radeon Graphics
BenchmarkSyncPool-12 13307784 187.5 ns/op 87 B/op 1 allocs/op
BenchmarkSyncPoolFill-12 18251826 66.80 ns/op 117 B/op 2 allocs/op
BenchmarkMempool-12 95308918 30.18 ns/op 0 B/op 0 allocs/op
BenchmarkMempoolFill-12 13297759 106.2 ns/op 128 B/op 2 allocs/op
BenchmarkMempoolRefill-12 42304838 28.13 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/zergon321/mempool 29.185s
As you can see, the sync.Pool chose not to work, whereas our custom memory pool does the job. Moreover, it doesn’t allocate any memory if we don’t fill it beyond capacity (because of the amortized linked list behind the stage). So if we allocate enough memory in advance, there will be no allocations onward.
Here is the package source code.
Creating a Zero-Allocation Environment in Go was originally published in Better Programming on Medium, where people are continuing the conversation by highlighting and responding to this story.