Using single-table design principles and AWS SDK for Go to create efficient and maintainable code to work with AWS DynamoDB
Recently, I maintained Go code that handles various DynamoDB operations. The code was full of manually defined ExpressionAttributeValues and ExpressionAttributeNames maps. It looked cumbersome. After a quick research, I discovered that AWS SDK for Go provides features that enable writing cleaner code.
I want to share how I work with DynamoDB using AWS SDK for Go in this article. The code snippets below are parts of a sample project I created to complement the article.
DynamoDB — A Shift From the Relational Data Model to a Single-Table Design Concept
Before jumping to Go code, I want to discuss DynamoDB tables design. As an experienced developer designing relational databases, I needed clarification about building data models for DynamoDB. I applied the same principles of normalisation when planning models for DynamoDB. As a result, I had several normalised tables without the support of JOIN.
DynamoDB delivers single-digit millisecond performance at any scale. But to retrieve data, a network request is required. Usually, network I/O is one of the application’s performance bottlenecks. Having multiple requests to a database in a waterfall fashion only degrades performance. To fully utilise DynamoDB potential, reducing the number of requests to the database required, ideally to one request. It means a DynamoDB application should work with as few tables as possible. Therefore, a single-table design concept was created.
The idea is to flatten application data. Sometimes, data items are accompanied by metadata items. The shift to a single-table design was challenging, partially because of the need for more terminology. In DynamoDB, the core components are tables, items, and attributes. However, the DynamoDB table does not equal a relational database table. It is more of a view in the relational database where multiple tables are joined.
MongoDB terminology is better. As the documentation says, MongoDB stores data records as documents gathered in collections. A database stores one or more document collections. Using words document and collection removes cognitive associations with a table in a relational database.
DynamoDB table keys
It is hard to imagine efficient database architecture without keys and indexes. Here, I would like to focus only on primary, partition, and sort keys — the minimum components required to build a table and execute DynamoDB operations.
The primary key must be specified when creating a table. The primary key uniquely identifies each item in the table. The primary key is either the partition key or the combination of the partition and sort key.
DynamoDB inputs the partition key’s value to an internal hash function. The hash function output determines the partition in which the record is stored. The items with the same partition key are located together.
Here’s a few things that are essential to know and understand this discussion:
- what type of primary key is defined in a table
- what is the partition key
- what is the sort key
These keys play a central role in building queries.
DynamoDB API — Data Access Methods
Read methods
- BatchGetItem — retrieves up to 100 items from one or more tables.
- GetItem— retrieves a single item from a table with the given primary key.
- Query — retrieves all items that have a specific partition key.
- Scan — retrieves all items in the specified table or index.
- TransactGetItems — atomically retrieves multiple items from one or more tables.
Note: You can use a filter in scan and query operations to reduce the number of records returned to the client. Filter applied after data read from the DynamoDB. The items that do not satisfy the filter condition not returned to the client.
Write methods
- BatchWriteItem — puts or deletes multiple items in one or more tables.
- DeleteItem — deletes a single record in a table by primary key.
- PutItem—creates or replaces an old item with a new one.
- TransactWriteItems — synchronous write operation on items from one or more tables (no two actions can target the same record).
- UpdateItem— edits an existing item’s attributes or adds a new item to the table if it does not already exist.
Put vs Update
There is no difference when an item does not exist. Both methods create a new item. When an existing item is found, Put replaces it with the new one, and Update alters the item’s attributes.
From DynamoDB API to SDK
DynamoDB is one of many services provided by AWS. Every service has API — a set of methods to call service — exposed to clients via HTTP endpoints. So, what’s the AWS SDK? It is a set of types and functions to build and run HTTP requests to AWS services.
AWS SDK is available in multiple programming languages. AWS services’ APIs define its functionality.
Expression Package
Go AWS SDK provides methods to read and write data in DynamoDB. The structures that describe method inputs contain filters, conditions, and expressions’ attributes maps (names and values maps). Here’s an example that shows how to build the QueryInput.
This code works, but it has drawbacks. Building ExpressionAttributeValues for KeyConditionExpression manually is a labour-intensive and error-prone process. Also, it contains information about the internal implementation of query processing.
The expression package provides types and functions to create expression strings (to describe filters and conditions) and attributes maps. The following code uses a declarative way to build QueryInput without exposing implementation details.
The main component of the package is Builder. It provides methods to build the Expression structure. The getter methods of the structure return the formatted DynamoDB expressions, ExpressionAttributeNames and ExpressionAttributeValues maps.
Builder uses four concrete implementations:
- ConditionBuilder — builds FilterExpression and ConditionExpression
- KeyConditionBuilder — builds KeyConditionExpression
- ProjectionBuilder — builds ProjectionExpression
- UpdateBuilder — builds UpdateExpression
Each of these builders can be involved in building the Expression structure using corresponding methods of Builder: WithCondition, WithFilter, WithKeyCondition, WithProjection, WithUpdate.
FilterExpression supports all the same functions and formats as ConditionExpression. Therefore, ConditionBuilder represents both types of expressions. As a result, WithCondition and WithFilter accept an instance of ConditionBuilder.
Expressions and builders usage
The following table shows what expressions and builders are used in different DynamoDB operations.
BatchGetItem requires a RequestItems map, where the key is a table name and the value is an item definition to get from the table. Expression and builders, provided in the table, used in the item definition.
TransactGetItems accepts a list of Get items. Each of the Get items described using expressions and a corresponding builder.
TransactWriteItems accepts a list of items each of type ConditionCheck, Delete, Put, and Update.
PutItem and Put require an instance of the Item structure that must contain at least a primary key. Condition expressions are optional for the Put operation.
As you can see, the expression package meets most of the requirements to prepare input for the DynamoDB operations. But there’s a missing feature — expression for a primary key. When defining an operation input that requires a primary key, the key is built manually (actually at the moment of writing). Here’s a list of operations that require a primary key:
- GetItem
- DeleteItem
- UpdateItem
- BatchGetItem, BatchWriteItem, TransactGetItems, TransactWriteItems — every request item must define a primary key
Examples
I created a DynamoDB table that stores invoice information for this article. Every invoice consists of one header and zero-to-many line items. I’m using one partition key for all documents related to the same invoice. It guarantees that invoice data is stored in one physical location on the server in the data centre. That reduces operations latency. I’m using a sorting key to differentiate between invoice header and line items.
Create record
The snippet above consists of the following parts:
- convert invoice item structure to DynamoDB attributes map (L3)
- define PutItemInput (L8)
- execute put item operation (L13)
Create multiple records in one transaction
PutItem is suitable when you need to write only one item. But in some cases, writing multiple records in one transaction is required. For example, storing invoices and all their items in one transaction is better.
The snippet above consists of the following parts:
- convert invoice structure to DynamoDB attributes map (L3)
- initiate transaction entries slice and add the invoice to it (L8–9)
- convert invoice item structure to DynamoDB attributes map (L15)
- add the item to the transaction entries slice (L20)
- define and validate transaction input (L26–27)
- execute the transaction (L31)
Update record
The snippet above consists of the following parts:
- construct item’s primary key (L5–9)
- define an expression for update (L14–17)
- use the expression to build UpdateItemInput (L22)
- execute update item operation (L30)
Update multiple records in one transaction
The snippet above consists of the following parts:
- define an expression for update (L9–12)
- initiate transaction entries slice (L17)
- construct item’s primary key (L19–23)
- use the expression and item’s primary key to build an Update entry (L28)
- add Update entry to transaction slice (L35)
- define and validate transaction input (L38–39)
- execute the transaction (L43)
Get record
The snippet above consists of the following parts:
- construct item’s primary key (L3–7)
- prepare and build expression (L12-13)
- use the expression to define GetItemInput (L18)
- get the item (L25)
- unmarshal results in product structure (L30)
In this example, an item is retrieved by the primary key. A projection expression defines the operation output so it matches the Product structure.
Get records (using Query)
The snippet above consists of the following parts:
- prepare and build an expression containing a filter and key conditions (L5-12)
- use the expression to define QueryInput (L20)
- execute query (L28)
- unmarshal query result to items slice (L37)
In this example, all items belonging to one invoice are read from the table first, filtered out by status, and returned to a client.
Get records (using Scan)
The snippet above consists of the following parts:
- prepare and build filter expression (L3–8)
- use filter expression when defining ScanInput (L13)
- execute scan (L20)
- unmarshal scan result to items slice (L29)
In this example, all items are read from the table and filtered out by status.
Delete record
The snippet above consists of the following parts:
- construct item’s primary key (L3–7)
- define DeleteItemInput (L12)
- call delete item operation (L17)
That’s all for today. Thanks for reading. A sample of the service is available in the go-dynamodb repository.
References
- NoSQL Design for DynamoDB
- The What, Why, and When of Single-Table Design with DynamoDB
- From relational DB to single DynamoDB table: a step-by-step exploration
- Simulating Amazon DynamoDB unique constraints using transactions
- Amazon DynamoDB Examples Using the AWS SDK for Go
- go-dynamodb — sample service source code on GitHub
DynamoDB, Expressions, and Go was originally published in Better Programming on Medium, where people are continuing the conversation by highlighting and responding to this story.