DynamoDB
Basic Concepts
- Table: A collection of data.
- Item: A single record in a table.
- Attribute: A single piece of data in an item.
- Primary Key: A unique identifier for an item in a table.
Note: DynamoDB has several reserved words that cannot be used for attribute name. See the full list here.
Primary Key
You can have simple or composite primary keys.
- Simple Primary Key: Consists of a single attribute (the partition key).
- Composite Primary Key: Consists of two attributes (the partition key and the sort key).
Query vs. Scan
- Query: Retrieves items based on primary key.
- Allows for comparison operators
- Fast
- Scan: Scans the entire table
- Slow, looks at all items
- Filters applied to refine results after the scan
- Not effective (or feasible) for large tables
Consistency
- Eventually Consistency Read (default): Reads may not reflect the latest write.
- Strongly Consist Read: Reads reflect the latest write.
Throughput
- Read Capacity Units (RCUs): The number of reads per second.
- Write Capacity Units (WCUs): The number of writes per second.
Per the AWS DynamoDB Documentation
RCUs
A single RCU represents one strongly consistent read per second, or two eventually consistent reads per second, for items up to 4 KB in size.
Math (from Nick Garner's course):
Or as I'd like to think of it:
In other words:
Where is the item size in KB.
Therefore, to compute the RCUs needed for a given read rate and item size, we can use the following Python function:
def compute_rcus(reads_per_sec, item_size_kb, consistent=True):
multiplier = 0.250 if consistent else 0.125
return multiplier * (reads_per_sec * item_size_kb)
WCUs
Again, per AWS's DynamoDB documentation:
One WCU can perform one standard write request per second for items up to 1KB in size.
Math:
Or put differently:
In other words:
Where is the item size in KB.
Therefore, to compute the WCUs needed for a given write rate and item size, we can use the following Python function:
def compute_wcus(writes_per_sec, item_size_kb):
return writes_per_sec * item_size_kb
Scaling
DynamoDB Auto Scaling is enabled by default.
Secondary Indexes
- Local Secondary Index (LSI): An index that has the same partition key as the table, but a different sort key.
- Global Secondary Index (GSI): An index with a partition key and sort key that can be different from those on the table.
Local Secondary Index
TODO
Global Secondary Index
TODO
Useful Resources
- DynamoDBGuide.com
- DynamoDB Best Practices
- Rick Houlihan's talk on Advanced DynamoDB Design Patterns (2018)
- Rick Houlihan's talk on Advanced DynamoDB Design Patterns (2021)
Additional Reading
- DynamoDB Paper - Also available from Amazon or AllThingsDistributed