Amazon DynamoDB

(1)

Amazon DynamoDB

Developer Guide

API Version 2012-08-10

(2)

Amazon DynamoDB: Developer Guide

Amazon's trademarks and trade dress may not be used in connection with any product or service that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may or may not be aﬃliated with, connected to, or sponsored by Amazon.

(3)

What Is Amazon DynamoDB?

Welcome to the Amazon DynamoDB Developer Guide.

Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. DynamoDB lets you offload the administrative burdens of operating and scaling a distributed database so that you don't have to worry about hardware provisioning, setup and configuration, replication, software patching, or cluster scaling. DynamoDB also offers encryption at rest, which eliminates the operational burden and complexity involved in protecting sensitive data. For more information, see DynamoDB Encryption at Rest (p. 858).

With DynamoDB, you can create database tables that can store and retrieve any amount of data and serve any level of request traﬃc. You can scale up or scale down your tables' throughput capacity without downtime or performance degradation. You can use the AWS Management Console to monitor resource utilization and performance metrics.

DynamoDB provides on-demand backup capability. It allows you to create full backups of your tables for long-term retention and archival for regulatory compliance needs. For more information, see Using On- Demand Backup and Restore for DynamoDB (p. 682).

You can create on-demand backups and enable point-in-time recovery for your Amazon DynamoDB tables. Point-in-time recovery helps protect your tables from accidental write or delete operations. With point-in-time recovery, you can restore a table to any point in time during the last 35 days. For more information, see Point-in-Time Recovery: How It Works (p. 701).

DynamoDB allows you to delete expired items from tables automatically to help you reduce storage usage and the cost of storing data that is no longer relevant. For more information, see Expiring Items By Using DynamoDB Time to Live (TTL) (p. 436).

High Availability and Durability

DynamoDB automatically spreads the data and traﬃc for your tables over a suﬃcient number of servers to handle your throughput and storage requirements, while maintaining consistent and fast performance. All of your data is stored on solid-state disks (SSDs) and is automatically replicated across multiple Availability Zones in an AWS Region, providing built-in high availability and data durability. You can use global tables to keep DynamoDB tables in sync across AWS Regions. For more information, see Global Tables: Multi-Region Replication with DynamoDB (p. 364).

Getting Started with DynamoDB

We recommend that you begin by reading the following sections:

• Amazon DynamoDB: How It Works (p. 2)—To learn essential DynamoDB concepts.

• Setting Up DynamoDB (p. 50)—To learn how to set up DynamoDB (the downloadable version or the web service).

• Accessing DynamoDB (p. 59)—To learn how to access DynamoDB using the console, AWS CLI, or API.

(13)

How It Works

To get started quickly with DynamoDB, see Getting Started with DynamoDB and AWS SDKs (p. 84).

To learn more about application development, see the following:

• Programming with DynamoDB and the AWS SDKs (p. 213)

• Working with Tables, Items, Queries, Scans, and Indexes (p. 338)

To quickly ﬁnd recommendations for maximizing performance and minimizing throughput costs, see Best Practices for Designing and Architecting with DynamoDB (p. 961). To learn how to tag DynamoDB resources, see Adding Tags and Labels to Resources (p. 387).

For best practices, how-to guides, and tools, see Amazon DynamoDB resources.

You can use AWS Database Migration Service (AWS DMS) to migrate data from a relational database or MongoDB to a DynamoDB table. For more information, see the AWS Database Migration Service User Guide.

To learn how to use MongoDB as a migration source, see Using MongoDB as a Source for AWS Database Migration Service. To learn how to use DynamoDB as a migration target, see Using an Amazon

DynamoDB Database as a Target for AWS Database Migration Service.

Amazon DynamoDB: How It Works

The following sections provide an overview of Amazon DynamoDB service components and how they interact.

After you read this introduction, try working through the Creating Tables and Loading Data for Code Examples in DynamoDB (p. 328) section, which walks you through the process of creating sample tables, uploading data, and performing some basic database operations.

For language-speciﬁc tutorials with sample code, see Getting Started with DynamoDB and AWS SDKs (p. 84).

Topics

• Core Components of Amazon DynamoDB (p. 2)

• DynamoDB API (p. 10)

• Naming Rules and Data Types (p. 13)

• Read Consistency (p. 17)

• Read/Write Capacity Mode (p. 17)

• Table Classes (p. 21)

• Partitions and Data Distribution (p. 22)

Core Components of Amazon DynamoDB

In DynamoDB, tables, items, and attributes are the core components that you work with. A table is a collection of items, and each item is a collection of attributes. DynamoDB uses primary keys to uniquely identify each item in a table and secondary indexes to provide more querying ﬂexibility. You can use DynamoDB Streams to capture data modiﬁcation events in DynamoDB tables.

There are limits in DynamoDB. For more information, see Service, Account, and Table Quotas in Amazon DynamoDB (p. 1030).

(14)

Core Components

Topics

• Tables, Items, and Attributes (p. 3)

• Primary Key (p. 6)

• Secondary Indexes (p. 7)

• DynamoDB Streams (p. 9)

Tables, Items, and Attributes

The following are the basic DynamoDB components:

• Tables – Similar to other database systems, DynamoDB stores data in tables. A table is a collection of data. For example, see the example table called People that you could use to store personal contact information about friends, family, or anyone else of interest. You could also have a Cars table to store information about vehicles that people drive.

• Items – Each table contains zero or more items. An item is a group of attributes that is uniquely identiﬁable among all of the other items. In a People table, each item represents a person. For a Cars table, each item represents one vehicle. Items in DynamoDB are similar in many ways to rows, records, or tuples in other database systems. In DynamoDB, there is no limit to the number of items you can store in a table.

• Attributes – Each item is composed of one or more attributes. An attribute is a fundamental data element, something that does not need to be broken down any further. For example, an item in a People table contains attributes called PersonID, LastName, FirstName, and so on. For a Department table, an item might have attributes such as DepartmentID, Name, Manager, and so on. Attributes in DynamoDB are similar in many ways to ﬁelds or columns in other database systems.

The following diagram shows a table named People with some example items and attributes.

(15)

Core Components

Note the following about the People table:

• Each item in the table has a unique identiﬁer, or primary key, that distinguishes the item from all of the others in the table. In the People table, the primary key consists of one attribute (PersonID).

• Other than the primary key, the People table is schemaless, which means that neither the attributes nor their data types need to be deﬁned beforehand. Each item can have its own distinct attributes.

• Most of the attributes are scalar, which means that they can have only one value. Strings and numbers are common examples of scalars.

(16)

Core Components

• Some of the items have a nested attribute (Address). DynamoDB supports nested attributes up to 32 levels deep.

The following is another example table named Music that you could use to keep track of your music collection.

Note the following about the Music table:

• The primary key for Music consists of two attributes (Artist and SongTitle). Each item in the table must have these two attributes. The combination of Artist and SongTitle distinguishes each item in the table from all of the others.

(17)

Core Components

• Other than the primary key, the Music table is schemaless, which means that neither the attributes nor their data types need to be deﬁned beforehand. Each item can have its own distinct attributes.

• One of the items has a nested attribute (PromotionInfo), which contains other nested attributes.

DynamoDB supports nested attributes up to 32 levels deep.

For more information, see Working with Tables and Data in DynamoDB (p. 338).

Primary Key

When you create a table, in addition to the table name, you must specify the primary key of the table.

The primary key uniquely identiﬁes each item in the table, so that no two items can have the same key.

DynamoDB supports two diﬀerent kinds of primary keys:

• Partition key – A simple primary key, composed of one attribute known as the partition key.

DynamoDB uses the partition key's value as input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored.

In a table that has only a partition key, no two items can have the same partition key value.

The People table described in Tables, Items, and Attributes (p. 3) is an example of a table with a simple primary key (PersonID). You can access any item in the People table directly by providing the PersonId value for that item.

• Partition key and sort key – Referred to as a composite primary key, this type of key is composed of two attributes. The ﬁrst attribute is the partition key, and the second attribute is the sort key.

DynamoDB uses the partition key value as input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored. All items with the same partition key value are stored together, in sorted order by sort key value.

In a table that has a partition key and a sort key, it's possible for multiple items to have the same partition key value. However, those items must have diﬀerent sort key values.

The Music table described in Tables, Items, and Attributes (p. 3) is an example of a table with a composite primary key (Artist and SongTitle). You can access any item in the Music table directly, if you provide the Artist and SongTitle values for that item.

A composite primary key gives you additional ﬂexibility when querying data. For example, if you provide only the value for Artist, DynamoDB retrieves all of the songs by that artist. To retrieve only a subset of songs by a particular artist, you can provide a value for Artist along with a range of values for SongTitle.

Note

The partition key of an item is also known as its hash attribute. The term hash attribute derives from the use of an internal hash function in DynamoDB that evenly distributes data items across partitions, based on their partition key values.

The sort key of an item is also known as its range attribute. The term range attribute derives from the way DynamoDB stores items with the same partition key physically close together, in sorted order by the sort key value.

Each primary key attribute must be a scalar (meaning that it can hold only a single value). The only data types allowed for primary key attributes are string, number, or binary. There are no such restrictions for other, non-key attributes.

(18)

Core Components

Secondary Indexes

You can create one or more secondary indexes on a table. A secondary index lets you query the data in the table using an alternate key, in addition to queries against the primary key. DynamoDB doesn't require that you use indexes, but they give your applications more ﬂexibility when querying your data.

After you create a secondary index on a table, you can read data from the index in much the same way as you do from the table.

DynamoDB supports two kinds of indexes:

• Global secondary index – An index with a partition key and sort key that can be diﬀerent from those on the table.

• Local secondary index – An index that has the same partition key as the table, but a diﬀerent sort key.

Each table in DynamoDB has a quota of 20 global secondary indexes (default quota) and 5 local secondary indexes.

In the example Music table shown previously, you can query data items by Artist (partition key) or by Artist and SongTitle (partition key and sort key). What if you also wanted to query the data by Genre and AlbumTitle? To do this, you could create an index on Genre and AlbumTitle, and then query the index in much the same way as you'd query the Music table.

The following diagram shows the example Music table, with a new index called GenreAlbumTitle. In the index, Genre is the partition key and AlbumTitle is the sort key.

(19)

Core Components

Note the following about the GenreAlbumTitle index:

• Every index belongs to a table, which is called the base table for the index. In the preceding example, Music is the base table for the GenreAlbumTitle index.

• DynamoDB maintains indexes automatically. When you add, update, or delete an item in the base table, DynamoDB adds, updates, or deletes the corresponding item in any indexes that belong to that table.

• When you create an index, you specify which attributes will be copied, or projected, from the base table to the index. At a minimum, DynamoDB projects the key attributes from the base table into the index. This is the case with GenreAlbumTitle, where only the key attributes from the Music table are projected into the index.

(20)

Core Components

You can query the GenreAlbumTitle index to ﬁnd all albums of a particular genre (for example, all Rock albums). You can also query the index to ﬁnd all albums within a particular genre that have certain album titles (for example, all Country albums with titles that start with the letter H).

For more information, see Improving Data Access with Secondary Indexes (p. 552).

DynamoDB Streams

DynamoDB Streams is an optional feature that captures data modiﬁcation events in DynamoDB tables.

The data about these events appear in the stream in near-real time, and in the order that the events occurred.

Each event is represented by a stream record. If you enable a stream on a table, DynamoDB Streams writes a stream record whenever one of the following events occurs:

• A new item is added to the table: The stream captures an image of the entire item, including all of its attributes.

• An item is updated: The stream captures the "before" and "after" image of any attributes that were modiﬁed in the item.

• An item is deleted from the table: The stream captures an image of the entire item before it was deleted.

Each stream record also contains the name of the table, the event timestamp, and other metadata.

Stream records have a lifetime of 24 hours; after that, they are automatically removed from the stream.

You can use DynamoDB Streams together with AWS Lambda to create a trigger—code that runs automatically whenever an event of interest appears in a stream. For example, consider a Customers table that contains customer information for a company. Suppose that you want to send a "welcome"

email to each new customer. You could enable a stream on that table, and then associate the stream with a Lambda function. The Lambda function would run whenever a new stream record appears, but only process new items added to the Customers table. For any item that has an EmailAddress attribute, the Lambda function would invoke Amazon Simple Email Service (Amazon SES) to send an email to that address.

(21)

DynamoDB API

NoteIn this example, the last customer, Craig Roe, will not receive an email because he doesn't have an EmailAddress.

In addition to triggers, DynamoDB Streams enables powerful solutions such as data replication within and across AWS Regions, materialized views of data in DynamoDB tables, data analysis using Kinesis materialized views, and much more.

For more information, see Change Data Capture for DynamoDB Streams (p. 654).

DynamoDB API

To work with Amazon DynamoDB, your application must use a few simple API operations. The following is a summary of these operations, organized by category.

Topics

• Control Plane (p. 10)

• Data Plane (p. 11)

• DynamoDB Streams (p. 12)

• Transactions (p. 12)

Control Plane

Control plane operations let you create and manage DynamoDB tables. They also let you work with indexes, streams, and other objects that are dependent on tables.

• CreateTable – Creates a new table. Optionally, you can create one or more secondary indexes, and enable DynamoDB Streams for the table.

(22)

DynamoDB API

• DescribeTable– Returns information about a table, such as its primary key schema, throughput settings, and index information.

• ListTables – Returns the names of all of your tables in a list.

• UpdateTable – Modiﬁes the settings of a table or its indexes, creates or removes new indexes on a table, or modiﬁes DynamoDB Streams settings for a table.

• DeleteTable – Removes a table and all of its dependent objects from DynamoDB.

Data Plane

Data plane operations let you perform create, read, update, and delete (also called CRUD) actions on data in a table. Some of the data plane operations also let you read data from a secondary index.

You can use PartiQL - A SQL-Compatible Query Language for Amazon DynamoDB (p. 523), to perform these CRUD operations or you can use DynamoDB’s classic CRUD APIs that separates each operation into a distinct API call.

PartiQL - A SQL-Compatible Query Language

• ExecuteStatement – Reads multiple items from a table. You can also write or update a single item from a table. When writing or updating a single item, you must specify the primary key attributes.

• BatchExecuteStatement – Writes, updates or reads multiple items from a table. This is more eﬃcient than ExecuteStatement because your application only needs a single network round trip to write or read the items.

Classic APIs

Creating Data

• PutItem – Writes a single item to a table. You must specify the primary key attributes, but you don't have to specify other attributes.

• BatchWriteItem – Writes up to 25 items to a table. This is more eﬃcient than calling PutItem multiple times because your application only needs a single network round trip to write the items. You can also use BatchWriteItem for deleting multiple items from one or more tables.

Reading Data

• GetItem – Retrieves a single item from a table. You must specify the primary key for the item that you want. You can retrieve the entire item, or just a subset of its attributes.

• BatchGetItem – Retrieves up to 100 items from one or more tables. This is more eﬃcient than calling GetItem multiple times because your application only needs a single network round trip to read the items.

• Query – Retrieves all items that have a speciﬁc partition key. You must specify the partition key value.

You can retrieve entire items, or just a subset of their attributes. Optionally, you can apply a condition to the sort key values so that you only retrieve a subset of the data that has the same partition key.

You can use this operation on a table, provided that the table has both a partition key and a sort key.

You can also use this operation on an index, provided that the index has both a partition key and a sort key.

• Scan – Retrieves all items in the speciﬁed table or index. You can retrieve entire items, or just a subset of their attributes. Optionally, you can apply a ﬁltering condition to return only the values that you are interested in and discard the rest.

(23)

DynamoDB API

Updating Data

• UpdateItem – Modiﬁes one or more attributes in an item. You must specify the primary key for the item that you want to modify. You can add new attributes and modify or remove existing attributes.

You can also perform conditional updates, so that the update is only successful when a user-deﬁned condition is met. Optionally, you can implement an atomic counter, which increments or decrements a numeric attribute without interfering with other write requests.

Deleting Data

• DeleteItem – Deletes a single item from a table. You must specify the primary key for the item that you want to delete.

• BatchWriteItem – Deletes up to 25 items from one or more tables. This is more eﬃcient than calling DeleteItem multiple times because your application only needs a single network round trip to delete the items. You can also use BatchWriteItem for adding multiple items to one or more tables.

DynamoDB Streams

DynamoDB Streams operations let you enable or disable a stream on a table, and allow access to the data modiﬁcation records contained in a stream.

• ListStreams – Returns a list of all your streams, or just the stream for a speciﬁc table.

• DescribeStream – Returns information about a stream, such as its Amazon Resource Name (ARN) and where your application can begin reading the ﬁrst few stream records.

• GetShardIterator – Returns a shard iterator, which is a data structure that your application uses to retrieve the records from the stream.

• GetRecords – Retrieves one or more stream records, using a given shard iterator.

Transactions

Transactions provide atomicity, consistency, isolation, and durability (ACID) enabling you to maintain data correctness in your applications more easily.

You can use PartiQL - A SQL-Compatible Query Language for Amazon DynamoDB (p. 523), to perform transactional operations or you can use DynamoDB’s classic CRUD APIs that separates each operation into a distinct API call.

PartiQL - A SQL-Compatible Query Language

• ExecuteTransaction – A batch operation that allows CRUD operations to multiple items both within and across tables with a guaranteed all-or-nothing result.

Classic APIs

• TransactWriteItems – A batch operation that allows Put, Update, and Delete operations to multiple items both within and across tables with a guaranteed all-or-nothing result.

• TransactGetItems – A batch operation that allows Get operations to retrieve multiple items from one or more tables.

(24)

Naming Rules and Data Types

This section describes the Amazon DynamoDB naming rules and the various data types that DynamoDB supports. There are limits that apply to data types. For more information, see Data Types (p. 1034).

Topics

• Naming Rules (p. 13)

• Data Types (p. 13)

Naming Rules

Tables, attributes, and other objects in DynamoDB must have names. Names should be meaningful and concise—for example, names such as Products, Books, and Authors are self-explanatory.

The following are the naming rules for DynamoDB:

• All names must be encoded using UTF-8, and are case-sensitive.

• Table names and index names must be between 3 and 255 characters long, and can contain only the following characters:

• a-z

• A-Z

• 0-9

• _ (underscore)

• - (dash)

• . (dot)

• Attribute names must be at least one character long, but no greater than 64 KB long.

The following are the exceptions. These attribute names must be no greater than 255 characters long:

• Secondary index partition key names.

• Secondary index sort key names.

• The names of any user-speciﬁed projected attributes (applicable only to local secondary indexes).

Reserved Words and Special Characters

DynamoDB has a list of reserved words and special characters. For a complete list of reserved words in DynamoDB, see Reserved Words in DynamoDB (p. 1097). Also, the following characters have special meaning in DynamoDB: # (hash) and : (colon).

Although DynamoDB allows you to use these reserved words and special characters for names, we recommend that you avoid doing so because you have to deﬁne placeholder variables whenever you use these names in an expression. For more information, see Expression Attribute Names in DynamoDB (p. 416).

Data Types

DynamoDB supports many diﬀerent data types for attributes within a table. They can be categorized as follows:

• Scalar Types – A scalar type can represent exactly one value. The scalar types are number, string, binary, Boolean, and null.

• Document Types – A document type can represent a complex structure with nested attributes, such as you would ﬁnd in a JSON document. The document types are list and map.

(25)

• Set Types – A set type can represent multiple scalar values. The set types are string set, number set, and binary set.

When you create a table or a secondary index, you must specify the names and data types of each primary key attribute (partition key and sort key). Furthermore, each primary key attribute must be deﬁned as type string, number, or binary.

DynamoDB is a NoSQL database and is schemaless. This means that, other than the primary key

attributes, you don't have to deﬁne any attributes or data types when you create tables. By comparison, relational databases require you to deﬁne the names and data types of each column when you create a table.

The following are descriptions of each data type, along with examples in JSON format.

Scalar Types

The scalar types are number, string, binary, Boolean, and null.

Number

Numbers can be positive, negative, or zero. Numbers can have up to 38 digits of precision. Exceeding this results in an exception.

• Positive range: 1E-130 to 9.9999999999999999999999999999999999999E+125

• Negative range: -9.9999999999999999999999999999999999999E+125 to -1E-130

In DynamoDB, numbers are represented as variable length. Leading and trailing zeroes are trimmed.

All numbers are sent across the network to DynamoDB as strings, to maximize compatibility across languages and libraries. However, DynamoDB treats them as number type attributes for mathematical operations.

NoteIf number precision is important, you should pass numbers to DynamoDB using strings that you convert from the number type.

You can use the number data type to represent a date or a timestamp. One way to do this is by using epoch time—the number of seconds since 00:00:00 UTC on 1 January 1970. For example, the epoch time 1437136300 represents 12:31:40 PM UTC on 17 July 2015.

For more information, see http://en.wikipedia.org/wiki/Unix_time.

String

Strings are Unicode with UTF-8 binary encoding. The minimum length of a string can be zero, if the attribute is not used as a key for an index or table, and is constrained by the maximum DynamoDB item size limit of 400 KB.

The following additional constraints apply to primary key attributes that are deﬁned as type string:

• For a simple primary key, the maximum length of the ﬁrst attribute value (the partition key) is 2048 bytes.

• For a composite primary key, the maximum length of the second attribute value (the sort key) is 1024 bytes.

DynamoDB collates and compares strings using the bytes of the underlying UTF-8 string encoding. For example, "a" (0x61) is greater than "A" (0x41), and "¿" (0xC2BF) is greater than "z" (0x7A).

(26)

You can use the string data type to represent a date or a timestamp. One way to do this is by using ISO 8601 strings, as shown in these examples:

• 2016-02-15

• 2015-12-21T17:42:34Z

• 20150311T122706Z

For more information, see http://en.wikipedia.org/wiki/ISO_8601.

Binary

Binary type attributes can store any binary data, such as compressed text, encrypted data, or images.

Whenever DynamoDB compares binary values, it treats each byte of the binary data as unsigned.

The length of a binary attribute can be zero, if the attribute is not used as a key for an index or table, and is constrained by the maximum DynamoDB item size limit of 400 KB.

If you deﬁne a primary key attribute as a binary type attribute, the following additional constraints apply:

• For a simple primary key, the maximum length of the ﬁrst attribute value (the partition key) is 2048 bytes.

• For a composite primary key, the maximum length of the second attribute value (the sort key) is 1024 bytes.

Your applications must encode binary values in base64-encoded format before sending them to DynamoDB. Upon receipt of these values, DynamoDB decodes the data into an unsigned byte array and uses that as the length of the binary attribute.

The following example is a binary attribute, using base64-encoded text.

dGhpcyB0ZXh0IGlzIGJhc2U2NC1lbmNvZGVk

Boolean

A Boolean type attribute can store either true or false.

Null

Null represents an attribute with an unknown or undeﬁned state.

Document Types

The document types are list and map. These data types can be nested within each other, to represent complex data structures up to 32 levels deep.

There is no limit on the number of values in a list or a map, as long as the item containing the values ﬁts within the DynamoDB item size limit (400 KB).

An attribute value can be an empty string or empty binary value if the attribute is not used for a table or index key. An attribute value cannot be an empty set (string set, number set, or binary set), however, empty lists and maps are allowed. Empty string and binary values are allowed within lists and maps. For more information, see Attributes (p. 1035).

List

A list type attribute can store an ordered collection of values. Lists are enclosed in square brackets:

[ ... ]

(27)

A list is similar to a JSON array. There are no restrictions on the data types that can be stored in a list element, and the elements in a list element do not have to be of the same type.

The following example shows a list that contains two strings and a number.

FavoriteThings: ["Cookies", "Coffee", 3.14159]

NoteDynamoDB lets you work with individual elements within lists, even if those elements are deeply nested. For more information, see Using Expressions in DynamoDB (p. 412).

Map

A map type attribute can store an unordered collection of name-value pairs. Maps are enclosed in curly braces: { ... }

A map is similar to a JSON object. There are no restrictions on the data types that can be stored in a map element, and the elements in a map do not have to be of the same type.

Maps are ideal for storing JSON documents in DynamoDB. The following example shows a map that contains a string, a number, and a nested list that contains another map.

{ Day: "Monday", UnreadEmails: 42, ItemsOnMyDesk: [ "Coffee Cup", "Telephone", {

Pens: { Quantity : 3}, Pencils: { Quantity : 2}, Erasers: { Quantity : 1}

} ] }

NoteDynamoDB lets you work with individual elements within maps, even if those elements are deeply nested. For more information, see Using Expressions in DynamoDB (p. 412).

Sets

DynamoDB supports types that represent sets of number, string, or binary values. All the elements within a set must be of the same type. For example, an attribute of type Number Set can only contain numbers; String Set can only contain strings; and so on.

There is no limit on the number of values in a set, as long as the item containing the values ﬁts within the DynamoDB item size limit (400 KB).

Each value within a set must be unique. The order of the values within a set is not preserved. Therefore, your applications must not rely on any particular order of elements within the set. DynamoDB does not support empty sets, however, empty string and binary values are allowed within a set.

The following example shows a string set, a number set, and a binary set:

["Black", "Green", "Red"]

[42.2, -19, 7.5, 3.14]

["U3Vubnk=", "UmFpbnk=", "U25vd3k="]

(28)

Read Consistency

Amazon DynamoDB is available in multiple AWS Regions around the world. Each Region is independent and isolated from other AWS Regions. For example, if you have a table called People in the us-east-2 Region and another table named People in the us-west-2 Region, these are considered two entirely separate tables. For a list of all the AWS Regions in which DynamoDB is available, see AWS Regions and Endpoints in the Amazon Web Services General Reference.

Every AWS Region consists of multiple distinct locations called Availability Zones. Each Availability Zone is isolated from failures in other Availability Zones, and provides inexpensive, low-latency network connectivity to other Availability Zones in the same Region. This allows rapid replication of your data among multiple Availability Zones in a Region.

When your application writes data to a DynamoDB table and receives an HTTP 200 response (OK), the write has occurred and is durable. The data is eventually consistent across all storage locations, usually within one second or less.

DynamoDB supports eventually consistent and strongly consistent reads.

Eventually Consistent Reads

When you read data from a DynamoDB table, the response might not reﬂect the results of a recently completed write operation. The response might include some stale data. If you repeat your read request after a short time, the response should return the latest data.

Strongly Consistent Reads

When you request a strongly consistent read, DynamoDB returns a response with the most up-to- date data, reﬂecting the updates from all prior write operations that were successful. However, this consistency comes with some disadvantages:

• A strongly consistent read might not be available if there is a network delay or outage. In this case, DynamoDB may return a server error (HTTP 500).

• Strongly consistent reads may have higher latency than eventually consistent reads.

• Strongly consistent reads are not supported on global secondary indexes.

• Strongly consistent reads use more throughput capacity than eventually consistent reads. For details, see Read/Write Capacity Mode (p. 17)

Note

DynamoDB uses eventually consistent reads, unless you specify otherwise. Read operations (such as GetItem, Query, and Scan) provide a ConsistentRead parameter. If you set this parameter to true, DynamoDB uses strongly consistent reads during the operation.

Read/Write Capacity Mode

Amazon DynamoDB has two read/write capacity modes for processing reads and writes on your tables:

• On-demand

• Provisioned (default, free-tier eligible)

The read/write capacity mode controls how you are charged for read and write throughput and how you manage capacity. You can set the read/write capacity mode when creating a table or you can change it later.

Secondary indexes inherit the read/write capacity mode from the base table. For more information, see Considerations When Changing Read/Write Capacity Mode (p. 344).

(29)

Read/write Capacity Mode

Topics

• On-Demand Mode (p. 18)

• Provisioned Mode (p. 19)

On-Demand Mode

Amazon DynamoDB on-demand is a ﬂexible billing option capable of serving thousands of requests per second without capacity planning. DynamoDB on-demand oﬀers pay-per-request pricing for read and write requests so that you pay only for what you use.

When you choose on-demand mode, DynamoDB instantly accommodates your workloads as they ramp up or down to any previously reached traffic level. If a workload’s traffic level hits a new peak, DynamoDB adapts rapidly to accommodate the workload. Tables that use on-demand mode deliver the same single-digit millisecond latency, service-level agreement (SLA) commitment, and security that DynamoDB already offers. You can choose on-demand for both new and existing tables and you can continue using the existing DynamoDB APIs without changing code.

On-demand mode is a good option if any of the following are true:

• You create new tables with unknown workloads.

• You have unpredictable application traﬃc.

• You prefer the ease of paying for only what you use.

The request rate is only limited by the DynamoDB throughput default table quotas, but it can be raised upon request. For more information, see Throughput Default Quotas (p. 1031).

To get started with on-demand, you can create or update a table to use on-demand mode. For more information, see Basic Operations on DynamoDB Tables (p. 338).

You can switch between read/write capacity modes once every 24 hours. For issues you should consider when switching read/write capacity modes, see Considerations When Changing Read/Write Capacity Mode (p. 344).

Topics

• Read Request Units and Write Request Units (p. 18)

• Peak Traﬃc and Scaling Properties (p. 19)

• Initial Throughput for On-Demand Capacity Mode (p. 19)

• Table Behavior while Switching Read/Write Capacity Mode (p. 19)

Read Request Units and Write Request Units

For on-demand mode tables, you don't need to specify how much read and write throughput you expect your application to perform. DynamoDB charges you for the reads and writes that your application performs on your tables in terms of read request units and write request units.

• One read request unit represents one strongly consistent read request, or two eventually consistent read requests, for an item up to 4 KB in size. Two read request units represent one transactional read for items up to 4 KB. If you need to read an item that is larger than 4 KB, DynamoDB needs additional read request units. The total number of read request units required depends on the item size, and whether you want an eventually consistent or strongly consistent read. For example, if your item size is 8 KB, you require 2 read request units to sustain one strongly consistent read, 1 read request unit if you choose eventually consistent reads, or 4 read request units for a transactional read request.

Note

To learn more about DynamoDB read consistency models, see Read Consistency (p. 17).

(30)

• One write request unit represents one write for an item up to 1 KB in size. If you need to write an item that is larger than 1 KB, DynamoDB needs to consume additional write request units. Transactional write requests require 2 write request units to perform one write for items up to 1 KB. The total number of write request units required depends on the item size. For example, if your item size is 2 KB, you require 2 write request units to sustain one write request or 4 write request units for a transactional write request.

For a list of AWS Regions where DynamoDB on-demand is available, see Amazon DynamoDB Pricing.

Peak Traﬃc and Scaling Properties

DynamoDB tables using on-demand capacity mode automatically adapt to your application’s traffic volume. On-demand capacity mode instantly accommodates up to double the previous peak traffic on a table. For example, if your application’s traffic pattern varies between 25,000 and 50,000 strongly consistent reads per second where 50,000 reads per second is the previous traffic peak, on-demand capacity mode instantly accommodates sustained traffic of up to 100,000 reads per second. If your application sustains traffic of 100,000 reads per second, that peak becomes your new previous peak, enabling subsequent traffic to reach up to 200,000 reads per second.

If you need more than double your previous peak on table, DynamoDB automatically allocates more capacity as your traﬃc volume increases to help ensure that your workload does not experience throttling. However, throttling can occur if you exceed double your previous peak within 30 minutes.

For example, if your application’s traffic pattern varies between 25,000 and 50,000 strongly consistent reads per second where 50,000 reads per second is the previously reached traffic peak, DynamoDB recommends spacing your traffic growth over at least 30 minutes before driving more than 100,000 reads per second.

Initial Throughput for On-Demand Capacity Mode

If you recently switched an existing table to on-demand capacity mode for the ﬁrst time, or if you created a new table with on-demand capacity mode enabled, the table has the following previous peak settings, even though the table has not served traﬃc previously using on-demand capacity mode:

• Newly created table with on-demand capacity mode: The previous peak is 2,000 write request units or 6,000 read request units. You can drive up to double the previous peak immediately, which enables newly created on-demand tables to serve up to 4,000 write request units or 12,000 read request units, or any linear combination of the two.

• Existing table switched to on-demand capacity mode: The previous peak is half the maximum write capacity units and read capacity units provisioned since the table was created, or the settings for a newly created table with on-demand capacity mode, whichever is higher. In other words, your table will deliver at least as much throughput as it did prior to switching to on-demand capacity mode.

Table Behavior while Switching Read/Write Capacity Mode

When you switch a table from provisioned capacity mode to on-demand capacity mode, DynamoDB makes several changes to the structure of your table and partitions. This process can take several minutes. During the switching period, your table delivers throughput that is consistent with the previously provisioned write capacity unit and read capacity unit amounts. When switching from on- demand capacity mode back to provisioned capacity mode, your table delivers throughput consistent with the previous peak reached when the table was set to on-demand capacity mode.

Provisioned Mode

If you choose provisioned mode, you specify the number of reads and writes per second that you require for your application. You can use auto scaling to adjust your table’s provisioned capacity automatically

(31)

in response to traﬃc changes. This helps you govern your DynamoDB use to stay at or below a deﬁned request rate in order to obtain cost predictability.

Provisioned mode is a good option if any of the following are true:

• You have predictable application traﬃc.

• You run applications whose traﬃc is consistent or ramps gradually.

• You can forecast capacity requirements to control costs.

Read Capacity Units and Write Capacity Units

For provisioned mode tables, you specify throughput capacity in terms of read capacity units (RCUs) and write capacity units (WCUs):

• One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for an item up to 4 KB in size. Transactional read requests require two read capacity units to perform one read per second for items up to 4 KB. If you need to read an item that is larger than 4 KB, DynamoDB must consume additional read capacity units. The total number of read capacity units required depends on the item size, and whether you want an eventually consistent or strongly consistent read. For example, if your item size is 8 KB, you require 2 read capacity units to sustain one strongly consistent read per second, 1 read capacity unit if you choose eventually consistent reads, or 4 read capacity units for a transactional read request. For more information, see Capacity Unit Consumption for Reads (p. 346).

NoteTo learn more about DynamoDB read consistency models, see Read Consistency (p. 17).

• One write capacity unit represents one write per second for an item up to 1 KB in size. If you need to write an item that is larger than 1 KB, DynamoDB must consume additional write capacity units.

Transactional write requests require 2 write capacity units to perform one write per second for items up to 1 KB. The total number of write capacity units required depends on the item size. For example, if your item size is 2 KB, you require 2 write capacity units to sustain one write request per second or 4 write capacity units for a transactional write request. For more information, see Capacity Unit Consumption for Writes (p. 347).

Important

When calling DescribeTable on an on-demand table, read capacity units and write capacity units are set to 0.

If your application reads or writes larger items (up to the DynamoDB maximum item size of 400 KB), it will consume more capacity units.

For example, suppose that you create a provisioned table with 6 read capacity units and 6 write capacity units. With these settings, your application could do the following:

• Perform strongly consistent reads of up to 24 KB per second (4 KB × 6 read capacity units).

• Perform eventually consistent reads of up to 48 KB per second (twice as much read throughput).

• Perform transactional read requests of up to 12 KB per second.

• Write up to 6 KB per second (1 KB × 6 write capacity units).

• Perform transactional write requests of up to 3 KB per second.

For more information, see Managing Settings on DynamoDB Provisioned Capacity Tables (p. 345).

Provisioned throughput is the maximum amount of capacity that an application can consume from a table or index. If your application exceeds your provisioned throughput capacity on a table or index, it is subject to request throttling.

(32)

Table Classes

Throttling prevents your application from consuming too many capacity units.

When a request is throttled, it fails with an HTTP 400 code (Bad Request) and a

ProvisionedThroughputExceededException. The AWS SDKs have built-in support for retrying throttled requests (see Error Retries and Exponential Backoﬀ (p. 227)), so you do not need to write this logic yourself.

You can use the AWS Management Console to monitor your provisioned and actual throughput, and to modify your throughput settings if necessary.

DynamoDB Auto Scaling

DynamoDB auto scaling actively manages throughput capacity for tables and global secondary indexes.

With auto scaling, you deﬁne a range (upper and lower limits) for read and write capacity units. You also deﬁne a target utilization percentage within that range. DynamoDB auto scaling seeks to maintain your target utilization, even as your application workload increases or decreases.

With DynamoDB auto scaling, a table or a global secondary index can increase its provisioned read and write capacity to handle sudden increases in traﬃc, without request throttling. When the workload decreases, DynamoDB auto scaling can decrease the throughput so that you don't pay for unused provisioned capacity.

NoteIf you use the AWS Management Console to create a table or a global secondary index, DynamoDB auto scaling is enabled by default.

You can manage auto scaling settings at any time by using the console, the AWS CLI, or one of the AWS SDKs.

For more information, see Managing Throughput Capacity Automatically with DynamoDB Auto Scaling (p. 350).

Reserved Capacity

As a DynamoDB customer, you can purchase reserved capacity in advance for tables that use the DynamoDB Standard table class, as described at Amazon DynamoDB Pricing. With reserved capacity, you pay a one-time upfront fee and commit to a minimum provisioned usage level over a period of time.

Your reserved capacity is billed at the hourly reserved capacity rate. By reserving your read and write capacity units ahead of time, you realize signiﬁcant cost savings on your provisioned capacity costs. Any capacity that you provision in excess of your reserved capacity is billed at standard provisioned capacity rates.

NoteReserved capacity is not available for replicated write capacity units. Reserved capacity is also not available for tables using the DynamoDB Standard-IA table class or on-demand capacity mode.

To manage reserved capacity, go to the DynamoDB console and choose Reserved Capacity.

NoteYou can prevent users from viewing or purchasing reserved capacity, while still allowing them to access the rest of the console. For more information, see "Grant Permissions to Prevent Purchasing of Reserved Capacity Oﬀerings" in Identity and Access Management in Amazon DynamoDB (p. 867).

Table Classes

DynamoDB oﬀers two table classes designed to help you optimize for cost. The DynamoDB Standard table class is the default, and is recommended for the vast majority of workloads. The DynamoDB

(33)

Partitions and Data Distribution

Standard-Infrequent Access (DynamoDB Standard-IA) table class is optimized for tables where storage is the dominant cost. For example, tables that store infrequently accessed data, such as application logs, old social media posts, e-commerce order history, and past gaming achievements, are good candidates for the Standard-IA table class. See Amazon DynamoDB Pricing for pricing details.

Every DynamoDB table is associated with a table class (DynamoDB Standard by default). Each table class offers different pricing for data storage as well as for read and write requests. You can select the most cost-effective table class for your table based on its storage and throughput usage patterns.

The choice of a table class is not permanent—you can change this setting using the AWS Management Console, AWS CLI, or AWS SDK. DynamoDB also supports managing your table class using AWS CloudFormation for single-Region tables (tables that are not global tables). To learn more about selecting your table class, see Considerations When Choosing a Table Class (p. 344).

Partitions and Data Distribution

Amazon DynamoDB stores data in partitions. A partition is an allocation of storage for a table, backed by solid state drives (SSDs) and automatically replicated across multiple Availability Zones within an AWS Region. Partition management is handled entirely by DynamoDB—you never have to manage partitions yourself.

When you create a table, the initial status of the table is CREATING. During this phase, DynamoDB allocates suﬃcient partitions to the table so that it can handle your provisioned throughput

requirements. You can begin writing and reading table data after the table status changes to ACTIVE.

DynamoDB allocates additional partitions to a table in the following situations:

• If you increase the table's provisioned throughput settings beyond what the existing partitions can support.

• If an existing partition ﬁlls to capacity and more storage space is required.

Partition management occurs automatically in the background and is transparent to your applications.

Your table remains available throughout and fully supports your provisioned throughput requirements.

For more details, see Partition Key Design (p. 964).

Global secondary indexes in DynamoDB are also composed of partitions. The data in a global secondary index is stored separately from the data in its base table, but index partitions behave in much the same way as table partitions.

Data Distribution: Partition Key

If your table has a simple primary key (partition key only), DynamoDB stores and retrieves each item based on its partition key value.

To write an item to the table, DynamoDB uses the value of the partition key as input to an internal hash function. The output value from the hash function determines the partition in which the item will be stored.

To read an item from the table, you must specify the partition key value for the item. DynamoDB uses this value as input to its hash function, yielding the partition in which the item can be found.

The following diagram shows a table named Pets, which spans multiple partitions. The table's primary key is AnimalType (only this key attribute is shown). DynamoDB uses its hash function to determine where to store a new item, in this case based on the hash value of the string Dog. Note that the items are not stored in sorted order. Each item's location is determined by the hash value of its partition key.

Amazon DynamoDB

Amazon DynamoDB

Developer Guide

API Version 2012-08-10

Amazon DynamoDB: Developer Guide

Table of Contents

What Is Amazon DynamoDB?

High Availability and Durability

Getting Started with DynamoDB

Amazon DynamoDB: How It Works

Core Components of Amazon DynamoDB

Tables, Items, and Attributes

Primary Key

Secondary Indexes

DynamoDB Streams

DynamoDB API

Control Plane

Data Plane

PartiQL - A SQL-Compatible Query Language

Classic APIs

DynamoDB Streams

Transactions

PartiQL - A SQL-Compatible Query Language

Classic APIs

Naming Rules and Data Types

Naming Rules

Reserved Words and Special Characters

Data Types

Scalar Types

Document Types

Sets

Read Consistency

Read/Write Capacity Mode

On-Demand Mode

Read Request Units and Write Request Units

Peak Traﬃc and Scaling Properties

Initial Throughput for On-Demand Capacity Mode

Table Behavior while Switching Read/Write Capacity Mode

Provisioned Mode

Read Capacity Units and Write Capacity Units

DynamoDB Auto Scaling

Reserved Capacity

Table Classes

Partitions and Data Distribution

Data Distribution: Partition Key