• 沒有找到結果。

Changing the application endpoint to the Amazon DocumentDB cluster 4.0

在文檔中 Amazon DocumentDB (頁 135-0)

Step 1: Enable Change Streams

To perform a minimal downtime migration, AWS DMS requires access to the cluster’s change streams.

Amazon DocumentDB change streams provide a time-ordered sequence of update events that occur within your cluster’s collections and databases. Reading from the change stream enables AWS DMS to perform change data capture (CDC) and apply incremental updates to the target Amazon DocumentDB cluster.

To enable change streams for all collections on a specific database, authenticate to your Amazon DocumentDB cluster using the mongo shell and execute the following commands:

db.adminCommand({modifyChangeStreams: 1, database: "db_name",

collection: "", enable: true});

Step 2: Modify the Change Streams Retention Duration

Next, modify the change stream retention period based on how long you would like to retain change events in the change stream. For example, if you expect your AWS DMS migration from Amazon DocumentDB v3.6 to v4.0 to take 12 hours, you should set the change stream retention to a value greater than 12 hours. The default retention period for your Amazon DocumentDB cluster is three hours.

You can modify the change stream log retention duration for your Amazon DocumentDB cluster to be between one hour and seven days using the AWS Management Console or the AWS CLI. For more details, refer to Modifying the Change Stream Log Retention Duration.

Step 3: Migrate Your Indexes

Create the same indexes on your Amazon DocumentDB 4.0 cluster that you have on your Amazon DocumentDB 3.6 cluster. Although AWS DMS handles the migration of data, it does not migrate indexes.

To migrate the indexes, use the Amazon DocumentDB Index Tool to export indexes from the Amazon DocumentDB 3.6 cluster. You can get the tool by creating a clone of the Amazon DocumentDB tools GitHub repo and following the instructions in README.md. You can run the tool from an Amazon EC2 instance or an AWS Cloud9 environment running in the same Amazon VPC as your Amazon DocumentDB cluster.

The following code dumps indexes from your Amazon DocumentDB v3.6 cluster:

python migrationtools/documentdb_index_tool.py --dump-indexes --dir ~/index.js/

--host docdb-36-xx.cluster-xxxxxxxx.us-west-2.docdb.amazonaws.com:27017 --tls --tls-ca-file ~/rds-ca-2019-root.pem

--username user --password <password>

2020-02-11 21:51:23,245: Successfully authenticated to database: admin

2020-02-11 21:46:50,432: Successfully connected to instance docdb-36-xx.cluster-xxxxxxxx.us-west-2.docdb.amazonaws.com:27017

2020-02-11 21:46:50,432: Retrieving indexes from server...

Step 4: Create a AWS DMS Replication Instance

2020-02-11 21:46:50,440: Completed writing index metadata to local folder: /home/ec2-user/

index.js/

Once your indexes are successfully exported, restore those indexes in your Amazon DocumentDB 4.0 cluster. To restore the indexes that you exported in the preceding step, use the Amazon DocumentDB Index Tool. The following command restores the indexes in your Amazon DocumentDB 4.0 cluster from the specified directory.

python migrationtools/documentdb_index_tool.py --restore-indexes --dir ~/index.js/

--host docdb-40-xx.cluster-xxxxxxxx.us-west-2.docdb.amazonaws.com:27017 --tls --tls-ca-file ~/rds-ca-2019-root.pem

--username user --password <password>

2020-02-11 21:51:23,245: Successfully authenticated to database: admin

2020-02-11 21:51:23,245: Successfully connected to instance docdb-40-xx.cluster-xxxxxxxx.us-west-2.docdb.amazonaws.com:27017

2020-02-11 21:51:23,264: testdb.coll: added index: _id

To confirm that you restored the indexes correctly, connect to your Amazon DocumentDB 4.0 cluster with the mongo shell and list the indexes for a given collection. See the following code:

mongo --ssl

--host docdb-40-xx.cluster-xxxxxxxx.us-west-2.docdb.amazonaws.com:27017 --sslCAFile rds-ca-2019-root.pem --username documentdb --password documentdb db.coll.getIndexes()

Step 4: Create a AWS DMS Replication Instance

An AWS DMS replication instance connects and reads data from your source (in this case an Amazon DocumentDB 3.6 cluster) and writes it your target (Amazon DocumentDB 4.0 cluster). The AWS DMS replication instance can perform both bulk load and CDC operations. Most of this processing happen in memory. However, large operations might require some buffering on disk. Cached transactions and log files are also written to disk. Once the data is migrated, the replication instance also streams any change events to make sure the source and target are in sync.

To create an AWS DMS replication instance:

1. Open the AWS DMS console.

2. In the navigation pane, choose Replication instances.

3. Choose Create replication instance and enter the following information:

• For Name, enter a name of your choice. For example, docdb36todocdb40.

• For Description, enter a description of your choice. For listitem, Amazon DocumentDB 3.6 to Amazon DocumentDB 4.0 replication instance.

• For Instance class, choose the size based on your needs.

• For Engine version, choose 3.4.1.

• For Amazon VPC, choose the Amazon VPC that houses your Amazon DocumentDB 3.6 and 4.0 clusters.

• For Allocated storage (GiB), use the default of 50 GiB. If you have a high write throughput workload, increase this value to match your workload.

• For Multi-AZ, choose Yes if you need high availability and failover support.

• For Publicly accessible, enable this option.

Step 5: Create an AWS DMS Source Endpoint

4. Choose Create replication instance.

Step 5: Create an AWS DMS Source Endpoint

The source endpoint is used for the Amazon DocumentDB 3.6 cluster you are looking to upgrade to 4.0.

To create a source endpoint 1. Open the AWS DMS console.

2. In the navigation pane, choose Endpoints.

3. Choose Create endpoint and enter the following information:

• For Endpoint type, choose Source.

• >For Endpoint identifier, enter a name that's easy to remember, for example docdb-source.

• For Source engine, choose docdb.

• For Server name, enter the DNS name of your Amazon DocumentDB v3.6 cluster.

Step 5: Create an AWS DMS Source Endpoint

• For Port, enter the port number of your Amazon DocumentDB v3.6 cluster.

• For SSL mode, choose verify-full.

• For CA certificate, choose Add new CA certificate. Download the new CA certificate to create TLS connections bundle. For Certificate identifier, enter rds-combined-ca-bundle . For Import certificate file, choose Choose file and navigate to the .pem file that you previously downloaded.

Select and open the file. Choose Import certificate, then choose rds-combined-ca-bundle from the Choose a certificate drop down

• For Username, enter the master username of your Amazon DocumentDB v3.6 cluster.

• For Password, enter the master password of your Amazon DocumentDB v3.6 cluster.

• For Database name, enter the database name you are looking to upgrade.

4. Test your connection to verify it was successfully setup.

Step 6: Create an AWS DMS Target Endpoint

5. Choose Create Endpoint.

NoteAWS DMS can only migrate one database at a time.

Step 6: Create an AWS DMS Target Endpoint

The target endpoint is for your Amazon DocumentDB 4.0 cluster.

To create a target endpoint:

1. Open the AWS DMS console.

2. In the navigation pane, choose Endpoints.

3. Choose Create endpoint and enter the following information:

• For Endpoint type, choose Target.

• For Endpoint identifier, enter a name that's easy to remember, for example docdb-target.

• For Source engine, choose docdb.

• For Server name, enter the DNS name of your Amazon DocumentDB v4.0 cluster.

• For Port, enter the port number of your Amazon DocumentDB v4.0 cluster.

• For SSL mode, choose verify-full.

• For CA certificate, choose the existing rds-combined-ca-bundle certificate from the Choose a certificate drop down.

• For Username, enter the master username of your Amazon DocumentDB v4.0 cluster.

• For Password, enter the master password of your Amazon DocumentDB v4.0 cluster.

• For Database name, enter the same database name you used to setup your source endpoint.

Step 7: Create and run a migration task

4. Test your connection to verify it was successfully set up.

5. Choose Create Endpoint.

Step 7: Create and run a migration task

An AWS DMS task binds the replication instance with your source and target instance. When you create a migration task, you specify the source endpoint, target endpoint, replication instance and any desired migration settings. An AWS DMS task can be created with three different migration types - migrate existing data, migrate existing data and replicate ongoing changes, or replicate data changes only.

Since the purpose of this walk through is to upgrade an Amazon DocumentDB 3.6 cluster to Amazon DocumentDB 4.0 with minimal downtime, the steps utilize the option to migrate existing data and replicate ongoing changes. With this option, AWS DMS captures changes while migrating your existing data. AWS DMS continues to capture and apply changes even after the bulk data has been loaded.

Eventually the source and target databases will be in sync, allowing for a minimal downtime migration.

Below are the steps to create a migration task for a minimal downtime migration:

1. Open the AWS DMS console.

2. In the navigation pane, choose Tasks.

3. Choose Create task and enter the following information:

• For Task name, enter a name that's easy to remember, for example my-dms-upgrade-task.

• For Replication instance, choose the replication instance that you created in Step3: Create an AWS Database Migration Service Replication Instance

• For Source endpoint, choose the source endpoint that you created in Step 4: Create an AWS Database Migration Service Source Endpoint

• For Target endpoint, choose the target endpoint that you created in Step 5: Create an AWS Database Migration Service Target Endpoint

• For Migration type, choose Migrate existing data and replicate ongoing changes.

Step 8: Changing the application endpoint to the Amazon DocumentDB cluster 4.0

4. In the Task Settings section, enable CloudWatch logs.

5. For Table mappings section, keep everything at its default setting. This will ensure all collections from your database are migrated.

6. For Migration task startup configuration, choose Automatically on create. This will start the migration task automatically once you create it.

7. Choose Create task.

AWS DMS now begins migrating data from your Amazon DocumentDB 3.6 cluster to your Amazon DocumentDB 4.0 cluster. The task status should change from Starting to Running. You can monitor the progress by choosing Tasks in the AWS DMS console. After several minutes/hours (depending on the size of your migration), the status should change from to Load complete, replication ongoing. This means that AWS DMS has completed a full load migration of your Amazon DocumentDB 3.6 cluster to an Amazon DocumentDB 4.0 cluster and is now replicating change events.

Eventually your source and target will be in sync. You can verify whether they are in sync by running a count() operation on your collections to verify all change events have migrated.

Step 8: Changing the application endpoint to the Amazon DocumentDB cluster 4.0

After the full load is complete and the CDC process is replicating continuously, you are ready to change your application’s database connection endpoint from your Amazon DocumentDB 3.6 cluster to your Amazon DocumentDB cluster 4.0 cluster.

Migration Tools

To migrate to Amazon DocumentDB, the two primary tools that most customers use are the AWS Database Migration Service (AWS DMS) and command line utilities like mongodump and mongorestore.

AWS Database Migration Service

As a best practice, and for either of these options, we recommend that you first create indexes in Amazon DocumentDB before beginning your migration as it can reduce the overall time and increase the speed of the migration. To do this, you can use the Amazon DocumentDB Index Tool.

AWS Database Migration Service

AWS Database Migration Service (AWS DMS) is a cloud service that makes it easy to migrate relational databases and non-relational databases to Amazon DocumentDB. You can use AWS DMS to migrate your data to Amazon DocumentDB from databases hosted on-premises or on EC2. With AWS DMS, you can perform one-time migrations, or you can replicate ongoing changes to keep sources and targets in sync.

To help with the cost of migrations, you can use AWS DMS free for six months per instance when migrating to Amazon DocumentDB. For more information, see Free DMS.

For more information on using AWS DMS to migrate to Amazon DocumentDB, please see:

• Using MongoDB as a Source for AWS DMS

• Using Amazon DocumentDB as a Target for AWS Database Migration Service

• Walkthrough: Migrating from MongoDB to Amazon DocumentDB

Command Line Utilities

Common utilities for migrating data to and from Amazon DocumentDB include mongodump, mongorestore, mongoexport, and mongoimport. Typically, mongodump and mongorestore are the most efficient utilities as they dump and restore data from your databases in a binary format.

This is generally the most performant option and yields a smaller data size compared to logical exports. mongoexport and mongoimport are useful if you want to export and import data in a logical format like JSON or CSV as the data is human readable but is generally slower than the mongodump/mongorestore and yields a larger data size.

The Migration Approaches (p. 134) section below will discuss when it is best to use AWS DMS and command line utilities based on your use case and requirements.

Discovery

For each of your MongoDB deployments, you should identify and record two sets of data: Architecture Details and Operational Characteristics. This information will help you choose the appropriate migration approach and cluster sizing.

Architecture Details

Name

Choose a unique name for tracking this deployment.

 

Version

Record the version of MongoDB that your deployment is running. To find the version, connect to a replica set member with the mongo shell and run the db.version() operation.

 

Type

Discovery

Record whether your deployment is a standalone mongo instance, a replica set, or a sharded cluster.

 

Members

Record the hostnames, addresses, and ports of each cluster, replica set, or standalone member.

 

For a clustered deployment, you can find shard members by connecting to a mongo host with the mongo shell and running the sh.status() operation.

 

For a replica set, you can obtain the members by connecting to a replica set member with the mongo shell and running the rs.status() operation.

 

Oplog sizes

For replica sets or sharded clusters, record the size of the oplog for each replica set member. To find a member’s oplog size, connect to the replica set member with the mongo shell and run the ps.printReplicationInfo() operation.

 

Replica set member priorities

For replica sets or sharded clusters, record the priority for each replica set member. To find the replica set member priorities, connect to a replica set member with the mongo shell and run the rs.conf() operation. The priority is shown as the value of the priority key.

 

TLS/SSL usage

Record whether Transport Layer Security (TLS)/Secure Sockets Layer (SSL) is used on each node for encryption in transit.

Operational Characteristics

Database statistics

For each collection, record the following information:

• Name

• Data size

• Collection count  

To find the database statistics, connect to your database with the mongo shell and run the command db.runCommand({dbstats: 1}).

 

Collection statistics

For each collection, record the following information:

Discovery

• Data size

• Index count

• Whether the collection is capped  

Index statistics

For each collection, record the following index information:

• Namespace

• ID

• Size

• Keys

• TTL

• Sparse

• Background  

To find the index information, connect to your database with the mongo shell and run the command db.collection.getIndexes().

 

Opcounters

This information helps you understand your current MongoDB workload patterns (read-heavy, write-heavy, or balanced). It also provides guidance on your initial Amazon DocumentDB instance selection.

 

The following are the key pieces of information to collect over the monitoring period (in counts/sec):

• Queries

• Inserts

• Updates

• Deletes  

You can obtain this information by graphing the output of the db.serverStatus() command over time. You can also use the mongostat tool to obtain instantaneous values for these statistics. However, with this option you run the risk of planning your migration on usage periods other than your peak load.

 

Network statistics

This information helps you understand your current MongoDB workload patterns (read-heavy, write-heavy, or balanced). It also provides guidance on your initial Amazon DocumentDB instance selection.

 

The following are the key pieces of information to collect over the monitoring period (in counts/sec):

• Connections

Planning: Amazon DocumentDB Cluster Requirements

 

You can get this information by graphing the output of the db.serverStatus() command over time. You can also use the mongostat tool to obtain instantaneous values for these statistics. However, with this option you run the risk of planning your migration on usage periods other than your peak load.

Planning: Amazon DocumentDB Cluster Requirements

Successful migration requires that you carefully consider both your Amazon DocumentDB cluster’s configuration and how applications will access your cluster. Consider each of the following dimensions when determining your cluster requirements:

Availability

Amazon DocumentDB provides high availability through the deployment of replica instances, which can be promoted to a primary instance in a process known as failover. By deploying replica instances to different Availability Zones, you can achieve higher levels of availability.

 

The following table provides guidelines for Amazon DocumentDB deployment configurations to meet specific availability goals.

 

Availability Goal Total Instances Replicas Availability Zones

99% 1 0 1

99.9% 2 1 2

99.99% 3 2 3

 

Overall system reliability must consider all components, not just the database. For best practices and recommendations for meeting overall system reliability needs, see the AWS Well-Architected Reliability Pillar Whitepaper.

 

Performance

Amazon DocumentDB instances allow you to read from and write to your cluster’s storage volume.

Cluster instances come in a number of types, with varying amounts of memory and vCPU, which affect your cluster’s read and write performance. Using the information you gathered in the discovery phase, choose an instance type that can support your workload performance requirements. For a list of supported instance types, see Managing Instance Classes (p. 299).

 

When choosing an instance type for your Amazon DocumentDB cluster, consider the following aspects of your workload's performance requirements:

Planning: Amazon DocumentDB Cluster Requirements

vCPUs—Architectures that require higher connection counts might benefit from instances with more vCPUS.

 

Memory—When possible, keeping your working dataset in memory provides maximum

performance. A starting guideline is to reserve a third of your instance’s memory for the Amazon DocumentDB engine, leaving two-thirds for your working dataset.

 

Connections—The minimum optimal connection count is eight connections per Amazon

DocumentDB instance vCPU. Although the Amazon DocumentDB instance connection limit is much higher, performance benefits of additional connections decline above eight connections per vCPU.

 

Network—Workloads with a large number of clients or connections should consider the aggregate network performance required for inserted and retrieved data. Bulk operations can make more efficient use of network resources.

 

Insert Performance—Single document inserts are generally the slowest way to insert data into Amazon DocumentDB. Bulk insert operations can be dramatically faster than single inserts.

 

Read Performance—Reads from working memory are always faster than reads returned from the storage volume. Therefore, optimizing your instance memory size to retain your working set in memory is ideal.

 

In addition to serving reads from your primary instance, Amazon DocumentDB clusters are automatically configured as replica sets. You can then route read-only queries to read replicas by setting read preference in your MongoDB driver. You can scale read traffic by adding replicas, reducing the overall load on the primary instance.

 

It is possible to deploy Amazon DocumentDB replicas of different instance types in the same cluster.

An example use case might be to stand up a replica with a larger instance type to serve temporary analytics traffic. If you deploy a mixed set of instance types, be sure to configure the failover priority for each instance. This helps ensure that a failover event always promotes a replica of sufficient size to

An example use case might be to stand up a replica with a larger instance type to serve temporary analytics traffic. If you deploy a mixed set of instance types, be sure to configure the failover priority for each instance. This helps ensure that a failover event always promotes a replica of sufficient size to

在文檔中 Amazon DocumentDB (頁 135-0)

相關文件