Start a task execution - AWS DataSync

• Step 5: Use the CLI to monitor your task execution (p. 51)

• API ﬁlters for ListTasks and ListLocations (p. 52)

For information about supported AWS Regions and endpoints, see DataSync AWS Regions and endpoints.

For information about DataSync Amazon Resource Name (ARN) values, see DataSync Amazon Resource Names.

Step 1: Create an agent

To access your self-managed storage, you ﬁrst deploy and activate an AWS DataSync agent. The activation process associates your agent with your AWS account. An agent isn't required when transferring between AWS storage services within the same AWS account. To set up a data transfer between two AWS services, see Step 2: Create locations (p. 39).

A DataSync agent can transfer data through public service endpoints, Federal Information Processing Standard (FIPS) endpoints, and Amazon VPC endpoints. For more information, see Creating and activating a DataSync agent (p. 55).

NoteWhen you conﬁgure your agent to use Amazon VPC endpoints, the data transferred between your agent and the DataSync service doesn't cross the public internet and doesn't require public IP addresses. For end-to-end instructions for this conﬁguration, see Using AWS DataSync in a virtual private cloud (p. 56).

To create an agent to read from a Network File System (NFS), Server Message Block (SMB), Hadoop Distributed File System (HDFS), or self-managed object storage source location 1. Download the current DataSync .ova image or launch the current DataSync Amazon Machine Image

(AMI) based on Amazon EC2 from the AWS DataSync console. For information about how to get the .ova image or EC2 AMI, see Create an agent (p. 21). For information about hardware requirements and recommended EC2 instance types, see Virtual machine requirements (p. 11).

Important

If you are deploying your agent on Amazon EC2, deploy the agent such that it doesn't require network traﬃc between Availability Zones (to avoid charges for such traﬃc).

• To access your Amazon EFS or FSx for Windows File Server ﬁle system, deploy the agent in an Availability Zone that has a mount target to your ﬁle system.

• For self-managed ﬁle systems, deploy the agent in the Availability Zone where your ﬁle system resides.

To learn more about data transfer prices for all AWS Regions, see Amazon EC2 On-Demand pricing.

2. Make sure that you satisfy the network connectivity requirements for the agent. For information about network requirements, see Network requirements (p. 11).

3. Deploy the .ova image in your hypervisor, power on the hypervisor, and note the agent-ip-address. Make sure that you can reach the agent on port 80. You can use the following command to check.

nc -vz agent-ip-address 80

NoteThe .ova default credentials are login admin, password password. You can change the password on the VM local console. You don't need to log in to the VM for basic DataSync functionality. Login is required mainly for troubleshooting, network-speciﬁc settings, and so on.You log in to the agent VM local console using your VM's hypervisor client. For information about how to use the VM local console, see Working with your DataSync agent's local console (p. 63).

4. Send an HTTP/1.1 GET request to the agent to get the activation key. You can do this by using standard Unix tools:

• To activate an agent using a public service endpoint, use the following command.

curl "http://agent-ip-address/?gatewayType=SYNC&activationRegion=aws-region&no_redirect"

• To activate an agent using a virtual private cloud (VPC) endpoint, use the IP address of the VPC endpoint. Use the following command.

curl "http://agent-ip-address/?gatewayType=SYNC&activationRegion=aws-region&privateLinkEndpoint=IP address of VPC

endpoint&endpointType=PRIVATE_LINK&no_redirect"

Step 1: Create an agent

To ﬁnd the correct IP address, open the Amazon VPC console at https://console.aws.amazon.com/

vpc/ and choose Endpoints from the navigation pane at left. Choose the DataSync endpoint, and check Subnets list to ﬁnd the private IP address that corresponds to the subnet that you chose for your VPC endpoint setup.

For more information about VPC endpoint conﬁguration, see step 5 in Conﬁguring DataSync to use private IP addresses for data transfer (p. 56).

• To activate an agent using a Federal Information Processing Standard (FIPS) endpoint, specify endpointType=FIPS. Also, the activationRegion value must be set to an AWS Region within the United States. To activate a FIPS endpoint, use the following command.

curl "http://agent-IP-address/?gatewayType=SYNC&activationRegion=US-based-aws-region&endpointType=FIPS&no_redirect"

This command returns an activation key similar to the one following.

F0EFT-7FPPR-GG7MC-3I9R3-27DOH

5. After you have the activation key, do one of the following:

• To activate your agent using a public endpoint or FIPS endpoint, use the following command.

aws datasync create-agent \ --agent-name agent's name \

--activation-key obtained activation key

• To activate your agent using a VPC endpoint, use the following command.

aws datasync create-agent \ --agent-name agent's name \

--vpc-endpoint-id vpc endpoint id \ --subnet-arns subnet arns \

--security-group-arns security group arns \ --activation-key obtained activation key

In this command, use the following arguments:

• vpc endpoint id – The AWS endpoint that the agent connects to. To ﬁnd the endpoint ID, open the Amazon VPC console at https://console.aws.amazon.com/vpc/, and choose Endpoints from the navigation pane on the left. Copy the Endpoint ID value of the DataSync endpoint. For more information about VPC endpoint conﬁguration, see step 5 in Conﬁguring DataSync to use private IP addresses for data transfer (p. 56).

• security group arn – The Amazon Resource Name (ARN) of the security group to use for the task's endpoint. This is the security group that you created in step 3 of Conﬁguring DataSync to use private IP addresses for data transfer (p. 56).

• task's subnet arn – The ARN of the subnet where the task endpoints for the agent are created.

This is the subnet that you chose in step 1 of Conﬁguring DataSync to use private IP addresses for data transfer (p. 56).

These commands return the ARN of the agent that you just activated. The ARN is similar to the one following.

{

"AgentArn": "arn:aws:datasync:us-east-1:111222333444:agent/

agent-0b0addbeef44baca3”

}

NoteAfter you choose a service endpoint, you can't change it later.

After you activate the agent, it closes port 80 and the port is no longer accessible. If you can't connect to the agent after you have activated it, verify that the activation was successful by using the following command:

aws datasync list-agents

NoteMake sure that you are using the same AWS credentials throughout the whole process. Don't switch between multiple terminals where you are authenticated with diﬀerent AWS credentials.

Step 2: Create locations

Each DataSync task is made up of a pair of locations between which data is transferred. The source location deﬁnes the storage system or service that you want to read data from. The destination location deﬁnes the storage system or service that you want to write data to.

For a list of all DataSync supported source and destination endpoints, see Working with locations (p. 72).

Topics

• Create an NFS location (p. 39)

• Create an SMB location (p. 40)

• Create an HDFS location (p. 41)

• Create an object storage location (p. 42)

• Create an Amazon EFS location (p. 42)

• Create an FSx for Windows File Server location (p. 44)

• Create an Amazon FSx for Lustre location (p. 44)

• Create an Amazon S3 location (p. 45)

Create an NFS location

Use the following procedure to create an NFS location using the AWS CLI. An NFS location deﬁnes a ﬁle system on an NFS server that can be read from or written to. You can also create an NFS location using the AWS Management Console. For more information, see Creating a location for NFS (p. 73).

NoteIf you are using an NFS location on an AWS Snowcone device, see NFS server on AWS Snowcone and AWS Snowball Edge (p. 75) for more information about transferring data to or from that device.

To create an NFS location using the CLI

• Use the following command to create an NFS source location.

$ aws datasync create-location-nfs \

Create an SMB location

--server-hostname server-address \ --on-prem-config AgentArns=agent-arns \ --subdirectory nfs-export-path

For the preceding command, the following applies:

• The path that you provide for the --subdirectory parameter should be a path that's exported by the NFS server, or a subdirectory. Other NFS clients in your network should be able to mount this path. To see all the paths exported by your NFS server, run the command showmount -e nfs-s-erv-er-addr-ess from an NFS client with access to your server. You can specify any directory that appears in the results, and any subdirectory of that directory.

• To transfer all the data in the folder that you speciﬁed, DataSync needs permissions to read all the data. To give DataSync permissions, you can do one of two things. You can conﬁgure the NFS export with no_root_squash. Or, for the all ﬁles that you want DataSync to access, you can make sure that the permissions allow read access for all users. Doing either enables the agent to read the ﬁles. For the agent to access directories, you must additionally give all users execute access.

• Make sure that the NFS export is accessible without Kerberos authentication.

DataSync automatically chooses the NFS version that it uses to read from an NFS location. To specify an NFS version, use the optional Version parameter in the NfsMountOptions (p. 289) API operation.

This command returns the Amazon Resource Name (ARN) of the NFS location, similar to the ARN shown following.

{ "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/

loc-0f01451b140b2af49" }

To make sure that the directory can be mounted, you can connect to any computer that has the same network conﬁguration as your agent and run the following command.

mount -t nfs -o nfsvers=<nfs-server-version <nfs-server-address:<nfs-export-path <test-folder

The following is an example of the command.

mount -t nfs -o nfsvers=3 198.51.100.123:/path_for_sync_to_read_from / temp_folder_to_test_mount_on_local_machine

Create an SMB location

Use the following procedure to create an SMB location using the AWS CLI. An SMB location deﬁnes a ﬁle system on an SMB server that can be read from or written to. You can also create an SMB location using the console. For more information, see Creating a location for SMB (p. 75).

To create an SMB location using the CLI

• Use the following command to create an SMB source location.

aws datasync create-location-smb \

--server-hostname smb-server-address \ --user user-name \

--domain domain-of-the-smb-server \

--password user's-password AgentArns=agent-arns \ --subdirectory smb-export-path

The path that you provide for the --subdirectory parameter should be a path that's exported by the SMB server, or a subdirectory. Specify the path using forward slashes, for example /path/to/

folder. Other SMB clients in your network should be able to access this path.

DataSync automatically chooses the SMB version that it uses to read from an SMB location. To specify an SMB version, use the optional Version parameter in the SmbMountOptions (p. 300) API operation.

This command returns the Amazon Resource Name (ARN) of the SMB location, similar to the ARN shown following.

{ "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/

loc-0f01451b140b2af49"

}

Create an HDFS location

Use the following procedure to create a Hadoop Distributed File System (HDFS) location using the AWS CLI. An HDFS location deﬁnes a ﬁle system on a Hadoop cluster that can be read from or written to.

You can also create an HDFS location using the AWS Management Console. For more information, see Creating a location for HDFS (p. 77).

To create an HDFS location using the AWS CLI

• Use the following command to create an HDFS location. In the following example, replace each user input placeholder with your own information.

aws datasync create-location-hdfs --name-nodes [{"Hostname":"host1", "Port": 8020}] \ --authentication-type "SIMPLE|KERBEROS" \

--agent-arns [arn:aws:datasync:us-east-1:123456789012:agent/

agent-01234567890example] \

--subdirectory "/path/to/my/data"

The following parameters are required in the create-location-hdfs command:

• name-nodes – Speciﬁes the hostname or IP address of the NameNode in the Hadoop cluster and the TCP port that the NameNode is listening on.

• authentication-type – The type of authentication to use when connecting to the Hadoop cluster. Specify SIMPLE or KERBEROS.

• agent-arns – The Amazon Resource Names (ARNs) of the agents to use for the HDFS location.

If you use SIMPLE authentication, use the --simple-user parameter to specify the user name of the user. If you use KERBEROS authentication, use the principal, --kerberos-keytab, and --kerberos-krb5-conf parameters.

The preceding the command returns the location ARN, similar to the following:

{ "arn:aws:datasync:us-east-1:123456789012:location/loc-01234567890example"

}

Create an object storage location

Use the following procedure to create a self-managed object storage location using the AWS CLI. An object storage location is the endpoint for an Amazon S3 API compatible object storage server. An object storage location deﬁnes an object storage server that can be read from or written to. You can also create an object storage location using the AWS Management Console.

For more information about object storage locations, including compatibility requirements, see Creating a location for object storage (p. 78).

To create a self-managed object storage location using the CLI

• Use the following command to create a self-managed object storage location.

aws datasync create-location-object-storage \

--server-hostname object-storage-server.example.com \ --bucket-name myBucket \

--agent-arns arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890deadfb

The following parameters are required in the create-location-object-storage command.

• server-hostname: The DNS name or IP address of the self-managed object storage server.

• bucket-name: The name that identiﬁes the bucket on the self-managed object storage server at the location.

• agent-arns: The ARNs of the agents to use for the self-managed object storage location.

If your object storage requires a user name and password to authenticate, use the --access-key and --secret-key parameters to provide the user name and password, respectively.

The preceding command returns a location ARN similar to the following.

{

"arn:aws:datasync:us-east-1:123456789012:location/loc-01234567890deadfb"

}

Create an Amazon EFS location

Use the following procedure to create an Amazon EFS location using the AWS CLI. An EFS location is the endpoint for an Amazon EFS ﬁle system, which deﬁnes an EFS ﬁle system that can be read from or written to. You can also create an EFS location using the console. For more information, see Creating a location for Amazon EFS (p. 79).

To create an Amazon EFS location using the CLI

1. If you don't have an Amazon EFS ﬁle system, create one. For information about how to create an EFS ﬁle system, see Getting started with Amazon Elastic File System in the Amazon Elastic File System User Guide.

2. Identify a subnet that has at least one mount target for that ﬁle system. You can see all the mount targets and the subnets associated with an EFS ﬁle system by using the describe-mount-targets command.

aws efs describe-mount-targets \ --region aws-region \

--file-system-id file-system-id

NoteThe AWS Region that you specify is the one where your target S3 bucket or EFS ﬁle system is located.

This command returns information about the target similar to the information shown following.

{

3. Specify an Amazon EC2 security group that can be used to access the mount target. You can run the following command to ﬁnd out the security group of the mount target.

aws efs describe-mount-target-security-groups \ --region aws-region \

--mount-target-id mount-target-id

The security group that you provide must be able to communicate with the security group on the mount target in the subnet speciﬁed.

The relationship between security group M on the mount target and security group S, which you provide for DataSync to use at this stage, is as follows:

• Security group M, which you associate with the mount target, must allow inbound access for the TCP protocol on the NFS port (2049) from security group S.

You can enable an inbound connection either by its IP address (CIDR range) or its security group.

• Security group S, which you provide to DataSync to access Amazon EFS, should have a rule that enables outbound connections to the NFS port. It enables outbound connections on one of the ﬁle system's mount targets.

You can enable outbound connections either by IP address (CIDR range) or security group.

For information about security groups and mount targets, see Security groups for Amazon EC2 instances and mount targets in the Amazon Elastic File System User Guide.

4. Create the EFS location. To create the EFS location, you need the ARNs for your Amazon EC2 subnet, EC2 security group, and an EFS ﬁle system. Because the DataSync API accepts fully qualiﬁed ARNs, you can construct these ARNs. For information about how to construct ARNs for diﬀerent services, see Amazon Resource Names (ARNs) in the AWS General Reference.

Use the following command to create an EFS location.

aws datasync create-location-efs \

--subdirectory /path/to/your/subdirectory \

--efs-filesystem-arn 'arn:aws:elasticfilesystem:region:account-id:file-system/filesystem-id' \

--ec2-config

SecurityGroupArns='arn:aws:ec2:region:account-id:security-group/security-group-id',SubnetArn='arn:aws:ec2:region:account-id:subnet/subnet-id'

Create an FSx for Windows File Server location

NoteThe AWS Region that you specify is the one where your target S3 bucket or EFS ﬁle system is located.

The command returns a location ARN similar to the one shown following.

{ "LocationArn": "arn:aws:datasync:us-west-2:111222333444:location/

loc-07db7abfc326c50fb"

}

Create an FSx for Windows File Server location

Use the following procedure to create an FSx for Windows File Server location using the AWS CLI. An Amazon FSx location is the endpoint for an FSx for Windows File Server. This endpoint deﬁnes the Amazon FSx ﬁle share that you can read from or write to.

You can also create an Amazon FSx location using the console. For more information, see Creating a location for FSx for Windows File Server (p. 81).

To create an FSx for Windows File Server location using the AWS CLI

• Use the following command to create an Amazon FSx location.

aws datasync create-location-fsx-windows \

--fsx-filesystem-arn arn:aws:fsx:region:account-id:file-system/filesystem-id \ --security-group-arns arn:aws:ec2:region:account-id:security-group/group-id \ --user smb-user --password password

In the create-location-fsx-windows command, specify the following:

• fsx-filesystem-arn – The fully qualiﬁed Amazon Resource Name (ARN) of the ﬁle system that you want to read from or write to.

The DataSync API accepts fully qualiﬁed ARNs, and you can construct these ARNs. For information

在文檔中 AWS DataSync (頁 43-0)