Using a custom container for analysis
This section includes information about how to build a Docker container using a Jupyter notebook. There is a security risk if you re-use notebooks built by third parties: included containers can execute arbitrary code with your user permissions. In addition, the HTML generated by the notebook can be displayed in the AWS IoT Analytics console, providing a potential attack vector on the computer displaying the HTML.
Make sure you trust the author of any third-party notebook before using it.
You can create your own custom container and run it with the AWS IoT Analytics service. To do so, you setup a Docker image and upload it to Amazon ECR, then set up a dataset yo run a container action. This section gives an example of the process using Octave.
This tutorial assumes that you have:
• Octave installed on your local computer
• A Docker account set up on your local computer
• An AWS account with Amazon ECR or AWS IoT Analytics access
Step 1: Set up a Docker image
There are three main files you need for this tutorial. Their names and contents are here:
• Dockerfile – The initial setup for Docker's containerization process.
FROM ubuntu:16.04
# Get required set of software RUN apt-get update
RUN apt-get install -y software-properties-common RUN apt-get install -y octave
RUN apt-get install -y python3-pip
# Get boto3 for S3 and other libraries RUN pip3 install --upgrade pip
Using a custom container
• run-octave.py – Parses JSON from AWS IoT Analytics, runs the Octave script, and uploads artifacts to Amazon S3.
# Parse the JSON from IoT Analytics
with open('/opt/ml/input/data/iotanalytics/params') as params_file:
params = json.load(params_file) variables = params['Variables']
# Pull input data from S3...
s3 = boto3.resource('s3')
s3.Bucket(input_s3_bucket).download_file(input_s3_key, local_input_filename)
# Run Octave Script
os.system("octave moment {} {} {}".format(local_input_filename, local_output_filename, order))
# # Upload the artifacts to S3
output_s3_url = urlparse(output_s3_uri) output_s3_bucket = output_s3_url.netloc output_s3_key = output_s3_url.path[1:]
s3.Object(output_s3_bucket, output_s3_key).put(Body=open(local_output_filename, 'rb'), ACL='bucket-owner-full-control')
• moment – A simple Octave script which calculates the moment based on an input or output file and a specified order.
Using a custom container
1. Download the contents of each file. Create a new directory and place all the files in it and then cd to that directory.
2. Run the following command.
docker build -t octave-moment .
3. You should see a new image in your Docker repository. Verify it by running the following command.
docker image ls | grep octave-moment
Step 2: Upload the Docker image to an Amazon ECR repository 1. Create a repository in Amazon ECR.
aws ecr create-repository --repository-name octave-moment 2. Get the login to your Docker environment.
aws ecr get-login
3. Copy the output and run it. The output should look something like the following.
docker login -u AWS -p password -e none https://your-aws-account-id.dkr.ecr..amazonaws.com
4. Tag the image you created with the Amazon ECR repository tag.
docker tag your-image-id your-aws-account-id.dkr.ecr.region.amazonaws.com/octave-moment 5. Push the image to Amazon ECR.
docker push your-aws-account-id.dkr.ecr.region.amazonaws.com/octave-moment
Step 3: Upload your sample data to an Amazon S3 bucket 1. Download the following to file input.txt.
0.857549 -0.987565 -0.467288 -0.252233 -2.298007 0.030077 -1.243324 -0.692745 0.563276 0.772901 -0.508862 -0.404303 -1.363477 -1.812281 -0.296744 -0.203897 0.746533 0.048276 0.075284 0.125395 0.829358 1.246402 -1.310275 -2.737117 0.024629 1.206120 0.895101 1.075549 1.897416 1.383577
2. Create an Amazon S3 bucket called octave-sample-data-your-aws-account-id.
3. Upload the file input.txt to the Amazon S3 bucket you just created. You should now have a bucket named octave-sample-data-your-aws-account-id that contains the input.txt file.
Step 4: Create a container execution role
1. Copy the following to a file named role1.json. Replace your-aws-account-id with your AWS account ID and aws-region with the AWS region of your AWS resources.
Using a custom container
NoteThis example includes a global condition context key to protect against the confused deputy security problem. For more information, see the section called “Cross-service confused deputy prevention” (p. 104).
{ "Version": "2012-10-17", "Statement": [
"aws:SourceArn": "arn:aws:iotanalytics:aws-region:your-aws-account-id:dataset/DOC-EXAMPLE-DATASET"
} } ]
}
2. Create a role that gives access permissions to SageMaker and AWS IoT Analytics, using the file role1.json that you downloaded.
aws iam create-role --role-name container-execution-role --assume-role-policy-document file://role1.json
3. Download the following to a file named policy1.json and replace your-account-id with your account ID (see the second ARN under Statement:Resource).
{ "Version": "2012-10-17",
"arn:aws:s3:::octave-sample-data-your-account-id/*"
},
Using a custom container
"Action": [
"ecr:GetAuthorizationToken", "ecr:GetDownloadUrlForLayer", "ecr:BatchGetImage",
"ecr:BatchCheckLayerAvailability", "logs:CreateLogGroup",
"logs:CreateLogStream", "logs:DescribeLogStreams", "logs:GetLogEvents",
4. Create an IAM policy, using the policy.json file you just downloaded.
aws iam create-policy --policy-name ContainerExecutionPolicy --policy-document file://
policy1.json
5. Attach the policy to the role.
aws iam attach-role-policy --role-name container-execution-role --policy-arn arn:aws:iam::your-account-id:policy/ContainerExecutionPolicy
Step 5: Create a dataset with a container action
1. Download the following to a fie named cli-input.json and replace all instances of your-account-id and region with the appropriate values.
{
"datasetName": "octave_dataset", "actions": [
{
"actionName": "octave", "containerAction": {
"image": "your-account-id.dkr.ecr.region.amazonaws.com/octave-moment", "executionRoleArn": "arn:aws:iam::your-account-id:role/container-execution-role",
Using a custom container
"stringValue": "octave-sample-data-your-account-id"
}, {
"name": "inputDataS3Key", "stringValue": "input.txt"
}, {
"name": "order", "stringValue": "3"
} ] } } ] }
2. Create a dataset using the file cli-input.json you just downloaded and edited.
aws iotanalytics create-dataset —cli-input-json file://cli-input.json
Step 6: Invoke dataset content generation 1. Run the following command.
aws iotanalytics create-dataset-content --dataset-name octave-dataset
Step 7: Get dataset content 1. Run the following command.
aws iotanalytics get-dataset-content --dataset-name octave-dataset --version-id \$LATEST 2. You might need to wait several minutes until the DatasetContentState is SUCCEEDED.
Step 8: Print the output on Octave
1. Use the Octave shell to print the output from the container by running the following command.
bash> octave
octave> load output.mat octave> disp(M)
-0.016393 -0.098061 0.380311 -0.564377 -1.318744
Visualizing (console)