AWS IoT Analytics
User Guide
AWS IoT Analytics: User Guide
Copyright © Amazon Web Services, Inc. and/or its affiliates. All rights reserved.
Amazon's trademarks and trade dress may not be used in connection with any product or service that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may or may not be affiliated with, connected to, or sponsored by Amazon.
Table of Contents
What is AWS IoT Analytics? ... 1
How to use AWS IoT Analytics ... 1
Key features ... 1
AWS IoT Analytics components and concepts ... 3
Access AWS IoT Analytics ... 4
Use cases ... 5
Getting started (console) ... 6
Sign in to the AWS IoT Analytics console ... 6
Create a channel ... 7
Create a data store ... 8
Create a pipeline ... 8
Create a dataset ... 9
Send message data with AWS IoT ... 11
Check the progress of AWS IoT messages ... 11
Access query results ... 12
Explore your data ... 12
Notebook templates ... 14
Getting started ... 15
Creating a channel ... 15
Creating a data store ... 16
Amazon S3 policies ... 16
File formats ... 17
Custom partitions ... 19
Creating a pipeline ... 21
Ingesting data to AWS IoT Analytics ... 22
Using the AWS IoT message broker ... 22
Using the BatchPutMessage API ... 25
Monitoring the ingested data ... 26
Creating a dataset ... 27
Querying data ... 28
Accessing the queried data ... 28
Exploring AWS IoT Analytics data ... 12
Amazon S3 ... 29
AWS IoT Events ... 29
Amazon QuickSight ... 30
Jupyter Notebook ... 30
Keeping multiple versions of datasets ... 30
Message payload syntax ... 31
Working with AWS IoT SiteWise data ... 31
Create a dataset ... 31
Access dataset contents ... 34
Tutorial: Query AWS IoT SiteWise data ... 35
Pipeline activities ... 40
Channel activity ... 40
Datastore activity ... 40
AWS Lambda activity ... 40
Lambda function example 1 ... 41
Lambda function example 2 ... 43
AddAttributes activity ... 43
RemoveAttributes activity ... 44
SelectAttributes activity ... 45
Filter activity ... 45
DeviceRegistryEnrich activity ... 46
DeviceShadowEnrich activity ... 47
Math activity ... 49
Math activity operators and functions ... 49
RunPipelineActivity ... 60
Reprocessing channel messages ... 62
Parameters ... 62
Reprocessing channel messages (console) ... 63
Reprocessing channel messages (API) ... 63
Canceling channel reprocessing activities ... 64
Automating your workflow ... 65
Use cases ... 65
Using a Docker container ... 66
Custom Docker container input/output variables ... 68
Permissions ... 69
CreateDataset (Java and AWS CLI) ... 71
Example 1 -- creating a SQL dataset (java) ... 71
Example 2 -- creating a SQL dataset with a delta window (java) ... 72
Example 3 -- creating a container dataset with its own schedule trigger (java) ... 72
Example 4 -- creating a container dataset with a SQL dataset as a trigger (java) ... 73
Example 5 -- creating a SQL dataset (CLI) ... 74
Example 6 -- creating a SQL dataset with a delta window (CLI) ... 74
Containerizing a notebook ... 75
Enable containerization of notebook instances not created via AWS IoT Analytics console ... 76
Update your notebook containerization extension ... 77
Create a containerized image ... 78
Using a custom container ... 82
Visualizing data ... 88
Visualizing (console) ... 88
Visualizing (QuickSight) ... 89
Tagging ... 92
Tag basics ... 92
Using tags with IAM policies ... 93
Tag restrictions ... 94
SQL expressions ... 95
Supported SQL functionality ... 95
Supported data types ... 95
Supported functions ... 96
Troubleshoot common issues ... 97
Security ... 98
AWS Identity and Access Management ... 98
Audience ... 98
Authenticating with identities ... 99
Managing access ... 100
Working with IAM ... 101
Cross-service confused deputy prevention ... 104
IAM policy examples ... 108
Troubleshooting identity and access ... 112
Logging and monitoring ... 114
Automated monitoring tools ... 114
Manual monitoring tools ... 114
Monitoring with CloudWatch Logs ... 115
Monitoring with CloudWatch Events ... 118
Logging API calls with CloudTrail ... 124
Compliance validation ... 127
Resilience ... 127
Infrastructure security ... 127
Quotas ... 129
Commands ... 130
AWS IoT Analytics actions ... 130
AWS IoT Analytics data ... 130
Troubleshooting ... 131
How do I know if my messages are getting into AWS IoT Analytics? ... 131
Why is my pipeline losing messages? How do I fix it? ... 132
Why is there no data in my data store? ... 132
Why does my dataset just show __dt? ... 132
How do I code an event driven by the dataset completion? ... 133
How do I correctly configure my notebook instance to use AWS IoT Analytics? ... 133
Why can't I create notebooks in an instance? ... 133
Why aren't I seeing my datasets in Amazon QuickSight? ... 133
Why am I not seeing the containerize button on my existing Jupyter Notebook? ... 134
Why is my containerization plugin installation failing? ... 134
Why is my containerization plugin throwing an error? ... 134
Why don't I see my variables during the containerization? ... 135
What variables can I add to my container as an input? ... 135
How do I set my container output as an input for subsequent analysis? ... 135
Why is my container dataset failing? ... 135
Document history ... 136
Earlier updates ... 136
How to use AWS IoT Analytics
What is AWS IoT Analytics?
AWS IoT Analytics automates the steps required to analyze data from IoT devices. AWS IoT Analytics filters, transforms, and enriches IoT data before storing it in a time-series data store for analysis. You can set up the service to collect only the data you need from your devices, apply mathematical transforms to process the data, and enrich the data with device-specific metadata such as device type and location before storing it. Then, you can analyze your data by running queries using the built-in SQL query engine, or perform more complex analytics and machine learning inference. AWS IoT Analytics enables advanced data exploration through integration with Jupyter Notebook. AWS IoT Analytics also enables data visualization through integration with Amazon QuickSight. Amazon QuickSight is available in the following Regions.
Traditional analytics and business intelligence tools are designed to process structured data. Raw IoT data often comes from devices that record less structured data (such as temperature, motion, or sound).
As a result the data from these devices can have significant gaps, corrupted messages, and false readings that must be cleaned up before analysis can occur. Also, IoT data is often only meaningful in the context of other data from external sources. AWS IoT Analytics lets you to address these issues and collect large amounts of device data, process messages, and store them. You can then query the data and analyze it.
AWS IoT Analytics includes pre-built models for common IoT use cases so that you can answer questions like which devices are about to fail or which customers are at risk of abandoning their wearable devices.
How to use AWS IoT Analytics
The following graphic shows an overview of how you can use AWS IoT Analytics.
Key features
Collect
• Integrated with AWS IoT Core—AWS IoT Analytics is fully integrated with AWS IoT Core so it can receive messages from connected devices as they stream in.
• Use a batch API to add data from any source—AWS IoT Analytics can receive data from any source through HTTP. That means that any device or service that is connected to the internet can send
Key features
data to AWS IoT Analytics. For more information, see BatchPutMessage in the AWS IoT Analytics API Reference.
• Collect only the data you want to store and analyze—You can use the AWS IoT Analytics console to configure AWS IoT Analytics to receive messages from devices through MQTT topic filters in various formats and frequencies. AWS IoT Analytics validates that the data is within specific parameters you define and creates channels. Then, the service routes the channels to appropriate pipelines for message processing, transformation, and enrichment.
Process
• Cleanse and filter—AWS IoT Analytics lets you define AWS Lambda functions that are triggered when AWS IoT Analytics detects missing data, so you can run code to estimate and fill gaps. You can also define maximum and minimum filters and percentile thresholds to remove outliers in your data.
• Transform—AWS IoT Analytics can transform messages using mathematical or conditional logic you define, so that you can perform common calculations like Celsius into Fahrenheit conversion.
• Enrich—AWS IoT Analytics can enrich data with external data sources such as a weather forecast, and then route the data to the AWS IoT Analytics data store.
Store
• Time-series data store—AWS IoT Analytics stores the device data in an optimized time-series data store for faster retrieval and analysis. You can also manage access permissions, implement data retention policies and export your data to external access points.
• Store processed and raw data—AWS IoT Analytics stores the processed data and also automatically stores the raw ingested data so you can process it at a later time.
Analyze
• Run Ad-hoc SQL queries—AWS IoT Analytics provides a SQL query engine so you can run ad-hoc queries and get results quickly. The service enables you to use standard SQL queries to extract data from the data store to answer questions like the average distance traveled for a fleet of connected vehicles or how many doors in a smart building are locked after 7pm. These queries can be re-used even if connected devices, fleet size, and analytic requirements change.
• Time-series analysis—AWS IoT Analytics supports time-series analysis so you can analyze the performance of devices over time and understand how and where they are being used, continuously monitor device data to predict maintenance issues, and monitor sensors to predict and react to environmental conditions.
• Hosted notebooks for sophisticated analytics and machine learning—AWS IoT Analytics includes support for hosted notebooks in Jupyter Notebook for statistical analysis and machine learning.
The service includes a set of notebook templates that contain AWS-authored machine learning models and visualizations. You can use the templates to get started with IoT use cases related to device failure profiling, forecasting events such as low usage that might signal the customer will abandon the product, or segmenting devices by customer usage levels (for example heavy users, weekend users) or device health. After you author a notebook, you can containerize and execute it on a schedule that you specify. For more information, see Automating your workflow.
• Prediction—You can do statistical classification through a method called logistic regression. You can also use Long-Short-Term Memory (LSTM), which is a powerful neural network technique for predicting the output or state of a process that varies over time. The pre-built notebook templates also support the K-means clustering algorithm for device segmentation, which clusters your devices into cohorts of like devices. These templates are typically used to profile device health and device state such as HVAC units in a chocolate factory or wear and tear of blades on a wind turbine. Again, these notebook templates can be contained and executed on a schedule.
Build and visualize
• Amazon QuickSight integration—AWS IoT Analytics provides a connector to Amazon QuickSight so that you can visualize your data sets in a QuickSight dashboard.
• Console integration—You can also visualize the results or your ad-hoc analysis in the embedded Jupyter Notebook in the AWS IoT Analytics' console.
AWS IoT Analytics components and concepts
AWS IoT Analytics components and concepts
Channel
A channel collects data from an MQTT topic and archives the raw, unprocessed messages before publishing the data to a pipeline. You can also send messages to a channel directly using the BatchPutMessage API. The unprocessed messages are stored in an Amazon Simple Storage Service (Amazon S3) bucket that you or AWS IoT Analytics manage.
Pipeline
A pipeline consumes messages from a channel and enables you to process the messages before storing them in a data store. The processing steps, called activities (Pipeline activities), perform transformations on your messages such as removing, renaming or adding message attributes, filtering messages based on attribute values, invoking your Lambda functions on messages for advanced processing or performing mathematical transformations to normalize device data.
Data store
Pipelines store their processed messages in a data store. A data store is not a database, but it is a scalable and queryable repository of your messages. You can have multiple data stores for messages coming from different devices or locations, or filtered by message attributes depending on your pipeline configuration and requirements. As with unprocessed channel messages, a data store's processed messages are stored in an Amazon S3 bucket that you or AWS IoT Analytics manage.
Data set
You retrieve data from a data store by creating a data set. AWS IoT Analytics enables you to create a SQL data set or a container data set.
After you have a data set, you can explore and gain insights into your data through integration using Amazon QuickSight. You can also perform more advanced analytical functions through integration with Jupyter Notebook. Jupyter Notebook provides powerful data science tools that can perform machine learning and a range of statistical analyses. For more information, see Notebook templates.
You can send data set contents to an Amazon S3 bucket, enabling integration with your existing data lakes or access from in-house applications and visualization tools. You can also send data set contents as an input to AWS IoT Events, a service which enables you to monitor devices or processes for failures or changes in operation, and to trigger additional actions when such events occur.
SQL data set
A SQL data set is similar to a materialized view from a SQL database. You can create a SQL data set by applying a SQL action. SQL data sets can be generated automatically on a recurring schedule by specifying a trigger.
Container data set
A container data set enables you to automatically run your analysis tools and generate results. For more information, see Automating your workflow. It brings together a SQL data set as input, a Docker container with your analysis tools and needed library files, input and output variables, and an optional schedule trigger. The input and output variables tell the executable image where to get the data and store the results. The trigger can run your analysis when a SQL data set finishes creating its content or according to a time schedule expression. A container data set automatically runs, generates and then saves the results of the analysis tools.
Trigger
You can automatically create a data set by specifying a trigger. The trigger can be a time interval (for example, create this data set every two hours) or when another data set's content has been created
Access AWS IoT Analytics
(for example, create this data set when myOtherDataset finishes creating its content). Or, you can generate data set content manually by using CreateDatasetContent API.
Docker container
You can create your own Docker container to package your analysis tools or use options that SageMaker provides. For more information, see Docker container. You can create your own Docker container to package your analysis tools or use options provided by SageMaker. You can store a container in an Amazon ECR registry that you specify so it is available to install on your desired platform. Docker containers are capable of running your custom analytical code prepared with Matlab, Octave, Wise.io, SPSS, R, Fortran, Python, Scala, Java, C++, and so on. For more information, see Containerizing a notebook.
Delta windows
Delta windows are a series of user-defined, non-overlapping and contiguous time intervals. Delta windows enable you to create the data set content with, and perform analysis on, new data that has arrived in the data store since the last analysis. You create a delta window by setting the deltaTime in the filters portion of a queryAction of a data set. For more information, see the CreateDataset API. Usually, you'll want to create the data set content automatically by also setting up a time interval trigger (triggers:schedule:expression). This lets you filter messages that have arrived during a specific time window, so the data contained in messages from previous time windows doesn't get counted twice. For more information, see Example 6 -- creating a SQL dataset with a Delta window (CLI).
Access AWS IoT Analytics
As part of AWS IoT, AWS IoT Analytics provides the following interfaces to enable your devices to generate data and your applications to interact with the data they generate:
AWS Command Line Interface (AWS CLI)
Run commands for AWS IoT Analytics on Windows, OS X, and Linux. These commands enable you to create and manage things, certificates, rules, and policies. To get started, see the AWS Command Line Interface User Guide. For more information about the commands for AWS IoT, see iot in the AWS Command Line Interface Reference.
Important
Use the aws iotanalytics command to interact with AWS IoT Analytics. Use the aws iot command to interact with other parts of the IoT system.
AWS IoT API
Build your IoT applications using HTTP or HTTPS requests. These API actions enable you to create and manage things, certificates, rules, and policies. For more information, see Actions in the AWS IoT API Reference.
AWS SDKs
Build your AWS IoT Analytics applications using language-specific APIs. These SDKs wrap the HTTP and HTTPS API and enable you to program in any of the supported languages. For more information, see AWS SDKs and tools.
AWS IoT Device SDKs
Build applications that run on your devices that send messages to AWS IoT Analytics. For more information, see AWS IoT SDKs.
AWS IoT Analytics Console
You can build the components to visualize the results in the AWS IoT Analytics console.
Use cases
Use cases
Predictive maintenance
AWS IoT Analytics provides templates to build predictive maintenance models and apply them to your devices. For example, you can use AWS IoT Analytics to predict when heating and cooling systems are likely to fail on connected cargo vehicles so the vehicles can be rerouted to prevent shipment damage. Or, an auto manufacturer can detect which of its customers have worn brake pads and alert them to seek maintenance for their vehicles.
Proactive replenishing of supplies
AWS IoT Analytics lets you build IoT applications that can monitor inventories in real time. For example, a food and drink company can analyze data from food vending machines and proactively reorder merchandise whenever the supply is running low.
Process efficiency scoring
With AWS IoT Analytics, you can build IoT applications that constantly monitor the efficiency of different processes and take action to improve the process. For example, a mining company can increase the efficiency of its ore trucks by maximizing the load for each trip. With AWS IoT Analytics, the company can identify the most efficient load for a location or truck over time, and the compare any deviations from the target load in real time, and better plan leading guidelines to improve efficiency.
Smart agriculture
AWS IoT Analytics can enrich IoT device data with contextual metadata using AWS IoT registry data or public data sources so that your analysis factors in time, location, temperature, altitude, and other environmental conditions. With that analysis, you can write models that output recommended actions for your devices to take in the field. For example, to determine when to water, irrigation systems might enrich humidity sensor data with data on rainfall, enabling more efficient water usage.
Sign in to the AWS IoT Analytics console
Getting started with AWS IoT Analytics (console)
Use this tutorial to create the AWS IoT Analytics resources (also known as components) that you need to discover useful insights about your IoT device data.
Notes
• If you enter uppercase characters in the following tutorial, AWS IoT Analytics automatically changes them to lowercase.
• The AWS IoT Analytics console has a one-click getting started feature to create a channel, pipeline, data store, and dataset. You can find this feature when you sign in to the AWS IoT Analytics console.
• This tutorial walks you through each step to create your AWS IoT Analytics resources.
Follow the instructions below to create an AWS IoT Analytics channel, pipeline, data store, and dataset.
The tutorial also shows you how to use the AWS IoT Core console to send messages that will be ingested into AWS IoT Analytics.
Topics
• Sign in to the AWS IoT Analytics console (p. 6)
• Create a channel (p. 7)
• Create a data store (p. 8)
• Create a pipeline (p. 8)
• Create a dataset (p. 9)
• Send message data with AWS IoT (p. 11)
• Check the progress of AWS IoT messages (p. 11)
• Access query results (p. 12)
• Explore your data (p. 12)
• Notebook templates (p. 14)
Sign in to the AWS IoT Analytics console
To get started, you must have an AWS account. If you already have an AWS account, navigate to the https://console.aws.amazon.com/iotanalytics/.
If you don't have an AWS account, follow these steps to create one.
To create an AWS account
1. Open https://portal.aws.amazon.com/billing/signup.
2. Follow the online instructions.
Create a channel
Part of the sign-up procedure involves receiving a phone call and entering a verification code on the phone keypad.
3. Sign in to the AWS Management Console and navigate to the https://console.aws.amazon.com/
iotanalytics/.
Create a channel
A channel collects and archives raw, unprocessed, and unstructured IoT device data. Follow these steps to create your channel.
To create a channel
1. In the https://console.aws.amazon.com/iotanalytics/, in the Prepare your data with AWS IoT Analytics section, choose View channels.
TipYou can also choose Channels from the navigation pane.
2. On the Channels page, choose Create channel.
3. On the Specify channel details page, enter the details about your channel.
a. Enter a channel name that is unique and that you can easily identify.
b. (Optional) For Tags, add one or more custom tags (key-value pairs) to your channel. Tags can help you identify your resources that you create for AWS IoT Analytics.
c. Choose Next.
4. AWS IoT Analytics stores your raw, unprocessed IoT device data in an Amazon Simple Storage Service (Amazon S3) bucket. You can choose your own Amazon S3 bucket, which you can access and manage, or AWS IoT Analytics can manage the Amazon S3 bucket for you.
a. In this tutorial, for Storage type, choose Service managed storage.
b. For Choose how long to store your raw data, choose Indefinitely.
c. Choose Next.
5. On the Configure source page, enter information for AWS IoT Analytics to collect message data from AWS IoT Core.
a. Enter an AWS IoT Core topic filter, for example, update/environment/dht1. Later in this tutorial, you will use this topic filter to send message data to your channel.
b. Choose an IAM role or create a new role.
c. Choose Next.
Create a data store
6. Review your choices and then choose Create channel.
7. Verify that your new channel appears on the Channels page.
Create a data store
A data store receives and stores your message data. A data store isn't a database. Instead, a data store is a scalable and queryable repository in an Amazon S3 bucket. You can use multiple data stores for messages from different devices or locations. Or, you can filter message data depending on your pipeline configuration and requirements.
Follow these steps to create a data store.
To create a data store
1. In the https://console.aws.amazon.com/iotanalytics/, in the Prepare your data with AWS IoT Analytics section, choose View data stores.
2. On the Data stores page, choose Create data store.
3. On the Specify data store details page, enter basic information about your data store.
a. For Data store ID, enter a unique data store ID. You can't change this ID after you create it.
b. (Optional) For Tags, choose Add new tag to add one or more custom tags (key-value pairs) to your data store. Tags can help you identify your resources that you create for AWS IoT Analytics.
c. Choose Next.
4. On the Configure storage type page, specify how to store your data.
a. For Storage type, choose Service managed storage.
b. For Configure how long you want to keep your processed data, choose Indefinitely.
c. Choose Next.
5. AWS IoT Analytics data stores support JSON and Parquet file formats. For your data store data format, choose JSON or Parquet. See File formats (p. 17) for more information about AWS IoT Analytics supported file types.
Choose Next.
6. (Optional) AWS IoT Analytics supports custom partitions in your data store so you can query on pruned data to improve latency. For more information about supported custom partitions, see Custom partitions (p. 19).
Choose Next.
7. Review your choices and then choose Create data store.
8. Verify that your new data store appears on the Data stores page.
Create a pipeline
You must create a pipeline to connect a channel to a data store. A basic pipeline only specifies the channel that collects the data and identifies the data store to which the messages are sent. For more information, see Pipeline activities.
For this tutorial, you create a pipeline that only connects a channel to a data store. Later, you can add pipeline activities to process this data.
Follow these steps to create a pipeline.
Create a dataset
To create a pipeline
1. In the https://console.aws.amazon.com/iotanalytics/, in the Prepare your data with AWS IoT Analytics section, choose View pipelines.
TipYou can also choose Pipelines from the navigation pane.
2. On the Pipelines page, choose Create pipeline.
3. Enter the details about your pipeline.
a. In Setup pipeline ID and sources, enter a pipeline name.
b. Choose your pipeline's source, which is an AWS IoT Analytics channel that your pipeline will read messages from.
c. Specify your pipeline's output, which is the data store where your processed message data is stored.
d. (Optional) For Tags, add one or more custom tags (key-value pairs) to your pipeline.
e. On the Infer message attributes page, enter an attribute name and an example value, choose a data type from the list, and then choose Add attribute.
f. Repeat the previous step for as many attributes as you need, and then choose Next.
g. You won't add any pipeline activities right now. On the Enrich, transform, and filter messages page, choose Next.
4. Review your choices and then choose Create pipeline.
5. Verify that your new pipeline appears on the Pipelines page.
Note
You created AWS IoT Analytics resources so that they can do the following:
• Collect raw, unprocessed IoT device message data with a channel.
• Store your IoT device message data in a data store.
• Clean, filter, transform, and enrich your data with a pipeline.
Next, you will create an AWS IoT Analytics SQL dataset to discover useful insights about your IoT device.
Create a dataset
NoteA dataset is typically a collection of data that might or might not be organized in tabular form.
In contrast, AWS IoT Analytics creates your dataset by applying a SQL query to data in your data store.
You now have a channel that routes raw message data to a pipeline that stores data in a data store where it can be queried. To query the data, you create a dataset. A dataset contains SQL statements and expressions that you use to query the data store along with an optional schedule that repeats the query at a day and time that you specify. You can use expressions similar to Amazon CloudWatch schedule expressions to create the optional schedules.
To create a dataset
1. In the https://console.aws.amazon.com/iotanalytics/, in the left navigation pane, choose Datasets.
Create a dataset
2. On the Create dataset page, choose Create SQL.
3. On the Specify dataset details page, specify the details of your dataset.
a. Enter a name for your dataset.
b. For Data store source, choose the the unique ID that identifies the data store that you created earlier.
c. (Optional) For Tags, add one or more custom tags (key-value pairs) to your dataset.
4. Use SQL expressions to query your data and answer analytical questions. The results of your query are stored in this dataset.
a. In the Author query field, enter a SQL query that uses a wildcard to show up to five rows of data.
SELECT * FROM my_data_store LIMIT 5
For more information about supported SQL functionality in AWS IoT Analytics, see SQL expressions in AWS IoT Analytics (p. 95).
b. You can choose Test query to validate that your input is correct and display the results in a table following the query.
Note
• At this point in the tutorial your datastore might be empty. Running a SQL query on an empty datastore won’t return results, so you might see only __dt.
• You must be careful to limit your SQL query to a reasonable size so that it does not run for an extended period because Athena limits the maximum number of running queries. Because of this, you must be careful to limit the SQL query to a reasonable size.
We suggest using a LIMIT clause in your query during testing. After the test succeeds, you can remove this clause.
5. (Optional) When you create dataset contents using data from a specified time frame, some data might not arrive in time for processing. To allow for a delay, you can specify an offset, or delta. For more information, see Getting late data notifications through Amazon CloudWatch Events (p. 119).
You won't configure a data selection filter at this point. On the Configure data selection filter page, choose Next.
6. (Optional) You can schedule this query to run regularly to refresh the dataset. Dataset schedules can be created and edited at any time.
You won't schedule a recurring run of the query at this point, so on the Set query schedule page choose Next.
7. AWS IoT Analytics will create versions of this dataset content and store your analytics results for the specified period. We recommend 90 days, however you can opt to set your custom retention policy.
You may also limit the number of stored versions of your dataset content.
You can use the default dataset retention period as Indefinitely and keep Versioning disabled. On the Configure the results of your analytics page, choose Next.
8. (Optional) You can configure the delivery rules of your dataset results to a specific destination, such as AWS IoT Events.
You won't deliver your results elsewhere in this tutorial, so on the Configure dataset content delivery rules page, choose Next.
9. Review your choices and then choose Create dataset.
10. Verify that your new dataset appears on the Datasets page.
Send message data with AWS IoT
Send message data with AWS IoT
If you have a channel that routes data to a pipeline, which stores data in a data store where it can be queried, then you're ready to send IoT device data into AWS IoT Analytics. You can send data into AWS IoT Analytics by using the following options:
• Use the AWS IoT message broker.
• Use the AWS IoT Analytics BatchPutMessage API operation.
In the following steps, you send message data from the AWS IoT message broker in the AWS IoT Core console so that AWS IoT Analytics can ingest this data.
NoteWhen you create topic names for your messages, note the following:
• Topic names are not case sensitive. Fields named example and EXAMPLE in the same payload are considered duplicates.
• Topic names can't begin with the $ character. Topics that begin with $ are reserved topics and can only be used by AWS IoT.
• Don't include personally identifiable information in your topic names because this information can appear in unencrypted communications and reports.
• AWS IoT Core can't send messages between AWS accounts or AWS Regions.
To send message data with AWS IoT 1. Sign in to the AWS IoT console.
2. In the navigation pane, choose Test, and then choose MQTT test client.
3. On the MQTT test client page, choose Publish to a topic.
4. For Topic name, enter a name that will match the topic filter that you entered when you created a channel. This example uses update/environment/dht1.
5. For Message payload, enter the following JSON contents.
{ "thingid": "dht1", "temperature": 26, "humidity": 29,
"datetime": "2018-01-26T07:06:01"
}
6. (Optional) Choose Add Configuration for additional message protocol options.
7. Choose Publish.
This publishes a message that is captured by your channel. Your pipeline then routes the message to your data store.
Check the progress of AWS IoT messages
You can check that messages are being ingested into your channel by following these steps.
To check the progress of AWS IoT messages
1. Sign in to the https://console.aws.amazon.com/iotanalytics/.
Access query results
2. In the navigation pane, choose Channels, and then choose the channel name that you created earlier.
3. On the Channel's details page, scroll down to the Monitoring section, and then adjust the displayed time frame (1h 3h 12h 1d 3d 1w). Choose a value such as 1w to view data for the last week.
You can use a similar feature to monitor for pipeline activity runtime and errors on the Pipeline's details page. In this tutorial, you haven't specified activities as part of the pipeline, so you shouldn't see any runtime errors.
To monitor pipeline activity
1. In the navigation pane, choose Pipelines, and then choose the name of the pipeline that you created earlier.
2. On the Pipeline's details page, scroll down to the Monitoring section, and then adjust the displayed time frame by choosing one of the time frame indicators (1h 3h 12h 1d 3d 1w).
Access query results
The dataset content is a file containing the result of your query, in CSV format.
1. In the https://console.aws.amazon.com/iotanalytics/, in the left navigation pane, choose Datasets.
2. On the Datasets page, choose the name of the dataset that you created previously.
3. On the dataset information page, in the upper-right corner, choose Run now.
4. To check if the dataset is ready, look under the dataset for a message similar to You’ve successfully started the query for your dataset. The Dataset content tab contains the query results and displays Succeeded.
5. To preview the results of your successful query, on the Dataset contents tab, select the query name.
To view or save the CSV file that contains the query results, choose Download.
NoteAWS IoT Analytics can embed the HTML portion of a Jupyter Notebook on the Dataset contents page. For more information, see Visualizing AWS IoT Analytics data with the console (p. 88).
Explore your data
You have several options for storing, analyzing, and visualizing your data.
Amazon Simple Storage Service
You can send dataset contents to an Amazon S3 bucket, enabling integration with your existing data lakes or access from in-house applications and visualization tools. See the field contentDeliveryRules::destination::s3DestinationConfiguration in the CreateDataset operation.
AWS IoT Events
You can send dataset contents as an input to AWS IoT Events, a service that enables you to monitor devices or processes for failures or changes in operation, and to initiate additional actions when such events occur.
To do this, create a dataset using the CreateDataset operation and specify an AWS IoT Events input in the field contentDeliveryRules :: destination ::
Explore your data
iotEventsDestinationConfiguration :: inputName. You must also specify the roleArn of a role, which grants AWS IoT Analytics permissions to run iotevents:BatchPutMessage.
Whenever the datasets contents are created, AWS IoT Analytics will send each dataset content entry as a message to the specified AWS IoT Events input. For example, if your dataset contains the following content.
"what","who","dt"
"overflow","sensor01","2019-09-16 09:04:00.000"
"overflow","sensor02","2019-09-16 09:07:00.000"
"underflow","sensor01","2019-09-16 11:09:00.000"
...
Then AWS IoT Analytics sends messages that contain fields like the following.
{ "what": "overflow", "who": "sensor01", "dt": "2019-09-16 09:04:00.000" }
{ "what": "overflow", "who": "sensor02", "dt": "2019-09-16 09:07:00.000" }
You will want to create an AWS IoT Events input that recognizes the fields you are interested in (one or more of what, who, dt) and to create an AWS IoT Events detector model that uses these input fields in events to trigger actions or set internal variables.
Jupyter Notebook
Jupyter Notebook is an open source solution for using scripting languages to run ad-hoc data exploration and advanced analyses. You can dive deep and apply more complex analyses and use machine learning methods, such as k-means clustering and regression models for prediction, on your IoT device data.
AWS IoT Analytics uses Amazon SageMaker notebook instances to host its Jupyter Notebooks.
Before you create a notebook instance, you must create a relationship between AWS IoT Analytics and Amazon SageMaker:
1. Navigate to the SageMaker console and create a notebook instance:
a. Fill in the details, and then choose Create a new role. Make a note the role ARN.
b. Create a notebook instance.
2. Go to the IAM console and modify the SageMaker role:
a. Open the role. It should have one managed policy.
b. Choose Add inline policy, and then for Service, choose iotAnalytics. Choose Select actions, and then enter GetDatasetContent in the search box and choose it. Choose Review Policy.
c. Review the policy for accuracy, enter a name, and then choose Create policy.
This gives the newly created role permission to read a dataset from AWS IoT Analytics.
1. Return to the https://console.aws.amazon.com/iotanalytics/, and in the left navigation pane, choose Notebooks. On the Notebooks page, choose Create notebook.
2. On the Select a template page, choose IoTA blank template.
3. On the Set up notebook page, enter a name for your notebook. In Select dataset source, choose and then choose the dataset you created earlier. In Select a notebook instance, choose the notebook instance you created in SageMaker.
4. After you review your choices, choose Create Notebook.
Notebook templates
5. On the Notebooks page, your notebook instance will open in the Amazon SageMaker console.
Notebook templates
The AWS IoT Analytics notebook templates contain AWS authored machine learning models and visualizations to help you get started with AWS IoT Analytics use cases. You can use these notebook templates to learn more or reuse them to fit your IoT device data and deliver immediate value.
You can find the following notebook templates in the AWS IoT Analytics console:
• Detecting contextual anomalies – Application of contextual anomaly detection in measured wind speed with a Poisson Exponentially Weighted Moving Average (PEWMA) model.
• Solar panel output forecasting – Application of piecewise, seasonal, and linear time series models to predict the output of solar panels.
• Predictive maintenance on jet engines – Application of multivariate Long Short-Term Memory (LSTM) neural networks and logistic regression to predict jet engine failure.
• Smart home customer segmentation – Application of k-means and Principal Component Analysis (PCA) analysis to detect different customer segments in data of smart home usage.
• Smart city congestion forecasting – Application of LSTM to predict the utilization rates for city highways.
• Smart city air quality forecasting – Application of LSTM to predict particulate pollution in city centers.
Creating a channel
Getting started with AWS IoT Analytics
This section discusses the basic commands you use to collect, store, process, and query your device data using AWS IoT Analytics. The examples shown here use the AWS Command Line Interface (AWS CLI). For more information on the AWS CLI, see the AWS Command Line Interface User Guide. For more information about the CLI commands available for AWS IoT, see iot in the AWS Command Line Interface Reference.
Important
Use the aws iotanalytics command to interact with AWS IoT Analytics using the AWS CLI.
Use the aws iot command to interact with other parts of the IoT system using the AWS CLI.
Note
Be aware as you enter the names of AWS IoT Analytics entities (channel, dataset, data store, and pipeline) in the examples that follow, that any uppercase letters you use are automatically changed to lowercase by the system. The names of entities must start with a lower-case letter and contain only lowercase letters, underscores and digits.
Creating a channel
A channel collects and archives raw, unprocessed message data before publishing this data to a pipeline.
Incoming messages are sent to a channel, so the first step is to create a channel for your data.
aws iotanalytics create-channel --channel-name mychannel
If you want AWS IoT messages to be ingested into AWS IoT Analytics, you can create an AWS IoT Rules Engine rule to send the messages to this channel. This is shown later in Ingesting data to AWS IoT Analytics (p. 22). Another way to get the data in to a channel is to use the AWS IoT Analytics command BatchPutMessage.
To list the channels you have already created:
aws iotanalytics list-channels
To get more information about a channel.
aws iotanalytics describe-channel --channel-name mychannel
Unprocessed channel messages are stored in an Amazon S3 bucket managed by AWS IoT Analytics, or in one managed by you. Use the channelStorage parameter to specify which. The default is a service-managed Amazon S3 bucket. If you choose to have channel messages stored in an Amazon S3 bucket that you manage, you must grant AWS IoT Analytics permission to perform these actions on your Amazon S3 bucket on your behalf: s3:GetBucketLocation (verify bucket location) s3:PutObject (store), s3:GetObject (read), s3:ListBucket (reprocessing).
Example
{ "Version": "2012-10-17", "Id": "MyPolicyID", "Statement": [
Creating a data store
{
"Sid": "MyStatementSid", "Effect": "Allow", "Principal": {
"Service": "iotanalytics.amazonaws.com"
},
"Action": [
"s3:GetObject",
"s3:GetBucketLocation", "s3:ListBucket", "s3:PutObject"
],
"Resource": [
"arn:aws:s3:::my-iot-analytics-bucket", "arn:aws:s3:::my-iot-analytics-bucket/*"
] } ] }
If you make changes in the options or permissions of your customer-managed channel storage, you might need to reprocess channel data to ensure that previously ingested data is included in dataset contents. See Reprocessing channel data.
Creating a data store
A data store receives and stores your messages. It is not a database but a scalable and queryable repository of your messages. You can create multiple data stores to store messages that comes from different devices or locations, or your can use a single data store to receive all of your AWS IoT messages.
aws iotanalytics create-datastore --datastore-name mydatastore
To list the data stores you have already created.
aws iotanalytics list-datastores
To get more information about a data store.
aws iotanalytics describe-datastore --datastore-name mydatastore
Amazon S3 policies for AWS IoT Analytics resources
You can store processed data store messages in an Amazon S3 bucket managed by AWS IoT Analytics or in one that you manage. When you create a data store, select the Amazon S3 bucket you want by using the datastoreStorage API parameter. The default is a service-managed Amazon S3 bucket.
If you choose to have data store messages stored in an Amazon S3 bucket that you manage, you must grant AWS IoT Analytics permission to perform these actions on your Amazon S3 bucket for you:
• s3:GetBucketLocation
• s3:PutObject
• s3:DeleteObject
If you use the data store as a source for an SQL query dataset, set up an Amazon S3 bucket policy that grants AWS IoT Analytics permission to invoke Amazon Athena queries on the contents of your bucket.
File formats
NoteWe recommend that you specify aws:SourceArn in your bucket policy to help prevent the confused deputy security problem. This restricts access by allowing only those requests that come from a specified account. For more information about the confused deputy problem, see the section called “Cross-service confused deputy prevention” (p. 104).
The following is an example of a bucket policy that grants these required permissions.
{
"Version": "2012-10-17", "Id": "MyPolicyID", "Statement": [ {
"Sid": "MyStatementSid", "Effect": "Allow", "Principal": {
"Service": "iotanalytics.amazonaws.com"
},
"Action": [
"s3:GetBucketLocation", "s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads", "s3:ListMultipartUploadParts", "s3:AbortMultipartUpload", "s3:PutObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::DOC-EXAMPLE-BUCKET", "arn:aws:s3:::DOC-EXAMPLE-BUCKET/*"
]
"Condition": { "ArnLike": {
"aws:SourceArn": [
"arn:aws:iotanalytics:us-east-1:123456789012:dataset/DOC-EXAMPLE- DATASET",
"arn:aws:iotanalytics:us-east-1:123456789012:datastore/DOC-EXAMPLE- DATASTORE"
] } } } ] }
For more information, see Cross-account access in the Amazon Athena User Guide.
Note
If you update the options or permissions of your customer managed data store, you might need to reprocess channel data to ensure that any previously ingested data is included in dataset contents. For more information, see Reprocessing channel data.
File formats
AWS IoT Analytics data stores currently support JSON and Parquet file formats. The default file format is JSON.
• JSON (JavaScript Object Notation) - A text format that supports name-value pairs and ordered lists of values.
• Apache Parquet - A columnar storage format used to efficiently store and query large volumes of data.
File formats
To configure the file format of the AWS IoT Analytics data store, you can use the FileFormatConfiguration object when you create the data store.
fileFormatConfiguration
Contains the configuration information of file formats. AWS IoT Analytics data stores support JSON and Parquet.
The default file format is JSON. You can specify only one format. You can't change the file format after you create the data store.
jsonConfiguration
Contains the configuration information of the JSON format.
parquetConfiguration
Contains the configuration information of the Parquet format.
schemaDefinition
Information needed to define a schema.
columns
Specifies one or more columns that store your data.
Each schema can have up to 100 columns. Each column can have up to 100 nested types.
name
The name of the column.
Length constraints: 1-255 chars.
type
The type of data. For more information about the supported data type, see Common data types in the AWS Glue Developer Guide.
Length constraints: 1-131072 characters.
AWS IoT Analytics supports all data types listed on the Data Types in Amazon Athena page, except for DECIMAL(precision, scale) - precision.
Create a data store (console)
The following procedure shows you how to create a data store that saves data in Parquet format.
To create a data store
1. Sign in to the https://console.aws.amazon.com/iotanalytics/.
2. In the navigation pane, choose Data stores.
3. On the Data stores page, choose Create data store.
4. On the Specify data store details page, enter basic information about your data store.
a. For Data store ID, enter a unique data store ID. You can't change this ID after you create it.
b. (Optional) For Tags, choose Add new tag to add one or more custom tags (key-value pairs) to your data store. Tags can help you identify your resources that you create for AWS IoT Analytics.
c. Choose Next.
Custom partitions
5. On the Configure storage type page, specify how to store your data.
a. For Storage type, choose Service managed storage.
b. For Configure how long you want to keep your processed data, choose Indefinitely.
c. Choose Next.
6. On the Configure data format page, define the structure and format of your data records.
a. For Classification, choose Parquet. You can't change this format after you create the data store.
b. For Inference source, choose JSON string for your data store.
c. For String, enter your schema in JSON format, such as the following example.
{
"device_id": "0001", "temperature": 26, "humidity": 29,
"datetime": "2018-01-26T07:06:01"
}
d. Choose Infer schema.
e. Under Configure Parquet schema, confirm that the format matches your JSON example. If the format doesn't match, update the Parquet schema manually.
• If you want your schema to show more columns, choose Add new column, enter a column name, and then choose the data type.
NoteBy default, you can have 100 columns for your schema. For more information, see AWS IoT Analytics quotas.
• You can change the data type for an existing column. For more information about the supported data types, see Common data types in the AWS Glue Developer Guide.
Note
After you create your data store, you can't change the data type for an existing column.
• To remove an existing column, choose Remove column.
f. Choose Next.
7. (Optional) AWS IoT Analytics supports custom partitions in your data store so you can query on pruned data to improve latency. For more information about supported custom partitions, see Custom partitions (p. 19).
Choose Next.
8. On the Review and create page, review your choices, and then choose Create data store.
Important
You can't change the data store ID, file format, or the data type for a column after you create the data store.
9. Verify that your new data store appears on the Data stores page.
Custom partitions
AWS IoT Analytics supports data partitioning so you can organize the data in your data store. When you use data partitioning to organize data, you can query on pruned data. This decreases the amount of data scanned per query and improves latency.
You can partition your data according to message data attributes or attributes added through pipeline activities.
Custom partitions
To get started, enable data partitioning in a data store. Specify one or more data partition dimensions and connect your partitioned data store to an AWS IoT Analytics pipeline. Then, write queries that leverage the WHERE clause to optimize performance.
Create a data store (console)
The following procedure shows you how to create a data store with a custom partition.
To create a data store
1. Sign in to the AWS IoT Analytics console.
2. In the navigation pane, choose Data stores.
3. On the Data stores page, choose Create data store.
4. On the Specify data store details page, enter basic information about your data store.
a. For Data store ID, enter a unique data store ID. You can't change this ID after you create it.
b. (Optional) For Tags, choose Add new tag to add one or more custom tags (key-value pairs) to your data store. Tags can help you identify resources that you create for AWS IoT Analytics.
c. Choose Next.
5. On the Configure storage type page, specify how to store your data.
a. For Storage type, choose Service managed storage.
b. For Configure how long you want to keep your processed data, choose Indefinitely.
c. Choose Next.
6. On the Configure data format page, define the structure and format of your data records.
a. For your data store data format Classification, choose JSON or Parquet. For more information about AWS IoT Analytics supported file types, see File formats (p. 17).
NoteYou can't change this format after you create the data store.
b. Choose Next.
7. Create custom partitions for this data store.
a. For Add data partitions, select Enable.
b. For Data partition source, specify basic information about the source of your partition.
Choose Sample source, and select the AWS IoT Analytics channel that collects messages for this data store.
c. For Message sample attributes, select the message attributes you want to use to partition your data store. Then, add your selections as attribute partition dimensions or timestamp partition dimensions under Actions.
NoteYou can add only one timestamp partition to your data store.
d. For Custom data store partition dimensions, define basic information about your partition dimensions. Each message sample attribute you selected in the previous step will become the dimensions of your partition. Customize each dimension with these options:
• Partition type - Specify if this partition dimension is an Attribute or a Timestamp partition type.
• Attribute name and Dimension name - By default, AWS IoT Analytics will use the name of the message sample attribute you selected as an identifier for your attribute partition dimension. Edit the attribute name to customize the name of your partition dimension. You can use the dimension name in the WHERE clause to optimize query performance.
Creating a pipeline
• The name of any partition attribute dimension is prefixed with __partition_.
• For timestamp partition types, AWS IoT Analytics creates the following four dimensions with names __year, __month, __day, __hour.
• Ordering - Rearrange your partition dimensions to improve the latency for your queries.
For Timestamp format, specify the format of your timestamp partition by matching the ingested timestamp from your message data. You can choose one of AWS IoT Analytics listed format options, or specify one that matches the format of your data. Learn more about specifying date time formatters.
To add a new dimension that isn’t a message attribute, choose Add new partitions.
e. Choose Next.
8. On the Review and create page, review your choices, and then choose Create data store.
Important
• You can't change the data store ID after you create the data store.
• To edit existing partitions, you must create another data store and reprocess the data through a pipeline.
9. Verify that your new data store appears on the Data stores page.
Creating a pipeline
A pipeline consumes messages from a channel and enables you to process and filter the messages before storing them in a data store. To connect a channel to a data store, you create a pipeline. The simplest possible pipeline contains no activities other than specifying the channel that collects the data and identifying the data store to which the messages are sent. For information about more complicated pipelines, see Pipeline activities.
When starting out, we recommend that you create a pipeline that does nothing other than connect a channel to a data store. Then, after you verify that raw data flows to the data store, you can introduce additional pipeline activities to process this data.
Run the following command to create a pipeline.
aws iotanalytics create-pipeline --cli-input-json file://mypipeline.json
The mypipeline.json file contains the following content.
{ "pipelineName": "mypipeline", "pipelineActivities": [ {
"channel": {
"name": "mychannelactivity", "channelName": "mychannel", "next": "mystoreactivity"
} }, {
"datastore": {
"name": "mystoreactivity", "datastoreName": "mydatastore"
}
Ingesting data to AWS IoT Analytics
} ] }
Run the following command to list your existing pipelines.
aws iotanalytics list-pipelines
Run the following command to view the configuration of an individual pipeline.
aws iotanalytics describe-pipeline --pipeline-name mypipeline
Ingesting data to AWS IoT Analytics
If you have a channel that routes data to a pipeline that stores data in a data store where it can be queried, then you're ready to send message data into AWS IoT Analytics. Here we show two methods of getting data into AWS IoT Analytics. You can send a message using the AWS IoT message broker or use the AWS IoT Analytics BatchPutMessage API.
Topics
• Using the AWS IoT message broker (p. 22)
• Using the BatchPutMessage API (p. 25)
Using the AWS IoT message broker
To use the AWS IoT message broker, you create a rule using the AWS IoT rules engine. The rule routes messages with a specific topic into AWS IoT Analytics. But first, this rule requires you to create a role which grants the required permissions.
Creating an IAM role
To have AWS IoT messages routed into an AWS IoT Analytics channel, you set up a rule. But first, you must create an IAM role that grants that rule permission to send message data to an AWS IoT Analytics channel.
Run the following command to create the role.
aws iam create-role --role-name myAnalyticsRole --assume-role-policy-document file://
arpd.json
The contents of the arpd.json file should look like the following.
{ "Version": "2012-10-17", "Statement": [
{
"Effect": "Allow", "Principal": {
"Service": "iot.amazonaws.com"
},
"Action": "sts:AssumeRole"
Using the AWS IoT message broker
} ] }
Then, attach a policy document to the role.
aws iam put-role-policy --role-name myAnalyticsRole --policy-name myAnalyticsPolicy -- policy-document file://pd.json
The contents of the pd.json file should look like the following.
{
"Version": "2012-10-17", "Statement": [
{
"Effect": "Allow",
"Action": "iotanalytics:BatchPutMessage", "Resource": [
"arn:aws:iotanalytics:us-west-2:your-account-number:channel/mychannel"
] } ] }
Creating a AWS IoT rule
Create an AWS IoT rule that sends messages to your channel.
aws iot create-topic-rule --rule-name analyticsTestRule --topic-rule-payload file://
rule.json
The contents of the rule.json file should look like the following.
{ "sql": "SELECT * FROM 'iot/test'", "ruleDisabled": false,
"awsIotSqlVersion": "2016-03-23", "actions": [ {
"iotAnalytics": {
"channelName": "mychannel",
"roleArn": "arn:aws:iam::your-account-number:role/myAnalyticsRole"
} } ] }
Replace iot/test with the MQTT topic of the messages that should be routed. Replace the channel name and the role with the ones you created in the previous sections.
Sending MQTT messages to AWS IoT Analytics
After you have joined a rule to a channel, a channel to a pipeline, and a pipeline to a data store, any data matching the rule now flows through AWS IoT Analytics to the data store ready to be queried. To test this, you can use the AWS IoT console to send a message.
NoteThe field names of message payloads (data) that you send to AWS IoT Analytics.
Using the AWS IoT message broker
• Must contain only alphanumeric characters and underscores (_); no other special characters are allowed.
• Must begin with an alphabetic character or single underscore (_).
• Cannot contain hyphens (-).
• In regular expression terms: "^[A-Za-z_]([A-Za-z0-9]*|[A-Za-z0-9][A-Za- z0-9_]*)$".
• Cannot be greater than 255 characters
• Are case-insensitive. Fields named foo and FOO in the same payload are considered duplicates.
For example, {"temp_01": 29} or {"_temp_01": 29} are valid, but {"temp-01": 29}, {"01_temp": 29} or {"__temp_01": 29} are invalid in message payloads.
1. In the AWS IoT console, in the left navigation pane, choose Test.
2. On the MQTT client page, in the Publish section, in Specify a topic, type iot/test. In the message payload section, verify the following JSON contents are present, or type them if not.
{
"message": "Hello from the IoT console"
}
3. Choose Publish to topic.
Using the BatchPutMessage API
This publishes a message that is routed to the data store you created earlier.
Using the BatchPutMessage API
Another way to get message data into AWS IoT Analytics is to use the BatchPutMessage API command.
This method does not require that you set up an AWS IoT rule to route messages with a specific topic to your channel. But it does require that the device which sends its data/messages to the channel is capable of running software created with the AWS SDK or is capable of using the AWS CLI to call BatchPutMessage.
1. Create a file messages.json that contains the messages to be sent (in this example only one message is sent).
[ { "messageId": "message01", "payload": "{ \"message\": \"Hello from the CLI\" }" } ]
2. Run the batch-put-message command.
aws iotanalytics batch-put-message --channel-name mychannel --messages file://
messages.json
If there are no errors, you see the following output.
{
"batchPutMessageErrorEntries": []
}
Monitoring the ingested data
Monitoring the ingested data
You can check that the messages you sent are being ingested into your channel by using the AWS IoT Analytics console.
1. In the AWS IoT Analytics console, in the left navigation pane, choose Prepare and (if necessary) choose Channel, then choose the name of the channel you created earlier.
2. On the channel detail page, scroll down to the Monitoring section. Adjust the displayed time frame as necessary by choosing one of the time frame indicators (1h 3h 12h 1d 3d 1w). You should see a graph line indicating the number of messages ingested into this channel during the specified time frame.
A similar monitoring capability exists for checking pipeline activity executions. You can monitor activity execution errors on the pipeline's detail page. If you haven't specified activities as part of your pipeline, then 0 execution errors should be displayed.
1. In the AWS IoT Analytics console, in the left navigation pane, choose Prepare and then choose Pipelines, then choose the name of a pipeline you created earlier.
Creating a dataset
2. On the pipeline detail page, scroll down to the Monitoring section. Adjust the displayed time frame as necessary by choosing one of the time frame indicators (1h 3h 12h 1d 3d 1w). You should see a graph line indicating the number of pipeline activity execution errors during the specified time frame.
Creating a dataset
You retrieve data from a data store by creating a SQL dataset or a container dataset. AWS IoT Analytics can query the data to answer analytical questions. Although a data store is not a database, you use SQL expressions to query the data and produce results that are stored in a dataset.
Topics
• Querying data (p. 28)
• Accessing the queried data (p. 28)
Querying data
Querying data
To query the data, you create a dataset. A dataset contains the SQL that you use to query the data store along with an optional schedule that repeats the query at a day and time you choose. You create the optional schedules using expressions similar to Amazon CloudWatch schedule expressions.
Run the following command to create a dataset.
aws iotanalytics create-dataset --cli-input-json file://mydataset.json
Where the mydataset.json file contains the following content.
{ "datasetName": "mydataset", "actions": [
{
"actionName":"myaction", "queryAction": {
"sqlQuery": "select * from mydatastore"
} } ] }
Run the following command to create the dataset content by executing the query.
aws iotanalytics create-dataset-content --dataset-name mydataset
Wait a few minutes for the dataset content to be created before you continue.
Accessing the queried data
The result of the query is your dataset content, stored as a file, in CSV format. The file is made available to you through Amazon S3. The following example shows how you can check that your results are ready and download the file.
Run the following get-dataset-content command.
aws iotanalytics get-dataset-content --dataset-name mydataset
If your dataset contains any data, then the output from get-dataset-content, has "state":
"SUCCEEDED" in the status field, like this the following example.
{ "timestamp": 1508189965.746, "entries": [
{
"entryName": "someEntry",
"dataURI": "https://aws-iot-analytics-datasets-f7253800-859a-472c-aa33- e23998b31261.s3.amazonaws.com/results/f881f855-c873-49ce-abd9-b50e9611b71f.csv?X-Amz-"
} ],
"status": {
"state": "SUCCEEDED",
Exploring AWS IoT Analytics data
"reason": "A useful comment."
} }
dataURI is a signed URL to the output results. It is valid for a short period of time (a few hours).
Depending on your workflow, you might want to always call get-dataset-content before you access the content because calling this command generates a new signed URL.
Exploring AWS IoT Analytics data
You have several options for storing, analyzing and visualizing your AWS IoT Analytics data.
Topics on this page:
• Amazon S3 (p. 29)
• AWS IoT Events (p. 29)
• Amazon QuickSight (p. 30)
• Jupyter Notebook (p. 30)
Amazon S3
You can send dataset contents to an Amazon Simple Storage Service (Amazon S3) bucket, enabling integration with your existing data lakes or access from in-house applications and visualization tools.
See the field contentDeliveryRules::destination::s3DestinationConfiguration in CreateDataset.
AWS IoT Events
You can send dataset contents as an input to AWS IoT Events, a service which enables you to monitor devices or processes for failures or changes in operation, and to trigger additional actions when such events occur.
To do this, create a dataset using CreateDataset and specify an AWS IoT Events input in the field contentDeliveryRules :: destination :: iotEventsDestinationConfiguration ::
inputName. You must also specify the roleArn of a role which grants AWS IoT Analytics permission to execute "iotevents:BatchPutMessage". Whenever the dataset's contents are created, AWS IoT Analytics will send each dataset content entry as a message to the specified AWS IoT Events input. For example, if your dataset contains:
"what","who","dt"
"overflow","sensor01","2019-09-16 09:04:00.000"
"overflow","sensor02","2019-09-16 09:07:00.000"
"underflow","sensor01","2019-09-16 11:09:00.000"
...
then AWS IoT Analytics will send messages containing fields like this:
{ "what": "overflow", "who": "sensor01", "dt": "2019-09-16 09:04:00.000" }
{ "what": "overflow", "who": "sensor02", "dt": "2019-09-16 09:07:00.000" }