KinesisStreamSourceConfiguration
The stream and role Amazon Resource Names (ARNs) for a Kinesis data stream used as the source for a delivery stream.
Contents
KinesisStreamARN
The ARN of the source Kinesis data stream. For more information, see Amazon Kinesis Data Streams ARN Format.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 512.
Pattern: arn:.*
Required: Yes RoleARN
The ARN of the role that provides access to the source Kinesis data stream. For more information, see AWS Identity and Access Management (IAM) ARN Format.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 512.
Pattern: arn:.*
Required: Yes
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following:
• AWS SDK for C++
• AWS SDK for Go
• AWS SDK for Java V2
• AWS SDK for Ruby V3
KinesisStreamSourceDescription
KinesisStreamSourceDescription
Details about a Kinesis data stream used as the source for a Kinesis Data Firehose delivery stream.
Contents
DeliveryStartTimestamp
Kinesis Data Firehose starts retrieving records from the Kinesis data stream starting with this timestamp.
Type: Timestamp Required: No KinesisStreamARN
The Amazon Resource Name (ARN) of the source Kinesis data stream. For more information, see Amazon Kinesis Data Streams ARN Format.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 512.
Pattern: arn:.*
Required: No RoleARN
The ARN of the role used by the source Kinesis data stream. For more information, see AWS Identity and Access Management (IAM) ARN Format.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 512.
Pattern: arn:.*
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following:
• AWS SDK for C++
• AWS SDK for Go
• AWS SDK for Java V2
• AWS SDK for Ruby V3
KMSEncryptionConfig
KMSEncryptionConfig
Describes an encryption key for a destination in Amazon S3.
Contents
AWSKMSKeyARN
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 512.
Pattern: arn:.*
Required: Yes
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following:
• AWS SDK for C++
• AWS SDK for Go
• AWS SDK for Java V2
• AWS SDK for Ruby V3
OpenXJsonSerDe
OpenXJsonSerDe
The OpenX SerDe. Used by Kinesis Data Firehose for deserializing data, which means converting it from the JSON format in preparation for serializing it to the Parquet or ORC format. This is one of two deserializers you can choose, depending on which one offers the functionality you need. The other option is the native Hive / HCatalog JsonSerDe.
Contents
CaseInsensitive
When set to true, which is the default, Kinesis Data Firehose converts JSON keys to lowercase before deserializing them.
Type: Boolean Required: No
ColumnToJsonKeyMappings
Maps column names to JSON keys that aren't identical to the column names. This is useful when the JSON contains keys that are Hive keywords. For example, timestamp is a Hive keyword. If you have a JSON key named timestamp, set this parameter to {"ts": "timestamp"} to map this key to a column named ts.
Type: String to string map
Key Length Constraints: Minimum length of 1. Maximum length of 1024.
Key Pattern: ^\S+$
Value Length Constraints: Minimum length of 1. Maximum length of 1024.
Value Pattern: ^(?!\s*$).+
Required: No
ConvertDotsInJsonKeysToUnderscores
When set to true, specifies that the names of the keys include dots and that you want Kinesis Data Firehose to replace them with underscores. This is useful because Apache Hive does not allow dots in column names. For example, if the JSON contains a key whose name is "a.b", you can define the column name to be "a_b" when using this option.
The default is false.
Type: Boolean Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following:
• AWS SDK for C++
• AWS SDK for Go
• AWS SDK for Java V2
• AWS SDK for Ruby V3
See Also
OrcSerDe
OrcSerDe
A serializer to use for converting data to the ORC format before storing it in Amazon S3. For more information, see Apache ORC.
Contents
BlockSizeBytes
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB.
Kinesis Data Firehose uses this value for padding calculations.
Type: Integer
Valid Range: Minimum value of 67108864.
Required: No BloomFilterColumns
The column names for which you want Kinesis Data Firehose to create bloom filters. The default is null.
Type: Array of strings
Length Constraints: Minimum length of 1. Maximum length of 1024.
Pattern: ^\S+$
Required: No
BloomFilterFalsePositiveProbability
The Bloom filter false positive probability (FPP). The lower the FPP, the bigger the Bloom filter. The default value is 0.05, the minimum is 0, and the maximum is 1.
Type: Double
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No Compression
The compression code to use over data blocks. The default is SNAPPY.
Type: String
Valid Values: NONE | ZLIB | SNAPPY Required: No
DictionaryKeyThreshold
Represents the fraction of the total number of non-null rows. To turn off dictionary encoding, set this fraction to a number that is less than the number of distinct keys in a dictionary. To always use dictionary encoding, set this threshold to 1.
Type: Double
Valid Range: Minimum value of 0. Maximum value of 1.
See Also
Required: No EnablePadding
Set this to true to indicate that you want stripes to be padded to the HDFS block boundaries. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is false.
Type: Boolean Required: No FormatVersion
The version of the file to write. The possible values are V0_11 and V0_12. The default is V0_12.
Type: String
Valid Values: V0_11 | V0_12 Required: No
PaddingTolerance
A number between 0 and 1 that defines the tolerance for block padding as a decimal fraction of stripe size. The default value is 0.05, which means 5 percent of stripe size.
For the default values of 64 MiB ORC stripes and 256 MiB HDFS blocks, the default block padding tolerance of 5 percent reserves a maximum of 3.2 MiB for padding within the 256 MiB block. In such a case, if the available size within the block is more than 3.2 MiB, a new, smaller stripe is inserted to fit within that space. This ensures that no stripe crosses block boundaries and causes remote reads within a node-local task.
Kinesis Data Firehose ignores this parameter when OrcSerDe:EnablePadding (p. 139) is false.
Type: Double
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No RowIndexStride
The number of rows between index entries. The default is 10,000 and the minimum is 1,000.
Type: Integer
Valid Range: Minimum value of 1000.
Required: No StripeSizeBytes
The number of bytes in each stripe. The default is 64 MiB and the minimum is 8 MiB.
Type: Integer
Valid Range: Minimum value of 8388608.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following:
See Also
• AWS SDK for C++
• AWS SDK for Go
• AWS SDK for Java V2
• AWS SDK for Ruby V3