- AWS research workshops
- What is Kinesis
- Amazon Kinesis is a managed, scalable, cloud-based service that allows real-time processing of streaming large amount of data per second. It is designed for real-time applications and allows developers to take in any amount of data from several sources, scaling up and down that can be run on EC2 instances.
- Tutorial point
- ETL
- ETL stands for “extract, transform, load,” which is the process of loading business data into a data warehousing environment, testing it for performance, and troubleshooting it before it goes live.
- Data stream -> set of shards->sequence of data records
- Shards
- Data stream is made up of one or more shards
- Read
- Up to 5 transactions per second for reads
- Up to a maximum total data read rate of 2 MB per second
- GetRecords
- call GetRecords in a loop. Use GetShardIterator to get the shard iterator to specify in the first GetRecords call. GetRecords returns a new shard iterator in
NextShardIterator
. Specify the shard iterator returned inNextShardIterator
in subsequent calls to GetRecords
- call GetRecords in a loop. Use GetShardIterator to get the shard iterator to specify in the first GetRecords call. GetRecords returns a new shard iterator in
- Write
- Up to 1,000 records per second for writes
- Up to a maximum total data write rate of 1 MB per second
- GetShardIterator
- Data Records
- Sequence Number
- Partition Key
- Data blob up to 1 MB
- Data retention period default to 24 hr. could be increased to 365 days
- Producer, such as Twilio, puts record into AWS data stream
- Consumer gets data from AWS data stream and process them
- Using AWS Lambda with Amazon Kinesis
- How to retry using info in DLQ using SNS
- 6 common pitfalls AWS lambda with Kinesis trigger
- AWS kinesis code sample
- Batch size: The number of records to send to the function in each batch, up to 10,000.
- The default value for getRecord is 10,000. See document here
- Example data in DLQ
{ "requestContext": { "requestId": "f01fef26-d010-4253-b997-d7dacb59452b", "functionArn": "arn:aws:lambda:us-west-2:999:function:example-datastream", "condition": "RetryAttemptsExhausted", "approximateInvokeCount": 6 }, "responseContext": { "statusCode": 200, "executedVersion": "$LATEST", "functionError": "Unhandled" }, "version": "1.0", "timestamp": "2021-10-29T15:33:51.712Z", "DDBStreamBatchInfo": { "shardId": "shardId-00000001635520907907-798533d1", "startSequenceNumber": "4178381200000000045390851482", "endSequenceNumber": "4178381200000000045390851482", "approximateArrivalOfFirstRecord": "2021-10-29T15:32:47Z", "approximateArrivalOfLastRecord": "2021-10-29T15:32:47Z", "batchSize": 1, "streamArn": "arn:aws:dynamodb:us-west-2:999:table/example-table/stream/2020-01-06T22:39:48.687" } }