Dynamodb ttl expiration is not minute precise

Finally figured out why some of my expired items fail to delete when expected. (bold mine)

DynamoDB typically deletes expired items within 48 hours of expiration. The exact duration within which an item truly gets deleted after expiration is specific to the nature of the workload and the size of the table. Items that have expired and not been deleted will still show up in reads, queries, and scans. These items can still be updated and successful updates to change or remove the expiration attribute will be honored.

Bottom line: you can't rely on DynamoDB's ttl column based expiration to do minute-precise expiration of things in your serverless app. Luckily for my purposes, I don't need minute precision. I do however need hour precision, so I'm planning to write a little scheduled lambda that kicks off every sixty minutes and reaps whatever records have expired but not deleted yet, and delete them manually.


Purchase my Serverless book at https://leanpub.com/serverless

Limitations of Dynamodb KeyConditionExpression

Dynamodb has a query system with a particularly hard learning curve (in my opinion.) Even though I've been using it for years now, I still run into some sticky problems like this one today when I tried to use an IN operator in a query expression. Should be fine, right?

    const key = {
      IndexName: 'by_label_and_status',
      KeyConditionExpression:
        'labelId = :labelId and submissionStatus in (:sent, :opened, :listened)',
      ExpressionAttributeValues: {
        ':labelId': event.pathParameters.labelId,
        ':sent': 'sent',
        ':opened': 'opened',
        ':listened': 'listened',
      },
    };
    const result = await dynamodb.query('submissions', key);

The code above actually fails. Dynamodb is kind enough to report Invalid operator used in KeyConditionExpression: IN, but finding the reason why took me a bit of digging.

It turns out that indeed if you go to the guide for Dynamodb expression syntax, you'll see that the allowed comparators for KeyConditionExpression are limited.

Valid comparisons for the sort key condition are as follows:

sortKeyName = :sortkeyval - true if the sort key value is equal to :sortkeyval.

sortKeyName < :sortkeyval - true if the sort key value is less than :sortkeyval.

sortKeyName <= :sortkeyval - true if the sort key value is less than or equal to :sortkeyval.

sortKeyName > :sortkeyval - true if the sort key value is greater than :sortkeyval.

sortKeyName >= :sortkeyval - true if the sort key value is greater than or equal to :sortkeyval.

sortKeyName BETWEEN :sortkeyval1 AND :sortkeyval2 - true if the sort key value is greater than or equal to :sortkeyval1, and less than or equal to :sortkeyval2.

begins_with ( sortKeyName, :sortkeyval ) - true if the sort key value begins with a particular operand. (You cannot use this function with a sort key that is of type Number.) Note that the function name begins_with is case-sensitive.

Use the ExpressionAttributeValues parameter to replace tokens such as :partitionval and :sortval with actual values at runtime.

Since the system I'm working on is brand new, I was able to get the code example working by changing the values of our statuses. Instead of having completely disparate statuses for sent, clicked, and opened, I changed them to be sent, sent_clicked, and sent_opened. They are semantically correct, since clicked and opened states are also sent. That change allowed us to use begins_with instead of the prohibited IN operator.

    const key = {
      IndexName: 'by_label_and_status',
      KeyConditionExpression:
        'labelId = :labelId and begins_with(submissionStatus, :submissionStatus)',
      ExpressionAttributeValues: {
        ':labelId': event.pathParameters.labelId,
        ':submissionStatus': 'sent',
      },
    };

Purchase my Serverless book at https://leanpub.com/serverless