The first-in-first-out (FIFO) queue is the type of AWS SQS queue that guarantees order and provides exactly once delivery of messages. That sounds great, but there are some other important features to understand to avoid unexpected queue behaviour. In this article you’ll discover the 3 most important caveats with SQS FIFO queues.

1) If a message fails to be processed, it may block other messages

When you send a message to a FIFO queue a message group id must be provided. This is a way to group messages, so that messages within that group are always received in order.

Message group id

Imagine we have several messages on the queue with the same message group id.

SQS FIFO queue with 3 messages

Message 1 is at the front of the queue. If it’s received by a consumer, but for whatever reasons fails to process and isn’t deleted, then no other messages with the same message group id can be received.

To demonstrate this, we’ll run a shell script which uses the AWS CLI to:

  1. create a new SQS FIFO queue
  2. send 3 messages to the queue, all with the same message group id
  3. attempt to receive a message from the queue
  4. attempt to receive a message from the queue again
  5. delete the queue to clean up
#!/bin/bash
set -e

QUEUE_URL=$(aws sqs create-queue --queue-name test.fifo --attributes FifoQueue=true,ContentBasedDeduplication=true --query QueueUrl --output text)
echo "Created SQS FIFO queue $QUEUE_URL"

echo "Sending 3 messages to queue"
aws sqs send-message --message-body "Message 1" --message-group-id 1 --queue-url "$QUEUE_URL" > /dev/null
aws sqs send-message --message-body "Message 2" --message-group-id 1 --queue-url "$QUEUE_URL" > /dev/null
aws sqs send-message --message-body "Message 3" --message-group-id 1 --queue-url "$QUEUE_URL" > /dev/null

echo "Receiving messages attempt 1"
aws sqs receive-message --queue-url "$QUEUE_URL"
echo "Receiving messages attempt 2"
aws sqs receive-message --queue-url "$QUEUE_URL"

echo "Finished. Deleting queue."
aws sqs delete-queue --queue-url "$QUEUE_URL"

If you execute the script you’ll see this output.

Created SQS FIFO queue https://eu-west-1.queue.amazonaws.com/299404798587/test.fifo
Sending 3 messages to queue
Receiving messages attempt 1
{
    "Messages": [
        {
            "MessageId": "8ee46fb4-7aae-444b-8ce1-0138c7620aeb",
            "ReceiptHandle": "AQEBAbg4vEIbQ+OC/o7R/R5CmZMHv5cKcfzn7LCiKnm+u1p2jabE+Z9mm0b0/wpf+H3Qnh7BE/FtjErqifBMudHIFzpZhHcEy7wxvXuK1AhzpPhimnbIdM/BSmWLyOKADw+xmbngIoSNAlzeqHTIEuxOt9+5ULxki/JW6ar9SBnur6CHNGvzg34c4hblXKdBnlf34QWs/NS1rE8SZ6ErHTEvFBugz0aY7GaIQlLzaevZyXTnLkajLnU+9GZ+i/fcx0+qx+hGJQ9Hm4Ko66xTDGqdTg==",
            "MD5OfBody": "68390233272823b7adf13a1db79b2cd7",
            "Body": "Message 1"
        }
    ]
}
Receiving messages attempt 2
Finished. Deleting queue.

You can clearly see that the first receive message attempt returns Message 1, but the second attempt doesn’t return anything.

FIFO queue tip 1

When sending messages to the queue, choose the message group id carefully.

Messages that have the same messages group id will be returned in order. While this is intended behaviour for a FIFO queue, remember that only once a message has been removed from the queue will the next message with the same message group id be returned.

Messages can be removed from the queue in these ways:

  1. Deleted using the SQS delete message API

  2. Deleted automatically once the message retention period has expired

  3. Moved automatically to a dead-letter queue after the configured maximum receives

2) If you don’t set the visibility timeout correctly, your message may be re-processed

In fact 1 we saw that when we do multiple receive messages calls on an SQS FIFO queue only the first one returns a result. That was the case because all the messages had the same message group id, and SQS was maintaining message order.

What if we want to be able to receive the same message again to retry processing which may have failed? That’s where the visibility timeout comes in. It configures how long after a message is received by one consumer will it be able to be received again by another. The default visibility timeout is 30 seconds.

The following diagram shows how this works using a simple FIFO queue with the default visibility timeout. The queue has a single message.

Visibility timeout

  1. an initial receive message request returns the message

  2. another receive message request within the visibility timeout returns no messages

  3. after waiting for the visibility timeout to expire, another receive message request returns the message again

Long-running queue message processing

Now imagine a scenario where the processing of a queue message takes a long time, even hours. For example, you could be developing a system for an online media site, where each queue message is a video that needs transcoding into many different formats.

In this case, if you leave the visibility timeout as the default, then a new consumer will start processing your queue message every 30 seconds. That could use a lot of compute resources unnecessarily, and have other undesirable effects.

FIFO queue tip 2

When creating your FIFO queue, configure the visibility timeout based on the time it takes to process each queue message.

If you have long-running queue message processing, configure the visibility timeout to be greater than the maximum duration of this processing. The maximum value you can choose is 12 hours.

If 12 hours isn’t enough, consider creating a dead-letter queue and setting the maximum receives of your queue to 1. That way, your queue message will be processed at most once.

3) You can have a maximum of 20,000 inflight messages

A FIFO SQS queue has an important limit compared to standard SQS queues. The number of inflight messages is limited to 20,000.

A message is considered to be in flight after it is received from a queue by a consumer, but not yet deleted from the queue

AWS’s definition of an in flight message

That means that if you’ve got consumers that are currently processing 20,000 messages, then the next receive message request you make won’t return anything. And that’s the case even if you have messages with a different message group id to those already in flight.

The other less obvious implication of this limit is that even if you have only one message inflight, any other messages with the same message group id count towards the infight limit. To illustrate this, we’re going to create a scenario where we have a FIFO queue with:

  • 20,000 messages with the same message group id

  • 1 message with another message group id

FIFO maximum in flight messages

Try this example out yourself by executing this script.

#!/bin/bash
set -e
printQueueSize() {
  QUEUE_SIZE=$(aws sqs get-queue-attributes --attribute-names ApproximateNumberOfMessages --queue-url "$1" --query Attributes.ApproximateNumberOfMessages --output text)
  echo "$2 size = $QUEUE_SIZE"
}
sendOneThousandMessages() {
  for i in {0..99}; do
    ENTRIES=""
    for p in {0..9}; do
      MESSAGE_INDEX=$((10 * i + $2 + p + 1))
      ENTRIES="${ENTRIES}Id=\"$p\",MessageBody=\"Message$MESSAGE_INDEX\",MessageGroupId=\"1\" "
    done
    aws sqs send-message-batch --entries $ENTRIES --queue-url "$1" >/dev/null
  done
}
QUEUE_URL=$(aws sqs create-queue --queue-name test.fifo --attributes FifoQueue=true,ContentBasedDeduplication=true --query QueueUrl --output text)
echo "Created SQS FIFO queue $QUEUE_URL"
echo "Sending 20000 messages to queue with message group id 1"
# send 1,000 message up front
sendOneThousandMessages "$QUEUE_URL" 0
# send the rest of the messages in parallel
for i in {2..20}; do
  OFFSET=$(((i - 1) * 1000))
  sendOneThousandMessages "$QUEUE_URL" "$OFFSET" &
done
wait
echo "Sending 1 message to queue with message group id 2"
aws sqs send-message --message-body "Message20001" --message-group-id 2 --queue-url "$QUEUE_URL" >/dev/null
printQueueSize "$QUEUE_URL" "Queue"
echo "Receive message attempt 1"
aws sqs receive-message --queue-url "$QUEUE_URL"
echo "Receive message attempt 2"
aws sqs receive-message --queue-url "$QUEUE_URL"
echo "Waiting 30 seconds"
sleep 30
echo "Delete message"
RECEIPT_HANDLE=$(aws sqs receive-message --queue-url "$QUEUE_URL" --query 'Messages[0].ReceiptHandle' --output text)
aws sqs delete-message --receipt-handle "$RECEIPT_HANDLE" --queue-url "$QUEUE_URL"
echo "Receive message attempt 3"
aws sqs receive-message --queue-url "$QUEUE_URL"
echo "Receive message attempt 4"
aws sqs receive-message --queue-url "$QUEUE_URL"
echo "Finished. Deleting queue."
aws sqs delete-queue --queue-url "$QUEUE_URL"

It takes a few minutes to add the 20,001 queue messages, but eventually produces the following output.

Created SQS FIFO queue https://eu-west-1.queue.amazonaws.com/299404798587/test.fifo
Sending 20000 messages to queue with message group id 1
Sending 1 message to queue with message group id 2
Queue size = 20001
Receive message attempt 1
{
    "Messages": [
        {
            "MessageId": "94a81449-947d-417e-8daa-b6c13c6c5733",
            "ReceiptHandle": "AQEBDox6zRkiaq9hLSE3xZkQBueQtCrYSCJDswBU72qvKvjp0N1e0RHsOIBLs/qDC3Lga6bvglk/+5Eqr9jYBlqF59XaH+4bMtCN59zIFR8HjteZOxihONdjK3Z+RqynmpOm
YbVZ/d5QdEZuwB10Q0weHaPbHU0aoO7tZAvFmjPQOdDzk75qLeT2tTqHGhQeP+Dcdv3GD6n8DpfUDfGZg7ZuNu0H7CR6lk1pQ8pJpPVSc3OTopRis0P2YqbGOtuK3L74XDmNq/E2TxdpO7ssJRQSEw==",
            "MD5OfBody": "86461527178884b100f650fd417ed936",
            "Body": "Message1"
        }
    ]
}
Receive message attempt 2
Waiting 30 seconds
Delete message
Receive message attempt 3
{
    "Messages": [
        {
            "MessageId": "11ab7d68-6248-4d58-a669-49fdce4ad521",
            "ReceiptHandle": "AQEBhuk4U3TXag9yUewfZ3cXScnVFg9PGZUz6RP0u+PPloSpA8Lg0OtMy5h/0OZTpjb1+hhjCrX1nXspLd5E1NCDxZmfgDEYcImKvIEj1imb0OlaZM2GVXKx3mryo/UDw0jC
FQM99wgt2mi3GSLKvEz3YoVr0H56tVD8igV4CNbwfwLMlcoDz61iu+DBupjGtDpacUTbLPAKvuV62/V0dxOgSqRqWLuAvYtPoAyoPlGD+4Wk7Vi4Nol96Wel/P2UXiPipG23KeYSklxeZNjcViRNvA==",
            "MD5OfBody": "f209385ac93e63d9efed6fedc158b16a",
            "Body": "Message2"
        }
    ]
}
Receive message attempt 4
{
    "Messages": [
        {
            "MessageId": "8edd2281-677a-422d-aeda-f91d2a5b1d95",
            "ReceiptHandle": "AQEBTvOso1s2q3IdYmL0La6OldUqCba4PYXYCrG9rkdzOzb8CGdaqpNMReGeIWaAK1HeF9Ew5N91NRG6EuE76GTMuTT9YEZ/TAMid4im6lEJ6k/f+A15UwMoEzMxGXcfFyt6
J+bhQbj9f53ZEuLbWp1r3NVLuSsqRRXn9JroHi0FLWI13Hy2xY0EvVKBzATxrBCEiB2Oef39yzXNLLz6ZS6ZXIK8MDXpC9U76O39DdzEuf7gSVvmcxEgcTPAKMrEb9Frok0pNrBUPW/hz3kd2csNrw==",
            "MD5OfBody": "3a7c1cfdb03a4c621424b334c7507621",
            "Body": "Message20001"
        }
    ]
}
Finished. Deleting queue.

Let’s go through this output step-by-step:

  1. we create a FIFO queue with size 20,001 (20,000 message with one message group id, 1 message with another). Each message has a unique numbered message body of the format Message xxx.
  2. we receive a single message from the queue, Message1
  3. the next receive message attempt doesn’t return any messages, even though there is another message on the queue with a different message group id
  4. we delete a message from the queue, bringing the queue size to 20,000. The queue is now within the maximum inflight message limit.
  5. we make two more receive message requests, and get two messages:
    • a message with the first message group id, Message1
    • a message with the second message group id, Message20001

FIFO queue tip 3

A FIFO queue has a maximum inflight message limit of 20,000. This could cause an issue if:

  • you have many consumers - eventually your consumers won’t be able to receive any more messages

  • you have many messages with the same message group id - in this case you may be blocked from receiving messages with a different message group id

Consider the implications of the inflight message limit when designing your FIFO queue. You might want to:

  1. keep the number of messages with the same message group id low
  2. implement a dead-letter queue so that messages that fail processing are quickly moved out of the main queue

Final thoughts

You’ll understand now that there’s a bit more to think about with the SQS FIFO queue than first appears. Bearing the above in mind, you should be able to design a solution to meet your requirements.

Relevant AWS documentation

If you’re still stuck, feel free to contact me and I’ll try my best to help.