1

I have a project where there's an external API which implements throttling. Roughly speaking, I'm allowed to perform N requests per minute. I also have a message queue (Apache Kafka) whose consumers consume API requests: a consumer receives an API URL to call, calls it, then produces another message which is processed by another consumer (some business logic which updates internal databases).

I'm wondering if it's possible to enforce N API requests restrictions using the message queue alone. As far as I understand it's impossible or contrary to the main purpose of message queues.

So far I've come up with the following alternative:

  1. business logic which wants some API calls to be performed pushes the calls on a queue data structure (data structure which supports queue/push and dequeue/pop methods, for example Redis lists offers such semantics).

  2. A job runs every minute and pops N API calls from the queue and assigns the requests to a consumer which performs the calls concurrently and produces messages with the results of the calls, which are then processed by consumers. If any call needs to be retried it's pushed back on the queue.

This way it's easy to enforce N requests per minute but this introduces certain complexity apart from the message queue itself.

Is my solution good architecture?

2
  • Why not have the job run continuously and pop a message off the queue every N seconds or milliseconds? Do you really need concurrent API calls? That might be the complex part. Commented Jun 6, 2021 at 19:25
  • @GregBurghardt it's an option but it still uses a queue data structure right? It's beneficial to make the calls concurrently though in order to speed up processing.
    – Yos
    Commented Jun 6, 2021 at 19:32

1 Answer 1

1

This is actually a common problem when implementing highly available systems. You want to process a as much as you can at any given moment but not too much. The challenge is that you need some sort of coordination between the consumers but a lot of the advantage of messaging is the independence of the consumers.

The main issue with the proposed solution is that it introduces a single point of failure. If your manager fails, your workers are idle. You could introduce more managers but you end up basically back where you started and it becomes a "turtles all the way down" type of problem.

It might not be in the scope of the solution but another thing that seems off about this is that messaging platforms allow for many consumers with different goals. So controlling the rate of requests that result from messages by throttling the gets from Kafka seems like you are trying to solve the issue in the wrong place. I'm not saying it won't work; it's that I tend to prefer direct solutions over indirect ones. By that I mean, if possible, you should try to throttle the requests, not the things that generate requests.

If you buy that, here's what I would suggest: you need to find some sort of solution that provides a distributed semaphore of requests in process. I recall there was some sort of open source Google library (Java) for this but I don't know what's a great solution in 2021. This raises a bunch of somewhat thorny issues related to the CAP Theorem but the good news is that a lot of smart people have worked on approaches and provide open-source solutions.

Essentially, you read from the queue and execute a request when the semaphore grants you a spot. The messaging piece is fine as it is because it is a pull model. You only pull more if you can send the requests. The only thing I can thing of at the moment that you might want to do beyond that naïve solution is to allow the reservation of batches to reduce chattiness.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.