Laravel job rate limit middleware without Redis (using cache locks)
Recently I’ve been working on web scrapper for online store, that I implemented using the Laravel’s jobs.
When using scrappers it’s vital to make sure we’re not making too many calls to a target website. On the other hand, we don’t want to make too few calls, which slows down a scrapper. This where rate limiting comes in. In other words, it’s about finding the best calls per time period ratio for the scrapper.
Laravel has a great infrastructure for running background jobs. Also there is support of middlewares, that can be used for various types of tasks. My interest was in rate limiting middleware. Laravel has built in rate limiter for the jobs. You can read more about it here.
My goal was to fire a scrapper job every 5 seconds.
In Laravel documentation there is an example for exactly this case.
Since I didn’t want to install Redis just for rate limiting, I tried to find a simpler option. Let’s get back to the documentation. And here is another example.
It looked promising, but that way you can only set a number of job runs in a minute. I gave it a shot and made it run 12 times a minute Limit::perMinute(12). Unfortunately, it didn’t do what I expected, because every minute it runs 12 jobs in a row without any delay and then waits until the next minute starts.
How cache locks helped me
After that I started looking into Laravel documentation again to find a way of implementing my own rate limiter without installing additional software.
Clearly, I needed some stateful component that keeps information about job runs that can be accessible from different queue workers.
The easiest way to store some value in Laravel is by using the built-in Cache component. On top of the cache infrastructure there is an atomic lock system in Laravel framework. (Find out more about cache locks here https://laravel.com/docs/8.x/cache#atomic-locks).
The idea of using atomic locks for rate limiting is pretty simple. Create a lock for some period of time and let jobs run only after that time passes.
In the example below I used 5 seconds, but this time period can be made configurable through middleware constructor. Additionally, you should care about having a group name for the jobs (see $jobGroup in the example). They’re used to rate limit only jobs in the current group and jobs from other groups won’t be affected. It’s totally fine to use one group for all jobs, which I did in the example.
The final step is to connect the middleware we created with a job we’re going to run. That’s pretty easy to do.
Notice middleware method that returns an array with the middleware instance.
I hope this helps you running your Laravel jobs more effectively.
Feel free to leave comments to this post if you have some issues or suggestions for the code I’ve shared.