Over at FutureProofs we are in the process of converting one of our existing PDF intensive operations into a process which is triggered by a RabbitMQ consumer.
RabbitMQ for those uninitialised is, put simply, a queueing system. You push “messages” onto it and then build consumers to take messages off again.
Why would you do this? Why not have a cron job run these? Or simply trigger a process and let the user wait until it finishes? Well… these work (we’ve even shamefully tried), but what happens when you’ve got a really slow process (like converting a 1000 page PDF into images)? Users expect to see update quickly; not after a 30 second delay.
Fundamentally rabbitMQ messages are JSON strings; at FP we generally reference DB id’s and as the consumer runs extract the required data from the database and then perform an action. We have consumers that log data, convert PDFs, send emails etc.
These queues can be persistent (meaning that should the server reset, they’re not all deleted) or they can act like streams; jobs are passed in, if something is listening its processed, otherwise its disappears into the aether.
In addition to persistency when a consumer is unable to complete a task it (through choice or because of a fatal error); jobs can be automatically added back onto the queue for later processing.
Finally when you have a lot of jobs that need processing you have the ability to spin up multiple consumers each one taking jobs off the same queue, this makes queues extremely scalable.
In later posts I hope to go into some practical examples, however for now I’d encourage anyone interested to look at the official examples.