I’ve been using Mixpanel for two weeks now.
I’ve followed their documentation and we were sending millions of requests to their servers until last night. Seems that one of their servers went down and caused our requests to start failing which caused our end to start clogging. Although we though we were sending too many requests and for that mixpanel had blocked us temporarily (fortunately not the case but it lead us to point 2 - keep reading).
This made me understand two things:
Because their service was down, our Sidekiq Queue was increasing and the jobs were failing, but by default Sidekiq retries them, preventing the crucial jobs to get run.
If you expect to send a massive amount of events then the mixpanel default consumer is not for you (I explain why in a second).
What you can do to prevent the first from happening again is having a safety point, for example here I check if the size of the queue is less than 100k to send another Mixpanel event, if it isn’t is just better not to send the event than crashing the application.
The second one is more interesting, the ruby mixpanel library has another consumer, called BufferedConsumer which basically allocates 50 events in memory before it sends them all together in a single request, which basically reduces your requests by a lot if you think about it.
I think it’s wise to have this philosophy that if you using a service which you can’t control try to isolate it as much as possible from causing any side effects on your app. It’s ok if their service is down for some reason in this case, because our app doesn’t depend on them for anything other than sending events, and it’s ok if we miss some events, what is not ok is if our service gets clogged and we can’t provide a good service to our users.
Thank you for reading.