Rails find_each/ find_in_batches pitfalls
Every rails-developer knows about method find_each or find_in_batches (if you check the implementation for the first one you find that it uses the second one under the hood) from ActiveRecord. Even more, it is a good practice to use these methods when you need to iterate throw a big amount of records in DB.
If you don't familiar with it you can check this https://apidock.com/rails/ActiveRecord/Batches/find_each.
Looks like, we always can/must use find_each
instead of each
when we need to iterate throw some bunch of records. We also thought this way. But one of our applications struggled with DB: load for DB was around 95%, it was obvious that DB is a bottleneck.
I'd started to analyze slow-query logs in MySQL and reports in NewRelic. It was quite simple to find the place in the code which produce this SQL query:
Survey.select(:id, :offer_uuid).active.where(provider: provider).where.not(survey_id: current_surveys).find_in_batches(batch_size: 10) do |survey_batch| DeactivateSurveysService.perform(survey_batch, provider)
end
Because of using where.not
MySQL should fetch all records in order to sort them by id (this sorting used in find_each/find_in_batches
) and only after this takes 10 records - it can very slow if a scope of the records is big how we had.
CONCLUSION
It is good practice using find_each
/find_in_batches
in order to not consume a lot of memory but need to be aware of what resulting SQL is.