Huge File Uploads On Amazon S3 – Rails With Dropzone on Unicorn + Nginx + Sidekiq
What is it …
Almost every second web application has a requirement for allowing users to upload large file. With huge internet bandwidths available and all applications in the cloud, it is expected that the user must be able to upload large data to the web application.
While working on a web application which deals with banking domain, we need to have such a feature where users might be uploading their huge files at the same slot of the day! Each user may upload files from 500 Mb to 2 Gb, and this is critical for the growing number of end users for this web application in the last few months!
.. and how we took the path ?
OK, understood, our Rails application is not having that big code-base or features for now, and also not a huge user-base. But the number is increasing at the double pace each month than it was before 6 months! That means we would have almost three times the users in next 2 months and if this feature is not implemented correctly then the performance could be affected.
We have started testing the application with growing the upload file size, each time we are trying with 500 – 900 Mb files in a single go where each file could be of maximum 100 Mb. OK, works fine, but when we goes up 1 GB or when the net speed is slower it started failing!
- We identified that the upload mechanism is kept very basic for the first iteration, and now we have a change in the feature. So, we have to change it such that it should work in all conditions. We have used Dropzone (for drag-drop support and AJAX based file uploads), Amazon S3 (storing the files in the cloud), nginx as a web server and unicorn as an application server.
- Nginx : Every request to the server will first go to nginx, and then it will be forwarded to unicorn which eventually routed to the particular controller. We found many references and issues related to large filesize uploads with Rails, and so based on that we have fixed the nginx configuration by adding the nginx upload module. This requires to compile the nginx from scratch, change the service configuration and restart the server. We have also tweaked the nginx configuration to handle client timeout & file uploads and only forward the non-file parameters to Rails. This has reduced the response time, because Rails does not need to handle the file now.
- Amazon S3 data upload : We have identified that for huge file, Rails takes time to push data to S3 and meanwhile the request may expire because of timeout between nginx and unicorn. Usually, unicorn timeout is 30s so if it takes more than 30s then the request will time out and nginx will respond with an error to the browser. How to fix this? We found carrierwave_backgrounder which allows to process the file uploading in a background job. We have used sidekiq to handle this background job which will eventually take the long running upload out of the synchronous request processing task and so the response would be faster.
And, its done as expected! Now we can upload 1.5–2GB of total files at once even with a 4GB RAM cloud server! End user does not have any difference than before, they are able to see the file uploading progress and server never complaints about timeout!
- Always send data in chunks to the server, to reduce the response time.
- Configure the web server to handle large file uploading, it works faster than most app servers.
- Use background processing on the app server for larger file uploads over cloud storage like S3/dropbox etc.
At BoTree Technologies, we build enterprise applications with our RoR team of 25+ engineers.
Consulting is free – let us help you grow!
Choose Your Language
- Digital Marketing
- IT Consulting
- Project Management