Loading Timeout Requests and MongoDB Pagination on Docker

93 views
Skip to first unread message

Mukmin Pitoyo

unread,
Oct 26, 2023, 11:17:11 PM10/26/23
to Cloud Carbon Footprint
Hi CCF Team,

Currently on our setup we are running CCF on our EC2 instance through Docker Compose, and we are also connected to MongoDB for persistent data storage/caching. 

However, the issue is that the CUR data set we are trying to load in is quite large, and we are facing some trouble in having CCF to read the data from Athena and cache it into MongoDB, as we will always face these 504 Gateway Timeout errors:

ccf error 504.PNG

We would have to manually reload the page again for CCF to attempt to read the data through Athena again, and after several attempts, the docker API container would crash / timeout:

ccf docker.PNG

Resulting in this 502 Bad Gateway error on the front page:
ccf error 502.PNG

We have tried upgrading the EC2 instance to a larger / faster instance type, but still facing similar issues. Is there a way for us to overcome this gateway timeout issue so our data can successfully load?
Currently, we are truncating and slowly feeding CCF data month by month, since we have CUR Report data all the way back to 2021. After multiple tries, it eventually loads but we are wondering if there is a better way to do this?

Also, is it possible to increase the pagination limit on MongoDB through Docker compose? I understand that it is currently limited to 50,000 pages per API call, but since we have almost a million, it is taking some time for CCF to load and get the data from MongoDB. I have tried adding this line in the Docker Compose file:

Screenshot 2023-10-27 111326.jpg

But so far it still sticks to the same default 50,000 pagination limit. Would there be another way to configure this pagination limit through the Docker compose file?

Thank you for your time and hope to hear from the team soon!

Best Regards,
Mukmin

Cloud Carbon Footprint

unread,
Nov 9, 2023, 12:33:43 PM11/9/23
to Cloud Carbon Footprint
Hi Mukmin,

Thank you for bringing this up! As far as feeding the data from Athena into MongoDB without facing the Javascript memory error, there are a few things to note. It would be best to attempt to use the `seed-cache-file` script to feed the data. This way, it doesn't not need to work within the client-side time-out limit and can run as a background job. You may still find that you will manually need to chunk the date ranges while you are seeding the db just to avoid more scalability issues. Here is more documentation on the CLI script to seed the cache file.

It is also important to note that the client side UI that CCF offers may not be built properly to handle such large amounts of data. For this reason, we would suggest looking into other business intelligence data visualization tools such as AWS Quicksight to integrate CCF data with. Happy to talk more about this process if you'd like.

Regarding the Pagination Limit, we have documentation located here that explains why we chose the limit and how it may not be a good idea to extend it beyond that. Currently, we do not have the variable configured to be set with Docker, but this is surely something we consider.

Thanks,
The CCF team at Thoughtworks
Reply all
Reply to author
Forward
0 new messages