Hello Cameron,
Thanks a lot for your well-described answer, I really appreciate your assistance in this matter. While waiting for feedback, I proceeded with decreasing the number of representative days to 4 days. For this, I followed this methodology:
1. Remove H49 to H96 in the "time_of_day" table
2. Remove H49 to H96 in the "CapacityFactorTech" table
3. Remove H49 to H96 in the "SegFrac" table and redistribute the remaining units in H1 to H48 so they still sum up 1
4. Remove H49 to H96 in the "DemandSpecificDistribution" table and redistribute the remaining units in H1 to H48 so they still sum up 1
It is important to remark that meticulous data handling was taken in steps 3 and 4 to keep the same distribution as in the 8 days database. The total simulation time was 34 hrs and the results were consistent. I attach a figure of the hourly electricity generation by source for the year 2020 for the 4 representative days.
For the SegFrac, DemandSpecificDistribution, and CapacityFactorTech data I did not try to gather the data from the older US_9R_4D database as you recommended because this one works with 4 seasons and one representative day by season. In contrast, the newer version US_9R_8D works with only 2 seasons and 4 representative days in each one. Therefore, reusing the data could bring more implementation challenges. Moreover, as the US_9R_8D is more updated, it can have more technologies than the US_9R_4D, therefore the 4D database could lack plenty of information in CapacityFactorTech data.
Thanks for pointing out the limitation of using fewer representative days and its consequence of having biased results. That is something I am aware of, however, as my research is not especially focused on developing very realistic future scenarios, but rather is about understanding the interlinkages of different parameters and their difference to a base case. I believe that is a limitation I can afford to take.
Thanks a lot for all the other recommendations for decreasing the size of the model, I might pursue 2nd, 4th or 5th recommendation. As I am mostly interested to model long-term scenarios and having an hourly resolution in the representative days (therefore discarding your 1st and 3rd recommendations). I will let you know if I manage to decrease the computational times!
I have one last question, based on your experience what is the best way to increase the computational performance of solving TEMOA's optimization problems? I am currently using gurobi optimization software and my virtual machine has a very significant computational power (128GB of RAM, 16 virtual processors, and a CPU speed of 2.59 GHz). However, as I mentioned before this last simulation took more than 30 hrs to complete. I am almost certain that there might be a different computational bottleneck different than RAM and CPU, perhaps the memory bandwidth? Your expertise will be very much appreciated!
Thanks again!