Hi Wing,
Is it really necessary for you to load this corpus file? All of the training/validation/test sets that were derived from the corpus are available for separate download on the website and they are perfectly handleable on a personal machine. This should be enough to reproduce any experimental results.
Loading that corpus file into memory all at once is not possible with the usual 8-16GB RAM a personal laptop/desktop has. So if you really need to work with it you will have to either (1) get access to a workstation/server with more memory or (2) process the data in chunks as Kyle suggested or (3) use a solution like Dask which uses local storage in addition to the main memory to load the file.
Cheers,
Ralph