If you don't care about the content type or other metadata associated with the URL, you should be able to generate a list from the index files pretty easily. It looks like the index latest crawl isn't world readable (anyone know why?), but the previous one is.
A crude, brute force way of doing this would be to do the following, or equivalent, for each of the 300 index files (400 MB ea.):
$ aws s3 cp --no-sign-request s3://aws-publicdatasets/common-crawl/cc-index/collections/CC-MAIN-2015-11/indexes/cdx-00000.gz .
$ zcat cdx-00000.gz | cut -f 4 -d '"' | gzip > cdx-00000-urls.gz
If you've got a good internet connection, you could use wget or curl and just pipe everything together, saving on 120 GB of temp space
If you want to filter the list of URLs to only those which are HTML pages or some other criteria that requires the use of the metadata, you'll need to process all the WAT files. This is more like 10 TB instead of 120 MB, so you'll definitely want to do this using Amazon AWS.
You'll need to parse the included JSON to get the URI and any other metadata that you are interested in.
Alternatively, if you can make do with the 2014 crawl, you can download a 20 GB list of the URLs from the WebDataCommons:
If you need the freshest data before the WebDataCommons folks re-run their page graph analysis (they seem to be focusing on microdata now), you could re-use their extraction software:
Hope that gives you some starting points to work with.
Tom