That was generated using the old Mahout Mapreduce recommenders, which had pluggable similarity metrics. I ran it on a vey large E-Commerce dataset from a real ecom site. The data was for 6 months of sales. We did cross-validation of an 80 training set and 20% held out probe/test set. The test set was 20% of the most recent sales. We then measure MAP@k for several k. A decline in MAP@k as K increases means the ranking of items is correct. This higher MAP@k the better the precision of recommendations.
Using cross-validation between different algorithms is highly suspect so this was using an identical algo, but not one I’d use today.