How do you split the Movielens dataset?

257 views
Skip to first unread message

Wang Dawei

unread,
Jan 30, 2017, 2:28:57 AM1/30/17
to MyMediaLite

Hi:


Hope all is well. 


I'm comparing the result of my algorithm with yours on your website.


I wonder what script do you use to split your movielens 100k dataset? on your website: http://mymedialite.net/examples/datasets.html you are saying you split your mk100k with --random-seed=1, is that the same splitting file you provided here named crossvalidation.pl? https://github.com/zenogantner/MyMediaLite/tree/master/scripts 


if so.. What is the random seed when you split the ml-1m? You didn't specify that on your web. Are you using the same crossvalidation.pl to split ml100k and ml1m with random seed=1? Or are you using the already splited files that comes with the mk100k?


Thank you in advance.


Best,


D

Zeno Gantner

unread,
Feb 1, 2017, 4:23:40 AM2/1/17
to mymed...@googlegroups.com
Hello, 

The crossvalidation.pl script was not used for those experiments. 

Just pass --random-seed=1 and --crossvalidation=5 to the command-line program.

Cheers,
    Z. 

--
You received this message because you are subscribed to the Google Groups "MyMediaLite" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mymedialite+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Wang Dawei

unread,
Feb 1, 2017, 9:03:24 AM2/1/17
to MyMediaLite
Thank you so much for your reply.

I was looking at your github earlier and didn't find the program folder. Now I downloaded your newest version and find that. I wonder which source code do you use to generate cross validation files? Can those files be exported to txt file like the script that comes together in the ml-100k file? I need to  run my test on the same 5-fold test files to compare with your result posted on your website.


To unsubscribe from this group and stop receiving emails from it, send an email to mymedialite...@googlegroups.com.

Zeno Gantner

unread,
Feb 1, 2017, 5:39:06 PM2/1/17
to mymed...@googlegroups.com
Am 01.02.2017 3:03 nachm. schrieb "Wang Dawei" <weeo...@gmail.com>:
Thank you so much for your reply.

I was looking at your github earlier and didn't find the program folder. Now I downloaded your newest version and find that. I wonder which source code do you use to generate cross validation files? Can those files be exported to txt file like the script that comes together in the ml-100k file? I need to  run my test on the same 5-fold test files to compare with your result posted on your website.




With some C# programming this is possible.
First you need to set
MyMediaLite.Random.Seed = 1;

You can use MyMediaLite.IO.RatingData to load the data and MyMediaLite.Data.RatingCrossValidationSplit to do the split.

Then iterating over the data and writing it to a file should be trivial.

This should give you exactly the same split.


Cheers,
   Z.

 


On Wednesday, February 1, 2017 at 4:23:40 AM UTC-5, Zeno Gantner wrote:
Hello, 

The crossvalidation.pl script was not used for those experiments. 

Just pass --random-seed=1 and --crossvalidation=5 to the command-line program.

Cheers,
    Z. 

Am 30.01.2017 8:28 vorm. schrieb "Wang Dawei" <weeo...@gmail.com>:

Hi:


Hope all is well. 


I'm comparing the result of my algorithm with yours on your website.


I wonder what script do you use to split your movielens 100k dataset? on your website: http://mymedialite.net/examples/datasets.html you are saying you split your mk100k with --random-seed=1, is that the same splitting file you provided here named crossvalidation.pl? https://github.com/zenogantner/MyMediaLite/tree/master/scripts 


if so.. What is the random seed when you split the ml-1m? You didn't specify that on your web. Are you using the same crossvalidation.pl to split ml100k and ml1m with random seed=1? Or are you using the already splited files that comes with the mk100k?


Thank you in advance.


Best,


D

--
You received this message because you are subscribed to the Google Groups "MyMediaLite" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mymedialite...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "MyMediaLite" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mymedialite+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages