reuters dataset

39 views
Skip to first unread message

Shiyuan Gu

unread,
Jan 4, 2017, 6:18:52 PM1/4/17
to Keras-users
Hi, 
   I am following the  example for Reuters newswire topic classification  which uses the dataset: http://s3.amazonaws.com/text-datasets/reuters.pkl.  In the dataset, the Y(the topic) are already in numerical indices from 1 to 46. Any place I can find the topics names those indices map to?  Thank you!

François Chollet

unread,
Jan 4, 2017, 6:27:55 PM1/4/17
to Shiyuan Gu, Keras-users
Yes, here: https://s3.amazonaws.com/text-datasets/reuters_word_index.pkl

However note that it may be safer to download the original text data and vectorize it yourself.

On 5 January 2017 at 00:18, Shiyuan Gu <gshy...@gmail.com> wrote:
Hi, 
   I am following the  example for Reuters newswire topic classification  which uses the dataset: http://s3.amazonaws.com/text-datasets/reuters.pkl.  In the dataset, the Y(the topic) are already in numerical indices from 1 to 46. Any place I can find the topics names those indices map to?  Thank you!

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/5b902f4d-211b-4d12-b24c-0739a1293ffc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Shiyuan Gu

unread,
Jan 4, 2017, 11:19:21 PM1/4/17
to Keras-users, gshy...@gmail.com
 Thanks. Looks like the reuters_word_index.pkl is the mapping for words (X variables, ), not the topics (the y variables). 


On Wednesday, January 4, 2017 at 6:27:55 PM UTC-5, François Chollet wrote:
Yes, here: https://s3.amazonaws.com/text-datasets/reuters_word_index.pkl

However note that it may be safer to download the original text data and vectorize it yourself.
On 5 January 2017 at 00:18, Shiyuan Gu <gshy...@gmail.com> wrote:
Hi, 
   I am following the  example for Reuters newswire topic classification  which uses the dataset: http://s3.amazonaws.com/text-datasets/reuters.pkl.  In the dataset, the Y(the topic) are already in numerical indices from 1 to 46. Any place I can find the topics names those indices map to?  Thank you!

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages