I have trained my word2vec model on a movie dataset with star cast, director name and other similar features/columns in the training data set. The text is not free flowing (it is comma separated). As a result, the SIMILARITY function and SCORE functions don’t produce satisfactory results as embedding generated are not up to the mark
1. Is word2vec the right approach for such a problem with more large number of proper nouns and no free flowing text?
2. If yes, which parameters to tune for training with proper nouns?
[['WORDSWORTH', 'HOUSE', '21', 'FAIRFAX', 'ROAD', 'BIRMINGHAM', '', 'GBR'], ['THE', 'HOLLIES', '2', 'FRIESTON', 'ROAD', 'GRANTHAM', 'CAYTHORPE', 'NG32', 'GBR']]
When i use the score function,I get the following resultsmodel.score([str("WORDSWORTH HOUSE 21 FAIRFAX ROAD").split()])[0]
-30.27762model.score([str("WORDSWORTH GBR").split()])[0]
-19.615669So what I am trying to achieve is that score in the first case must be better than second as the first case context is present in training but not the second,Is word2vec the appropriate way to go for such a use case ?Thanks
Hi Ishaan,
I personally think that distributional models (like word2vec) will not
be of much use in this case. Their power comes exactly from what you are
missing in you dataset - typical word co-occurrences.
You will probably be much better off simply using your 'columns' as
features in classification/clustering/whatever.
On 03/23/2017 12:26 PM, Ishaan Arora wrote:
> I have trained my word2vec model on a movie dataset with star cast,
> director name and other similar features/columns in the training data
> set. The text is not free flowing (it is comma separated). As a result,
> the SIMILARITY function and SCORE functions don’t produce satisfactory
> results as embedding generated are not up to the mark
>
> 1.
>
> 1. Is word2vec the right approach for such a problem with more large
> number of proper nouns and no free flowing text?
>
> 2.
>
> 2. If yes, which parameters to tune for training with proper nouns?
>
> --
> You received this message because you are subscribed to the Google
> Groups "gensim" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to gensim+unsubscribe@googlegroups.com
> <mailto:gensim+unsubscribe@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.
--
Solve et coagula!
Andrey
--
You received this message because you are subscribed to a topic in the Google Groups "gensim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gensim+unsubscribe@googlegroups.com.
> > an email to gensim+unsubscribe@googlegroups.com
> <mailto:gensim%2Bunsubscribe@googlegroups.com>
> > <mailto:gensim+unsubscribe@googlegroups.com
> <mailto:gensim%2Bunsubscribe@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
>
> --
> Solve et coagula!
> Andrey
>
> --
> You received this message because you are subscribed to a topic in
> the Google Groups "gensim" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "gensim" group.
> To unsubscribe from this group and stop receiving emails from it, send
--
You received this message because you are subscribed to a topic in the Google Groups "gensim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gensim+unsubscribe@googlegroups.com.
> > <mailto:gensim%2Bunsubscribe@googlegroups.com> > > an email to gensim+unsubscribe@googlegroups.com
> <mailto:gensim%2Bunsubscribe@googlegroups.com>
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>
> > > <mailto:gensim+unsubscribe@googlegroups.com
> <mailto:gensim%2Bunsubscribe@googlegroups.com>
> > <mailto:gensim%2Bunsubscribe@googlegroups.com
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>>.
> > > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>
> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> > --
> > Solve et coagula!
> > Andrey
> >
> > --
> > You received this message because you are subscribed to a topic in
> > the Google Groups "gensim" group.
> > To unsubscribe from this topic, visit
> > https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>
> >
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>>.
> > To unsubscribe from this group and all its topics, send an email to
> > gensim+unsubscribe@googlegroups.com
> <mailto:gensim%2Bunsubscribe@googlegroups.com>
> > <mailto:gensim%2Bunsubscribe@googlegroups.com
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>
> > <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "gensim" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to gensim+unsubscribe@googlegroups.com
> <mailto:gensim%2Bunsubscribe@googlegroups.com>
> > <mailto:gensim+unsubscribe@googlegroups.com
> <mailto:gensim%2Bunsubscribe@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Solve et coagula!
> Andrey
>
> --
> You received this message because you are subscribed to a topic in
> the Google Groups "gensim" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "gensim" group.
> To unsubscribe from this group and stop receiving emails from it, send
--
You received this message because you are subscribed to a topic in the Google Groups "gensim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gensim+unsubscribe@googlegroups.com.
> > <mailto:gensim%2Bu...@googlegroups.com
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>
> > > <mailto:gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>
> > <mailto:gensim%2Bu...@googlegroups.com
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>>.
> > > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>
> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> > --
> > Solve et coagula!
> > Andrey
> >
> > --
> > You received this message because you are subscribed to a topic in
> > the Google Groups "gensim" group.
> > To unsubscribe from this topic, visit
> > https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>
> >
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>>.
> > To unsubscribe from this group and all its topics, send an email to
> > gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>
> > <mailto:gensim%2Bu...@googlegroups.com
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>
> > <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "gensim" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>
> > <mailto:gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Solve et coagula!
> Andrey
>
> --
> You received this message because you are subscribed to a topic in
> the Google Groups "gensim" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "gensim" group.
> To unsubscribe from this group and stop receiving emails from it, send
> For more options, visit https://groups.google.com/d/optout.
--
Solve et coagula!
Andrey
--
You received this message because you are subscribed to a topic in the Google Groups "gensim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gensim+un...@googlegroups.com.
> <mailto:gensim%252Bunsubscribe...@googlegroups.com>>
> > > <mailto:gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>
> > <mailto:gensim%2Bu...@googlegroups.com
> <mailto:gensim%252Bunsubscribe...@googlegroups.com>>>.
> > > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>
> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> > --
> > Solve et coagula!
> > Andrey
> >
> > --
> > You received this message because you are subscribed to a topic in
> > the Google Groups "gensim" group.
> > To unsubscribe from this topic, visit
> > https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>
> >
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>>.
> > To unsubscribe from this group and all its topics, send an email to
> > gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>
> > <mailto:gensim%2Bu...@googlegroups.com
> <mailto:gensim%252Bunsubscribe...@googlegroups.com>>.
--
To unsubscribe from this group and all its topics, send an email to gensim+unsubscribe@googlegroups.com.
('org:SERVICES', 'org:LIMITED')
('org:SERVICES', 'city:KINGSTON_HULL')
('org:SERVICES', 'state:ENGLAND')
('org:SERVICES', 'pc:EY1')
('org:SERVICES', 'country:GBR')
('org:LIMITED', 'city:KINGSTON_HULL')
('org:LIMITED', 'state:ENGLAND')
('org:LIMITED', 'pc:EY1')
('org:LIMITED', 'country:GBR')
('city:KINGSTON_HULL', 'state:ENGLAND')
('city:KINGSTON_HULL', 'pc:EY1')
('city:KINGSTON_HULL', 'country:GBR')
('state:ENGLAND', 'pc:EY1')
('state:ENGLAND', 'country:GBR')
('pc:EY1', 'country:GBR')
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>
> > > <mailto:gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>
> > <mailto:gensim%2Bu...@googlegroups.com
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>>.
> > > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>
> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> > --
> > Solve et coagula!
> > Andrey
> >
> > --
> > You received this message because you are subscribed to a topic in
> > the Google Groups "gensim" group.
> > To unsubscribe from this topic, visit
> > https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>
> >
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe
> <https://groups.google.com/d/topic/gensim/DwsDbeYoaD4/unsubscribe>>.
> > To unsubscribe from this group and all its topics, send an email to
> > gensim+un...@googlegroups.com
> <mailto:gensim%2Bu...@googlegroups.com>
> > <mailto:gensim%2Bu...@googlegroups.com
> <mailto:gensim%252Bunsubscribe@googlegroups.com>>.