Sentiment Classification with H2O

412 views
Skip to first unread message

Julian Hillebrand

unread,
Jun 17, 2015, 10:44:55 AM6/17/15
to h2os...@googlegroups.com
Is there a (good) example of using h2o for sentiment analysis? 


Best regards
Julian

al...@0xdata.com

unread,
Jun 18, 2015, 7:31:45 PM6/18/15
to h2os...@googlegroups.com
Hi Julian!

Thanks for your note. I do a lot of nlp stuff using Spark and H2O and have yet to come across an example for sentiment analysis.

However, we do have some scripts which are available that:
a) Use TF-IDF weighting scheme for classifying text messages
https://github.com/h2oai/sparkling-water/blob/master/examples/scripts/mlconf_2015_hamSpam.script.scala

b) Use Word2Vec Skip-gram model + GBM for classifying job titles
https://github.com/h2oai/sparkling-water/blob/master/examples/scripts/craigslistJobTitles.scala

Thanks!
Alex

sunilchi...@gmail.com

unread,
Sep 7, 2015, 2:36:50 AM9/7/15
to H2O Open Source Scalable Machine Learning - h2ostream, jul.hil...@gmail.com
On Wednesday, June 17, 2015 at 8:14:55 PM UTC+5:30, Julian Hillebrand wrote:
> Is there a (good) example of using h2o for sentiment analysis?  Best regardsJulian

@Alex,

I am unable to access the link you pointed -

https://github.com/h2oai/sparkling-water/blob/master/examples/scripts/craigslistJobTitles.scala

Did this move elsewhere?


Michal Malohlava

unread,
Sep 7, 2015, 1:44:58 PM9/7/15
to h2os...@googlegroups.com

sunilchi...@gmail.com

unread,
Sep 8, 2015, 2:08:30 AM9/8/15
to H2O Open Source Scalable Machine Learning - h2ostream, mic...@h2oai.com

Thanks Michal, just need additional help here -


val title_vectors = words.map(x => new DenseVector(
divArray(x.map(m => wordToVector(m, model).toArray).
reduceLeft(sumArray),x.length)).asInstanceOf[Vector])

I see the above piece of code post word2vec model creation and verification of synonyms. Whats this piece of code doing really?! I am sorry for being naive but I am new to Scala coding.

Michal Malohlava

unread,
Sep 8, 2015, 4:09:07 PM9/8/15
to sunilchi...@gmail.com, H2O Open Source Scalable Machine Learning - h2ostream
On 9/7/15 11:08 PM, sunilchi...@gmail.com wrote:
> On Monday, September 7, 2015 at 11:14:58 PM UTC+5:30, Michal Malohlava wrote:
>> On 9/6/15 11:36 PM, sunilchi...@gmail.com wrote:
>>> On Wednesday, June 17, 2015 at 8:14:55 PM UTC+5:30, Julian Hillebrand wrote:
>>>> Is there a (good) example of using h2o for sentiment analysis? Best regardsJulian
>>> @Alex,
>>>
>>> I am unable to access the link you pointed -
>>>
>>> https://github.com/h2oai/sparkling-water/blob/master/examples/scripts/craigslistJobTitles.scala
>> It is here:
>> https://github.com/h2oai/sparkling-water/blob/master/examples/scripts/craigslistJobTitles.script.scala
>>
>> Michal
>>> Did this move elsewhere?
>>>
>>>
> Thanks Michal, just need additional help here -
>
>
> val title_vectors = words.map(x => new DenseVector(
> divArray(x.map(m => wordToVector(m, model).toArray).
> reduceLeft(sumArray),x.length)).asInstanceOf[Vector])
For each row, which is represented by an array of tokens (["development", "job"]) it applies
existing word2vec model on each word (x.map(m => wordToVector(m, model).toArray) and sums the
resulting vectors together(reduceLeft(sumArray)). The result is a vector representing a row.

["development", "job"] => [ [0,1,0.3, ...], [0.89, 0.8, 0, ....] ] => [0.89, 1.2, 0, ...]

Does it make sense?
michal
Reply all
Reply to author
Forward
0 new messages