I tried to use Word2Vec with Apache Spark. Used the first Harry Potter book as Corpus.
Some Interesting results
Similarities for "Ron"
Hermione 0.8892348408699036
watch 0.8258942365646362
"and 0.7972607016563416
Similarities for "Hermione"
Ron 0.9096277952194214
"and 0.8301450610160828
Hooch 0.829563319683075
Ron and Hermione end up getting married in the last book.
Similarities for "Voldemort"
Quirrell's 0.9311719536781311
laughing 0.9307642579078674
Similarities for "Harry"
Hermione 0.7319877743721008
George 0.7252205014228821
Harry Potter Corpus
API Refrence
So what is word2vec. It is a shallow neural network model. In short it tries to predict the contextual word from its surroundings and vice - versa.