With great interest I have been reading your paper. Currently, I am trying to reproduce your results but have some issues with reproducability.
First I checked the SPINE_word2vec.text embeddings from https://drive.google.com/drive/folders/1ksVcWDADmnp0Cl5kezjHqTg3Jnh8q031, they indeed seem to give proper results
0 1 2 3 4
0 cellular answered browser letters transmitter
1 subscriber answering app messages radios
2 verizon reply downloads email amplifier
3 broadband answer iphone letter antenna
4 subscribers replies download mail handheld
5 telecom queries tablet correspondence sensors
6 phone respond downloaded mailing sensor
7 telecommunications answers mobile message infrared
8 phones responds phones spam gps
9 dial fielded desktop sms phones
Subsequently I trained my own embedding using the following parameters:
python3 main.py
--input ../../word2vec_original_15k_300d_train.txt
--num_epochs 4000
--denoising
--noise 0.20
--sparsity 0.85
--output ./test.vec
--hdim 1000
After running this script for 4k epochs I get the following result:
0 1 2 3 4
0 beetles stabbed oversees sealing deco
1 wetlands cried bowler horizontally imprint
2 rainforest befriended ed sequel trajectory
3 meadow chased challenged kane flu
4 organisms danced fired challenged stockholm
5 fungus avenge vowel fired fired
6 rainfall starred en vowel vowel
7 hymns distraught domain en en
8 lily unbeaten el domain domain
9 larvae medalist unloaded el el
Am I missing something?
Thank you in advance for your response!
With great interest I have been reading your paper. Currently, I am trying to reproduce your results but have some issues with reproducability.
First I checked the SPINE_word2vec.text embeddings from https://drive.google.com/drive/folders/1ksVcWDADmnp0Cl5kezjHqTg3Jnh8q031, they indeed seem to give proper results
Subsequently I trained my own embedding using the following parameters:
After running this script for 4k epochs I get the following result:
Am I missing something?
Thank you in advance for your response!