This blog post shall serve as my final submission link for Google Summer of Code 2016.
My task was to implement a pythonic Dynamic Topic Model for gensim and RaRe Technologies. I was very ably guided by my mentors Lev and Radim, and am happy that my code for the same has been merged into gensim via PR#739. You can follow the tutorial to understand the theory behind Dynamic Topic Models and use the code here: notebook tutorial link.
My collaboration with Gensim has not only been through this PR, and I enjoyed fixing bugs and adding features throughout the summer (all my PRs can be viewed here).
There is still a lot of scope for taking this project further – particularly through increasing performance in terms of Memory, Speed and Documentation. Adding the Document Influence Model mode and making it distributed are other aims.
I have been addressing this via a new PR, #831. It will attempt to use a fixed memory stamp, and make the code more ‘pythonic’ via style changes and improving speed by further vectorising code. I will also update the tutorial notebook to explain how you can further play around with your Dynamic Topic Model there. This should be up soon, and will make the already merged code only better. 🙂
Participating in GSOC been a great and humbling learning experience, where I not only learned many new skills and coding practices, but also that there is so much more to learn and do in the fields of open source Machine Learning and Data Science. I hope to keep contributing to Gensim and the open source community!