Groups
Conversations
All groups and messages
Send feedback to Google
Help
Training
Sign in
Groups
Gensim
Conversations
About
Gensim
Contact owners and managers
1–30 of 3738
Welcome to the mailing list of
Gensim, topic modelling for humans
. Please read the
FAQ
before asking. Supporting Gensim helps us support you:
https://github.com/
sponsors/piskvorky
Mark all as read
Report group
0 selected
Alexey Shkarupin
, …
Megan Rogers
3
Mar 7
Loading fasttext model from S3
Hi there I am also having this issue when trying to load a model from S3 with the same error. The
unread,
Loading fasttext model from S3
Hi there I am also having this issue when trying to load a model from S3 with the same error. The
Mar 7
Joseph Emmens
,
Gordon Mohr
3
Feb 15
Serialized author-topic-model incorrect document count
That's exactly right, that's my fault for copying the example on the gensim atm documents
unread,
Serialized author-topic-model incorrect document count
That's exactly right, that's my fault for copying the example on the gensim atm documents
Feb 15
santosh.b...@gmail.com
,
Andrey Kutuzov
3
Feb 8
How to incorporate timestamp into word embeddings?
Thank you so much, Andrey. I will peruse the articles you shared - beginning with your co-authored
unread,
How to incorporate timestamp into word embeddings?
Thank you so much, Andrey. I will peruse the articles you shared - beginning with your co-authored
Feb 8
squaaad yang
12/11/23
How to implement NMF dynamic topic modeling using gensim?
Hi everyone, Recently, I have been doing a dynamic topic modeling project using NMF. I thought the
unread,
How to implement NMF dynamic topic modeling using gensim?
Hi everyone, Recently, I have been doing a dynamic topic modeling project using NMF. I thought the
12/11/23
Jeff Winchell
,
Gordon Mohr
8
11/24/23
Streamed Restartable Iterables - Details (for corpus streaming)
I suspect if you were to test your conjecture that "Using a file system vs the fastest dbms is
unread,
Streamed Restartable Iterables - Details (for corpus streaming)
I suspect if you were to test your conjecture that "Using a file system vs the fastest dbms is
11/24/23
Sargentini, Thierry
,
Gordon Mohr
2
11/24/23
Python 3.9
AFAIK, Gensim builds & passes its test suite on Python-[3.8, 3.9, 3.10, 3.11] for [Ubuntu, MacOS,
unread,
Python 3.9
AFAIK, Gensim builds & passes its test suite on Python-[3.8, 3.9, 3.10, 3.11] for [Ubuntu, MacOS,
11/24/23
Andy Weasley
,
Gordon Mohr
8
11/23/23
Errors When Installing Version 3.8.3
Thanks for the update, but keep in mind that given the rather blatant failure-to-do-what-was-intended
unread,
Errors When Installing Version 3.8.3
Thanks for the update, but keep in mind that given the rather blatant failure-to-do-what-was-intended
11/23/23
Nathan Cassee
9/29/23
[Request] Experiment on Technical Debt prioritization
We're an international team of academic Software Engineering researchers investigating technical
unread,
[Request] Experiment on Technical Debt prioritization
We're an international team of academic Software Engineering researchers investigating technical
9/29/23
Danilo Tomasoni
,
Gordon Mohr
3
8/28/23
Hard limit on vocab size?
Glad it's sorted. If you *did* want to cap the number of words loaded, you can supply a `limit`
unread,
Hard limit on vocab size?
Glad it's sorted. If you *did* want to cap the number of words loaded, you can supply a `limit`
8/28/23
Jaden Rodriguez
,
Gordon Mohr
2
8/23/23
Fix Proposals and Troubles with Source
Without more details, unsure what specific source errors you're having. A general guide to
unread,
Fix Proposals and Troubles with Source
Without more details, unsure what specific source errors you're having. A general guide to
8/23/23
Felix Goldberg
,
Gordon Mohr
2
8/22/23
Noob question - how to train a doc2vec model using a built-in corpus?
The Gensim project source code (https://github.com/RaRe-Technologies/gensim/) contains in its `docs/
unread,
Noob question - how to train a doc2vec model using a built-in corpus?
The Gensim project source code (https://github.com/RaRe-Technologies/gensim/) contains in its `docs/
8/22/23
Jonathan Peters
8/1/23
Negative log_perplexity
Hello, I created an LDA model from control data and I am trying to calculate the perplexity of my
unread,
Negative log_perplexity
Hello, I created an LDA model from control data and I am trying to calculate the perplexity of my
8/1/23
Jeff Winchell
,
Gordon Mohr
2
7/19/23
Need tokenizer/preprocessor for popular pretrained embeddings models
I made a feature-request item in our issue-tracker for this – https://github.com/RaRe-Technologies/
unread,
Need tokenizer/preprocessor for popular pretrained embeddings models
I made a feature-request item in our issue-tracker for this – https://github.com/RaRe-Technologies/
7/19/23
pradeep t
,
Gordon Mohr
5
7/7/23
Add custom words to GoogleNews-vectors-negative300.bin pretrained model
Thank you so much for the updates On Fri, Jul 7, 2023 at 2:52 AM Gordon Mohr <goj...@gmail.com
unread,
Add custom words to GoogleNews-vectors-negative300.bin pretrained model
Thank you so much for the updates On Fri, Jul 7, 2023 at 2:52 AM Gordon Mohr <goj...@gmail.com
7/7/23
Danilo Tomasoni
,
Gordon Mohr
5
7/4/23
Load of FastText binary format with mmap='r'
thank you very much!! it works! Il giorno venerdì 30 giugno 2023 alle 19:59:27 UTC+2 Gordon Mohr ha
unread,
Load of FastText binary format with mmap='r'
thank you very much!! it works! Il giorno venerdì 30 giugno 2023 alle 19:59:27 UTC+2 Gordon Mohr ha
7/4/23
Thanos Tasakos
,
Gordon Mohr
5
6/30/23
Gensim KeyedVector load from s3
What a legend! I needed to also monkey-patch the numpyio module , to use smart_open instead of open,
unread,
Gensim KeyedVector load from s3
What a legend! I needed to also monkey-patch the numpyio module , to use smart_open instead of open,
6/30/23
pradeep t
,
Gordon Mohr
2
6/29/23
Pretrained model for doc2vec
I don't know of any I'd recommend, & that work with recent Gensim versions. (When I'
unread,
Pretrained model for doc2vec
I don't know of any I'd recommend, & that work with recent Gensim versions. (When I'
6/29/23
Laura
,
Gordon Mohr
2
6/27/23
Doc2vec with small corpus
That approach seems within the realm of reason - but ultimately whether it's better for your
unread,
Doc2vec with small corpus
That approach seems within the realm of reason - but ultimately whether it's better for your
6/27/23
Peter Mayhew
,
Gordon Mohr
11
6/13/23
Saving Wikidump corpus into Memory map
Note that even training the exact same corpus twice won't result in the *same* vectors.
unread,
Saving Wikidump corpus into Memory map
Note that even training the exact same corpus twice won't result in the *same* vectors.
6/13/23
jeff yang
,
Gordon Mohr
4
5/31/23
Is there anyway to adjust the weight of the node?
I'm not really sure why one would want to "reduce the density around a node". Do you
unread,
Is there anyway to adjust the weight of the node?
I'm not really sure why one would want to "reduce the density around a node". Do you
5/31/23
TRIXIA MAY BELGA
5/29/23
LDA topics for Clustering
My goal is to cluster the resulting LDA topics to reduce dimensionality. However I am not sure what
unread,
LDA topics for Clustering
My goal is to cluster the resulting LDA topics to reduce dimensionality. However I am not sure what
5/29/23
Yan Xu
,
Gordon Mohr
4
5/18/23
Add the similarity threshold to gensim.models.keyedvectors.KeyedVectors.most_similar
That's a good point, given the extra memory required to return the list-of-(word, score) tuples.
unread,
Add the similarity threshold to gensim.models.keyedvectors.KeyedVectors.most_similar
That's a good point, given the extra memory required to return the list-of-(word, score) tuples.
5/18/23
Fred R
,
Gordon Mohr
2
5/9/23
How to get context words in gensim word2vec models
Can you clarify with a bit more detail what you mean by "context words"? I ask because once
unread,
How to get context words in gensim word2vec models
Can you clarify with a bit more detail what you mean by "context words"? I ask because once
5/9/23
nicolas valderrama
,
Gordon Mohr
3
4/25/23
"Lazily" add documents to TfIdf
Oh we didn't knew this was possible. I'm glad I asked here before doing any change. Thanks a
unread,
"Lazily" add documents to TfIdf
Oh we didn't knew this was possible. I'm glad I asked here before doing any change. Thanks a
4/25/23
Gabriel L
, …
Gordon Mohr
12
4/20/23
Implementation of Correlated Topic Model
I can understand why you might prefer techniques that exist over those that are purely imaginary,
unread,
Implementation of Correlated Topic Model
I can understand why you might prefer techniques that exist over those that are purely imaginary,
4/20/23
Danilo Tomasoni
,
Gordon Mohr
16
4/12/23
Very different performances if streaming data or reading data from disk
On Wednesday, April 12, 2023 at 5:36:04 AM UTC-7 danilot.l...@gmail.com wrote: Performance in my
unread,
Very different performances if streaming data or reading data from disk
On Wednesday, April 12, 2023 at 5:36:04 AM UTC-7 danilot.l...@gmail.com wrote: Performance in my
4/12/23
Tedo Vrbanec
, …
Benedict Holland
8
3/20/23
Doc2Vec loss function
For me, dynamic stopping is what I am looking for. As for the reviewer, I am not sure. :) Dana
unread,
Doc2Vec loss function
For me, dynamic stopping is what I am looking for. As for the reviewer, I am not sure. :) Dana
3/20/23
Oliver Gordon
,
Gordon Mohr
2
3/15/23
GPL being violated
Note: Gensim is licensed under the "Lesser" GPL (aka "LGPL" https://www.gnu.org/
unread,
GPL being violated
Note: Gensim is licensed under the "Lesser" GPL (aka "LGPL" https://www.gnu.org/
3/15/23
Tedo Vrbanec
,
Gordon Mohr
2
3/9/23
GloVe native support in Gensim?
Other than the current ability to load GloVE vectors, GloVe-style training hasn't been planned (
unread,
GloVe native support in Gensim?
Other than the current ability to load GloVE vectors, GloVe-style training hasn't been planned (
3/9/23
日出間健本社総合企画部
3/3/23
Why does increasing the number of topics increase perplexity?
I'm trying to calculate the perplexity using the LDA model log_perplexity. The official
unread,
Why does increasing the number of topics increase perplexity?
I'm trying to calculate the perplexity using the LDA model log_perplexity. The official
3/3/23