Groups
Sign in
Groups
Gensim
Conversations
About
Send feedback
Help
Gensim
Contact owners and managers
1–30 of 3742
Welcome to the mailing list of
Gensim, topic modelling for humans
. Please read the
FAQ
before asking. Supporting Gensim helps us support you:
https://github.com/
sponsors/piskvorky
Mark all as read
Report group
0 selected
Roy Becker
,
Gordon Mohr
2
Apr 10
noob question: gensim.downloader.load keeps getting stuck
Gensim is just using a basic `urllib` HTTP request here. If you consistently get this same failure –
unread,
noob question: gensim.downloader.load keeps getting stuck
Gensim is just using a basic `urllib` HTTP request here. If you consistently get this same failure –
Apr 10
Daniel
,
Gordon Mohr
2
Apr 10
cannot import name 'triu' from 'scipy.linalg'
Scipy recently removed these functions after a fairly brief (less than 1 year) 'deprecation'
unread,
cannot import name 'triu' from 'scipy.linalg'
Scipy recently removed these functions after a fairly brief (less than 1 year) 'deprecation'
Apr 10
Ferhat Arslan
,
Gordon Mohr
3
Apr 10
Unexpected performance decrease when shared objects are locally compiled
Thanks for the update & confirmation of a possible workaround. If by chance you're on a Linux
unread,
Unexpected performance decrease when shared objects are locally compiled
Thanks for the update & confirmation of a possible workaround. If by chance you're on a Linux
Apr 10
Alexey Shkarupin
, …
Gordon Mohr
4
Apr 1
Loading fasttext model from S3
I believe the docs are wrong to suggest that smart_open's S3 support is enough for this operation
unread,
Loading fasttext model from S3
I believe the docs are wrong to suggest that smart_open's S3 support is enough for this operation
Apr 1
Tomáš Holler
,
Gordon Mohr
2
Mar 28
Loading WikiCorpus
I haven't tested this, but have you tried specifying a no-op `tokenizer_func`, fitting the
unread,
Loading WikiCorpus
I haven't tested this, but have you tried specifying a no-op `tokenizer_func`, fitting the
Mar 28
Joseph Emmens
,
Gordon Mohr
3
Feb 15
Serialized author-topic-model incorrect document count
That's exactly right, that's my fault for copying the example on the gensim atm documents
unread,
Serialized author-topic-model incorrect document count
That's exactly right, that's my fault for copying the example on the gensim atm documents
Feb 15
santosh.b...@gmail.com
,
Andrey Kutuzov
3
Feb 8
How to incorporate timestamp into word embeddings?
Thank you so much, Andrey. I will peruse the articles you shared - beginning with your co-authored
unread,
How to incorporate timestamp into word embeddings?
Thank you so much, Andrey. I will peruse the articles you shared - beginning with your co-authored
Feb 8
squaaad yang
12/11/23
How to implement NMF dynamic topic modeling using gensim?
Hi everyone, Recently, I have been doing a dynamic topic modeling project using NMF. I thought the
unread,
How to implement NMF dynamic topic modeling using gensim?
Hi everyone, Recently, I have been doing a dynamic topic modeling project using NMF. I thought the
12/11/23
Jeff Winchell
,
Gordon Mohr
8
11/24/23
Streamed Restartable Iterables - Details (for corpus streaming)
I suspect if you were to test your conjecture that "Using a file system vs the fastest dbms is
unread,
Streamed Restartable Iterables - Details (for corpus streaming)
I suspect if you were to test your conjecture that "Using a file system vs the fastest dbms is
11/24/23
Sargentini, Thierry
,
Gordon Mohr
2
11/24/23
Python 3.9
AFAIK, Gensim builds & passes its test suite on Python-[3.8, 3.9, 3.10, 3.11] for [Ubuntu, MacOS,
unread,
Python 3.9
AFAIK, Gensim builds & passes its test suite on Python-[3.8, 3.9, 3.10, 3.11] for [Ubuntu, MacOS,
11/24/23
Andy Weasley
,
Gordon Mohr
8
11/23/23
Errors When Installing Version 3.8.3
Thanks for the update, but keep in mind that given the rather blatant failure-to-do-what-was-intended
unread,
Errors When Installing Version 3.8.3
Thanks for the update, but keep in mind that given the rather blatant failure-to-do-what-was-intended
11/23/23
Nathan Cassee
9/29/23
[Request] Experiment on Technical Debt prioritization
We're an international team of academic Software Engineering researchers investigating technical
unread,
[Request] Experiment on Technical Debt prioritization
We're an international team of academic Software Engineering researchers investigating technical
9/29/23
Danilo Tomasoni
,
Gordon Mohr
3
8/28/23
Hard limit on vocab size?
Glad it's sorted. If you *did* want to cap the number of words loaded, you can supply a `limit`
unread,
Hard limit on vocab size?
Glad it's sorted. If you *did* want to cap the number of words loaded, you can supply a `limit`
8/28/23
Jaden Rodriguez
,
Gordon Mohr
2
8/23/23
Fix Proposals and Troubles with Source
Without more details, unsure what specific source errors you're having. A general guide to
unread,
Fix Proposals and Troubles with Source
Without more details, unsure what specific source errors you're having. A general guide to
8/23/23
Felix Goldberg
,
Gordon Mohr
2
8/22/23
Noob question - how to train a doc2vec model using a built-in corpus?
The Gensim project source code (https://github.com/RaRe-Technologies/gensim/) contains in its `docs/
unread,
Noob question - how to train a doc2vec model using a built-in corpus?
The Gensim project source code (https://github.com/RaRe-Technologies/gensim/) contains in its `docs/
8/22/23
Jonathan Peters
8/1/23
Negative log_perplexity
Hello, I created an LDA model from control data and I am trying to calculate the perplexity of my
unread,
Negative log_perplexity
Hello, I created an LDA model from control data and I am trying to calculate the perplexity of my
8/1/23
Jeff Winchell
,
Gordon Mohr
2
7/19/23
Need tokenizer/preprocessor for popular pretrained embeddings models
I made a feature-request item in our issue-tracker for this – https://github.com/RaRe-Technologies/
unread,
Need tokenizer/preprocessor for popular pretrained embeddings models
I made a feature-request item in our issue-tracker for this – https://github.com/RaRe-Technologies/
7/19/23
pradeep t
,
Gordon Mohr
5
7/7/23
Add custom words to GoogleNews-vectors-negative300.bin pretrained model
Thank you so much for the updates On Fri, Jul 7, 2023 at 2:52 AM Gordon Mohr <goj...@gmail.com
unread,
Add custom words to GoogleNews-vectors-negative300.bin pretrained model
Thank you so much for the updates On Fri, Jul 7, 2023 at 2:52 AM Gordon Mohr <goj...@gmail.com
7/7/23
Danilo Tomasoni
,
Gordon Mohr
5
7/4/23
Load of FastText binary format with mmap='r'
thank you very much!! it works! Il giorno venerdì 30 giugno 2023 alle 19:59:27 UTC+2 Gordon Mohr ha
unread,
Load of FastText binary format with mmap='r'
thank you very much!! it works! Il giorno venerdì 30 giugno 2023 alle 19:59:27 UTC+2 Gordon Mohr ha
7/4/23
Thanos Tasakos
,
Gordon Mohr
5
6/30/23
Gensim KeyedVector load from s3
What a legend! I needed to also monkey-patch the numpyio module , to use smart_open instead of open,
unread,
Gensim KeyedVector load from s3
What a legend! I needed to also monkey-patch the numpyio module , to use smart_open instead of open,
6/30/23
pradeep t
,
Gordon Mohr
2
6/29/23
Pretrained model for doc2vec
I don't know of any I'd recommend, & that work with recent Gensim versions. (When I'
unread,
Pretrained model for doc2vec
I don't know of any I'd recommend, & that work with recent Gensim versions. (When I'
6/29/23
Laura
,
Gordon Mohr
2
6/27/23
Doc2vec with small corpus
That approach seems within the realm of reason - but ultimately whether it's better for your
unread,
Doc2vec with small corpus
That approach seems within the realm of reason - but ultimately whether it's better for your
6/27/23
Peter Mayhew
,
Gordon Mohr
11
6/13/23
Saving Wikidump corpus into Memory map
Note that even training the exact same corpus twice won't result in the *same* vectors.
unread,
Saving Wikidump corpus into Memory map
Note that even training the exact same corpus twice won't result in the *same* vectors.
6/13/23
jeff yang
,
Gordon Mohr
4
5/31/23
Is there anyway to adjust the weight of the node?
I'm not really sure why one would want to "reduce the density around a node". Do you
unread,
Is there anyway to adjust the weight of the node?
I'm not really sure why one would want to "reduce the density around a node". Do you
5/31/23
TRIXIA MAY BELGA
5/29/23
LDA topics for Clustering
My goal is to cluster the resulting LDA topics to reduce dimensionality. However I am not sure what
unread,
LDA topics for Clustering
My goal is to cluster the resulting LDA topics to reduce dimensionality. However I am not sure what
5/29/23
Yan Xu
,
Gordon Mohr
4
5/18/23
Add the similarity threshold to gensim.models.keyedvectors.KeyedVectors.most_similar
That's a good point, given the extra memory required to return the list-of-(word, score) tuples.
unread,
Add the similarity threshold to gensim.models.keyedvectors.KeyedVectors.most_similar
That's a good point, given the extra memory required to return the list-of-(word, score) tuples.
5/18/23
Fred R
,
Gordon Mohr
2
5/9/23
How to get context words in gensim word2vec models
Can you clarify with a bit more detail what you mean by "context words"? I ask because once
unread,
How to get context words in gensim word2vec models
Can you clarify with a bit more detail what you mean by "context words"? I ask because once
5/9/23
nicolas valderrama
,
Gordon Mohr
3
4/25/23
"Lazily" add documents to TfIdf
Oh we didn't knew this was possible. I'm glad I asked here before doing any change. Thanks a
unread,
"Lazily" add documents to TfIdf
Oh we didn't knew this was possible. I'm glad I asked here before doing any change. Thanks a
4/25/23
Gabriel L
, …
Gordon Mohr
12
4/20/23
Implementation of Correlated Topic Model
I can understand why you might prefer techniques that exist over those that are purely imaginary,
unread,
Implementation of Correlated Topic Model
I can understand why you might prefer techniques that exist over those that are purely imaginary,
4/20/23
Danilo Tomasoni
,
Gordon Mohr
16
4/12/23
Very different performances if streaming data or reading data from disk
On Wednesday, April 12, 2023 at 5:36:04 AM UTC-7 danilot.l...@gmail.com wrote: Performance in my
unread,
Very different performances if streaming data or reading data from disk
On Wednesday, April 12, 2023 at 5:36:04 AM UTC-7 danilot.l...@gmail.com wrote: Performance in my
4/12/23