Highest scored 'fasttext' questions

25 votes

2 answers

15k views

Difference between Fasttext .vec and .bin file

I recently downloaded fasttext pretrained model for english. I got two files: wiki.en.vec wiki.en.bin I am not sure what is the difference between the two files?

Bhushan Pant

1,590

asked Nov 5, 2017 at 6:03

21 votes

3 answers

21k views

fastText embeddings sentence vectors?

I wanted to understand the way fastText vectors for sentences are created. According to this issue 309, the vectors for sentences are obtained by averaging the vectors for words. In order to confirm ...

ryuzakinho

1,919

asked Jan 14, 2019 at 12:01

15 votes

6 answers

28k views

How to find similar words with FastText?

I am playing around with FastText, https://pypi.python.org/pypi/fasttext,which is quite similar to Word2Vec. Since it seems to be a pretty new library with not to many built in functions yet, I was ...

Isbister

956

asked Feb 13, 2017 at 14:33

14 votes

1 answer

12k views

How does the Gensim Fasttext pre-trained model get vectors for out-of-vocabulary words?

I am using gensim to load pre-trained fasttext model. I downloaded the English wikipedia trained model from fasttext website. here is the code I wrote to load the pre-trained model: from gensim....

Baktaawar

7,580

asked Jun 13, 2018 at 2:33

13 votes

2 answers

14k views

FastText using pre-trained word vector for text classification

I am working on a text classification problem, that is, given some text, I need to assign to it certain given labels. I have tried using fast-text library by Facebook, which has two utilities of ...

JarvisIA

153

asked Dec 7, 2017 at 10:28

12 votes

3 answers

9k views

How to save fasttext model in vec format?

I trained my unsupervised model using fasttext.train_unsupervised() function in python. I want to save it as vec file since I will use this file for pretrainedVectors parameter in fasttext....

esin ildiz

121

asked Oct 11, 2019 at 8:46

12 votes

2 answers

17k views

ModuleNotFoundError: No module named 'fasttext'

I have tried installing fasttext through conda using two channels: conda install -c conda-forge fasttext and conda install -c conda-forge/label/cf201901 fasttext as per (https://anaconda.org/conda-...

Ashwin Geet D'Sa

7,458

asked Oct 1, 2019 at 14:18

10 votes

3 answers

8k views

Continue training a FastText model

I have downloaded a .bin FastText model, and I use it with gensim as follows: model = FastText.load_fasttext_format("cc.fr.300.bin") I would like to continue the training of the model to adapt it to ...

ted

14.9k

asked Aug 29, 2018 at 14:47

10 votes

1 answer

5k views

Use Tensorflow and pre-trained FastText to get embeddings of unseen words

I am using a pre-trained fasttext model https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md). I use Gensim to load the fasttext model. It can output a vector for any words,...

Munichong

4,041

asked Oct 30, 2017 at 19:08

9 votes

3 answers

13k views

How to vectorize whole text using fasttext?

To get vector of a word, I can use: model["word"] but if I want to get the vector of a sentence, I need to either sum vectors of all words or get average of all vectors. Does FastText provide a ...

Andrey

663

asked Apr 17, 2017 at 16:06

9 votes

5 answers

6k views

FastText - Cannot load model.bin due to C++ extension failed to allocate the memory

I'm trying to use the FastText Python API https://pypi.python.org/pypi/fasttext Although, from what I've read, this API can't load the newer .bin model files at https://github.com/facebookresearch/...

Filipe Aleixo

4,292

asked Aug 28, 2017 at 16:13

9 votes

3 answers

3k views

FastText quantize documentation incorrect?

I'm unable to run FastText quantization as shown in the documentation. Specifically, as shown at the bottom of the cheat sheet page: https://fasttext.cc/docs/en/cheatsheet.html When I attempt to ...

J-Dinerstein

91

asked Sep 20, 2018 at 15:30

9 votes

0 answers

1k views

Gensim FastText compute Training Loss

I am training a fastText model using gensim.models.fasttext. However, I can't seem to find a method to compute the loss of the iteration for logging purposes. If I look at gensim.models.word2vec, it ...

Hardian Lawi

616

asked Jun 1, 2018 at 12:13

8 votes

4 answers

21k views

Unable to install fastText for python on windows.

So I am unable to install fasttext for python on windows. I followed the methods mentioned in this issue When I enter python setup.py install, I get the following error: error: command 'C:\\Program ...

humble

2,208

asked Aug 4, 2018 at 6:13

8 votes

1 answer

7k views

Can't suppress fasttext warning: 'load_model' does not return [...]

I'm struggling to suppress a specific warning related to fasttext. The warning is Warning : 'load_model' does not return WordVectorModel or SupervisedModel any more, but a 'FastText' object which is ...

Ian

3,968

asked Feb 24, 2021 at 14:56

8 votes

2 answers

4k views

Fasttext algorithm use only word and subword? or sentences too?

I read the paper and googled as well if there is any good example of the learning method(or more likely learning procedure) For word2vec, suppose there is corpus sentence I go to school with lunch ...

Isaac Sim

581

asked Apr 13, 2018 at 7:22

8 votes

1 answer

5k views

gensim - fasttext - Why `load_facebook_vectors` doesn't work?

I've tried to load pre-trained FastText vectors from fastext - wiki word vectors. My code is below, and it works well. from gensim.models import FastText model = FastText.load_fasttext_format('./...

frhyme

1,036

asked May 28, 2020 at 7:27

8 votes

3 answers

4k views

Use tf-idf with FastText vectors

I'm interested in using tf-idf with FastText library, but have found a logical way to handle the ngrams. I have used tf-idf with SpaCy vectors already for what I have found several examples like these ...

Luis Ramon Ramirez Rodriguez

10.6k

asked Sep 23, 2019 at 20:28

7 votes

3 answers

20k views

ERROR: Could not build wheels for fasttext, which is required to install pyproject.toml-based projects

I'm trying to install fasttext using pip install fasttext in python 3.11.4 but I'm running into trouble when building wheels. The error reads as follows: error: command 'C:\\Program Files (x86)\\...

Parseval

903

asked Oct 11, 2023 at 13:23

7 votes

1 answer

8k views

SPACY - Confusion about word vectors and tok2vec

it would be really helpful for me if you would help me understand some underlying concepts about Spacy. I understand some spacy models have some predefined static vectors, for example, for the Spanish ...

BaldML

153

asked Oct 7, 2020 at 23:18

7 votes

2 answers

3k views

How to handle unbalanced label data using FastText?

In FastText, I have unbalanced labels. What is the best way to handle it?

Gil Lev

121

asked Jun 10, 2018 at 8:02

6 votes

1 answer

3k views

Unable to recreate Gensim docs for training FastText. TypeError: Either one of corpus_file or corpus_iterable value must be provided

I am trying to make my own Fasttext embeddings so I went to official Gensim documentation and implemented this exact code below with exact 4.0 version. from gensim.models import FastText from gensim....

Deshwal

4,292

asked May 17, 2021 at 16:13

6 votes

1 answer

4k views

Proper way to add new vectors for OOV words

I'm using some domain-specific language which have a lot of OOV words as well as some typos. I have noticed Spacy will just assign an all-zero vector for these OOV words, so I'm wondering what's the ...

BaldML

153

asked Jul 28, 2020 at 23:28

6 votes

1 answer

39k views

Process finished with exit code -1073740791 (0xC0000409) pycharm error

I am trying to use fastText with PyCharm. Whenever I run below code: import fastText model=fastText.train_unsupervised("data_parsed.txt") model.save_model("model") The process exits with this error:...

user9857589

61

asked May 28, 2018 at 8:33

6 votes

2 answers

3k views

Error when loading FastText's french pre-trained model with gensim

I am trying to use the FastText's french pre-trained binary model (downloaded from the official FastText's github page). I need the .bin model and not the .vec word-vectors so as to approximate ...

Clara-sininen

201

asked Jul 23, 2018 at 14:43

6 votes

2 answers

8k views

Latest Pre-trained Multilingual Word Embedding

Are there any latest pre-trained multilingual word embeddings (multiple languages are jointly mapped to a same vector space)? I have looked at the following but they don't fit my needs: FastText / ...

MachineLearner

423

asked Jun 15, 2020 at 9:13

6 votes

2 answers

6k views

fasttext cannot load training txt file

I am trying to train a fasttext classifier in windows using fasttext python package. I have a utf8 file with lines like __label__type1 sample sentence 1 __label__type2 sample sentence 2 ...

tahsintahsin

1,054

asked Jun 18, 2018 at 9:37

6 votes

2 answers

2k views

How to save fasttext model in binary and text formats?

The documentation is a bit unclear how to save the fasttext model to disk - how do you specify a path in the argument, I tried doing so and it failed with an error Example in documentation >>&...

erotavlas

4,575

asked Aug 30, 2019 at 16:57

5 votes

2 answers

10k views

How to use pre-trained word vectors in FastText?

I've just started to use FastText. I'm doing a cross validation of a small dataset by using as input the .csv file of my dataset. To process the dataset I'm using this parameters: model = fasttext....

Pelide

528

asked Nov 5, 2020 at 23:25

5 votes

1 answer

2k views

What is the difference between syntactic analogy and semantic analogy?

At 15:10 of this video about fastText it mentions syntactic analogy and semantic analogy. But I am not sure what the difference is between them. Could anybody help explain the difference with ...

user1424739

14.1k

asked Jan 20, 2018 at 12:58

5 votes

1 answer

4k views

fasttext error: predict processes one line at a time (remove '\n')

Hi I have a dataframe column contains text. I want to use fasttext model to make prediction from it. I can achieve this by passing an array of text to fasttext model. import fasttext d = {'id':[1, 2, ...

Osca

1,744

asked Jan 21, 2021 at 3:55

5 votes

1 answer

2k views

FastText 0.9.2 - why is recall 'nan'?

I trained a supervised model in FastText using the Python interface and I'm getting weird results for precision and recall. First, I trained a model: model = fasttext.train_supervised("train.txt&...

abstrakkt

116

asked May 14, 2020 at 0:21

5 votes

2 answers

6k views

How to export a fasttext model created by gensim, to a binary file?

I'm trying to export the fasttext model created by gensim to a binary file. But the docs are unclear about how to achieve this. What I've done so far: model.wv.save_word2vec_format('model.bin') But ...

Farhood ET

1,573

asked Nov 15, 2019 at 12:01

5 votes

2 answers

4k views

precision and recall in fastText?

I implement the fastText for text classification, link https://github.com/facebookresearch/fastText/blob/master/tutorials/supervised-learning.md I was wondering what's the precision@1, or P@5 means? I ...

HAO CHEN

1,349

asked Sep 9, 2017 at 10:54

5 votes

1 answer

723 views

Gensim fasttext cannot get latest training loss

Problem description It seems that the get_latest_training_loss function in fasttext returns only 0. Both gensim 4.1.0 and 4.0.0 do not work. from gensim.models.callbacks import CallbackAny2Vec from ...

Jinhua Wang

1,779

asked Sep 10, 2021 at 4:02

5 votes

1 answer

7k views

How to get nearest neighbours in fasttext for unsupervised learning models (cbow, skipgram)?

The examples (related to word representations) on fasttext official web site (fasttext.cc) suggest that it is possible to calculate the nearest neighbors on vectors derived with cbow (or skip-gram ...

IsidoraG

51

asked Sep 12, 2019 at 9:38

5 votes

1 answer

806 views

How i can maintain a temporary dictionary in a pyspark application?

I want to use pretrained embedding model (fasttext) in a pyspark application. So if I broadcast the file (.bin), the following exception is thrown: Traceback (most recent call last): cPickle....

bib

1,060

asked Jan 28, 2019 at 5:16

5 votes

0 answers

706 views

Incorporate fasttext vectors in tf.keras embedding layer?

Fasttext could handle OOV easily, i.e., it could be assumed that emb = fasttext_model(raw_input) always holds. However, I am not sure how I could build this layer into tf.keras embedding. I couldn't ...

Mr.cysl

1,684

asked Oct 5, 2020 at 8:03

4 votes

1 answer

4k views

Handling C++ arrays in Cython (with numpy and pytorch)

I am trying to use cython to wrap a C++ library (fastText, if its relevant). The C++ library classes load a very large array from disk. My wrapper instantiates a class from the C++ library to load the ...

Bob

1,472

asked Jun 6, 2017 at 17:57

4 votes

1 answer

1k views

What is the difference between args wordNgrams, minn and maxn in fassttext supervised learning?

I'm a little confused after reading Bag of tricks for efficient text classification. What is the difference between args wordNgrams, minn and maxn For example, a text classification task and Glove ...

Rajan Raju

123

asked Jul 10, 2020 at 6:18

4 votes

1 answer

2k views

How to Find Top N Similar Words in a Dictionary of Words / Things?

I have a list of str that I want to map against. The words could be "metal" or "st. patrick". The goal is to map a new string against this list and find Top N Similar items. For ...

Ian Yu

67

asked Apr 18, 2021 at 9:56

4 votes

1 answer

3k views

loading of fasttext pre trained german word embedding's .vec file throwing out of memory error

I am using gensim to load the fasttext's pre-trained word embedding de_model = KeyedVectors.load_word2vec_format('wiki.de\wiki.de.vec') But this gives me a memory error. Is there any way I can load ...

shasvat desai

439

asked Jun 18, 2018 at 13:08

4 votes

2 answers

2k views

Difference between Gensim's FastText and Facebook's FastText

I came upon the realization that there exists the original implementation of FastText here by which you can use fasttext.train_unsupervised in order to generate word vectors (see this link as an ...

Perl Del Rey

1,128

asked Aug 18, 2021 at 14:21

4 votes

2 answers

1k views

Why FastText is not handling finding multi-word phrases?

FastText pre-trained model works great for finding similar words: from pyfasttext import FastText model = FastText('cc.en.300.bin') model.nearest_neighbors('dog', k=2000) [('dogs', 0.8463464975357056)...

dzieciou

4,654

asked Jan 5, 2021 at 9:13

4 votes

1 answer

6k views

Reducing size of Facebook's fastText

I am building a machine learning model which will process documents and extract some key information from it. For this, I need to use word embedding for OCRed output. I have several different options ...

ironhide012

85

asked Nov 19, 2019 at 9:10

4 votes

1 answer

5k views

Value of alpha in gensim word-embedding (Word2Vec and FastText) models?

I just want to know the effect of the value of alpha in gensim word2vec and fasttext word-embedding models? I know that alpha is the initial learning rate and its default value is 0.075 form Radim ...

M S

944

asked Dec 17, 2018 at 12:36

4 votes

1 answer

3k views

Converting Fasttext vector to word

I am having trouble converting a fast FastText vector back to a word. Here is my python code: from gensim.models import KeyedVectors en_model = KeyedVectors.load_word2vec_format('wiki.en/wiki.en.vec'...

DMM

41

asked Nov 7, 2018 at 3:51

4 votes

1 answer

1k views

Language names of Languages supported by Fasttext

I am trying to find out the names of languages supported by Fasttext's LID tool, given these language codes listed here: af als am an ar arz as ast av az azb ba bar bcl be bg bh bn bo bpy br bs bxr ca ...

Abhinav Sukumar Rao

41

asked Nov 10, 2022 at 9:31

4 votes

1 answer

8k views

Is it possible to fine tune FastText models

I'm working on a project for text similarity using FastText, the basic example I have found to train a model is: from gensim.models import FastText model = FastText(tokens, size=100, window=3, ...

Luis Ramon Ramirez Rodriguez

10.6k

asked Sep 5, 2019 at 5:17

4 votes

1 answer

5k views

Install fasttext on Windows 10 with anaconda

I am trying to install fasttext in anaconda with Windows 10 using the command: pip install fasttext as explained here: https://pypi.org/project/fasttext/ The error messages are: ValueError: Unknown ...

Nicolas

809

asked Jul 6, 2018 at 0:27

Collectives™ on Stack Overflow