477 questions
2
votes
2
answers
297
views
Can't install Python fasttext on Windows 11 – string_view not found despite Visual Studio 2022 and C++17
I'm hitting a wall trying to install the Python package fasttext on Windows 11, and I'm getting C++ build errors despite setting everything up correctly (or so I thought). Here's the full context of ...
0
votes
0
answers
51
views
Why won't vectors derived from the pre-trained fasttext Japanese wiki model align properly with English vectors?
I'm trying to align English word vectors taken from the word2vec model trained on Google news with Japanese language word vectors taken from two different models: the fasttext model pre-trained on ...
1
vote
0
answers
226
views
Accessing FastText Library Directly from PHP Without Using Python
I want to access and use the FastText library directly from my PHP code without using Python or creating a web service (e.g., using Flask) to wrap the FastText functionality and call it from PHP. Is ...
1
vote
0
answers
709
views
Error in installing fast text-Could not build wheels for fasttext, which is required to install pyproject.toml-based projects
I was trying to install fast test using
pip install fasttext
But this gives me error like this.
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed ...
1
vote
1
answer
160
views
Fasttext pre-trained model is not producing OOV word vectors when using gensim downloader
I am having A LOT of trouble when trying to use all the fasttext libraries (in Jupyter with Anaconda3 on Windows 11) that I have found so far but this question is mainly about gensim's implementation. ...
-2
votes
1
answer
72
views
How to Find accuracy of FastText model in text classification?
in machine learning, all models have the equation of accuracy while in the FastText model, we don't have please support.
1
vote
1
answer
182
views
FastText language_identification in R returns too many arguments - how to match to texts?
FastText language_identification returns multiple predictions per original text, and also fails to indicate which belong to which original document.
There are differing numbers of predictions per ...
0
votes
1
answer
81
views
cannot load fine tuned fasttext wiki model after retraining and saving
I am fine tuning a fastest wiki model as follows. This works fine. After fine tuning I save the retrained model.
from gensim.models import fasttext
model = fasttext.load_facebook_model(datapath("...
1
vote
1
answer
999
views
How to workaround an installation issue on installing the fastText library on Windows?
I am new to this field and experimenting different models in NLP domain. While am trying to install the fastText libary using the command prompt, it is showing an error:
pip install wheel
pip install ...
0
votes
1
answer
632
views
fasttext Installation in anaconda
I'm using Anaconda3-2024.02-1-Windows-x86_64 (yes, Windows 64) on windows 11, with python 3.11.7. I was trying to build an offline translator and i forgot what led to what... Eventually i was trying ...
0
votes
1
answer
35
views
How to separate items in dataset in python?
I scraped reviews from a web and there are pros and cons separate from each other. I scraped them as a list because it looks like as the best solution for not having the same review with user, date ...
-1
votes
1
answer
455
views
Unhandled exception. System.DllNotFoundException: Unable to load shared library 'fasttext' or one of its dependencies
I want to use the FastText.NetWrapper library in my project. I imported the library and wrote the sample code, but after running it, I get the following error:
Unhandled exception. System....
-1
votes
1
answer
167
views
Pre-training or using the existing model of FastText?
I am planning to create a classification model. Instead of using traditional models, I decided to use a new technique of creating word embeddings, clustering them using k-means, then use the mean of ...
0
votes
1
answer
137
views
Can fasttext classify on character level?
I am using fasttext model to predict labels for text.
Usually fasttext can classify text on word level such as:
model = fasttext.train_supervised(input="training_fasttextFormat.csv", lr=0.1, ...
1
vote
2
answers
3k
views
ModuleNotFoundError: No module named 'pybind11'
I tried to install fasttext by the following command:
python -m pip install -U fasttext
but it didn't work well.
I faced the error below:
Collecting fasttext
Using cached fasttext-0.9.2.tar.gz (68 ...
0
votes
0
answers
90
views
Makefile error for fastText on windows - make (e=2)
I'm trying to use fastText on my windows computer and I when I try to build the project using make command (I installed Make v3.81), I have this error :
c++ -pthread -std=c++11 -march=native -O3 -...
7
votes
3
answers
20k
views
ERROR: Could not build wheels for fasttext, which is required to install pyproject.toml-based projects
I'm trying to install fasttext using pip install fasttext in python 3.11.4 but I'm running into trouble when building wheels. The error reads as follows:
error: command 'C:\\Program Files (x86)\\...
0
votes
1
answer
92
views
CompressFastText pqkmeans does not install
I would like to use compressfasttext library:
https://github.com/avidale/compress-fasttext
I use this command to install:
pip install compress-fasttext[full]
I get this pqkmeans error:
running ...
1
vote
0
answers
5k
views
Faiss how to access data in a database, after indexing and retrieving the indexes?
Good afternoon,
I am facing a challenge related to retrieving sentences based on questions asked. To solve this problem, I chose to use the FAISS model. The process I followed involved coding the ...
0
votes
1
answer
524
views
How to get a progess bar for gensim.models.FastText.train()?
I have the following code to train a FastText embedding model.
embed_model = FastText(vector_size=meta_hyper['vector_size'],
window=meta_hyper['window'],
...
0
votes
0
answers
254
views
Issues going from UTF-8 to Windows Latin1 and back
I am working within a rather complicated system where data is entered by a respondent and stored in an xml file that uses UTF-8 encoding. That data is then uploaded to an Oracle database that uses ...
1
vote
0
answers
93
views
Fasttext train function, when called from c code, reads all words, but not number of words is incorrect
currently I am developing fasttext go binding, I am stacked with the problem while training on the same file
from cgo code:
go test -run ^TestTrain$ -v ...
1
vote
1
answer
142
views
Why do I get inconsistent results between Fasttext, Longformer, and Doc2vec?
I am using a Doc2Vec model to calculate cosine similarity between observations in a dataset of website text. I want to be sure that my measure is roughly consistent if I instead use Fasttext (trained ...
0
votes
2
answers
664
views
Customize word embeddings to your own vocabulary
I have a vocabulary related to restaurant stuff in Spanish and I am using predefined word embeddings in Spanish with FastText and Bert, however, I see that there are a lot of out-of-vocabulary (oov) ...
0
votes
2
answers
177
views
Loading .bin pretrain fasttext models from Azure Blob Storage into Azure Synapse Notebook
Does anyone know how to load pretrained .bin fasttext model into azure synapse notebook using fasttext.load_model function? My .bin file is on azure blob storage account
I try to load using the ...
0
votes
1
answer
56
views
R system call failing due to arithmetic operator
I am trying to mimic the results of a fasttext command line operation in R using the system function. When I run the prompt in the command line I get the desired results as below:
F:\>fasttext ...
0
votes
1
answer
412
views
Regularization in Fasttext
I am currently training a FastText classifier and I'm facing the issue of overfitting.
model = fasttext.train_supervised( f'train{runID}.txt', lr=1.0, epoch=10, wordNgrams=2, dim=300, thread=2, ...
0
votes
0
answers
192
views
Removing words from FastText Model / Converting a .vec file to a .bin file (vec2bin)
I am working with FastText on a language (Tamil) and a task where I don't expect to encounter and simply don't care about character/words from other languages. I have both the text (.vec) and binary (....
-1
votes
1
answer
170
views
Should I Pass Word2Vec and FastText Vectors Separately or Concatenate Them for Deep Learning Model in Smart Contract Vulnerability Detection?
i have been working with word embedding latly, i have a question. So, here consider taking vulnerability detection in smart contract. So the input is smart contract files labeled with 0 or 1 stating ...
0
votes
1
answer
210
views
What is the best way to save fastText word vectors in a dataframe as numeric values?
How to save fastText word vectors in dataframe better in order to use them for further calculations?
Hello everyone!
I have a question about fastText word vectors, namely, I'd like to know, how to ...
4
votes
1
answer
2k
views
Sentence embeddings for extremely short texts (1-3 words/sentence)
I have some texts which are extremely short which comes from banktransactions (80% of the dataset has less than 3 words), and I want to classify them into ~90.000 classes (supplier).
Since the text ...
0
votes
1
answer
153
views
Does hashing in Fasttext lead to different ngrams sharing the same embedding?
As per Section 3.2 in the original paper on Fasttext, the authors state:
In order to bound the memory requirements of our model, we use a
hashing function that maps n-grams to integers in 1 to K
...
2
votes
2
answers
523
views
Convert vector to a word with Fasttext
I made a model with a dataset with Fasttext and I can convert every word to a vector.
But now I want to convert a vector to its unique word.
For example, I have this vector that is for ("the"...
0
votes
2
answers
309
views
Could not broadcast input array from shape (300,) into shape when mapping FastText word vectors with word in the dataset
I am working on creating a NLP model for my final year project. I am currently using Tensorflow Keras LSTM model to train the model. I have found a online guide for this as my dataset is in Sinhala ...
0
votes
1
answer
2k
views
load fasttext model and using GPU
I want to use cc.fa.300.bin FastText model on colab and I want to make it use the colab GPU this is my code:
model_path = '/content/model/fasttext/cc.fa.300.bin'
device = torch.device('cuda' if torch....
1
vote
1
answer
99
views
NLP how to get vectors of phrases/documents
I want to know how to generate vectors using NLP, if I'm not mistaken it should be done by sum or average of all words. However, it is not clear to me how the following sentences will generate ...
1
vote
1
answer
487
views
Awk Script for fastText vectors - Error: "unexpected character 0xe2" when there's no such character
I ran the following Awk script to get fastText vectors on my Ubuntu 22.04.2 LTS (Jammy Jellyfish).
However, I always get the same error code: awk: lines 5 and 13: unexpected character 0xe2
The Awk ...
0
votes
1
answer
2k
views
Loading a pretrained fastText model with Gensim
I'm trying to load a pretrained German fastText model (source: https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.de.300.bin.gz) with Gensim. My intention is to fine-tune it using my own dataset....
0
votes
0
answers
78
views
How can you get the word out of a vector using fasttext?
I'm using fasttext for word embeding as part of a transformer network, I supposed that if there is a function to get the vector of a word(get_word_vector()), there is a function to get the word out of ...
0
votes
1
answer
93
views
FastText difference between saving directly and saving after build_vocab and training with epochs
I've been building FastText model and saving as following:
model = FastText(#some parameters)
model.build_vocab(corpus_strings)
model.train(corpus_iterable=corpus_strings, total_examples=len(...
0
votes
1
answer
297
views
Meaning of get_input_matrix and get_output_matrix in FastText Python API
I was playing with the FT Api and found out 2 methods, namely get_input_matrix and get_output_matrix. I am wondering what are the purposes of these? Are there any chance that those 2 are the weight ...
0
votes
2
answers
962
views
How does gensim calculate sentence embeddings when using a pretrained fasttext model?
According to this answer, sentence similarity for FastText is calculated with one of two ways (depending if the embeddings are created superviser or unsupervised)
The mean of the normalized word ...
0
votes
1
answer
606
views
Evaluate FastText embeddings
I want to evaluate my FastText model (trained on my own corpus).
For semantic meaning I understand that we can use a dataset containing several pairs of two words which have been scored by humans, and ...
0
votes
0
answers
161
views
FastText Unsupervised Training only detects small number of Chinese words
I'm using FastText to unsupervised train the model (the data.train file is a text file that contains 50,000 lines/1.7 million Chinese characters):
import fasttext
model = fasttext.train_unsupervised(...
0
votes
1
answer
201
views
crawl-300d-2M-subword.zip corrupted or cannot be downloaded
I am trying to use this fasttext model crawl-300d-2M-subword.zip from the official page onI my Windows machine, but the download fails by the last few Kb.
I managed to successfully download the zip ...
1
vote
0
answers
86
views
Cannot run a python package in amazon sagemaker, but can run it in amazon terminal?
I am trying to use https://github.com/amzn/pecos in amazon sagemaker. I follow the readme of pecos to install it using amazon terminal. Then one can see the package is installed from terminal:
...
0
votes
0
answers
672
views
Problem in installing fasttext in M1 Mac - python 3.8
Tried the following command for installing fasttext
conda install -c "conda-forge/label/cf202003" fasttext
Collecting package metadata (current_repodata.json): done
Solving environment: ...
0
votes
1
answer
445
views
FastText Error! ValueError: (file-name) cannot be opened for training
I Have installed fasttext module in Python and loaded the model [ 'cc.en.300.bin'].
I already made the data frame format according to the fasttext. and then generating the files
train.to_csv(" ...
0
votes
1
answer
134
views
Error loading vectors in fasttext/torchtext
I get the following error:
urllib.error.HTTPError: HTTP Error 403: Forbidden
When running this:
class FastText(Vectors):
url_base = 'https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.{}....
0
votes
1
answer
1k
views
Encountered a problem while installing [ FastText ] library on MacOS
I have been trying to install the "FastText" library on macOS but I keep encountering a Runtime error.
System - MacOS: 13.0.1 (22A400)
Python Version: 3.10
IDE: Pycharm
I tried installing it ...