0

I want to use the fastText Sentence Vector as an input Feature.

vector = model.get_sentence_vector('Original Sentence')

I am attempting to perform Binary Classification of sentences using MLPs and will train the algorithm using the fixed sized feature generated by the above code. Is this a plausible thing to do?

1 Answer 1

1

You can take the mean of the word embeddings, i.e., tokenize the sentence, look up embeddings for all words computing an average. In this way, you will get a NumPy array that you can use as an input to whatever classifier you want. Depending on the classification task, it might be useful to remove function words first.

Gensim has a richer Python API than FastText itself. If you just want to quickly train a classifier, the best option is using the command line interface of FastText.

Sign up to request clarification or add additional context in comments.

2 Comments

But can I use get_sentence_vector? I believe this also returns an averaging of words in the sentence.
Yes, it does the average plus some additional tricks L2 norms.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.