Calculate word embeddings using fasttext

Question

I am trying to calculate the word embeddings using fasttext for the following sentence.

a = 'We are pencil in the hands'

I dont have any pretrained model, so how do i go about it?

Jindřich · Accepted Answer · 2019-09-16 12:32:01Z

1

You need a table of trained embeddings.

You can download pre-trained embeddings from the FastText website and use the code they provide for loading the embeddings. You don't even need to install FastText for that:

import io

def load_vectors(fname):
    fin = io.open(fname, 'r', encoding='utf-8', newline='\n', errors='ignore')
    n, d = map(int, fin.readline().split())
    data = {}
    for line in fin:
        tokens = line.rstrip().split(' ')
        data[tokens[0]] = map(float, tokens[1:])
    return data

Then you just pick-up the from the dictionary.

Alternatively, you can train fasttext yourself on your text data by following a tutorial. The reasonable minimum of a dataset to train the word embeddings on is hundreds of thousands of words.

answered Sep 16, 2019 at 12:32

Jindřich

11.3k2 gold badges31 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Calculate word embeddings using fasttext

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related