Skip to main content
Filter by
Sorted by
Tagged with
-1 votes
1 answer
64 views

I’m working on a research problem where I want to reconstruct or paraphrase sentences starting from synthetic embeddings. The embeddings are global (mean-pooled), not token-level, so they lose ...
melissa mattos's user avatar
0 votes
0 answers
33 views

How can we use hardware encoder and decoder for rgb frames convert it to h.264. in my aosp code hw encoder/decoder is enabled <MediaCodec name="OMX.qcom.video.encoder.avc" type="...
Rajeev's user avatar
  • 167
0 votes
0 answers
33 views

I am building a encoder-decoder sequence-to-sequence model using LSTM on a time series. I am following some tutorials and I am confused as to why do we need to send the previous step prediction as an ...
Manoj Agrawal's user avatar
0 votes
1 answer
71 views

Problem: I have created my encoder-decoder model to forecast time series. Model trains well, but I struggle with the error in the inference model and I dont know how to troubleshoot it: WARNING:...
Art's user avatar
  • 11
1 vote
0 answers
66 views

Hello guys I m trying to understand how to make a forward pass on an encoder decoder model with a custom dataset. I have created a BucketIterator to see how the tensor.shape looks like for a batch of ...
Zaharie Andrei's user avatar
1 vote
1 answer
324 views

I don't understand what my problem is. It should work, if only because its the standard autoenoder from the tensorflow documentation. this is the error line 64, in call decoded = self.decoder(...
razzzz's user avatar
  • 11
0 votes
1 answer
176 views

I am working on relation extraction problem using T5 encoder decoder model with the prefix as 'summary'. I have fine tuned the model but i am confuse about the evaluation metrics to evaluate my ...
Mudasser Afzal's user avatar
0 votes
1 answer
234 views

I am trying to figure out what would be a good architecture for neural network that takes projections (2D images) from different angles and creates volume consisting of 2D slices (CT-like). So for ...
daniel's user avatar
  • 15
1 vote
0 answers
313 views

I'm building a basic transformers models from scratch in PyTorch (with simple tokenization and no masking). I'm using 'The Mysterious Island' by Jules Verne as the training set, so download it from ...
matsuo_basho's user avatar
  • 3,080
1 vote
1 answer
46 views

I am trying to create a custom neural network that has 2 encoders and one decoder. The row encoder takes in the input of size eg: 30x40 and the column encoder is supposed to take the same data in ...
Chandana Deshmukh's user avatar
2 votes
0 answers
589 views

Im currently using models like RoBERTa, CodeBERT etc for "code author identification/ code detection" (you can imagine it like facial recognition task). I know they are encoder architectures....
sastaengineer's user avatar
1 vote
1 answer
494 views

In Google Colab I have loaded a BERT model using the Hugging Face transformers library and then finetuned it using Seq2SeqTrainer. I then saved this model to my Google Drive using model....
thewaterbuffalo's user avatar
1 vote
0 answers
231 views

I'm fine-tuning the TFT5ForConditionalGeneration model ("t5-small"). Before doing model.fit() and save the model I set the output_score=True as well as the relevant parameters in the ...
ayalaall's user avatar
  • 117
3 votes
1 answer
4k views

I have a question regarding T5 and BART. It seems they are very similar in the bird's eye view, but I want to know what the differences between them are delicately. As far as I know they are both ...
Now.Zero's user avatar
  • 1,459
1 vote
2 answers
172 views

I built a TransformerEncoder model, and it changes the output's sequence length if I use "with torch.no_grad()" during the evaluation mode. My model details: class TransEnc(nn.Module): ...
Leon's user avatar
  • 113
1 vote
0 answers
36 views

I want to generate a sentence from a machine translation model with constrained decoding that requires a custom BeamScorer. Is there a way how to replace the standard BeamSearchScorer while using the ...
Jindřich's user avatar
  • 11.3k
1 vote
0 answers
202 views

I am trying to build a video-to-text model using a Huggingface VisionEncoderDecoderModel. For the encoder, I'm using VideoMAE. Because the sequence length for videos is long, I want to use the decoder ...
jeg's user avatar
  • 61
3 votes
0 answers
608 views

I am currently trying to get a hang of the EncoderDecoder model for seq2seq task from pretrained decoder models. I am a bit confused about how to create a Encoder-decoder from two pretrained Bert ...
Georg B's user avatar
  • 311
2 votes
0 answers
682 views

I made an export of the Helsinki model using python optimum and i am trying to run the model with only the onnx environment and implement beam search from scratch because I have to later port this to ...
klsmgföl's user avatar
0 votes
1 answer
326 views

For decoding online and local mp4 files, I am using this code snippet to create source bin in Deepstream Python code.(similar to deepstream_test_3 example) This is uridecodebin based source, and ...
Nawin K Sharma's user avatar
1 vote
0 answers
121 views

i have two layer with different layer sizes (hidden states) how can i perform encoder decoder type of attention on these layers if the layer sizes are different? because i will do dot product, how? ...
Josef Souza's user avatar
3 votes
0 answers
496 views

I'm trying to use a Transformers Encoder as part of my model, something like this: self.trans = torch.nn.TransformerEncoder(torch.nn.TransformerEncoderLayer( d_model=18, nhead=6, ...
Antonis Karvelas's user avatar
1 vote
1 answer
338 views

Relevant Code : from transformers import ( AdamW, MT5ForConditionalGeneration, AutoTokenizer, get_linear_schedule_with_warmup ) tokenizer = AutoTokenizer.from_pretrained('google/byt5-...
iamabhaykmr's user avatar
  • 2,071
0 votes
2 answers
2k views

When trying to use EarlyStopping for Seq2SeqTrainer, e.g. patience was set to 1 and threshold 1.0: training_args = Seq2SeqTrainingArguments( output_dir='./', num_train_epochs=3, ...
alvas's user avatar
  • 123k
0 votes
1 answer
2k views

I am generating some summaries using a fine-tuned BART model, and I've noticed something strange. If I feed the labels to the model, it will always generate summaries of the same length of the label, ...
andrea's user avatar
  • 930
0 votes
1 answer
104 views

My understanding is that in the Encoder Decoder LSTM, the decoder first state is same as the encoder final state (both hidden and cell states) . But I don't see that written explicitly in the code ...
Travelling Salesman's user avatar
0 votes
0 answers
92 views

Below is a typical code for creating Encoder-Decoder LSTM with a single layer. Source is here. You can assume a single feature and many to one architecture. model.add(LSTM(units, input_shape=(n_input, ...
Travelling Salesman's user avatar
3 votes
1 answer
12k views

When I try to run my code for Donut for DocVQA model, I got the following error """Test""" from donut import DonutModel from PIL import Image import torch model = ...
Armen Melkonyan's user avatar
1 vote
0 answers
451 views

referring to the tutorial (https://pytorch-forecasting.readthedocs.io/en/stable/tutorials/stallion.html) provided by Pytorch about their implementation of the Temporal Fusion Transformer, I'm trying ...
Francesco Mangia's user avatar
2 votes
0 answers
1k views

I'm writing my thesis about attention mechanisms. In the paragraph in which I explain the decoder of transformer I wrote this: The first sub-layer is called masked self-attention, in which the ...
CarlaDP's user avatar
  • 21
0 votes
1 answer
247 views

Checked the source code but still struggling to find the difference between tf.keras.layers.LSTM(512) and tf.keras.layers.LSTMCell(512) In many articles of encoders-decoders, at the encoders LSTM(512) ...
Jay ra1's user avatar
  • 91
1 vote
0 answers
141 views

So I am trying to train my Text Summarization model on Colab TPU as training it on Colab CPU is very slow but I am getting a No gradients provided for any variable Error, this error does not appear ...
Peter Austin's user avatar
1 vote
2 answers
205 views

I am trying to train encoder decoder model with multispectral images having 9 channels but the code that i am running is downloading pretrained resnet101 weights which is trained on 3 channel images. ...
user3449214's user avatar
2 votes
1 answer
1k views

I have Encoder-Decoder LSTM model that learns to predict 12 months data in advance, while looking back 12 months. If it helps at all, my dataset has around 10 years in total (120 months). I keep 8 ...
Travelling Salesman's user avatar
6 votes
1 answer
6k views

I want to build a classification model that needs only the encoder part of language models. I have tried Bert, Roberta, xlnet, and so far I have been successful. I now want to test the encoder part ...
ls_grep's user avatar
  • 61
1 vote
0 answers
113 views

I am building a English to Hindi translation model and I keep getting this error. I am still new to this so I couldn't figure out my error. I used the encoder-decoder model and i still have to build ...
Akshat Mittu's user avatar
1 vote
0 answers
113 views

I am working on Sequence to Sequence encoder-decoder model with bidirectional GRU for the task of grammar error detection and correction for Arabic language. I want to calculate the F0.5 score for my ...
Moodhi's user avatar
  • 45
0 votes
0 answers
107 views

I have a JSON file that has dates and integers in them. All of them are showing as strings. I used a JSON decoder to get the strings parsed into integers, but I am having issues getting the date ...
Alphanum3ric's user avatar
0 votes
0 answers
183 views

I am building a seq2seq model for grammar correction, and I want to calculate F score for my model. But I don’t know how ? Can you help me please In the evaluation loop I am only calculating the loss, ...
Latifa Alabdullatif's user avatar
0 votes
1 answer
1k views

Used T5Tokenizer to tokenize a sentence then T5EncoderModel to encode. Finally, used the pytorch nn.TransformerDecoder to decode it. The target vector is a torch.tensor [y1, y2] where y1 and y2 have ...
rana's user avatar
  • 111
0 votes
0 answers
468 views

I am using the BertConfig() to create a encoder decoder model in the following way: encoder = BertConfig() decoder = BertConfig() config = EncoderDecoderConfig.from_encoder_decoder_configs(encoder, ...
Ritu Gahir's user avatar
0 votes
1 answer
422 views

such as the input shape=[1,64,12,60,33] when i use nn.Conv3d(in_channels=128, out_channels=64, kernel_size=(3, 3, 3), stride=2, padding=1) the out put shape =[1,64,6,30,17] after that i want to let ...
Truman's user avatar
  • 1
0 votes
1 answer
227 views

I wonder whether there are any attempts to predict word embedding vectors as targets in neural networks architectures (like Transformers, Sequence-to-Sequence-Models or simple RNNs) using for example ...
Jochen's user avatar
  • 1
0 votes
0 answers
171 views

I'm currently training a seq2seq encoder-decoder network powered by LSTM in TensorFlow 2.x. The main problem right now is the loss approaches to NaN and the prediction returned are all NaN as well. I ...
YoYo's user avatar
  • 1
1 vote
0 answers
421 views

I am trying to create an encoder-decoder RNN that adds sequence_lengths as an input to the model, to tell the model to ignore padding (essentially masking). The problem is when I do this, I get a ...
jda5's user avatar
  • 1,454
0 votes
0 answers
291 views

I trained an autoencoder with 16 latent dimensions on an input of 21*18 matrix and it worked well. However the latent space is not in my control as in, I cannot have each node represent a property i ...
Aditya Vartak's user avatar
0 votes
1 answer
83 views

I am new to the machine learning domain. I have a 1d signal in the first column and its corresponding frequency, mean_amplitude, and a time is saved in second column of a file: These are the input-...
manas's user avatar
  • 501
0 votes
1 answer
316 views

I have a problem making a prediction using a pre-trained model that contains an encoder and decoder for handwritten text recognition. What I did is the following: checkpoint = torch.load("Model/...
Imen Bouzidi's user avatar
1 vote
0 answers
46 views

I trained an E2E speech recognition model using Conformer encoder and Transformer decoder with Hybrid CTC/Attention, but it does not recognize even the training data correctly. I trained about 20 ...
Lightning's user avatar
1 vote
0 answers
443 views

I'm trying to implement attention -LSTM based encoder decoder model for multi-class classification. The dataset is non-linguistic in nature. Characteristics of my dataset: x_train.shape = (930,5) ...
Sukhmani Kaur Thethi's user avatar