157 questions
-1
votes
1
answer
64
views
How to reconstruct sentences from mean-pooled embeddings (embedding inversion) [closed]
I’m working on a research problem where I want to reconstruct or paraphrase sentences starting from synthetic embeddings.
The embeddings are global (mean-pooled), not token-level, so they lose ...
0
votes
0
answers
33
views
OMX usage in native code to encode and decode frames into h.264 in android
How can we use hardware encoder and decoder for rgb frames convert it to h.264.
in my aosp code hw encoder/decoder is enabled
<MediaCodec name="OMX.qcom.video.encoder.avc" type="...
0
votes
0
answers
33
views
encoder-decoder sequence-to-sequence time series forecasting
I am building a encoder-decoder sequence-to-sequence model using LSTM on a time series. I am following some tutorials and I am confused as to why do we need to send the previous step prediction as an ...
0
votes
1
answer
71
views
What is source of this error in time series inference model
Problem: I have created my encoder-decoder model to forecast time series. Model trains well, but I struggle with the error in the inference model and I dont know how to troubleshoot it:
WARNING:...
1
vote
0
answers
66
views
BucketIterator in Pytorch is putting the batch_size on the torch.shape[1]
Hello guys I m trying to understand how to make a forward pass on an encoder decoder model with a custom dataset.
I have created a BucketIterator to see how the tensor.shape looks like for a batch of ...
1
vote
1
answer
324
views
autoencoder.fit doesnt work becaue of a ValueError
I don't understand what my problem is. It should work, if only because its the standard autoenoder from the tensorflow documentation.
this is the error
line 64, in call
decoded = self.decoder(...
0
votes
1
answer
176
views
Evaluation of entity relation extraction using encoder decoder?
I am working on relation extraction problem using T5 encoder decoder model with the prefix as 'summary'. I have fine tuned the model but i am confuse about the evaluation metrics to evaluate my ...
0
votes
1
answer
234
views
Encoder - Decoder neural network architecture with different input and output size
I am trying to figure out what would be a good architecture for neural network that takes projections (2D images) from different angles and creates volume consisting of 2D slices (CT-like).
So for ...
1
vote
0
answers
313
views
Generate prediction sequence with transformers model built from scratch
I'm building a basic transformers models from scratch in PyTorch (with simple tokenization and no masking). I'm using 'The Mysterious Island' by Jules Verne as the training set, so download it from ...
1
vote
1
answer
46
views
train row encoder and column encoder in Tensorflow
I am trying to create a custom neural network that has 2 encoders and one decoder. The row encoder takes in the input of size eg: 30x40 and the column encoder is supposed to take the same data in ...
2
votes
0
answers
589
views
Decoder only architecture to generate embedding vectors
Im currently using models like RoBERTa, CodeBERT etc for "code author identification/ code detection" (you can imagine it like facial recognition task). I know they are encoder architectures....
1
vote
1
answer
494
views
From_pretrained not loading custom fine-tuned model correctly "encoder weights were not tied to the decoder"
In Google Colab I have loaded a BERT model using the Hugging Face transformers library and then finetuned it using Seq2SeqTrainer. I then saved this model to my Google Drive using model....
1
vote
0
answers
231
views
TFT5ForConditionalGeneration generate returns empty output_scores
I'm fine-tuning the TFT5ForConditionalGeneration model ("t5-small"). Before doing model.fit() and save the model I set the output_score=True as well as the relevant parameters in the ...
3
votes
1
answer
4k
views
What are differences between T5 and Bart?
I have a question regarding T5 and BART.
It seems they are very similar in the bird's eye view, but I want to know what the differences between them are delicately. As far as I know they are both ...
1
vote
2
answers
172
views
with torch.no_grad() Changes Sequence Length During Evaluation Mode
I built a TransformerEncoder model, and it changes the output's sequence length if I use "with torch.no_grad()" during the evaluation mode.
My model details:
class TransEnc(nn.Module):
...
1
vote
0
answers
36
views
Huggingface Translate Pipe with custom BeamScorer
I want to generate a sentence from a machine translation model with constrained decoding that requires a custom BeamScorer. Is there a way how to replace the standard BeamSearchScorer while using the ...
1
vote
0
answers
202
views
Output of extracted Huggingface decoder does not have attribute logits
I am trying to build a video-to-text model using a Huggingface VisionEncoderDecoderModel. For the encoder, I'm using VideoMAE. Because the sequence length for videos is long, I want to use the decoder ...
3
votes
0
answers
608
views
EncoderDecoder model training/prediction with two different tokenizers
I am currently trying to get a hang of the EncoderDecoder model for seq2seq task from pretrained decoder models.
I am a bit confused about how to create a Encoder-decoder from two pretrained Bert ...
2
votes
0
answers
682
views
How can I execute decoder of ONNX Export from Seq2Seq model
I made an export of the Helsinki model using python optimum and i am trying to run the model with only the onnx environment and implement beam search from scratch because I have to later port this to ...
0
votes
1
answer
326
views
How to get Bitrate from uridecodebin Deepstream source
For decoding online and local mp4 files, I am using this code snippet to create source bin in Deepstream Python code.(similar to deepstream_test_3 example) This is uridecodebin based source, and ...
1
vote
0
answers
121
views
Attention mechanism with different layer size?
i have two layer with different layer sizes (hidden states) how can i perform encoder decoder type of attention on these layers if the layer sizes are different? because i will do dot product, how?
...
3
votes
0
answers
496
views
torch.nn.Transformer huge memory impact
I'm trying to use a Transformers Encoder as part of my model, something like this:
self.trans = torch.nn.TransformerEncoder(torch.nn.TransformerEncoderLayer(
d_model=18, nhead=6, ...
1
vote
1
answer
338
views
ValueError: bytes must be in range(0, 256) while decoding input tensor using transformer AutoTokenizer (MT5ForConditionalGerneration Model)
Relevant Code :
from transformers import (
AdamW,
MT5ForConditionalGeneration,
AutoTokenizer,
get_linear_schedule_with_warmup
)
tokenizer = AutoTokenizer.from_pretrained('google/byt5-...
0
votes
2
answers
2k
views
Why did the Seq2SeqTrainer not stop when the EarlyStoppingCallback criteria is met?
When trying to use EarlyStopping for Seq2SeqTrainer, e.g. patience was set to 1 and threshold 1.0:
training_args = Seq2SeqTrainingArguments(
output_dir='./',
num_train_epochs=3,
...
0
votes
1
answer
2k
views
Why is transformer decoder always generating output of same length as gold labels?
I am generating some summaries using a fine-tuned BART model, and I've noticed something strange. If I feed the labels to the model, it will always generate summaries of the same length of the label, ...
0
votes
1
answer
104
views
How does Keras initialize decoder first state in Encoder Decoder LSTM?
My understanding is that in the Encoder Decoder LSTM, the decoder first state is same as the encoder final state (both hidden and cell states) . But I don't see that written explicitly in the code ...
0
votes
0
answers
92
views
How is the encoder connected to the decoder in Encoder-Decoder LSTM?
Below is a typical code for creating Encoder-Decoder LSTM with a single layer. Source is here. You can assume a single feature and many to one architecture.
model.add(LSTM(units, input_shape=(n_input, ...
3
votes
1
answer
12k
views
ValueError: The following `model_kwargs` are not used by the model: ['encoder_outputs'] (note: typos in the generate arguments will also show up
When I try to run my code for Donut for DocVQA model, I got the following error
"""Test"""
from donut import DonutModel
from PIL import Image
import torch
model = ...
1
vote
0
answers
451
views
PyTorch Forecasting - Temporal Fusion Transformer calculate_prediction_actual_by_variable() plots empty
referring to the tutorial (https://pytorch-forecasting.readthedocs.io/en/stable/tutorials/stallion.html) provided by Pytorch about their implementation of the Temporal Fusion Transformer, I'm trying ...
2
votes
0
answers
1k
views
Masked self-attention in tranformer's decoder
I'm writing my thesis about attention mechanisms. In the paragraph in which I explain the decoder of transformer I wrote this:
The first sub-layer is called masked self-attention, in which the ...
0
votes
1
answer
247
views
difference between LSTM(512) and LSTMCELL(512)
Checked the source code but still struggling to find the difference between tf.keras.layers.LSTM(512) and tf.keras.layers.LSTMCell(512)
In many articles of encoders-decoders, at the encoders LSTM(512) ...
1
vote
0
answers
141
views
No gradients provided for any variable, Tensorflow error on TPU
So I am trying to train my Text Summarization model on Colab TPU as training it on Colab CPU is very slow but I am getting a No gradients provided for any variable Error, this error does not appear ...
1
vote
2
answers
205
views
a mismatch between the current graph and the graph
I am trying to train encoder decoder model with multispectral images having 9 channels but the code that i am running is downloading pretrained resnet101 weights which is trained on 3 channel images.
...
2
votes
1
answer
1k
views
How my LSTM model knows about testing data and simply cheats previous values/patterns?
I have Encoder-Decoder LSTM model that learns to predict 12 months data in advance, while looking back 12 months. If it helps at all, my dataset has around 10 years in total (120 months). I keep 8 ...
6
votes
1
answer
6k
views
Using the encoder part only from T5 model
I want to build a classification model that needs only the encoder part of language models. I have tried Bert, Roberta, xlnet, and so far I have been successful.
I now want to test the encoder part ...
1
vote
0
answers
113
views
ValueError: Shapes (None, 16) and (None, 16, 16) are incompatible (LSTMs)
I am building a English to Hindi translation model and I keep getting this error. I am still new to this so I couldn't figure out my error. I used the encoder-decoder model and i still have to build ...
1
vote
0
answers
113
views
Calculate F-score for GEC
I am working on Sequence to Sequence encoder-decoder model with bidirectional GRU for the task of grammar error detection and correction for Arabic language. I want to calculate the F0.5 score for my ...
0
votes
0
answers
107
views
JSON isinstance decoder is only decoding the strings to numbers, but not dates
I have a JSON file that has dates and integers in them. All of them are showing as strings. I used a JSON decoder to get the strings parsed into integers, but I am having issues getting the date ...
0
votes
0
answers
183
views
How to calculate the f score for seq2seq model for grammar correction?
I am building a seq2seq model for grammar correction, and I want to calculate F score for my model. But I don’t know how ? Can you help me please
In the evaluation loop I am only calculating the loss, ...
0
votes
1
answer
1k
views
T5Tokenizer and T5EncoderModel are used to encode sentences then nn.TransformerDecoder to decode to a 2-label tensor. It gives error
Used T5Tokenizer to tokenize a sentence then T5EncoderModel to encode. Finally, used the pytorch nn.TransformerDecoder to decode it. The target vector is a torch.tensor [y1, y2] where y1 and y2 have ...
0
votes
0
answers
468
views
Bert Config: Num attention heads
I am using the BertConfig() to create a encoder decoder model in the following way:
encoder = BertConfig()
decoder = BertConfig()
config = EncoderDecoderConfig.from_encoder_decoder_configs(encoder, ...
0
votes
1
answer
422
views
when use conv and deconv, the out put shape does not math(The input image's weight is odd)
such as the input shape=[1,64,12,60,33]
when i use
nn.Conv3d(in_channels=128, out_channels=64, kernel_size=(3, 3, 3), stride=2, padding=1)
the out put shape =[1,64,6,30,17]
after that i want to let ...
0
votes
1
answer
227
views
Predicting word vectors instead of words (Natural Language Processing)
I wonder whether there are any attempts to predict word embedding vectors as targets in neural networks architectures (like Transformers, Sequence-to-Sequence-Models or simple RNNs) using for example ...
0
votes
0
answers
171
views
Loss value + output of encoder-decoder LSTM network return NaN - TensorFlow 2.x
I'm currently training a seq2seq encoder-decoder network powered by LSTM in TensorFlow 2.x. The main problem right now is the loss approaches to NaN and the prediction returned are all NaN as well. I ...
1
vote
0
answers
421
views
TensorFlow -- Invalid argument: assertion failed: [Condition x == y did not hold element-wise:]
I am trying to create an encoder-decoder RNN that adds sequence_lengths as an input to the model, to tell the model to ignore padding (essentially masking). The problem is when I do this, I get a ...
0
votes
0
answers
291
views
Training Autoencoder Latent space with custom properties
I trained an autoencoder with 16 latent dimensions on an input of 21*18 matrix and it worked well. However the latent space is not in my control as in, I cannot have each node represent a property i ...
0
votes
1
answer
83
views
building an autoencoder network for parameter predictions
I am new to the machine learning domain. I have a 1d signal in the first column and its corresponding frequency, mean_amplitude, and a time is saved in second column of a file: These are the input-...
0
votes
1
answer
316
views
Prediction for pretrained model on handwritten text(images)-Pytorch
I have a problem making a prediction using a pre-trained model that contains an encoder and decoder for handwritten text recognition.
What I did is the following:
checkpoint = torch.load("Model/...
1
vote
0
answers
46
views
Trained E2E speech recognition model does not recognize even training data correctly
I trained an E2E speech recognition model using Conformer encoder and Transformer decoder with Hybrid CTC/Attention, but it does not recognize even the training data correctly.
I trained about 20 ...
1
vote
0
answers
443
views
Dimensional error in the non-linguistic dataset as input to LSTM based Encoder-decoder model using attention
I'm trying to implement attention -LSTM based encoder decoder model for multi-class classification. The dataset is non-linguistic in nature.
Characteristics of my dataset:
x_train.shape = (930,5)
...