Category "deep-learning"

Pretraining a language model on a small custom corpus

I was curious if it is possible to use transfer learning in text generation, and re-train/pre-train it on a specific kind of text. For example, having a pre

Variational AutoEncoder - TypeError

I am trying to implement a VAE for MNIST using convolutional layers using TensorFlow-2.6 and Python-3.9. The code I have is: # Specify latent space dimensions-

Derivates from a class instance in TF1

I am using the Physics Informed Neural Networks (PINNs) methodology to solve non-linear PDEs in high dimension. Specifically, I am using this class https://git

logits and labels must be broadcastable error in Tensorflow RNN

I am new to Tensorflow and deep leaning. I am trying to see how the loss decreases over 10 epochs in my RNN model that I created to read a dataset from kaggle w

How to clean garbage from CUDA in Pytorch?

I teached my neural nets and realized that even after torch.cuda.empty_cache() and gc.collect() my cuda-device memory is filled. In Colab Notebooks we can see t

Correct Implementation of Dice Loss in Tensorflow / Keras

I've been trying to experiment with Region Based: Dice Loss but there have been a lot of variations on the internet to a varying degree that I could not find tw

How to get the location of all text present in an image using OpenCV?

I have this image that contains text (numbers and alphabets) in it. I want to get the location of all the text and numbers present in this image. Also I want to

LSTM is Showing very low accuracy and large loss

I am applying LSTM on a dataset that has 53699 entries for the training set and 23014 entries for the test set. The shape of the input training set is (53699,4)

why does the VQ-VAE require 2 Stage training?

According the the paper, VQ-VAE goes through two stage training. First to train the encoder and the vector quantization and then train an auto-regressive model

Random cropping data augmentation convolutional neural networks

I am training a convolutional neural network, but have a relatively small dataset. So I am implementing techniques to augment it. Now this is the first time i a

How to understand masked multi-head attention in transformer

I'm currently studying code of transformer, but I can not understand the masked multi-head of decoder. The paper said that it is to prevent you from seeing the

Data augmentation in test/validation set?

It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training

Why do we need to call zero_grad() in PyTorch?

Why does zero_grad() need to be called during training? | zero_grad(self) | Sets gradients of all model parameters to zero.

Why do we need to call zero_grad() in PyTorch?

Why does zero_grad() need to be called during training? | zero_grad(self) | Sets gradients of all model parameters to zero.

validation and train metrics very low values (images and masks generator)

I have images(X_train) and masks data (y_train). I want to train a unet network. I am currently using iou metric and the validation iou is very low and constant

How is profit calculated in gym environment?

So I'm using the gym stocks environment to train a model using A2C policy but I want to understand how the profit is calculated by the model, in the documentati

Can I use Layer Normalization with CNN?

I see the Layer Normalization is the modern normalization method than Batch Normalization, and it is very simple to coding in Tensorflow. But I think the layer

How to get the coordinates of the bounding box in YOLO object detection?

I need to get the bounding box coordinates generated in the above image using YOLO object detection.

ran into TensorFlow ValueError during my TensorFlow Course with Udemy

I have being trying to fit the error during my Tensorflow course (Section 3: Neural network Regression with Tensorflow) with Udemy. import tensorflow as tf impo

error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1 on Colab

I had cloned these repo !git clone https://github.com/lbin/DCNv2.git and try to Build on Google colab but got these error