Category "reinforcement-learning"

Deep Reinforcement Learning - CartPole Problem

I tried to implement the most simple Deep Q Learning algorithm. I think, I've implemented it right and know that Deep Q Learning struggles with divergences but

How can I deal with Reinforcement Problem when the episode length is infinite?

I am trying to create a Custom PyEnvironment for making an agent learn the optimum hour to send the notification to the users, based on the rewards received by

RL + optimization: how to do it better?

I am learning about how to optimize using reinforcement learning. I have chosen the problem of maximum matching in a bipartite graph as I can easily compute the

Parallelizing Monte Carlo Tree Search

I have a Monte Carlo Tree Search implementation that I need to optimize. So I thought about parallelizing the rollout phase. How to do that? (Is there a code ex

Reinforcement Learning applications in computer vision?

As I continued to study computer vision, I felt that RL (reinforcement learning) was used relatively less frequently in computer vision tasks, compared to the i

Tensorboard stops updating in Google Colab during learning with stable baselines

I am using PPO stable baselines in Google Colab with Tensorboard activated to track the training progress but after around 100-200K timesteps tensorboard stops

Which OpenAI gym environment should be used for solve the shortest route problem?

I am trying to fine the shortest route between two nodes using reinforcement learning. I am not sure what environment to use. I have found this particular envir

How to check the actions available in OpenAI gym environment?

When using OpenAI gym, after importing the library with import gym, the action space can be checked with env.action_space. But this gives only the size of the a

PyTorch Model Training: RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

After training a PyTorch model on a GPU for several hours, the program fails with the error RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR Trainin

LSTM based policy in stable baselines3 model

I am trying to make a PPO model using the stable-baselines3 library. I want to use a policy network with an LSTM layer in it. However, I can't find such a possi

How is profit calculated in gym environment?

So I'm using the gym stocks environment to train a model using A2C policy but I want to understand how the profit is calculated by the model, in the documentati

Why does ep_rew_mean decrease over time?

In order to learn about reinforcement learning for optimization I have written some code to try to find the maximum cardinality matching in a graph. Not only d

AnyLogic and Alpyne library: Any issue with the Resource block?

I am trying to develop a simulation model in which actions are performed by an intelligent agent, through Reinforcement Learning, namely using the Alpyne librar

Error while defining observation space in gym custom environment

I am working on a reinforcement algorithm, I am very new to this and trying to get a hold of things. Player1Env looks upon a 7x6 Connect4 playing grid. I am ini

How to check out actions available in OpenAI gym environment?

It seems like the list of actions for Open AI Gym environments are not available to check out even in the documentation. For example, let's say you want to play

Pytorch RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

This code is built up as follows: My robot takes a picture, some tf computer vision model calculates where in the picture the target object starts. This informa

Deep Q Learning **WITHOUT** OpenAI Gym

Does anyone have or know of any tutorials / courses that teach q learning without the use of open ai gym. I'm trying to make a convolutional q learning model an

installing box2d gym ai

I want to train DQN on CarRacing environmnet but when I want to import it using bellow command there is an error. env = gym.make('CarRacing-v0').unwrapped Attr

A2C not converge as the loss explode

I'm experimenting with Advantage Actor Critic algorithm, and the loss explode exponentially. like iteration actor_loss critic_loss 17 -0.072878 0.003239 78 -25

Tensorflow and Multiprocessing: Passing Sessions

I have recently been working on a project that uses a neural network for virtual robot control. I used tensorflow to code it up and it runs smoothly. So far, I