Found 1492 results, showing the newest relevant preprints. Sort by relevancy only.Update me on new preprints

Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning

Off-policy reinforcement learning algorithms promise to be applicable in settings where only a fixed data-set (batch) of environment interactions is available and no new experience can be acquired. Expand abstract.
2 days ago
10/10 relevant
arXiv

An anatomical substrate of credit assignment in reinforcement learning

Here we show, by combining high-throughput volume electron microscopy 2 and automated connectomic analysis3-5, that the synaptic architecture of songbird basal ganglia supports local credit assignment using a variant of the node perturbation algorithm proposed in a model of songbird reinforcement learning6,7. Expand abstract.
3 days ago
8/10 relevant
bioRxiv

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

The choice of the control frequency of a system has a relevant impact on the ability of reinforcement learning algorithms to learn a highly performing policy. Expand abstract.
4 days ago
10/10 relevant
arXiv

First Order Optimization in Policy Space for Constrained Deep Reinforcement Learning

In reinforcement learning, an agent attempts to learn high-performing behaviors through interacting with the environment, such behaviors are often quantified in the form of a reward function. Expand abstract.
5 days ago
10/10 relevant
arXiv

Reward Design for Driver Repositioning Using Multi-Agent Reinforcement Learning

This paper aims to model the multi-driver repositioning task through a mean field multi-agent reinforcement learning (MARL) approach. Expand abstract.
5 days ago
9/10 relevant
arXiv

Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning

As our first contribution, we use this approach for goal-conditioned reinforcement learning and show that it is both efficient and does not suffer from hindsight bias in stochastic domains. Expand abstract.
6 days ago
10/10 relevant
arXiv

Fast Reinforcement Learning for Anti-jamming Communications

This letter presents a fast reinforcement learning algorithm for anti-jamming communications which chooses previous action with probability $\tau$ and applies $\epsilon$-greedy with probability $(1-\tau)$. Expand abstract.
8 days ago
10/10 relevant
arXiv

Effective Reinforcement Learning through Evolutionary Surrogate-Assisted Prescription

ESP is further extended in this paper to sequential decision-making tasks, which makes it possible to evaluate the framework in reinforcement learning (RL) benchmarks. Expand abstract.
8 days ago
10/10 relevant
arXiv

A Framework for End-to-End Learning on Semantic Tree-Structured Data

We evaluate our approach on several UCI benchmark datasets, including ablation and data-efficiency studies, and on a toy reinforcement learning task. Expand abstract.
8 days ago
4/10 relevant
arXiv

Provably Convergent Policy Gradient Methods for Model-Agnostic Meta-Reinforcement Learning

We consider Model-Agnostic Meta-Learning (MAML) methods for Reinforcement Learning (RL) problems where the goal is to find a policy (using data from several tasks represented by Markov Decision Processes (MDPs)) that can be updated by one step of stochastic policy gradient for the realized MDP. Expand abstract.
9 days ago
7/10 relevant
arXiv