A Comprehensive Mathematical and Applied Survey of Reinforcement Learning Algorithms

Published in Under Review, 2026

Authors

Azim Akhtarshenas, Seyyed Hossein Mostafavi Tehrani, Ramin Toosi, Tohid Alizadeh, Mario Rico-Ibañez

Abstract

In the field of Artificial Intelligence (AI), Reinforcement Learning (RL) has become a central paradigm for addressing sequential decision-making problems. Despite its success, practical deployment remains challenged by sample inefficiency, instability, and scalability limitations in complex environments. While many existing surveys focus on specific RL subdomains, they often provide a fragmented perspective of the broader algorithmic landscape. This paper presents a comprehensive survey of RL algorithms, encompassing model-free value-based and policy-based methods, actor-critic architectures, model-based RL, Deep Reinforcement Learning (DRL), Multi-Agent Reinforcement Learning (MARL), and Multi-Agent Deep Reinforcement Learning (MADRL) frameworks. We begin with a concise mathematical formulation of RL grounded in Markov Decision Processes and fundamental value functions. Major algorithm families are systematically examined in terms of their core principles, mechanisms, strengths, limitations, and representative applications. To support practical implementation, algorithms are further categorized based on environment characteristics, including discrete and continuous action-spaces as well as finite and infinite state spaces, and linked to widely used RL software libraries. The survey also reviews contemporary alignment strategies and hybrid approaches integrating RL with supervised, semi-supervised, unsupervised, and self-supervised learning. Finally, key open challenges and research directions are outlined, particularly regarding safety, interpretability, sample efficiency, and computational efficiency.