Machine Learning
Often, in the study of mathematics we encounter problems which ask of us optimal behavior in a given environment. These questions are especially prevalent in the setting of games, where given a particular environment, the player (agent) must determine what the optimal move is to secure victory. In the setting of a single player game, this problem can reduce to the that of pathfinding, where the player is asked to find a (or the optimal) route between a starting and target configuration of the game space.
Many techniques exist to resolve pathfinding problems, but none adequately address the complexity of tasks such as multi-agent pathfinding. It is known that optimal MAPF is NP-hard, and further that for single-agent pathfinding, no better optimal solution exists outside of Djikstra's algorithm. Thus, it is desirable to investigate solutions which are approximately optimal through machine learning. That is, we seek to develop methods which produce "good enough" solutions, through methods which become better and better over time, as they approach optimality.
One such method we examine is that of reinforcement learning. Therein, an agent within an environment must devise an optimal policy which takes as input a state and returns an action. Over time, the policy approaches the optimal solution, given the underlying environment. We apply reinforcement learning towards the purpose of solving cubical sliding puzzles, an environment where dimension and ruleset can quickly cause optimal solutions to scale out of control. The cubical puzzle has the additional snag that the underlying space of possible moves changes given the moves made, due to the k-rule. Thus, existing complete methods for solving pathfinding problems, (that is, methods which are certain to solve problems which can be solved, but perhaps not optimally) all break within the environment of the cubical sliding puzzle. We provide methods using evolutionary algorithms and reinforcement learning to provide approximately optimal solutions to the cubical sliding puzzle, then measure their performance against each other and the optimal A* pathfinding method. We find that both methods solve the cubical puzzle where A* cannot, and do so in very reasonable time, with solutions approaching optimality.
- Approximately optimal search on a higher-dimensional sliding puzzle. Nono SC Merleau, Miguel O'Malley, Sayan Mukherjee, Erika Roldan. 2024.