Bachelor and Master Theses

To apply for conducting this thesis, please contact the thesis supervisor(s).
Title: Model Checked Reinforcement Learning for Multi-Agent Planning
Subject: Software engineering, Computer science, Robotics
Level: Advanced
Description:

Overall Description

Agents are systems that can move, function, and collaborate with others (e.g., humans or machines) without human intervention. Self-driving cars, drones, robots, and autonomous construction equipment are examples of agents in our real lives.

Agents are often designed to accomplish missions in a group, such as robots delivering goods in a warehouse, and wheel loaders digging and transporting stones in a quarry. To accomplish their missions, they need to collaborate and cooperate with each other. For example, in Fig.1, a small group of robots (i.e., blue ones) is assigned to a specific region of an area (e.g., a warehouse) and responsible for collecting goods within that region and transporting them to a port that connects the region to another. The other group of robots (i.e., yellow ones) is responsible for collecting the goods from the port and carrying them to the destination.

Fig 1. An example of multi-agent systems

Multi-agent planning (MAP) is about planning actions in an optimal manner so that they can accomplish their missions in an efficient way. In the example shown in Fig.1, MAP is about planning the robots' movement and order of picking up goods at different points so that all the goods can be delivered to the destination in the fastest way.

Path planning is a typical problem of MAP, which has many algorithms to solve it, such as Dijkstra and A* algorithms. Reinforcement learning as a branch of machine learning is a newly accepted way of planning. It uses simulation and learning algorithms, e.g., Q learning, to train the agents in an environment s.t. they learn how to behave in order to accomplish their missions. The resulting mission planning can be a table of state-action pairs or a neural network that takes in states as the input and outputs actions as the output.

Safety is another important factor that one needs to consider in MAP. Free of collision is one typical safety requirement. Never being stuck in a starving situation is another example of a safety requirement, which means that robots should never be waiting for each other s.t. a deadlock of waiting happens. To guarantee that MAP satisfies the safety requirement is a difficult problem but as important as planning itself because agents are often safety critical. A minor error may cause a catastrophic consequence. Formal methods, e.g., model checking, are techniques that are based on mathematics. They provide a way of proving that the target system is free of bugs in the sense that the system satisfies its requirements.

Content of the thesis

In this thesis, you will be working on a platform for MAP, namely MALTA (https://github.com/rgu01/MALTA). As depicted in Fig. 2, the platform has a front end for visualizing the environment and mission plans for the users, a middleware for calculating paths and generating models, and a backend for task scheduling.

 

Graphical user interface, applicationDescription automatically generated

Fig 2. The structure of MALTA

Recently, we have proposed a novelty method named MCRL that combines Q learning as the algorithm for MAP, and a state-of-the-art model checker, UPPAAL5, as the verifier of plans. In this thesis, you will continue the research of this method and implement your method in MALTA. The tasks are enumerated as follows:

  • Read the latest literature on MAP and understand the frontier of this field.
  • Train neural-network-based plans (tools for training can be PyTorch, TensorFlow, and OpenNN, etc.).
  • Verify the resulting plans by using formal verifiers, such as UPPAAL5 and Marabou. 

Contact

Rong Gu (rong.gu@mdu.se)

 

Start date:
End date:
Prerequisites:
  • Experience in programming, especially in C/C++, Java, and C#.
  • Experience in modeling.
  • Knowledge of neural networks.
IDT supervisors: Rong Gu
Examiner: Cristina Seceleanu
Comments:
Company contact: