Title: | Explaining and exploring models with LLMs |
Subject: | Software engineering |
Level: | Advanced |
Description: |
The complexity of modern software systems makes it increasingly intricate to deal with the development in a code-centric way. Model-based software engineering (MBSE) proposes to shift the focus towards model-centric approaches; models provide abstractions (i.e. simplified views over the system) that are easier to deal with and that improve communication (especially in their graphical form). Moreover, being models machine readable they enable the automation of development related tasks, like analysis of properties and generation of code. Despite the advantages of adopting abstractions in MBSE, the essential complexity of the systems cannot be taken away, that is complex problems will keep being complex to be dealt with [1]. As a consequence, industrial models tend to grow in size and complexity, often reaching thousands of elements, hierarchies, nesting layers, etc. All of these complicate remarkably the understandability of model contents especially for stakeholders not directly involved in the creation of those. The advent of LLMs has brought many interesting automation and support features for Software Engineering in general and software development in particular. For example, there exists summarization solutions that given a source code in input can generate descriptions about the code at different levels of details, e.g. the single line of code, a function block, a module, etc. This thesis project aims to investigate the capabilities of LLMs to take models as input and generate a summary, an explanation, or support concept exploration. In particular, the thesis should explore: - what format is needed to be provided as input? - what prompts are adequate to get proper results? - can LLMs handle these tasks out of the box or a fine-tuning is required? The thesis is expected to explore the current state-of-the-art related to modelling support from LLMs, and to locate an appropriate dataset to be used for validation purposes. In this respect, the thesis might be limited to particular modelling languages and their subsets. 1. Brooks, "No Silver Bullet Essence and Accidents of Software Engineering," in Computer, vol. 20, no. 4, pp. 10-19, April 1987, doi: 10.1109/MC.1987.1663532. |
Start date: | 2025-01-01 |
End date: | 2025-06-08 |
Prerequisites: |
- knowledge of software modelling is required; - knowledge of a programming language is required (preferably Java, Python, or similar); - basic knowledge of LLMs is required; - attendance of a modelling course is consired as a plus. |
IDT supervisors: | Antonio Cicchetti |
Examiner: | |
Comments: |
This thesis can be done in a pair |
Company contact: |
This thesis is done in collaboration with Riccardo Rubei riccardo.rubei@mdu.se |