Bachelor and Master Theses

To apply for conducting this thesis, please contact the thesis supervisor(s).

Title:	Category Theory for Mechanistic Interpretability of Neural Networks
Subject:	Computer science
Level:	Advanced
Description:	Deep neural networks specially transformers are the backbone of current large language models (LLMs) and computer vision models. However the decision-making processes of these models remain largely opaque. Mechanistic interpretability aims to address this opacity by reverse-engineering neural networks at a sub-circuit level, identifying how individual modules, neurons, or attention heads implement specific computational functions. Category theory is a branch of mathematics that provide abstraction, structure, composition, and offers a rigorous language for modeling hierarchical, compositional, and modular systems. This thesis aims to develop a category-theoretic framework for mechanistic interpretability of neural networks.
Start date:
End date:
Prerequisites:
IDT supervisors:	Shaibal Barua
Examiner:
Comments:	- Good knowledge on Linear algebra, Discrete mathematics, Basic Logic and algebraic thinking - Knowledge in Neural Network Foundations - Python and PyTorch
Company contact: