Bachelor and Master Theses

To apply for conducting this thesis, please contact the thesis supervisor(s).
Title: Category Theory for Mechanistic Interpretability of Neural Networks
Subject: Computer science
Level: Advanced
Description:

Deep neural networks specially transformers are the backbone of current large language models (LLMs) and computer vision models. However the decision-making processes of these models remain largely opaque. Mechanistic interpretability aims to address this opacity by reverse-engineering neural networks at a sub-circuit level, identifying how individual modules, neurons, or attention heads implement specific computational functions. Category theory is a branch of mathematics that provide abstraction, structure, composition, and offers a rigorous language for modeling hierarchical, compositional, and modular systems.

This thesis aims to develop a category-theoretic framework for mechanistic interpretability of neural networks.

Start date:
End date:
Prerequisites:
IDT supervisors: Shaibal Barua
Examiner:
Comments:

-              Good knowledge on Linear algebra, Discrete mathematics, Basic Logic and algebraic thinking

-              Knowledge in Neural Network Foundations

-              Python and PyTorch

Company contact: