Bachelor and Master Theses

To apply for conducting this thesis, please contact the thesis supervisor(s).

Title:	Edge-deployable tiny machine learning models for time series data
Subject:	Computer science, Applied Artificial Intelligence
Level:	Advanced
Description:	Time series analysis plays a crucial role in various applications, including human activity recognition, process and quality control, and systems monitoring [1]. This analysis encompasses tasks such as time series classification [2], extrinsic regression [3], and forecasting [4]. Time series classification involves predicting categorical class labels for data consisting of ordered sets of real-valued attributes. In contrast, time series extrinsic regression aims to predict numerical values. Time series forecasting leverages historical and current data to predict future values over time. It is crucial to have a reliable and accurate statistical model in order to perform well on these tasks. Machine learning models, particularly Recurrent Neural Networks (RNNs) [5], Long Short Term Memory (LSTM) [6], and transformers [7], have demonstrated exceptional capabilities in modelling long-range dependencies and interactions within sequential data, making them suitable for time series analysis. Transformers can capture long-range dependencies and provide more contextual information, which can improve the learning capacity compared to RNNs and LSTMs. However, transformer models have shown great performance, but they cannot be deployed on resource-constrained edge devices due to the large number of parameters. With the increasing number of IoT devices that produce time-series data, it has become imperative to scale down state-of-the-art DL models to fit them on edge devices. Because of this, it is important to create a framework that can efficiently shrink DL models without impacting their performance. There have been many model compression techniques proposed to accelerate deep model inference and reduce model size while maintaining accuracy. The most commonly used techniques include quantization [8], model weight pruning [9], neural architecture search [10], and knowledge distillation (KD) [11]. Model weight pruning involves removing connections from the model and eliminating redundant parameters while preserving performance. Quantization reduces the bit-width of model parameters, allowing the model to retain its efficiency while reducing its size. Knowledge distillation leverages the output of a large, pre-trained model to train a smaller model to transfer the knowledge and maintain performance. Neural Architecture Search (NAS) methods design compact models by constraining the search space to consider only small models and optimizing them for preserving performance. The primary goal of this proposal is to explore and implement an efficient framework for scaling down DL models for time series analysis, specifically tailored for deployment on IoT edge devices. We aim to investigate the most suitable model size reduction methods from the aforementioned options—knowledge distillation, NAS, model pruning, and quantization. Our methodology will involve a systematic investigation of these methods, benchmarking their performance on various time series analysis tasks, and evaluating their suitability for IoT edge devices. We will conduct experiments on real-world time series datasets to quantify the trade-offs between model size reduction and predictive accuracy. [1] Foumani, Navid Mohammadi, et al. "Deep learning for time series classification and extrinsic regression: A current survey." arXiv preprint arXiv:2302.02515 (2023). [2] Ismail Fawaz, Hassan, et al. "Deep learning for time series classification: a review." Data mining and knowledge discovery 33.4 (2019): 917-963. [3] Tan, Chang Wei, et al. "Time series extrinsic regression: Predicting numeric values from time series data." Data Mining and Knowledge Discovery 35 (2021): 1032-1060. [4] Zhang, Yunhao, and Junchi Yan. "Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting." The Eleventh International Conference on Learning Representations. 2022. [5] Rangapuram, Syama Sundar, et al. "Deep state space models for time series forecasting." Advances in neural information processing systems 31 (2018). [6] Lai, Guokun, et al. "Modeling long-and short-term temporal patterns with deep neural networks." The 41st international ACM SIGIR conference on research & development in information retrieval. 2018. [7] Zhou, Tian, et al. "Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting." International Conference on Machine Learning. PMLR, 2022. [8] Kim, Dahyun, Kunal Pratap Singh, and Jonghyun Choi. "Learning architectures for binary networks." European conference on computer vision. Cham: Springer International Publishing, 2020. [9] Diao, Enmao, et al. "Pruning deep neural networks from a sparsity perspective." arXiv preprint arXiv:2302.05601 (2023). [10] Lin, Ji, et al. "Mcunet: Tiny deep learning on iot devices." Advances in Neural Information Processing Systems 33 (2020): 11711-11722.[11] Hinton, G., et al. (2015). Distilling the Knowledge in a Neural Network. arXiv preprint arXiv:1503.02531.
Start date:
End date:
Prerequisites:	Capability of reading scientific literature. Knowledge of Python. Familiarity with Machine learning.
IDT supervisors:	Hamid Mousavi Masoud Daneshtalab
Examiner:	Masoud Daneshtalab
Comments:
Company contact: