Title: A systematic review on theoretical aspects of k-nearest neighbour algorithm
Subject: Computer science
Level: Basic/Advanced
Description: One of the simplest and oldest machine learning algorithms is Nearest neighbour or commonly known as k-nearest neighbour (k-NN) algorithm. k-NN is useful for a range of applications for example as a part of predictive analytics tools, as case retrieval method in case-based reasoning system, uses in pattern recognition, uses in ensemble method for matrix completion, time series classification, etc. In theoretical machine learning, K-NN is the universally consistent classifier; many theoretical and statistical properties have been proven over the years that makes it one of the powerful tools to solve machine learning problems in practice. However, the understanding of this method is still evolving and the performance of this method under various settings and different ranges of application domains have been giving insights to it and still requires to explore e.g., multiclass classification.

This thesis work will be based on systematic review of theoretical aspects of k-NN algorithm. Understanding of statistical and mathematical framework of statistical learning theory for k-NN and future direction and use of k-NN under various settings e.g., multiclass classification are required. This should also cover theoretical insights into generalization and practical tradeoffs.

1. Beyer K., Goldstein J., Ramakrishnan R., Shaft U. (1999) When Is “Nearest Neighbor” Meaningful?. In: Beeri C., Buneman P. (eds) Database Theory — ICDT’99. ICDT 1999. Lecture Notes in Computer Science, vol 1540. Springer, Berlin, Heidelberg
2. HINNEBURG, Alexander, Charu C. AGGARWAL, Daniel A. KEIM, 2000. What is the nearest neighbor in high dimensional spaces?. 26th Internat. Conference on Very Large Databases. Cairo, Egypt, 2000. In: Proc. of the 26th Internat. Conference on Very Large Databases, Cairo, Egypt, 2000, pp. 506-515
3. Devroye, L., Györfi, L., & Lugosi, G. (2013). A probabilistic theory of pattern recognition (Vol. 31). Springer Science & Business Media.
4. Shalev-Shwartz, Shai; Ben-David, Shai. Chapter 19 in Understanding machine learning : from theory to algorithms. New York, NY : Cambridge University Press , 2014 ISBN: 978-1-107-05713-5
Start date:
End date:
Prerequisites: 1. Probability theory and Set theory
2. Enthusiastic to understand statistical learning theory aspects of machine learning
IDT supervisors: Shaibal Barua
Examiner: Shahina Begum