PhD Student: Ocean Wu (Current), MS Data Science: Johnson Zhou (2021). Postdoc: Thomas Lacombe (2019). Many applications deal with data streams. Data streams can be perceived as a continuous sequence of data instances, often arriving at a high rate. In data streams, the underlying data distribution may change over time, causing decay in the predictive ability of the machine learning models. This phenomenon is known as concept drift.
Moreover, it is common for previously seen concepts to recur in real-world data streams, known as recurrent concept drifts. If a concept reappears, for example, a particular weather pattern, previously learned classifiers can be reused; thus the performance of the learning algorithm can be improved.
Scikit-ika is an open-source implementation of methods for handling recurrent concept drifts. It continuously models evolving data streams, providing accurate predictions in real-time, using probabilistic networks and meta-information to proactively predict a change in the data stream. The code developed for this project is available on GitHub and released as part of an open-source python library, as stated in the initial proposal, https://scikit-ika.github.io/.
This project is funded by ONRG Global. Supervisors: Assoc Prof Yun Sing Koh, Prof Gillian Dobbie