It is well known in the machine learning community that the performance of models degrades over time with the introduction of new data, getting worse in solving both new and old tasks. This is also true for security applications of machine learning, but there is less knowledge of how this happens, when, and why. In the particular case of malware classification, the increasing growth in the amount of malicious files forces the community to research machine learning models that use data more effectively. Inspired by the way humans learn, Continual Learning algorithms try to incrementally learn new tasks from data that keeps changing, without forgetting the tasks learned in the past. In this talk we will try to address the following questions: Is Continual Learning a suitable approach for security-related datasets and problems? Can Continual Learning methods learn new malware families effectively without forgetting old ones? What are the improvements in terms of speed and storage?
Maria Rigaki is a PhD Student in the Department of Computer Science at the Czech Technical University (CTU), working on applications for machine learning and AI in cybersecurity. She has many years’ experience as a software developer and systems architect.