Q&A with speaker Maria Rigaki and her team from Czech Technical University
Maria and her team talk about malware classification as a continual learning problem.
Maria Rigaki, a PhD student in the Department of Computer Science at the Czech Technical University (CTU), will be presenting at CyberSec & AI Prague this October. Maria’s talk will focus on research carried out with her colleagues Elnaz Babayeva (graduate in AI), and Sebastian Garcia (assistant professor at CTU).
What’s the talk you’ll be giving at CyberSec & AI Prague?
Our talk is entitled ‘Forget Me Not: Malware Classification as a Continual Learning Problem’.
Why is malware classification important?
Malware classification is an important aspect for an antivirus company because it allows the malware researchers and analysts to find malware that belong to the same family and exhibit the same behavior. There are several approaches to tackle the problem, however the increasing growth in the amount of malicious files forces the community to research machine learning models that use data more effectively. However, one of the problems with machine learning models is that they degrade over time with the introduction of new data, which makes it difficult to solve both new and old tasks effectively. This is also true for security applications of machine learning, but there is less knowledge of how this happens, when, and why.
Can you give us a taste of the research you’ll be presenting?
Inspired by the way humans learn, continual learning algorithms try to incrementally learn new tasks from data sets that are constantly changing, without forgetting the tasks learned in the past.
Our work will answer whether continual learning is a suitable approach for security-related dataset problems, and this method allows for programs to learn new malware families effectively without forgetting the old ones. We will also focus on what improvements there have been in the area of continual learning in terms of speed and storage.
With a new dataset, a broad comparison of models, and a new proposed model, our work explores how to use machine learning to continually learn new malware families with the most potential.
As a student at CTU, can you shed light into an academic’s point of view regarding the fields of cybersecurity and AI?
Cybersecurity and AI are major areas of interest right now. There’s a lot of ‘hype’ surrounding AI, with a lot of students pursuing AI specialization. Companies have a sharper focus on AI engineering, and a lot of research happens within these companies. So, we are seeing a huge growth in career opportunities as well.
As a field of interest, how would you say cybersecurity and AI are spreading throughout Europe?
We’ve seen a growth in cybercrime activities within the past ten years, which is why there has also been a steady growth in cybersecurity as a response. We can see this growth in companies, in the type of research being carried out, and in the changing of EU laws and regulations.
In general, the use of machine learning and AI techniques in security has started to grow in the past three to five years, especially in the US. Europe, however, has been catching up rapidly in recent years.
Any hopes for the future of machine learning, AI, and cybersecurity?
We hear about many companies suffering from data leaks and breaches. We are hoping to see more uses of privacy preserving algorithms in the future. This will allow us to use machine learning without direct access to our personal data, for example using localized models or even encrypted data.
What about with AI in particular?
I know many people have concerns about the future of AI, but I really think AI has the capacity to benefit society and will make life easier for us. We will be able to shift our focus towards more difficult and creative tasks, rather than be tied to mundane work which can be automated instead.
Any concerns for our future?
With machine learning algorithms, the human element is lacking. People don’t understand that algorithms can have built-in bias They assume that since they are based on data, that it is some sort of mathematical truth. But algorithms can reflect the bias of those that programmed them. I fear people may put excessive trust in these algorithms, allowing for decisions which could have a very negative impact on society and the lives of individuals.
Maria Rigaki is a PhD Student in the Department of Computer Science at the Czech Technical University (CTU), working on applications for machine learning and AI in cybersecurity. She has many years’ experience as a software developer and systems architect.
Elnaz Babayeva is a recent graduate majoring in AI from CTU and is currently working as a machine learning researcher in the AI & Security group in Avast. In her spare time she enjoys teaching kids about STEM subjects and programming with wITches.cz and volunteers in Women in Tech Fund.
Sebastian Garcia is director of the Stratosphere Laboratory, assistant professor in CTU University, and co-founder of Women in Tech Fund. Loves network security and machine learning malware detection.