160 lines
13 KiB
TeX
160 lines
13 KiB
TeX
\documentclass[conference]{IEEEconf}
|
|
|
|
%\input epsf
|
|
\usepackage{graphicx}
|
|
\usepackage{multirow}
|
|
\usepackage[toc,acronym,abbreviations,nonumberlist,nogroupskip]{glossaries-extra}
|
|
|
|
\renewcommand\thesection{\arabic{section}} % arabic numerals for the sections
|
|
\renewcommand\thesubsectiondis{\thesection.\arabic{subsection}.}% arabic numerals for the subsections
|
|
\renewcommand\thesubsubsectiondis{\thesubsectiondis.\arabic{subsubsection}.}% arabic numerals for the subsubsections
|
|
|
|
\newcommand\agd[1]{{\color{red}$\bigstar$}\footnote{agd: #1}}
|
|
\newcommand\SF[1]{{\color{blue}$\bigstar$}\footnote{sf: #1}}
|
|
\newcommand{\cn}{{\color{purple}[citation needed]}}
|
|
\newcommand{\pv}{{\color{orange}[passive voice]}}
|
|
\newcommand{\wv}{{\color{orange}[weak verb]}}
|
|
|
|
% correct bad hyphenation here
|
|
\hyphenation{op-tical net-works semi-conduc-tor IEEEconf}
|
|
\begin{document}
|
|
\input{acronyms}
|
|
\title{\textbf{\Large MAD: One-Shot Machine Activity Detector for Physics-Based Cyber Security\\}}
|
|
|
|
\author{Arthur Grisel-Davy$^{1,*}$, Sebastian Fischmeister$^{2}$\\
|
|
\normalsize $^{1}$University of Waterloo, Ontario, Canada\\
|
|
\normalsize agriseld@uwaterloo.ca, sfishme@uwaterloo.ca\\
|
|
\normalsize *corresponding author
|
|
}
|
|
|
|
%+++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
% use only for invited papers
|
|
%\specialpapernotice{(Invited Paper)}
|
|
|
|
% make the title area
|
|
\maketitle
|
|
\begin{abstract}
|
|
Side channel analysis offers several advantages over traditional machine monitoring methods.
|
|
The low intrusiveness, independence with the host, data reliability and difficulty to bypass are compelling arguments for using involuntary emissions as input for security policies.
|
|
However, side-channel information often comes in the form of unlabeled time series representing a proxy variable of the activity.
|
|
Enabling the definition and enforcement of high-level security policies requires extracting the state or activity of the system.
|
|
We present in this paper a novel time series, one-shot classifier called \gls{mad} specifically designed and evaluated for side-channel analysis.
|
|
\gls{mad} outperforms other traditional state detection solutions in terms of accuracy and, as importantly, Levenshtein distance of the state sequence.
|
|
\end{abstract}
|
|
\IEEEoverridecommandlockouts
|
|
\vspace{1.5ex}
|
|
\begin{keywords}
|
|
\itshape component; formatting; style; styling; insert (key words)
|
|
\end{keywords}
|
|
% no keywords
|
|
|
|
% For peer review papers, you can put extra information on the cover
|
|
% page as needed:
|
|
% \begin{center} \bfseries EDICS Category: 3-BBND \end{center}
|
|
%
|
|
% for peerreview papers, inserts a page break and creates the second title.
|
|
% Will be ignored for other modes.
|
|
\IEEEpeerreviewmaketitle
|
|
|
|
|
|
\section{Introduction}
|
|
|
|
\gls{ids}s leverage different types of data to detect intrusions.
|
|
On one side, most solutions use labeled and actionable data, often provided by the system to protect.
|
|
In the software world, this data can be the resource usage \cite{1702202}, program source code \cite{9491765} or network traffic \cite{10.1145/2940343.2940348} leveraged by an \gls{hids} or \gls{nids}.
|
|
In the machine monitoring world the input data can be the shape of a gear \cite{wang2015measurement} or the throughput of a pump \cite{gupta2021novel}.
|
|
On the other side, some methods consider only information that the system did not intentionally provide.
|
|
The system emits these activities by-product through physical mediums called side channels.
|
|
Common side-channel information for an embedded system include power consumption \cite{yang2016power} or electromagnetic fields \cite{chawla2021machine}.
|
|
For a production machine, common side-channel information include vibrations \cite{zhang2019numerical} or chemical composition of fluids \cite{4393062}.
|
|
|
|
Side-channel information offer compelling advantages over agent-collected information.
|
|
First, the information is difficult to forge.
|
|
Because the monitored system is not involved in the data retrieval process, there is no risk that an attacker that compromised the system could easily send forged information.
|
|
For example, if an attacker performs any computation on the system --- which is the case of most attacks --- it will unavoidably affect a variety of different side channels.
|
|
Second, the side-channel information retrieval process is often non-intrusive and non-disruptive for the monitored system.
|
|
Measuring the power consumption of a computer or the vibrations of a machine does not involve the cooperation or modification of the system \cite{10.1145/2976749.2978353}.
|
|
This host-independence property is crucial for safety-critical or high-availability applications as the failure of one of the two --- monitored or monitoring --- systems does not affect the other.
|
|
These two properties --- reliable data and host-independence --- set physics-based monitoring solution apart with distinct advantages and use-cases.
|
|
|
|
However, using side-channel data introduces new challenges.
|
|
One obstacle to overcome when designing a physics-based solution is the interpretation of the data.
|
|
Because the data collection consists of measuring a physical phenomenon, the input data is often a discrete time series.
|
|
The values in these time series are not directly actionable.
|
|
In some cases, a threshold value is enough to assess the integrity of the system.
|
|
In such a case, comparing each value of the time series to the threshold is possible \cite{jelali2013statistical}.
|
|
However, whenever a simple threshold is not a reliable factor for the decision, a more advanced analysis of the time series is required to make it actionable.
|
|
The state of a machine is often represented by a specific pattern.
|
|
This pattern could be, for example, a succession of specific amplitudes or a frequency/average pair for periodic processes.
|
|
These patterns are impossible to reliably detect with a simple threshold method.
|
|
Identifying the occurrence and position of these patterns makes the data actionable and enables higher-level --- i.e., that work at a higher level of abstraction \cite{tongaonkar2007inferring} --- security and monitoring policies.
|
|
For example, a computer starting mid-night or rebooting multiple times in a row should raise an alert for a possible intrusion or malfunction.
|
|
|
|
Rule-based \gls{ids}s using side channel information require an accurate and practical pattern detection solution.
|
|
Many data-mining algorithms assume that training data is cheap, meaning that acquiring large --- labeled --- datasets is achievable without major expense.
|
|
Unfortunately, collecting labeled data requires following a procedure and induce downtime for the machine which can be expensive.
|
|
Collecting many training samples during normal operations of the machine is more time-consuming as the machine's activity cannot be controlled.
|
|
A single sample of each pattern to be detected in the time series is a more convenient data requirement.
|
|
Collecting a sample is immediately possible after the installation of the measurement equipment during normal operations of the machine.
|
|
|
|
In this paper, we present \gls{mad}, a distance-based, one-shot pattern detection method for time series.
|
|
\gls{mad} focuses on providing pre-defined state detection from only one training sample per class.
|
|
This approach enables the analysis of side-channel information in contexts where the collection of large datasets is impractical.
|
|
A context selection algorithm lies at the core of \gls{mad} and yield stable classification of individual sample, important for the robustness of high-level security rules.
|
|
In experiments, \gls{mad} outperforms other approaches in accuracy and the Levenshtein distance on various simulated, lab-captured, and public times-series datasets.
|
|
|
|
We will present the current related work on physics-based security and time series pattern detection in Section~\ref{sec:related}.
|
|
Then we will introduce the formal and practical definitions of our solution in Section~\ref{sec:statement} and~\ref{sec:solution}.
|
|
Finally, we will present the datasets considered in Section~\ref{sec:dataset} and the results in Section~\ref{sec:results} to finish with a discussion of the solution in Section~\ref{sec:discussion}.
|
|
|
|
\section{Related Work}\label{sec:related}
|
|
Side-channel analysis focuses on extracting information from involuntary emissions of a system.
|
|
This topic traces back to the seminal work of Paul C. Kocher.
|
|
He introduced power side-channel analysis to extract secrets from several cryptographic protocols \cite{kocher1996timing}.
|
|
This led to the new field of side-channel analysis \cite{randolph2020power}.
|
|
However, the potential of leveraging side-channel information for defense and security purposes remains mostly untapped.
|
|
The information leakage through involuntary emissions through different channels provides insights into the activities of a machine.
|
|
Acoustic emissions \cite{belikovetsky2018digital}, heat pattern signature \cite{al2016forensics} or power consumption \cite{10.1145/3571288, gatlin2019detecting, CHOU2014400}, can --- among other side-channels --- reveal information about a machine's activity.
|
|
Side-channel information collection generally results in time series objects to analyze.
|
|
|
|
There exists a variety of methods for analyzing time series.
|
|
For signature-based solutions, a specific extract of the data is compared to known-good references to assess the integrity of the host \cite{9934955, 9061783}.
|
|
This signature comparison enables the verification of expected and specific sections and requires that the sections of interest can be extracted and synchronized.
|
|
Another solution for detecting intrusions is the definition of security policies.
|
|
Security policies are sets of rules that describe wanted or unwanted behavior.
|
|
These rules are built on input data accessible to the \gls{ids} such as user activity \cite{ilgun1995state} or network traffic \cite{5563714, kumar2020integrated}.
|
|
However, the input data requirements must have to apply a rule.
|
|
This illustrates the gap between the side-channel analysis methods and the rule-based intrusion detection methods.
|
|
To apply security policies to side-channel information, it is necessary to first label the data.
|
|
|
|
The problem of identifying pre-defined patterns in unlabeled time series is referenced under various names in the literature.
|
|
The terms \textit{activity segmentation} or \textit{activity detection} are the most relevant for the problem we are interested in.
|
|
The state of the art methods in this domain focus on human activities and leverage various sensors such as smartphones \cite{wannenburg2016physical}, cameras \cite{bodor2003vision} or wearable sensors \cite{uddin2018activity}.
|
|
These methods rely on large labeled datasets to train classification models and detect activities \cite{micucci2017unimib}.
|
|
For real-life applications, access to large labeled datasets may not be possible.
|
|
Another approach, more general than activity detection, uses \gls{cpd}.
|
|
\gls{cpd} is a sub-topic of time series analysis that focuses on detecting abrupt changes in a time series \cite{truong2020selective}.
|
|
It is assumed in many cases that these change points are representative of state transitions from the observed system.
|
|
However, \gls{cpd} is only the first step in state detection as classification of the detected segments remains necessary \cite{aminikhanghahi2017survey}.
|
|
Moreover, not all state transitions trigger abrupt changes in time series statistics, and some states include abrupt changes.
|
|
Overall, \gls{cpd} only fits a specific type of problem with stable states and abrupt transitions.
|
|
Neural networks raised in popularity for time series analysis with \gls{rnn}.
|
|
Large \gls{cnn} can perform pattern extraction in long time series, for example in the context of \gls{nilm} \cite{8598355}.
|
|
\gls{nilm} focuses on the problem of signal disaggregation.
|
|
In this problem, the signal comprises an aggregate of multiple signals, each with their own patterns \cite{angelis2022nilm}.
|
|
This problem shares many terms and core techniques as this paper but the nature of the input data makes \gls{nilm} a distinct area of research.
|
|
|
|
The specific problem of classification with only one example of each class is called one-shot --- or few-shot --- classification.
|
|
This topic focuses on pre-extracted time series classification with few training samples, often using multi-level neural networks \cite{10.1145/3371158.3371162, 9647357}.
|
|
However, in the context of side-channel analysis, a time series contains many patterns that are not extracted.
|
|
Moreover, neural-based approaches lack interpretability, which can cause issues in the case of unforeseen time series patterns.
|
|
Simpler approaches with novelty detection capabilities are required when the output serves as input for rule-based processing.
|
|
|
|
Finally, Duin et. al. investigate the problem of distance-based few-shot classification \cite{duin1997experiments}.
|
|
They present an approach based on the similarity between new objects and a dissimilarity matrix between items of the training set.
|
|
The similarities are evaluated with Nearest-Neighbor rules or \gls{svm}.
|
|
Their approach bears some interesting similarities with the one presented in this paper.
|
|
However, they evaluate their work on the recognition of handwritten numerals, which is far from the use case we are interested in.
|
|
|
|
\end{document}
|