fix clem comments

This commit is contained in:
Arthur Grisel-Davy 2023-06-18 18:12:20 -04:00
parent c06d1a7760
commit 3b2acada9c
2 changed files with 40 additions and 22 deletions

View file

@ -4,7 +4,7 @@
%
\documentclass[runningheads]{llncs}
%
\usepackage[T1]{fontenc}
%\usepackage[T1]{fontenc}
% T1 fonts will be used to generate the final print and online PDFs,
% so please use T1 fonts in your manuscript whenever possible.
% Other font encondings may result in incorrect characters.
@ -62,6 +62,8 @@
% FAKES/ANONYMOUS
%
\author{
\phantom{
\begin{minipage}{\textwidth}
Anon. Anonymous\and
Anon. Anonymous\and
Anon. Anonymous\and
@ -71,11 +73,13 @@ Anon. Anonymous\and
Anon. Anonymous\and
Anon. Anonymous\and
Anon. Anonymous
\end{minipage}
}
\authorrunning{Anon. et al.}
}
\authorrunning{ }
\institute{University of Anonymous, Nowhere. \\
anon@anonymous.nw}
\institute{ ~\\
}
%
\maketitle % typeset the header of the contribution
%
@ -85,8 +89,8 @@ anon@anonymous.nw}
In the case of a compromized device, the detection capability of its \gls{hids} becomes untrustworthy.
In this context, embedded systems such as network equipment remain vulnerable to firmware and hardware tampering, as well as log manipulation.
Side-channel emissions provide an independent and extrinsic source of information at the about the system, purely based on the physical by-product of its activities.
Leveraging side-channel information, we propose a physics-based \gls{ids} as an aditional layer of protection for embedded systems.
Side-channel emissions provide an independent and extrinsic source of information about the system, purely based on the physical by-product of its activities.
Leveraging side-channel information, we propose a physics-based \gls{ids} as an additional layer of protection for embedded systems.
The physic-based \gls{ids} uses machine-learning-based power analysis to monitor and assess the behaviour and integrity of network equipment.
The \gls{ids} successfully detects three different classes of attacks on an HP Procurve Network Switch 5406zl: (i)~firmware manipulation with \numprint[\%]{99} accuracy, (ii)~brute-force SSH login attempts with \numprint[\%]{98} accuracy, and (iii)~hardware tampering with \numprint[\%]{100} accuracy.
@ -97,7 +101,6 @@ anon@anonymous.nw}
\end{abstract}
%
%
%
\glsresetall % reset all acronyms to be expanded on first use.
\section{Introduction}
@ -114,29 +117,29 @@ Although \glspl{hids} and \glspl{nids} offer intrusion detection capabilities, t
The literature shows promising work in improving the state-of-the-art in security by analyzing side-channel emissions from embedded systems.
Systems generate side-channel emissions, which usually reflect their activity in the form of power consumption \cite{kocher1999differential, brier2004correlation, Moreno2018}, electromagnetic waves \cite{khan2019malware, sehatbakhsh2019remote}, acoustic emissions \cite{genkin2014rsa, liuacoustic}, etc.
Side-channel based \glspl{ids} analyze side-channel emissions and can complement state-of-art \glspl{ids}, as shown in this paper.
Side-channel-based \glspl{ids} analyze side-channel emissions and can complement state-of-art \glspl{ids}, as shown in this paper.
The \gls{ids} uses \gls{dsp} and \gls{ml} to detect anomalies or recognize patterns of previously detected intrusions.
Thus, using this \gls{ids} would improve the security of the embedded system by detecting attacks that regular \glspl{ids} fail to identify.
\subsection{Contributions}
This paper proposes a side-channel-based \gls{ids} that can complement existing \glspl{ids} and improve security for embedded systems.
The side-channel based \gls{ids} can potentially protect any embedded system treated a black box and detect a range of attacks against it.
This paper proposes a side-channel-based \gls{ids} --- also called physics-based \gls{ids} --- that can complement existing \glspl{ids} and improve security for embedded systems.
The side-channel-based \gls{ids} can potentially protect any embedded system treated as a black box and detect a range of attacks against it.
Our \gls{ids} is deployed on an HP Procurve 5406zl network switch as a black box.
The experiments in the paper illustrate the \gls{ids} capabilities of detecting firmware manipulation and hardware tampering attacks against the switch and defending against log entry forging through log verification.
The side-channel based \gls{ids} achieves near-perfect accuracy scores despite using simple \gls{dsp} methods and \gls{ml} algorithms. The algorithms analyze \gls{ac} and \gls{dc} power consumption of the network switch to detect these attacks.
The side-channel-based \gls{ids} achieves near-perfect accuracy scores despite using simple \gls{dsp} methods and \gls{ml} algorithms. The algorithms analyze \gls{ac} and \gls{dc} power consumption of the network switch to detect these attacks.
%The experiments use a relatively small dataset that contains roughly \numprint{1000} power traces.
\subsection{Paper Organization}
The paper is organized as follows:
Section~\ref{sec:Overview} provides an overview of the motivation for the experiments and threat model.
Section~\ref{Related Work} describe other side-channel-based approaches for runtime monitoring and integrity assessment.
Section~\ref{Related Work} describes other side-channel-based approaches for runtime monitoring and integrity assessment.
Section~\ref{Firmware} describes experiments related to firmware manipulation,
Section~\ref{RunTime} describes log verification and auditing,
and Section~\ref{Hardware} describes hardware tampering.
The paper concludes in Sections~\ref{Discussion} and ~\ref{Conclusion}.
The paper concludes in Sections~\ref{Discussion} and~\ref{Conclusion}.
\section{Overview}
\label{sec:Overview}
@ -177,7 +180,7 @@ This independence is also beneficial in case of a malfunction of the \gls{ids},
\end{tabularx}
\caption{Attack scenarios that side-channel based \gls{ids} can detect.}
\caption{Attack scenarios that side-channel-based \gls{ids} can detect.}
\label{tab:example}
\end{table}
@ -213,7 +216,7 @@ In the context of \gls{ids} for network equipment, we considered power consumpti
After initial tests, power consumption proved to provide the most information about the system state relative to the practicality of measurement.
In our setup, the power consumption of the device is measured in two different ways: measurement at the \gls{ac} line (between the device's \gls{psu} and the power outlet); and measurement at the \gls{dc} power (from the \gls{psu} to the motherboard of the device).
For both \gls{ac} and \gls{dc}, a power measurment box is placed in series with the main power cable.
For both \gls{ac} and \gls{dc}, a power measurement device is placed in series with the main power cable.
The box measures the voltage drop generated by the current flowing through a shunt resistor.
This box samples the voltage at one mega sample per seconds (1MSPS).
During every operation of the device, the different instructions influence the overall power consumption \cite{727070} and will be detectable in either \gls{ac} and \gls{dc} power consumption.
@ -223,7 +226,7 @@ However, its \gls{snr} is lower compared to the \gls{dc} measurement because the
\section{Related Work}
\label{Related Work}
The idea of side-channel based \gls{ids} traces back to the seminal work in side-channel analysis by Paul C. Kocher.
The idea of side-channel-based \gls{ids} traces back to the seminal work in side-channel analysis by Paul C. Kocher.
He introduced Differential Power Analysis to find secret keys used by cryptographic protocols in tamper-resistant devices~\cite{kocher1999differential}.
This led to a field of research focusing on side-channel analysis that has been growing since. Power analysis is the most common and widely studied side-channel analysis technique~\cite{brier2004correlation,mangard2008power}. %new citations%
Cagalj et al.~\cite{vcagalj2014timing} show a successful passive side-channel timing attack on U.S. patent Mod 10 method and Hopper-Blum (HB) protocol.
@ -255,6 +258,11 @@ They use HDBSCAN clustering method to identify anomalous behaviour exhibited by
Yilmaz et al.~\cite{yilmaz2019detecting} implement K-Nearest Neighbors clustering methods along with PCA dimensionality reduction method to model EM emanations from a phone with the different operational status of front/rear camera.
Using the ML methods, they can determine the state of cellphone cameras.
Some work also investigated the possibility of forging power consumption for defense purposes.
Raghavendra et al.~\cite{Pradyumna_Pothukuchi_2021} proposed a simple control method to mask the power consumption pattern of any application.
However, this kind of method does not enable masking into an arbitrary pattern as it is meant for obfuscation, not impersonation.
Thus an attacker could not leverage this method to make an activity (malware) impersonate another one (legit activity) from a power consumption point of view.
%The work that this paper proposes builds on top of the aforementioned works.
%An HP network switch, treated as a black box, generates side-channel leaks in the form of its power consumption.
%The experiments treat this power consumption as an output of the system when the inputs are certain attacks/stimuli that triggers the switch.
@ -271,8 +279,8 @@ Starting from the pre-installed version K.15.06.008, we performed upgrades to th
\subsubsection{Feature Engineering}\label{FE-Firmware}
With the HP Procurve Switch 5406zl taking around 120 seconds to complete its boot-up sequence, this experiment family produces the largest datasets of this case study.
Therefore, several preprocessing steps were applied to reduce the size of the datasets and remove noise.
With the HP Procurve Switch 5406zl taking around 120 seconds to complete its boot-up sequence, this experiment family produces the largest dataset of this case study.
Therefore, several preprocessing steps were applied to reduce the size of the dataset and remove noise.
A combination of downsampling and a sliding median filter yields the best results at a minimal size per training set.
Given a power trace with a length of \numprint{120e6} datapoints, downsampling with a factor of \numprint{1e6} results in a sample size of 120 and provides an overall accuracy of \numprint[\%]{99} for this experiment.
This process enables training accurate machine-learning models (see Table~\ref{tab:fw-results}) with less than \numprint{1000} training samples, each consisting of 120 datapoints (See Figure~\ref{fig:firmwares-samples}).
@ -292,7 +300,7 @@ Figure~\ref{fig:firmwares} illustrates the captured data for two different firmw
\caption{Median-filtered power traces of boot-up sequences for two different firmware versions (ten captures each).}
\label{fig:firmwares-samples}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\begin{subfigure}{0.45\textwidth}
\centering
\includegraphics[width=\linewidth]{images/psd.pdf}
\caption{PSD of power traces of boot-up sequences for two different firmware versions (two traces for each version).}
@ -551,7 +559,7 @@ The \gls{ac} periods do present different patterns depending on the number of mo
The \gls{svm} model was able to identify the number of modules installed with an accuracy of \numprint[\%]{99}.
Results from Table~\ref{tab:hardware-results} show that \gls{dc} data yields the best results.
These high accuracy and recall performances are the result of the non-overlapping grouping of the averages \gls{dc} consummations.
These high accuracy and recall performances are the result of the non-overlapping grouping of the averages \gls{dc} consumptions.
The results presented are produced with a stratified 10-fold cross-validation setup.
\begin{table}[ht]
@ -579,7 +587,7 @@ This section highlights important aspects of this study.
The data used for training the models did not include traffic and were collected in a laboratory environment.
Because the production equipment is used by actual users, it is impossible to perform attack that would disrupt to connection quality or lower the security of the device.
However, complementary experiments were conducted to verify whether traffic would have a significant influence on the results of the experiment.
For Experiment Family I (Section~\ref{Firmware}), the traffic can not influence the results as there is no traffic possible during the boot-up sequence and the experiment uses only the boot-up sequences to perform the classification.
For Experiment Family I (Section~\ref{Firmware}), the traffic cannot influence the results as there is no traffic possible during the boot-up sequence and the experiment uses only the boot-up sequences to perform the classification.
For Experiment Family II (Section~\ref{RunTime}) and III (Section~\ref{Hardware}), we captured data containing real traffic (captures on the identical production switch) and simulated traffic (connections between multiples pairs of machines at around 1Gbps in the laboratory environment).
Traffic data does not show any significant influence on \gls{dc} or \gls{ac} in both time and frequency domain.
From these results, it is possible to conclude that traffic should not affect the results from the presented experiments.
@ -597,7 +605,7 @@ The lightweight nature of the models allows for fast online run-time monitoring
\label{Conclusion}
This paper introduces a physics-based \gls{ids} that offers a novel and complementary type of runtime monitoring and integrity assessment for network equipment.
The proposed \gls{ids} leverages side-channel information generated by the system at the physical level and infer the system's state and activities to detect attacks.
The proposed \gls{ids} leverages side-channel information generated by the system at the physical level and infers the system's state and activities to detect attacks.
This paper presents en evaluation of the performances against hardware tampering, firmware manipulation, and log tampering.
The results show that the used methods achieve near perfect accuracy on all experiments with only a small training set.
Overall, the introduced techniques provide a glimpse on a general concept that is extensible to other real-time and embedded systems.