deneir/PhD/research_proposal/introduction.tex

\chapter{Introduction to Physics-Based Security}

\section{Context}

%\gls{scs} are those whose failure to perform their task could result in significant safety risks for humans or the environment.
%These systems are present in many aspects of our daily life from transportation (ABS, airbags, traffic lights), or energy production (nuclear controle systems), to medicine (ventilation systems, radiation machines) and many others.
%\gls{scs} are now more and more computer-based to enable features such as remote control or lower maintenance cost.
%These systems are also increasingly connected to the internet to allow for offsite monitoring or data collection.
%This digitalization of \gls{scs} also brings the undesirable aspects of connected computers.
%The more connection and interraction types are available to a computer system, the greater is the risk of an attacker using one of these connection.
%This sum of potential attack points, called attack surface, should typically be as small as possible, especially for \gls{scs} that require high reliability and availability.
%Increasing the capabilities and connectivity of \gls{scs} enable large scale attacks that would be infeasible before.
%For example, if all the water treatment plants in Canada are equipped with a data collection mechanisme exposed to the internet for centralized analysis, then an atacker could leverage this mechanism to take over all these system and put the whole country at risk.


A wide variety of solutions are available to protect embedded systems.
No solution can claim to protect against all possible attacks, and multiple layers of prevention, detection, and mitigation mechanisms are often required to protect a system as best as possible.
Each solution presents different domains of application, requirements, and capabilities that are important to understand to reduce the attack surface.


Among all these solutions, \glspl{ids} aim at detecting security policy violations or suspicious activities from or among computers.
The detection always starts with collecting and analyzing data related to the activity of the machine to protect --- also called the target.
If the \gls{ids} only considers local resources (e.g. CPU load, RAM data, disks read/write speed), then it is called \gls{hids}.
\glspl{hids} have access to relevant local data, but they require to install software on the target --- either for collection only or for local analysis --- or dedicated components in communication with the target.
This requirement represents a flaw for two main reasons.
First, the host machine may be compromised, allowing the attacker to bypass the detection by feeding forged data to the \gls{ids}, shutting it down, or forging the detection result.
Second, the operation of the \gls{hids} may interfere with the critical operation of the system (for example, if the \gls{hids} misbehaves and blocks other operations).
For these reasons, \glspl{hids} may be challenging to implement on a wide range of embedded systems and lack the reliability of an external solution.

One other main class of \gls{ids} takes a different approach to solving some of these issues.
\glspl{nids} consider the communication between machines in a network to detect intrusions.
This solution does not require installing individual software on each machine and can detect network-level intrusions.
However, \glspl{nids} present their own drawbacks.
First, machine-specific attacks can remain undetected as only network information is accessible.
Then, they require the installation of dedicated equipment to collect network traffic.
Finally, modern traffic encryption practices will limit \glspl{nids} to sender-receiver pattern analysis unless traffic flows unencrypted, which can raise privacy issues.

The current \gls{ids} scene appears to present a tradeoff between the granularity of detection and isolation from the protected machine.
What about the case of protecting a machine against a local intrusion without the possibility of installing additional software?
How can an \gls{ids} protect a machine against attackers bypassing the secure boot verification and booting a completely different \gls{os}?
Following the discovery of a vulnerability on a \gls{scs}, how can the detection mechanism evolve without requiring the recertification of the whole system?
These use cases can seem niche, but they represent a reality for many purpose-built embedded systems with minimal \gls{os}.
Systems like network switches, \gls{rtu}, \gls{wap} rarely allow additional software installation and yet perform critical tasks.
In these cases, neither local resources nor network information can be leveraged for local attack detection.
Moreover, any industry that relies on \gls{scs} have strict regulations (e.g. DO-178C for aerospace systems in Canada, ISO 26262 for automotive system, ISO 16142 for medical devices) that guarantee the safety of every equipment.
Modifying an existing system to add intrusion detection capabilities is expensive as it requires the revalidation of the whole system.

A third under-exploited source of information for embedded systems activity is the side-channels.
The side-channels are all the physical emissions that a machine involuntarily generates.
For example, the sound of a fan, the temperature of a CPU, or the power consumption of a \gls{psu} are common side-channels.

\begin{figure}[H]
    \centering
    \includegraphics[width=0.95\linewidth]{images/side_channel.pdf}
    \caption{Main side-channels from a typical embedded systems.}
    \label{fig:side_channel}
\end{figure}

Even though the historical use case of side-channel analysis is to attack a system, the core idea is to retrieve information correlated with the system's activity.
With enough knowledge of the normal behaviour of a machine, an algorithm should be able to assess its correct behaviour from only side-channel information.
This idea is called physics-based security and is the core principle of this research work.

\section{Proposal Organization}
This proposal is organized as follows: Section~\ref{sec:related-work} presents an overview of the related work, Chapter~\ref{chap:pastwork} presents the preliminary work conducted until now, Chapter~\ref{chap:futurwork} presents the main problems I want to address during my research, and finally Chapter~\ref{chap:timetable} draws a proposed timeline for the completion of the planned work.


\section{Related Work}\label{sec:related-work}
The idea of side-channel-based analysis traces back to the seminal work by Paul C. Kocher.
He introduced \gls{dpa} to find secret keys used by cryptographic protocols in tamper-resistant devices~\cite{kocher1999differential}.
This led to a field of research focusing on side-channel analysis that has grown ever since.
A wide variety of side-channels have since been leveraged to recover information from a system such as power consumption \cite{brier2004correlation,mangard2008power}, electromagnetic fields~\cite{sayakkara2019survey}, acoustic emanations~\cite{7479068, halevi2015keyboard}, thermal dissipations~\cite{9727162} or, on the non-physical side, cache~\cite{page2003defending}.


Among them, power consumption is the most common and widely studied side-channel because of its numerous advantages.
Power consumption leaks information about the activity of an embedded system with little inertia --- i.e., it can transmit high-frequency information contrary to thermal ---, is easy to reliably measure with low-cost equipment --- contrary to electromagnetic fields or sound --- and is guaranteed to be present in any system.
This combination of properties allows for a granular detection of a system activity, even at the instruction level.
%Quisquater et al.~\cite{quisquater2002automatic} present an approach to identify instructions with the use of self-organizing maps, power analysis and analysis of electromagnetic traces.\agd{this citation comes out of nowhere}
%Eisenbarth et al.~\cite{eisenbarth2010building} propose a methodology for recovering the instruction flow of microcontrollers using its power consumption.\agd{this citation comes out of nowhere}


Even though the information-gathering capability of side-channel analysis enables powerful attacks, it also enables defensive capabilities.
Zhai et al.~\cite{zhai2015method} propose a self-organizing maps approach that uses features extracted from an embedded processor to detect abnormal behaviour in embedded devices.
Different teams at Georgia Tech University leveraged power and electromagnetic backscattering \cite{8701559, jorgensen2022efficient} to detect hardware trojans and counterfeit integrated circuits.
Due to its non-intrusive and architecture-agnostic nature, power fingerprinting has a wide range of applications from energy production systems \cite{6378346}, Software Defined Radio compliance assessments \cite{5379826}, or applications activity on mobile devices \cite{8057232}.
Literature shows promising work in assessing integrity through cache monitoring~\cite{7163050} and power monitoring~\cite{10.1145/2976749.2978299}.
Works by Moreno et al. offer two building blocks for this work.
In~\cite{moreno2013non}, the team proposes a solution for non-intrusive debugging and program tracing using side-channel analysis.
In this work, they use the power consumption of a given embedded system to identify the code block the embedded system was executing at the time.
The team builds on their previous technique and presents a new one~\cite{Moreno2018} using the power consumption of embedded systems for non-intrusive online run-time monitoring through anomaly detection.
They use a signals and systems analysis approach to identify anomalies using the power consumption of a system and showcase this by identifying buffer overflow attacks on their system.
Msgna et al.~\cite{msgna2014verifying} propose a technique for using the instruction-level power consumption of a system to verify the integrity of the software components of a system with no prior knowledge of the software code.
In~\cite{kur2009improving}, Kur et al. perform power analysis of smart cards based on the JavaCard platform to help identify vulnerable operations, obtain bytecode instruction information, and also propose a framework to replace vulnerable operations with safe alternatives.

Side-channel information's non-intrusiveness and difficult-to-forge nature makes it an ideal input for \glspl{ids}.
Van Aubel et al.~\cite{van2018side} proposed using electromagnetic information to protect \glspl{ics} by detecting changes in software flow.
Xun et al.~\cite{10016748} use the voltage signal of a vehicle CAN bus to detect anomalies without extensive documentation from the manufacturer.
On a different kind of embedded systems, Liang et al. propose a framework to leverage side-channel information in additive manufacturing where traditional \glspl{ids} would fail.

In more recent literature, there is a trend towards using \gls{ml} for side-channel analysis to enhance the security of systems.
Michele Giovanni Calvi~\cite{calvi2019runtime} offers a solution for run-time monitoring of an entire cyber-physical system treated as a black box.
They collect data from a self-driving car during operations such as steering and acceleration.
Using this data, they train a Long Short Term Memory~\cite{hochreiter1997long} deep learning model and use it to verify the safety of the vehicle.
Zhengbing et al.~\cite{4488501} suggest the use of forensic techniques for profiling user behaviour to detect intrusions and propose an intelligent lightweight \gls{ids}.
Hanilçi et al.~\cite{hanilci2011recognition} use recorded speech from a cell phone to ascertain the cell phone brand and model through using vector quantization and \gls{svm} models on the \gls{mfcc} of the audio.
In~\cite{khan2019malware} Khan et al. propose a technique to identify malware in critical embedded and cyber-physical systems using \gls{em} side channel signals.
Their technique uses deep learning on EM emanation to model the behaviour of an uncompromised system.
The system flags an activity as anomalous when the emanations differ from the normal ones used to train the neural network.
Sehatbakhsh et al.~\cite{sehatbakhsh2019remote} also use EM emanations and detect malware code injection into a known application without any prior knowledge of the malware signature.
They use the HDBSCAN clustering method to identify anomalous behaviour exhibited by the malicious code.
Yilmaz et al.~\cite{yilmaz2019detecting} implement K-Nearest Neighbors clustering methods along with PCA dimensionality reduction method to model EM emanations from a phone with the different operational status of front/rear camera.
Using the ML methods, they can determine the state of cellphone cameras.

A mechanical equivalent of physics-based security is \gls{mcm}, which aims at monitoring the evolution of key parameters of a machine for health assessment.
This topic is not restricted to detecting attackers' activity and can inform about the machine's health over time to enable timely maintenance.
Different techniques are deployed based on the machine type and the specific metrics of interest.
Machining equipment is often monitored with side-channel measurements such as vibration~\cite{PENG2004199,4084702,HOU2021107451} sound~\cite{sound_mcm}, temperature~\cite{22438} or chemical analysis~\cite{tavner1987condition}.
These techniques focus on mechanical machines with high-reliability requirements and leverage side-channel information to reduce intrusivity.

On a larger scale, power consumption information for a whole house -- or even a whole building -- provides information about the activity of each appliance.
Monitoring or prediction applications can leverage this information without the need for a measurement system on each endpoint.
This idea of non-intrusive load monitoring was first proposed by Hart in 1992~\cite{hart1992nonintrusive}.
The interests and challenges posed by the problem yielded different proposed solutions such as \gls{cnn}~\cite{moradzadeh2021practical}, soft computing~\cite{puente2020non}, or Gaussian models fitting on electromagnetic signatures~\cite{10.1145/1864349.1864375}.
The concepts of signal disambiguation and individual consumption retrieval are transposable from a house composed of appliances to an embedded system composed of devices.