diff --git a/PhD/research_proposal/bibliography.bib b/PhD/research_proposal/bibliography.bib index 318c5f8..6540958 100644 --- a/PhD/research_proposal/bibliography.bib +++ b/PhD/research_proposal/bibliography.bib @@ -2003,22 +2003,12 @@ series = {UbiComp '10} note = {Accessed: 2023-03-15} } -@INPROCEEDINGS{8057232, - +@inproceedings{8057232, author={Chen, Yimin and Jin, Xiaocong and Sun, Jingchao and Zhang, Rui and Zhang, Yanchao}, - booktitle={IEEE INFOCOM 2017 - IEEE Conference on Computer Communications}, - title={POWERFUL: Mobile app fingerprinting via power analysis}, - year={2017}, - - volume={}, - - number={}, - pages={1-9}, - doi={10.1109/INFOCOM.2017.8057232} } diff --git a/PhD/research_proposal/frontpages.tex b/PhD/research_proposal/frontpages.tex index 9a9bb73..447667f 100644 --- a/PhD/research_proposal/frontpages.tex +++ b/PhD/research_proposal/frontpages.tex @@ -146,10 +146,10 @@ This proposal describes the exploratory work already achieved in the domain of p % L I S T O F T A B L E S % --------------------------- -\addcontentsline{toc}{chapter}{List of Tables} -\listoftables -\cleardoublepage -\phantomsection % allows hyperref to link to the correct page +%\addcontentsline{toc}{chapter}{List of Tables} +%\listoftables +%\cleardoublepage +%\phantomsection % allows hyperref to link to the correct page % L I S T O F A B B R E V I A T I O N S % --------------------------- diff --git a/PhD/research_proposal/introduction.tex b/PhD/research_proposal/introduction.tex index 8a33ef1..3752582 100644 --- a/PhD/research_proposal/introduction.tex +++ b/PhD/research_proposal/introduction.tex @@ -15,16 +15,16 @@ A wide variety range of solutions are available to protect computer systems in g Among them, \gls{ids} aim at detecting security policies violations or suspicious activities from or among computers. Collection and analysis of data related to the machines activity often enable the detection. If the \gls{ids} only consideres local ressources (e.g. CPU load, RAM data, disks read/write speed), then it is called \gls{hids}. -\gls{hids} have access to relevant local data\cn but they require to install a software on the machine (either for collection only or for local analysis). +\gls{hids} have access to relevant local data but they require to install a software on the machine (either for collection only or for local analysis). This represent a potential flaw for multiple reasons. First, the host machine may not be trusted and can be compromised, allowing the attacker to deploy stealth attacks \cite{10.1145/586110.586145}. -Second, an \gls{hids} can lack the broader vision required to detect intrusions distributed over a network of machines\cn. +Second, an \gls{hids} can lack the broader vision required to detect intrusions distributed over a network of machines. Finally, the operation of the \gls{hids} may interfer with the critical operation of the system (for example if the \gls{hids} missbehave and block other operations). For these reasons, \gls{hids} may be difficult to implement on a wide range of embedded systems. The other main class of \gls{ids} aims at solving these issues. \gls{nids} \cite{vigna1999netstat, bivens2002network} consider the communication between machines in a network to detect intrusions. -This solution does not require installing individual software on each machines and can detect network-level intrusions \cn. +This solution does not require installing individual software on each machines and can detect network-level intrusions. However, \gls{nids} present their own concerns. First, machine-specific attacks can remain undetected as only network information are accessible. Then, they require the installation of dedicated equipment to collect network traffic. @@ -41,7 +41,7 @@ Modifying an existing system to add intrusion detection capabilities is expensiv A third, under-exploited, source of information for embedded systems activity are the side-channels. The side-channels are all the physical emissions that a machine involuntarely generates. -For example, the sound of a fan, the temperature of a CPU, or the power consumption of a \gls{psu} are common side-channels \cn. +For example, the sound of a fan, the temperature of a CPU, or the power consumption of a \gls{psu} are common side-channels. \begin{figure}[H] \centering @@ -68,13 +68,14 @@ A wide variety of side-channels have since been leveraged to recover information Among them, power consumption is the most common and widely studied side-channel because of its numerous advantages. Power consumption leaks information about the activity of an embedded system with a low inertia --- i.e., it can transmit high frequency information contrary to thermal ---, is easy to measure with low-cost equipment at specific points in a machine --- contrary to electromagnetic fields or sound --- and is guaranteed to be present in any system. This combination of properties allow for a granular detection of a system activity, even at the instruction level. -Quisquater et al.~\cite{quisquater2002automatic} present an approach to identify instructions with the use of self-organizing maps, power analysis and analysis of electromagnetic traces.\agd{this citation comes out of nowhere} -Eisenbarth et al.~\cite{eisenbarth2010building} propose a methodology for recovering the instruction flow of microcontrollers using its power consumption.\agd{this citation comes out of nowhere} +%Quisquater et al.~\cite{quisquater2002automatic} present an approach to identify instructions with the use of self-organizing maps, power analysis and analysis of electromagnetic traces.\agd{this citation comes out of nowhere} +%Eisenbarth et al.~\cite{eisenbarth2010building} propose a methodology for recovering the instruction flow of microcontrollers using its power consumption.\agd{this citation comes out of nowhere} + Eventhough the information portential of side-channel analysis enable powerfull attacks, it also enables defensive capabilities. Zhai et al.~\cite{zhai2015method} propose a self-organizing maps approach that uses features extracted from an embedded processor to detect abnormal behavior in embedded devices. Different teams at Georgia Tech University leveraged power and electromagnetic backscattering \cite{8701559, jorgensen2022efficient} to detect hardware trojans and counterfeit integrated circuit. -Due to its non-intrusive and architectur-agnostic nature, power fingerprinting has a wide range of applications from energy production systems \cite{6378346}, Software Defined Radio compliance assesments \cite{5379826}, or applications activity on mobile devices \ref{8057232}. +Due to its non-intrusive and architectur-agnostic nature, power fingerprinting has a wide range of applications from energy production systems \cite{6378346}, Software Defined Radio compliance assesments \cite{5379826}, or applications activity on mobile devices \cite{8057232}. Literature shows promising work in assessing integrity through cache monitoring~\cite{7163050} and power monitoring~\cite{10.1145/2976749.2978299}. Works by Moreno et al. offer two building blocks for this work. In~\cite{moreno2013non}, the team proposes a solution for non-intrusive debugging and program tracing using side-channel analysis. diff --git a/PhD/research_proposal/pastwork.tex b/PhD/research_proposal/pastwork.tex index 417fb9b..65b92ad 100644 --- a/PhD/research_proposal/pastwork.tex +++ b/PhD/research_proposal/pastwork.tex @@ -140,7 +140,12 @@ The results were satisfactory and illustrated the possibility of detecting a fir The standard Molex cable supplying power the a SATA hard drive is composed of 3 voltage levels: 3V, 5V and 12V. After some tests, it appears that the 5V cables --- grouped on the same shunt resistor --- carried the most information about the drive activity. The shunt resistor generated the voltage drop on the 5V cables of the hard drive. -\agd{find back results and add them here} + +Unfortunately, the report containing the results was misplaced, and the results are not easily reproducible. +The overall conclusion of this experiment was that the power consumption, captured at the \gls{psu} level, contains enough information to distinguish between different versions of the firmware on a hard drive. +In this specific setup, it was also possible to distinguish between two hard drives with identical manufacturer, model number and capacity. +These results were encouraging but too superficial and not rigorous enough to be conclusive. +Further experiments on the accuracy, robustness and versatility of this method are required to assess the potential of this technology properly. \newpage \section{Boot Process Verifier}\label{sec:bpv} @@ -319,7 +324,7 @@ The pattern $\chi$ is the unknown pattern assigned to the samples in $t$ that do \end{figure} The core of the algorithm is a \gls{knn} classification. -This algorithm is a proven and robust way of labelling new samples based on their relative similarity to the training samples (\cn ?). +This algorithm is a proven and robust way of labelling new samples based on their relative similarity to the training samples. Although this is a good algorithm for many classification problems, its application to time series for state detection is not trivial for multiple reasons. First, a single time serie can contain multiple different states, making it a multi-label classification problem. Second, extracting windows to perform classification introduces parameters --- window size, window placement around sample to classify, number of sample to classify per window, stride --- potentially difficult to tune or justify. @@ -336,11 +341,10 @@ The results for this method are sub-optimal for two reasons. First, if the stride between each window is too large, crucial patterns can be overlooked in the trace. If it is too small, the detection accuracy can suffer as the state of each sample is evaluated multiple time --- due to windows overlap. Second, the whole window will be assigned one label, which causes the edges of the states to be inaccurate --- especially when states patterns share similarities. -\agd{find a theoretical setup where the middle-sample knn is worse than dsd. Consider cases of a bootup with a description that include some OFF portion.} The \gls{dsd} uses a better metric for evaluating the distance between a sample and each state. For each sample and for each state, every window of the length of the state containing the sample is considered. -The first window contains the sample at the last position, and the last window contains the sample at the first position (see Figure~\ref{windows_dsd}). +The first window contains the sample at the last position, and the last window contains the sample at the first position (see Figure~\ref{fig:windows_dsd}). \begin{figure}