141 lines
12 KiB
TeX
141 lines
12 KiB
TeX
\chapter{Planned Work}\label{chap:futurwork}
|
|
All the work achieved in the preliminary work serves as the foundation for the planned work.
|
|
The thesis will focus on the state detection problem under various input data and detection requirements.
|
|
Detecting the state of a system constitute a stepping stone in the construction of specialized tools for physics-based security.
|
|
As illustrated by the \gls{sds} and \gls{bpv}, the detection of specific attacks often relies on the ability to pre-process the time series to find sections of interest.
|
|
In this sense, solving the state detection problem enables a deeper investigation of power consumption by making the data actionable.
|
|
The different machines and data measurement designs lead to different problems to solve and different detection capabilities.
|
|
This chapter described the problems to study with their problem statement as well as the motivations and expected results.
|
|
|
|
The problems are discretized based on the input data and measured machines that constitute the power trace.
|
|
A single sensor only measure the power flowing through one cable.
|
|
It is possible to combine sensores to measure multiple related consumptions --- for example, the consumptions of different components in the same machine.
|
|
In this case, the problem is called \textit{multi-measure} and the resulting input data is multivariate trace.
|
|
It is also possible to place the sensor on a power cable that provide power to multiple machines.
|
|
In this case, the problem is called \textit{multi-sources} and the resulting input data is an aggregate of multiple traces.
|
|
The difference between machines and components is a fine and blury line as the description of a machine often fits individual components.
|
|
In this thesis, a component is a system that expects instructions from a central unit while a machine run its own software.
|
|
For example, at a macroscopic scale, a graphics card does not take the initiative on its own to run any software and expect instructions from the rest of the \gls{pc}.
|
|
|
|
\section{Single-Source, Single-Measure}
|
|
The \gls{dsd} --- example of a Single-Source Single-Measure problem --- shows promising results in an experimental setup.
|
|
To this date, the experiments have focused on the detection of simple global states.
|
|
The global state are usualy \textit{OFF}, \textit{ON}, \textit{BOOT}, \textit{HIGH LOAD}.
|
|
Depending on the machine, other states like \textit{FIRMWARE FLASH}, \textit{SLEEP} or a specific activity mode can also be detected.
|
|
The experiments focus on the deployment to general-purpose computers, network switches, and \gls{wap}/routers.
|
|
|
|
In the next months, the goal for the \gls{dsd} is to evaluate the performances of the runtime state detection in broaders and more exhaustives conexts.
|
|
The current accuracy and edit distance performances (see Figure \ref{fig:dsd_acc}) illustrate the capabilities of the \gls{dsd} for the detection of well defined states --- i.e. states associated with a striking variation of average power consumption.
|
|
However, in order to provide a useful and reliable runtime labeling of the a machine's activity, the \gls{dsd} must achieve similar results with a more diverse selection of states.
|
|
The work on \gls{dsd} is the fundation for the planned development of more specific applications of the same principle of physics-based monitoring.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\textwidth]{images/dsd_acc}
|
|
\caption{Current results of the DSD algorithm on several datasets.}
|
|
\label{fig:dsd_acc}
|
|
\end{figure}
|
|
|
|
\section{Single-Source, Multi-Measure}
|
|
The global power consumption of a machine does not fully describe its activity.
|
|
In an embedded system, the power consumption can be attributed to different components, each with its specific activity.
|
|
For the simplest systems performing one specific task, the activity of each component is often correlate with each other.
|
|
If the system is in a Mode \textit{A} then each component is in Mode \textit{A}, and the global power consumption will display the Mode \textit{A} pattern.
|
|
For more complex systems, different components can be in different modes to accommodate the multi-tasking nature of the global activity.
|
|
In this case, if the first component is in Mode \textit{A} but the second is in Mode \textit{B}, this indicates a different global activity than if both are in the same mode.
|
|
For example, if the bootup sequence of a general-purpose computer shows a significant \gls{cpu} activity but no \gls{hdd} activity, it could indicate a failure to boot or an attacker booting the system from external storage.
|
|
Access to each component's individual power consumption opens the way to a more granular understanding of the machine's activity.
|
|
However, the multivariate aspect of the captured data requires an evolution of the detection techniques.
|
|
|
|
\subsection{Problem Statement}
|
|
Differentiating between the different components to better understand the activity of a machine is a valuable capability associated with a new problem.
|
|
|
|
\begin{problem-statement}[Single-Source Multi-Measure]
|
|
Given a discretized, multivariate time series $ts$ and a set of $n$ components for each of $m$ patterns $P=\{\{\chi\},P_1=\{P_{1,1},\dots, P_{1,n}\},\dots,$
|
|
$P_m=\{P_{m,1},\dots, P_{m,n}\}\}$, identify an injective mapping $m_{SSMM}:\mathbb{N}\longrightarrow P$ such that every sample $ts[i]$
|
|
maps to exactly one set of pattern components in $P$ with the condition that the sample matches an occurrence of the set of patterns in $t$.
|
|
\end{problem-statement}
|
|
|
|
The time series $ts$ is a discretized, multivariate, real-valued time series.
|
|
$ts$ is composed of $n$ dimensions with the $j^{th}$ dimension referred to as $ts_j$.
|
|
Each sample $ts[i]$ is a vector or $n$ component representing the value of each dimension of $t$ at a point in time.
|
|
The items of the set $P$ are sets of patterns $P_j$ with $j\in[1,m]$.
|
|
Each set of patterns $P_j$ is associated with one component of a global pattern.
|
|
In other words, each component $P_{j,k}$ represent a the pattern $j$ along the $k^{th}$ dimension of $ts$.
|
|
Thus, the number of components of each pattern must be equal to the dimensions of $ts$.
|
|
Figure \ref{fig:notation} illustrate the $ts$ and $P$ objects.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=0.9\textwidth]{images/ssmm_illustration.pdf}
|
|
\caption{Notations for the multivariate time series and the patterns set.}
|
|
\label{fig:notation}
|
|
\end{figure}
|
|
|
|
\subsection{Applications}
|
|
The multi-measure setup present two potential benefits that will be investigated in this thesis.
|
|
First, correlated information could allows for a more robust detection mechanism.
|
|
If all components of a machine display behaviours associated with the same global activity, the detection confidence could improve compared with the global consumption only.
|
|
Second, multiple measures could enable a more granular activity detection.
|
|
With the power consumption measurement of multiple components available, every combination of component's activity can be associated with a different global activity.
|
|
These changes would allow for detecting potentially anomalous combinations of states and for a better understanding of the machine's behaviour.
|
|
|
|
\sfm{Because we address embedded stsrems, somewhere discuss the problem of actuators distorting the power trace (e.g., fans, motors, etc). You can link that to the MSSM problem.}
|
|
|
|
The typical application of this technology would concern general-purpose computers or medium-complexity systems with multiple internal components.
|
|
These machines are typically difficult to profile with global consumption as each component influences the measure in a different way.
|
|
The detection of the activity can be restricted to general states like \textit{ON}, \textit{OFF}, \textit{SLEEP} or \textit{HIGH LOAD}.
|
|
While this information is still valuable, it does not enable in-depth monitoring of the machine.
|
|
|
|
|
|
\section{Multi-Source Single-Measure}
|
|
If the Single-Source Multi-Measure was looking \textit{in} a machine to get more insight, the Multi-Source Single-Measure is looking \textit{out} and considering multiple devices at once.
|
|
In a context where measuring the consumption of individual machines is not possible, the problem of disambiguation arises.
|
|
Signal disambiguation is the ability to identify the source of each component signal from a single aggregated signal.
|
|
This is a difficult problem as the different sources can affect each other, sometimes in a non-linear way.
|
|
Figure \ref{fig:mssm_illustration} illustrate the aggregation of multiple consumption sources in a single measurement.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\textwidth]{images/mssm_illustration}
|
|
\caption{Illustration of the MSSM setup.}
|
|
\label{fig:mssm_illustration}
|
|
\end{figure}
|
|
\agd{add a map of the problems and what is planned. Some visual representation of the SSSM, SSMM, MSSM and MSMM problems}
|
|
|
|
\subsection{Problem Statement}
|
|
|
|
\begin{problem-statement}[Multi-Source Single-Measure]
|
|
Given a discretized aggregated time series $ts_a = ts_1 \oplus ts_2 \oplus \dots \oplus ts_k$ and a set of patterns $P=\{(P_1\times\dots\times P_k)\}$, identify an injective mapping $m_{MSSM}:\mathbb{N}\longrightarrow P$ such that every sample $ts_a[i]$ maps to a pattern set in $P$ with the condition that the sample matches an occurrence of the pattern in $ts_a$.
|
|
\end{problem-statement}
|
|
|
|
The time series $ts_a$ is a discretized, mono-variate, real-valued time series.
|
|
The set of patterns $P$ is the cartesian product of the sets of patterns for each source $P_i$.
|
|
Thus, each element of $P$ is a set of $n$ patterns, each associated with one source.
|
|
Each set $P_i$ contain any number of pattern and the unknown $\chi$ pattern.
|
|
The unknown pattern is not added to the set $P$ as the set of all $\chi$ is already present and bears the same meaning.
|
|
The operator $\oplus$ is the aggregation function, generally the summation or caped summation.
|
|
%In some applications, the associativity of the $\oplus$ operator can be discarded as the aggregation is performed at the physical level, instantly across all sources $ts_i$.
|
|
|
|
The MSSM problem can be expressed as a combination of $k$ SSSM problems with a different input time series.
|
|
Because the input is an aggregated time series, the patterns describing an activity may not appear similarly in the input.
|
|
These patterns may be distorded by the aggregation with another pattern from another source.
|
|
The main hurdle when developping a solution for the MSSM problem will be to correctly identify the distorded patterns when having access to all possible distortion sources (the other patterns).
|
|
|
|
\subsection{Applications}
|
|
The successful design of a Multi-source Single-Measure monitoring system would finds its best application in an industrial setting.
|
|
Any industry that relies on many simple embedded systems to reliably perform a task can benefit from a monitoring system that is minimally disruptive to install.
|
|
For example, an assembly line can leverage hundreds of conveyor belt drivers, robotic arms, or quality assessment points.
|
|
Each type of system is simple in its design and task.
|
|
However, adding a designated power monitoring measurement device to each individual system can significantly increase cost, maintenance, and points of failure.
|
|
Capturing the power consumption of these machines at a single point could minimize the implementation footprint while maintaining a reliable physics-based monitoring solution.
|
|
|
|
\section{Multi-Source Multi-Measure}
|
|
The MSMM problem is a combination of the previous ones for which a clear application is difficult to imagine.
|
|
In an MSMM context, multiple capture systems would each measure an aggregate power consumption to form a multivariate time series.
|
|
Each dimension of this time series would incorporate the consumption of one or more individual components.
|
|
As long as the capture architecture (i.e., what machine is monitored by which capture system) is known, the analysis is a combination of the methods previously presented.
|
|
In the case where the capture architecture is unknown, the problem become out of scope for this thesis.
|
|
|
|
\section{Conclusion}
|
|
\agd{to be filled}
|