diff --git a/DSD/qrs/acronyms.tex b/DSD/qrs/acronyms.tex index 4777e8d..26dff80 100644 --- a/DSD/qrs/acronyms.tex +++ b/DSD/qrs/acronyms.tex @@ -1,6 +1,8 @@ \newabbreviation{tas}{TAS}{Temporal Action Segmentation} +\newabbreviation{apt}{APT}{Advanced Persistent Threat} \newabbreviation{dsd}{DSD}{Device State Detector} \newabbreviation{cpd}{CPD}{Change Point Detection} +\newabbreviation{stl}{STL}{Signal Temporal Logic} \newabbreviation{hids}{HIDS}{Host-Based Intrusion Detection Software} \newabbreviation{nids}{NIDS}{Network-Based Intrusion Detection Software} \newabbreviation{1nn}{1-NN}{1-Nearest Neighbor} @@ -11,4 +13,5 @@ \newabbreviation{mlp}{MLP}{Multi Layer Perceptron} \newabbreviation{mad}{MAD}{Machine Activity Detector} \newabbreviation{ids}{IDS}{Intrusion Detection Systems} -\newabbreviation{nilm}{NILM}{Nonintrusive Load Monitoring} \ No newline at end of file +\newabbreviation{nilm}{NILM}{Nonintrusive Load Monitoring} +\newabbreviation{it}{IT}{Information Technology} diff --git a/DSD/qrs/biblio.bib b/DSD/qrs/biblio.bib index a72f351..80668b0 100644 --- a/DSD/qrs/biblio.bib +++ b/DSD/qrs/biblio.bib @@ -609,3 +609,8 @@ series = {CoDS COMAD 2020} publisher={Elsevier} } +@misc{sleep_states, +title={Sleep States Description: }, +url={https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/system-sleeping-states}, +year={2023}, +} diff --git a/DSD/qrs/images/2w_experiment.svg b/DSD/qrs/images/2w_experiment.svg index cb16e54..115e9b4 100644 --- a/DSD/qrs/images/2w_experiment.svg +++ b/DSD/qrs/images/2w_experiment.svg @@ -2,12 +2,12 @@ + transform="translate(-7.2606705,-63.577428)"> Work Hours - Sleep Sleep - Sleep 4 + 0 + Compressed + 4 + 1 + 2 + 3 diff --git a/DSD/qrs/main.tex b/DSD/qrs/main.tex index 9679c08..c793a48 100644 --- a/DSD/qrs/main.tex +++ b/DSD/qrs/main.tex @@ -27,7 +27,7 @@ \newcommand{\wv}{{\color{orange}[weak verb]}} % correct bad hyphenation here -\hyphenation{op-tical net-works semi-conduc-tor IEEEconf} +\hyphenation{op-tical net-works semi-conduc-tor IEEEconf hyper-parameter} \begin{document} \input{acronyms} \title{\textbf{\Large MAD: One-Shot Machine Activity Detector for Physics-Based Cyber Security\\}} @@ -575,8 +575,73 @@ With both performances metrics combined, \gls{mad} outperforms the other methods \end{figure*} - \section{Case Study 2: Attack Scenarios} +The second case study focuses on a realistic production scenario. +The goal of this study is to illustrate hoh \gls{mad} enbales hight abstraction level rules applications by converting the low-level power consumption signal into labeled and actionable states sequence. + + +\subsection{Overview} +This second case study aims at illustrating the performances of the \gls{mad} detector on more realisitc data. +To this extend, a machine was setup to perform tasks on a typical office work schedule including work hours, sleep hours, and maintenance hours. +The scenario comprises 4 phases: + +\begin{itemize} + + \item Night Sleep: During the night and until the worker begin the day, the machine is asleep in S3 sleep state\cite{sleep_state}. Any other state than sleep is considered anomalous during this time. + \item Work Hours: During work hours, little restriction is applied on the activity. Only a sustained (more than 30s) high load is considered anoamlous. + \item Evening Sleep: After work hours, the machine goes to sleep again for a few hours. + \item Maintenance: During the night, the machine wakes up as part of an automated maintenance schedule. During maintenance updates are fetched and a reboot is performed. +\end{itemize} + +\begin{figure} +\centering +\includegraphics[width=0.49\textwidth]{images/2w_experiment.pdf} +\caption{Overview of the scenario and rules for the Second case study.} +\label{fig:2w_experiment} +\end{figure} + +In order to reduce the experimentation and processing time, the daily scenario is compressed into 4 hours, allowing 6 runs per day and a processing time of only $\approx 4min$ per run. +Note that this compression of experiment time does not influence the results (the patterns are kept uncompressed) and is only for convenience and better confidence in the results. +Figure~\ref{fig:2w_experiment} illustrate the experiment scenario with both the real and compressed time. + +The data capture follows the same setup as presented in the first case study. +A power measurement device is placed in series with the main power cable of the machine (a NUC micro-pc). +The measurement devices captures the power consumption at 10 kilo-sampls per seconds. +The pre-processing step downsamples the trace to 20 samples per seconds using a median filter. +This step greatly reduces the measurement noise and the processing time, and increases the consistency of the results. +The final sampling rate of 20 samples per seconds was selected empirically to be about one order of magnitude highter than the typical length of the patterns to detect (around 5 seconds). + +For each comrpessed day of experiment (4 hours segment, thereafter refered as days), the \gls{mad} performs state detection and returns a label vector. +This label vector associate a label to each sample of the power trace following the mapping: -1 is UNKNOWN, 0 is SLEEP, 1 is IDLE, 2 is HIGH and 3 is REBOOT. + +Many rules can be imagined to describe the expected and unwanted behavior of a machine. +System administrators can define highly specific rules to detect specific attacks or to match the typicall acticities of their infrastructure. +We selected 4 rules (see Table~\ref{tab:rules}) that are representative of common threats on companies or administrations's \gls{it} infrastructures. +These rules are not exhaustive and are merely an example of the potential of converting power cosumption traces to actionable data. +The rules are formaly defined using the \gls{stl} syntax which is bespoke for describing variable patterns with temporal components.\cn + +\begin{table*} + \centering + \begin{tabular}{p{0.03\textwidth} | p{0.20\textwidth} | p{0.47\textwidth} | p{0.20\textwidth}} + Rule & Description & STL Formula & Threat\\ + \toprule + 1 & "SLEEP" state only & $R_1 := \square_{[0,1h]\cup [2h40,3h20]}(SLEEP=1)$ & Machine takeover, Botnet, Rogue Employee\\ + 2 & Exactly one occurence of "REBOOT" & $R_2 := \lozenge(REBOOT_{[t]}=1) \cup (\neg \square_{[,2h40]}(REBOOT=1)$ & \gls{apt}, Backdoors\\ + 3 & No "HIGH" state for more than 30s. & $R_3 := \square (HIGH_{[t_0]}=1 \rightarrow \lozenge_{[t_0,t_0+30s]}(HIGH_{[t]}=0))$ & CryptoMining Malware, Ransomware, BotNet\\ + 4 & No "REBOOT" occurence. & $R_4 := \neg \square_{[1h,2h40]}(REBOOT_{[t]}=1)$ & Malware Installation\\ + \bottomrule + \end{tabular} + \caption{Characteristics of the machines in the evaluation dataset.} + \label{tab:rules} +\end{table*} +\agd{add MITRE references for each threat} +\agd{fix stl formulas to use labels and not states name} + + + +\subsection{Dataset} + +\subsection{Results} \section{Discussion}\label{sec:discussion} In this section we highlight specific aspects of the proposed solution. @@ -619,6 +684,8 @@ Although there are more operations to perform to evaluate all possible windows a Over all the datasets considered, the time for \gls{mad} was, on average, 14\% higher than the time for the \gls{1nn}. \gls{mad} is also slower than \gls{svm} and faster than \gls{mlp}, but comparison to other methods is less relevant as computation time is highly sensitive to implementation, and no optimization was attempted. Finally, because \gls{mad} is distance-based and window-based, parallelization is naturally applicable and can significantly reduce the processing time. +\agd{add subsection or bold titles to discussions topic, add discussion about why a simple threshold does not work} + \section{Conclusion} We present \gls{mad}, a novel solution to enable high-level security policy enforcement from side channel information.