add zenodo ref

This commit is contained in:
Arthur Grisel-Davy 2023-07-31 12:57:34 -04:00
parent 6c413308a1
commit 87f17c2564
2 changed files with 52 additions and 12 deletions

View file

@ -473,6 +473,12 @@ title={Evaluation Dataset for the Machine State Detector, https://zenodo.org/rec
year={2023}, year={2023},
} }
@misc{name_hidden_for_peer_review_2023_8192914,
title = {{160 Hours of Labeled Power Consumption Dataset of
Computer, https://doi.org/10.5281/zenodo.8192914}},
year = 2023,
}
@article{gupta2021novel, @article{gupta2021novel,
title={A novel failure mode effect and criticality analysis (FMECA) using fuzzy rule-based method: A case study of industrial centrifugal pump}, title={A novel failure mode effect and criticality analysis (FMECA) using fuzzy rule-based method: A case study of industrial centrifugal pump},
author={Gupta, Gajanand and Ghasemian, Hamed and Janvekar, Ayub Ahmed}, author={Gupta, Gajanand and Ghasemian, Hamed and Janvekar, Ayub Ahmed},
@ -667,3 +673,41 @@ howpublished = {https://github.com/fchollet/keras},
pages={2825--2830}, pages={2825--2830},
year={2011} year={2011}
} }
@INPROCEEDINGS{1253591,
author={Saputra, H. and Vijaykrishnan, N. and Kandemir, M. and Irwin, M.J. and Brooks, R. and Kim, S. and Zhang, W.},
booktitle={2003 Design, Automation and Test in Europe Conference and Exhibition},
title={Masking the energy behavior of DES encryption [smart cards]},
year={2003},
volume={},
number={},
pages={84-89},
doi={10.1109/DATE.2003.1253591}
}
@ARTICLE{6918465,
author={Khedkar, Ganesh and Kudithipudi, Dhireesha and Rose, Garrett S.},
journal={IEEE Transactions on Nanotechnology},
title={Power Profile Obfuscation Using Nanoscale Memristive Devices to Counter DPA Attacks},
year={2015},
volume={14},
number={1},
pages={26-35},
doi={10.1109/TNANO.2014.2362416}}

View file

@ -343,7 +343,6 @@ Algorithm~\ref{alg:code} presents the implementation's pseudo-code.
\subsection{Analysis} \subsection{Analysis}
\textbf{Time-Efficiency:} \textbf{Time-Efficiency:}
\agd{Better time efficiency analysis and comparison with the efficiency of \gls{1nn}}
The time efficiency of the algorithm is expressed as a function of the number of normalized distance computations and the number of comparison operations. The time efficiency of the algorithm is expressed as a function of the number of normalized distance computations and the number of comparison operations.
Each part of the algorithm has its own time-efficiency expression, with Algorithm~\ref{alg:code} showing each of the three parts. Each part of the algorithm has its own time-efficiency expression, with Algorithm~\ref{alg:code} showing each of the three parts.
The first part, dedicated to the threshold computation, is polynomial in the number of patterns and linear in the length of each pattern. The first part, dedicated to the threshold computation, is polynomial in the number of patterns and linear in the length of each pattern.
@ -401,7 +400,6 @@ Figure~\ref{fig:areas} illustrate the areas of capture around the patterns for d
\caption{2D visualization of the areas of capture around each pattern as $\alpha$ changes. When $\alpha \ggg 2$, the areas of capture tend to equal these of a classic \gls{1nn}.} \caption{2D visualization of the areas of capture around each pattern as $\alpha$ changes. When $\alpha \ggg 2$, the areas of capture tend to equal these of a classic \gls{1nn}.}
\label{fig:areas} \label{fig:areas}
\end{figure} \end{figure}
\agd{Increase font size}
The influence of the $\alpha$ coefficient on the classification is monotonic and predictable. The influence of the $\alpha$ coefficient on the classification is monotonic and predictable.
@ -487,7 +485,6 @@ It is important to notice that zero represents the best Levenshtein distance and
\subsection{Dataset}\label{sec:dataset} \subsection{Dataset}\label{sec:dataset}
\agd{include more datasets from REFIT. One per house would be perfect but simply more is already good. Add in annexe why other are rejected.}
We evaluate the performances of \gls{mad} against eight time series. We evaluate the performances of \gls{mad} against eight time series.
One is a simulated signal composed of sine waves of varying frequency and average. One is a simulated signal composed of sine waves of varying frequency and average.
Four were captured in a lab environment on consumer-available machines (two NUC PCs and two wireless routers). Four were captured in a lab environment on consumer-available machines (two NUC PCs and two wireless routers).
@ -496,6 +493,7 @@ Table~\ref{tab:dataset} presents the times series and their characteristics.
\begin{table} \begin{table}
\centering \centering
\caption{Characteristics of the machines in the evaluation dataset.}
\begin{tabular}{lcc} \begin{tabular}{lcc}
Name & length & Number of states\\ Name & length & Number of states\\
\toprule \toprule
@ -508,7 +506,6 @@ Table~\ref{tab:dataset} presents the times series and their characteristics.
REFIT-H4A1 & 100000 & 142\\ REFIT-H4A1 & 100000 & 142\\
\bottomrule \bottomrule
\end{tabular} \end{tabular}
\caption{Characteristics of the machines in the evaluation dataset.}
\label{tab:dataset} \label{tab:dataset}
\end{table} \end{table}
@ -546,7 +543,6 @@ The no-consumption sections are not challenging --- i.e., all detectors perform
For this reason, we removed large sections of inactivity between active segments to make the time series more challenging without tempering with the order of detector performances. For this reason, we removed large sections of inactivity between active segments to make the time series more challenging without tempering with the order of detector performances.
\subsection{Alternative Methods} \subsection{Alternative Methods}
\agd{explain how the svm and mlp are trained.}
We implemented three alternative methods to compare with \gls{mad}. We implemented three alternative methods to compare with \gls{mad}.
The alternative methods are chosen to be well-established and of comparable complexity. The alternative methods are chosen to be well-established and of comparable complexity.
The methods are: a \gls{1nn} detector, an \gls{svm} classifier, and an \gls{mlp} classifier. The methods are: a \gls{1nn} detector, an \gls{svm} classifier, and an \gls{mlp} classifier.
@ -623,8 +619,9 @@ The final sampling rate of 20 samples per second was selected empirically to be
For each compressed day of the experiment (four hours segment, thereafter referred to as days), the \gls{mad} performs state detection and returns a label vector. For each compressed day of the experiment (four hours segment, thereafter referred to as days), the \gls{mad} performs state detection and returns a label vector.
This label vector associates a label to each sample of the power trace following the mapping: -~1 is UNKNOWN, 0 is SLEEP, 1 is IDLE, 2 is HIGH and 3 is REBOOT. This label vector associates a label to each sample of the power trace following the mapping: -~1 is UNKNOWN, 0 is SLEEP, 1 is IDLE, 2 is HIGH and 3 is REBOOT.
The training dataset comprises one sample per state, captured during the run of a benchmark script that interactively places the machine in each state to detect. The training dataset comprises one sample per state, captured during the run of a benchmark script that interactively places the machine in each state to detect.
\agd{make dataset available}
The script on the machine generates logs that serve as ground truth to verify the results of rule checking. The script on the machine generates logs that serve as ground truth to verify the results of rule checking.
The traces and ground truth for each day of the experiment are available online \cite{name_hidden_for_peer_review_2023_8192914}.
Please note that day 1 was removed due to a scheduling issue that affected to scenario.
Figure~\ref{fig:preds} present an illustration of the results. Figure~\ref{fig:preds} present an illustration of the results.
The main graph line in the middle is the power consumption over time. The main graph line in the middle is the power consumption over time.
The line's colour represents the machine's predicted state based on the power consumption pattern. The line's colour represents the machine's predicted state based on the power consumption pattern.
@ -670,7 +667,7 @@ The performance measure represents the ability of the complete pipeline (\gls{ma
The main metrics are the micro and macro $F_1$ score of the rule violation detection. The main metrics are the micro and macro $F_1$ score of the rule violation detection.
The macro-$F_1$ score is defined as the arithmetic mean over individual $F_1$ scores for a more robust evaluation of the global performance as described in \cite{opitz2021macro}. The macro-$F_1$ score is defined as the arithmetic mean over individual $F_1$ scores for a more robust evaluation of the global performance as described in \cite{opitz2021macro}.
Table~\ref{tab:rules-results} presents the performance for the detection of each rule. Table~\ref{tab:rules-results} presents the performance for the detection of each rule.
The performances are perfect for this scenario without any false positive or false negative over XX\agd{updates} runs. The performances are perfect for this scenario without any false positive or false negative over 40 runs.
The perfect detection of more complex patterns like REBOOT illustrates the need for a system capable of matching arbitrary states. The perfect detection of more complex patterns like REBOOT illustrates the need for a system capable of matching arbitrary states.
Flat lines at varying average levels represent many common states from embedded systems. Flat lines at varying average levels represent many common states from embedded systems.
@ -686,16 +683,15 @@ This illustrates that \gls{mad} balances the tradeoff between simple, explainabl
\begin{tabular}{lccc} \begin{tabular}{lccc}
Rule & Violation Ratio & Micro-$F_1$ & Macro-$F_1$\\ Rule & Violation Ratio & Micro-$F_1$ & Macro-$F_1$\\
\toprule \toprule
Night Sleep & 0.273 & 1.0 & \multirow{4}*{1.0} \\ Night Sleep & 0.33 & 1.0 & \multirow{4}*{1.0} \\
Work Hours & 0.227 & 1.0 & \\ Work Hours & 0.3 & 1.0 & \\
Reboot & 0.445 & 1.0 & \\ Reboot & 0.48 & 1.0 & \\
No Long High & 0.773 & 1.0 & \\ No Long High & 0.75 & 1.0 & \\
\bottomrule \bottomrule
\end{tabular} \end{tabular}
\label{tab:rules-results} \label{tab:rules-results}
\end{table} \end{table}
\section{Discussion}\label{sec:discussion} \section{Discussion}\label{sec:discussion}
In this section, we highlight specific aspects of the proposed solution. In this section, we highlight specific aspects of the proposed solution.