remove all ??, \agd, \cn

This commit is contained in:
Arthur Grisel-Davy 2023-09-29 13:38:06 -04:00
parent 2ea0650c00
commit e1e9b0183e
4 changed files with 21 additions and 26 deletions

View file

@ -140,7 +140,12 @@ The results were satisfactory and illustrated the possibility of detecting a fir
The standard Molex cable supplying power the a SATA hard drive is composed of 3 voltage levels: 3V, 5V and 12V.
After some tests, it appears that the 5V cables --- grouped on the same shunt resistor --- carried the most information about the drive activity.
The shunt resistor generated the voltage drop on the 5V cables of the hard drive.
\agd{find back results and add them here}
Unfortunately, the report containing the results was misplaced, and the results are not easily reproducible.
The overall conclusion of this experiment was that the power consumption, captured at the \gls{psu} level, contains enough information to distinguish between different versions of the firmware on a hard drive.
In this specific setup, it was also possible to distinguish between two hard drives with identical manufacturer, model number and capacity.
These results were encouraging but too superficial and not rigorous enough to be conclusive.
Further experiments on the accuracy, robustness and versatility of this method are required to assess the potential of this technology properly.
\newpage
\section{Boot Process Verifier}\label{sec:bpv}
@ -319,7 +324,7 @@ The pattern $\chi$ is the unknown pattern assigned to the samples in $t$ that do
\end{figure}
The core of the algorithm is a \gls{knn} classification.
This algorithm is a proven and robust way of labelling new samples based on their relative similarity to the training samples (\cn ?).
This algorithm is a proven and robust way of labelling new samples based on their relative similarity to the training samples.
Although this is a good algorithm for many classification problems, its application to time series for state detection is not trivial for multiple reasons.
First, a single time serie can contain multiple different states, making it a multi-label classification problem.
Second, extracting windows to perform classification introduces parameters --- window size, window placement around sample to classify, number of sample to classify per window, stride --- potentially difficult to tune or justify.
@ -336,11 +341,10 @@ The results for this method are sub-optimal for two reasons.
First, if the stride between each window is too large, crucial patterns can be overlooked in the trace.
If it is too small, the detection accuracy can suffer as the state of each sample is evaluated multiple time --- due to windows overlap.
Second, the whole window will be assigned one label, which causes the edges of the states to be inaccurate --- especially when states patterns share similarities.
\agd{find a theoretical setup where the middle-sample knn is worse than dsd. Consider cases of a bootup with a description that include some OFF portion.}
The \gls{dsd} uses a better metric for evaluating the distance between a sample and each state.
For each sample and for each state, every window of the length of the state containing the sample is considered.
The first window contains the sample at the last position, and the last window contains the sample at the first position (see Figure~\ref{windows_dsd}).
The first window contains the sample at the last position, and the last window contains the sample at the first position (see Figure~\ref{fig:windows_dsd}).
\begin{figure}