fix typo, removed footnote

This commit is contained in:
Arthur Grisel-Davy 2023-06-26 11:24:32 -04:00
parent e934d77848
commit a7e6210f45

View file

@ -142,7 +142,7 @@ Our solution addresses this trade-off by leveraging side-channel information.
== Contributions
This paper presents a novel solution for firmware verification using side-channel analysis.
Building on the assumption that every security mechanism operating on a host is vulnerable to being bypassed, we propose using the device's power consumption signature during the firmware execution to assess its integrity.
Because of the intrinsic properties of side-channel information, the *integrity* evaluation is based on does not involve any communication with the host and is based on data difficult to forge.
Because of the intrinsic properties of side-channel information, the integrity evaluation is based on does not involve any communication with the host and is based on data difficult to forge.
A distance-based outlier detector that uses power traces of a nominal boot-up sequence can learn the expected pattern and detect any variation in a new boot-up sequence.
This novel solution can detect various attacks centred around manipulating firmware.
In addition to its versatility of detection, it is also easily retrofittable to almost any embedded system with @DC input and a consistent boot sequence.
@ -355,7 +355,7 @@ The abnormal boot sequences are composed of sequences where an operator went int
== Results
The models are manually tuned to obtain 100% accuracy in the classification of nominal and abnormal boot sequences.
Obtaining 100% accuracy illustrates that there is a clear separation between nominal and abnormal boot sequences for this type of attack.
#agd[could not redo the results as teh data for bios boot are missing]
//#agd[could not redo the results as teh data for bios boot are missing]
Although this test case represents an unrealistic situation (mainly because the anomalous samples are accessible), it is still a valuable first evaluation of the @BPV.
This test case serves as a proof-of-concept and indicates that there is a potential for the detection of firmware-level attacks with power consumption.
@ -490,12 +490,6 @@ The experiment scenarios are:
)
== Results
The experiment procedure consists in starting the drone flight controller multiple times while capturing the power consumption.
The experiment consists in repeating each scenario between 40 and 100 times.
The experiment procedure automatically captures boot-up traces for better reproducibility (see @sds for more details).
@drone-results presents the results of the detection.
Both Original and Compiled represent nominal firmware versions.
#figure(
tablex(
@ -511,12 +505,21 @@ Both Original and Compiled represent nominal firmware versions.
caption: [Results of the intrusion detection on the drone.]
)<drone-results>
The experiment procedure consists in starting the drone flight controller multiple times while capturing the power consumption.
The experiment consists in repeating each scenario between 40 and 100 times.
The experiment procedure automatically captures boot-up traces for better reproducibility (see @sds for more details).
@drone-results presents the results of the detection.
Both Original and Compiled represent nominal firmware versions.
Each scenario introduces disturbances in the boot-up sequence power consumption.
The model correctly identifies the anomalous firmware.
One interesting scenario is the Battery Module Bug that is mostly detected as nominal.
This result is expected as the bug affects the operations of the firmware after the bootup sequence.
Hence, the power consumption in the first second of activity remains nominal.
#agd[Should the result of the battery module bug remain, or is it confusing to present scenarios where the BPV expectedly fails?]
//#agd[Should the result of the battery module bug remain, or is it confusing to present scenarios where the BPV expectedly fails?]
It is interesting to notice that the differences in power consumption patterns among the different firmware are visible immediately after the initial power spike.
This suggests that future work could achieve an even lower time-to-decision, likely as low as 200ms depending on the anomaly.
@ -604,12 +607,10 @@ Second, the range parameter for the y-shift affects the results in the same way.
The values for these parameters are chosen as part of the domain knowledge extraction, and they affect the transferability of the model (see @aim-conclusion).
The performances are evaluated on the same dataset as for the initial @BPV evaluation (see~@exp-network).
The performance metric is the F1 score.
//The performance metric is the F1 score.
The final performance measure is the average F1 score (and its standard deviation) over 30 independent runs.
Each run selects five random normal traces as seed for the dataset generation.
The training dataset is composed of 100 training traces and 100 evaluation races.
The results are presented in @tab-aim
#figure(