diff --git a/DSD/qrs/main.tex b/DSD/qrs/main.tex index a41cb2c..8ea86f1 100644 --- a/DSD/qrs/main.tex +++ b/DSD/qrs/main.tex @@ -3,12 +3,22 @@ %\input epsf \usepackage{graphicx} \usepackage{multirow} +\usepackage{xcolor} +\usepackage{booktabs} +\usepackage{tabularx} +\usepackage{algpseudocodex} +\usepackage{algorithm} +\usepackage{amsfonts} +\usepackage{amssymb} +\usepackage{amsthm} \usepackage[toc,acronym,abbreviations,nonumberlist,nogroupskip]{glossaries-extra} \renewcommand\thesection{\arabic{section}} % arabic numerals for the sections \renewcommand\thesubsectiondis{\thesection.\arabic{subsection}.}% arabic numerals for the subsections \renewcommand\thesubsubsectiondis{\thesubsectiondis.\arabic{subsubsection}.}% arabic numerals for the subsubsections +\newtheorem{problem-statement}{Problem Statement} + \newcommand\agd[1]{{\color{red}$\bigstar$}\footnote{agd: #1}} \newcommand\SF[1]{{\color{blue}$\bigstar$}\footnote{sf: #1}} \newcommand{\cn}{{\color{purple}[citation needed]}} @@ -157,4 +167,461 @@ The similarities are evaluated with Nearest-Neighbor rules or \gls{svm}. Their approach bears some interesting similarities with the one presented in this paper. However, they evaluate their work on the recognition of handwritten numerals, which is far from the use case we are interested in. +\section{Problem Statement}\label{sec:statement} +%\gls{mad} focuses on detecting the state of a time series at any point in time. +We consider the problem from the point of view of multi-class, mono-label classification problem \cite{aly2005survey} for every sample in a time series. +The problem is multi-class because multiple states can occur in one-time series, and therefore any sample is assigned one of multiple states. +The problem is mono-label because only one state is assigned to each sample. +The classification is a mapping from the samples space to the states space. + +\begin{problem-statement}[\gls{mad}] +Given a discretized time series $t$ and a set of patterns $P=\{P_1,\dots, P_n\}$, identify a mapping $m:\mathbb{N}\longrightarrow P\cup \lambda$ such that every sample $t[i]$ +maps to a pattern in $P\cup \lambda$ with the condition that the sample matches an occurrence of the pattern in $t$. +\end{problem-statement} + +The time series $t: \mathbb{N} \longrightarrow \mathbb{R}$ is a finite, discretized, mono-variate, real-valued time series. +The patterns (also called training samples) $P_j \in P$ are of the same type as $t$. +Each pattern $P_j$ can take any length denoted $N_j$. +A sample $t[i]$ \textit{matches} a pattern $P_j \in P$ if there exists a substring of $t$, the length of $P_j$, that includes the sample, such that a similarity measure between this substring and $P_j$ is below a pre-defined threshold. +The pattern $\lambda$ is the \textit{unknown} pattern assigned to the samples in $t$ that do not match any of the patterns in $P$. + +\begin{figure} +\centering +\includegraphics[width=0.45\textwidth]{images/overview.pdf} +\caption{Illustration of the sample distance from one sample to each training example in a 2D space.} +\label{fig:overview} +\end{figure} + +\section{Proposed Solution: MAD}\label{sec:solution} +\gls{mad}'s core idea separates it from other traditional sliding window algorithm. +In \gls{mad}, the sample window around the sample to classify dynamically adapts for optimal context selection. +This principle influences the design of the detector and requires the definition of new distance metrics. +Because the patterns lengths may differ, our approach requires distance metrics that are robust to length variations. +%For the following explanation, the pattern set $P$ refers to the provided patterns only $\{P\setminus \lambda\}$ --- unless specified otherwise. +We first define the fundamental distance metric as the normalized Euclidean distance between two-time series $a$ and $b$ of the same length $N_a=N_b$ +\begin{equation} + nd(a,b) = \dfrac{EuclideanDist(a,b)}{N_a} +\end{equation} + +Using this normalized distance $nd$, we define the distance from a sample $t[i]$ to a pattern $P_j \in P$. +This is the sample distance $sd$ defined as +\begin{equation}\label{eq:sd} + sd(i,P_j) = \min_{k\in [i-N_j,i+N_j])}(nd(t[i-k:i+k],P_j)) +\end{equation} + +%with $P_j$ the training sample corresponding to the state $j$, and $t$ the complete time series. +Computing the distance $sd(i,P_j)$ requires to: (1) select every substring of $t$ of length $N_j$ that contains the sample $t[i]$, (2) evaluate their normalized distance to the pattern $P_j$, and (3) consider $sd(i,P_j)$ as the smallest of these distances. +For simplicity, Equation~\ref{eq:sd} omits the border conditions for the range of $k$. +When the sample position $i$ is less than $N_j$ or greater than $N_t-N_j$, the range adapts to only consider valid substrings. + +Our approach uses a threshold-based method to decide what label to assign to a sample. +For each sample in $t$, the algorithm compares the distance $sd(i,P_j)$ to the threshold $T_j$. +The sample receives the label $j$ associated with the pattern $P_j$ that results in the smallest distance $sd(i,P_j)$ with $sd(i,P_j)N_j$. +If $N_l