Efficient labeling technique and interpretable deep neural network for the classification of seizures using continuous electroencephalograms
MetadataShow full item record
This thesis focuses on the classification of seizures, together with finding efficient and scalable ways to obtain high-quality datasets in order to train deep neural networks. It was motivated by the need to automate the classification of seizure patterns. In fact, roughly 30\% of critically ill patients in ICUs suffer seizures or related patterns of harmful electrical activity of the brain. While seizures do damage the brain, most seizures in ICU patients occur without any obvious or overt clinical signs, and are thus detectable only by continuous electroencephalography (cEEG). cEEGs are recordings of the brain activity, often lasting over several hours. Manually labeling all the recordings to detect such patterns is infeasible, and the problem is a great candidate for the application of automatic classifiers. In particular, deep neural networks are promising, as they already perform well in a wide range of other tasks. However, the key to obtaining robust classifiers is an efficient label acquisition process. Data labeling is often challenging and subject to high levels of label noise. This can arise even when classification targets are well defined, for example if instances to be labeled are more difficult than the prototypes used to define the class. This leads to disagreements among the expert community and leaves room for mis-interpretation of the concepts. Therefore, although cEEG monitoring yields large volumes of data, labeling costs and difficulty make it hard to build a classifier. While experts agree on the labels of clear-cut examples of cEEG patterns, labeling many real-world cEEG data can be extremely challenging. Thus, a large number of sequences might be mislabeled, making training accurate deep learning models a really challenging task. This work explores ways to efficiently scale the labeling efforts in an environment where manual annotation is error-prone due to the complexity of the task, concurrently with the design of an interpretable model, suitable for medical use. One of the results include a method for human and machine co-learning, where experts become consistent in the labeling task, allowing to improve the quality of the dataset, while the model becomes stronger at correctly classifying inputs in the right category of seizure. This method is called HAMLET: a novel Human And Machine co-LEarning Technique. Using this system, it is possible to obtain a dataset that is suitable for training of deep learning models on challenging tasks, like the classification of seizures based on continuous EEG recordings. The core of the system integrates the constraint that some sample points cannot be reliably labeled even by human experts. In brief, during training, HAMLET is allowed to challenge the decision of human experts regarding the labels of certain difficult cases.