F 

 
DEPARTMENT OF COMMUNICATION AND LEARNING IN SCIENCE 

CHALMERS UNIVERSITY OF TECHNOLOGY 

Gothenburg, Sweden 2020 

www.chalmers.se 

Assessing Cognitive Workload 
Between Different Tasks 
Using EEG to develop and examine a method to measure 
variation of cognitive workload between different levels of 
difficulty 
Master’s thesis in Biomedical Engineering1 and Learning and Leadership2 

 
FANNY APELGREN1 & IDA PETTERSSON2 

 
http://www.chalmers.se/


ii 

 
Assessing Cognitive Workload Between Different Tasks: Using EEG to develop and 
examine a method to measure variation of cognitive workload between different levels of 
difficulty 

Fanny Apelgren and Ida Pettersson 

 
© FANNY APELGREN AND IDA PETTERSSON, 2020 

 
Supervisor: Eva Lendaro, Department of Electrical Engineering, Chalmers University of 
Technology 

Supervisor: Sheila Galt, Department of Communication and Learning in Science, Chalmers 
University of Technology 

Examiner: Samuel Bengmark, Department of Mathemathical Science, Chalmers University 
of Technology 

 
Master’s Thesis 2020 

Communication and Learning in Science 

Chalmers University of Technology 

SE-412 96 Gothenburg 

Telephone +46 31 772 1000 

 
Cover: The picture is taken from our own EEG data 

 
iii 

 
Abstract 
Assessing cognitive workload is an important tool, for example when evaluating different 
techniques for improving prostheses. Here, we have developed a method to compare how 
the cognitive workload differs if a prosthesis has sensory feedback or not. We have used 
electroencephalogram (EEG) and performed a pilot study on ten intact limb subjects. An 
easy and hard level were constructed by changing the weight of a force sensitive cube that 
were to be lifted back and forth over a barrier while counting sounds in an auditory oddball 
task. A third level consisted of only the oddball task. The difference in difficulty between the 
different levels were verified by measuring performance, and perceived effort. On a group 
level, these measurements all indicated that the no task condition was easiest, and the hard 
task condition was most demanding. Measurements of the number of lifts for different 
repetitions of the easy and hard conditions also showed signs of a learning effect during the 
performance of the easy task. The cognitive workload was measured by using the event-
related potentials (ERP) technique and frequency bands. The results showed that the ERP 
component P3 was the only one that could indicate a significant difference between all 
three levels. A comprised measurement (consisting of the sum of ERP components N1, P2, 
P3, and LPP) and the alpha frequency bands (low-, high-, and broadband alpha) also 
showed a significant effect between some of the conditions.  

 
Keywords: Cognitive workload, Mental load, Learning, Electroencephalogram, EEG, 

Event-Related Potential, ERP, P3, Grasping task, Oddball Task, Dual-Task Paradigm 

  
iv 

 
Acknowledgements 
The writing of this thesis has, as most other projects, been a rocky road. We have had 
moments on top where everything has gone our way and we felt like real scientist ready for 
our big breakthrough, but we have also stumbled around in dark valleys, unsure of what to 
do next. Nevertheless, we have finally made it, and here is our finished work! But we would 
have never made it without the support we have gotten from people around us.  

We are very grateful for the Biomechatronics and Neurorehabilitation lab under Max Ortiz 
Catalan for the chance to do this, and for lending us both their time and equipment. Thank 
you everyone for meeting us and sharing your knowledge and experience, and for changing 
your plans to let us perform our measurements at the lab. Especially we would like to thank 
Shahrzad Damercheli who patiently helped us during all our measurements and training. 
And, of course, a huge thank you to our supervisor Eva Lendaro who guided us along the 
way, believed in us, and taught us how to think like scientists.  

We are also very grateful for our never-ending source of support, feedback, ideas and hot 
chocolate (at least this would have been never-ending if hot chocolate could have been 
sent online): Sheila Galt. Thank you for everything!   

We also had some other meeting to help us with our literature review and statistical 
analysis. So, thank you to Yommine Hjalmarsson and Serik Sagitov for sharing your 
expertise. 

This project would also not have been possible without all the wonderful people who 
volunteered to spend hours with us while we poked them in the head with gel-filled syringes 
and asked them to move a cube hundreds of times. Thank you to everyone who somehow 
took part in our experiment, helped us practice, contributed your ideas in our mini-pilot 
study or offered to help, but were stopped by the arrival of the Covid-19 virus. 

We are also very thankful for our friends and family who have supported us through this 
work, during panic and euphoria. Everyone who has taken time to listen when we tried to 
explain our work, or helped us when it was time to think about something else by sharing a 
hug, a lunch, or an ice cream. 

Lastly, we would like to thank the two people who have been most important for this work: 
each other! As mentioned, this work has taken us over high mountains and through dark 
valleys and deep pits, with many treacherous rocks to stumble on. Sometimes, working 
together has been challenging, but it was by doing it together that we got out of those pit 
holes. We would never have reached as far alone. And, if we wouldn’t have done it 
together, we would have no one to share the amazing views from the top with! 

           
https://www.facebook.com/profile.php?id=100002143578515&ref=br_rs


v 

 
Table of contents 

Abstract ................................................................................................................................ iii 

Keywords: ......................................................................................................................... iii 

Acknowledgements .............................................................................................................. iv 

Abbreviations ...................................................................................................................... viii 

List of Figures ....................................................................................................................... ix 

List of Tables ......................................................................................................................... x 

1 Introduction .....................................................................................................................1 

1.1 Background ..............................................................................................................1 

1.2 Brief description of this work .....................................................................................2 

1.3 Aims and limitations ..................................................................................................3 

1.4 Research questions ..................................................................................................4 

1.5 Contribution ..............................................................................................................4 

1.6 Thesis outline ...........................................................................................................4 

2 Theory ............................................................................................................................6 

2.1 Cognitive workload ...................................................................................................6 

2.2 Electroencephalogram (EEG) ...................................................................................7 

2.2.1 Referencing ........................................................................................................8 

2.2.2 Electrode positioning ..........................................................................................8 

2.2.3 Artifacts and Noise .............................................................................................9 

2.3 Event-Related potentials ..........................................................................................9 

2.3.1 Basic concept ................................................................................................... 10 

2.3.2 ERP Components ............................................................................................ 10 

2.3.2.1 N1 .............................................................................................................. 12 

2.3.2.2 N2 .............................................................................................................. 12 

2.3.2.3 P2 .............................................................................................................. 12 

2.3.2.4 P3 .............................................................................................................. 13 

2.3.2.5 LPP............................................................................................................ 14 

2.3.3 Frequency bands ............................................................................................. 15 

2.3.3.1 Delta .......................................................................................................... 16 

2.3.3.2 Theta ......................................................................................................... 16 

2.3.3.3 Alpha ......................................................................................................... 16 

2.3.3.4 Beta ........................................................................................................... 17 

2.3.3.5 Gamma ...................................................................................................... 17 


vi 

 
2.3.3.6 Theta/Alpha: .............................................................................................. 17 

2.3.4 Differences between individuals ....................................................................... 18 

3 Method ......................................................................................................................... 19 

3.1 Grasping task ......................................................................................................... 19 

3.2 Dual-task paradigms and oddball tasks .................................................................. 20 

3.2.1 The choice of stimuli ........................................................................................ 21 

3.2.2 Secondary task: Counting, reacting or ignoring? ............................................. 22 

3.2.3 Stimuli timing .................................................................................................... 22 

3.3 Experimental procedure ......................................................................................... 23 

3.3.1 Participants ...................................................................................................... 23 

3.3.2 Tasks ............................................................................................................... 24 

3.3.3 Self-assessment using NASA-RTLX ................................................................ 24 

3.3.4 Procedure ........................................................................................................ 25 

3.4 Signal processing ................................................................................................... 27 

3.4.1 Offline referencing ............................................................................................ 27 

3.4.2 Amplification ..................................................................................................... 28 

3.4.3 Filtering ............................................................................................................ 28 

3.4.4 Epoching .......................................................................................................... 28 

3.4.5 Artifact management ........................................................................................ 29 

3.4.5.1 Artifact detection and rejection .................................................................. 29 

3.4.5.2 Artifact correction ...................................................................................... 31 

3.4.5.3 The choice of whether to reject or correct artifacts .................................... 33 

3.4.6 Averaging ......................................................................................................... 34 

3.5 Measurements to indicated cognitive workload ...................................................... 35 

3.5.1 Measuring amplitudes of averaged ERPs ........................................................ 35 

3.5.1.1 Choosing a method to measure amplitude ................................................ 35 

3.5.1.2 The choice of which ERP components to examine ................................... 36 

3.5.1.3 The choice of latency window and electrode sites ..................................... 36 

3.5.1.4 A compiled measurement for ERP ............................................................ 37 

3.5.2 Measuring frequency bands ............................................................................. 37 

3.5.2.1 The choice of which frequency bands to examine ..................................... 37 

3.5.2.2 Choosing a method ................................................................................... 38 

3.5.2.3 The choice of electrode sites ..................................................................... 38 

3.6 Statistical analysis .................................................................................................. 38 

3.7 Design of force sensitive cube ................................................................................ 39 


vii 

 
3.7.1 Design requirements ........................................................................................ 39 

3.7.2 Design process ................................................................................................ 40 

4 Results ......................................................................................................................... 43 

4.1 Notes on the execution of the experiment .............................................................. 43 

4.2 Performance ........................................................................................................... 44 

4.3 Perceived effort ...................................................................................................... 46 

4.4 ERP components .................................................................................................... 47 

4.5 Frequency bands .................................................................................................... 52 

5 Discussion .................................................................................................................... 56 

5.1 Experiment execution ............................................................................................. 56 

5.2 Performance ........................................................................................................... 57 

5.2.1 Comparison of conditions ................................................................................ 57 

5.2.2 Assessment of a learning process within each condition ................................. 58 

5.3 Perceived difficulty of the different conditions ......................................................... 58 

5.4 ERP components .................................................................................................... 58 

5.5 Frequency bands .................................................................................................... 59 

5.6 Differences between individuals ............................................................................. 60 

6 Conclusions .................................................................................................................. 62 

7 Future work .................................................................................................................. 64 

7.1 Experiment with anesthesia .................................................................................... 64 

7.2 Possible improvements of the cube ........................................................................ 64 

7.3 Examination of the usefulness of EEG to evaluate learning processes .................. 64 

Appendix A: Experiment protocol ...........................................................................................I 

Appendix B: Earlier versions of the force sensitive cube ...................................................... V 

Appendix C: Informed consent .......................................................................................... VIII 

Appendix D: Artifact management details .......................................................................... XIV 

Appendix E: Two-way ANOVA analysis............................................................................. XVI 

Appendix F: Electrode clusters .......................................................................................... XIX 

  
viii 

 
Abbreviations 
ABBREVIATION MEANING SHORT EXPLANATION 

EEG Electroencephalogram A technique to record electrical signals 
arising from brain activity 

EMG Electromyography A technique to record electrical signals 
arising from skeletal muscle activity 

EOG Electrooculography A technique to record electrical signals 
arising from eye activity 

ERP Event-Related Potential Small changes in recorded EEG as 
response to an event 

HEOG Horizontal 
Electrooculography 

Electrodes placed at the side of each eye to 
measure horizontal eye movements 

ICA Independent 
Component Analysis 

A technique to separate different 
components of a signal 

NASA-RTLX NASA Raw Task Load 
Index 

A simplified version of a self-assessment 
questionnaire developed by NASA 

VEOG Vertical 
Electrooculography 

Electrodes placed under and above one eye 
to measure vertical eye movements and 
blinks 

PSD Power Spectral Density A signal’s power content versus frequency 

 
ix 

 
List of Figures 
 

Figure 1. The brain and its four different lobes ......................................................................7 

Figure 2. Electrode positioning according to the normal and extended 10-20 system ...........9 

Figure 3. An ERP waveform where each peak is named after the convention used in this 
thesis ................................................................................................................................... 11 

Figure 4. A sketch of the latency ranges ERP components N1, N2, and P2 ....................... 13 

Figure 5. A sketch of the latency ranges ERP components P3 and LPP ............................. 15 

Figure 6. The experimental setup for the grasping task, with the force sensitive cube ........ 20 

Figure 7. Person fitted with EEG cap with 128 electrodes ................................................... 24 

Figure 8. EEG signals with eye blinks and horizontal eye artifacts ...................................... 30 

Figure 9. Scalp topography showing blinks and horizontal eye movements ........................ 33 

Figure 10. Final version of the force sensitive cube ............................................................. 40 

Figure 11. Forces measured by the cube ............................................................................ 42 

Figure 12. The performance in the grasping task as measured as number of lifts per minute
 ............................................................................................................................................ 44 

Figure 13. The success rate as a function of the number of lifts per minute ........................ 45 

Figure 14. The performance of the subjects as measured by the accuracy of the oddball 
task ...................................................................................................................................... 46 

Figure 15. Task load index from the self-assessment questionnaire NASA-RTLX .............. 47 

Figure 16. The ERP waveform averaged over all subjects, conditions and channels .......... 48 

Figure 17 Scalp distribution showing the amplitudes of the ERP components .................... 49 

Figure 18. The grand average over all subjects as shown in the different clusters .............. 50 

Figure 19 The amplitude of each ERP component and condition in the small clusters ....... 51 

Figure 20. The amplitude of each ERP component and condition in the large clusters ....... 52 

Figure 21. Scalp distributions of the spectral power density for frequency bands ............... 53 

Figure 22. The absolute spectral amplitude of each frequency band and condition in the 
small clusters  ...................................................................................................................... 54 

Figure 23. The absolute spectral power amplitude of each frequency band and condition in 
the large clusters ................................................................................................................. 55 

Figure 24. Version 1 of the force sensitive cube ................................................................... V 

Figure 25. Version 2 of the force sensitive cube .................................................................. VI 

Figure 26. Curve showing the calibration of the force sensitive cube ................................ VIII 

 
x 

 
List of Tables 
Table 1. EEG frequency bands ............................................................................................ 16 

Table 2. An illustration of the relationship between different levels here and in related 
studies ................................................................................................................................. 19 

Table 3. Variables measured by the cube ........................................................................... 42 

Table 4. Summary of the data that was gathered for each subject, excluding EEG data .... 43 

Table 5. Components that were removed following ICA analysis ...................................... XIV 

Table 6. The number of epochs that were accepted after artifact rejection ........................ XV 

Table 7. Description of the construction of each of the small and large electrode clusters for 
measuring the ERP components. ...................................................................................... XIX 

Table 8. Description of the construction of each of the small and large electrode clusters 
chosen for measuring the frequency bands ....................................................................... XIX 


1 

 
1 Introduction 
Here we introduce this thesis work by presenting the background for why this research is 
needed. After that we give a short description of the method of this work together with our 
aims and limitation. We also present our research questions and how we contribute to the 
research field. Lastly, we give you as a reader a guide for the structure of this report. 
 

1.1 Background 

The loss of a limb would imply a major change in almost any lifestyle, and there are many 
dedicated scientists, engineers, doctors, therapists etc. around the world working to make 
this change as convenient as possible. One possibility is to use a prosthesis as a substitute 
for the lost limb. Most often these prostheses are strapped on with a socket and electrodes 
are attached to the remaining part of the limb [1]. The socket often chafe the skin and imply 
a lot of discomfort for the person wearing it [2]. These kind of prostheses are also unreliable 
and patients therefore often choose not to use them [1]. 

In 1990 the world’s first osseointegrated prosthesis was implemented in Gothenburg, 
Sweden [3]. This means that the prosthesis is integrated in the bone of the amputee using a 
titanium rod. Apart from being a more secure and comfortable way of attaching the 
prosthesis than the conventional socket, this solution also opens up for the possibility to 
connect it to the muscles and nerves inside the remaining part of the limb. 29 years after 
the first osseointegrated prosthesis, in 2019, a Swedish man was the first in the world of 
getting an osseointegrated hand prosthesis with a neuromuscular interface [4], a so called 
e-OPRA [1]. This medical and technical achievement has been made possible by the 
collaboration between Chalmers Biomechatronics and Neurorehabilitation Laboratory 
(BNL), Centre for Advanced Reconstruction of Extremities at Sahlgrenska University 
Hospital and the company Integrum AB as part of their project “Natural Control of Artificial 
Limb Through an Osseointegrated Implant” [5]. By using the neuromuscular interface, the 
electrodes can pick up the signals from the muscles and nerves in the remaining part of the 
limb. That way, when the amputee execute the movement associated to move the hand, 
the hand moves [6]. By introducing sensory feedback in the prosthesis, which reflects when 
pressure that is applied to the surface of the hand of the prostheses, the nerves in the limb 
are stimulated. This way signals can also be sent from the hand to the brain and the brain 
can react to the stimuli given by the sensory feedback. 

Before the implementation of the neuromuscular interface, prosthesis users had to rely only 
on visual feedback and could not feel how hard they pressed an object or even if they 
touched it at all [6]. With the addition of sensory feedback that gives the carrier a 
significantly better experience [4], Chalmers BNL hopes to further facilitate and improve the 
quality of life for people with amputated limbs or motor impairments. 

Adding sensory feedback to a prosthesis intuitively seems to facilitate performing different 
tasks, like for example picking up a fragile object such as an egg. However, this needs to 
be investigated formally. One amputee who have received a prosthesis with sensory 
feedback have tried lifting a fragile object and they have broken or dropped the object less 
frequently with sensory feedback compared to when that feature is disconnected [7]. Also, it 
is possible that sensory feedback increase performance, but perhaps the effort is also 
increased. For this reason, there is a need for a quantitative and objective measurement of 


2 

 
the mental effort, or cognitive workload, for performing a task, such as lifting an egg, with 
and without sensory feedback. Such a method could also be used to evaluate different 
stimulation paradigms, i.e. different ways to stimulate the nerves.  

The first problem that arises when designing a method to measure cognitive workload is 
that there are currently only four people in Sweden with an implemented e-OPRA system 
[8]. This makes the sample size insufficient for reaching meaningful conclusions. Therefore, 
the conditions of lifting a fragile object with an e-OPRA prosthesis needs to be replicated 
with intact limb subjects as a complement. One way of doing this is to measure the 
cognitive workload for lifting the fragile object with your hand and compare this to when the 
sensory feedback is removed by using anesthesia on the hand and digits. 

The second problem is that research that involve a physical intervention needs to be 
approved by the Swedish Ethical Review Authority [9]. However, this application normally 
takes 60 days to be approved [10] which makes this approach unsuited for the time limit of 
this project. 

When measuring cognitive workload, there are several different options when choosing a 
method. These include pupil size measurements (e.g. [11]–[13]), heart rate variability (e.g. 
[14]–[16]) and breathing frequency (e.g. [16]). In this work, we will use two of the most 
common techniques to measure cognitive workload: electroencephalogy (EEG) using 
event-related potentials (ERP, e.g. [17]–[19]) and a self-assessment tool called NASA-
RTLX (a task load index developed by the National Aeronautics and Space Administration 
[20], e.g. [15], [21], [22]) to adapt a method and test if that can be used to assess cognitive 
workload in this kind of task. 

In 2019 a small study was made as part of Linn Berntssons master thesis at BNL. Her 
method was designed for testing amputees and were run with one amputee and compared 
three different conditions: no task, with sensory feedback and without sensory feedback. In 
the two last one the subject was instructed to lift a force sensitive cube back and forth over 
a small barrier. The first two conditions were also run with two intact limb subjects. The 
cognitive workload was evaluated using a combination of ERP measurements and the 
NASA-RTLX self-assessment tool. The results showed promise, but the method was tested 
with too few subjects to be able to draw any real conclusions. [23] 

1.2 Brief description of this work 

This report is part of a master’s thesis at Chalmers University of Technology where both 
writers, Fanny Apelgren and Ida Pettersson, have studied Engineering Physics. We then 
moved on to a master’s in Biomedical Engineering and Learning and Leadership, 
respectively. The thesis was written at the Department of Communication and Learning in 
Science (CLS) and the project was executed at Chalmers Biomechatronics and 
Neurorehabilitation Laboratory (BNL) at the Department of Electrical Engineering, under the 
Associate Professor Dr. Max Ortiz Catalán. The project has been supervised by Eva 
Lendaro (BNL) and Sheila Galt (CLS). 

In this study, we measured event-related potentials (ERP) using EEG equipment for three 
different conditions, each recorded in three blocks. The participants performed a lifting task 
by moving a force sensitive cube back and forth over a small barrier at the same time as 
they performed a secondary task of listening to and counting sounds, known as an oddball 


3 

 
task. The cube lit up when it was pressed too hard and the weight of the cube could be 
changed to vary the difficulty of the task between easy and hard. The participants were told 
that if the cube lights up, this indicates that the cube has been pressed to hard and “broke”, 
and that they should try to move the cube as many times as possible without “breaking” it. 
The third condition consists of merely the secondary task, i.e. counting sounds. This is 
called the no task condition. The EEG data were studied to examine if differences in event-
related potentials components and the frequency bands could be seen.       

During each condition the number of times the cube was lifted over the barrier and the 
number of times that it was “broken” were counted. The participant also reported the 
number of sounds that they counted in each block and filled out a self-assessment form, 
called the NASA-RTLX [20], to report the effort of each condition. The number of lifts and 
“breaks” per minute together with the difference between presented sound and counted 
sounds were studied as an indication of the subject’s performance, the result of the NASA-
RTLX was used to measure perceived effort and the EEG data served as a quantitative 
measurement of the cognitive workload.  

The different types of data and the experiment procedure that are used in this study were 
gathered from previous work on cognitive workload and recommendations from 
experienced scientists of the field. The force sensitive cube was designed with a few other 
similar models as an inspiration but was adapted for the criteria for this study. It has also 
been designed with the possibility for further development in mind, to enable use in other 
future studies. 
 

1.3 Aims and limitations 

The present study aims to develop and examine a method that can be used to measure the 
difference in cognitive workload with and without sensory feedback. The method by Linn 
Berntsson [23] have served as an inspiration, but we have mainly looked at other studies 
that have been tested with more subjects to develop our own improved methodology that is 
also adapted for intact limb subjects. Since anesthesia cannot be used without an ethical 
approval, we will test the method with other conditions to simulate the difference of with and 
without sensory feedback. Therefore, different levels of difficulty will be used as a 
substitute. The aim of this is to investigate how the variance in cognitive workload between 
the different levels of difficulty can be measured. We have also looked for signs of a 
learning process and the method has been tested using ten intact limb subjects. If the 
method can detect differences between different levels of difficulty, it could also be 
expected to be able to measure the difference between with and without sensory feedback, 
since these conditions are also believed to be different in difficulty. Therefore, the goal is for 
this study to serve as a pilot test in preparation for a future study where this methodology, 
or an adaption of it, will be used to investigate the difference in cognitive workload of 
performing a lifting task with and without sensory feedback, by using anesthesia.  

With the limited timeframe of this work, we have done our best to process and analyze all 
the EEG data. However, there remains other ways to examine the data that has been 
recorded, this will be discussed further in the section about future work. Among other 
things, we will not examine the EEG data or the results from NASA-RTLX for different parts 
of each condition. Signs of a possible learning process within the conditions will only be 
examined by looking at the factors measuring performance. 
  

4 

 
1.4 Research questions 

1) Will the subjects experience the expected difference in difficulty between the different 

conditions, on group and/or individual level, as indicated by…  

a)  …the perceived effort, given by the scores on the NASA-RTLX? 

b) …the performance, given by number of lifts, success rate and accuracy of the 

oddball task? 

2) Can the proposed method be used to measure differences in cognitive workload, on 

group and/or individual level as indicated by…  

a) …event-related potential (ERP) components? And if so, which components? 

b) …frequency bands? And if so, which frequency bands?  

3) Which latency windows (for ERP components only) and electrode sites should be used 

to examine the differences in cognitive workload with ERP components and frequency 

bands?  

4) Can a learning effect be observed during each condition by comparing the performance 

for each of the three blocks?  

 
1.5 Contribution 

The aim of this study is part of a larger goal, that ultimately comes down to improving the 
quality of life of amputees. By preparing for the future study with anaesthesia, this work is a 
step to provide quantitative evidence that adding sensory feedback in artificial limbs does 
lower the cognitive workload. This knowledge in turn will provide an incentive for the further 
development of prostheses. In addition to providing a method and a pilot study for this 
future study, this method might as mentioned also be used to evaluate different aspects of 
the prosthesis design, for example different stimulation paradigms. 

This work has also contributed to the total knowledge about ERP experiments and 
discovered several conflicting opinions about the best procedures of the field. We have also 
discovered the lack of, and importance of, motivation and reasoning to explain why certain 
methods were chosen. Besides the results of this thesis the collected data could also be 
analysed further, and more aspects could be examined using for example ANOVA 
statistical analysis, which seems to be the most common procedure in ERP studies (e.g. 
[17], [18], [24]). 

Studies using this or similar methods might also examine the learning process of receiving 
and learning to use a prosthesis. Even though the neuromuscular interface and the sensory 
feedback can be shown to decrease the cognitive workload, learning to live with a 
prosthesis will still demand practice and learning new strategies. The study of this progress 
can be an important step in the development of both prosthesis technology and the 
strategies used to teach someone to use a prosthesis.   
 

1.6 Thesis outline 

In the following chapters the thesis work will be described in further detail, starting with the 
theoretical background that lays the foundation of the work. After that follows the methods 
section where different parts of the method, from experiment procedures to data processing 
and analyses, are discussed. We describe possible approaches, discuss how they have 


5 

 
been used in other studies and present how we have chosen to do and why. The results for 
the performance, perceived effort and EEG data are presented and thereafter discussed. 
Lastly, there is a conclusion and ideas for future work.   
 

6 

 
2 Theory 
We will start by introducing the main concept of this work: cognitive workload. We will also 
give a background of the measurement methods: electroencephalogram (EEG) and event-
related potentials (ERP). The latter will also be discussed further in the methods section 
(section 3). Here we will give a brief introduction of the technique together with how ERP 
components and frequency bands can be used to assess cognitive workload. We will then 
conclude the section by discussing how and why EEG measurements can differ between 
different individuals. 
 

2.1 Cognitive workload 

We are all aware that some tasks demand more cognitive resources than others. Most 
people have no trouble walking and talking at the same time, but when you are asked to 
solve some equations it might be harder to keep up an interesting conversation. This comes 
back to attention and cognitive workload (also known as mental workload or cognitive load).  

There are different definitions to these, rather familiar, concepts and the relationship 
between attention and cognitive workload is also a matter of discussion. Rietschel et al. [19] 
states that “attention refers to the directed allocation of cognitive resources”. Similarly, 
Kantowitz [25] argues that cognitive workload is a subset of attention. Magill [26] elaborates 
this statement by saying that “attention refers to several characteristics associated with 
perceptual, cognitive, and motor activities” and that “a related view extends the notion of 
attention to the amount of cognitive effort we put into performing activities”. In this work 
there is no need to keep these interlaced concepts apart, so attention and cognitive 
workload will both be used in reference to the cognitive resources demanded by a person to 
perform a certain activity or task.  

To get back to the question of why we can perform some tasks simultaneously while others 
cannot, we need to introduce what is known as attentional reserve, or attention capacity. 
This theory states that we have a certain amount of attention, or cognitive workload, and 
that this can be split to do several things. Each task demands some of the attention from 
our reserve and leaves the rest. In the example above, walking does not demand a lot of 
cognitive workload and leaves some attention that you can use for example for talking. 
Meanwhile, solving equations might not leave enough spare attention in the reserve for 
conversation, and perhaps walking and talking at the same time does not allow you follow a 
map to find your way in a new place. 

That means that cognitive workload has an inverse relationship to the remaining resources 
of attentional reserve [27]. When the cognitive workload increases for a task, for example if 
you try to solve increasingly complex equations, the resources left for other tasks decrease.  

Workload and attention seem to be closely related to performance and learning. Kantowitz 
[25] suggests a model where too low or too high workload leads to lower performance and 
this view is supported by Winnie et al. [28] who says that efficient learning happens at the 
optimal level of cognitive workload. As a further link to learning, Magill [4] suggests that a 
new task takes a lot of cognitive effort in the beginning, but that learning takes place and 
thereby the attentional demands decrease with practice. This is known as the practice 
effect and means that learning of a task can be indicated in difference ways. Either by a 
decrease of cognitive workload together with a stable level of performance, by an increase 


7 

 
of performance with a stable level of cognitive workload, or by the combination of increased 
performance and decreased workload.  

Something else that needs to be considered when looking at attention and cognitive 
workload is how it is balanced by the demands of the task at hand. If the challenge of the 
task is too low one will experience boredom, and frustration will emerge if the challenge is 
too much compared to the skill level. The area in between these two outer limits, where skill 
and challenge are perfectly matched, is usually called flow. This is the feeling that can 
make you keep up a task, for example a video game, for a long time. If you are bored or 
frustrated because the game is too easy or too hard you are likely to stop playing. So it is 
believed that both these conditions will decrease the attention of the task [29].  
 

2.2 Electroencephalogram (EEG) 

One way to measure cognitive workload that is commonly used is by the 
electroencephalogram (EEG). EEG is a clinical tool that measures the electrical activity of 
the cerebral cortex with electrodes attached to the human scalp. The cerebral cortex is the 
outermost layer of the cerebrum, which is the largest part of the brain, and is divided into 
left and right hemisphere. Each hemisphere is in turn divided into four lobes: frontal, 
temporal, parietal and occipital lobes, that are associated with different functions of the 
human body [30]. The brain and its different regions can be seen in Figure 1. 

 
Figure 1. The brain and its four different lobes: frontal, parietal, occipital and temporal. 

The electrodes that are used to measure the electrical activity of the brain usually consist of 
a metal disk or pellet. They can be attached to the head with stickers, but since the number 
of electrodes used for a measurement normally is more than 16, they are usually attached 
to a cap that can much easier be fitted to the subject’s head. The electrodes pick up 
electrical activity from the brain in the form of electrical potentials for currents to flow from 
one electrode to a ground electrode. Since the recorded signals are in the range of 0 to  
100 𝜇V they typically need to be amplified by a factor of 1000-100000 before they are 
further processed [31].  

There are mainly four different characteristics for electrodes. They can be either wet or dry, 
and either passive or active [32]. Wet electrodes are generally Ag/AgCl electrodes, and one 
needs to put a conductive gel between the scalp and the electrode to get a good and stable 


8 

 
electrical connection. This also helps lowering the impedance of the electrode-scalp 
connection. A lower impedance induces less noise which is important to get good quality of 
the measurement [31]. Dry electrodes instead consist of a single metal, often stainless 
steel, that act as a conductor and the electrode are put directly on the scalp. The difference 
between passive and active electrodes are that the active electrodes include a pre-
amplification module after the conductive material. By that, the signal can be amplified 
before additional noise are introduced when the signal travels from the electrode to the 
system that measures the signal. This increases the signal to noise ratio. For passive 
electrodes there is no preamplification, which means that noise arising as the signals travel 
from the electrode to the measuring system will be amplified as much as the EEG signals. 
The different types of electrodes are combined, for example one can use active, wet 
electrodes [32].  
 

2.2.1 Referencing 

When creating an EEG amplifier, the ground electrode must be connected to a ground 
circuit for the EEG amplifier to work. This ground circuit is typically connected to other parts 
of the amplifier, which means that electrical noise is introduced at the site of the ground 
electrode. This means that there are noise present in the signal from the ground electrode 
that are not present in the signal from the other electrodes. To get rid of this noise, EEG 
recording systems use differential amplifiers. With the differential amplifier a reference 
electrode is used together with the operating electrode and the ground electrode to cancel 
out the noise. The differential amplifier records the potential between the operating 
electrode (O) and the ground electrode (G), as well as the potential between the reference 
electrode (R) and G. The amplifier then outputs the difference between these potentials  
O-G-(R-G) = O-R and since the noise from the ground circuit are the same for both O-G 
and R-G any noise generated at G will be eliminated in O-R. In other words, to get a single 
channel of EEG all three electrodes (operating, reference and ground) are needed. [31] 
 

2.2.2 Electrode positioning 

To get useful data that are comparable to other studies and possible to analyse it is 
important to position the electrodes in a correct way on the head. The most commonly used 
system to define the position of the electrodes is the 10-20-system [31]. Originally, this 
system used 21 electrodes, where two of them were placed on the earlobes and the rest 
were placed according to measurements of specific landmarks on the scalp. The landmarks 
used are the nasion (just above the nose, between the eyes), inion (the indent in the back 
of the head) and the left and right pre-auricular points (right in front of each ear), see Figure 
2. An equator through the nasion, inion and the left and right pre-auricular points, together 
with a line between the nasion and inion and a line between the left and right pre-auricular 
points defines the measurements used to place the electrodes. The equator and the lines 
are then divided into sections with the first mark at 10 % and the following marks at 20 % 
intervals, resulting in the electrode positioning in Figure 2a. An extended 10-20-system with 
128 electrodes can also be used, with marks on every 10 %, see Figure 2b [33]. Here we 
can also see that the electrodes are marked with letters and numbers. This is a way to 
indicate the location of the electrode. The latter gives the scalp region (F: frontal, T: 
temporal, C: central, P: parietal, O: occipital). The numbers indicate the distance from the 
center, where larger number are further from the central line. Even numbers are used for 


9 

 
the right hemisphere and odd numbers for the left. The letter “z” stands for the number zero 
and is used instead of the number “0” to avoid confusion with the letter “O”. 

Figure 2. Electrode positioning according to the normal and extended 10-20 system. The letters stand for scalp region (F: 
frontal, T: temporal, C: central, P: parietal, O: occipital). The numbers represent the distance from the center, with even 
numbers to the right and odd numbers to the left. “z” stands for zero and is used instead of the number to avoid confusion 
with the letter “O”. 
 

a) 10-20 system 

 
b) Extended 10-20 system 

2.2.3 Artifacts and Noise 

The electrodes do not only detect signals from activity in the brain, but also pick up other, 
non-neural, signals. Every time the subject moves, clenches their jaw, frowns, move their 
eyes, blink or something similar, this gives an electrical signal that can be picked up by the 
electrodes. Electrical signals from muscle movements are called electromyography (EMG). 
External sources, such as electrical equipment, can also emit electrical signals that can be 
picked up by the EEG electrodes. Another source for disturbance is the equipment itself, for 
example if the connection between an electrode and the scalp is instable. 

All of these non-EEG, signals are called artifacts [31]. Some of the muscle movements, 
especially from the eyes since they are located close to the electrodes, can cause big 
disturbances of the recorded signal. Others, like electrical equipment or some muscle 
activations, are smaller and more regular. Both of these kinds of artifacts need to be 
handled to be able to see the subtle changes of the small, often below 100 µV, neural 
activity. How this can be done is discussed in section 3.4.5.   
 

2.3 Event-Related potentials 

Here we will briefly introduce the event-related potential (ERP) technique, which is the 
cornerstone of the method of this thesis. The different aspects of this technique will be 
discussed in further detail in section 3. We will also present the concept of ERP 
components and frequency bands as a way to measure cognitive workload. Lastly, we will 
discuss different reasons for why measurements of cognitive workload can differ between 
different individuals.  
 

10 

 
2.3.1 Basic concept  

A common technique when measuring EEG is to use event-related potentials (ERPs). This 
is a way to single out certain activities in the brain. Raw EEG data is often hard to use, 
since it is a mix of all the neural activities in the brain. Even if you are told to focus on a 
certain task, your mind easily wanders. The ERP technique was first used in 1977 by 
Wickens et al. [34] and builds upon the idea that a certain stimulus, or event, can trigger a 
specific brain activity.  

The ERPs are measured by presenting some kind of stimuli, for example sounds or flashes 
of light, repeatedly while measuring EEG. Each stimulus is time-locked to the EEG data and 
marked by a line at the appropriate time. Later, a short section of EEG data, a so called 
epoch, is extracted around every stimulus. The epoch begins a short time before the 
stimulus and ends a certain time after the stimulus. It is common to use 100-200 
milliseconds pre-stimulus and 800-1000 milliseconds post-stimulus. The idea is that noise 
that is unrelated to the stimulus will cancel out when many epochs are averaged together 
and leave the EEG signals that are related to the stimuli. 

The book “An Introduction to the Event-Related Potential Technique” by Steven J. Luck [31] 
is a commonly used reference in this work. This, together with articles using the ERP 
technique, has helped us make all of the decisions involved in conducting an ERP 
experiment. 
 

2.3.2 ERP Components 

Luck [29, p. 68] gives the following definition of ERP components: 

“An ERP component can be operationally defined as a set of voltage changes that are 
consistent with a single neural generator site and that systematically vary in amplitude 
across conditions, time, individuals, and so forth. That is, an ERP component is a source of 
systematic and reliable variability in an ERP data set.” 

These voltage changes can then be picked up by the EEG electrodes, with different weights 
depending on the relative location of the source and each electrode.  

Here it is important to note the difference between ERP components and ERP peaks. The 
peaks in the ERP does also show voltage changes, but these changes do not necessarily 
reflect changes in a given component. For example, if the voltage of a positive peak is 
reduced it might reflect a reduction of an underlying positive component, but it might also be 
an increase of a negative component at the same latency, i.e. at the same time compared 
to the stimulus. There are some techniques to extract the components from the data, a 
common one being independent component analysis (ICA) that will be used and discussed 
in this work (see section 3.4.5.2). However, none of these methods can be completely 
trusted, and should be used with caution [31]. 

So, a single peak can never be assumed to represent a single component. Nevertheless, 
one can look at many different electrodes and study the latency of a peak. Since the time 
for a signal to travel the different distances from the source to each electrode can be closely 
estimated to be equal, the timing will coincide for one component at different electrodes.  

To avoid having to investigate all electrode sites (since they can be many), and still be able 
to draw conclusions from an ERP waveform, Luck recommends to use the components that 


11 

 
have been shown useful in earlier studies, either from other similar experiments or, if you 
are first in your field, from other fields [31].  

As a way to facilitate discussions about ERP and comparisons between studies there is a 
conventional method for naming the different peaks of an averaged ERP waveform. These 
names start with a letter, either N or P, to denote whether a peak is positive or negative. 
After that follows a number, describing one of two things. In the first convention the number 
describes the ordinal position of the specific peak, i.e. the first positive peak would be called 
P1 and the third negative peak N3. This convention is depicted in Figure 3. However, this 
plot uses the old convention of plotting ERP waveforms with the negative axis directed 
upwards. In this work we will use the same naming convention but with the positive axis 
upwards, as is common in most modern ERP studies [31]. The other possible way is to 
name the peak according to latency (i.e. the time after stimulus onset), so that a positive 
peak occurring around 300 ms after the stimulus onset would be called P300. In some 
cases, peaks are also named to describe their function or location, such as the error related 
negativity (when the subject discovers that he or she did something wrong) or late positive 
potential. [31] 

 
Figure 3. Depicting an example of an ERP waveform where each peak is named after the convention used in this thesis. 
The letter (P or N) stands for positive or negative (although note that negative is upwards in this plot) and the number 

stands for the peaks’ ordinal position. 

As mentioned, each component can be referenced to either by using the ordinal position of 
the peak (e.g. N1) or the latency (e.g. P200). The latter describes the latency at which the 
component is usually found, but this varies between different experiments and therefore this 
notion can be confusing. Luckily, the latency is often about 100 times the ordinal position, 
so that P1~P100, N2~N200 and so on [31]. However, some old conventions linger and P3 
is still often referred to as P300 because it was first found about 300 ms post stimulus even 
though it is more common to arise later than that [31]. The latency also tells us something 
about the stage of the stimulus processing by the brain. That means that earlier 
components arise from perceptual processing in the brain while later components reflect 
later stages of the reaction, including evaluation of the stimulus [17]. Here we describe 
some components that have been shown to be an indication of cognitive workload, that 
were the most common in our literature research. We will use the naming convention based 
on ordinal position.  
 

12 

 
2.3.2.1 N1 

The N1 component is specific to auditory stimuli and is characterized as one of the initial 
components in an auditory ERP, called long-latency auditory ERP components. It has been 
suggested that the N1 component signal the detection of acoustic change in the 
environment. The single-peak N1 component is evoked by short transient stimuli or by 
onsets of noise and has been shown to consist of three temporally overlapping 
constituents. The dominant contribution to the N1 component is most prominent at the 
fronto-central electrodes. [35] 

The N1 component has been linked to cognitive workload in several studies [18], [36]–[39]. 
One of the examined studies could show that N1 varied between some of the levels but not 
all [24] and two failed to show significance for N1 [17], [19]. In these studies, N1 was found 
between 75 and 180 ms post-stimulus, where the studies that were successful of linking N1 
to cognitive workload seems to have found it in the later region of that interval, see  
Figure 4.    
 

2.3.2.2 N2 

The N2 component is known for containing several different subcomponents: N2a, N2b, 
N2c. However, the basic N2 component (that will be discussed here) is said to be elicited 
by a repetitive, nontarget stimulus and it gets a larger amplitude if the stimulus is novel (not 
repeated). Depending on if the stimulus is task-relevant or task-irrelevant the N2 
component appears with different latency, with later latency if the stimulus is task-relevant 
(the difference between task- relevant and irrelevant stimuli will be discussed more in 3.2). 
Also, if the stimulus is auditory a larger effect is seen in the central sites and if the stimulus 
is visual the effect shifts to be larger in the posterior sites instead. [31] 

The N2 component has been examined in two of the studies that we have looked at [36], 
[37] and both showed that it successfully assess the cognitive workload. They found N2 in 
the interval 200 to 400 ms post-stimulus, see Figure 4.  
  

2.3.2.3 P2 

The P2 component is most prominent at the frontal and central scalp sites and is typically 
larger for stimuli containing simple, infrequent target features. At posterior sites, the P2 
component often interferes with N1, N2 and P3 and therefore it is hard to distinguish at 
posterior sites. [31] 

Two of the studies in our literature study could show a significant correlation between P2 
and different levels [17], [18], but three other could not verify this correlation [19], [24], [37]. 
P2 was found between 166 and 270 ms post stimulus, see Figure 4.  


13 

 
Figure 4. A sketch of the latency ranges where the different ERP components have been found according to our literature 
study. Each line is marked with the reference for the study it is taken from. Green lines indicate that there has been a 
significant difference between the different levels of difficulty. Orange means that a difference could only be seen between 
some of the tested levels, but not all. Grey lines mean that no significant differences could be shown. Studies that have 
not specified the latency range are marked with “?”. 

2.3.2.4 P3 

The P3 component is the most examined ERP component when it comes to cognitive 
workload [40], and that shows in our literature study. There are also several other 
components that are closely related to P3, and sometimes hard to differentiate from it. The 
ones examined in the studies we have read are novelty P3, P3a, P3b and early and late 
P3a. 

The P3 component is typically evoked by rare task-relevant events and it is said to reflect 
an updating of the context information, which often is assumed as an update of the working 
memory. There is also clear evidence that the amplitude of the P3 component can be 
influenced by the amount of attention allocated to a stimulus, which has been most clearly 
observed in dual-task experiments where the subject is to perform two tasks at the same 
time. The latency of the P3 component changes over the scalp and is shorter over the 
frontal areas and longer over the parietal areas. It also differs between individuals 
depending on how rapidly the subject can allocate their attentional resources, such that the 
latency is shorter for subjects with higher mental speed. [35] 

As mentioned above, the P3 component can be divided into several subcomponents: 
mainly the P3a, P3b and novelty P3 component. These subcomponents are typically 
elicited by different task conditions and can be recognized by their different topographic 
distributions. The P3a subcomponent has a centro-parietal maximum amplitude distribution 
and is elicited by rare tones presented in a series of frequent tones without a task. If novel 
distracters (such as a dog barking) are used in a sequence of frequent tones a fronto-
central P3 potential is elicited, which is called the novelty P3. The P3b subcomponent is 
elicited by task-relevant stimuli and has a parietal maximum amplitude distribution. Often, 


14 

 
P3b and the classic P300 (P3) are said to be the same component. It is also found that the 
novelty P3 differs from the classic P300 component and that the P3a and novelty P3 are 
most likely variants of the same ERP that varies in scalp topography depending on 
attentional and task demands. [35] 

Many studies have linked the P3 component and it’s relatives to cognitive workload [17], 
[18], [34], [37], [39]–[42]. One study failed to show significance for P3 [29] and one could 
only show significance between some of the levels [36]. In the successful cases, P3 has 
been found in the interval between 270 and 517 ms post-stimulus, and more commonly in 
the earlier part of that interval, see Figure 5.  

When looking at the related novelty P3, it has also been proven successful in assessing 
cognitive workload by several studies [19], [24], [43]. One has only shown a change of 
amplitude between some of the levels examined [44]. The novelty P3 component has been 
observed between 250 and 332 ms post-stimulus, see Figure 5. 

Lastly, some studies have linked P3a to cognitive workload [38], [45], where one split the 
component into early and late P3a. Another study saw no correlation between different 
levels and neither P3a nor P3b [43]. The components were found somewhere in the range 
between 210 and 405 ms post-stimuli, see Figure 5. 
 

2.3.2.5 LPP  

The late positive potential (LPP) is commonly identified as a midline centro-parietal ERP 
with a strong connection to emotional stimuli such as pleasant and unpleasant pictures. It 
becomes evident at 300 ms, and can therefore be mistaken for the P3 component, but the 
LPP component often continues for latencies up to 2000 ms, even though it is maximal in 
the latency range of 300-1000 ms. LPP has also been shown to indicate reaction time to a 
stimulus by that the LPP amplitude increases when the reaction time increases. [35] 

The LPP component has been shown to be an indicator of cognitive workload [17], [18], 
[21]. The findings have been within the interval 400 to 610 ms post stimulus, but two of 
these three studies found LPP close to the end of this interval, see Figure 5. 


15 

 
Figure 5. As the previous figure, this is a sketch of the latencies where each component has been found, and each study 
is marked with its reference number. This figure includes the same color coding as the previous one (green: significance, 
orange: partly significance and grey: no significance), but here darker colors are used to indicate different versions of P3. 
This is also indicated by letters where “a” is P3a, “b” is P3b, “N” is Novelty P3, “ea” is early P3a and “la” is late P3a. A dot 
indicates that no interval was given, only the latency of the peak. As before, “?” denotes studies where the latency has not 
been specified. 

2.3.3 Frequency bands 

The measured EEG signals often have an oscillatory, repetitive behaviour and therefore the 
collective electrical activity of the cerebral cortex is often called a rhythm. The EEG rhythms 
diverse between individuals and depends on things like the mental state of the subject, if 
they are awake or sleeping for example. Since the electrical activity arises from the 
activation of neurons in the brain, the rhythms can have different frequency depending on 
how synchronous the activated neurons are. The frequency range for the rhythms is 
approximately between 0.5 and 30-40 Hz and are often divided into five frequency bands, 
Delta, Theta, Alpha, Beta and Gamma [30]. The Alpha band is also sometimes subdivided 
into Low- and High-Alpha. The ranges of each band differ slightly between different studies. 
However, the differences in how the frequency bands are defined are relatively small 
(around 1 Hz). So, in this work we will discuss previous findings about a certain frequency 
band, such as Alpha, without consideration about the fact that the studies have used 
slightly different definitions of Alpha. We have decided to use the same ranges as was used 
by Rietschel et al. [46], which are presented in Table 1. Now we will present each of these 
frequency bands and their connection to cognitive workload, as shown by other studies that 
are part of the literature study of this work. We also describe the quotient Theta/Alpha.  
 

16 

 
Table 1. EEG frequency bands [46]. 

EEG FREQUENCY BANDS 

DELTA RHYTHM <3 Hz 

THETA RHYTHM 3-8 Hz 

ALPHA RHYTHM 8-13 Hz 

    LOW-ALPHA 8-10 

    HIGH-ALPHA 10-13 

BETA RHYTHM 13-30 Hz 

GAMMA RHYTHM >30 Hz 
 

2.3.3.1 Delta 

The Delta rhythm has a large amplitude and is mostly present during deep sleep. In normal 
adults it is normally not observed in the awake state other than that it is indicative of 
cerebral damage or brain disease[30]. It has also been shown that Delta rhythms are 
involved in motivational processes such as the necessity to satisfy the basic biological 
needs. [47] 

Our literature study has shown that there seems to be no significant correlation between the 
Delta frequency band and cognitive workload. We found two studies that measured Delta in 
tasks of varying difficulty, but neither saw any significant results [46], [48]. 
 

2.3.3.2 Theta 

The Theta rhythm mostly occurs during drowsiness and certain stages of sleep [31], but it 
has also been shown to correlate with a variety of behavioural, cognitive and emotional 
variables. The main domain seems to be memory and emotional regulations, but there are 
also indications that Theta activity occurs when performance of a learned task is increasing 
most rapidly and that it declines as tasks becomes familiar [47]. Especially at frontal scalp 
sites Theta activity can be facilitated by emotions, focused concentration and during mental 
tasks [49], meaning that it is expected to increase with increasing workload. 

Theta is, together with Alpha (described below), the frequency band that has been shown to 
relate most to cognitive workload [40]. Several studies have shown that theta can show the 
difference in cognitive workload between different levels [11], [21], [40], [50]. However, our 
literature study has also shown that several studies have failed to show this correlation [14], 
[29], [45], [46], [48] and a few studies have seen statistical significance for theta between 
some levels, but not between all [24], [44]. This can for example mean that there is a 
difference between the easy condition compared to the medium and hard, but that no 
difference can be seen between the two latter conditions. 
 

2.3.3.3 Alpha 

The Alpha rhythm occurs during wakefulness over the posterior regions of the head and 
does normally have higher amplitude over the occipital areas. It is typically characterized by 


17 

 
rounded or sinusoidal waveforms. The amplitude varies between individuals and in a given 
individual also from time to time but is normally below 50 µV in adults. It is commonly 
blocked or attenuated by attention and mental effort, especially visual attention, and are 
most prominent when the eyes are closed [51]. The amplitude of the Alpha frequency band 
is therefore expected to decrease with increasing workload. 

As mentioned, Alpha and Theta has been shown to indicate cognitive workload [40]. Alpha 
is also the most studied frequency band in the literature that we have studied for this work, 
sometimes split up into sub-bands Low- and High-Alpha. Several studies have seen a 
significant difference in Alpha between different levels of difficulty [11], [14], [21], [29], [40]. 
One of the studies have, however, failed to show a significant effect [50]. When comparing 
High- and Low-Alpha, the upper frequency range seems to often yield significance [24], 
[44], [46], [48] while the lower range often only can show difference between some of the 
levels [24], [44].  
 

2.3.3.4 Beta 

Activity recognized as Beta rhythm are mainly found over the frontal and central regions of 
the head and is found in almost every healthy adult. The amplitude does normally not 
exceed 30 µV and it can be blocked by motor activity and tactile stimulation [51]. Beta 
activity normally increases with drowsiness and light sleep and also with mental activation 
[49]. 

In our literature study, there has been little evidence of a correlation between Beta and 
cognitive workload. Most studies that have examined beta have not been able to show a 
significant effect [14], [29], [46], [48], [50] while one has seen a difference only between 
some of the conditions [24]. 
 

2.3.3.5 Gamma 

The Gamma rhythm consists of high-frequency oscillations and are said to be related to a 
state of active information processing [30]. Induced Gamma activity have been reported 
during sensory, cognitive and motor processing and may be related to sensory binding as 
well as sensorimotor integration [51].  

One of the studies that we have read have seen evidence of a significant difference 
between different levels for the Gamma frequency band [46]. One study has seen effects 
between some of the conditions but not all [24]. However, two studies have also failed to 
show a correlation between gamma and cognitive workload [50], [52]. 
 

2.3.3.6 Theta/Alpha: 

Besides the frequency bands, the quotient Theta/Alpha is also commonly used when 
assessing cognitive workload. There are several ways of calculating this ratio, often by 
using either frontal or parietal (see Figure 1) electrodes when measuring Alpha and Theta. 
Frontal Thetha/parietal Alpha [44] and frontal Theta/frontal Alpha [24] has both been used 
to indicate cognitive workload. Another study performed by Gentili et al. [45] showed that 
the Theta/Alpha ratio could be calculated from electrodes in the same area and still show 
significantly higher values for a higher level of difficulty.  
 

18 

 
2.3.4 Differences between individuals 

As mentioned, cognitive workload is here defined as the cognitive resources demanded by 
a person to perform a certain activity or task. This means that the cognitive workload is not 
only correlated to the difficulty of the task, but also to the abilities of the individual. When 
the task demands are close to exceeding a person’s ability, the workload is high, and the 
limits to boredom and frustration depend on both the task and the individual skill level.  

Apart from this, ERP measurements also varies between individuals. Differences between 
different subjects can reflect biological differences such as skull thickness or cortical folding 
patterns [31]. Other factors that can affect the ERPs when measuring cognitive workload 
are age, lack of sleep, time-of-day, time since the last meal, time of year and geographic 
location (mainly because of difference in daylight), exercise (mainly affects older people), 
and the intake of common drugs such as caffeine, nicotine and alcohol [53].  

  
19 

 
3 Method 
Here we present how we have constructed our method, by describing general theory for the 
different parts and discussing how others have chosen to do. 
  

3.1 Grasping task 

As mentioned in section 1.1, this work is a pilot study in preparation for measuring cognitive 
workload on intact limb subjects performing a grasping task with and without sensory 
feedback, where the latter condition will be done by using anesthesia on their hands and 
digits. Further, the conditions of this study are meant to mimic the conditions of with and 
without sensory feedback in prosthetic hands. A graphic illustration of the connection 
between the easy and hard condition of the different studies can be found in Table 2. 
 

Table 2. A schematic illustration of how the different levels are meant to be represented in our study and the future studies 
with anesthesia and prosthetic hands, respectively.  

 OUR STUDY STUDY WITH 
ANESTHESIA 

PROSTHETIC STUDY 

EASY TASK Lighter cube Without anesthesia With sensory feedback 

HARD TASK Heavier cube With anesthesia Without sensory 
feedback 

 
So, the main task to be examined in this work is a grasping task. This is performed by lifting 

a force sensitive cube (described more in section 3.7) back and forth over a barrier as many 

times as possible, without pressing it too hard i.e. breaking it. If the cube is pressed too 

hard it is indicated by that a red LED bar light up. The weight of the cube can be increased 

by adding extra weights to the cube, in order to make it harder to lift it without pressing it too 

hard. In that manner there are two different difficulties for the grasping task: easy and hard. 

These are meant to represent the different conditions of with and without sensory feedback, 

that will be used in the future study with anaesthesia that this work is in preparation for. 

Pictures of the experimental setup and the force sensitive cube can be seen in Figure 6. 

The force sensitive cube and its design process is further described in section 3.7. The 

grasping task is comparable to the modified Box and Blocks test, i.e. the Virtual Eggs Test 

developed by Clemente et al. [54]. 


20 

 
Figure 6. The experimental setup for the grasping task, with a closeup of the force sensitive cube.  
 

a) The setup for the grasping task, with two boards 
separated with a barrier. The cube was to be lifted back 
and forth over the barrier. 

 
b) The force sensitive cube, with force sensors, 
LED bar and weights. The weights could be 
removed to reduce the difficulty of the grasping 
task. 

3.2 Dual-task paradigms and oddball tasks 

Studies using the ERP (event related potential) technique are commonly performed by 
measuring ERPs of a secondary task that is performed simultaneously with a primary task 
of interest. This design is needed when it is not possible to directly assess the workload of 
the primary task, for example if there are no clear stimuli. The subjects are to primarily 
perform the primary task as well as possible and use remaining cognitive resources for the 
second task, doing it as well as possible under the circumstances. The secondary task in 
an ERP study can be for example to see flashes of light while performing a primary task of 
solving equations. Using ERP, the brain potentials related to the stimuli are measured. This 
way, the brain’s responses to the secondary task stimuli are expected to decrease as the 
difficulty of the primary task increases, and this shows by a decrease in amplitude of the 
different ERP components presented in section 2.3.2. The ERP technique thereby uses the 
inverse relationship between cognitive workload and attentional reserve, mentioned in 
section 2.1. This use of two simultaneous tasks, a primary and a secondary task, to 
measure the cognitive workload of the primary task is called a dual-task paradigm. In some 
studies, the subjects are required to react to the stimuli in some way, for example by 
pressing a button or by silently counting, while other studies tell the subjects to ignore the 
stimuli.  

A common dual-task paradigm is what is called an oddball task. Here, the stimuli contain 
common non-targets and rare targets, differentiated by for example pitch or colour. It is 
usual that the common non target represent 80 % of the stimuli. The ERPs are measured 
around the rare targets, since several ERP components are larger for a stimulus from a rare 
category than a common. The stimuli are usually either visual, auditory or somatosensory. 

The dual-task paradigm is widely used, but also questioned. One argument is that adding a 
secondary task will affect the performance of the first task, and thereby change the variable 
under investigation [18]. To deal with this problem, it is often recommended to use task 
irrelevant stimuli, i.e. stimuli that the subject should ignore [25]. However, Castellar et al. 
[55] examined this and could not find evidence that the primary task, in this case a game, 


21 

 
was affected by the secondary task of reacting to target sounds as fast as possible by 
pressing a button.   

When applying a dual task paradigm, there are many different factors to consider. These 
include deciding if the subjects should ignore or react to the stimuli, what type of stimuli to 
use and the timing of the stimuli. We will continue by discussing these options.  
 

3.2.1 The choice of stimuli 

As mentioned, stimuli can be either visual, auditory or somatosensory. Since the primary 
task of this work (lifting a force sensitive cube) involves using visual and sensory feedback, 
we have chosen to use auditory feedback for the secondary oddball task. This so that the 
secondary task should interfere with the primary task as little as possible. 

When using auditory stimuli, an approach that has become common is the novelty oddball 
task, which include novel, complex sounds (e.g. [18], [19], [39], [43], [45], [55]). This means 
a collection of complex sounds (e.g. a dog barking or a car honking) that are not repeated 
within each subject. When comparing different kinds of auditory stimuli, Dyke et al. [38] 
showed that complex sounds were better for measuring cognitive workload than simple 
sounds (e.g. a tone of a certain frequency). They could, however, not see any difference 
between if the sounds were repeated or not.  

In novelty oddball studies, it is common to use 80 % common, simple sounds (e.g. a low 
pitch tone), 10 % rare, simple sounds (e.g. a high pitch tone) and 10 % novel, complex 
sounds (e.g. a person coughing or a mosquito buzzing) (e.g. [39], [43], [55]). The ERPs are 
usually measured around the novel, complex sound since this is a better way to elicit ERP 
components [38] and these sounds are most often task-irrelevant by either having the 
subjects react to the rare, simple sounds by pressing a button or count them (e.g. [39], 
[55]), or by asking the subjects to ignore all sound and only focus on the primary task (e.g. 
[14], [18], [19], [45]). This means that the novel, complex sounds are used for the ERP 
measurement but are not relevant for any of the tasks. However, according to a study made 
by Debener et al. [43] task irrelevance is not necessary when applying the novelty oddball 
task. They also found that the novelty P3 was actually larger for task relevant sounds. That 
is to say that it was more effective to let the subjects count the novel sounds, that were also 
used for ERPs, than to count the rare, task-irrelevant sounds. This is evidence against the 
common view, and all other studies that we have looked at, both before and after Debener’s 
discovery, still use task irrelevant stimuli when measuring ERPs. 

For this study, we apply the novelty oddball task using 80 % common, simple sounds, 10 % 
rare, simple sounds and 10 % novel, complex sounds, as described above. Henceforth, 
these sounds will be referred to as common, rare and novel, respectively. We choose to 
use 500 Hz as common sounds and 1500 Hz as rare sounds, since these sounds 
represented the broadest range of frequencies we could use that were deemed comfortable 
to listen to for the subjects. The novel, complex sounds were randomly chosen from 93 
different audio clips and were only played once during each condition. The ERPs were 
measured by using the novel sounds, as recommended above [38] and the subjects were 
asked to count the rare sounds. This choice was made against what was shown by 
Debener et al. [43], since we decided to rather use the common approach of using task-
irrelevant stimuli to measure ERPs. This will make it easier to compare the results of this 
study to others. 
   

22 

 
3.2.2 Secondary task: Counting, reacting or ignoring? 

If the subjects are instructed to count sounds, this can also be used as an indication of 
cognitive workload. Since a harder task should decrease the attention available for 
counting, more errors should be made with a harder task than an easy one. However, Luck 
[31] raises an issue with this method. Since error could arise from missing a target, from 
mistaking a nontarget as a target or from losing count, it is impossible to tell if a correct 
number means that no error was made. For example, the combination of the two first errors 
would result in a correct number of counted targets. For this reason, the alternative of 
pressing a button in reaction of a target can be superior, since that allows both misses and 
false pushes to be considered. A problem with this approach is that it requires a movement 
that will result in artifacts. 

We decided to ask our subjects to count the rare, simple sounds of the oddball task. This 
choice was made to avoid subjects getting bored of the task, since the primary task is very 
repetitive. This because boredom might affect the willingness to focus your attention on a 
task, as discussed in section 2.1.  
   

3.2.3 Stimuli timing  

If the interstimulus interval, i.e. the time between two stimuli, is too short there is a risk that 
the different epochs will overlap, which will mean that potentials resulting from one sound 
might affect the next one. On one hand, you want the interval to be as short as possible to 
maximize the number of epochs to draw data from. On the other hand, the ERP 
components are bigger the longer the interstimulus interval and if the stimulus are played 
too often that might be tiering for the subject. Also, if the interval is too long, so called 
stimulus-preceding negativity can occur, which means that the subject is anticipating a 
sound. This can be confusing when analysing the results. Luck recommends around 1000 
ms interstimulus interval, and to use a temporal jitter of at least ±100 ms, since varying the 
interval also prevents stimulus-preceding negativity. This means that the interstimulus 
interval could vary randomly between 900 and 1100 ms, according to Luck. Also, since the 
epochs around each novel sound will later be averaged together (for more details, see 
3.4.6), varying the interstimulus interval also helps to prevent regular noise, such as alpha 
waves, to show in the averaged ERP waveforms. When it comes to the duration of stimuli, 
Luck recommends 50-100 ms for simple sounds and 300-400 ms for novel sounds, with  
5-20 ms rise and fall time. [31] 

It seems like most of the earlier studies that have applied the novelty oddball task (e.g. [13],  
[16]–[18], [23], [37], [42], [44], [54]) have, however, used a longer time for the duration of 
the simple sounds. They have all used the same sounds, originally from[56], and to 
facilitate comparing our result to theirs we have decided to use the same source for our 
sounds. 

Therefore, we have played the sounds in random order with a varied interstimulus interval 
between 960 and 1360 ms, as it follows Lucks recommendation and has been used by for 
example Debener [43] and Castellar Núñez [55]. The novel sounds are from the work of 
Fabiani et al. [56] and the duration is between 159 and 399 ms (mean 335,43 ms). The 
pure tones from the same source was 336 ms long, and as mentioned we chose to use  
500 Hz for the frequent sounds and 1500 Hz for the rare. Rise and fall time are 10 ms for 
the pure tones, but vary for the novel, depending on their properties.  
 

23 

 
3.3 Experimental procedure 

This thesis mainly resulted in a developed method to measure cognitive workload, that 
consists of the different parts described above: a dual-task paradigm consisting of an 
oddball task and a grasping task, where the cognitive workload is evaluated through EEG 
measurements and the self-assessment questionnaire NASA-RTLX. To test if the method 
could be used to measure cognitive workload, we performed a pilot study including 10 
subjects. 
 

3.3.1 Participants 

A good average ERP waveform can be obtained either by using long trials or many trials. 
However, the long preparation time for each subject (about one hour with the subjects for 
our experiment) makes it unrealistic to examine many subjects. Normally each study uses 
about 10-20 subjects for ERP measurements [31]. We have measured ERPs for 10 
subjects.  

The 10 participants (six females and four males) was students at Chalmers University of 
Technology in the age 24 to 28, with mean age 25.5 and standard deviation 1.43. All had 
normal or corrected to normal vision and hearing. The subjects’ handedness was evaluated 
through the Waterloo Handedness Questionnaire. This is made up of a series of questions 
of which hand one would use for performing certain tasks. The options were left always, left 
usually, both equally often, right usually and right always. The score is then added by 
assigning the options with values -2, -1, 0, 1, and 2, respectively. The score ranges from 
±72 and this score would thereby indicate a strong preference for either the left (-72) or the 
right (+72) hand. According to this all subjects were right-handed, with a received score in 
the range 41 to 56, with mean score 49.2 and standard deviation 4.38. The subjects also 
read and signed an informed consent, which can be found in Appendix C, before the 
experiment. 

To measure EEG we used an EEG system environment from g.Tec Medical Engineering, 
including g.HIamp multi-channel biosignal amplifier for 144 channels, g.GAMMA EEG cap 
with 128 g.SCARABEO active Ag-AgCl electrodes and g.TRIGbox trigger pulse box. The 
software used to collect the EEG data was g.RECORDER, a biosignal recording system 
from g.Tec. EEG was recorded at 2400 Hz from 128 electrodes according to the extended 
10-20 system, which is described in section 2.2.2 and can be seen in Figure 2.  Included in 
these 128 electrodes are four eye electrodes (EOG), two electrodes put on each earlobe 
and 122 scalp electrodes. As ground electrode we used the AFz electrode (between Fp and 
F in Figure 2). No online reference was used, instead the data was referenced offline. For a 
picture of a subject fitted with the cap and electrodes, see Figure 7. 

The trigger pulse box was used to time-lock the audio stimuli from the oddball task 
described above in section 3.2, which was played for the subjects through in-ear 
headphones. Headphones were used so that the subjects should hear equally in both ears, 
compared to if speakers had been used where there is a risk that the speaker sound is 
heard differently in the ears. For localizing and digitizing the exact individual position of the 
electrodes in 3D for all subject we used Polaris Krios System from Northern Digital Inc.  
 

24 

 
Figure 7. Person fitted with EEG cap with 128 electrodes. You can also see the EOG electrodes around the eyes. 
 

3.3.2 Tasks 

The developed method consists of three different conditions: no task, easy task and hard 
task, using combinations of the grasping task and the oddball task, described in section 3.1 
and 3.2, respectively. No task means that the subject performs only the auditory oddball 
task while focusing their gaze at a plus sign on a computer screen. For the easy and hard 
task conditions the subject was to perform the grasping task at the same time as the 
auditory oddball task. The easy and hard condition represent a dual task paradigm, 
described in section 3.2, where the grasping task is the primary task and the oddball task is 
the secondary task. The easy and hard conditions are, as mentioned in section 3.1, meant 
to replicate the conditions of with and without sensory feedback that will be used in the 
future study with anaesthesia. The no task condition is to add another level of cognitive 
workload that can be used as a baseline to examine if we can measure the differences in 
cognitive workload between different levels.  
 

3.3.3 Self-assessment using NASA-RTLX 

To get an indication of how much workload the subjects themselves thought they put into 
each task, we used a self-assessment questionnaire, or task load index, developed by the 
National Aeronautics and Space Administration called NASA-TLX [57]. The NASA-TLX is 
commonly used to assess perceived effort (e.g. [14], [21], [39], [44], [45]). The 
questionnaire consists of six subscales which represent the variables: mental, physical, and 
temporal demands, frustration, effort, and performance. Each subscale is a twenty-step 
scale from 0 to 100 and the subjects were asked to put a cross on the step of each 
subscale that best represented their effort on each task. For each task, the values for all 
subscales were added together and divided by six to get the averaged NASA Raw Task 
Load Index (NASA-RTLX). This index is more commonly used in many studies because it is 
simpler to apply compared to the NASA-TLX which also includes an additional weighting 
process to weight the different subscales against each other [20].  

We used the NASA-RTLX for each task and subject to get an indication of whether the 
perceived workload differed between the tasks. This will be used as an indicator to see if 


25 

 
the subjects experienced the expected difference between the different conditions. Subjects 
were also told to mark the different blocks in each condition with 1, 2, or 3 if they 
experience a difference in effort. They were also told that they could mark with an X if they 
estimated the same value for all blocks. In the end, most of the subjects did not experience 
a difference between the blocks, so only conditions were examined. For subjects who 
marked a difference, we have used the mean value.  
 

3.3.4 Procedure 

Before the experiment, the subject was fitted with the EEG head cap and a connection was 

made between each electrode and the scalp using a conductive gel. Then the EOG 

electrodes were attached around the eyes using adhesive labels, and the reference 

electrodes, used for offline referencing, were clipped to the earlobes. The impedance for 

the connections was kept below 50 kΩ and were also controlled regularly between the 

measurements. Lastly the electrode positions were scanned. The participant also filled out 

the informed consent, a photo agreement and the Waterloo Handedness Questionnaire. 

They were asked to use their dominant hand for the grasping task.  

During the experiment the subject was seated in a chair with an adjustable table in front of 

them. They got in-ear headphones through which the sounds for the auditory oddball task 

were played. Before the experiment started the subjects were informed about the tasks they 

were going to perform and got the possibility to ask questions about the procedure. We 

emphasised that the cube should be lifted as many times as possible without breaking it, 

and that the grasping task was the main task. The subjects also got to listen to one sound 

(or more if requested) of each type: frequent, rare, novel and start/stop-sound, so that they 

knew what to listen for. The start/stop sound consisting of three consecutive tones, were 

used to notice the participant that they could start respectively end doing the task. We also 

adjusted the audio to fit the subject’s preference. This could give rise to some differences 

between the individual results, since the intensity of a sound affects the amplitude of the 

reaction, or ERP components [31]. However, keeping the volume constant would have 

meant that the subjects would experience different volumes because of differences in 

hearing, which would also give raise to individual differences. If the sounds had been hard 

to hear or painfully loud, this would have contributed to exhausting the subjects faster. 

Therefore, the subjects got to adjust the volume so that they felt most comfortable.  

The subject was also instructed to not blink excessively, to not frown, clench their jaws or 

keep unnecessary tension in any other muscles. This is to avoid artifacts and will be 

discussed further in section 3.4.5.   

The time needed for each condition depends on how many epochs is needed to get a 
satisfactory ERP waveform. This in turn depends on what you are looking for in the data 
and how much noise there is, but Luck [31] recommends 10-50 epochs for larger 
components, such as P3, and 100-500 for smaller, such as P1. This because more 
measurements increase the signal-to-noise ratio and thereby makes it possible to study 
smaller components. Since stimulus duration, interstimulus interval and the percentage of 
novel sounds are already set (see section 3.2.3), the time for each condition depends on 
the number of epochs we chose to measure. We have chosen to use an algorithm that 
plays 600-720 stimuli in total. With novel sounds being 10 % of the sounds that gives us  


26 

 
60-72 novel sounds per condition. That way we also have some margin if some of the 
epochs needs to be rejected because of blinks or other artifacts, at least for large 
components. To keep the novelty of the novel sounds they were not repeated during a 
condition, which means that they were repeated a maximum of three times for each subject, 
with at least five minutes between the repetitions. Some of the sounds were not repeated at 
all. The algorithm randomizes the order of the played sounds as well as the interstimulus 
time, such that frequent, rare and novel sounds are mixed and played with different 
interstimulus times between each other. 

Maximizing the number of epochs needs to be weighed against too long blocks. This will 
exhaust the subjects and might affect the number of subjects willing to participate. But more 
importantly it will affect the subject’s ability to stay focused on the task, and longer times 
might therefore do more damage than good. 

It is also important to insert enough time for rest between the measurements. This helps to 
keep the subjects alert and focused on the task. It can also reduce blinking and muscle 
artifacts during the measurement, since the breaks gives the subject time to blink and 
stretch. For this reason, each condition was divided into three blocks of about four minutes 
with at least one minute break between them. Between each condition there was also time 
for about five minutes break, or more depending on what the subject wanted. By letting 
subjects perform the same task for three hours, or until they were exhausted, Trejo et al. 
[58] have shown that fatigue will affect the measurement by increasing the amplitude of 
both the alpha and theta frequency bands and the P2 component. However, the same 
study showed that N1 and P3 was not significantly affected by the time of the task.  

This means that each subject completed in total nine blocks, three for each condition. At the 

start and end of each block the special start/stop sound was played. After each block the 

subjects reported the number of rare sounds they had counted. During the longer break 

after each condition the subjects also filled in the NASA-RTLX questionnaire. All subjects 

started with the no task condition, and moved on to easy task and hard task, in that order. 

By doing so, the level of arousal will tend to vary between the conditions. To avoid this, 

Luck [31] recommends varying conditions unpredictably within each trial block. However, in 

the future study that this work is in preparation for, it will not be possible to switch back and 

forth between the conditions, since the conditions in that case will be with and without 

anaesthesia, respectively. It would be possible to use a random order between the different 

conditions, for example by inviting the subjects for two separate days, but we decided 

against this since the same order would make it easier to compare the different subjects’ 

learning processes. 

When the subjects performed the grasping task, the number of times they lifted the cube 

over the barrier was counted in order to get an indication of how well they accomplished the 

task during the different levels of difficulty. This was then divided by the time to compute 

number of lifts per minute, taking into account the fact that the total duration of the blocks 

shifted slightly. Since the task was to lift the cube without breaking it, we also counted the 

number of times they broke the cube (pressed it too hard such that it lit up). A success rate 

was then computed by subtracting the number of times the cube was broken from the 

number of total lifts. These measurements of performance will, together with the NASA-

RTLX, be used to verify the differences between the different conditions. It will also be 


27 

 
investigated for each block and compared to look for learning effects. It is expected that if 

learning takes place, performance would increase between the blocks.   

An experiment procedure for this work can be found in Appendix A, with more information 

about preparation, execution and the work needed after each experiment. Also, at the end 

of the experiment the subjects also participated in another study. However, the procedures 

or results for this are not discussed in this work and since it was performed at the end it 

should not affect the result of this work. 

 
3.4 Signal processing 

Before analysing the EEG data, it needs to be processed to reduce the signal-to-noise ratio 
and obtain clean averaged curves to measure ERP components and frequency bands. This 
section describes common steps for signal processing and motivate our choices for this 
work. 

The signal processing has been done using EEGLAB [59], which is a freely available 
MATLAB toolbox, and the plugin ERPLAB [60]. These are specifically designed to analyse 
EEG and ERP data.   

3.4.1 Offline referencing 

Even if the EEG equipment uses a reference site during the measurements, as discussed 
in section 2.2.1, this site needs to be specified and sometimes changed offline before 
analysing the data. This is called offline referencing or, if the reference site is changed, re-
referencing. Since there are no electrically neutral sites on the head or the body in terms of 
neural activity, there are no perfect reference sites. This means that ERP measured at an 
active electrode will both reflect the EEG at the active electrode site and the reference site. 
Therefore, it is important to choose the reference site with caution, so it does not cancel out 
important information in the data. This means for example that a reference site near the site 
of interest is not a good choice. Also, reference sites that pick up much noise should be 
avoided to not get extra noise in the data. Which reference site that is the best depends on 
the application. [31] 

Common reference sites used are one or both of the earlobes (e.g. [17]–[19], [21], [37], 
[39], [44], [46], [61]) or one or both of the mastoids (the bones directly behind the ears, e.g.  
[36], [40], [41], [55], [