The Gothenburg riots in virtual reality
A reconstruction of a historical event

Master’s thesis in Computer science and engineering

KALLE ARESCHOUG AND ROMINA ASADI

Department of Computer Science and Engineering
CHALMERS UNIVERSITY OF TECHNOLOGY
UNIVERSITY OF GOTHENBURG
Gothenburg, Sweden 2023


Master’s thesis 2023

The Gothenburg riots in virtual reality

A reconstruction of a historical event

KALLE ARESCHOUG

ROMINA ASADI

Department of Computer Science and Engineering
Chalmers University of Technology

University of Gothenburg
Gothenburg, Sweden 2023


The Gothenburg riots in virtual reality
A reconstruction of a historical event
KALLE ARESCHOUG
ROMINA ASADI

© KALLE ARESCHOUG AND ROMINA ASADI, 2023.

Supervisor: Thommy Eriksson, Computer Science and Engineering
Examiner: Mikael Wiberg, Computer Science and Engineering

Master’s Thesis 2023
Department of Computer Science and Engineering
Chalmers University of Technology and University of Gothenburg
SE-412 96 Gothenburg
Telephone +46 31 772 1000

Cover: Excerpt of Vasaplatsen from Gothenburg’s digital twin imported to Unity.

Typeset in LATEX
Gothenburg, Sweden 2023

iii


The Gothenburg riots in virtual reality
A reconstruction of a historical event

KALLE ARESCHOUG
ROMINA ASADI
Department of Computer Science and Engineering
Chalmers University of Technology and University of Gothenburg

Abstract
The shooting at Vasaplatsen during the Gothenburg riots in 2001 has become known
for several reasons, the main one being the misrepresentations in the prosecutor’s
evidence that was used against the protesters. This master thesis aims to create a
proof of concept in the form of a reconstruction of the Vasaplatsen shooting and
allow users to experience it in virtual reality (VR). VR today allows us to partic-
ipate in unique events without physically being there. It also makes it possible to
experience what has already happened.

The thesis has included several steps in determining which parts of the shooting can
be reconstructed to make it resemble the real event and if this contributes to more
knowledge about the shooting. We have also looked at what is important when re-
constructing a historical event and if it is possible for these types of reconstructions
to be used for crime investigations.

The final VR reconstruction is 2 minutes and 10 seconds, showing the shootings at
Vasaplatsen. It is based on an already-made reconstruction by the Swedish director
Göran du Rées, who created a documentary using collected images and videos from
the event. In our VR reconstruction users can relive the moments before and when
Hannes Westberg gets shot by the police as if they were there. The thesis has
focused on presence and immersion, and to understand important aspects of when
recreating a real event in VR. The report discusses these based on literature and our
experiences developing the VR experience. Furthermore, to create a basis for future
work, user tests were carried out at the end of the project to gather improvement
suggestions.

Keywords: Virtual reality, Reconstruction, Historical event, User experience, Inter-
action design, Unity, Gothenburg riots

iv


Acknowledgements
We want to thank our supervisor Thommy Eriksson who contributed to the project
idea and helped through all phases of the work. Furthermore, we want to show our
gratitude to the AR & VR company OutHere for providing valuable advice and tips
about technical solutions and how we could make the work more efficient. Lastly,
we also want to thank our opponents Rebecka Hanson and Linnéa Adielsson for
giving feedback and suggestions for improvement at the end of the project. This
was greatly appreciated.

Kalle Areschoug & Romina Asadi, Gothenburg, May 2023

vi


Glossary
1. Reconstruction - To rebuild an artefact or events.
2. Göran du Rées - Swedish director and lecturer whose video reconstruction we

based our work on.
3. Forensic visualisation - Animations and simulations created to answer critical

questions, for example in crime investigations.
4. Virtual reality - A computer-generated simulation created to immerse users

into a virtual 3D world.
5. Fidelity - How correct the virtual experience is in relation to the original event

or object.
6. Immersion - How well reality is shut out with the help of VR software and

hardware when being in the experience.
7. Presence - Being in a place that one knows is virtual and fictitious, but reacting

to what is seen and heard as if it were real.
8. Locomotion - How to move in VR
9. Marker-based motion tracking - Track human motion through sensors placed

on the body
10. Markerless motion tracking - Track human motion with computer vision tech-

niques
11. Cultural heritage preservation - The act of keeping and conserving cultural

manifestations
12. Digital twin - 3D model of a physical object, for example a city
13. Mixamo - Online library with premade 3D characters and animations
14. High poly - Refers to the number of polygons in a 3D model. The higher

number, the denser mesh.
15. Prefab - Base game object in Unity that one can use instances of.
16. 3D-sound - Sound sources placed out at different locations in the VR world

viii


Contents

1 Introduction 1
1.1 Purpose and research questions . . . . . . . . . . . . . . . . . . . . . 2
1.2 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 Dramatisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.2 3D-modelling and animation . . . . . . . . . . . . . . . . . . . 4
1.3.3 Virtual reality reconstructions . . . . . . . . . . . . . . . . . . 4

1.3.3.1 Nefertari: Journey to Eternity . . . . . . . . . . . . . 4
1.3.3.2 The Night Cafe: A VR Tribute to Vincent van Gogh 4
1.3.3.3 Anne Frank House VR . . . . . . . . . . . . . . . . . 5

2 Background 6
2.1 Events during the riots . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 The barricading of the Hvitfeldska gymnasium . . . . . . . . . 7
2.1.2 The demonstrations at Götaplatsen . . . . . . . . . . . . . . . 7
2.1.3 The march towards The Swedish Exhibition & Congress Center 7
2.1.4 The demonstrations at Järntorget . . . . . . . . . . . . . . . . 7
2.1.5 The storming of Schillerska gymnasium . . . . . . . . . . . . . 8

3 Theory 9
3.1 Virtual reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.1.1 Fidelity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.2 Immersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.3 Presence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.4 Movement in virtual reality . . . . . . . . . . . . . . . . . . . 10
3.1.5 Selection in virtual reality . . . . . . . . . . . . . . . . . . . . 11
3.1.6 User interface in virtual reality . . . . . . . . . . . . . . . . . 11

3.1.6.1 Placement . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Motion tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2.1 Marker-based tracking . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Markerless tracking . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 Managing uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Cultural heritage preservation . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.6 Images as credible evidence . . . . . . . . . . . . . . . . . . . . . . . 14
3.7 Theoretical framework . . . . . . . . . . . . . . . . . . . . . . . . . . 15

x


Contents

3.7.1 Content-Oriented Model of User Experience . . . . . . . . . . 15
3.7.2 Reversal theory and designing for a rich experience . . . . . . 15
3.7.3 Protective frame . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Methodology 17
4.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Resource gathering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Hardware and software . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.3.1 Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.2 Blender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3.3 Oculus Quest 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4 Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4.1 Chronological timeline . . . . . . . . . . . . . . . . . . . . . . 19
4.4.2 Storyboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4.3 Digital twin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.5 User testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5.1 Think-aloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5.2 Interviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5.3 Affinity diagramming . . . . . . . . . . . . . . . . . . . . . . . 20

5 Execution and Process 21
5.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Prestudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Data gathering and analysis . . . . . . . . . . . . . . . . . . . . . . . 22
5.4 Motion tracking of key persons . . . . . . . . . . . . . . . . . . . . . 23
5.5 Learning Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.6 Create the 3D model . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.7 Prepare for virtual reality . . . . . . . . . . . . . . . . . . . . . . . . 26
5.8 Creating the virtual reality experience . . . . . . . . . . . . . . . . . 27

5.8.1 Unity timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.8.2 Visualising uncertainties . . . . . . . . . . . . . . . . . . . . . 29
5.8.3 Sound and Light . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.8.4 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.8.5 Optimise performance . . . . . . . . . . . . . . . . . . . . . . 31

5.9 Visit to Hagabion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.10 User testing and compilation . . . . . . . . . . . . . . . . . . . . . . . 32

5.10.1 Execution of user tests . . . . . . . . . . . . . . . . . . . . . . 32
5.10.2 Compilation of user tests . . . . . . . . . . . . . . . . . . . . . 33

6 Results 34
6.1 The reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2 Manipulating time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.3 Results from the user tests . . . . . . . . . . . . . . . . . . . . . . . . 38

6.3.1 Other use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3.2 Improvements for visualising uncertainties . . . . . . . . . . . 38
6.3.3 Improve presence and immersion . . . . . . . . . . . . . . . . 38
6.3.4 Information during the experience . . . . . . . . . . . . . . . . 39

xi


Contents

6.3.5 Audio in VR experiences . . . . . . . . . . . . . . . . . . . . . 39

7 Discussion 40
7.1 Motion tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2 Creating an immersive experience . . . . . . . . . . . . . . . . . . . . 42

7.2.1 Adding a crowd . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.2.2 Rich experience and protective frame . . . . . . . . . . . . . . 44

7.2.2.1 The scandalous . . . . . . . . . . . . . . . . . . . . . 44
7.2.2.2 Detachment frame . . . . . . . . . . . . . . . . . . . 44

7.3 Generalisability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.4 Ethical and societal aspects . . . . . . . . . . . . . . . . . . . . . . . 45

7.4.1 Anonymisation . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.4.2 Bias and interpretation . . . . . . . . . . . . . . . . . . . . . . 46
7.4.3 Long-term consequences . . . . . . . . . . . . . . . . . . . . . 46
7.4.4 Preserving cultural heritage . . . . . . . . . . . . . . . . . . . 47
7.4.5 Intended target group . . . . . . . . . . . . . . . . . . . . . . 47

7.5 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.5.1 Increasing detail level . . . . . . . . . . . . . . . . . . . . . . . 48
7.5.2 Visualising uncertainties . . . . . . . . . . . . . . . . . . . . . 48
7.5.3 Providing more information . . . . . . . . . . . . . . . . . . . 48
7.5.4 Implementing perspectives . . . . . . . . . . . . . . . . . . . . 49
7.5.5 Guidelines for reconstructions in VR . . . . . . . . . . . . . . 49
7.5.6 Preserve and learn from history . . . . . . . . . . . . . . . . . 49

8 Conclusion 50

A Appendix A I

B Appendix B IV

C Appendix C V

D Appendix D VIII

E Appendix E IX

F Appendix F XI

xii


1
Introduction

"Gothenburg riots", or in Swedish "Göteborgskravallerna", happened 14-16 June
2001 during an EU summit [1]. The riots were initially peaceful demonstrations
in different areas in Gothenburg, but in some cases, for example at Vasaplatsen it
escalated to violence between the protesters and the police. This resulted in the
police firing against the demonstrators at Vasaplatsen [2, 3, 1, 4]. One person that
got injured was Hannes Westberg, who was later convicted of violent rioting [2]. In
retrospect, it has been shown that numerous factors resulted in chaos during the
event [4]. The demonstrators reacted strongly to the barricading of the Hvitfeldtska
gymnasium, which will be discussed further on. It is also revealed, among other
things, that the police’s internal communication did not work, and they had to rely
on the media’s reporting [4].

During the many court cases that followed, one of the pieces of evidence that was
used was a reconstruction of the riots, created by the prosecutor [2]. It was created
with clips filmed during the event by the public and reporters. In this reconstruc-
tion, the shooting against Hannes was one of the things that were misrepresented,
which received a lot of attention. The prosecutor used the reported media material
and created a biased image of the event in favour of the police. Despite this, the
reconstruction, together with other material from the police, was used during several
trials [2].

Apart from the Vasaplatsen shooting being interesting from a crime perspective, it
is also fascinating from a historical and cultural point of view. There are numerous
reasons for this, for example, that it is an event still talked about and that it was
the first time a sitting American president visited Sweden [5]. In addition, it was
also the first time since the shootings in the Ådalen, 1931, that the police fired at
demonstrators in Sweden [6].

Du Rées [2] explains that the Gothenburg Riots were one of the most filmed events
at the time. Still, due to the lack of long uninterrupted single-take sequences and
numerous media sources (video recordings in standard definition (SD) resolution),
credible retelling has proved difficult. Both du Rées [7] and Granström [4] agree
that these were the main factors that caused misrepresentations in the prosecution’s
clip. It made it difficult to determine if the sequences were in chronological order
and if they were from the same event and time. As a result, questions regarding
the credibility of the evidence and how it could have affected the verdicts have

1


1. Introduction

been raised [2]. This is according to Chisum et al. [8], who means that creating
reconstructions is a difficult task, where it is easy to place out the different events
sequentially but much harder to start reading between the lines and understand the
whole picture [8].

As a result, Göran du Rées created a new reconstruction of the event. Du Rées anal-
ysed the prosecution’s film and created a more authentic version in his documentary
"Skotten på Vasaplatsen" [7, 2]. Du Rées uses video clips from six different reliable
media sources. These clips were synchronised and documented the event from vari-
ous locations throughout its course [7, 2]. Among other things, it is noticed that in
reality eight shots were heard, but in the prosecutor’s clip, it was only six. It was
also noticed that the sound of specific media sequences in the prosecutor’s clip had
been manipulated and used to make the event seem more dramatic [7].

Reconstruction is one of the steps in crime investigation [9, 8, 10]. It is a common
practice of gaining knowledge and evidence about what happened during a crime,
how and when it happened and who was involved [9, 8, 10]. In the examples above,
film clips have been used, but according to Flor et al. and Ma et al. [10, 11], it can
also be done in other ways. For example, one can create a 3D model of the crime
scene in modelling software and then make it possible to experience it in virtual
reality (VR) [10, 11]. By making the crime scene in VR, the user can navigate the
virtual world and observe the crime scene from different viewpoints and perspectives
[11].

1.1 Purpose and research questions
Du Rées [2] mentions that uncertainties remain surrounding the incident and the
legal processes afterwards. Therefore, to contribute to a more objective and immer-
sive representation of the event, this thesis aims to implement interaction design by
creating a new reconstruction of the event and combining it with VR technology.
This will let the users explore the events themselves and will help them create their
understanding of what happened that day, which in turn goes hand in hand with
the historical and cultural perspective. To accomplish this, the following research
questions will be aimed to answer:

1. RQ1: Which aspects of the Vasaplatsen shootings in 2001 can be reconstructed
using VR technology?

(a) Sub RQ1: What is the most efficient and reliable way of implementing
motion tracking of people in VR from a real-life event?

(b) Sub RQ2: Which factors are relevant to consider when reconstructing a
real event in VR?

(c) Sub RQ3: What aspects are important to create a high immersion and
presence in virtual reality?

2. RQ2: Can a VR reconstruction of a crime scene contribute to an increased

2


1. Introduction

understanding of the event?

1.2 Delimitations
The project has the following delimitations:

• The Gothenburg riots consisted of several events, but this report will only
focus on the shooting at Vasaplatsen.

• The final product will be based on the documentary by du Rées and no further
search will be made for other video material from the event.

• Our focus will be to create a proof concept that can act as a basis for further
work.

1.3 Related work
When creating reconstructions, it is essential to have the correct data, especially if
the reconstruction is used in crime investigations [12]. In addition, there are two
important factors to consider, the first being the accuracy level. Many parameters
must be considered when creating a reconstruction to make it realistic. Not only
does the event itself need to be created precisely but details before the event needs
to be investigated [12]. The second factor is that the reconstruction cannot stand
independently in a forensic investigation. There must be complementary evidence,
especially since pictures and videos are more likely to impact a jury [12].

Reconstructions can be done in different ways, not only in crime investigations but
also for educational and cultural purposes. Below variants will be presented, these
are only some of many.

1.3.1 Dramatisation
On July 22, 2011, Norway was hit by two terrorist attacks by the right-wing terrorist
Anders Behring Breivik [13, 14]. The first attack occurred in the Oslo government
district, where a bomb was detonated. The second act occurred in Utøya, where the
assailant ruthlessly murdered 69 people, mainly young adults and children [13, 14].
In total, 77 individuals were killed during both attacks, and 319 were injured [14].
Six years after the attacks, the creation of the documentary "Reconstructing Utøya"
begins. The documentary is based on reconstructing and dramatising four survivors’
memories of the attack [14].

In the documentary, scenes shown describe the course of events until friends and
family of the survivors are shot and how they themselves escaped. It is worth
mentioning that the documentary contains modifications, for example, towards the
end where a school disco takes place to give a positive twist to the documentary.
However, this disco never occurred in reality because of the attacks [14].

3


1. Introduction

1.3.2 3D-modelling and animation
3D animations have been discussed and used in crime investigations to understand
cases better. Reconstructions of car and motorcycle accidents have been created to
give a picture of the accidents for juries in court cases [12]. One terrorist attack
partially reconstructed by 3D modelling and animation is the truck attack by IS
terrorist Rakhmat Akilov on April 7, 2017, on Drottninggatan in Stockholm [15, 16].

The animated 3D reconstruction showing the truck’s 1,063-meter-long path was
made by the Swedish National Forensic Center (NFC) and is the first of its kind
to be used in a Swedish legal process [17, 16]. In the reconstruction only objects,
vehicles and buildings are visible, and the truck is driving the same route and speed
as the police described in their preliminary investigation [17, 18].

The animation film aims to create an overview of the course of events, facilitate the
legal processes and create educational material [17, 16]. By creating a 3D model,
investigators can return to the digital version of the crime scene for further investiga-
tions. The model can also, together with materials such as images and videos, create
a credible basis in legal proceedings. It also provides the opportunity to experience
the event by entering the virtual environment [16].

1.3.3 Virtual reality reconstructions
There have been many reconstructions of historical buildings in VR [19]; some ex-
amples will be presented in this section.

1.3.3.1 Nefertari: Journey to Eternity

One of the more successful reconstructions is "Nefertari: Journey to Eternity" [20].
The VR experience was developed by Experius VR and was released in 2018. "Ne-
fertari: Journey to Eternity" is a reconstruction of the tombs of Queen Nefertari,
where the user can explore the tombs in VR and see how they looked 3000 years ago.
The user can interact with different hieroglyphs and get audio historical accurate
facts about the hieroglyphs, Queen Nefertari, and the ancient Egyptian gods and
goddesses [20].

The VR experience was created by digitally reconstructing the tomb with the help
of 360 cameras and scanning [20]. One of the most substantial advantages of this VR
experience is the detailed and meticulous recreating of the tomb and the hieroglyphs.
Since the VR experience is based on a small room, the developers had to be creative
when filling the room with intractable artefacts and objects.

1.3.3.2 The Night Cafe: A VR Tribute to Vincent van Gogh

The Night Cafe: A VR Tribute to Vincent van Gogh which explores virtual reality
and culture creatively [19]. It was developed by Borrowed Light Studios [21] and
was released in 2016. The VR experience allows the user to explore Vincent van
Gogh’s painting "The Night Cafe". Unlike other cultural VR experiences, The Night

4


1. Introduction

Cafe lets the user enter the painting and explore the picture from within. This is an
interesting way of exploring VR as a medium. The opportunity to not only observe
the painting and see it in 2D, but users can also now travel into the colourful world
of the painting.

There are some problems with this kind of VR recreation, however. One of them is
the creative freedom the developers take when creating an experience similar to this
one. The developers studied the painting and other impressionist paintings [21], but
they still had to add new material to the painting and world. This dilemma opens
the door to the question we struggle with and discuss in this thesis; how far from
the objective truth can a developer stray when recreating an object that exists in
real life?

1.3.3.3 Anne Frank House VR

On Steam, there are many art museums and recreation applications allowing the user
to explore different cultures [19]. One example is "Anne Frank House VR", where
the user can walk through the house Anne Frank lived in, interact with items, and
be told about the life of Anne Frank. They utilise virtual reality to conserve the life
of Anne Frank while making sure it is available for everyone [22].

5


2
Background

Even though the shooting at Vasaplatsen is perhaps the most infamous event from
the Gothenburg riots, it was not the only event during the demonstrations [23, 24].

The Gothenburg riots were one of many large demonstrations in the EU during
the early 2000s. There were already riots associated with demonstrations in Nice,
Seattle, and Prague in protest against the European Union (EU) [25]. The Swedish
government and the city of Gothenburg wanted to avoid the conflict and escalation
that had happened in the other cities. Due to this, they started to have a dia-
logue with the organisers of the different demonstration factions one year before the
EU summit. Up until the EU summit in Gothenburg, many believed that all the
demonstrations would be peaceful [25]. However, the outcome of the demonstrations
were mixed, where some demonstrations were successful while others, like the one
at Vasaplatsen, escalated to violence.

The reason for the demonstration was mainly due to two factors. The primary
reason was the dissatisfaction against the EU that was wildly spread at the time.
People feared losing their voice and influence over the decisions and politics if the
EU took too many decisions. The secondary reason was the then-sitting president
of America, George W. Bush, visiting the summit. This was the first time a sitting
president was visiting Sweden. There was also a widespread discontempt against the
USA generally, and the president specifically in Sweden [25, 23, 24]. These factors
lead to many thousand people gathering in Gothenburg during the days of 14-16 of
June 2001.

2.1 Events during the riots
Our thesis focuses on the shooting at Vasaplatsen, but five more noteworthy events
took part during the Gothenburg riots that influenced the outcome and are relevant
to be familiar with. These are:

1. The barricading of the Hvitfeldska gymnasium.

2. The demonstrations at Götaplatsen.

3. The march towards The Swedish Exhibition & Congress Center.

6


2. Background

4. The demonstrations at Järntorget.

5. The storming of Schillerska gymnasium

2.1.1 The barricading of the Hvitfeldska gymnasium
The first event speculated to have raised tension between the demonstrators and
police, happened at the Hvitfeldtska gymnasium [26]. The school was one of sixteen
that accommodated demonstrators who did not live in Gothenburg and needed a
place to stay. Around 500 people stayed at Hvitfeldtska, mainly young adults. On
the morning of the 14th of June, the police barricade the school with containers
prohibiting anyone from leaving the area. Despite this, some demonstrators tried
to leave, and in return, they had to turn over their belongings to the police and
got were searched. The demonstrators that decided to stay at the school were kept
there for roughly 24 hours. The police later stormed the school, arrested everyone
still left, and kept them overnight [26, 23, 24].

2.1.2 The demonstrations at Götaplatsen
At the same time as the demonstrators were kept under lockdown at Hvitfeldtska, a
demonstration against President Bush was happening at Götaplatsen [23]. Around
12 000 people joined the demonstrations in protest against USA and President Bush.
During this demonstration, there was no altercation between the protesters and
the police, and everything went as planned and peacefully [24, 23]. Two more
demonstrations similar to this one took place, each including over 10 000 people,
they were also successful and peaceful. [26].

2.1.3 The march towards The Swedish Exhibition & Congress
Center

The third major event happened on the morning of the 15th. Unauthorised demon-
strations were held at Götaplatsen and had plans to march toward The Swedish
Exhibition & Congress Center, where the EU summit was happening [26]. The
demonstrators moved close to Exhibition & Congress Center, where the police had
created a defensive line. When the demonstrators got close to the police line, vio-
lence arose and the two parts clashed. The police pushed the demonstrators back
towards Götaplatsen, where a group of protesters broke free from the rest of the
demonstration. The group contained individuals from the faction "Svarta Blocket",
known by the police for using violence [26, 25]. The faction moved down on Avenyn
and caused havoc on stores, outdoor seating, and shop windows, of a value of several
million Swedish kronor [23].

2.1.4 The demonstrations at Järntorget
The fourth event that got attention afterwards was the demonstrations at Järntorget.
On the evening of the 16th of June, a couple of hundred people gathered for a
peaceful unauthorised demonstration against the police brutality that had occurred

7


2. Background

the previous days [23, 25, 24]. The demonstrators stayed at Järntorget to show
solidarity against the people taken into custody. The police were quickly there
and established a ring around the demonstration. After a while, the protesters
were allowed to leave on the condition that they should be searched and checked
for Swedish citizenship [23]. Roughly half of the protesters did, while the other
half stayed. The demonstrators remained peaceful and made no aggression. After
another couple of hours, the executive police on site got tired of keeping the peaceful
demonstrators on lockdown. Without authorisation from his bosses, he released the
rest of the demonstrators without repercussions [25].

2.1.5 The storming of Schillerska gymnasium
As Hvitfeldtska, Schillerska was a school where demonstrates lived and slept during
the event. During the evening of the 16th of June, the police and the national
task force stormed the school Schillerska gymnasium [23, 26]. The police had heard
rumours about a German terrorist staying at Schillerska [26]. The task force and the
police took around 80 people who were forced down onto the ground. The people
had to stay on the ground with little clothes for 45-90 minutes [23]. There are
accounts of the police saying racist and degrading terms to the demonstrators. Just
as they did at Hvitfeldtska, the police left Schillerska empty-handed, as they did
not find any German terrorist [26, 23]. Many demonstrators reported being shocked
and traumatised during and after the event [23].

8


3
Theory

This chapter will explain the applicable theory for this master’s thesis. It will
begin by exploring factors to consider when using images as evidence, followed by
an explanation of virtual reality and how one can create an immersive and rich
experience for the user.

3.1 Virtual reality
According to Kardong-Edgren et al. [27], there is no unifying definition of vir-
tual reality for academic and educational purposes. However, Merriam-Webster [28]
proposes the following general definition: “an artificial environment which is expe-
rienced through sensory stimuli (such as sights and sounds) provided by a computer
and in which one’s actions partially determine what happens in the environment”.
Something certain, however, is that virtual reality is continuing to develop and is
now in its second wave of innovation where new technologies and methods are being
developed [27]. Examples are updated hardware such as new headsets or so-called
head-mounted displays (HMD) [29], which will be used in this thesis.

Several factors make a successful VR experience; fidelity, immersion and presence
[27]. Examples of situations where these factors are included when using VR are in
therapy and training [30]. Patients can be exposed to their fear or phobia, and the
military can train in urban areas without physically being there. Each factor will
be discussed more thoroughly below.

3.1.1 Fidelity
Fidelity in the context of virtual reality can be explained as how correct the virtual
experience is in relation to the original event or object [31, 32]. It can be split
into multiple categories, such as display, physical and psychological fidelity [32,
33]. Display fidelity refers to the computational optimisation of the experience,
display size and resolution [33]. Physical fidelity refers to how well the objects
replicate the real object [32]. Psychological fidelity is how well the experience creates
psychological effects in the user, such as stress and fear.

9


3. Theory

3.1.2 Immersion
Immersion in virtual reality is about the level of sensory fidelity that a VR system
provides, or in simpler words; how well reality is shut out when being in the experi-
ence, how many of the human senses are involved and with what quality [30, 32, 34].
Immersion is objective and measurable, where two systems can have different levels
of immersion [30, 32, 35]. It is a VR system’s hardware and software that deter-
mines the level of immersion, such as display size, display resolution and field of
view (FOV) [30, 32, 34]. Immersion in virtual reality consists of different parts that
must co-exist to optimise the result [32].

3.1.3 Presence
The third building block in virtual reality, presence is about being in a place that
one knows is virtual and fictitious, but reacting to what one sees, hears and does as
if it were real [36, 30, 35, 34]. In contrast to immersion, presence is subjective and
therefore hard to measure [37, 34]. Factors affecting presence include graphics, level
of realism, sound, and range of movement [36, 34]. As a result, one can conclude
that increased psychological fidelity and immersion can lead to increased presence
[27, 34]. An example of when the user experiences presence is by having an avatar
or seeing one’s hands in the virtual world and being able to interact in the virtual
world with them [34].

3.1.4 Movement in virtual reality
When designing for VR, one important aspect is how to move in the experience,
the so-called locomotion [38, 39]. There are many ways to implement locomotion
in a VR experience. A recent comparative study found at least 22 established
methods of locomotion in VR [39]. Some of the more common ones are using a
joystick, pointing and teleporting, moving in place, and flying. The decision of
which method to use affects the user experience and can be difficult when designing
a new VR experience. Locomotion can help make the experience more fun, easier
to understand, and prolong the time the user can stay in the experience without
feeling motion sickness [40].

A common side effect of using VR is motion sickness [41]. Users experience motion
sickness because the brain cannot comprehend that their body is still in the room
while in the headset, they are moving. This results in the brain receiving conflicting
signals, which in turn is what causes nausea [42, 41]. Some methods of locomotion
have been shown to decrease or delay the feeling of motion sickness [43]. One of
these methods is pointing and teleporting, which is a target-based motion where
the user decides on the destination, and the system handles the movement [43, 38].
In addition, the feeling of teleporting is reported as futuristic and resembles science
fiction, which is appreciated by the users [43].

Pointing and teleporting is best used when the VR experience has a vast area to
explore, and the user does not need to manoeuvre around any obstacles [40]. Using

10


3. Theory

a joystick as locomotion is better suited in a more high-paced VR experience where
decisions must be made quickly.

3.1.5 Selection in virtual reality
Being able to interact with objects is an important part of VR and has a significant
effect on the result [38]. According to Bowman et al. [38], the most common way
is to have virtual hands that move with the hand trackers one holds. The difficulty,
however, is that objects far away cannot be reached. For this reason, ray interactors
have been developed to act as an extension of the user’s hand [38, 44]. In combination
with scripting, in other words, short pieces of code, rays and hands can be used to
interact with the virtual world [45], see Figure 3.1.

Figure 3.1: Hands and rays in Unity

3.1.6 User interface in virtual reality
Designing a user interface (UI) for 3D such as VR differs from a UI on 2D platforms
such as desktops and mobile phones [46]. UIs for 2D platforms have gone through
many iterations, and paradigms have been established. This is not the case for UI in
VR. However, some recommendations and principles from classical UI design apply
to VR, but there are big differences when designing in 3D compared to 2D [46].

Depending on what type of VR experience is being developed, the UI has to be
customised for the experience. Doerner et al. [46] propose two types of UIs; natural
UI and magical UI. Natural UI is more suited for realistic VR experiences, where
the developers aim to resemble the real world as much as possible. The natural UI
means that the user can only interact with objects close to the user, just like in
the real world. Magical UI, in turn, means that all limits are off, where the users
can, for instance, extend their arms beyond realistic proportions to interact with an
object.

11


3. Theory

3.1.6.1 Placement

There are two ways of displaying the UI: screen space and world space [47]. In the
first case, interface elements are placed directly on the user’s field of vision, while in
the second case, interface elements are placed in the 3D world. World space UI is
preferred since the elements do not block the user’s field of vision [47].

Furthermore, according to Bowman et al. [38], one of the main problems in the 3D
world is that actions become more difficult as one gets more degrees of freedom.
Selecting something in a menu that floats in the 3D world is more challenging than
having a menu on a surface.

3.2 Motion tracking
Motion tracking of humans involves capturing and digitising real people’s bodies and
facial movements. These digitised versions can then be used within, for example,
the film and gaming industries [48]. There are many different ways to motion track,
but they are often divided into two main groups; marker-based and markerless [49].

3.2.1 Marker-based tracking
Marker-based means that sensors are attached to the human body, and with the
help of advanced cameras and software, the movements are documented [49, 48].
This type of tracking gives a reliable and detailed result, but the equipment makes
it complex and expensive [48].

3.2.2 Markerless tracking
Markerless tracking does not depend on sensors but instead uses computer vision
techniques, namely depth cameras and computer algorithms, to document move-
ments [49, 50, 48, 51]. The depth cameras combine colour and depth to perceive
movement in a 3D environment, while the algorithms use deep learning and cameras
to identify key parts of the human body, such as joints, to document the movements
[48, 51].

Markerless tracking is more accessible to people because there is no need for expen-
sive equipment [48]. However, it is less reliable as one cannot reach the same level
of detail as marker-based [50, 48].

3.3 Managing uncertainties
During this thesis, there will be cases in the reconstruction where we need to deal
with uncertainties. This may apply to uncertainties regarding the characters’ posi-
tioning, appearance and actions. For this reason, it is relevant for us to look more
closely at how these uncertainties can be communicated to the user.

12


3. Theory

A recommended way is to work with so-called visual cues such as markings in the
image to communicate metadata [52, 53]. Metadata, in this case, refers to the level of
uncertainty. Some examples are flickering, grayscaling, colours, sizing and blurring,
which can be applied to one object, several or the whole scene [52, 54, 53]. A higher
degree of uncertainty can be visualised with faster flickering, more gray- or colour
scale, more significant size change, or more blurring [52, 54].

However, there are disadvantages to the methods, where, for example, flickering can
result in visual fatigue and be difficult to interpret [55, 52]. Colour and grayscale,
in turn, can also be challenging to interpret, and they also depend on the starting
phase of the object [52]. For example, it is difficult to desaturate an object already
in grayscale from the start. It is also complicated to understand how users will
experience an applied colour filter [52].

Furthermore, two general difficulties when visualising uncertainties are distraction
and precision [52]. Distraction is a matter of whether one wants the uncertainty to
attract the users’ attention or not. Is the aim to make the user focus and notice
the uncertainty or to look past it? Precision, in turn, is about what parts of the
object are uncertain and how many parts. An example brought up by Westin and
Eriksson [52] is a railing they have to reconstruct in their project aimed at creating
a visualisation of the Sanctuary of Hercules Victor in Tivoli. The position of the
railing was certain, but not the material. In that case, by marking the object as
uncertain, there was no way to communicate to the viewer which part of it was
uncertain.

3.4 Cultural heritage preservation
One area of VR that has exploded in terms of applications since the hardware became
more accessible is cultural heritage preservation [19]. Cultural heritage preservation
is the act of keeping and conserving cultural manifestations. The aim is to preserve
the object’s physical and cultural characteristics to ensure that it will outlive us
and that its value does not dwindle [56]. VR offers new opportunities to preserve
historical events and cultural traditions by reconstructing them digitally. This opens
up the possibility of preserving and exploring the culture that otherwise would be
sealed off due to distance and price.

3.5 Copyright
The proof of concept created in this thesis will be based on material from other
people’s work, namely video sequences from the documentary by du Rées [2] based
on clips from various media sources. The resources that will be used are from either
documentaries or private persons, and thus it is essential to consider copyright and
intellectual property.

Two laws that may be relevant are the right of use (nyttjanderätten) and the right
of citation (citationsrätten). The former means the law protects original creations,

13


3. Theory

and the creator is the owner (SFS 1960:729) [57]. The latter means that everyone
is allowed to cite from the creations as long as that is done in good faith and that
it is done to the extent that it motivates the purpose (SFS 1960:729, cha 2. 22 §).
There is also a need for good practice (in Swedish "med god sed") when working with
copyright. This means the creator and source should be referenced in context with
the creation. Further, the reference must be done in a way that does not change the
meaning of the creation (SFS 1960:729, cha 2. 22 §).

During this thesis, we will recreate a new and unique variant of a historical event
where parts are built on another person’s "creation". Our guess is that it should
be within the framework of the right of citation, the right of use and good faith.
However, there are still uncertainties, which is why this needs to be investigated
further. According to Journalistförbundet [58], joint copyright can be created if
permission is obtained from the original creator, where the material may be used
and modified. In short, a solution could be contacting Göran du Rées or Uppdrag
Granskning for permission to use their clips.

3.6 Images as credible evidence
It has become increasingly more common for images and video to be used in forensic
investigations [59]. Images and videos are data-rich sources for understanding our
world [60]. When used as evidence, it is important that they are of good quality
and demonstrates their purpose to feel truthful.

To avoid bias when interpreting images and videos, it is, among other things, impor-
tant that the clips are presented with context so that we can be sure they have not
been misused [60, 61]. Context can be conveyed through, for example, supplemen-
tary text, narrative or with the help of other images. However, according to Granot
[61], the viewer’s subjective opinion can still create a conscious or unconscious bias
when video or images are presented as evidence during trials. This is because they
provoke more feelings than other pieces of evidence. For this, there is still no good
solution today.

In his documentary, du Rées based the creation of his reconstruction on six different
steps [7, 2]. These are presented below, followed by a brief description together with
reinforcement from other sources:

1. Original materials and equipment must be verified - When film and
sound sequences are edited and combined, it must be possible to return to the
original material and source.

2. The creators’ intentions must be clearly stated - When we watch a clip,
it is important to understand the sender’s goal and purpose. What are they
trying to accomplish with the photo and video they took? This is reinforced
by Granot et al. and Mathison [60, 61], who also believes it is easy to modify
and manipulate images to achieve a specific feeling or redirect attention.

14


3. Theory

3. Space and location must be clearly described - To enable us to locate
ourselves when we watch a clip, camera angles and location must be clearly
described. Mathison [60] mentions that this is also important, again for con-
text, as the camera angle shows what the sender felt was worth documenting
at the time.

4. Time must be objectified - The easiest way to manipulate time in videos is
to make the clip longer or shorter. In this way, the time is manipulated without
the user noticing. To counteract this, du Rées believes that to the extent that
it is possible to have single-sequence shots from the event to objectify time.

5. Use black boxes at each clip - In cases where it is not possible to use
single-sequence clips, black boxes should be used to indicate clip changes.

6. Call in experts when the image is to be used as evidence - People in
the legal process often have no knowledge of film and images. Due to this, it is
vital to call in experts to confirm the film’s credibility. Granot et al. believes
it may also be relevant to train jurors and lawyers in this area.

3.7 Theoretical framework
The following part will discuss the theoretical framework and start by explaining
the basics between the user and a product, then delve deeper into rich experiences
and how to achieve them.

3.7.1 Content-Oriented Model of User Experience
The content-oriented model of user experience breaks design into three steps; how,
what and why [62]. The three steps are all needed to create an enjoyable user
experience. Everything in design starts with the "why"; the needs and envisioned
experience of the design. The "what" is the decision of instrumentality and function.
The "how" is how the interactions should look and feel [62]. According to Hassenzahl
[62], these points must be included in the design to achieve enjoyable products with
a good user experience. During the project, we will use the content-oriented model
of user experience to the best of our ability to create an enjoyable VR experience.

3.7.2 Reversal theory and designing for a rich experience
Rich experiences are an important design principle to utilise if the designer wants
the user to become invested in their product. Humans are dynamic and complex
creatures, and a mistake often made by designers is to develop a static product [63].
The reversal theory allows designers to use a more holistic approach to product
development. Fokkinga and Desmet [63] propose six design opportunities when de-
signing for rich experiences, where parapathic emotions are essential in two of them.
Parapathic emotions in reversal theory mean that every negative emotion has a
corresponding positive emotion, and vice versa for positive emotions [63]. When
designing for rich experiences, the product can be better if the designers can evoke

15


3. Theory

both of the corresponding parapathic emotions. One study [64] described numerous
stages where a person experiences parapathic emotions, where the one described as
"The scandalous" will be relevant to this thesis. The scandalous experience occurs
when a person experiences something morally ambiguous and unjust (negative emo-
tion), and they become fascinated by the event (positive emotion) [64]. During our
experience, the user will see and feel the violence between the protesters and police,
which will hopefully evoke ambiguous feelings within the user.

3.7.3 Protective frame
Protective frames are a mental defence mechanism humans utilise to distance them-
selves emotionally from unpleasant situations [65]. These frames are psychological
constructs, which means that it is not about whether a person is physically safe
but if she believes she is safe. One of the protective frames, the detachment frame,
occurs within a person when they are experiencing something they find scary but
know that it is not in their direct vicinity. The protective frame has been used for
centuries by authors and movie makers [65]. It lets a person observe an unpleasant
occurrence without being part of the event. In our thesis, it will work as a psy-
chological defence mechanism that protects the user from the periodically intense
experience.

16


4
Methodology

During this thesis, different methods will be used. These are explained in detail in
this chapter.

4.1 Planning
We will use a Gantt chart for the work process and time management. It is a
valuable and efficient tool for planning and providing an overview of project steps
and their duration [66, 67]. Following Wadsworth [68], the chart will also be revised
after the methods have been decided to ensure that it is relevant and covers all steps
in the process.

4.2 Resource gathering
Even though the prestudy and the data gathering are two different process steps,
the same methods will be applied; a simplified literature review and the method
"Written records, accounts and diaries" [68]. This is to provide general knowledge
about the selected area. In the former case books and scientific articles will be
collected and summarised to learn from previous research and find related work
[68]. During the latter case, historical records such as news articles and other files
and documentation from the event will be gathered. According to Wadsworth [68],
this data can be indirect evidence of the event. Either way, it is important to be
aware that they can be biased depending on the source [68].

4.3 Hardware and software
Two software programs will be used to create a proof of concept; Unity and Blender.
The concept will then be experienced in VR with the help of the headset Oculus
Quest 2. These are explained in detail below.

4.3.1 Unity
Since its release in 2005, Unity has become one of the most used and best game
engines available [69, 70]. Unity’s primary focus is to provide various tools for the

17


4. Methodology

developers and supply an easy-to-use engine regardless of skill level. Unity is at the
forefront of providing developers with tools and packages for VR development. It
has packages for almost every hardware on the market, and they constantly update
the engine to keep it up to date with the new technology [69].

A competitor to Unity is Unreal Engine which is in the same price range and has a
similar learning curve [70]. The choice between these two was not apparent to us.
However, early in the project, we came across well-made tutorials for Unity where
they developed VR experiences. We felt this would speed up the software learning
process and thus became the deciding factor in choosing Unity.

4.3.2 Blender
Blender is a free and open-source 3D design platform used for the entire 3D mod-
elling process; modelling, rigging, animation, simulation, rendering, compositing and
motion tracking [71]. The software can be used by individuals who want to play
around with the program, as well as professional studios working in 3D environments
[71].

Similar software on the market are Maya and SketchUp. Our choice regarding which
one to use was made in discussion with our supervisor, who mentioned that Blender
is easier to learn and a better fit for our small-scale project. Furthermore, if we
would need help with Blender during the project, our supervisor had knowledge of
it. This simplified the process of receiving help, unlike other software, where we
would need to rely on online sources.

4.3.3 Oculus Quest 2
This thesis will use a head-mounted display (HMD) which is a fully immersive
system [35]. Based on what was presented previously in the sections 3.1.1 Fidelity,
3.1.2 Immersion, and 3.1.3 Presence, this means that the user uses both visuals and
sound to get the ultimate experience. The choice regarding which headset to use
was based on close dialogue with our supervisor and preference on our part. We
were torn between the Oculus Quest 2 headset and the HTC Vive Pro. Although
the Vive had better specs, it was more inconvenient to have on the go due to its
so-called base stations. These track the user’s movements in the room while wearing
the headset and holding the controls. For this reason, Oculus Quest 2 was chosen
to be tethered with a link cable to a computer to provide better rendering quality.

4.4 Modelling
When working with design, it is essential to communicate ideas clearly to move for-
ward to the next step in the process [72, 73]. One way of doing this is by creating
prototypes. By showcasing prototypes to others, evaluating, refining, and improving
the concepts will be easier. During this project, prototypes will be created using the
software 3D Blender and Unity 3D, which was mentioned previously. These proto-

18


4. Methodology

types will include a 3D model of Vasaplatsen in Gothenburg, a VR user interface,
and a VR reconstruction of the event.

4.4.1 Chronological timeline
Timelines are used to display how events unfold in chronological order [74]. De-
pending on the field of interest, timelines can be used to display aeons of time or
a couple of minutes. Either way, the purpose is still the same; show when events
occur and highlight key events. This makes it easier for the person looking at the
timeline to understand what was important for the event. We used a chronological
timeline to keep track of how key persons during the event moved and what actions
they took.

4.4.2 Storyboard
To create an overview of the Gothenburg riots, we will create a storyboard with the
key scenes we want to include. According to Krause [75], this will be supplemented
with visualisations in the form of sketches and short captions. Our storyboard
contained five pictures depicting how we wanted to form the experience.

4.4.3 Digital twin
The 3D model of Vasaplatsen in the reconstruction will be based on the digital twin
of Gothenburg that is available [76]. A digital twin is a 3D model of a physical
object, for example, a city or a process [77], in this case, the city of Gothenburg [76].
The twin is created by the city planning office to facilitate the planning of future
projects, understand what consequences these projects may have, and also create a
basis for making the city more efficient [77, 78, 76].

4.5 User testing
An essential part of projects is to receive user input regarding the concept to discover
future improvements and existing problems [72, 79]. In this project, user testing is
also important for gaining insight into how the users think the reconstruction reflects
reality and how it affects them emotionally. The aim will be to gather three to five
intended users of the product [72, 79]. Through a qualitative approach, the goal
will be to gather insights regarding the general experience of using the product [79].
The method applied will be think-aloud, where the users will be prompted to share
their thoughts and feelings when using the product [80, 79]. The results from the
think-aloud method will be sorted and analysed with the help of affinity diagrams
[81].

4.5.1 Think-aloud
To help evaluate the VR experience, the think-aloud method will be used [80]. It
is used to evaluate prototypes, where the participants are asked to speak out loud

19


4. Methodology

when trying a product. The method is used to understand what is working or not
rather than why it is working or not. Think-aloud can help developers understand
the user’s feelings and thoughts about the product. The method is suitable when
the product, or part of a product, being tested is narrow [80]. This is the reason
we are using it as a method, in addition to it being an adequate evaluation method,
since our VR experience is relatively simple. We will note what the users are doing
and saying when using the method.

4.5.2 Interviews
To supplement the think-aloud testing, semi-structured interviews will be conducted.
Interviews in research can be conducted in structured, semi-structured, or unstruc-
tured format [80]. For the user testing, we will be using the semi-structured interview
format. Semi-structured interviews are based on questions that are asked to all of
the participants. Depending on how the participants answer the questions, the in-
terview can morph from a more clinical conversation to a more conversational and
everyday-like one. It all depends on what the interviewee finds interesting. The test
leader has predetermined questions they can use if the conversation starts to trail
off the subject too much. [80].

4.5.3 Affinity diagramming
Affinity diagramming is a method to sort qualitative data [81]. The persons that
partake in the affinity diagramming write down a piece of known information about
the product on a sticky note. The notes are placed on a whiteboard when all the
information is written down. The partakers then start to sort the notes into groups
and start to look for patterns or similarities. After an affinity diagramming session,
the tester should have insight into what they are developing and be able to create
a plan moving forward [81]. In our thesis, we will use affinity diagrams to structure
the data collected from the think-aloud method. The insights will be used to provide
notes on what is possible in future work with the project.

20


5
Execution and Process

Since this thesis is more technical, we have not used a traditional design process
with steps such as user data collection or a phase dedicated to ideation. Instead,
the project has been based on the clear vision for the thesis: to create a recon-
struction of a historical event in virtual reality. However, the process has not been
straightforward and linear. There have been many ambiguities regarding the ap-
plication’s design such as motion tracking, locomotion and manipulation of time.
The focus has further been on the actual design of the final concept and what level
of fidelity it should have. This has consisted of several process steps that will be
presented below.

5.1 Planning
Based on previous experience and the fact that the project will contain several
entirely new approaches to us, we decided that a Gantt chart would be used for
planning the work. This chart was revised several times during the project, and the
final version can be seen in Appendix B.

When creating the Gantt chart, we had in mind that some parts could take longer to
execute than expected. This primarily concerned steps "6. Create 3D model" and "7.
Implement VR in Unity". In both of these steps, we would use the software Blender
and Unity, which we did not have experience using before. In addition, it was in
step 7 that many of the project’s ambiguities would take place, for example how the
user should move in the experience, how the animations should be implemented and
how the 3D model should be combined with the VR experience.

5.2 Prestudy
The process started with a preliminary study where we looked for information about
virtual reality, the Gothenburg riots and the design and use of reconstructions today.
This resulted in us gaining knowledge regarding physical and digital reconstructions
and how virtual reality can be used. We also gained a deeper understanding of the
Gothenburg riots, its impact on society and the process of achieving justice.

21


5. Execution and Process

5.3 Data gathering and analysis
Collecting enough source material was necessary to design the reconstruction in
virtual reality. This included, among other things, video footage, pictures from the
demonstration, and maps of the place of action. In this step, the documentary
"Skotten på Vasaplatsen" by Göran du Rées and its clips from Uppdrag Granskning
were of great help [2].

The documentary facilitated the categorisation of material, setting and synchronis-
ing time stamps, as well as understanding the positioning of the various cameras
filming the event. The camera positioning can be seen in Figure 5.1. The cameras
that were most important to us were those in positions 1-2, 4, 6, 8-9 and unknown,
see Figure 5.1.

Figure 5.1: A map of the camera positions used in du Rées’ documentary

We also created a storyboard with illustrations to get an overview of important
key scenes and moments from the event, see Figure 5.2. However, after creating
the storyboard, it turned out that it sounded more rewarding in theory than in
practice. This is because the du Rées’ documentary functioned to some extent as a
storyboard.

22


5. Execution and Process

Figure 5.2: Storyboard of the Gothenburg riots

5.4 Motion tracking of key persons
The riots at Vasaplatsen escalated quickly and many people were involved. However,
due to limitations in time and resources, not all individuals could be modelled or
motion tracked. This resulted in us selecting key individuals and groups that we
felt were most relevant to the event. The choice was also based on who were the
easiest to identify in the film clips, as there were many clips and the quality was
often low. The people we chose to motion track were Hannes Westberg and the
two largest police groups; those stoned at the beginning of du Rées’ reconstruction
(police group 1) and the reinforcement group seen a few moments before Hannes
was shot (police group 2).

The documentary by Göran du Rées also proved relevant when we were to track
people’s movements. At first, we decided that markerless motion tracking using
deep learning would be the best option for us. This is because there was not enough
time or resources to recreate the riot movements in real life and perform marker-
based motion tracking. However, one can conclude that a good prerequisite for
markerless motion tracking is good-quality film sequences so that deep learning can
track people. This was not the case with the clips we had access to. The key people
we were interested in were not always in frame, and the general quality of the clips
was not always optimal. We did not do any testing, but based on these theoretical
findings, we made an educated estimation that manual motion tracking would be the
most suitable for us. This involved watching the documentary, identifying relevant
people in the various clips throughout the event and noting their positions and

23


5. Execution and Process

actions. At first, we tried to do this by placing them visually on a map, see Figure
5.3. Arrows indicated which way they were facing, and different colours indicated
actions and positioning:

• Green dots meant that we were sure of their positions and actions

• Yellow dots meant that we were not sure of their positions and actions

• Smaller light grey dots indicated that stones had been thrown

• Larger darker grey dots indicated that shots were fired

• Red dots indicated that a shot had hit someone

Figure 5.3: Positioning of police groups 1 and 2 and Hannes based on du Rées’
documentary

However, we soon realised this was not detailed enough to base a reconstruction on.
There needed to be more clarity regarding which time stamps applied, what actions
the people did and in what order everything happened. Due to this, we restarted the
process and instead used the chronological timeline, see Appendix C. There we wrote

24


5. Execution and Process

down actions and positioning together with time stamps. This method also made
it easier to understand exactly during which sequences our persons of interest were
visible and not. Empty boxes mean the person or group has not yet been shown for
the first time. Hyphen "-" means that the person or group is not visible but has been
in frame before. The yellow marking implies uncertainty about positioning and/or
action. Furthermore, in some cases, during the video sequences, it was possible for
us to make educated guesses. These guesses have been based on what is happening
around the characters and their actions before and after. We wanted the scene to
flow effortlessly and without any seemingly weird teleportation from the characters,
since we thought it would lower the immersion and presence. Because of this, there
are sequences in the VR experience where characters are visible and moving, even
though we are not completely sure what the characters did at those moments in the
real-life event. These have been marked with "guess" in the script.

5.5 Learning Unity
After our prestudy and collection of source material were finished, the idea was
to start the 3D modelling of the Vasaplatsen. However, due to various blockages
and bottlenecks such as postponed meetings and not having access to the right
VR equipment or a computer with adequate performance, we had to come up with
things to work with. However, this turned out to be a blessing in disguise as we got
time to learn Unity thoroughly, as it had a central role in our work. We focused on
learning the program’s interface, how to code in the program, and approaches when
developing games and VR experiences.

To apply what we learned and make a first project draft, we created a very simple
model resembling Vasaplatsen, see Figure 5.4. By making this model, we learned
about scale in Unity, how to download assets, and how to work with so-called prefabs.
According to [82], prefabs facilitate and streamline the work where one creates a
base game object that one then uses instances of to fill the game environment. An
example in our case was the trees on Vasaplatsen. Instead of modelling each tree
individually, we could make one base model, and use copies. In this way, we could,
for example, change the colour of all instances by only changing it on the base object.

5.6 Create the 3D model
At first, our idea was to model the Vasaplatsen from scratch in Blender. This idea
turned out to be more time-consuming than we initially thought, which resulted in
us contacting Gothenburg’s state building office (SBK) on the recommendation of
our supervisor. Through them, we gained access to the digital twin over Gothenburg,
see Figure 5.5, and aerial photos, which made the work significantly more efficient,
even though it was only a primitive model. For our project, the model only needed
to be supplemented with house facades and city objects, such as benches and traffic
signs, as the buildings already had the correct dimensions, green areas and roads
were laid out. It is worth noting that Vasaplatsen in this model could potentially

25


5. Execution and Process

differ from Vasaplatsen during the riots since it is a model of how Vasaplatsen looks
today, and not back in 2001. We believe that Vasaplatsen is a well-preserved area
that the differences are not decisive for the reconstruction.

Figure 5.4: First prototype of Vasaplatsen in Unity

Figure 5.5: Excerpt of Vasaplatsen from Gothenburg’s digital twin

House facades were not made for all buildings due to time constraints. However,
the facades made were created by our supervisor with the help of photographs from
the facades on Vasaplatsen. These images were then modified in Photoshop, where
objects such as trees and cars were removed to create an image with a flat house
facade, see Figure 5.6. The finished image was then projected onto the digital twin.
Further, regarding city objects, a package was downloaded in Unity containing basic
city objects that could be placed in our 3D environment.

5.7 Prepare for virtual reality
During the Unity introduction tutorial described in section 5.5 Learning Unity, they
recommended using their pre-made VR package for Unity as a starting point when
creating VR experiences. This package included many of the parts that we would
otherwise have had to create ourselves. For example, there were scripts for different
types of locomotion, finished hand models and ray interactors. This made the work
significantly more efficient. In addition, we learned later in the process that this
is standard practice when developing with Unity. Unity provides templates that
provide the best pre-settings for the type of project that is going to be developed.

26


5. Execution and Process

Figure 5.6: House facade from Vasaplatsen

Unity also provided a design document during the tutorial, which they recommended
filling in before creating VR projects. With the help of the document, we determined,
among other things, how the user can interact during the experience and how we
can optimise the experience performance. This also facilitated the work as we got
a clear picture of what we would develop and which videos from the Unity tutorial
could be relevant. The document can be seen in Appendix A

5.8 Creating the virtual reality experience
Once the tutorial was finished and the 3D model of Vasaplatsen was in place, it was
imported to Unity so we could start creating the VR experience. On recommenda-
tion from the AR & VR agency OutHere, we decided to use pre-made 3D animations
and characters from the free online library Mixamo [83]. Through the library, an
animation can be applied to the desired character and then downloaded and used in,
for example, Unity. It is possible to use one of their pre-made characters or upload a
customised one. However, the principle is the same; all characters can be combined
with a large number of animations available, but only one animation at a time can
be applied to the character.

The Mixamo process was iterative, where we started with finding a character that
resembled Hannes and animations similar to his movements in the du Rées recon-
struction. These were then downloaded one by one. This process was also done
with police group 1 (stone throwing) and police group 2 (the reinforcement). For
the groups, the same character was used together with various animations. In Unity,
multiple instances of this character were then used to create the groups.

One of the difficulties in this phase was determining the speed at which the characters
moved. We had to guess the values by looking at how long it took them to move
from point A to B in the du Rées documentary. Additionally, the scale of the
characters placed in the digital twin was challenging to determine and based on

27


5. Execution and Process

guesswork. The third difficulty was that we sometimes could not find animations
and characters that resemble reality enough. We had to modify or settle for slightly
different in those cases. Examples are the characters’ clothes, equipment, and some
of the movement. An example is shown in Figure 5.7 where the clothes Hannes is
wearing in the reconstruction, and the hand he is pointing with differ from reality.

Figure 5.7: Comparision between Hannes in reality and the reconstruction

An attempt was made to modify clothes and add equipment such as weapons and
batons in Blender, this turned out to be significantly more advanced than expected.
It was not only about modelling the clothes and equipment but also modifying
them to follow the characters’ movements. Due to time constraints, complexity, and
difficulty level, we chose to keep the original Mixamo characters as they were.

5.8.1 Unity timeline
We used the Unity timeline to apply the animations in the VR world. The timeline
allows developers to create cinematic content by adding audio sequences, working
with several animations simultaneously, placing them out in time and modifying
game objects [84]. Worth noting is that this timeline is different from the previously
mentioned in section 4.4.1 Chronological timeline.

With the animations downloaded from Mixamo, each character had its own ani-
mation track customised, see Figure 5.8. As mentioned previously, we did manual
motion tracking by looking at the documentary by du Rées, noting down positions
and actions, and then implementing them in Unity. Even though this script was
detailed, we still needed to switch between the documentary and Unity during the
process. Second by second, we then made sure that the animations in Unity re-
sembled what was shown in the documentary. In total, ten characters were motion
tracked after their exact behaviour in the documentary and added to the VR ex-
perience. The process was iterative and needed a high level of precision. It took
approximately four weeks to achieve a reasonably good result.

28


5. Execution and Process

Figure 5.8: Animation tracks in Unity timeline

5.8.2 Visualising uncertainties
In those cases where the characters’ positioning and movement could not be seen or
documented, we applied a visual effect to the characters. Based on the previously
presented literature in section 3.3 Managing uncertainties, we tried different visual
cues to communicate the uncertainties to the users. We discussed if a coloured or
blurry filter could represent the uncertainties, or if the characters could become more
transparent or desaturated. We also tried different types of so-called particle sys-
tems, such as "flying" squares around the character and a cloud variant. According
to Unity [85], a particle system can be used on numerous occasions to apply a visual
effect. The transparency, cloud and flying squares effect can be seen in Figure 5.9

There were advantages and disadvantages with all of the effects we tested. The
majority of the drawbacks were that the techniques attracted too much attention
or resembled something else. For example, the "flying squares" made it look like the
characters were on fire, which was not what we sought.

Figure 5.9: Different ways of visualising uncertainties

The technique we settled on at last, was transparency as we felt it was attention-
grabbing enough without becoming too distracting. Furthermore, since this recon-
struction will be stripped down regarding the number of characters, the main focus

29


5. Execution and Process

will be Hannes and the police groups. We, therefore, believe that subtle differences,
such as changes in transparency, are noticeable enough to the user.

5.8.3 Sound and Light
The AR & VR agency OutHere told us that one of the most important aspects
of developing VR is the audio in the experience. At first, we planned to use the
audio from the du Rées documentary [2], which would give the experience the most
authentic reconstruction soundwise. Due to copyright, which has been discussed
previously, we decided that the simplest way to solve this problem is to receive
permission from Göran du Rées himself. As a result, we contacted him and got
permission to use his material as we wished.

Shortly after, we started to explore the audio functions that Unity had. One function
was the ability to create 3D sound, which can enhance the VR experience [86].
This means that we could create different sound sources, place them at accurate
locations in the experience relative to the real-life events counterpart, and increase
the immersion. In connection with this, however, we discovered that the audio from
the documentary would not be suitable for creating 3D audio in the experience. It
was only one sound file, meaning we could not isolate the sound from important
events in the documentary, such as single gunshots. The audio was too flat, and
sound from other sources blended in.

As a result, we instead decided to use sound from the audio library Epidemic Sound
[87]. It is an extensive audio library with royalty-free music, that had all the audio
we strived to include in the VR experience. The audio we used from Epidemic Sound
were all isolated audio files and worked well in the experience. Since the audio files
were separated, we could create 3D sounds of them and make the VR experience
more immersive.

Moving to light in the experience, Unity provides natural light when starting up a
new project. The developers can then manipulate the light in the scene depending
on how they want it to look. The shootings and demonstrations on Vasaplatsen
took place in the evening, and the sun was on its way down over the horizon. To
provide a high level of immersion and presence, we aimed to simulate the same light
effect in our recreation as it was in the source material.

5.8.4 Interface
The idea throughout the project has been to have an introductory scene with back-
ground information about the event for the user before they see the actual recon-
struction. It turned out that working with a lot of text in VR proved challenging
during the development. It became uninteresting and annoying. Furthermore, the
text is also no longer read statically in VR, but can be influenced by movements
and rendering functionality [88].

In communication with the company OutHere who also confirmed the difficulties
with text in VR, we concluded that a simple solution would be a voice-over. We

30


5. Execution and Process

could either record it ourselves or use an AI voice generator that uses text-to-speech.
We discussed both approaches but concluded that the most flexible would be to use
artificial intelligence.

We tested several websites that generated AI voices but finally landed on the page
Play.ht [89], which we felt gave the most natural voice and did not have the typical
robot pitch. This process was iterative, where we started by writing a short script
and then generating the voice-over. The script was modified several times, and so
was the voice-over. Play.ht focuses on creating natural voice results, making each
voice sample unique. For this reason, certain words could be emphasised differently
and in some cases, we were unsatisfied and had to generate a new sample.

5.8.5 Optimise performance
We followed Meta’s recommendations regarding what is a good performance for a
VR experience on an Oculus 2[90]. We aimed for:

• 72 frames per second

• 13.9 milliseconds per frame

• 750k-1.0m triangles per frame

• 200-300 draw calls per frame

When developing the experience, we knew it would be a smaller project and would
not need too much computational power. There are no high poly objects and no
mechanics that demand much of the computer. We will also play the experience with
a link cable connected to a computer and therefore, estimated that the performance
never really would be a problem. The computer that we used for developing the VR
experience used a Nvidia 2060 RTX graphic card, which is recognised as a powerful
graphic card.

When we were done with the VR experience, we met our conditions for the frame
per second and milliseconds per frame. We generally had lower triangles per frame
than what we wrote in the design document. This was because the models we used
were lower in triangles, and we did not add more objects to the experience. The
same goes for draw calls per frame; we had many more draw calls per frame than our
goal. This is again due to our VR experience not demanding much of the computer
and headset. The performance we got out of our product was enough to satisfy our
goal.

5.9 Visit to Hagabion
When we emailed du Rées regarding copyright, he also mentioned that he would
be coming to Hagabion in Gothenburg to show his documentary "Skotten på Vas-
aplatsen", which has been of great help during this thesis. The same evening, a new
documentary by another director would also be shown: "På Hvitfeldska bodde vi".

31


5. Execution and Process

In this case, the director was one of the protesters who were detained at Hvitfeldska
by the police, see section 2.1.1 The barricading of the Hvitfeldska gymnasium. After
the documentaries were shown, the audience was allowed to ask questions, and it
emerged that many in the audience had taken part in the demonstrations. Dur-
ing the discussions, there were also students who attended Hvitfeldska today who,
despite attending the school, had never heard of the barricading of the school.

During the discussions and in the documentary "På Hvitfeldtska bode vi", it emerged
that those involved felt hopeless that their perspective of the events was not taken
seriously. Many felt that what the media had reported was biased or one-sided, and
they felt that people would not believe them if they told their side. For example,
after du Rées’ documentary, someone mentioned that she had heard 13 shots, in
contrast to the prosecutor’s six and du Rées’ eight. It also emerged during the
same discussion that much of the film evidence had disappeared as the police had
no computational space nor knowledge to save it, which could have resulted in the
differences in numbers.

Before the visit, we had worked clinically and scientifically with the events and
riots. The visit gave us a new perspective and understanding of what happened
those days in 2001. People in Gothenburg still have strong emotions towards the
event, and many are still emotionally hurt. The visit invited humanity to our thesis
and understanding of it. It solidifies that the events are still relevant, even though
more than 20 years have passed.

5.10 User testing and compilation
During the work, we discussed user tests several times and when they would be
appropriate. We felt that it would not give us much if we were to carry out user
tests relatively early in the project, for example with a first draft of the 3D model and
some simple character animations. Therefore, it was important for us to create an
experience with sufficiently high fidelity, immersion and presence before presenting
it to users. This aimed to get feedback based on the user’s more concrete feelings
and thoughts regarding the VR experience and its content.

As a result, developing the VR experience took most of the project’s time. For this
reason, we decided that the purpose of the user tests is to create a basis for future
work and if someone else continues the project. In other words, none of the feedback
was implemented in the VR experience presented in this thesis. It will instead be
compiled and presented under section 6.3 Results from the user tests.

5.10.1 Execution of user tests
The participants were collected through a convenience selection and were persons
close to High and Low in Kuggen at Chalmers Lindholmen. We gathered five par-
ticipants, all students of the Interaction Design and Technologies (IXD) program
at Chalmers University of Technology and were versed in UX design. All of the
participants received verbal information about how we would conduct the tests and

32


5. Execution and Process

how their answers would be used. Through this, we also received their informed
consent.

None of the tests were recorded. However, we used Microsoft Teams’ transcription
tool to make the analysis of the responses easier and more time-efficient. Further-
more, we decided beforehand that one of us would ask the questions, and the other
would take notes of the participants’ answers.

The tests were conducted individually, starting with a brief introduction about our
project, how they can navigate in the VR world and an explanation of the buttons
on the controllers. Before the participants could start the VR experience, we also
asked them about their previous experience with VR. The participants then put on
the VR glasses and we checked the size and fit. During the VR experience itself, we
encouraged them to describe out loud what they saw and felt instead of us asking
continuous questions.

After the experience, we asked questions about where they think this type of concept
can be used, their development suggestions, and if they understood all the parts they
saw. All questions before, during and after testing the VR experience can be found
in Appendix D.

5.10.2 Compilation of user tests
After all the tests were finished, we went through the auto-generated transcriptions,
all of which had many disjointed parts. As mentioned earlier, the goal of the tests
was to create a basis for future work, and therefore we did not see the need to
have direct quotes from the users. Instead, we wanted to receive some suggestions
for future work. For this reason, to make the transcriptions reasonably easy to
understand, we went through each transcription, removed unnecessary filler words,
and rewrote disjointed sentences.

We then reviewed each revised transcription and copied relevant parts to sticky
notes. When all parts were extracted, we began to group the various notes, see
Appendix E. These groups then contributed to concluding what the participants
considered opportunities for future work. This is explained in section 6.3 Results
from the user tests.

Worth mentioning is that during the tests it emerged that all participants previously
had little to no experience with VR. This is a factor that could have affected their
input since their focus most likely was on the VR technology itself. That, in turn,
may have limited the amount of in-depth analysis and suggestions we received. On
the other hand, this is a proof of concept where there is still a need for more details,
and for that reason, perhaps the possibility for an in-depth analysis was limited
already for that very reason.

33


6
Results

Our final concept is a VR reconstruction of the Vasaplatsen shooting during the
Gothenburg riots in 2001. During this reconstruction, the user can see Vasaplatsen
and the sequence of events from stone throwing at the police officers to the shooting
of Hannes Westberg. The reconstructed experience is a total of 2 minutes and 10
seconds long. A link to a part of the reconstruction can be seen in Appendix F

To avoid the user being just thrown into the experience and seeing the chaos unfold,
a text box greets them with background information about the event, see Figure
6.1. The information is also presented verbally by an AI voiceover to avoid missing
information due to difficulties with long texts in virtual reality. In this scene, the user
can only rotate his head and hands to look around and interact with the interface.
Furthermore, the textbox, and all other buttons, sliders and controls the user will
be met by are set to world space. This means that they are placed in the virtual
world where users can interact with them.

Figure 6.1: The introduction scene

When the users feel ready, they can press the button "Go to Vasaplatsen" to be
moved to the reconstruction scene. Here the user is loaded into Vasaplatsen and
another button that says "Start experience", see Figure 6.2. The reason for the
additional button is to allow the user to learn the controls and navigation before
the reconstruction sequence begins. This is to give them the best possible chance of

34


6. Results

not missing any information or event. We also wanted the button to be one of the
first things the user sees, and therefore it is not placed on their wrist.

Figure 6.2: The second button on the reconstruction scene

So-called rays are used to navigate around the world, see Figure 6.3. These are red
lines that beam out of the virtual hands. By clicking the buttons on the physical
controls, users can activate the locomotion pointing and teleportation by clicking the
buttons on the physical controls and getting from points A to B. The user will also
receive haptic feedback in the form of vibrations in the controllers when navigating
through the scene.

Figure 6.3: Rays used by the user for locomotion and interactions

Constraints have been set up to prevent users from entering buildings or going
outside the world. These constraints are not visible in the experience, but they are
highlighted in red in Figure 6.4. Users can press the "Start experience" button to
start the animations when they feel comfortable with the settings and controls.

35


6. Results

Figure 6.4: Overview of the constraints

6.1 The reconstruction
Hannes and the police groups are embodied by so-called non-player characters
(NPC), see Figures 6.5 - 6.6. These have been designed to make the same or similar
movements as their real counterpart that can be discerned in du Rées’ documentary.
Hannes and police group 1 (stone throwing) are visible throughout the whole expe-
rience, while police group 2 (the reinforcement) are visible from 01:24. Furthermore,
during the experience, there will be sequences where characters are semi-transparent.
This is to visualise the uncertainty regarding their actions, positions or both.

Figure 6.5: Hannes and police group 1

The reconstruction starts with the user facing Vasagrillen and seeing Hannes and
police group 1, see Figure 6.7. Until police group 2 comes into the picture, there is
stone throwing and chasing from both sides. When group 2 becomes visible, they
focus on contributing reinforcements to group 1, resulting in them forming a larger
group. At the same time, Hannes is shot and begins to limp towards the Vasagrillen,
see Figure 6.8. In this context, it is important to remember that Hannes was not
alone and had many protesters around him. However, this is not visible in the
reconstruction, making it look like Hannes is alone against the police groups.

36


6. Results

Figure 6.6: Police group 2

Figure 6.7: Start scene

Figure 6.8: End scene

6.2 Manipulating time
While the user sees the reconstruction unfold, they can manipulate the time through
a square "tablet" on their arm, see Figure 6.9. The tablet is attached to the user’s
left arm, and they can interact with it with their right hand. On the tablet, there is
a short functionality instruction, the number of seconds of the reconstruction that
has passed and a slider to rewind and fast forward the time. There is no delay
between user interaction with the slider and updates in the reconstruction sequence.

37


6. Results

Figure 6.9: Time manipulation tablet

6.3 Results from the user tests
The five user tests we gathered, see section 5.10.1 Execution of user tests, provided
numerous possibilities for future work. These are presented below.

6.3.1 Other use cases
Reconstructions similar to this one, and VR reconstructions, in general, can be
used for tourists and educational purposes. For example, VR reconstructions can be
available at museums and exhibitions. Through it, one can reenact historical events,
participate in them, and learn more. VR reconstructions can also be suitable for
guided tours of places one does not have the opportunity to visit due to various
reasons such as time, money or inaccessible places.

6.3.2 Improvements for visualising uncertainties
When visualising uncertainties, participants felt that there should be different visu-
alisation techniques. Specifically for the reconstruction made in this thesis, it needed
to become clearer what was uncertain and what was not. The users felt they missed
the semi-transparent effect on the characters during different time sequences. In
addition, it emerged that turning the uncertainty visualisation on and off would be
appreciated to get the optimal experience.

6.3.3 Improve presence and immersion
Precisely for this reconstruction, it emerged that there is a need for more people and
crowds to create a higher level of presence and immersion. This would also increase
with the help of other details, such as the implementation of weapons and stones.
It also emerged that most of the participants could locate it as Vasaplatsen without
us mentioning it. However, according to some participants, it needed to be shown

38


6. Results

more clearly, for example, through signs that say "Vasaplatsen". One suggestion was
to have these signs at the bus and tram stops.

6.3.4 Information during the experience
The users appreciated the information in the text box before the experience started,
but it appeared that much information was quickly forgotten. For this reason, it
would be appreciated to have information continuously throughout the experience,
for example, by interacting with the characters and getting information from them
while the event unfolds.

6.3.5 Audio in VR experiences
The sound was shown to be efficient for giving context and increasing immersion
and presence. The participants experienced in this reconstruction sound made it
clear that a lot was happening simultaneously; many people were present, shots
were fired, and stones were thrown. However, there was a desire to distinguish
important sounds and communicate to the user that their focus should be drawn
to the sound source. This was because one of the participants missed the final shot
that hit Hannes and had to replay the sequence.

39


7
Discussion

The Content-oriented model of user experience, mentioned in section 3.7.1 Content-
Oriented Model of User Experience, and the keywords why, what and how have
been used to try and keep the project on a linear track to reach an enjoyable user
experience. The hardest part during the development was the "Why", not because
we did not know what to include to make the recreation an exciting VR experience,
but because we did not know how much we would manage to complete during this
thesis. From the start, we knew that we wanted to make it possible to experience the
Gothenburg riots in VR, recreate Vasaplatsen, and include some motion tracking. In
the end, we managed to include these as well as audio, light, uncertainty, background
information, time manipulation and street objects.

For the "What", we wanted to create a VR experience where the user, to some
extent, gets to experience the Gothenburg riots and the shootings at Vasaplatsen.
This meant that a digital reconstruction needed to be included. For the "How" we
wanted the locomotion in the VR experience to be effortless, which was thoroughly
researched. The choice of locomotion, namely pointing and teleporting, was mainly
based on the fact that we did not want users to get affected by motion sickness.
We also included a time manipulator which allowed the user to change the time in
the event that they are in. "Why" ended up being the fact that one can experience
something that has already happened and go back to it multiple times.

Since the user is experiencing a historical event we would not want them to interfere
with the events in the VR experience. The user is therefore able to do two things
in the VR experience; (1) move around in the world and (2) manipulate the time.
For the "how" we mentioned that the project’s aim was a proof of concept. This
meant that the look and feel of the experience would have some rough edges and
be in need of further work when the thesis is done. Our goal was however to have
a VR experience that would resemble the real event enough so that the user would
get an understanding of how the event unfolded back in 2001.

7.1 Motion tracking
A central difficulty during this thesis has been the motion tracking of relevant people.
Although the event was at the time one of the most filmed, there is a lack of clips
with sufficient resolution quality to be able to apply artificial intelligence or machine

40


7. Discussion

learning for motion tracking. Also, as now noted, the film clips from the event are
from different angles and not long enough for relevant key people always to be seen
in the frame. Apart from this, there were also moments when Hannes and the police
blended into the rest of the crowd. For this reason, it was of great importance for
us to be transparent and visualise this uncertainty to achieve a credible result and
counteract bias.

Some methods can create markerless motion tracking with unsynchronised video
footage [50]. These methods have other ways to capture the people they want to
motion track, for example through audio from multiple cameras or automatically
reconstructing background geometry [50]. These methods are used when the person
in the frame also is the focus of the video clip [50]. The video footage we have
access to does not have these attributes. The biggest issue is the low resolution of
the footage. The other is the chaotic scenes that we are depicting, where it is a
cluster of people moving in front of each other, moving in and out of the frame.
The audio comes from one channel through all of the footage, making it hard to
distinguish from which camera the audio is coming from. All these reasons, and the
fact that we had no prior knowledge about motion tracking, made us choose the
manual method, described in section 5.4 Motion tracking of key persons.

The work has been based on the documentary by du Rées and clips from Uppdrag
Granskning. Among other things, the documentary was used to note down im-
portant timestamps and write a script for the VR experience. This simplified the
process, but may also have been a source of error as time synchronisation errors were
discovered in some parts of the documentary. For example, if Hannes disappeared
from the frame in clip A while we were analysing him, we had to rely on clip B where
he was visible. Between these clips, sync errors could occur, meaning Hannes could
appear a few seconds later than he originally would have done in the second clip.
This may have resulted in Hannes being placed in the wrong place at a certain time
in our VR experience. This also applies to other objects and people, for example,
if stones are thrown at the right time or if the police teams enter the scene at the
right time.

This error could be minimised or counteracted in some cases by a third camera
being available and presenting clip C. This clip acted as a bridge between clips A
and B, and thus Hannes was shown continuously using 3 different camera angles
and clips. It is worth noting that the synchronisation error has not had any serious
consequences for the work as it only differed by a few seconds, but it has nevertheless
affected the final VR experience and its various time sequences.

The combination of multiple cameras can also further be used to reduce the un-
certainty in our reconstruction. By projecting the video material from the du Rées
documentary, as well as specifying the camera positions from the documentary onto
the project in Unity, see Figure 5.1, one can more precisely motion track the key
people. In the same way as above, one can see on camera A where the key person
is, and then compare it with cameras B and C. Through this, one can triangulate
the positioning and motion tracking and create a more reliable result in Unity.

41


7. Discussion

Another aspect to remember is that the police officers were much easier to distinguish
from the crowds than Hannes due to their uniforms. However, they were difficult
to distinguish from each other, which in this work has led to us only identifying
the police groups that were the largest (group 2, the reinforcement), or that had a
central role in the incident (group 1, stone throwing). The difficulty in identifying
individual police officers, as well as the lack of clips where they are always visible,
has also led to us only being able to speculate about which police officer(s) fired the
shots, including the one that hit Hannes.