Data-driven inference approach for
integration between shared micro-mobility
and public transit with empirical
analysis

Master’s thesis in Infrastructure and Environmental Engineering

HAO LI
LOUIS INKUMSAH

DEPARTMENT OF ARCHITECTURE AND CIVIL ENGINEERING

CHALMERS UNIVERSITY OF TECHNOLOGY

Gothenburg, Sweden 2024

www.chalmers.se

www.chalmers.se


Master’s thesis 2024

Data-driven inference approach for integration between
shared micro-mobility and public transit with empirical

analysis

HAO LI
LOUIS INKUMSAH

Department of Architecture and Civil Engineering
Division of Urban Mobility

Urban Mobility Systems Group
Chalmers University of Technology

Gothenburg, Sweden 2024


Data-driven inference approach for integration between shared micro-
mobility and public transit with empirical analysis

HAO LI
LOUIS INKUMSAH

© HAO LI, 2024.
© LOUIS INKUMSAH, 2024.

Supervisor: Ruo Jia, Department of Architecture and Civil Engineering
Examiner: Kun Gao, Department of Architecture and Civil Engineering

Master’s Thesis 2024
Department of Architecture and Civil Engineering
Division of Urban Mobility
Urban Mobility Systems Group
Chalmers University of Technology
SE-412 96 Gothenburg
Telephone +46 31 772 1000

Cover: E-scooters parked in front of the architecture and civil engineering faculty
building, with one hanging from a tree branch.

Gothenburg, Sweden 2024

iv


Data-driven inference approach for integration between shared micro-mobility and
public transit with empirical analysis
HAO LI
LOUIS INKUMSAH
Department of Architecture and Civil Engineering
Chalmers University of Technology

Abstract
E-scooters are here to stay, as we see promising growths of about 10% annually
by 2030. The industry is envisioned as a prospect to promote environmental and
socio-economic sustainability. Integrating it with other forms of public transit since
it is a more flexible form of transit, for the first- and last-mile seems to be the most
promoted desire presently. However, the challenge lies in the fact that there are
very few policies to govern them and also very little research to fully understand the
impact of e-scooter’s integration with public transport. With our research aimed at
using machine learning and a k-prototype technique to analyse the usage patterns,
seasonal effect and effects on POI of the first- and last-mile trips within the city
of Gothenburg. From that we found that the closer an e-scooter was to a stop
it encouraged it’s usage for integration, especially in the winter with about 62%
decline in integrated trips as compared to 70% in non-integrated trips. Indicating
that, there is a stronger desire for integrated trips in the winter than in the summer.
We also found that the city had 80% of substituted and 20% complementary e-
scooter trips with public transit, with the common day and time of usage being on
Wednesdays and Thursdays between 12:00 and 14:00 or 14:00 and 16:00. In the city
the high counts of integration was found to be in the centre of the city at locations
with multi-modal transport and dense activities which included commercial, others
and recreational areas but their integration rates mostly occurred in suburban areas
which were less dense with less efficient transport. One location stood dominant in
both integrated trip count and integration rate which was "Stenpiren". Finally we,
found that the weather impact the number of trips but does not affect the perception
of usage, with the integration patterns being similar.

Keywords: E-scooter, Bus Public Transport, First- and last-mile, Micro-mobility,
Seasons, Point of interest, temporal, space, Integration, Gothenburg.

v


Acknowledgements
I am glad to have had a colorful experience in Sweden. I didn’t expect I would come
so far. Along with friends makes me go further. Without people’s help, I might
already have had a different life. I want to show my deepest thanks to Chalmers
which provides all kinds of possibilities and likelihoods.
Then I am so grateful to the examiner, Kun Gao, and the supervisor, Ruo Jia.
They gave the topic and offered a platform for us to shape different careers in
transportation fields. Also, Louis is my reliable and hardworking partner along with
the thesis work, for composing and guiding the thesis.
Additionally, I want to thank my friends from everywhere who show the different
colors all over the world. It turns my Swedish life into a unique life. Their companies
are the biggest comfort for me. Especially, when I am depressed and dazed, their
words and tips gave me the greatest power to make the path come true.
Thanks to everyone who showed their warmness and hope the warmness can spread
till eternal.

Hao Li, Gothenburg, June 2024

I am first of all Grateful to God. I extend my deepest gratitude to the Svenska
Institute (Si) for their invaluable support and contribution to my education. Special
thanks to the teachers and staff of Chalmers University of Technology, particularly
the Department of Architecture and Civil Engineering, for their unwavering support
and guidance.
I am immensely grateful to my examiner, Kun Gao, and my supervisor, Ruo Jia, for
their insightful feedback and encouragement. I also appreciate my thesis partner,
Hao Li, for his exceptional work in managing our data.
Above all, I thank my wife for her unwavering support and being my number one
cheerleader, my mother for her thoughtfulness, and my siblings for their persistent
reminders to complete my thesis.
Thank you all for making this journey possible.

Louis Inkumsah, Gothenburg, June 2024

vii


ix


List of Acronyms

Below is the list of acronyms that have been used throughout this thesis listed in
alphabetical order:

CBD Central Business District
CO2 Carbon Dioxide
DBS Dockless bike-sharing
DB-SCAN Density-Based Spatial Clustering of Applications with Noise
E-scooter Electric Scooter
EPSS Electrically Powered Standing Scooter
E-scooter Electric Scooter
GIS Geographic Information System
GTFS General Transit Feed Specification
GPS Global Positioning System
ID Identification
kg Kilograms
KJ Kilojoules
km Kilometres
km/h Kilometres per hour
LiDAR Light Detection and Ranging
MARTA stations Metropolitan Atlanta Rapid Transit Authority
OD Origin-Destination
PCA Principal Component Analysis
POI Point of interest
UITP International Association of Public Transport
USA United States of America
USD United State Dollars
SDG Sustainable Development Goals
SEK Swedish Kronors
SKF Svenska Kullagerfabriken
sqm Square metres
% Percent

x


xii


Contents

List of Acronyms ix

List of Figures xvii

List of Tables xix

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3.1 Research Question . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Literature Review 5
2.1 Micro-mobility as a Service (MaaS) . . . . . . . . . . . . . . . . . . . 5
2.2 Micro-mobility Safety and Policies . . . . . . . . . . . . . . . . . . . . 6
2.3 E-Scooter and Sustainability . . . . . . . . . . . . . . . . . . . . . . . 8

2.3.1 Environmental Impacts . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Socio-Economic Impacts . . . . . . . . . . . . . . . . . . . . . 10

2.4 Micro-mobility User Behaviour . . . . . . . . . . . . . . . . . . . . . 11
2.5 Integration between Shared Micro-Mobility and Public Transit . . . . 12
2.6 Related Research Methods Explored for Integration . . . . . . . . . . 14

2.6.1 Integration Analysis Processes . . . . . . . . . . . . . . . . . . 16
2.6.2 Data collection and categorisation . . . . . . . . . . . . . . . . 17
2.6.3 Data filtration and wrangling Methods . . . . . . . . . . . . . 18
2.6.4 Buffer Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6.5 Scenario Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6.6 Data Analysis Methods . . . . . . . . . . . . . . . . . . . . . . 19
2.6.7 Methods of Machine Learning . . . . . . . . . . . . . . . . . . 20
2.6.8 Cluster Analysis and Data Visualisation . . . . . . . . . . . . 20

2.7 Micro-mobility in Gothenburg . . . . . . . . . . . . . . . . . . . . . . 21

xiii


Contents

2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Methodology 25
3.1 Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Study Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Data Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 Data Wrangling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.5.1 Bus Transit and Stops . . . . . . . . . . . . . . . . . . . . . . 30
3.5.2 E-scooter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.3 POI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5.4 Buffering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.6 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.7 K-Prototype Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.7.1 Algorithm Steps . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7.2 Standardisation of Results . . . . . . . . . . . . . . . . . . . . 37

3.8 POI Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4 Results 39
4.1 Transfer Distance and Possible Integration . . . . . . . . . . . . . . . 39
4.2 Seasons and Temporal Analysis . . . . . . . . . . . . . . . . . . . . . 40
4.3 Cluster Integration Analysis in Comparison with Seasons . . . . . . . 41

4.3.1 Clusters Analysis for Summer . . . . . . . . . . . . . . . . . . 43
4.3.2 Clusters Analysis for Winter . . . . . . . . . . . . . . . . . . . 45

4.4 Seasonal Geo-spatial Integration Analysis . . . . . . . . . . . . . . . . 47
4.5 POI Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Discussion 55
5.1 Access/Egress and Distance analysis . . . . . . . . . . . . . . . . . . 55
5.2 Seasons and Temporal Analysis . . . . . . . . . . . . . . . . . . . . . 56
5.3 Geo-Spatial with Cluster Analysis . . . . . . . . . . . . . . . . . . . . 57
5.4 POI with Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . 58

6 Conclusion 61
6.1 Research Question 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2 Research Question 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Research Question 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.4 Future Research Recommendations . . . . . . . . . . . . . . . . . . . 63

Bibliography 65

xiv


Contents

A Appendix A I
A.1 Land Use POIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
A.2 Building Use POIs (Additional) . . . . . . . . . . . . . . . . . . . . . II

xv


Contents

xvi


List of Figures

3.1 Illustration of Research Design . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Public Transit Map of Gothenburg in 2022, source: Jens Svanfelt, 2018 28
3.3 Illustration of Data Wrangling Process . . . . . . . . . . . . . . . . . 30
3.4 Bus GTFS filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Illustration of Buffer zones and Analysis considerations . . . . . . . . 33

4.1 2-Dimensional histogram of trip distances to nearest bus stop . . . . . 40
4.2 Total number of trips . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Total daytime trips . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4 Bar chart of Integrated and Non-integrated Trips . . . . . . . . . . . 42
4.5 Bar chart of Count of e-scooter trips per Cluster . . . . . . . . . . . . 43
4.6 Cluster Analysis Heatmap for Summer . . . . . . . . . . . . . . . . . 44
4.7 Cluster Analysis Heatmap for Winter . . . . . . . . . . . . . . . . . . 46
4.8 Geo-spatial map of Summer’s Count of Integrated Trips . . . . . . . . 48
4.9 Geo-spatial map of Winter’s Count of Integrated Trips . . . . . . . . 48
4.10 Geo-spatial map of Summer’s Trip Integration Rate . . . . . . . . . . 49
4.11 Geo-spatial map of Winter’s Trip Integration Rate . . . . . . . . . . . 50
4.12 Geo-spatial map of Total Count of Integrated Trips . . . . . . . . . . 50
4.13 Geo-Spatial map of Total Trip Integration Rate . . . . . . . . . . . . 51
4.14 Heatmap of POI for Summer . . . . . . . . . . . . . . . . . . . . . . . 52
4.15 Heatmap of POI for Winter . . . . . . . . . . . . . . . . . . . . . . . 53

xvii


List of Figures

xviii


List of Tables

2.1 Summary of Integration Analysis Processes . . . . . . . . . . . . . . . 17
2.2 Summary of Data Collection Efforts . . . . . . . . . . . . . . . . . . . 17
2.3 Summary of Data filtration and wrangling Methods . . . . . . . . . . 18
2.4 Summary of Buffer Ranges . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Summary of Scenario Analysis . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Summary of Data Analysis Methods . . . . . . . . . . . . . . . . . . 20
2.7 Methods for Machine Learning . . . . . . . . . . . . . . . . . . . . . . 20
2.8 Summary of Cluster Analysis and Data Visualisation . . . . . . . . . 21
2.9 Specifications and requirements for different vehicle types . . . . . . . 22

3.1 Data Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Input of mixed data Xi for Analysis . . . . . . . . . . . . . . . . . . . 35

A.1 POI parameters for land use in Sweden . . . . . . . . . . . . . . . . . II
A.2 POI parameters for Building use in Sweden . . . . . . . . . . . . . . . III

xix


List of Tables

xx


1
Introduction

Transportation serves as an essential element of infrastructure and a foundation
for the structure of modern society. The sector involves the organised movement
of people and goods across different geographical locations via a variety of modes
(Kacher & Singh, 2021). The significance of transport extends beyond its immediate
practicality in facilitating human mobility and trade. It plays an important role in
promoting cultural connections and unity, which are essential for global peace and
collaboration.

The industry, however, faces significant challenges. In 2019, the Intergovernmen-
tal Panel on Climate Change (IPCC, 2022) demonstrated a significant influence
of transportation on climate change. They revealed that transportation alone ac-
counts for approximately one-quarter of worldwide carbon dioxide (CO2) emissions,
with the road transport sector being responsible for the majority of these emissions,
amounting to approximately three-quarters. The predominance of road transport in
contributing to this contribution reflects the world’s dependence on vehicles powered
by fossil fuels, with continuous economic advancement and urbanisation leading to
a surge in personal vehicle ownership. As a result, road transport has become a
primary target for combating climate change.

1.1 Background

Over time, developed cities have had an increased demand for innovative modes
of travel (Kamargianni et al., 2016), resulting in the creation of shared passenger
transport. This has resulted in a more expansive notion known as shared mobility,
which refers to a wide range of transport services used by several people, either
one after another or at the same time. The idea encompasses a range of transporta-
tion options, including public transport, micro-mobility (such as bikes and scooters),
automobile-based modes like car-sharing and trips on demand, as well as commuter-
based modes (Kersten Heineke et al., 2023). In service-based terminology, they are

1


1. Introduction

commonly labelled as ride-sharing, car-sharing, bike-sharing, and scooter-sharing
and viewed as a subset of the "sharing economy", a continuously evolving concept
that eliminates focus on individual ownership and runs parallel with technological
developments (Shaheen et al., 2015).

The advent of technology, especially the widespread adoption of GPS-enabled smart-
phones, revolutionised urban mobility. The capability of individuals to access real-
time transit information has spurred the development of applications that facilitate
the optimisation of transport mode selection (Pakarinen et al., 2023). The birth of
shared micro-mobility services gained traction because of this almost a decade ago
with the introduction of bike-sharing programmes in various cities (Bloom et al.,
2021). They were presented as a transformational and convenient urban transporta-
tion alternative since they are portable and can be ridden on streets and bicycle
lanes, allowing riders to choose faster or shorter routes due to their flexibility (Ma
et al., 2022). Particularly in areas where public transit routes may not adequately
cover all neighbourhoods or align with users’ schedules, the behaviour pattern de-
picted commuters patronising it as an alternative to complement the rather rigid
public transit with first- and last-mile trips (Zuo et al., 2020). The elegant fusion of
technology has progressively encouraged commuters to shift away from conventional
private vehicles and combine trips with biking and public transport.

The idea that commuters can have all of their demands satisfied by a single mode
of transportation is becoming less and less acceptable. The possibility of transport
multi-modality to solve urban problems has drawn much attention in recent years,
where several forms of transportation are integrated into a single system (UITP,
2023). A concept such as "Park and Ride" allowed commuters to park their vehi-
cles at predetermined locations and continue their trip with public transportation
(Meek et al., 2008). Although this strategy was meant to encourage public transport
and lessen the number of personal vehicles entering crowded urban areas, its results
were varied. The proximity of the transit station or stop to the commuter’s point
of origin also affects the use of flexible transportation options (Yang et al., 2024).
Commuters are more inclined to use various transportation options, including walk-
ing, to reach their destination when public transportation is more readily available
and reasonably close. Interestingly, the city of Gothenburg, through Västtrafik is
carrying out a similar programme supported by "commuter parking" for vehicles and
bicycles (Västtrafik, n.d.).

The e-scooter market was valued at about USD 37 billion in 2023 and is predicted
to reach approximately USD 78 billion by 2030. It predicts that the industry has

2


1. Introduction

the potential to grow at a little under a 10% annual rate by 2030. An indicator that
there is an increasing demand for this technology and more people are investing in it
(Grand View Research, n.d.). Over almost a decade-long growth period, the micro-
mobility cluster industry’s contributions in Gothenburg (the second largest city in
Sweden) from 2018 have been remarkable with growth in tourism activities, which
reflected growth as well in micro-mobility services (Business Region Göteborg AB,
2022). The relationship between micro-mobility usage and tourism suggests they
do not only support each other in growth but also reveal a gradual acceptance of
micro-mobility services as an integral part of urban transportation.

Gothenburg’s "The Bicycle City" project is a noteworthy investment indicator. The
city has committed SEK 473 million to enhance and expand cycling infrastructure,
which includes extending the existing 37,000 sqm lane and adding 3 km of lanes
in 2022 (City of Gothenburg, 2022). This commitment shows the Gothenburg city
authority’s strategic approach to enhancing micro-mobility and e-scooter ridership as
it makes specific infrastructure provisions for e-scooters. These are a clear indication
of a global desire to continuously develop and invest in the industry.

1.2 Problem Statement

There is widespread concern about the safety, environmental impact, and service
cost of e-scooters, but users generally have a more favourable view of it. Despite
this, there is still a strong consensus on the need for stricter regulations and better
support infrastructure (Wallgren et al., 2023). The comprehension of commuters’
behavioural patterns and use of shared e-scooters is essential in moulding this, espe-
cially to assess whether it complements the traditional public transit systems or if
it is perceived as used independently. In some cities, there is increasing recognition
of the potential for shared mobility as a sustainable transport mode, with advocacy
for its inclusion into urban strategies and to provide it with more appropriate in-
frastructure and regulatory support (Laa & Emberger, 2020).

Generally, micro-mobility services are considered important in attaining environ-
mental sustainability as they help reduce congestion and potentially greenhouse gas
emissions (Kjærup et al., 2021). With that, it is important to evaluate their influ-
ence on urban transportation (König et al., 2022; Krier et al., 2021), especially since
there are no clearly defined laws to govern them now. An evaluation of their impacts
on integration with public transport and first- and last-mile solutions is necessary
to improve their overall efficacy (Cai & Liang, 2021; Saberi et al., 2018).

3


1. Introduction

1.3 Objectives
The objective of this research is to develop data-driven algorithms using naturalistic
data to analyse the integration patterns between shared micro-mobility and public
transit in Gothenburg. The study aims to achieve the following objectives:

• Examine the present user patterns in the integration of Gothenburg’s public
transport system with shared micro-mobility services, with a particular em-
phasis on e-scooters.

• Analyse the patterns of bus and e-scooter integration between key seasons of
the year.

• Assess the variations in user behaviour and preferences when integrating public
transport and e-scooter services in locations with varied levels of activity.

1.3.1 Research Question

To align with the study’s objectives, the following questions were developed to help
in better understanding and shaping the research’s focus:

1. How does the mixed use of shared micro-mobility with Gothenburg’s public
transport system impact connectivity and user choices?

2. How do seasonal and temporal variations affect the coordination between bus
arrivals and e-scooter departures in areas with different integration levels?

3. How does the coordination of bus and e-scooter services affect public transport
use in areas with different levels of activity?

1.4 Limitations
• The GTFS data was limited to buses only. We, however, observed stops in-

teracting with other transport modes, which could/could not be a main con-
tributor to the integration observed.

• The access and egress radius of 50 m for integrated trips and the 200 m buffer
for POI were selected based on supposition.

• Hardware capacity limitations, specifically in memory and processing power,
had a significant impact on the execution times of Python computations.

4


2
Literature Review

With a rapidly growing interest in emission-free transportation, scholars and pol-
icymakers are beginning to pay much attention to shared micro-mobility services
and how to maximise their benefits in development. While there has been ample
research on bike-sharing users and integration patterns, few studies have covered the
field of shared e-scooters. As such, this chapter focuses on reviewing works that are
related to micro-mobility as a service, the usage and integration patterns of shared
micro-mobility, and related research methods used to assess them.

2.1 Micro-mobility as a Service (MaaS)

The introduction of rental e-scooters in several cities across the United States in
2017 was an important milestone in the evolution of urban transportation (Citta-
dini et al., 2022). They were easily accepted into urban life, as it made it much easier
for people to travel short distances quickly and improved their ability to connect
with larger public transportation systems when needed. Naturally, it was important
in transforming the landscape and providing a practical alternative to traditional
transport methods, (Shaheen et al., 2021), especially considering that most of them
are electrically powered.

The industry on its own has proven to be highly efficient and flexible, particularly in
facilitating short trips to and from public transport hubs. Often referred to as first-
mile or last-mile solutions, function as connections supplementing journeys between
different modes of transit (Kapuku et al., 2021; Wang & Shen, 2022). Although the
idea and desire are that they are used in such a manner, the reality of usage isn’t
entirely so, as commuters don’t always have fixed or pre-planned travel patterns.

Shared micro-mobility involves the collective use of low-speed, lightweight vehicles.
Descriptively, commuters share them or access the vehicles temporarily for use in-
stead of the traditional style of owning them outright (Shaheen et al., 2021). A more
specific definition of micro-mobility considers them as a class of vehicles that are

5


2. Literature Review

limited to a maximum weight of 350 kg and a top speed of 45 km/h. This restric-
tion keeps the kinetic energy of these vehicles to just 27 KJ, which is considerably
lower than that of a standard car travelling at high speeds, thereby reducing both
energy consumption and the potential for harm (Haji et al., 2023). When defin-
ing micro-mobility, multiple factors are considered, including vehicle speed, weight,
intended use, and size. Typically, its definition is rooted in the vehicle’s charac-
teristics and urban infrastructure that they best suit, plus they are mostly used in
short-distance travel (Zarif et al., 2019). In the UK, the e-scooter is formally known
as the Electrically Powered Standing Scooter (EPSS) and is practically described
as a vehicle that resembles a traditional scooter but is designed for a single rider in
a standing position. What sets the EPSS apart is its electric powertrain and the
placement of propulsion controls on a handlebar, which enhance manoeuvrability
and ease of use. Other single-rider vehicles, such as segways, electric skateboards,
and self-balancing vehicles, are clearly classified and cannot be used on roads. This
means that regulators can focus on safety and making sure that e-scooters work
with existing infrastructure. (RoSPA, 2020)

2.2 Micro-mobility Safety and Policies

Urbanisation is a continuous process that has resulted in a significant increase in
population density as people move to cities for improved economic prospects and a
better quality of life. This phenomenon generally attracts a greater percentage of
highly educated people. The increased adoption of e-scooters as a preferred mode
of transportation has been linked to this trend (Heumann et al., 2021; Jiao & Bai,
2020) but the widespread acceptance of e-scooters is directly attributed to the con-
venience and accessibility they offer. Which, however, has not been without its
challenges.

A notable consequence of the increased acceptance is the rise in safety-related inci-
dents. (Blomberg et al., 2019), wrote about the worrying trend of accidents involv-
ing riders and pedestrians, with the number of life-threatening incidents surprisingly
higher among pedestrians, with a rise in public health concerns and calls for urgent
measures to address them. The behaviour of e-scooter users reveals diverse concern-
ing patterns; among them is the fact that riders frequently ignored traffic signals
and tended to increase their speed instead when approaching red lights, thereby
exacerbating accident risks (Guo et al., 2014), and some also sped up to 27 km/h,
with instances of exceeding this on downhill paths (Mayhew & Bergin, 2019). To
mitigate these effects, regulatory bodies imposed restrictions. Some of them were fo-

6


2. Literature Review

cused on speed, location, and time. One example is setting speed limits to 20 km/h
during the day and reducing them to 15 km/h at night (Pakarinen et al., 2023).
Apart from the aforementioned characteristics, (James et al., 2019) identified other
key contributors to injuries that included improper parking obstructing pedestrian
traffic, leading to trips and falls, use during the night, and riding at excessive speeds.
These are particularly problematic as they increase the risk of accidents but also con-
tribute to urban clutter and pose a direct hazard to pedestrians (Knapp et al., 2014).

In addition to regulatory measures, technological solutions have been explored to
enhance safety. One such innovation is "geo-fencing", implemented directly on the
service provider’s maps to restrict e-scooters from operating in certain areas or ex-
ceeding specific speeds within designated zones. Geo-fencing has been perceived by
some, according to (Sharp, 2019), as more effective than traditional regulatory ap-
proaches. Geo-fencing is not perfect, though, because of problems with technology
like GPS multi-path interference, which can make positioning wrong without the
help of extra technologies like LiDAR or cameras (Miura & Kamijo, 2015). Incorpo-
rating all these technologies only increases the cost of service, which is often passed
on to the users in the form of higher fees.

As part of a larger effort to address safety concerns, stricter regulatory and enforce-
ment measures are required. Some noteworthy proposed strategies include imple-
menting visible identification tags on shared e-scooters (Datava et al., 2022). The
idea would allow bystanders and authorities to easily identify and/or report improper
behaviour with the possibility of imposing direct fines on e-scooter companies (Lin
et al., 2023). A pre-conceptualised way to incentivize companies to educate their
users and manage their fleets more responsibly.

Another aspect influencing e-scooter safety is the purpose of their use. The recre-
ational use of e-scooters often leads to a higher chance of injury as compared to
commuting on foot (Coelho et al., 2021). Their distinction underscores the variabil-
ity in rider behaviour and risk exposure based on how they are used, suggesting that
safety measures need to be tailored to suit different use cases.

(Salas-Niño, 2022) pointed out the challenges many urban areas face in crafting ef-
fective guidelines that not only promote safety but also integrate e-scooters into the
broader transport ecosystem in a sustainable manner. The difficulty lies in balancing
unpredictable usage with public safety and urban design, necessitating a dynamic
approach to policy-making that can adapt to the evolving nature of the industry.

7


2. Literature Review

Sweden, similar to many developed nations, places a high priority on safety across
various facets of public life, particularly in the realm of transportation. The country
is well-known for its dedication to road safety, which takes a proactive approach that
always prioritises the well-being of all road users. The "Vision Zero" programme is
a game-changer among Sweden’s safety initiatives. It represents a paradigm shift
in traffic safety philosophy, based on the belief that no loss of life or serious in-
jury is acceptable, as opposed to traditional safety programmes, which frequently
accept some level of risk. The programme contends that the road system should
be inherently safe and that drivers should not bear sole responsibility for accidents
and injuries. Instead, multiple stakeholders, including vehicle manufacturers, infras-
tructure planners, and policymakers, collaborate to ensure safety (Belin et al., 1997).

While some cities have implemented regulations to address these safety concerns,
the effectiveness of such measures remains questionable. Enforcing speed limits and
mandating helmet use for e-scooter riders are positive steps, but without strict en-
forcement and penalties for non-compliance, these regulations may not effectively
mitigate the safety risks associated with e-scooter usage (Kleinertz et al., 2023).

We cannot ignore the safety risks that the increased use of e-scooters poses. The
potential for conflicts, collisions, and injuries in urban environments should prompt
a re-evaluation of the widespread use of e-scooters in transportation networks. The
drawbacks of e-scooters for public safety may outweigh their advantages if there
aren’t extensive and effective safety measures in place.

2.3 E-Scooter and Sustainability

Transportation is a key component of urban development and It supports economic
growth by facilitating trade and access to locations. Still, it must be managed to
balance ecological impacts, such as emissions and land use, with social and economic
benefits. Effective transport systems reduce travel time, enhance safety, and promote
affordability, while poor planning can worsen social inequalities (City of Gothenburg,
2018). Through the United Nations’ Agenda 2030, (Trafikverket, 2018)" highlighted
the interconnected nature of safety and sustainability in transportation. The role of
sustainable transportation in building a sustainable society is highlighted through
it, with a framework outlining specific targets that aim to transform transportation
systems worldwide. Specifically, sub-target 3.6 aimed to halve the cases of road
traffic fatalities and serious injuries by 2020, a goal that underscored the urgency of
addressing road safety issues. Additionally, sub-target 11.2 set forth the objective

8


2. Literature Review

to ensure a safe, affordable, accessible, and sustainable transportation system for all
by 2030, reflecting a holistic view of transportation that integrates environmental,
social, and economic dimensions.

Even though Agenda 2030 sets clear goals, research by (Trane et al., 2023) has
shown that the Sustainable Development Goals (SDGs) have mostly been focused
on the environmental aspects, leaving the social aspects out of the picture. This
gap is significant as social aspects such as equity, accessibility, and safety are inte-
gral to the holistic achievement of sustainability. Environmental sustainability in
transportation often covers topics like reducing greenhouse gas emissions and pro-
moting energy-efficient vehicles. In contrast, the social dimensions involve ensuring
that transportation systems are designed to be inclusive, catering to the needs of all
segments of society.

Recognising this imbalance, our research intends to adopt a more integrated ap-
proach by delving into both the environmental and social dimensions of the SDGs
related to transportation. By examining literature that encompasses both of these
aspects, we aim to provide a more comprehensive understanding of what truly con-
stitutes sustainable transportation.

2.3.1 Environmental Impacts

The ease of use and environmental advantages of e-scooters are what is driving
their rising urban popularity (Glenn et al., 2020). In busy cities, where commuters
routinely grapple with traffic jams, high parking fees, and limited parking spaces,
e-scooters emerge as a viable alternative. They allow users to weave through con-
gested streets and reach their destinations swiftly and effectively. In contrast to
traditional vehicles such as cars and taxis, e-scooters offer a nimble and adaptable
means for covering short urban distances. They enable riders to easily navigate
through traffic, sidestepping jam-packed roads and thus facilitating quicker arrivals
at various destinations. This level of convenience renders e-scooters an attractive
option for daily commutes, errand running, or urban exploration.

Throughout their life-cycle, e-scooters also have significant environmental impacts,
encompassing the energy consumed during their production, transport, and the
processes of charging and recharging, up to their disposal. These environmental
impacts are mitigated when the energy sources used are eco-friendly (Neves et al.,
2024). The e-scooter itself has no tailpipe emissions during operation, putting it in
an environmentally friendly category. According to research, combining shared mo-

9


2. Literature Review

bility services with public transportation can lead to fewer people owning cars and
lower overall transportation costs. It can also lead to more people using public tran-
sit, which can encourage more environmentally friendly transportation habits (Koo
& Choo, 2022; Stiglic et al., 2018; Yan, Zhao, Han, Van Hentenryck, & Dillahunt,
2019; Zhang & Zhang, 2018). While sharing physical items typically yields more
substantial environmental benefits than sharing spaces or transportation services,
the latter can sometimes be detrimental to the environment. Contrary to previous
assumptions, peer-to-peer sharing does not always result in greater environmental
benefits than centralised ownership models (Meshulam et al., 2024). An easy way to
consider such statements is to use ride-hailing vs. public transport as an example.
When an app is used to book a trip, it’s presumed to be a more environmentally
friendly alternative as compared to using a privately owned car. However, the de-
ficiency of travelling long distances to pick up passengers in the case of public bus
transport contributes to increased emissions and sometimes traffic congestion. On
the other hand, using a personal vehicle is a fuel-efficient choice, and the trips are
planned efficiently considering this flexibility.

Adding to that, it has been found that the current shared e-scooter systems emit
more CO2 per km than the transportation modes they replace, primarily due to
their brief lifespans. Nonetheless, with ongoing investments aimed at innovation
within the industry, there is optimism that future developments will enhance their
environmental performance (Moreau et al., 2020).

2.3.2 Socio-Economic Impacts

E-scooter riders are mostly younger and more educated, especially in cities with
lots of tech companies or universities (Gkartzonikas & Dimitriou, 2023). They like
e-scooters because they offer quick, affordable mobility and are environmentally
friendly.

E-scooters are great for navigating dense urban areas because of their compatibility
and agility, which makes them easy to use in tight and crowded spaces. They are
also a popular choice for urban residents who intend to cut down on commute time.
A survey by (Lee et al., 2021) found that people heading to educational institutions
often used e-scooters for their first- or last-mile trips. Higher-income individuals
also tend to use them more, especially when public transport isn’t as efficient.

Several factors influence e-scooter use, such as employment status, household size,
and the perceived benefits of e-scooters (Gkartzonikas & Dimitriou, 2023). These

10


2. Literature Review

factors shape how often people use e-scooters in urban areas. As cities get more
crowded, more people are recognising the practicality and benefits of e-scooters for
reliable, quick, and eco-friendly transportation.

2.4 Micro-mobility User Behaviour

Research on mobility patterns reveals that adults walk at an average speed of ap-
proximately 4.8 km per hour, translating to about 1.6 km every 20 minutes (Hullett
& Bubnis, 2020). The typical distance a person is willing to walk is an average of
1.25 km. Surprisingly, the walking distance, elevation changes, and satisfaction lev-
els had negligible effects on it (Manaugh & El-Geneidy, 2013). Dockless e-scooters
have been the subject of several studies, notably in the United States, where they
are primarily used for short trips. (Jiao & Bai, 2020) identified that the trips they
used were an average of 1.24 km in distance and took about 7.55 minutes, with
noticeable differences between weekday and weekend usage patterns. On weekdays,
peak usage times range from 1 pm to 5 pm, starting earlier at 11 am, whereas week-
ends see consistent use throughout the afternoon. In a comparable study conducted
by (Noland, 2019), the observations varied with a larger dataset, with an average
travel distance of 2.14 km and a duration of 15.59 minutes, indicating slightly longer
journeys. The highest level of e-scooter usage was observed on Saturdays between
12 pm and 3 pm. In another study on dockless e-scooters in four European cities,
the patterns found were distinct, suggesting that the average travel time ranged
from 10.2 to 13.8 minutes, with a distance range of 1.96 km to 3.02 km. The peak
usage was during the afternoons on Fridays and Saturdays (Foissaud et al., 2022).

Comparative studies between American and European users suggest that Ameri-
cans tend to use e-scooters for relatively shorter and quicker trips, which reflects
differences in commuting patterns, underscoring the need for studies to be con-
ducted based on considerations of their geographic locations (Bozzi & Aguilera,
2021). Walking and riding e-scooters both reflect typical usage of under 3 km.
Given that the typical walking willingness is capped at around 1.25 km and the av-
erage e-scooter trip tends to be around or above this range, e-scooters seem to offer
a convenient alternative for distances that are slightly out of comfortable walking
range. This may specifically attract users who are seeking to save time or minimise
physical effort. Both the USA and Europe show a noticeable pattern of heightened
e-scooter utilisation on weekends, suggesting that e-scooters are preferred for recre-
ational and non-work-related journeys. Scores of previous studies have focused on
bike-sharing, the reason being that the shared e-scooter service market is relatively

11


2. Literature Review

novel. Notwithstanding that it is easy to observe from findings that their usage pat-
terns are not far apart, making it easy to cross-reference research on bike-sharing
reflectively on shared e-scooter services.

The e-scooter is used for various trip purposes. In a study based in the USA, it was
found that midday and evening weekday usage was confined to commercial and insti-
tutional areas like the CBD and universities, whereas the trip times on weekends are
more dispersed (Tokey et al., 2022). Dockless bike-sharing (DBS) and e-scooters are
mostly used for short trips on weekday afternoons, reflecting similar trends in other
countries and indicating they mainly serve non-commuting needs. Differences be-
tween these services are evident: dockless bike-sharing peaks notably in the morning,
particularly around university areas, suggesting its role in commuting to and from
educational institutions. E-scooter usage was found to be primarily concentrated in
city centres and around public transit hubs, and it connects commuters to public
transportation or their final destination (Chicco & Diana, 2022). Another study by
(Orvin & Fatmi, 2021) found that individuals living in neighbourhoods with varied
amenities and extensive cycling infrastructure frequently engage in bike-sharing for
recreational purposes. Higher economic status and possession of a private vehicle
typically decrease the possibilities of utilising shared transportation modes. Criti-
cal elements that encouraged a positive adoption of sharing were mostly based on
convenience and access to public transport, residential density, and the availability
of cycle paths.

Weather conditions, such as rain or extreme temperatures, significantly influence the
adoption and usage of shared micro-mobility services (Bi et al., 2021). The quantity
of e-scooter users changes with the weather changes, with harsh weather like snow
causing a reduction in the number of riders (Abouelela et al., 2023).

2.5 Integration between Shared Micro-Mobility
and Public Transit

Understanding user behaviours and perceptions is critical for integrating e-scooters
into urban transportation systems effectively. This understanding, along with steps
to lower the risks involved, makes it easier to add the services to current transport
systems (Useche et al., 2022). In developed regions, public transit systems typically
showcase high reliability, with schedules prominently displayed at stops. However,
seasoned transit users often possess deeper insights than what is merely displayed,
which highlights the diversity in commuters and types of information needs among

12


2. Literature Review

them (Larsen & Sunde, 2008). E-scooters can both compete with and complement
existing transport modes, depending on the specific use case and location (Yan,
Zhao, Han, Hentenryck, & Dillahunt, 2019).

The current focus on shared micro-mobility extensively explores its dynamics and
interplay with conventional public transport systems. A pivotal aspect of this in-
volves determining whether micro-mobility services complement or replace existing
public transport. Using a regression model, (Radzimski & Dzięcielski, 2021) looked
into this relationship and found that shared micro-mobility services are much more
appealing and useful when public transport is reliable and easy to get to. These
services are particularly preferred for short to medium trips within urban settings,
especially in contexts where public transit is frequent and dependable. The inclina-
tion towards micro-mobility in densely populated urban areas often correlates with
its ability to conveniently and efficiently fill the gaps left by traditional transit routes.

In such environments, bike-sharing may substitute public transport due to its conve-
nience and speed, indicating why more micro-mobility solutions tend to be integrated
when public transit is used more efficiently (Kong et al., 2020). In less dense areas,
bike-sharing acts more as a complement to public transport by bridging the first-
and last-mile gaps, aiding commuters in reaching transit stations that are otherwise
too distant to walk to (Martin & Shaheen, 2014). It was concluded that people
frequently prefer shared bikes to other modes of transportation when travelling to
and from transit stations in areas with few public transportation options. The intro-
duction of bike-sharing services influenced transportation choices significantly, with
more than 44% of commuters modifying their mode of transport and between 27%
and 40% supplementing their trips with public transit as a result of these services
(Shaheen et al., 2021). Integration patterns between e-scooters, e-bicycles, and bicy-
cles are very similar (van Kuijk et al., 2022). The suggestion is that the integration
patterns observed are interchangeable. With that, bike-sharing services showed that
their patrons were much more likely to integrate their trips with rail than with a bus
(Yan, Zhao, Han, Hentenryck, & Dillahunt, 2019). We infer that the same patterns
are associated with e-scooters.

The versatility of micro-mobility trips in urban transport is becoming increasingly
recognised. (Vinagre Díaz et al., 2023) systematically categorised these trips into
four distinct functions: serving as a complementary choice that coexists with tra-
ditional transport modes without displacing them, acting as an auxiliary tool en-
hancing first-mile or last-mile connectivity, or completely substituting for public
transport when necessary. The concepts of Modal substitution, Modal integration,

13


2. Literature Review

and Modal Complementing describe the various interactions between bike sharing
or micro-mobility services and public transit. Modal Substitution occurs when a
bike-sharing trip directly replaces a public transit journey, often when the bike
route closely aligns with transit schedules and starts and ends near transit hubs
with minimal need for transfers. Modal Integration involves using bike-sharing in
conjunction with public transit, typically covering the journey to a station at the
beginning (first mile) or from a station at the end (last mile) of travel, requiring
proximity to transit stations and timely connections. Modal Complementing hap-
pens when bike-sharing addresses coverage deficiencies in areas poorly served by
public transit, particularly when the bike trip’s start or end points are remote from
transit stations or when the transit routes necessitate multiple transfers (Koo &
Choo, 2022). Distinctively, (Vinagre Díaz et al., 2023) separated the roles of first-
mile and last-mile connectivity into unique functions, unlike (Kong et al., 2020),
who viewed any bike-sharing activity facilitating access to or from nearby transit
stations as a single integrated role. (Vinagre Díaz et al., 2023) further detailed a
complementary role where micro-mobility coexists harmoniously with other modes
of transportation without replacing them, contrasting sharply with Modal comple-
menting, which focuses on filling service gaps in areas underserved by public transit
rather than merely coexisting.

2.6 Related Research Methods Explored for In-
tegration

To find a commonality to our research parameters, a similar study evaluated public
transit coverage using threshold distances of 400 m for subways and buses and 800
m for railways (El-Geneidy et al., 2014; Jin et al., 2019). Considering how indicative
this represents the parametric gaps between rail and bus transit, we will avoid direct
comparisons to rail transport in the review but may only consider the process. E-
scooters are relatively new compared to other micro-mobility modes, including bike
sharing. Bike-sharing usage patterns have shown some patterns slightly similar to
the e-scooter, as they are both used in short trips but show different usage patterns
(McKenzie, 2019; Zhu et al., 2020), with that, we’ll adapt some of the processes
used in the analysis of bike-sharing services. Reading through most of the previous
research, it is easy to notice that some processes and methods stand out.

Distances are used in buffering, and clustering is a critical metric for analysing
datasets to derive meaningful conclusions and identify patterns (Sharma & Seal,
2020). A review of studies revealed that Euclidean distance is frequently used as a

14


2. Literature Review

metric. However, the debate over which method is better, whether the Euclidean or
Haversine, is still ongoing, as each method has advantages and constraints based on
its specific application (Maria et al., 2020). The Euclidean method computes the
linear distance, similar to the diagonal of a triangle, and is appropriate for small
distances where the Earth’s curvature can be disregarded. Although this method is
simple and direct, it fails to take into account practical obstacles such as buildings
or roads. The Haversine formula is frequently used in navigation to determine the
arc distance between two points on the spherical Earth and provides precise mea-
surements for longer distances (Ghilissen, 2020). Another approach, the Manhattan
distance method, computes the total sum of the absolute differences in coordinates,
which is particularly advantageous in urban settings characterised by a grid-like lay-
out (Ghilissen, 2020). It also offers a direct and efficient approach to computation,
and its distance efficiency renders it highly valuable in fields such as data clustering.

(Vinagre Díaz et al., 2023) conducted a notable study that was comparable to ours.
Their research developed an unsupervised method to assess how e-scooters comple-
ment or substitute public transportation networks in Rome, Italy. They looked at
how close the starting and ending points of e-scooter trips were to the nearest train
or subway stations, using GTFS data to explore the relationship with public transit.

Unlike traditional methods, their approach didn’t start with any assumptions about
the distance from transit stations. Instead, it lets distance emerge as a parameter
in the clustering process, taking into account factors like urban design and user
behaviour. They first filter to ensure data quality and remove the outliers caused
by the tracking devices, just as earlier emphasised in Section 2.2. For this, they
calculated the travel distances, and the only data considered were a distance rang-
ing from 0.1 km to 20 km, a duration between 30 seconds and 125 minutes, and
an average speed not exceeding 25 km/h used in the proceeding analysis. It was
reassuring to see how this was empirically supported. The approach avoids the lim-
itations inherent in setting arbitrary distance thresholds by allowing intrinsic city
parameters, such as user behaviour and the layout of the public transport network,
to dictate the relationships between e-scooter trips and public transit. Their study
used a natural approach, giving a clearer view of how e-scooters are used and their
effect on the transport network. Instead of fixed metrics, they grouped trips based
on patterns in the data. This method created clusters by examining the similarities
among trips using various distance measures and the K-means clustering algorithm.

The K-means algorithm analyses the distances from the start and end points of each
e-scooter trip to the nearest station (Capó et al., 2018). The steps included setting

15


2. Literature Review

initial cluster centres, assigning trips to the nearest centres, recalculating the cen-
tres, and repeating until the clusters stabilised.

In analysing POI data, a typical reflection of an ideal method such as (Espinoza
et al., 2019)’s was a great fit to adopt. Using specific coordinates for locations
in Atlanta, like the stadium due to its size, had problems because the point loca-
tion didn’t match well with the e-scooter’s parked location. To fix this, buffer points
were added around key places such as the Georgia Aquarium (90 m), Mercedes-Benz
Dome (140 m), MARTA stations (20 m), and neighbourhoods (140 m). Addition-
ally, a ring of 16 buffers with a 290 m radius was added around neighbourhoods.
Generally, it is indicated that the buffers used were not fixed but varied based on
size and to logically match the purpose of the location.

After grouping the POIs, duplicate entries were addressed where, if searches such as
"restaurant" and "bar" returned to the same place, only the primary type matching
the query was kept. POIs with vague addresses were excluded. When different POIs
shared the same location, they were combined into a single "multiple" POI.

With clean trip and POI data, the closest POI to each trip’s start and end was
found using a k-dimensional tree. All locations were converted to a Cartesian plane
for accuracy. Trips were then linked to their nearest POIs, and distances between
them were recorded to help infer trip purposes. If the same POI appeared for both
the start and end, a new search was done for the start point, as it was inferred that
riders have more control over their last-mile trip than their last-mile trip (Espinoza
et al., 2019).
Other authors have also used a variety of methods that yield reliable results in
addition to these elaborate methods, which examine e-scooter and public transport
integration and their interactions with POIs. Their summaries are illustrated in
Tables 2.1 to 2.8

2.6.1 Integration Analysis Processes

(Zuniga-Garcia et al., 2022) investigated the role of e-scooters in bridging the last-
mile transportation gap in Austin, Texas, using a two-stage methodological frame-
work. (Kong et al., 2020) examined this by classifying trips based on their proximity
to public transit stations. (Cao et al., 2021) investigated the feasibility of commuter
preferences by conducting a Stated Preference Survey and employing mixed logit
models.

16


2. Literature Review

Author(s) Study Focus Details
Zuniga-Garcia
et al. (2022)

Bridging last-mile
transportation gap

Two-stage methodological framework

Kong et al.
(2020)

Bike-sharing integration
with public
transportation

Scenario and trip duration analysis

Cao et al.
(2021)

E-scooter as a substitute
for conventional transport

Stated Preference Survey and mixed logit
models

Table 2.1: Summary of Integration Analysis Processes

2.6.2 Data collection and categorisation

In their study, (Ma et al., 2021) collected data on E-Scooter user guidelines from
156 cities in the United States. They identified sixteen important characteristics
and used different methods of categorization to emphasise differences in policies.
Similarly, (Karimpour et al., 2023) collected two years’ worth of e-scooter Origin-
Destination (OD) trip data from Louisville, USA, and created maps of the service
areas within specific traffic analysis zones. (Zuniga-Garcia et al., 2022) compiled
data from approximately 1.7 million e-scooter trips and 9 million public transport
trips in Austin, Texas, to examine the incorporation of e-scooters into the preexist-
ing public transport system. (Hawa et al., 2021) also monitored the geographical
positions of e-scooters in Washington, D.C., for six consecutive days by dividing the
city into 1,671 grid cells, each measuring 0.07 square miles. In all processes, there
was an obvious attempt to gather a large dataset, showing the need to get results
as close to reality as possible.

Table 2.2: Summary of Data Collection Efforts

Reference Location Details
(Ma et al.,

2021)
USA E-Scooter user guidelines from 156

cities.
(Karimpour
et al., 2023)

Louisville, USA Two years’ worth of e-scooter OD trip
data.

(Zuniga-
Garcia et al.,

2022)

Austin, Texas 1.7 million e-scooter trips and 9
million public transport trips.

(Hawa et al.,
2021)

Washington, D.C. Monitored e-scooters in 1,671 grid
cells over six days.

17


2. Literature Review

2.6.3 Data filtration and wrangling Methods

(Ziedan et al., 2021) excluded trips that had missing data, duration shorter than 60
seconds or longer than 120 minutes, distances greater than ten miles, and average
speeds higher than 24 km/h. (Cao et al., 2021) excluded e-scooter trips that lasted
less than 1 minute or more than 900 minutes. (Hawa et al., 2021) initially considered
excluding fishnets from areas where locking is not permitted. Nevertheless, they
ultimately opted to incorporate them into their analysis because of the significant
prevalence of e-scooter trips.

Author(s) Criteria Details
Ziedan et al.
(2021)

Trip duration, distance,
average speed

Excluded trips with missing data: shorter than
60 seconds, longer than 120 minutes, distances
over 10 miles, speeds over 24 km/h

Cao et al.
(2021)

Trip duration Excluded trips less than 1 minute or more than
900 minutes

Hawa et al.
(2021)

Fishnets Considered excluding non-permitted areas but
included them due to high prevalence

Table 2.3: Summary of Data filtration and wrangling Methods

2.6.4 Buffer Ranges

(Cao et al., 2021) eliminated trips that were less than 1 minute or more than 900 min-
utes in duration, indicating that they focused on time-based rather than distance-
based boundaries. (Kong et al., 2020) used clear buffer zones 100 m and 400 m
from transit stations to divide trips into different scenarios. Section 2.6.5 goes into
more detail about these scenarios. These zones clearly showed the locations where
bike-sharing trips connect to public transport stops. (Hawa et al., 2021) employed
fine-scale geographical grids, also known as fishnets, with a size of 0.07 square miles.
These grids were used to establish buffer ranges within each grid cell.

Author(s) Buffer Criteria Details
Zuniga-Garcia
et al. (2022)

Spatial analysis near
transit stations

Integrated e-scooters with transit systems

Cao et al.
(2021)

Trip duration Focused on time-based rather than
distance-based boundaries

Kong et al.
(2020a)

Buffer thresholds Established 100 and 400 m from transit stations

Hawa et al.
(2021)

Geographical grids Used 0.07 square mile fishnets

Table 2.4: Summary of Buffer Ranges

18


2. Literature Review

2.6.5 Scenario Analysis

(Kong et al., 2020) classified bike-sharing trips into four specific scenarios in order
to examine the ways in which these trips interacted with public transportation.
Scenario 1 involved first- or last-mile trips within a range of 100 m to 400 m from
public transport stations, which are deemed adequately served by public transport
but not close enough for convenient transfers, hence potentially replacing the use
of public transport. Scenario 2 encompassed first-mile trips within a distance of
100 m and last-mile trips between 100 and 400 m, or vice versa. These journeys
were regarded as instances of either transit-and-bike or bike-and-transit integration,
where riders would likely utilise bike-sharing to minimise or eliminate transfers in
public transportation or to access the nearest public transit stations. Scenario 3
encompassed first-mile trips within a 100-m radius and concluded beyond a 400 m
distance, or vice versa. The trips held the potential for modal integration. Scenario
4 involved first-mile trips greater than 400 m that ended between 100 and 400 m
away, or vice versa. The origin point or destination of these trips was either more
than 400 m away from public transport or close to it. These trips were assumed to
complement public transit rather than replace or integrate with it.

Scenario Criteria Details
1 100 to 400 m from

stations
Potentially replacing the use of public transport

2 Within 100 m and ending
100–400 m, or vice versa

Transit-and-bike or bike-and-transit integration

3 Within 100 m to beyond
400 m, or vice versa

Facilitated seamless integration with public
transportation

4 More than 400 m to
100-400 m, or vice versa

Complemented public transit rather than
replaced or integrated

Table 2.5: Summary of Scenario Analysis

2.6.6 Data Analysis Methods

(Ma et al., 2021) utilised Chi-squared analysis to investigate the associations among
various policy components. In addition, Principal Component Analysis (PCA) was
employed to condense the dataset into its principal components. (Karimpour et al.,
2023) employed PCA to decrease the number of variables used for analysis. (Cao et
al., 2021) utilised mixed logit models to analyse data collected from stated preference
surveys on usage patterns and user preferences. to investigate ways in which e-
scooters were utilised for different trip purposes, such as minimising transit transfers
and facilitating access to public transit stations. (Hawa et al., 2021) employed
multilevel mixed effects linear regression models to assess the influence of different

19


2. Literature Review

covariates on the distribution of e-scooters. (Ziedan et al., 2021) employed fixed-
effects regression models to investigate the associations between e-scooter usage and
bus ridership. The aim was to ascertain whether e-scooters functioned as substitutes
or complements to bus transportation. Some of the more advanced statistical models
used were the negative binomial count model and the zero-inflated negative binomial
count model (Zuniga-Garcia et al., 2022).

Author(s) Method Details
Ma et al. (2021) Chi-squared analysis,

PCA
Investigated policy components, condensed
dataset, Categorization and testing

Karimpour et
al. (2023)

PCA Reduced variables for analysis

Cao et al.
(2021)

Mixed logit models Analyzed usage patterns and user preferences

Hawa et al.
(2021)

Multilevel mixed effects
linear regression

Assessed influence of covariates on e-scooter
distribution

Zuniga-Garcia
et al. (2022)

Negative Binomial Count
Model, Zero-Inflated
Negative Binomial Count
Model

Analysed integration between e-scooter and
public transit

Ziedan et al.
(2021)

Fixed-effects regression Investigated associations between e-scooter
usage and bus ridership

Table 2.6: Summary of Data Analysis Methods

2.6.7 Methods of Machine Learning

The study by (Zuniga-Garcia et al., 2022) used gradient boosting regression to find
and separate possible confounding factors that might affect the patterns of e-scooter
and public transit use. These factors included weather, urban design, and personal
preferences. (Karimpour et al., 2023) utilised a random forest regression model in
their integration analysis.

Author(s) Method Details
Zuniga-Garcia
et al. (2022)

Gradient boosting regres-
sion

Identified confounding factors affecting usage pat-
terns

Karimpour et
al. (2023)

Random forest regression Investigated factors influencing e-scooter rider-
ship and accessibility

Table 2.7: Methods for Machine Learning

2.6.8 Cluster Analysis and Data Visualisation
(Ma et al., 2021) employed k-means clustering to detect groups of cities that share
similar e-scooter policies and then represented the data using spiral bubble plots.

20


2. Literature Review

(Karimpour et al., 2023) also utilised agglomerative hierarchical clustering to inves-
tigate the first and last mile effects of e-scooter trips, and their results were presented
in a dendrogram visualisation. (Hawa et al., 2021) used precise geographical fishnet
grids to measure and analyse the distribution of e-scooters in different areas.

Author(s) Method Details
Ma et al. (2021) K-means clustering Detected groups of cities with similar policies,

represented with spiral bubble plots
Karimpour et
al. (2023)

Agglomerative hierarchi-
cal clustering

Investigated first and last mile effects, presented
results in a dendrogram

Hawa et al.
(2021)

Geographical fishnet grids Measured and analysed e-scooter distribution

Table 2.8: Summary of Cluster Analysis and Data Visualisation

2.7 Micro-mobility in Gothenburg

Shared micro-mobility services have become an important part of urban transporta-
tion in many cities, Gothenburg included. As the number of people seeking environ-
mentally friendly and flexible transport options continues to grow, the service plays
a role in meeting this demand. The illustration of micro-mobility vehicles in Swe-
den seen in Table 2.9, sourced from (Trafikverket, 2018) describes the industry. In
Gothenburg, the regulatory framework for e-scooters is the same as that of bicycles,
since the Swedish Transport Agency considers both to be in the same category. This
categorisation facilitates the integration of e-scooters into the current road network
system (Göteborgs Stad, n.d.).

In Gothenburg, the use of e-scooters had a typical duration of up to approximately 7
minutes and covered a distance of about 1.8 km. Their usage significantly diminishes
on weekends, particularly around the evening but spikes on weekdays, particularly
in the afternoon, with an uneven geographic distribution within the city. There is,
however, a markedly higher concentration of trips in central Gothenburg compared
to the city’s more peripheral areas (Peci et al., 2022). In almost every way, these
usage patterns are very similar to those observed for Europe, as depicted by (Fois-
saud et al., 2022) in Section 2.5. All aspects, except for the usage days, which were
on weekends.

The developments of e-scooter regulations and infrastructure in Gothenburg reflect
the effort to accommodate them; though, there is a discrepancy between the objec-
tive of decreasing car usage and the continued allocation of resources towards road
capacity expansion (Bi et al., 2021).

21


2. Literature Review

Table 2.9: Specifications and requirements for different vehicle types

(Wang et al., 2021) identified inefficiencies in the utility of e-scooters, where 40%
of the energy stored was not used properly. They suggested the adoption of com-
prehensive strategies that integrated e-scooters with public transport systems to
increase patronage and, consequently, improve energy efficiency. Malin Månsson, a
traffic planner in Gothenburg, stated in a 2020 publication by Trafik Göteborg, that
the city deals with approximately 1,000 instances of e-scooters being illegally parked
daily. The large quantity of e-scooters poses logistical challenges in its management
even with the implementation of fees on companies for improper usage (Trafik Göte-
borg, 2020). The GeoSence project investigated the use of geo-fencing as a means to
enhance the regulation (Malmström & Tunmarker, 2023). To ride e-scooters in the
city, a person must be a minimum of 18 years old and must ride and park in specific
designated areas. Additionally, there are restrictions on the use of e-scooters during
nighttime hours on weekends (VOI Technology AB, n.d.).

2.8 Summary
To sum up, while several scholarly articles have explored the integration of micro-
mobility options, such as e-scooters, with public transportation systems, definitive
conclusions regarding this relationship remain elusive. The interaction between e-
scooters and public transport is likely to vary based on a multitude of context-specific
factors, including urban scale, the density and coverage of the public transport net-
work, user demographics, climatic conditions, the occurrence of special events, and
the proximity to POIs. For example, in urban areas where public transport faces
significant congestion, e-scooters might attract a segment of public transport users,

22


2. Literature Review

serving as an alternative mode of transit. Conversely, in different city contexts,
e-scooters might be adopted as a complementary mode, enhancing multi-modal
transportation systems. While several studies have focused on the geographical
correlation between public transport stop locations and e-scooter usage, there is a
notable gap in research concerning the impact of the characteristics of public trans-
port services on e-scooter utilisation.

23


2. Literature Review

24


3
Methodology

The chapter provides an overview of the methodological processes and perspectives
that support the credibility of the study. It includes important aspects such as
research design, study area, data collection, data sample, and data cleaning. The
data cleaning section emphasises the process of preparing data, which involves tech-
niques for organising and validating bus transit data, managing e-scooter data, and
implementing buffering procedures. The exploratory analysis section provides a
comprehensive overview of the initial steps involved in examining data, particularly
the methods used to prepare the input data for thorough analysis. This chapter
also explains how to use K-prototype clustering for both categorical and numerical
data, as well as how to look at points of interest (POI) to understand patterns and
relationships in space.

3.1 Research Design

Our goal was to use Python to create machine learning algorithms to analyse the
patterns between bus transportation GTFS and e-scooter usage data, as well as their
specific connections to different Points of Interest (POIs). As part of our method-
ological approach, we followed these steps: A literature review, conceptualisation,
building a method for data collection, Data Processing, analysis, verification, and
conclusion, in a flow pattern as elaborated in Figure 3.1.

Using the methodology of (Vinagre Díaz et al., 2023) as a reference guide, we elim-
inated the outliers and utilised clustering techniques to further analyse the data.
By utilising the technique, we were able to categorise e-scooter trips according to
inherent patterns found in the data. (Maria et al., 2020) emphasised how important
the haversine formula is for using longitude and latitude to figure out the distances
between points (Equation (3.1)). This formula is used to calculate the length of
an arc connecting two sets of geographical coordinates. Our methods used this in
calculating all distances.

25


3. Methodology

Figure 3.1: Illustration of Research Design

d = R × c (3.1)

where:
• d is the distance between the two points,
• R is the radius of the Earth (assumed to be approximately 6371 km),
• c is the central angle between the two points, calculated as:

c = 2 × tan−1 2
(√

a
√

1 − a
)

(3.2)

where:
• a is calculated as:

a = sin2
(

∆lat

2

)
+ cos (lat1) × cos (lat12) × sin2

(
∆lon

2

)
(3.3)

where:
• ∆ l a t is the difference in latitude between the two points lat2 − lat1

• ∆ l o n is the difference in longitude between the two points lon2 − lon1

We also used, as a secondary aspect of our research, an empirical analysis which
involved a thorough review of existing literature relevant to our study to bolster our

26


3. Methodology

findings and provide a solid academic foundation to support our observations and
conclusions.

3.2 Study Area

Gothenburg is a city in the southwestern part of Sweden, with a land mass of over
440 sqm and a population of 596,841 and an annual growth rate of about 1.2% as of
the end of 2022, which was the year the data used was assessed (Statistiska central-
byrån, n.d.). (Göteborgs Stad, n.d.) reported that the city’s cycle network spans an
impressive 813 km. The year 2021 marked a significant period of growth for micro-
mobility in Gothenburg, particularly in e-scooter services. It saw the addition of
three new service providers. By the peak of the summer, the number of e-scooters
increased by 60% from the previous year, 2020. E-scooter trips more than doubled
over the year, with close to 1,000,000 rides recorded in July alone (Trafikkontoret,
2022). The e-scooter companies currently operating in the city are four namely,
VOI, Tier, Ryde, and Bolt, with Ryde being the new entrant (Göteborgs Stad, n.d.-
b). Figure 3.2 shows the public transport layout of Gothenburg, Västtrafik also
manages the city’s public bus system, which as of 2022 was running on 121 bus lines
and 8,907 bus stops within the city.

3.3 Data Collection

The reference data utilised in our study was meticulously and qualitatively sourced
through secondary channels, primarily the Chalmers Library database utilising Sco-
pus, and supplemented by other open-source academic platforms such as Google
and Google Scholar. This initial stage of gathering information was critical, setting
the stage for the subsequent, crucial step of rigorously evaluating the sources to
ascertain their reliability and relevance to our specific research topic. This evalua-
tive process entailed a detailed analysis of several key aspects: the credibility of the
authors involved, the publication dates of the materials, and the overall academic
competence of the sources.

The diverse array of reference sources accessed provided a comprehensive and well-
rounded foundation for the study, enabling an in-depth and multifaceted examina-
tion of the subject matter from multiple academic perspectives. Such a rigorous
approach to sourcing and referencing not only significantly bolstered the scholarly
credibility of our research but also profoundly enriched the depth and breadth of the

27


3. Methodology

Figure 3.2: Public Transit Map of Gothenburg in 2022, source: Jens Svanfelt, 2018

discussions within the study. As (Penders, 2018) emphasised, thorough and critical
sourcing is imperative in enhancing the academic rigour and integrity of research
outcomes.

3.4 Data Sample

The study extensively utilised POIs, bus transit GTFS, e-scooter space, and tem-
poral data, which were obtained from secondary sources. The datasets of e-scooter
trips were exclusively based on Gothenburg, whereas the data encompassing POIs
and bus transit GTFS covered the entirety of Sweden. The bus transit GTFS and
e-scooter space-time datasets comprised quantitative data gathered in June, July,
November, and December of 2022 and sourced from the Trafiklab website. The
dataset included detailed records for Västtrafik bus trips. The VOI e-scooter trips
were obtained from the Chalmers Urban Mobility Division and GIS layers of POIs
in Gothenburg were obtained from Openstreetmaps.

In terms of volume, the study processed a substantial amount of data: 349,854
records of bus trips, 313,464 records concerning e-scooter trips, and 133,999 POI

28


3. Methodology

locations. This dataset summarised in Table 3.1 was used as a robust foundation
for the analysis based on the selected months within the year 2022.

The datasets for e-scooter usage included detailed information about individual
trips, each clearly identified and recorded with distance, time, and location data,
as well as battery usage statistics. These datasets contained detailed records about
each trip, such as the start and end times of trips, GPS coordinates at the start and
end points, and initial and final battery levels, along with measures such as distance
travelled, usage time, trip duration, and average speed.

The datasets for bus transport included a wide range of public transit data across
several key documents that gave a complete overview of the transit network. The
"agency" file compiled a full list of all the transit authorities in Sweden, providing a
broad view of the operational scope. The "stops" file described the transit networks
and marked the locations of commuter interactions with bus stops. "Routes" de-
scribes the public service routes available through the transportation system. The
"trips" file detailed each journey’s attributes, plotting routes through a series of
stops over a set time frame. The "stop times" file recorded the scheduled arrival
and departure times at each stop Finally, the ”calendardates.txt” file contained the
operational schedule.

The POI datasets contained GIS layer files in shape files. They covered seven main
name categories about land and building use in the city and included areas namely,
Residential, Commercial, Recreational, Educational, Public, Medical, and others. A
more elaborate breakdown of the categories can be found in APPENDIX A.

Table 3.1: Data Overview

Platform VOI Västtrafik Gothenburg
Service Type E-Scooter Bus Transit
Total number
of trips

313,464 349,854

Location type Origin to destina-
tion

Bus Stops Point of Interest

Total number
of locations

626,928 8,907 133,999

Key Attributes Space and Time Space and Time Space

29


3. Methodology

3.5 Data Wrangling
Given the precise demands of the study, a thorough data cleaning process was crucial
to guarantee the accuracy and pertinence of the data in alignment with the research
objectives. Figure 3.3 shows a visual path of the processes used.

Figure 3.3: Illustration of Data Wrangling Process

3.5.1 Bus Transit and Stops

The GTFS data obtained included data from all of Sweden’s. To isolate data relevant
to Gothenburg from the broader national dataset, geographic filters were applied
using the location data embedded within the GTFS. The GTFS dataset was filtered
to only include local bus trips and bus stop locations by using the agency code
specific to Gothenburg’s Västtrafik (agency code 279).

30


3. Methodology

To ensure accuracy and standardise units across datasets, corrections were made to
erroneous entries. Errors were detected in the time data for the bus trips, with some
trips recorded as lasting more than 24 hours. The abnormality was adjusted to a
standard date-time format—for example, converting ’30 hours’ to ’plus 1 day and 6
hours’.

In this, the intention was to identify each bus stop and its connected trips. To en-
sure that the data was well-represented and generated an accurate date, time, and
location, follow the steps depicted in Figure 3.4. First, filtering was executed using
the agency ID mentioned above for Västtrafik against the route ID to obtain routes
within Gothenburg. The data within this file has no spatial data but only a unique
identifier code that links it to the route location. The next step was to use the route
ID to filter the unique trip IDs, and the trip IDs were used to obtain the stop IDs,
which had a direct link to the latitude and longitude of each bus stop. Finally, the
trip ID and route IDs were used to filter the stopping times of each trip at each bus
stop. Datasets after the processes resulted in bus stops; location, bus trip; date and
time, e-scooter trips; origin’s location with date and time, and destination’s location
with date and time.

Figure 3.4: Bus GTFS filtering

3.5.2 E-scooter

Through outlier removal, we ensured that the dataset was reliable. This prepro-
cessing step helped filter the dataset to ensure valid and actionable results (Ziedan
et al., 2021). The process involves identifying and excluding parameters that meet

31


3. Methodology

specific criteria for outliers. In this case, the criteria are travel distances of exactly
100 m and travel times between 1 second and 60 minutes. The removal of time
and distances was to prevent the inclusion of short trips that were presumed to be
used for jolly rides. (Vinagre Díaz et al., 2023) conducted the same outlier removal
process, considering a minimum time range of 30 to 60 seconds and a maximum of
120 to 125 minutes as the outlier ranges.

3.5.3 POI

The GIS layers of the POI data, which were received in shape files, were converted
into CSV files over ArcGIS to make them easy to read and be in Python. Since
the data included all of Sweden, we set boundaries similar to the process in Section
3.5.1 as a first step in the analysis.

3.5.4 Buffering

During our analysis, we constructed buffer ranges as shown in Figure 3.5 to support
our investigation. Unlike (Kong et al., 2020), who performed a scenario analysis,
we adopted buffer ranges based on the research considerations of (Li et al., 2024).
Making use of a 50 m radius buffer around the origin or destination of e-scooter
trips as the most suitable distance for access and egress, near the bus stop. The
50 m buffer ensures that scooter users can easily access and leave scooters without
walking excessively, aligning with findings that short walking distances improve the
usability and attractiveness of shared mobility services.

In addition to the 50 m buffer, we implemented another buffer to exclude trips that
both originated and ended within a 500-m range using a function with ranges that
define it at latitude ±0.0045 degrees and longitude ±0.0055 degrees. The values
assume that 1 degree of latitude is approximately 111 km (thus, 0.0045 degrees is
about 500 m) and adjust for longitude according to the cosine of the latitude (since
degrees of longitude get closer together as you move away from the equator) from the
e-scooter. This exclusion criterion was based on the premise that short trips within
a 500 m range might not provide significant insights into the typical integration
use patterns of e-scooters, possibly representing anomalies or misuse. Research on
first- and last-mile distances supports the strategy by emphasising the significance
of excluding excessively short trips to concentrate on more representative travel pat-
terns (Li et al., 2024). Similarly, (Kong et al., 2020) performed the same buffering,
making buffer zone considerations of radius between 100 m and 400 m, as explained
in Section 2.6.

32


3. Methodology

Figure 3.5: Illustration of Buffer zones and Analysis considerations

We also considered trips that had their first or last mile beyond 500 m from the
origin or destination in our analysis. Including these trips gives a bigger picture of
how far people use e-scooters, which was important for figuring out if they could
be a good alternative for medium-range city travel (Radzimski & Dzięcielski, 2021).
E-scooter trips that started or ended within 10 minutes of a bus’s arrival at a bus
stop were also included in our analysis.

3.6 Integration

In the process of analysing the temporal and spatial data derived from e-scooters
and public transportation usage, we employed a structured parametric breakdown to
enhance our understanding and management of the data. This approach was crucial
for dissecting the dynamics of commuter movement patterns, focusing particularly
on the hours most indicative of peak commuting times, which include regular work-
ing hours and other high-traffic periods.

33


3. Methodology

We utilised a systematic parametric breakdown during the analysis to improve our
comprehension of the temporal aspects. To streamline this analysis, we initially set
up a time buffer that specifically included hours when commuter activity was at its
peak. The times are between 6:00 AM and ending just before midnight at 23:59:59.
this range was selected to encompass the complete range of daily commuter move-
ments, including both the busy morning periods and the late evening returns. We
divided the data into nine equal intervals, each lasting two hours, and arranged them
numerically from 1 to 9. The remaining portion of the 24-hour day is consolidated
and regarded as zero.

Our analysis was based on the standard workweek observed in Sweden, taking into
account the cultural and regional variations that greatly influence commuting pat-
terns. We classified the days into three distinct groups to more accurately represent
the overall work culture: Monday to Thursday were considered as complete working
days, representing a typical workweek labelled as 1. Friday was typically regarded
as a half day, acknowledging the widespread custom of shorter working hours and
separated as 2. The weekend, Saturday, and Sunday, considered trips that were
usually unrelated to work, were labelled as 0.

The segmentation was crucial in identifying the variability in commuter behaviour,
which frequently changes based on the day of the week. By grouping the days in this
way, we can conduct a more detailed analysis that accurately captures the movement
of commuters, taking into account both regular workdays and the unique patterns
of weekends and half days. Our methodological framework is in accordance with the
findings of (Bozzi & Aguilera, 2021), who highlight the significance of taking into
account geographical biases in transportation research. By tailoring our analysis
to the particular cultural and work practices of Sweden, we recognised that these
patterns may differ considerably in other nations.

3.7 K-Prototype Analysis

To comprehensively analyse the datasets, our research aimed to evaluate various
clustering methods to determine their suitability for managing diverse data types.
This assessment revealed that while some algorithms offer specific benefits, they
may not be appropriate for the complexities inherent in mixed data types.

When deciding on suitable methods, it is important to consider the nature of the
data, desired cluster shapes, computational limits, and the specific requirements

34


3. Methodology

Input Description
o stops within 50m Count of bus stops at the Origin of e-scooter

within 50m
d stops within 50m Count of bus stops at the Destination of e-

scooter within 50m
new d to o near stop
distances

Origin of e-scooter to Destination of Bus

new o to d near stop
distances

Origin of Bus to Destination of e-scooter

o count Count of bus trips at Origin within 10 mins
d count Count of bus trips at Destination within 10

mins
Days (0,1,2) Days of the week - monday to thursday - 1,

Friday - 2 weekend - 0
Time groups (0 to 9) Time between 6:00 to 23:59:59 divided at 2

hour intervals - 1 to 9 and other times as 0

Table 3.2: Input of mixed data Xi for Analysis

of the intended application domain, since each algorithm offers unique advantages
tailored to meet different analytical challenges and complexities for different data
environments. DB-SCAN is effective at detecting clusters in spatial datasets, mak-
ing it a great tool for analysing geographic data. Its ability to efficiently handle
noise and outliers is essential when addressing spatial irregularities (Ester et al.,
1996). Alternative approach Hierarchical clustering is renowned for its capacity to
produce intricate dendrogram structures, enabling a more profound understanding
of data relationships. The method is highly adaptable in analysing both numerical
and categorical data (Nielsen, 2016).

To conduct a thorough analysis of the datasets, we assessed these different cluster-
ing techniques to ascertain their appropriateness for handling a wide range of data
types. The assessment showed that although certain algorithms provided distinct
advantages, they may not be suitable for the complexity of mixed data types.

Ultimately, we opted for the K-prototypes algorithm due to its practical utility
in real-world scenarios where data often consists of mixed-type objects. The K-
prototypes algorithm improves the usual clustering method by adding a distance
measure that combines squared Euclidean distance for numbers and a simple match-
ing dissimilarity measure for categories. A key feature of this algorithm is the intro-
duction of a weighting parameter, γ which helps to balance the impact of numerical
and categorical distances during the clustering process. This algorithm updates clus-

35


3. Methodology

ter centroids for numerical data by calculating the mean of the feature values, while
for categorical data, the mode is used. The algorithm assigns data points to the
closest cluster based on the total distance measure and keeps the centres up-to-date
until convergence is reached (Hernández et al., 2023).
Consider a collection of n objects represented as

X = {X1, X2, X3, . . . , Xn},

where each object Xi is composed of a set of attributes

Xi = {Xi1, Xi2, Xi3, . . . , Xim}.

Within this framework, the attributes are further categorised into two distinct types:
mn numerical attributes and mc categorical attributes, making the total number of
attributes for each object,

m = mn + mc.

The primary objective of clustering in this context is to organise these n objects into
k distinct and non-overlapping clusters, denoted as

C = {C1, C2, C3, . . . , Ck}.

The algorithm proceeds by initialising k centroids, each representing a cluster. These
centroids are iteratively refined through the algorithm’s execution: in each iteration,
every object Xi is assigned to the cluster whose centroid is nearest to it, according
to the combined distance measure across the attributes. Post-assignment, the cen-
troids are recalculated to reflect the mean (for numerical attributes) and mode (for
categorical attributes) of the objects now contained within each cluster. The process
is repeated until a stopping criterion is met, typically when the centroids stabilise
with minimal or no change, indicating that the clusters have become sufficiently
distinct according to the dataset’s inherent structure.

To combine numerical and categorical data, the K-Prototype algorithm employs a
combined distance measure:

D(Xi, Ck) =
∑

j∈Num
(xij − ckj)2 + γ

∑
j∈Cat

δ(xij, ckj) (3.4)

where:
• Xi is the i-th data point.
• Ck is the k-th cluster centroid.

36


3. Methodology

• Num and Cat are the sets of numerical and categorical features, respectively.
• xij and ckj are the values of the j-th feature for the i-th data point and the

k-th cluster centroid.
• δ(xij, ckj) is an indicator function that equals 1 if xij ̸= ckj and 0 otherwise.
• γ is a weighting parameter to balance the numerical and categorical distances.

For numerical features, the cluster centroid is the mean of the feature values of all
points in the cluster:

ckj = 1
|Ck|

∑
Xi∈Ck

xij

For categorical features, the cluster centroid is the mode of the feature values of all
points in the cluster:

ckj = mode{xij : Xi ∈ Ck}

3.7.1 Algorithm Steps

1. Initialization:
• Randomly select K initial cluster centroids, where K is the number of

desired clusters.
• Each centroid consists of both numerical and categorical components.

2. Assignment Step:
• Assign each data point Xi to the nearest cluster centroid Ck based on the

combined distance measure D(Xi, Ck).
3. Update Step:

• Update the cluster centroids Ck by calculating the mean of the numerical
features and the mode of the categorical features for all points assigned
to each cluster.

4. Convergence Check:
• Repeat the assignment and update steps until the cluster assignments no

longer change or until a specified number of iterations is reached.

3.7.2 Standardisation of Results

The outputs are similar to the parameters set for the input in Table 3.2. The scores
have been made more comparable across different types of variables by standardising
them.
Specifically, the data is standardised using the following formula:

37


3. Methodology

z = x − µ

σ

Where:
• x is the original data.
• µ is the mean of the data.
• σ is the standard deviation of the data.

3.8 POI Analysis
The section aimed to assess how Points of Interest (POIs) affected integrated e-
scooter trips, specifically how they affected first- and last-mile trips. First, we
gathered information about POIs and divided them into seven different categories:
commercial, residential, recreational, educational, public, medical, and others. To
enable a more detailed and nuanced examination, we subsequently expanded this
framework to 14 POI groups, allowing us to analyse each type of POI at both
the start and end points of e-scooter journeys. Following the categorization, we
implemented a clustering method to precisely identify and count points of interest
within a 200 m radius of both e-scooter trip origins and destinations.

38


4
Results

This chapter presents the findings of our study. The results are organised into sev-
eral key areas: transfer distance and possible integration, seasonal and time-of-day
variations, trip density, cluster results, summer cluster results, winter cluster results,
comparison of centroids, and points of interest (POI) analysis.

The first part covers the transfer distances and the potential for integrating e-
scooters and bus services. Next was on how e-scooter usage varies with seasons and
times of the day, as well as the results of trip density, highlighting the most frequent
routes and areas of activity. We then present the clustering results, with separate
findings for summer and winter. A comparison of cluster centroids is included to il-
lustrate spatial distribution patterns. The POI results identified significant locations
that play a crucial role in the integration of these transport modes.

4.1 Transfer Distance and Possible Integration
The 2D histogram represents the counts of origin and destination points of e-scooter
trips in comparison with access and egress distances to the nearest bus stations as
plotted on the x- and y-axes, respectively, and uses a colour gradient to represent
trip density, with brighter colours indicating a higher concentration of trips. To
breakdown further, they do not represent the first- and last-mile trips but instead
the walked distances by commuters to access or egress an e-scooter.

The highest concentration of trips starting and ending near bus stations are clubbed
within the first 500 m and start to diminish after 1 km. The bright colors in the
bottom left corner of the histogram indicate that the bus stations’ first 50 m to 100
m are where the majority of trips occur. Emphatically, it is key to note that there
are little to no trips after the 1 km distance.

39


4. Results

Figure 4.1: 2-Dimensional histogram of trip distances to nearest bus stop

4.2 Seasons and Temporal Analysis

The frequency of e-scooter trips, both overall in Figure 4.2 and within the specific
time window of 6:00 to 23:59:59 in Figure 4.3, demonstrates seasonal and daily usage
patterns when comparing usage during the summer (months of June and July) with
the winter months (November and December).

In June, there are 70,782 trips in total, and in July, the peak month, there are
107,761 trips, leading to a combined total of 178,543 trips during both months. The
majority of daytime trips in summer are in June, at 65,924, and in July, at 99,101,
totaling 165,025 daytime trips. Nighttime trips are fewer but still notable, with
4,858 in June and 8,660 in July, summing up to 13,518 summer night trips.

In the winter months, the use of e-scooters is much lower. November shows 43,210
trips, and December, the lowest month, records 14,261 trips, resulting in a total of
57,471 trips during the winter months. Daytime trips also decreased during this
season, with 41,365 in November and 13,349 in December, totaling 54,714 daytime
trips in winter. Night trips reduce significantly, with 1,845 in November and 912 in
December, adding up to 2,757 winter night trips.
Comparing these seasons, total trips drop by 68% from summer to winter, daytime
trips drop by 67%, and nighttime trips fall by 80%.

40


4. Results

Figure 4.2: Total number of trips

Figure 4.3: Total daytime trips

4.3 Cluster Integration Analysis in Comparison
with Seasons

Figure 4.5 shows the counts of integrated and non-integrated e-scooter trips by sum-
mer and winter clusters. Figure 4.5 shows that non-integrated trips from clusters 4,
5, and 6, resulting in about 80% of the total trips, outnumber the integrated trips
from clusters 0, 1, 2, and 3 during both seasons.

Among the integrated trips, those starting from public transport locations in clusters
0 and 2 and associated with the first mile trips resulted in about 10% of the trips,
and those ending at public transport locations in clusters 1 and 3 associated with the

41


4. Results

(a) For Summer

(b) For Winter

Figure 4.4: Bar chart of Integrated and Non-integrated Trips

last mile trips also resulted in about 10% of the total trips being evenly distributed.
Each accounts for an equal number of trips, indicating a balanced use of e-scooters
for both first- and last-mile trips.

42


4. Results

(a) For Summer

(b) For Winter

Figure 4.5: Bar chart of Count of e-scooter trips per Cluster

4.3.1 Clusters Analysis for Summer

From Table 4.6, we observed that Clusters 0 and 2 show significant values within
the first-mile. In Cluster 2, the count of bus trips associated with the first-mile trip
that occurred within 10 minutes after a bus arrives is higher than in Cluster 0, with
counts of 18.04 and 4.63, respectively. Both clusters have an equal positive count of
bus stops with their first-mile trip. This situation doesn’t go in favour of destination
trips where the number of buses arriving within 10 minutes and count of bus stops
are both less than zero. The impact pattern of trip arrivals was also reflected in
their first- and last-mile distances, with cluster 2 resulting in an average distance of
about 1.21 km and cluster 0 at about 1.03 km.

The integration peak observed in Cluster 0 and 2 occurs in the mid-week, on Thurs-
days, which is a workday. The common activity time is early afternoon, from 14:00
to 16:00 for both clusters.

Clusters 1 and 3, conversely show significant activity related to the destination,

43


4. Results

where Cluster 3 has a higher count of bus trips departing the bus stops at the des-
tination point than Cluster 1, with counts of 17.86 and 4.37, respectively. Similar
to Cluster 0 and 2, the average count of bus stops at the last mile for both is also 1,
with their first and last-mile distances measuring at about 1.03 km and 1.25 km for
both Cluster 1 and 3, respectively. The common activity time is early afternoon,
from 14:00 to 16:00 for both clusters, which occurs on Thursdays.

Clusters 4, 5, and 6 show lower results across its metrics. Especially with their
counts of bus stops within 50 m at both the first- and last-mile being less than zero,
we rule out the possibility of integration. However, the first- and last-mile distances
for clusters 4, 5, and 6 resulted in 0.63 km, 1.54 km, and 2.98 km, respectively. The
day of usage for clusters 4, 5, and 6 was Thursday, and times of usage were in the
early afternoon, also between 14:00 and 16:00.

Figure 4.6: Cluster Analysis Heatmap for Summer

44


4. Results

4.3.2 Clusters Analysis for Winter

The results in Figure 4.7 show patterns similar to those in summer. Clusters 0 and
2 have a lot of activity at the e-scooter starting points, while Clusters 1 and 3 have a
lot of activity at the destinations. Clusters 4, 5, and 6 also show close to no activity
to bus stop integration.

Cluster 2 has a significantly higher count of bus trips in the first-mile trip within 10
minutes compared to Cluster 0, with counts of 25.09 and 5.76, respectively. Both
clusters have a high count of integrated