Delivery times and Regionalization
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
This project aimed to find a solution to partition geographical maps into regions
based on historical lead time data of delivered parcels. The guidelines of the partitioning
were to keep lead time distributions of areas within one region as similar
as possible, while keeping distributions of different lead times as different as possible.
The partitioning algorithm would also preferably execute the task in reasonable
time. Two algorithms were implemented and tested. One was an adaptation of Ant
Colony Optimization, that proved to strongly respect the specified criteria, but at
the cost of very high computation time. The second algorithm goes by the name
REDCAP for spatial clustering, and this one was more thoroughly explored. It
works by building a minimum spanning tree between all atomic level areas as nodes,
clustering them together in a bottom-up approach while considering up to all nodes
in every currently merged cluster along the way. The algorithm comes with some
options to be specified: How to compute the dissimilarity between two clusters
(single-linkage, complete-linkage, average-linkage), and which nodes of the clusters
to consider (first-order constraint, full-order constraint). The data, represented as
histogram vectors with each bin representing days passed since pickup, all split up
by pickup day of the week, also had to be compared using some measure of dissimilarity.
The dissimilarity measures explored were L1-distance, L2-distance, earth
mover distance, and Jaccard index. This allowed for many different possible partitionings
despite the same principle. To compare the result, consensus clustering was
applied to measure how consistently the same algorithm partitions the same nodes
of a data set together when making small adjustments to the data itself. It allowed
to determine the quality of each option combination, and its ability to select how
many regions to suggest. For a couple of data sets, single-linkage and earth-mover
distance failed to give out a clear result and were removed as options. First-order
constraint had a habit of cutting out single node size regions, and were therefore
also removed. The combinations of the other options will all run on any new data
set, and they each got a vote for the final partitioning. Some visual plots confirm
that this approach seems to give reasonable results. Since deliveries tend to take
longer to islands and remote locations, the algorithm tends to perform most of its
partitioning there, and group country mainlands into one single region. Deliveries in
Sweden showed to be an exception to this pattern, due to a lot of remoteness in the
mainland, yielding regions of more equal area size. These regions were compared
to those of the provided estimations of lead time, which turned out to match well
overall, with some minor differences.
Beskrivning
Ämne/nyckelord
Regionalization, Spatial clustering, Graph theory, REDCAP, ACO, Consensus clustering.