Mission Management for Fuel Cell Electric Trucks – a Bilevel Approach Optimise power split, speed and refuelling in the presence of uncertain external factors Master’s thesis in Systems, Control and Mechatronics ILYA KUANGALIYEV OSCAR MARK DEPARTMENT OF ELECTRICAL ENGINEERING CHALMERS UNIVERSITY OF TECHNOLOGY Gothenburg, Sweden 2024 www.chalmers.se www.chalmers.se Master’s thesis 2024 Mission Management for Fuel Cell Electric Trucks - a Bilevel Approach Optimise power split, speed and refuelling in the presence of uncertain external factors ILYA KUANGALIYEV OSCAR MARK Department of Electrical Engineering Division of Systems and control Chalmers University of Technology Gothenburg, Sweden 2024 Mission Management for Fuel Cell Electric Trucks - a Bilevel Approach Optimise power split, speed and refuelling in the presence of uncertain external factors ILYA KUANGALIYEV, OSCAR MARK © ILYA KUANGALIYEV, OSCAR MARK 2024. Supervisors: Saurabh Suman, Olof Lindgärde, Volvo Group Technology Examiner: Nikolce Murgovski, Department of Electrical Engineering Master’s Thesis 2024 Department of Electrical Engineering Chalmers University of Technology SE-412 96 Gothenburg Telephone +46 31 772 1000 Cover: Illustration of mission management for fuel cell electric trucks Typeset in LATEX, template by Kyriaki Antoniadou-Plytaria Printed by Chalmers Reproservice Gothenburg, Sweden 2024 iv Mission Management for Fuel Cell Electric Trucks - a Bilevel Approach Optimise power split, speed and refuelling in the presence of uncertainty ILYA KUANGALIYEV, OSCAR MARK Department of Electrical Engineering Chalmers University of Technology Abstract Fuel cell electric vehicles (FCEVs) have a crucial role in the transition to fossil- free travel and transport. In order to fully leverage their potential, proper mission planning of such vehicles should be conducted. This thesis aims to solve the mixed- integer problem of mission management with a hierarchical bilevel approach. The first layer solves a simplified mixed-integer problem of choosing hydrogen stations to refuel at and the amount to refuel with the help of stochastic dynamic programming, allowing us to account for uncertain external factors to obtain a more robust solution. The second layer becomes a smooth problem that can be solved with more complex and comprehensive dynamics. In addition, the benefits of using non-equidistant stage sampling were evaluated. To evaluate the robustness and optimality of the proposed approach 3 tests were conducted. The simulations were conducted with a 45-ton truck using 3 routes: Zagreb–Paris (1559 km), Gothenburg–Kiruna (1793 km), and Gothenburg–Rødby (461 km). Overall, according to the results, the bilevel approach can yield solutions comparable in optimality to the ones that were obtained with different strategies. In some cases, the proposed approach produced better results when it comes to robustness in the presence of disturbances, but the results were not conclusive enough to answer either of the research questions with a high level of confidence. Even though the non-equidistant stage sampling did not show any substantial increase in simulation accuracy, it was shown that stage and DP stage resolutions can be reduced without significant losses in optimality, making the computation time almost 3 times smaller. Keywords: Fuel cell electric trucks, Mission Management, Stochastic optimal con- trol, Mixed-integer nonlinear program. v Acknowledgements We would like to extend our heartfelt gratitude to our supervisors, Saurabh Suman and Olof Lindgärde. Their support and advice throughout the project has been invaluable. By giving us a good understanding of the background and inspiring us with a lot of great ideas for implementation they helped the project to advance without a single hitch. We also want to thank our examiner, Nikolce Murgovski, for being so dedicated to helping us with every issue we had and sharing all his experi- ence in optimal control. Finally, we want to thank our peer, Isac Borghed, who was doing his thesis on a similar topic together with us at Volvo Group Technology, for great and effective collaboration and partnership. Ilya Kuangaliyev, Gothenburg, June 2024 Oscar Mark, Gothenburg, June 2024 vii List of Acronyms Below is the list of acronyms that have been used throughout this thesis listed in alphabetical order: DC Direct current GPS Global positioning system ICE Internal combustion engine IVP Initial value problem BEV Battery electric vehicle EM Electric machine ESS Energy storage system HRS Hydrogen refuelling station MSE Mean squared error MPC Model predictive control MINLP Mixed-integer nonlinear program NLP Nonlinear program NMPC Nonlinear model predictive control FC Fuel cell FCEV Fuel cell electric vehicle HEV Hybrid electric vehicle PEM Predictive energy management PEMFC Proton-exchange membrane fuel cell RPM Revolutions per minute DP Dynamic programming SDP Stochastic dynamic programming SoC State of charge ix Nomenclature Below is the nomenclature of indices, sets, parameters, and variables that have been used throughout this thesis. Indices i State index n Stage number in the NLP k Stage number in the MINLP Sets X State constraint set Xf Terminal state constraint set U Control input constraint set Kstation Set of stages in MINLP that have a hydrogen station Parameters x0 Initial condition xtarget Target terminal state value m Vehicle mass mb Battery mass Af Vehicle effective frontal surface area Cd Vehicle aerodynamic drag coefficient Crr Vehicle rolling resistance coefficient ktm Gear ratio of the transmission kamb Coefficient of heat exchange between the battery’s case and air xi g Free fall acceleration ρair Air density ηem Electric machine efficiency ηtm Transmission efficiency ηth Battery thermal system efficiency Pfc,max Fuel cell maximum power output Pfc,min Fuel cell minimum power output Ṗfc,max Fuel cell maximum power output rate Ṗfc,min Fuel cell minimum power output rate FT,max Maximum traction force Fbr,max Maximum friction brake force H̄2,min Minimum normalised hydrogen level H̄2,max Maximum normalised hydrogen level SoCmin Minimum SoC level SoCmax Maximum SoC level Cb Battery electric capacity CH2 Hydrogen tank capacity cb Battery specific heat capacity N Number of stages in the NLP K Number of stages used in the MINLP M Number of quantised state values used in SDP R Tyre effective radius trf Full tank refuelling time tq Station queue time vmax Maximum vehicle speed Tb,min Minimum battery temperature Tb,max Maximum battery temperature λ Discount factor ∆s Spatial step ∆H Hydrogen’s lower heating value κ Risk tolerance (Maximum risk of failure) Variables xii x State vector u Control input vector y Outputs vector t Time tK Total travel time in NLP tN Total travel time in MINLP v Vehicle longitudinal velocity vrel Vehicle speed relative to the air s Vehicle position along the path q Battery charge SoC Battery state of charge Ib Battery current H2 Hydrogen level in the vehicle tank H2,rf Hydrogen refuelled at a hydrogen station H̄2 Normalised hydrogen level in the vehicle tank H̄2,rf Normalised hydrogen refuelled at a hydrogen station Tb Battery temperature Pem Electric machine power demand Pfc Fuel cell power output Pb Battery power output Pbr Friction brake power Pth Battery thermal system power Ebr Energy lost from friction breaking Qact Battery active heat generation rate Qpass Battery passive heat generation rate Qamb Battery dissipated heat rate FT Traction force Ffc Traction force from the fuel cell Fb Traction force from the battery Fbr Friction brake force Frr Rolling resistance force Fair Aerodynamic drag force Fg Gravitational force due to road slope Fz Normal reaction force of the road xiii sH2 Slack variable for H̄2 bound relaxation sSoC Slack variable for SoC bound relaxation st Slack variable for t bound relaxation τem Electric machine torque ωem Electric machine angular velocity δπ Binary variable used for feasibility check in the MINLP δst Binary variable identifying a hydrogen station’s presence ωt Cost of time in the NLP ωH2 Cost of hydrogen in the NLP ωs Cost of slack variables in the NLP ωbr Cost of friction breaking in the NLP εe Cost of deviating from desired state targets εbr Cost of friction braking in the MINLP εfuel Cost of fuel (hydrogen) in the MINLP εfail Cost of mission failure in the MINLP εstop Cost of making a stop to refuel in the MINLP εtr Cost of breaching the fuel cell’s transient’s maximum allowed level in the MINLP εllim,xi Cost of breaching a state’s lower bound in the MINLP εulim,xi Cost of breaching a state’s upper bound in the MINLP εC Cost of mission completion in the MINLP µPem Power demand’s mean σPem Power demand’s standard deviation γ Vehicle’s gear Functions p(s) Road parameters α(s) Road grade φ(s) Vehicle heading angle θ(s) Wind heading angle Tamb(s) Ambient temperature w(s) Wind speed β(·) Wind effective angle of attack xiv vlim(s) Road speed limit ∆vtr(s) Traffic speed reduction Paux(Tamb) Auxiliary power demand Rb(SoC, Tb) Battery internal resistance Pem,loss(Pem) Electric machine loss Pb,max(SoC, Tb) Battery maximum power output Pb,min(SoC, Tb) Battery minimum power output Uoc(SoC, Tb) Battery open circuit voltage H2,con(Pfc) Hydrogen consumption rate J(·) Objective function to minimise in the nonlinear problem Pb,loss(·) Battery heat loss Pfc,heat(Pfc) Fuel cell heat loss Pr(·) Probability function f(·) Probability density function of the uncertainty f(·) Continuous state dynamics g(·) Nonlinear inequality constraints kem(γ) Gear ratio of the motor L(·) Cost function used in the MINLP L(·) Cost function used in the MINLP L(·)dr Cost function of driving used in the MINLP L(·)lim Cost function of leaving the state constraint set used in the MINLP L(·)tr Cost function of breaching the maximum allowed level of the fuel cell’s transient used in the MINLP L(·)rf Cost function of refuelling used in the MINLP T (·) Terminal cost function used in the MINLP Q(·) Bellman action-value function V (·) Bellman value function π Optimal policy from dynamic programming δDirac(s) Dirac delta function ηfc(Pfc) Fuel cell efficiency xv xvi Contents List of Acronyms ix Nomenclature xi List of Figures xix List of Algorithms xxi List of Tables xxiii 1 Introduction 1 1.1 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Main research questions . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Theory 5 2.1 Vehicle modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Longitudinal dynamics . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Powertrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Optimal control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 Stochastic optimal control . . . . . . . . . . . . . . . . . . . . 15 2.2.2 Mixed-integer optimisation . . . . . . . . . . . . . . . . . . . . 16 2.3 Dynamic programming . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Stochastic dynamic programming . . . . . . . . . . . . . . . . 17 3 Methods 19 3.1 Solution strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.1 Space sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.2 Data downsampling . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Problem formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.1 Top layer – stochastic dynamic programming . . . . . . . . . . 22 3.3.2 Bottom layer – non-linear programming . . . . . . . . . . . . 27 3.4 Simulation setups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.1 Test 1 – Robustness analysis . . . . . . . . . . . . . . . . . . . 29 3.4.2 Test 2 – Iterative method . . . . . . . . . . . . . . . . . . . . 31 3.4.3 Test 3 – Resolution analysis . . . . . . . . . . . . . . . . . . . 32 xvii Contents 4 Results 33 4.1 Downsampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Test 1 – Robustness analysis . . . . . . . . . . . . . . . . . . . . . . . 35 4.3 Test 2 – Iterative method . . . . . . . . . . . . . . . . . . . . . . . . 39 4.4 Test 3 – Resolution analysis . . . . . . . . . . . . . . . . . . . . . . . 42 5 Discussion 47 5.1 Downsampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.2 Robustness Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.3 Iterative method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.4 Resolution analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.5 Dynamic programming . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6 Conclusion 51 Bibliography 53 A Algorithms used in solution and evaluation I A.1 Speed preprocessing algorithm . . . . . . . . . . . . . . . . . . . . . . I A.2 Heuristic algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . II B Nonlinear problem formulation III xviii List of Figures 2.1 Illustration of wind’s relative speed. . . . . . . . . . . . . . . . . . . . 7 2.2 Transmission efficiency maps as a function of motor torque and RPM and at optimal gear. . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Illustration of PEMFC’s principle of operation. [27], CC0 . . . . . . . 9 2.4 Hydrogen power and fuel cell efficiency, as functions of fuel cell power. 10 2.5 Normalised charge and discharge power bounds, as functions of bat- tery temperature and state of charge. . . . . . . . . . . . . . . . . . . 12 2.6 (a): Normalised open circuit voltage and internal resistance, as func- tions of battery temperature and SoC. . . . . . . . . . . . . . . . . . 13 2.7 Schemes representing the cases of the vehicle’s power flow. . . . . . . 14 3.1 Block diagram illustrating the hierarchical approach. . . . . . . . . . 20 3.2 Comparison of interpolation methods in discontinuous region. . . . . 25 3.3 Illustration of stochastic variables’ sampling. . . . . . . . . . . . . . . 26 3.4 Illustration of normal approximation of the stochastic power demand. Probability density function of total power demand (v = 70 km/h, α = 0, v′ = 0). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.1 Comparison of equidistant and non-equidistant sampling for two dif- ferent routes, with original resolutions of N = 15000 and N = 6000, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 exponential fit of MSE data for route one. . . . . . . . . . . . . . . . 34 4.3 H̄2 and SoC trajectories together with the power split for the Gothenburg– Kiruna route with equidistant sampling with N = K = 1600. Sub- figures (a)–(c) correspond to different numbers of quantised state values in SDP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4 H̄2 and SoC trajectories together with the power split for the Gothenburg– Kiruna route with non-equidistant sampling with N = K = 1600. Subfigures (a)–(c) correspond to different numbers of quantised state values in SDP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 xix List of Figures xx List of Algorithms 1 Speed preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . I 2 Heuristic station selection . . . . . . . . . . . . . . . . . . . . . . . . II xxi List of Algorithms xxii List of Tables 3.1 Station positions and hydrogen prices for the Zagreb–Paris route. . . 30 3.2 Station positions and hydrogen prices for the Gothenburg–Kiruna route. 31 3.3 Initial and terminal state constraints, stage quantisation and SDP regeneration decision for simulations 1-12. . . . . . . . . . . . . . . . 32 4.1 Simulation results with different methods for the route Zagreb–Paris with no uncertainty applied. . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Simulation results with different methods for the route Zagreb–Paris, with uncertainty applied after the station selection. . . . . . . . . . . 36 4.3 Simulation results with different methods for the route Gothenburg– Kiruna with no uncertainty applied. . . . . . . . . . . . . . . . . . . . 37 4.4 Simulation results with different methods for the route Gothenburg– Kiruna, with uncertainty applied after the station selection. . . . . . 38 4.5 Results from simulations for route one, Zagreb–Paris, using the same seed for every simulation. . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.6 Results from simulations for route two, Gothenburg–Kiruna, using the same seed for every simulation. . . . . . . . . . . . . . . . . . . . 40 4.7 Results from simulations for route three, Gothenburg–Rødby, using the same seed for every simulation. . . . . . . . . . . . . . . . . . . . 40 4.8 Results from simulations for route one, Zagreb–Paris, using a different seed for every simulation. . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.9 Results from simulations for route two, Gothenburg–Kiruna, using a different seed for every simulation. . . . . . . . . . . . . . . . . . . . . 41 4.10 Results from simulations for route three, Gothenburg–Rødby, using a different seed for every simulation. . . . . . . . . . . . . . . . . . . 41 4.12 Simulation results with different SDP state quantisation and stage resolution with non-equidistant stage sampling for the route Gothenburg– Kiruna. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.11 Simulation results with different SDP state quantisation and stage resolution with equidistant stage sampling for the route Gothenburg– Kiruna. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 xxiii List of Tables xxiv 1 Introduction Humanity has set ambitious goals for reducing worldwide carbon emissions to fight climate change. As transport is responsible for about a quarter of the EU’s total CO2 emissions, 27% of which come from heavy-duty trucks [1], the market actors can not ignore this challenge. From a lifecycle perspective, most of the emissions occur during the use phase of vehicles. Therefore, one of the ways that companies can reduce emissions is to manufacture and encourage the use of vehicles running on more sustainable sources of energy. Today plenty of resources are spent on developing alternatives to the internal com- bustion engine. An example of such is fuel cell electric vehicles. One challenge in bringing these vehicles to the market is the lack of infrastructure required to charge or refuel. To ensure the adoption of these new technologies in the commercial sector additional services need to be provided that plan routes and suggest where to stop to refuel, as such decisions might have unintuitive optimal solutions. A common way of approaching the mission management task is viewing it as an op- timisation problem and solving it using nonlinear model predictive control (NMPC) [2]. Some of the previous work in this field make it evident that mission man- agement for fuel cell electric vehicles using NMPC is not a trivial task for many reasons. Firstly, problems arise from the specific behaviour of the fuel cell, which makes it necessary to use an additional short-term energy storage system (ESS), in other words, a battery. The ESS lets the vehicle brake regeneratively, increasing en- ergy efficiency, and provides an additional power source during acceleration and hill climbing, reducing power fluctuations for the fuel cell, which increases its lifetime [3]. This, however, in turn leads to some complications. To fully utilise the ESS the battery state of charge (SoC) must be used optimally. For instance, the battery SoC should be sufficiently high before a large hill climb, and nearly empty before a steep descent, in order to efficiently assist the fuel cell and recuperate otherwise lost energy. Having two energy sources also complicates the optimisation task as it increases the number of states in the vehicle, where one state tracks the level of hydrogen in the tank and another tracks the battery’s charge. This in turn means the controller needs to know the road topography ahead of time and plan accordingly. Another complication arises from the fact that the task contains some binary deci- sion variables, such as whether to stop at a hydrogen station to fuel or not. This makes the optimisation problem mixed-integer which is computationally demanding to approach using only NMPC as then it has to be solved for many combinations of 1 1. Introduction the binary variables. In this project, a two-layer hierarchical approach is developed, where the discrete decision variables are determined at the top level using stochastic dynamic programming (SDP) leaving the residual problem to be solved at the lower level as a smooth nonlinear program (NLP). 1.1 Previous work One of the main works that this project is based on is that by Sundberg and Bragde [4]. In their thesis, the same problem is approached only from the NMPC perspec- tive, where some smoothening techniques are used to omit the integer part of the objective function. Also, some complexity-reducing assumptions such as simplified thermal dynamics of the battery were made in that work which will be challenged in this work. Another example of using optimal control applied to the modelling of FCEVs is [5] where a supervisory control is conducted to ensure an optimal power split between the battery and the fuel cell. In [6] one can find how power split optimisation is conducted with model predictive control (MPC). The area of vehicle routing and planning has been studied for decades. Optimal ve- hicle routing problems consider single or multiple vehicles travelling along a graph, potentially loading/unloading goods or passengers at certain points and minimis- ing a cost often associated with fuel, time, or more recently, environmental impact [7]. For an overview, see the book by Toth and Vigo [8]. For the single vehicle mission, eliminating the routing problem and instead considering a fixed route de- creases model complexity, allowing instead for more complex vehicle modelling, e.g., hybrid vehicles where a powertrain split can be optimised [9]. Early work consider- ing hybrid electric vehicles (HEVs), by Rajagopalan and Washington [10] and Lin et al. [11], utilise GPS technology to optimise the power split between the inter- nal combustion engine (ICE) and the electric machine (EM) for a short horizon based on topography and traffic data. In papers by Kolmanovsky et al. [12] and by Johannesson et al. [13], certain parameters are modelled as stochastic. In Sund- berg and Bragde’s thesis, and in the paper by Johannesson et al. the problem is discretised spatially, as space-dependent dynamics become easier to formulate than time-dependent dynamics and maintain problem convexity. In her master thesis, Katsagyri [14] samples the road at non-equidistant points to capture more detail in the road characteristics and in doing so improves accuracy. When it comes to thermal modelling of the battery, Hamednia et al. [15] discusses optimal (battery) thermal management for battery electric vehicles (BEVs) from the mission planning perspective in their paper. In it, they also separate driving and charging dynamics into a hybrid dynamical system, where the charging dynamics are modelled in a time domain, whereas the driving dynamics are modelled in a spatial domain. Another example is a paper by Murashko et al. [16], in which they describe the modelling thermal management system for the lithium-ion battery pack in hybrid electric vehicles (HEVs). Different hierarchical approaches have also been incorporated quite extensively in 2 1. Introduction past works. One example is a paper by Johannesson et al. [17], in which the solution is based on a 3-layer structure to approach the mixed-integer problem, where each layer operates on different timescales with different update frequencies, prediction horizons, and model abstractions. Another example of such an approach is described in a paper by Ganesan et al. [18], where the lower layer is intended to solve the mixed-integer problem of gear selection using receding horizon MPC with different numerical relaxations. Methods accounting for stochastic factors have also been used in quite a few works, one example being a paper by Wang et al. [19], where optimal energy schedul- ing for a hybrid fuel cell/PV/battery system in the presence of uncertain factors is done based on stochastic dynamic programming. In their paper, Ritter et al. [20] incorporate stochastic model predictive control for the energy management of HEVs. 1.2 Limitations In order to reduce the complexity of the problem, no other vehicle dynamics than the longitudinal ones are considered in this work, which means that only the forces that influence a vehicle’s movement in the direction of travel are accounted for. For the same reason, no fluid dynamics of hydrogen during refuelling is considered in the work. According to EU laws, truck drivers must take a 45-minute long break after 41 2 hours of driving at the latest, and must not drive for more than 9 hours a day, including breaks, except for twice a week where it may be extended to 10 hours [21]. This factor significantly complicates the solution of the mission management problem, therefore an optimisation problem considering these constraints will be left for future work. 1.3 Main research questions Through implementing a hierarchical approach for handling the mixed-integer prob- lem, the authors hope to answer the following research questions: • Can splitting the mixed-integer problem into two layers, solving the integer decisions in the higher layer, thus making the second layer a smooth non-linear problem, yield a similar optimal solution as the MINLP problem? • Further, can modelling certain dynamics as stochastic processes improve the robustness of the refuelling station selection? • Can non-equidistant stage sampling help reduce the computational heaviness of the problem without significant losses in accuracy and optimality? 3 1. Introduction 4 2 Theory This chapter details how the vehicle and its powertrain are modelled, detailing the background to help understand the equations used in modelling. 2.1 Vehicle modelling As previously stated only the longitudinal dynamics are considered as the truck is driving, the problem can thus be considered 2-dimensional, one being the spatial dimension and the other being the temporal dimension. 2.1.1 Longitudinal dynamics The equation of motion of the truck in the longitudinal direction can be written as mv̇ = FT − Frr − Fair − Fg − Fbr, (2.1) where, on the left side, m is the mass of the truck and v̇ is the time derivative of the truck’s speed. On the right-hand side, FT is the traction effort generated by the EM, Frr is the sum of all rolling resistance forces acting on each tyre, Fair is the aerodynamic drag force, and Fg is the gradient force due to the road slope. The equation of motion is missing longitudinal slip losses, which appear when the wheel rotation speed differs from the corresponding vehicle speed. Modelling slip losses require tracking wheel rotation speed, which significantly increases the complexity of the model, and as these losses are mostly relevant during acceleration and decel- eration modelling them provides little increase in accuracy, especially in the mission management context. A more detailed description of traction effort FT is done in section 2.1.2 whereas the other forces are discussed here. Rolling resistance Rolling resistance is the resistive force that arises as a wheel is rolling on the surface of the road. The deformation of the tyre during rolling is both elastic and viscous, so- called viscoelastic, due to the properties of rubber material. In the elastic component of the deformation, no permanent (plastic) deformations occur, which is, however, not the case for the viscous component as this deformation can not be fully recovered after unloading. During viscous deformation, friction between the molecules causes a conversion of the mechanical energy transferred to the tyre into heat, hence leading 5 2. Theory to energy losses [22]. Mathematically these losses are expressed as a rolling resistance force that can be written as Frr = CrrFz, (2.2) where Crr is the rolling resistance coefficient of the tyres and Fz is the sum of all the normal reaction forces of the road acting on each tyre. The rolling resistance coefficient Crr is a measure of the force needed for a vehicle to maintain a constant speed with zero road grade and zero air resistance per unit weight. The rolling resistance coefficient is usually found empirically and is approximately equal to 0.0035 for a 40-tonne Volvo FH truck operating in the summer [23]. External factors such as rain, snow and the road surface can have significant effects on the rolling resistance coefficient, increasing it by upwards of 50% [24][25]. For a truck driving on a road, the normal force is Fz = mg cos α, where α is the road grade. The rolling resistance force can then be rewritten as Frr = Crrmg cos α. (2.3) Aerodynamic drag Aerodynamic drag forces, Fair arise as an object moves through a fluid. The equation is given by Fair = 1 2ρairCdAfv2, (2.4) where ρair is the density of air, v the vehicle’s longitudinal speed, and CdAf , often combined as one parameter denoted drag area, is the product of the drag coefficient and the effective frontal area of the vehicle respectively. Considering the dynamics of wind speed and heading the equation can be expanded into Fair = 1 2ρairCdAf (β)v2 rel, (2.5) where the parameter CdAf is now a function of the effective angle of attack β and the vehicle’s speed relative to the air around it, vrel, both of which are given by the equations β = arctan vrel,y vrel,x − φ− π (2.6) vrel = √ v2 rel,x + v2 rel,y, (2.7a) vrel,x = w cos θ − v cos φ, (2.7b) vrel,y = w sin θ − v sin φ. (2.7c) The parameter w is the wind speed, φ is the vehicle’s heading angle and θ is the wind heading angle, both relative to a set coordinate frame. An illustration of all those speeds can be seen in Figure 2.1. The value of CdAf (β) has been determined empirically for a range of common angles β. 6 2. Theory Figure 2.1: Illustration of wind’s relative speed. Gradient force The part of the gravitational force acting in the direction of the vehicle’s heading is given by Fg = mg sin α, (2.8) where a positive slope angle α, signifying an increase in altitude ahead, results in a resistive gradient force and a negative slope angle results in an additive gradient force. Braking force Braking of the vehicle is done with the help of either friction brakes or regenerative brakes, however, in the context of this work, all the variables that have br as an index, will refer to the friction brakes only. More about the regenerative brake force will be discussed further in this chapter. 2.1.2 Powertrain The vehicle considered in this thesis is a heavy-duty fuel-cell electric truck. A fuel cell converts hydrogen gas to electric energy which, assisted by a lithium-ion battery, drives an electric machine. The battery can be charged by the fuel cell or by regenerative braking but cannot be charged by plug-in. Electric machine The EM converts electric energy into tractive effort, driving the vehicle in the desired direction. The conversion from motor torque τem to traction force FT depends on the gear ratio kem of the selected gear γ, the fixed gear ratio of the transmission ktm, the transmission efficiency ηtm and the effective tyre radius R as such FT = ηtmkem(γ)ktmτem R . (2.9) The efficiency ηtm depends on τem and the motor RPM. An efficiency map can be seen in Figure 2.2a. In turn, the RPM depends on the vehicle speed and the gear selection. The field of optimising gear selection has been well studied [18]. 7 2. Theory Inherently it is an integer problem which brings complexity to the NLP layer. It is subsequently assumed that a low-level controller performs gear selection to achieve close to optimal efficiency. By doing so, the efficiency mapping can be made directly from vehicle speed to traction effort, as in Figure 2.2b. 0.60.6 0.6 0.6 0.80.8 0 .8 0 .8 0.8 0.8 0.860.86 0 .8 6 0 .8 6 0.86 0.86 0.90.9 0 .9 0 .9 0.9 0.9 0.91 0 .9 1 0.91 0.91 0.92 0 .9 2 0.92 0.92 0.9 3 0.93 0 .9 3 0.93 0 .9 3 0.93 0.9 4 0 .9 4 0 500 1000 1500 2000 2500 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 0.6 0.65 0.7 0.75 0.8 0.85 0.9 (a) As a function of motor torque and RPM. (b) At optimal gear. Figure 2.2: Transmission efficiency maps as a function of motor torque and RPM and at optimal gear. The direct relationship between τem and motor power Pem is given by the angular speed of the wheels as τem = ηemPem ωem , (2.10) which, under the assumption that there is zero slip between the wheels and the road, is a function of the vehicle speed as such τem = ηemPemR kemktm(γ)v . (2.11) Putting this into equation (2.9) shows the relation between motor input power Pem from the DC-bus, vehicle speed v, and traction force FT to be FT = ηtmηem Pem v . (2.12) For more convenience when solving the problem using numerical methods, the effi- ciency can be represented by an EM loss term FT = Pem − Pem,loss v , (2.13) where Pem,loss = (1 − ηtmηem)Pem and Pem,loss > 0 for both positive and negative motor powers. 8 2. Theory Fuel cell A fuel cell is an electrochemical system that is intended to convert the chemical energy of a fuel (hydrogen in this case) and an oxidising agent (oxygen in this case) into electricity through a pair of redox reactions. Polymer electrolyte membrane fuel cells (PEMFC) utilise a membrane that is permeable for protons, but not for the reactant gases and electrons [26]. Thanks to that property, this kind of fuel cell is also known as a proton-exchange membrane fuel cell. As shown in Figure 2.3, hydrogen enters the fuel cell at the anode and splits into protons and electrons. The electrical chemical reaction that occurs at the anode is H2 −−→ 2 H+ + 2 e−. (2.14) The protons pass through the membrane while the electrons flow through an external circuit, producing an electric current. After that, the oxygen entering the cathode is combined with the resulting protons and electrons to produce water (either gas or liquid) and heat. The reaction on the cathode’s side is thus O2 + 4 H+ + 4 e− −−→ 2 H2O. (2.15) Figure 2.3: Illustration of PEMFC’s principle of operation. [27], CC0 The entire reaction that occurs in the PEMFC is 2 H2 + O2 −−→ 2 H2O. (2.16) In the context of the problem, knowing how a fuel cell’s hydrogen consumption and efficiency vary depending on its power is important. According to [28], hydrogen 9 2. Theory consumption can be approximated as a quadratic function of the fuel cell’s electrical power Pfc as follows H2,con(Pfc) = −(a2P 2 fc + a1Pfc + a0), (2.17) where a0, a1, a2 ≥ 0. Hydrogen power can then be written as PH2(Pfc) = H2,con(Pfc)∆H = −(a2P 2 fc + a1Pfc + a0)∆H, (2.18) where ∆H is hydrogen’s lower heating value. The efficiency of a fuel cell represents the share of chemical (hydrogen) energy that is converted into electrical energy. The heat that is emitted during the chemical reaction is considered to be a loss and most of it is removed by the cooling system. Thus, fuel cell efficiency can be written as ηfc(Pfc) = Pfc |PH2(Pfc)| = Pfc (a2P 2 fc + a1Pfc + a0)∆H . (2.19) The fuel cell hydrogen power and efficiency curves can be seen in Figures 2.4a and 2.4b, respectively. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 (a) Hydrogen power. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (b) Fuel cell efficiency. Figure 2.4: Hydrogen power and fuel cell efficiency, as functions of fuel cell power. Accounting for the hydrogen consumption by the fuel cell and the amount refuelled at hydrogen stations, the differential equation of the hydrogen level in the tank can be written as follows Ḣ2 =  H2,con, H2,rf > 0 H2,con + δst CH2 trf , H2,rf = 0 , (2.20) which after substitution becomes Ḣ2 =  −(a2P 2 fc + a1Pfc + a0), H2,rf > 0 −(a2P 2 fc + a1Pfc + a0) + δst CH2 trf , H2,rf = 0 , (2.21) 10 2. Theory where H2,rf is the amount of hydrogen refuelled at a hydrogen station, CH2 is the capacity of the vehicle’s hydrogen tank, trf is the time needed to refuel the hydrogen tank to the fullest, δst is a binary variable identifying the presence of a hydrogen station nearby at the current time. The refuelling time of FCEVs is usually below 15 minutes [29], which makes it reasonable to neglect the time dynamics of refuelling and consider the refuelling rate constant. It is important to note that operation with low fuel cell power Pfc should be avoided as it leads to degradation of the catalyst in the fuel cell [30]. At the same time, operating at high power levels is also detrimental to the fuel cell as it may incur fuel starvation, which degrades both the catalyst and gas diffusion layers in the fuel cell through carbon oxidation [31]. Constraining the fuel cell to work within a desired operating region can thus lengthen its lifespan. Since the fuel cell must operate in a very narrow temperature range around 70 ◦C, the heat generated during its operation must be systematically removed [32]. The heat generated by the fuel cell system can be approximated as Pfc,heat(Pfc) = c2P 2 fc + c1Pfc + c0 (2.22) where c0, c1, c2 ≥ 0. Battery The battery model is comprised of lithium-titanium-oxide (LTO) cells. These offer a faster charge/discharge cycle and a longer life cycle than other lithium-based batteries at the cost of lower terminal voltage [33]. Connecting several of these cells in parallel yields a battery pack with a capacity Cb. The normalised battery charge, the so-called SoC is defined as the relation between the current battery charge q and its capacity SoC = q Cb . (2.23) The SoC’s dynamics can then be described as ˙SoC = Ib Cb , (2.24) where Ib is the battery current. As the power of any circuit equals voltage multiplied by the current in it, the relation between the normalised battery discharge rate ˙SoC and the output power of the battery Pb is given by ˙SoC = −Pb CbUoc(SoC, Tb) , (2.25) where the battery open circuit voltage Uoc is a function of SoC and the bat- tery’s internal temperature Tb that is shown in Figure 2.6a, and Pb is the bat- tery power that lies within an interval that is determined by SoC and Tb: Pb ∈ [Pb,min(SoC, Tb), Pb,max(SoC, Tb)]. Pb is positive when the battery is discharging, and negative when charging. The upper bounds on the battery’s discharge and 11 2. Theory charge powers as functions of battery temperature and SoC are shown in Figures 2.5a and 2.5b, respectively. (a) Discharge power bound. (b) Charge power bound. Figure 2.5: Normalised charge and discharge power bounds, as functions of battery temperature and state of charge. Battery temperature can be modelled with the help of the heat balance equation Ṫb = Qpass + Qact + Qamb cbmb (2.26) where cb and mb are the battery’s specific heat capacity and total mass, respectively, Qpass is the rate of the heat generated by passive heat sources affecting the battery’s temperature, Qact is the active heat rate supplied to or removed from the battery by external auxiliary components that are intended to control battery’s temperature, and Qamb is the rate of the heat dissipated into the ambient air. It is worth noting that the incorporated model assumes a uniform distribution of temperature along the battery, which substantially reduces the complexity of the problem. Passive heat ultimately consists of the heat losses generated due to internal re- sistance in the battery. These losses can be modelled according to Joule’s law as Qpass = Rb(SoC, Tb)I2 b = Rb(SoC, Tb) P 2 b U2 oc(SoC, Tb) , (2.27) where Rb is the battery’s internal resistance which significantly decreases at higher battery temperatures as one can see in Figure 2.6b. 12 2. Theory (a) Open circuit voltage. (b) Internal resistance. Figure 2.6: (a): Normalised open circuit voltage and internal resistance, as func- tions of battery temperature and SoC. Active heat generation rate can be defined as Qact = ηthPth, (2.28) where Pth is the power of the battery thermal system and ηth is its efficiency. Pth is positive when heat is supplied to and negative when heat is removed from the battery. Heat dissipated into the ambient air can be modelled using Fourier’s law of heat conduction Qamb = kamb(Tamb − Tb), (2.29) where Tamb is the temperature of the ambient air and kamb > 0 is the coefficient of heat exchange between the battery’s case and air, which normally is dependent on the vehicle’s speed but in this work will be chosen as constant for simplicity. The battery modelling is based on [15] and [34], so for more detailed derivations check out those. Power balance equations The fuel cell and battery supply power to the DC Bus, which then passes it to the EM, battery and fuel cell thermal systems and other auxiliary loads. The battery can be charged either through regenerative braking or by the fuel cell. The battery cannot be charged and discharged at the same time. There are 4 different cases of power flow in the vehicle: (a) Both the battery and the fuel cell provide power to the EM and for the aux- iliaries and thermal management needs as in Figure 2.7a. (b) The vehicle is braking regeneratively but all the power goes to the auxiliaries and thermal management needs as in Figure 2.7b. This case is not realistic as the fuel cell’s minimum operating power is always larger than the auxiliaries and thermal management needs. 13 2. Theory (c) The battery is charging from the fuel cell as in Figure 2.7c. (d) The battery is charging from the regenerative brakes as in Figure 2.7d. (a) Both the battery and the fuel cell act as power sources. (b) All the power goes to the auxiliaries and thermal management. (c) Battery charging from the fuel cell. (d) Battery charging from the regenerative brakes. Figure 2.7: Schemes representing the cases of the vehicle’s power flow. The power balance equation can be written as Pem = Pfc + Pb − Pfc,heat − |Pth| −Qpass − Paux, (2.30) where Paux is the auxiliary power mainly used for air conditioning in the cabin. Combining this with equations (2.13) and (2.30) gives an equation for traction effort FT = 1 v (Pfc + Pb − Pfc,heat − |Pth| −Qpass − Paux − Pem,loss). (2.31) 2.2 Optimal control A common way of describing a system with nonlinear dynamics in continuous time is by using a system of differential equations of the form ẋ(t) = f(t, x(t), u(t)), (2.32) where t is time, x is the state vector and u is the (control) input vector. The state vector contains, at minimum, all the information necessary to unambiguously describe the current state of the system. The input vector represents an external impact coming from an actuator that is intended to affect the state of the system. 14 2. Theory Optimal control aims to find an optimal control input trajectory u∗(t) and a corre- sponding optimal state trajectory x∗(t) that minimises objective, or also called cost, function J . Usually, the actuator is limited by some region of operation, control constraint set U ∈ Rm, which the control trajectory must fulfil. The same applies to the state vector which must lie within a constraint set X ∈ Rn. In addition, if the system is analysed in a set time interval [t0, tN ], there will also appear an initial state x(t0) = x0 and a terminal set Xf which x(tN) must belong to. Those control trajectories, which meet the control and state constraints and lead the state trajectory to the target set Xf are called feasible. It is worth mentioning that the control constraint set U largely depends on the initial state x0, but also that it is generally not fixed as it may vary with time t and the state value x. The same goes for the state constraints X as they can also be time-dependent. All the afore- mentioned constraints can be expressed as general inequality constraints of the form g(t, x(t), u(t)) ≤ 0. So finally we can mathematically formulate the continuous-time optimal control problem as minimizex, u J(t, x(t), u(t)) (2.33a) subject to ẋ(t) = f(t, x(t), u(t)), (2.33b) x(t0) = x0, (2.33c) g(t, x(t), u(t)) ≤ 0, (2.33d) Optimal control problems are typically solved numerically with the help of compu- tational software, therefore the dynamics described in (2.32) must be discretised, which can be done by solving the initial value problem (IVP) described in (2.34) us- ing integration methods such as Euler or other higher order Runge-Kutta methods. After discretisation, the continuous objective function, dynamics and constraints need to be evaluated at N + 1 time instances t = t0, t1, ..., tN only. x(tn) = xn, ẋ = f(t, x(t), u(t)), for t ∈ [tn, tn+1] (2.34) It is assumed in this work that u is a piece-wise constant vector, that is u(t) = u(tn), ∀ t ∈ [tn, tn+1). (2.35) 2.2.1 Stochastic optimal control Commonly, system dynamics are modelled to contain parameters representing some sort of uncertainty. These can stem from external factors or internal mismatches in the modelled and actual dynamics. In the presence of disturbance with uncertain behaviour, here denoted w, the aim is to find the control input trajectory u∗(t) that minimises the expected value of the cost function. 15 2. Theory minimizex, u E [J(t, x(t), u(t))] (2.36a) subject to ẋ(t) = f(t, x(t), u(t), w(t)), (2.36b) x(0) = x0, (2.36c) g(t, x(t), u(t)) ≤ 0, (2.36d) w(t) ∼ fw(w(t)), (2.36e) where fw is the probability density function (pdf) of the uncertainty. 2.2.2 Mixed-integer optimisation If the problem described in equation (2.36) has one or several variables that are restricted to be an integer, for instance, ui ∈ Ui ⊂ Z, the cost function becomes non-smooth and non-differentiable, forcing one to solve a separate optimal control problem for many combinations of the discrete variables, which is, of course, very computationally demanding. However, dynamic programming is one of the algo- rithms that can handle such problems without any increase in the computational burden and more about it is written in the next subsection. 2.3 Dynamic programming Invented by Richard Bellman, dynamic programming (DP) has been used as an optimisation method since the 1950s and, naturally, has also found application in optimal control [35, 36]. As with optimal control, its purpose is to find a feasible optimal control trajectory u∗(t) for any initial state x0 over the time interval [t0, tN ]. As there are infinite time instances in the considered time interval, discretising the system is necessary, as mentioned in section 2.2. By discretising dynamical system for K+1 time instances t in [t0, tK ] the problem can be considered to be split into K subproblems, where for problem k = 0, . . . , K − 1 the initial state x0 = xk is constrained by X(tk), likewise, for the terminal state xN = xk+1 ∈ X(tk+1), and the optimal input u∗ k minimises the cost function J(t, x, u) over the interval [tk, tk+1]. This requires that the cost function J can also be split, such that over a time interval [tk, tk+1], J only depends on the input u and state x trajectories within that interval. In more abstract terms, the cost at time tk can be expressed as the cost of getting from xk to xk+1 and the cost associated with getting from xk+1 to xK . If it is possible splitting the problem in such a way the necessary conditions for DP are fulfilled, and an action-value equation can be introduced, defined by the Bellman equation Q(t, x, u) = L(t, x, u) + V (t+, x+), (2.37) where the first term L is commonly referred to as the stage cost, that is the cost associated with taking the action u at time t in state x and V is the optimal cost of 16 2. Theory the remaining subproblems, meaning V is the optimal solution to the action-value function Q at the next time instance t+ and the resulting state values x+ from taking control action u, expressed as V (t, x) = min u Q(t, x, u). (2.38) The control input u that minimises the action-value function at time t in state x is referred to as the optimal policy, given by π(t, x) = argmin u Q(t, x, u), (2.39) which, when combined with equation (2.38) gives the Bellman equation for the value function as V (t, x) = L(t, x, π(t, x)) + V (t+, x+), (2.40) expressing the value in state x at time t. The purpose of DP is to consecutively determine the value function and the pol- icy for all states and for each time step to then save this policy as a look-up table allowing one to determine the optimal input trajectory u∗(t) for any given initial value x(t0) by iteratively looking up the policy π(x) and integrating the dynamics to attain x+. Noting how the action-value function depends on the value for future actions but is independent from actions take at previous time instances one can conclude that the policy must be determined recursively, starting at tK and step- ping backwards in time, commonly known as backward induction. The backward induction is initialized by defining the value at the final time instance tK as V (tK , x) = T (x), (2.41) where T is common notation for the terminal cost function. Thus, the cost function J has been split according to J(t, x(t), u(t)) = K−1∑ i=1 L(ti, x(ti), u(ti)) + T (x(tK)). (2.42) 2.3.1 Stochastic dynamic programming In the presence of external stochastic factors in the system, the model dynamics become uncertain such that the continuous system dynamics become ẋ(t) = f(t, x(t), u(t), w(t)), (2.43) thus making x+ and therefore V (t+, x+) stochastic variables in the value function in equation (2.40), which require additional measures to be taken in the approach described in the previous section. In order to still fulfill the criteria for DP the un- certainty w of the system must have the Markov property, meaning the uncertainty of the system at a certain time instance can only depend on the state x and input u at that time instance. This makes the optimal control problem a Markov decision process, and using DP to solve it is referred to as stochastic dynamic programming 17 2. Theory (SDP)[37]. Since the value V (x) is now stochastic, its values in the Bellman equations (2.37) and (2.40) are represented by the expected values Q(t, x, u) = L(t, x, u) + E [ V (t+, x+) | t, x, u ] . (2.44a) V (t, x) = L(t, x, π(x)) + E [ V (t+, x+) | t, x, π ] , (2.44b) To calculate the expected value it is convenient to introduce a probability density function that would describe the conditional probability that the system’s next state value will be x+ given that the system currently is at state x, the action u is taken and the stochastic uncertainty w affects the system, at time instance t. Written mathematically, that would be fx+| t,x,u(x+), which together with the definition of mathematical expected value can be used to write the Bellman equations for the stochastic case as Q(t, x, u) = L(t, x, u) + ∫ ∞ -∞ V (t+, x+)fx+| t,x,u(x+)dx+, (2.45a) V (t, x) = L(t, x, π(x)) + ∫ ∞ -∞ V (t+, x+)fx+| t,x,u(x+)dx+. (2.45b) 18 3 Methods This chapter details how the hierarchical strategy was set up, how the route was sampled, how cost functions were modelled and how the strategy evaluation was set up. 3.1 Solution strategy Before going to concrete mathematical formulations of the problems to solve, it is worth describing the general strategy that was used to solve the stated problem of mission management. As mentioned before, the problem contains binary variables corresponding to deci- sions to stop at hydrogen stations along the route. These decision variables make the objective function discontinuous and, therefore non-differentiable and unapproach- able as an NLP. Thus it becomes a mixed-integer nonlinear program (MINLP) that has a combinatorial nature and is very computationally expensive to solve as the solution lies in solving a separate NLP problem for many combinations of the binary variables as stated in 2.2.2. In this context, the work in this thesis seeks to investigate the benefits and draw- backs of an approach with the problem decomposed into two parts. The first part, the so-called top layer, was to solve the MINLP problem with a method that can handle binary variables, in this case SDP. Since dynamic programming explores ev- ery possible combination of states and inputs its computational complexity increases exponentially with the number of states and inputs, as well as stochastic variables. Due to this the mathematical model used in this layer had to undergo some sim- plifications to account for that. One of the main simplifications was the exclusion of the thermal management of the battery from the scope of the optimisation, thus making battery temperature a fixed parameter for the entire route. As illustrated in Figure 3.1, the layer takes mission and vehicle parameters as input and outputs the optimal station selection, eliminating this decision from the second part, the so-called bottom layer. Since the binary decisions are taken the problem is no longer mixed-integer and a global NLP optimisation can be conducted with more accurate and comprehen- sive dynamics, more inputs and more states. This layer outputs the final speed, battery temperature and power split trajectories. Unlike the top layer, the bottom 19 3. Methods layer does not account for any uncertain parameters. Figure 3.1: Block diagram illustrating the hierarchical approach. It is essential to note that in order to reduce the number of states, speed was not optimised in the top layer and a predefined base speed profile was calculated instead. Therefore, a feedback loop with an optimised speed profile from the bottom layer to the top layer was implemented as an option. This feedback is illustrated with a dashed line in Figure 3.1. 3.2 Discretisation As mentioned in section 2.2, the problem needs to be discretised to be solved nu- merically by a computer. The problem is split into a number of stages, K for the top layer, and N for the bottom layer, for which an optimisation problem is solved separately. The route thus has to be divided into K and N road segments respec- tively, over which the dynamics are integrated in the solving process. This section describes how the problem is discretised and how the data is sampled to reach the desired number of data points. 3.2.1 Space sampling Since the positions of hydrogen stations are fixed along the route, it poses a more convenient solution to solve the problem in the spatial domain, meaning that all the time derivatives written in section 2.1 have to be rewritten as spatial ones. For the rest of this paper, the notation x′ will represent the space derivative of an arbitrary variable x, i.e. x′ = dx/ds, where s is the vehicle’s position along the route. The transition from the time derivative ẋ to the space domain is done using the equation x′ = dx ds = dx dt dt ds = ẋ v . (3.1) The differential equations (2.1), (2.21), (2.25) and (2.26) thus become mv′ = 1 v (FT − Frr − Fair − Fg − Fbr), (3.2a) H′ 2 = −1 v (a2P 2 fc + a1Pfc + a0) + δstδDirac(s)H2,rf , (3.2b) 20 3. Methods SoC ′ = −Pb vCbUoc(SoC, Tb) , (3.2c) T ′ b = Qpass + Qact + Qamb vcbmb . (3.2d) δDirac(s) in (3.2b) is the Dirac delta function needed to demonstrate that refuelling happens instantly in the space domain. Spatial sampling will be used for both layers in the control hierarchy. The Bellman equations for stochastic dynamic program- ming, given in (2.44), will also have to be rewritten in the space domain Q(s, x, u) = L(s, x, u) + E [ V (s+, x+) | s, x, u ] . (3.3a) V (s, x) = L(s, x, π(x)) + E [ V (s+, x+) | s, x, π ] , (3.3b) 3.2.2 Data downsampling Originally, road data parameters such as position, altitude, speed limits and weather are provided at a much higher resolution than the desired number of stages in the solver. Hence, the data must be downsampled to K and N data points, a process that can have a significant effect on the simulation’s accuracy. Unless the original number of data points is evenly divisible by the desired data points the data can not be downsampled into segments of equal distance, so-called equidistantly, with- out some sort of mathematical interpolation. The process of interpolating and then downsampling data introduces data points that were originally not part of the data set which risks decreasing the accuracy of the data. Additionally, different road segments affect the dynamics differently. A long flat road segment requires only two data points to be accurately described, while a hilly segment of the same length requires more data points to be described with equal accuracy. The parameter with the largest effect on dynamics is altitude. By downsampling the road data using an algorithm that preserves the highest detail in the altitude profile, while not introducing interpolated data points, the simula- tion should in theory become more accurate with the same number of data points N . Such an algorithm would be computationally intensive, however, an algorithm aim- ing to achieve close to optimal results can have a much lower computation time at the trade-off of optimality. One created by the authors of [4] was used to downsam- ple data non-equidistantly to preserve the detail in altitude with the intention of decreasing computation time. In essence, the algorithm splits the data into windows and eliminates redundant data points. A way to evaluate the detail retention is to upsample to the original resolution using first-order hold and calculate the mean squared error (MSE). Another proposed metric is the cumulative error in energy consumption when simulating a truck driving the downsampled routes compared to the original route. A comparison of equidistant and non-equidistant downsampling was done for two routes. For the first route, of length 1793 km, the data was downsampled to 21 3. Methods N = 100, . . . , 1000 stages, while for the second route, of length 461 km, the data was downsampled to N = 100, . . . , 500 stages. To calculate the energy consumption a speed profile was generated by generating a noisy speed profile around 70 km/h and integrating equation (3.2a) using explicit Euler. Additional parameters such as speed limits and station positions were considered, but ultimately no changes were made to the algorithm. This means that currently speed limit changes and hydrogen stations are positioned at the longitudinally clos- est data point post-downsampling. 3.3 Problem formulations In this section the models and cost functions of the two layers are detailed. 3.3.1 Top layer – stochastic dynamic programming The purpose of the top layer is to select stations from the available ones at which to stop and fuel during the mission and decide how much hydrogen should be refuelled. Mathematical model States and inputs of the system will thus be x(s) = [ SoC H̄2 ] , u(s) = [ Pfc H̄2,rf ] , (3.4) where H̄2 = H2/CH2 is the normalised hydrogen level in the tank and H̄2,rf = H2,rf/CH2 is the normalised amount of hydrogen refuelled at a hydrogen station. Note that if H̄2,rf = 0, the vehicle should pass the station without refuelling, which implies that this input doubles as a binary variable. As vehicle speed is neither a state nor an input it must be deterministic, or mod- elled as a stochastic variable. Preprocessing of vehicle and road data, such as speed limits, power output limits, road grade, desired arrival time and maximum risk of failure is done to produce a predefined base speed profile. The algorithm for this can be found in Appendix A.1. The dynamics used to find x+ in the SPD algorithm in equation (3.3) start with the power input to the EM, given by combining equations (3.2a) and (2.12) Pem = ηtmηemv (v′mv + Frr + Fg + Fair) . (3.5) As road parameters are known, as well as a predefined speed trajectory, the input power to the EM can be determined and is henceforth referred to as the power de- mand, as it governs how much power needs to be output by the DC bus to meet the demand from the machine to fullfil the differential equations. One can notice that 22 3. Methods Pem depends on several external factors, such as road grade, traffic speed, wind di- rection and speed and rolling resistance which all display a random behaviour in real life, or are impossible to predict flawlessly hours in advance. This makes modelling Pem as a single stochastic variable a suitable choice, to represent the uncertainty while reducing the computational complexity. In short, a number of vehicle and mission-specific parameters determine the shape of the probability distribution of Pem, which differs for each spatial step. A more detailed description of Pem mod- elling is done later in this section. Assuming constant battery temperature, the vehicle’s power balance (??) can be simplified to Pfc + Pb = Pfc,heat + Paux + Pb,loss + Pem + Pem,loss, (3.6) where, as battery temperature is outside of the optimisation scope in this layer, battery losses are modelled as a quadratic fit independent of the battery temperature Pb,loss = (b0 − b1SoC)P 2 b , b0, b1 ≥ 0. (3.7) This makes equation (3.6) quadratic in Pb and can for a given Pem and a selected Pfc be solved to find Pb. If, at a certain spatial step, no solution exists, or the solution exceeds the state of charge-dependent power output limit Pb,max(SoC) the power demand can not be met for the given Pfc and it is not considered for the policy π. If the solution exceeds the regenerative charging power limit Pb,min(SoC), or the power needed to recharge the battery fully, given from inverting (3.9), another term is added to the power balance, namely the braking power Pbr, to allow for a solution to be found, given the braking power is minimised. Equation (3.6) then becomes Pfc + Pb − Pbr = Pfc,heat + Paux + Pb,loss + Pem + Pem,loss. (3.8) It should be noted that friction brakes Pbr are applied independently of the EM and do not induce additional losses in the EM, thus this representation is not perfectly accurate. Finally, after Pb has been calculated, the next state values x+ can be found by integrating differential equations x′ = [ SoC ′ H̄′ 2 ] =  − Pb vCbUoc(SoC) − 1 vCH2 ( a2P 2 fc + a1Pfc + a0 ) + δstδDiracH̄2,rf  . (3.9) Objective function The dynamics in section 3.3.1 return three outputs: the battery power consumption Pb, the braking power required Pbr, and a binary variable δπ indicating whether the input should be considered for the policy or not, determined by its feasibility, as mentioned above. y(s) =  Pb Pbr δπ  (3.10) 23 3. Methods Additionally, a separate policy is determined for each of the control inputs, as π(s, x(s)) = [ P ∗ fc H̄∗ 2,rf ] . (3.11) The terminal cost function is defined as T (x(sN)) = x⊤ e εexe + εC , xe = x(sN)− xtarget, (3.12) where εe = [ εe,SoC 0 0 εe,H2 ] (3.13) are the costs of deviating from the desired SoC and H2 targets and εC is the cost of mission completion. The cost at any other spatial point is the sum of the cost of driving and refuelling, as long as the risk of failure is below or equal our risk tolerance κ, as such L(s) = Ldr(s, x(s), u(s)) + Lrf (s, x(s), u(s)), Pr (δπ(s) | u(s)) ≤ κ ∞, Pr (δπ(s) | u(s)) > κ , (3.14) The cost function of driving is in turn the sum of three costs Ldr(s, x(s), u(s)) = Llim(x(s)) + Ltr(x(s), u(s)) + Lbr(y(s)), (3.15) where the cost of leaving the state constraint set is Llim(x(s)) = Llim,SoC(x(s)) + Llim,H2(x(s)), (3.16) Llim,xi (x(s)) =  εllim,xi xi,min − xi(s) xi,min , xi(s) < xi,min εulim,xi xi(s)− xi,max xi,max , xi(s) > xi,max εfail, xi(s) /∈ Xi = 0, otherwise , (3.17) where εulim,xi and εllim,xi are the costs of breaching a state’s upper and lower bounds {xi,min, xi,max} ⊆ Xi, respectively, and εfail is the cost of failing the mission by exiting the feasible state space Xi, the cost of breaching the maximum allowed level of the fuel cell’s transient is Ltr(x(s), u(s)) = εtr, |P ∗ fc(s+, x+(s))− Pfc(s)| > P ′ fc,max∆s(s) 0, otherwise , (3.18) where εtr is the cost of exceeding the fuel cell’s transient’s maximum allowed level, and the cost of using friction brakes Lbr(y(s)) = εbr Pbr(s)∆s(s) Fbr,maxv(s) , (3.19) 24 3. Methods where εbr is the cost of friction braking. The cost function of refuelling is calculated as follows Lr(s, x(s), u(s)) = δst(s) ( εfuel(s)H̄2,rf (s)CH2 + εstop(s) ) H̄2,rf (s) > 0 0, H̄2,rf (s) = 0 , (3.20) where εfuel and εstop are the costs of fuel (hydrogen) and stopping, respectively. Since the output y(s) is stochastic, making L(s) stochastic, both terms in the action- value function in equation (2.44a) are stochastic and the action-value function is calculated as Q(s, x(s), u(s)) = E [ L(s, x(s), u(s)) + V (s+, x+(s)) | s, x(s), u(s) ] . (3.21) The value function and optimal policy for each input is saved at every stage as V [k, x, u] = min u Q(s[k], x, u) (3.22a) P ∗ fc[k, x] = argmin Pfc Q(s[k], x, u), k = 1, . . . , K (3.22b) H̄∗ 2,rf [k, x] = argmin H̄2,rf Q(s[k], x, u), δst(s[k]) = 1 (3.22c) As these are only saved for a discrete set of stages, states and inputs linear inter- polation is done to calculate the value and policy at any point in space and for any state and input values with the exception of refuelling decisions H̄2,rf . As this input has an inherent binary interpretation its value is often discontinuous with regard to state values. Hence, it is important to identify discontinuities and interpolate using a different method around these. An example of this is shown in Figure 3.2 where an estimation of the discontinuity is illustrated and the difference between linear and nearest interpolation is shown. An intuitive explanation of this is that the truck will either pass the station or stop and refuel enough to finish the mission without stopping again. Refuelling less than the policy causes the truck to need another stop later on, which is not cost-effective. Figure 3.2: Comparison of interpolation methods in discontinuous region. 25 3. Methods Stochastic variables Implementing dynamic programming means quantising states, inputs and distur- bances in a range of interest as xi = x1 i , . . . , x Mxi i ∈ Xi where Mxi represents the number of discrete values at which to determine a policy for state xi. Likewise, Mui denotes the number of inputs to consider for the policy of input ui, and Mwi the number of samples from the random distribution of wi to perturb the system with. The number of calculations to be made for each spatial sample can thus be calculated by ∏ j Mj. Because of this, the number of states and inputs is already kept to a minimum; likewise, this is desired for the number of disturbances. Hence, it is of interest to approximate the disturbances with a single variable and associated probability density function. The variables that were modelled as stochastic were wind speed w, wind angle θ, rolling resistance coefficient Crr due to road conditions, mass m, and speed re- duction due to traffic ∆vtr. As all of these eventually affect the power demand Pem it was chosen as the variable which the single disturbance should affect. The vehicle dynamics are not linear in all stochastic variables, nor are all of the variables modelled as Gaussian distributions, hence it is not possible to implicitly derive a density function for a desired combined variable. However, by sampling each stochastic variable as shown in Figure 3.3 and calculating the power demand for the sampled values, akin to a Monte Carlo method, it is possible to determine a combined probability distribution from the resulting data points. The distribution could then be approximated using a normal distribution as shown in Figure 3.4, allowing it to be represented mathematically using only two parameters, a mean and a variance. 0 10 20 30 0 0.02 0.04 0.06 0.08 0.1 39.5 40 40.5 0 0.5 1 1.5 10 -3 0.0035 0.004 0.0045 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 0 0.02 0.04 0.06 0.08 probability sampled points Figure 3.3: Illustration of stochastic variables’ sampling. 26 3. Methods 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.002 0.004 0.006 0.008 0.01 0.012 simulation normal fit base demand Figure 3.4: Illustration of normal approximation of the stochastic power demand. Probability density function of total power demand (v = 70 km/h, α = 0, v′ = 0). As some stochastic variables were modelled as a distribution around a given base value, i.e. mass was normally distributed around a known approximate vehicle mass m and resulting speed was the product of the base speed v and the speed reduction from traffic, the resulting power demand’s mean µPem and standard deviation σPem become a function of some base values, as well as some spatially varying parameters such as road grade and acceleration, which have a large effect on the power demand. µPEM (s) = µPEM (Crr0, m, v(s), v′(s), α(s)) (3.23a) σPEM (s) = σPEM (Crr0, m, v(s), v′(s), α(s)) (3.23b) A joint probability distribution for vehicle speed given power demand was also de- termined, with covariance 1, as the speed is used in the integration step to calculate the total power consumption, see equation (3.9). Consequently, the distribution of energy consumption varies less than that of power demand, as outliers in power demand correlate with a decreased speed, and a change in speed entails an implicit change in the step size in the Euler integration. 3.3.2 Bottom layer – non-linear programming Mathematical model After refuelling decisions have been made at the upper level and the problem is no longer mixed-integer, an NLP with more accurate and comprehensive dynamics can be formulated. The states and inputs of the system in discrete distance space thus 27 3. Methods become x[n] =  SoC H̄2 t v2 Tb Ffc[n− 1]  , u[n] =  sH2 sSoC st Ffc Fb Fbr Pth  , (3.24) where sH2 , sSoC, st are slack variables that are used for bound relaxation for H2, SoC, t, respectively. Using distance sampling substantially increases the complexity of the state dynam- ics, especially for the vehicle’s speed. This complexity can be partly eliminated by using v2 instead of v as a state representing the vehicle’s speed and that is thanks to the following property dv2 ds = 2v dv ds = 2v dv dt dt ds = 2v̇. (3.25) In order to make the equation of speed dynamics linear in control inputs, some power control inputs have been replaced by their force counterparts. The relation between the two can be described as Ffc = Pfc v , (3.26a) Fb = Pb v , (3.26b) Fbr = Pbr v . (3.26c) Using the mathematical relations written before and accounting for some transfor- mation made specifically for this layer, the state dynamics become SoC ′ = −Fb CbUoc(SoC, Tb) , (3.27a) H̄′ 2 = − 1√ v2CH2 ( a2F 2 fcv 2 + a1Ffc √ v2 + a0 ) + δstδDiracH̄2,rf , (3.27b) t′ = 1√ v2 + δstδDirac · (trf + tq), (3.27c) (v2)′ = 2 m ( Ffc + Fb − Fbr− − Qpass + PEM,loss + |Pth|+ Pfc,heat + Paux√ v2 − Frr − Fair − Fg ) , (3.27d) T ′ b = Qpass + Qact + Qamb√ v2cbmb . (3.27e) 28 3. Methods tq in equation (3.27c) is queing time of a hydrogen station. The complete discretised mathematical formulation of the problem can be found in Appendix B. It is solved numerically with the help of Embotech’s FORCESPRO solver [38] which utilises a Newton method-based approach by the name of nonlinear interior point (IP). More information about the solver can be found in [4] and [39]. The problem is solved to minimise the objective function described below. Objective function The objective function for the bottom layer is defined as follows J(x, u) = ωtt[N ]+ + N∑ n=1 (ωs(sH2 [n] + sSoC [n] + st[n]) + ωH2δst[n]H2,rf [n] + ωbrFbr[n]∆s[n]) , (3.28) where time cost ωt is used to minimise the time spent en route, hydrogen cost ωH2 is used to minimise the refuelled amount of hydrogen, slack variables costs ωs is used to minimise their use and enforce constraint fulfilment and cost of friction braking ωbr is used to minimise the use of friction brakes. 3.4 Simulation setups Reiterating the research questions asked: • Can splitting the mixed-integer problem into two layers, solving the integer decisions in the higher layer, thus making the second layer a smooth non-linear problem, yield a similar optimal solution as the smoothened MINLP problem? • Further, can modelling certain dynamics as stochastic processes improve the robustness of the refuelling station selection? Aiming to answer these questions a number of simulation tests were set up, each quantifying the claims above or additional metrics that were deemed interesting. The test cases aim to answer the following questions: Test 1 Does the bilevel approach produce results of equal or lower total cost, and if so, how consistently? In the case of disturbances, does the stochastic strategy find a feasible solution more consistently than the deterministic strategy? Test 2 Are there any possible benefits to be had when iterating the two layers until convergence? Test 3 What is the tradeoff between computational time and optimality or robust- ness? Can sampling the route non-equidistantly assist the strategy in any of the above questions? 3.4.1 Test 1 – Robustness analysis The first set of simulations was done to evaluate the optimality and robustness of station selection using the two-layer approach. The following strategies were compared: 29 3. Methods SDPκ Use a bilevel approach with stochastic dynamic programming selecting sta- tions, with a maximum failure risk of κ. DP Use a bilevel approach with deterministic dynamic programming selecting sta- tions. Sigm Use a sigmoid activation function, similar to the one in [4], to smoothen the MINLP problem allowing stations to be selected directly in one layer. Heur Use a bilevel approach with a heuristic algorithm based on estimated power consumption selecting stations. The algorithm can be found in Appendix A.2. The simulations were conducted on two routes: one from Zagreb, Croatia to Paris, France (1559 km) and another from Gothenburg, Sweden to Kiruna, Sweden (1793 km). In Tables 3.1 and 3.2, the parameters of the hydrogen stations that lie along the chosen routes are listed. Using the different strategies listed above as the top layer, selecting stations and refuelling amount, the evaluation was then based on the total cost J , given from running the top layer results in the bottom layer (NLP). The same simulation was then done but with a constant disturbance applied to the rolling resistance to simulate wet road conditions due to precipitation. The purpose of this was to see whether the station selection and refuelling amount were sufficient in an uncertain scenario. The parameters used for the top and bottom layer were K = 600, N = 800, Mx1 = Mx2 = 31, Mu = 21, Mw = 13 and with initial and terminal conditions x1[0] = x2[0] = 0.90 and x1[N ] = 0.30, x2[N ] = 0.15. Table 3.1: Station positions and hydrogen prices for the Zagreb–Paris route. Station Position [km] Price [SEK/kg] Station Position [km] Price [SEK/kg] 1 93.64 60.3 17 587.18 67.0 2 96.56 83.0 18 602.79 62.9 3 109.24 59.7 19 748.12 74.5 4 141.43 84.7 20 821.28 72.8 5 156.06 69.6 21 824.20 73.5 6 223.36 73.9 22 877.85 61.8 7 249.70 79.3 23 922.72 62.5 8 273.11 70.6 24 977.34 61.4 9 310.17 62.9 25 1056.34 64.0 10 323.83 59.1 26 1084.62 57.5 11 342.36 55.3 27 1122.66 58.3 12 376.50 78.3 28 1152.91 86.2 13 474.04 67.9 29 1164.61 86.1 14 503.30 58.0 30 1362.61 59.1 15 549.14 83.3 31 1392.85 53.4 16 565.72 62.1 32 1426.01 59.7 30 3. Methods Table 3.2: Station positions and hydrogen prices for the Gothenburg–Kiruna route. Station Position [km] Price [SEK/kg] Station Position [km] Price [SEK/kg] 1 6.73 67.1 19 667.27 59.4 2 13.46 77.7 20 670.63 80.5 3 48.22 52.5 21 712.12 86.4 4 50.47 63.1 22 745.77 63.5 5 70.65 57.6 23 753.62 76.7 6 172.70 55.7 24 802.96 83.2 7 176.07 59.0 25 855.67 83.8 8 235.51 64.6 26 905.01 55.5 9 259.06 66.4 27 907.26 53.9 10 288.21 71.4 28 908.38 58.4 11 327.46 67.2 29 1025.01 83.2 12 344.29 76.5 30 1160.71 55.9 13 373.44 59.7 31 1166.31 67.2 14 382.42 83.2 32 1167.43 86.0 15 513.63 53.5 33 1304.25 71.2 16 527.08 67.1 34 1503.87 76.7 17 529.33 72.1 35 1790.96 63.5 18 617.92 57.4 36 1792.09 76.5 3.4.2 Test 2 – Iterative method The premise of this test is that on the one hand, the optimal station selection is dependent on the speed travelled, as that determines the efficiency of the energy consumption and the total trip time, while on the other hand, the optimal speed profile is dependent on which stations were selected, i.e. when the truck needs to lower the speed to have sufficient energy until the next station is reached. This two-way dependency is one reason why the hierarchical strategy can not be con- sidered globally optimal, as it optimises one variable at a time. One strategy to mitigate this drawback of the bilevel approach is to iterate the two layers until the solution potentially converges to a solution upon which no further improvements can be made. This is no guarantee that the solution will be globally optimal, but rather improving the solution marginally. The idea is thus to run the mission optimiser, save the speed profile and rerun the mission optimiser using this optimal speed profile to potentially get a new opti- mal station selection, for which a new optimal speed profile is found, as illustrated in Figure 3.1. When doing so, a question arises: Should the SDP policy be regenerated using the new speed profile as well, or just the forward simulation? Generating the SDP policy makes up a majority of the computation time of the mission optimiser, incentivising reusing the policy generated initially. On the contrary, the policy is determined for a certain speed profile, meaning if this changes significantly the pol- icy is no longer optimal and might not give desirable results. 31 3. Methods The simulations run in this test were done for three different routes, with two sets of parameter values, specifically the stage resolution for the two layers, K and N , and trying both approaches of regenerating the SDP policy for each iteration and utilising the initial SDP policy. This comes out to 12 simulations in total, for which the parameters are detailed in Table 3.3. All simulations use non-equidistant sam- pling, and a risk tolerance of κ = 0.05. As conditions for convergence, the stations selected should be the same for three consecutive iterations and the total cost J should not increase or decrease by more than 1% from the previous iteration, with a maximum of 10 iterations allowed. Table 3.3: Initial and terminal state constraints, stage quantisation and SDP regeneration decision for simulations 1-12. simulation x1[0] x2[0] x1[N ] x2[N ] K N reg. SDP Zagreb – Paris 1 0.90 0.90 0.30 0.15 250 500 no 2 0.90 0.90 0.30 0.15 250 500 yes 3 0.90 0.90 0.30 0.15 500 1000 no 4 0.90 0.90 0.30 0.15 500 1000 yes Gothenburg – Kiruna 5 0.90 0.90 0.30 0.15 250 500 no 6 0.90 0.90 0.30 0.15 250 500 yes 7 0.90 0.90 0.30 0.15 500 1000 no 8 0.90 0.90 0.30 0.15 500 1000 yes Gothenburg – Rødby 9 0.90 0.90 0.75 0.75 250 500 no 10 0.90 0.90 0.75 0.75 250 500 yes 11 0.90 0.90 0.75 0.75 500 1000 no 12 0.90 0.90 0.75 0.75 500 1000 yes 3.4.3 Test 3 – Resolution analysis Another set of simulations was be aimed at investigating how different resolutions of the state vector quantisation in the MINLP M and stage quantisation in both problems K, N can affect the optimality of the overall solution. The purpose of this test is to determine whether the two-layer solution can perform competitively at a similar computation time as the other solutions. The simulations were conducted based on a route from Gothenburg, Sweden to Kiruna, Sweden (1793 km). All combinations of the following parameters were tested: • Mx1 = Mx2 = {11, 21, 31}, • N = K = {400, 800, 1200, 1600}, and the parameters total cost J , the total amount of refuelled hydrogen, the selected stations, the travel time tN , and the computation time will be evaluated. Initial and terminal conditions were x1[0] = x2[0] = 0.90, x1[N ] = x2[N ] = 0.2. One set of simulations were run using equidistant sampling, and one using non-equidistant sampling. 32 4 Results 4.1 Downsampling The results for the comparison between the equidistant and non-equidistant down- sampling of the road data is shown in Figure 4.1. An exponential fit was made to the MSE data for route one, illustrated in Figure 4.2. This shows that for any N over 175, a 10% increase in N is required to achieve the same MSE for equidistant as when using non-equidistant. Looking at total energy consumption error the non- equidistant sampling performs better on average than equidistant for route one and worse on average for route two. The total error in energy consumption is notably less than 0.3% for as low as 100 stages using both methods. (a) MSE for route one. (b) Energy consumption for route one. (c) MSE for route two. (d) Energy consumption for route two. Figure 4.1: Comparison of equidistant and non-equidistant sampling for two dif- ferent routes, with original resolutions of N = 15000 and N = 6000, respectively. 33 4. Results Figure 4.2: exponential fit of MSE data for route one. 34 4. Results 4.2 Test 1 – Robustness analysis The results of the simulations for the two chosen routes with and without uncertainty added can be seen in Tables 4.1 - 4.4 below. A dagger (†) signifies a failure of convergence of the solver in the bottom layer, determined either from the solver’s output message or by observing the results. Failure to converge can be a consequence of the station selection being infeasible, but such deductions should be made with caution. Due to the non-linearity of the problem tiny changes in the parameters can cause the solver to not converge to a solution. Table 4.1: Simulation results with different methods for the route Zagreb–Paris with no uncertainty applied. method J tN [min] Selected stations ∑ H̄2,rf Ebr [J ] tcomp [sec] variable H2 cost, equidistant sampling SDPκ=0.05 394664 1301 18 31 0.7463 1598 68.61 SDPκ=0.50 394592 1301 18 31 0.7471 1598 55.65 DP † † 22 28 0.4473 † 56.67 sigm 559505 1864 16 0.3857 1599 80.76 heur 394739 1302 20 31 0.7437 1598 20.09 constant H2 cost, equidistant sampling SDPκ=0.05 393973 1301 18 31 0.7379 1598 54.01 SDPκ=0.50 394612 1302 18 31 0.7518 1598 55.75 DP 385738 1283 17 0.4479 4488 56.74 sigm 536466 1859 16 0.3931 1599 40.01 heur 395381 1302 20 31 0.7437 1598 19.65 constant H2 cost, non-equidistant sampling SDPκ=0.05 394281 1302 17 24 0.8058 1598 67.09 SDPκ=0.50 400229 1322 17 28 30 0.7861 1598 68.73 DP † † 18 0.4445 † 56.53 sigm 504915 2307 16 0.3687 1599 39.92 heur 394106 1302 20 31 0.7437 1598 19.67 variable H2 cost, non-equidistant sampling SDPκ=0.05 397766 1307 19 32 0.7341 1598 67.41 SDPκ=0.50 402050 1308 19 32 0.7372 1598 67.97 DP † † 23 0.4209 † 56.93 sigm 486618 2133 17 26 0.3973 1599 40.93 heur 398233 1309 21 32 0.7279 1598 38.13 35 4. Results Table 4.2: Simulation results with different methods for the route Zagreb–Paris, with uncertainty applied after the station selection. method J tN [min] Selected stations ∑ H̄2,rf Ebr [J ] tcomp [sec] variable H2 cost, equidistant sampling SDPκ=0.05 701179 2015 17 28 0.9448 1602 44.59 SDPκ=0.50 707140 2018 17 28 0.9439 1602 45.88 DP 516429 1710 20 21 26 0.5479 1598 34.03 sigm 509478 1689 11 16 0.4200 1598 123.04 heur 682114 1976 18 28 0.9357 1602 15.44 constant H2 cost, equidistant sampling SDPκ=0.05 724601 2063 14 28 0.9538 1603 68.86 SDPκ=0.50 708528 2019 13 25 0.9554 1602 70.32 DP 504097 1671 20 0.5819 1598 33.65 sigm 513491 1689 11 16 0.4198 1598 82.87 heur 681484 1976 18 28 0.9357 1602 16.21 constant H2 cost, non-equidistant sampling SDPκ=0.05 † † 15 26 0.9568 † 72.05 SDPκ=0.50 † † 15 25 0.9486 † 42.91 DP 510064 1691 18 21 0.5914 1598 32.60 sigm 509803 1690 11 16 0.4208 1598 81.09 heur 681523 1976 18 28 0.9357 1602 15.43 variable H2 cost, non-equidistant sampling SDPκ=0.05 † † 18 29 0.9359 † 70.91 SDPκ=0.50 † † 18 29 0.9417 † 67.98 DP 581831 1718 21 27 29 0.5762 1601 55.19 sigm 563614 1708 17 27 0.5559 1600 76.90 heur † † 19 29 0.9186 † 38.60 In Tables 4.1 and 4.2 the deterministic DP strategy performs the best, cost-wise, while failing to converge three out of eight times. The heuristics algorithm performs on par with the SDP strategy for the deterministic simulations and outperform it for the uncertain simulations. The sigmoid strategy is the only one to converge every time, performing the worst in the deterministic set, but the best out of all the strategies in the uncertain set. Perhaps this is a consequence of the uncertainty being part of the bottom layer optimisation and not simulated in an additional plant-controller model, indicating that the method of evaluation is flawed. 36 4. Results Table 4.3: Simulation results with different methods for the route Gothenburg– Kiruna with no uncertainty applied. method J tN [min] Selected stations ∑ H̄2,rf Ebr [J ] tcomp [sec] constant H2 cost, equidistant sampling SDPκ=0.05 397956 1294 18 0.7042 16002 67.24 SDPκ=0.50 397826 1293 18 0.7016 16002 69.00 DP 396182 1292 23 0.4725 16001 58.51 sigm 481240 1383 17 0.4297 160020 18.80 heur 409092 1330 20, 31 0.7424 16005 44.11 variable H2 cost, equidistant sampling SDPκ=0.05 415933 1352 18, 28 0.8137 16009 52.03 SDPκ=0.50 403478 1314 19, 20 0.7285 16003 73.81 DP 396013 1292 19 0.4948 16001 64.27 sigm 481137 1383 20, 31 0.4305 160020 16.60 heur 408950 1330 20, 31 0.7424 16005 45.21 variable H2 cost, non-equidistant sampling SDPκ=0.05 397340 1294 19 0.7286 16003 65.88 SDPκ=0.50 397701 1294 18 0.6979 16002 65.56 DP 395816 1292 17 0.4704 16001 60.35 sigm 866507 1394 17 0.4534 3982674 16.57 heur 409507 1330 20, 21 0.7424 16005 50.17 constant H2 cost, non-equidistant sampling SDPκ=0.05 934964 1300 19 0.7057 5365848 70.13 SDPκ=0.50 648533 1302 19 0.7071 2496247 68.39 DP 647081 1301 24 0.4848 2496246 58.76 sigm 866605 1393 17 0.4551 3980611 18.01 heur 655704 1326 21, 32 0.7337 2496249 44.22 37 4. Results Table 4.4: Simulation results with different methods for the route Gothenburg– Kiruna, with uncertainty applied after the station selection. method J tN [min] Selected stations ∑ H̄2,rf Ebr [J ] tcomp [sec] variable H2 cost, equidistant sampling SDPκ=0.05 505343 1648 4 23 30 31 2.0834 1598 52.52 SDPκ=0.50 505400 1649 4 23 30 31 2.0830 1598 15.00 DP † † 4 26 31 1.7397 † 28.22 sigm † † 9 14 15 22 29 31 1.8598 † 51.05 heur 498882 1628 5 23 30 31 2.0250 1598 2.90 constant H2 cost, equidistant sampling SDPκ=0.05 503405 1648 4 16 29 31 2.0948 1598 7.40 SDPκ=0.50 503238 1649 4 19 30 31 2.1028 1598 13.09 DP 556520 2468 3 24 30 1.7538 693787385 26.10 sigm † † 14 15 22 29 31 1.7359 † 56.88 heur 499019 1628 5 23 30 31 2.0250 1598 53.96 constant H2 cost, non-equidistant sampling SDPκ=0.05 505391 1649 4 22 30 31 2.0857 1598 66.82 SDPκ=0.50 504057 1649 1 18 29 31 2.0996 1598 70.11 DP 569126 1808 4 26 31 1.7445 1595 89.24 sigm † † 14 15 22 29 31 1.7520 † 61.44 heur 498183 1628 5 23 30 31 2.0250 1598 55.24 variable H2 cost, non-equidistant sampling SDPκ=0.05 510085 1664 5 25 34 36 2.0994 1598 100.08 SDPκ=0.50 516100 1684 5 25 32 33 36 2.0997 1598 110.06 DP † † 5 29 36 1.7496 † 76.08 sigm † † 1 3 4 15 18 25 27 28 33 36 2.0541 † 48.36 heur 503535 1643 7 26 34 36 2.0250 1598 46.46 For Tables 4.3 and 4.4 the SDP strategy is outperformed by the DP strategy in the deterministic case, while performing better in the uncertain case, albeit being outperformed by the heuristics algorithm there. The sigmoid strategy has the worst performance, failing to find a solution in the uncertain case. 38 4. Results 4.3 Test 2 – Iterative method Presented below are the results from running the simulations in test 2. The first two simulations for each route used K = 250 and N = 500 while the latter used K = 500 and N = 1000. Even-numbered simulations, on the right-hand side, regenerated a new SDP policy for each iteration while odd-numbered simulations, on the left- hand side, reused the initial policy. The first set of 12 simulations, seen in Tables 4.5-4.7, were run using the seed rng(5) for the random number generator, which determines how the hydrogen prices vary between stations. The second set, seen in Tables 4.8-4.10, used a new seed for each simulation resulting in prices varying between simulations run with and without regenerating the SDP policy. Note that for route two, Gothenburg-Kiruna, the number of stations available decreases with the lower stage resolution, as nearby stations are combined, resulting in the indices not being comparable between stage resolutions and the randomly generated costs differing. Table 4.5: Results from simulations for route one, Zagreb–Paris, using the same seed for every simulation. simulation 1 iter. J tN [min] stations ∑ H̄2,rf 1 403718 1299 18 0.7084 2 423368 1295 18 0.7138 3 406036 1301 18 0.7139 4 398754 1295 18 0.7141 5 406004 1301 18 0.7138 6 467642 1300 18 0.7141 7 484385 1304 18 0.7139 8 403719 1298 18 0.7182 9 468267 1295 18 0.7138 10 476583 1300 18 0.7137 simulation 2 iter. J tN [min] stations ∑ H̄2,rf 1 403718 1299 18 0.7084 2 † † † † simulation 3 iter. J tN [min] stations ∑ H̄2,rf 1 413230 1326 19 32 0.7168 2 417976 1331 18 27 0.8146 3 409419 1341 18 27 29 0.8122 4 405712 1323 18 27 0.8089 5 403655 1322 18 27 0.8089 6 403320 1321 18 27 0.8101 simulation 4 iter. J tN [min] stations ∑ H̄2,rf 1 413230 1326 19 32 0.7168 2 † † † † 39 4. Results Table 4.6: Results from simulations for route two, Gothenburg–Kiruna, using the same seed for every simulation. simulation 5 iter. J tN [min] stations ∑ H̄2,rf 1 1.1 · 107 2630 16 28 0.9520 2 745434 2461 15 26 1.2761 3 † † † † simulation 6 iter. J tN [min] stations ∑ H̄2,rf 1 1.1 · 107 2630 16 28 0.9520 2 711221 2318 15 26 1.2674 3 † † † † simulation 7 iter. J tN [min] stations ∑ H̄2,rf 1 490930 1619 18 32 0.9892 2 492082 1619 18 32 1.0158 3 492070 1618 18 32 1.0160 simulation 8 iter. J tN [min] stations ∑ H̄2,rf 1 490930 1619 18 32 0.9892 2 492052 1617 18 32 1.0166 3 492312 1618 18 32 1.0163 Table 4.7: Results from simulations for route three, Gothenburg–Rødby, using the same seed for every simulation. simulation 9 iter. J tN [min] stations ∑ H̄2,rf 1 116464 383 9 0.3193 2 120941 397 9 0.3319 3 118311 389 9 0.3270 4 119430 392 9 0.3272 simulation 10 iter. J tN [min] stations ∑ H̄2,rf 1 116464 383 9 0.3193 2 699926 1597 9 0.5409 3 † † † † simulation 11 iter. J tN [min] stations ∑ H̄2,rf 1 122220 400 9 0.3289 2 136394 448 9 0.3384 3 142515 468 9 0.3440 4 150348 490 9 0.3497 5 165948 537 9 0.3630 6 201265 612 9 0.3922 7 † † † † simulation 12 iter. J tN [min] stations ∑ H̄2,rf 1 122220 400 9 0.3289 2 † 1758 12 0.6063 3 † † † † Table 4.8: Results from simulations for route one, Zagreb–Paris, using a different seed for every simulation. simulation 1 iter. J tN [min] stations ∑ H̄2,rf 1 437415 1299 18 0.6918 2 421984 1358 17 20 24 30 0.7145 3 410299 1319 17 20 0.7524 4 410465 1319 17 20 0.7522 simulation 2 iter. J tN [min] stations ∑ H̄2,rf 1 405938 1315 16 30 0.7131 2 † † † † simulation 3 iter. J tN [min] stations ∑ H̄2,rf 1 409438 1322 19 32 0.7163 2 † 2298 16 26 0.8237 3 406646 1326 16 26 0.8721 4 404271 1324 16 26 0.8218 simulation 4 iter. J tN [min] stations ∑ H̄2,rf 1 432181 1336 19 32 0.7284 2 † † † † 40 4. Results Table 4.9: Results from simulations for route two, Gothenburg–Kiruna, using a different seed for every simulation. simulation 5 iter. J tN [min] stations ∑ H̄2,rf 1 487589 1611 17 26 0.9702 2 487641 1612 17 26 0.9771 3 406646 1326 16 26 0.8721 4 404271 1324 16 26 0.8218 simulation 6 iter. J tN [min] stations ∑ H̄2,rf 1 487653 1611 17 27 0.9616 2 487499 1611 17 27 0.9688 3 406646 1326 16 26 0.8721 4 404271 1324 16 26 0.8218 simulation 7 iter. J tN [min] stations ∑ H̄2,rf 1 496783 1639 21 25 33 0.9736 2 497745 1638 21 25 33 0.9691 3 406646 1326 16 26 0.8721 4 404271 1324 16 26 0.8218 simulation 8 iter. J tN [min] stations ∑ H̄2,rf 1 496804 1639 23 32 33 0.9698 2 491992 1619 23 32 0.9886 3 406646 1326 16 26 0.8721 4 404271 1324 16 26 0.8218 Table 4.10: Results from simulations for route three, Gothenburg–Rødby, using a different seed for every simulation. simulation 9 iter. J tN [min] stations ∑ H̄2,rf 1 114569 377 10 0.2819 2 123673 407 10 11 0.3288 3 122332 402 10 11 0.3252 4 122295 402 10 11 0.3233 simulation 10 iter. J tN [min] stations ∑ H̄2,rf 1 115177 378 12 0.3176 2 4438936 1651 12 0.5764 3 † † † † simulation 11 iter. J tN [min] stations ∑ H̄2,rf 1 121791 398 10 0.3284 2 126756 415 10 0.3332 3 124980 409 10 0.3322 4 122295 402 10 11 0.3233 simulation 12 iter. J tN [min] stations ∑ H̄2,rf 1 118808 389 12 0.3262 2 433304 1255 12 0.5202 3 † † † † From the results, there is a general pattern of simulations where the SDP policy is regenerated ending up infeasible, as illustrated by the † symbols. The cause of this is presumed to be the speed profile determined by the deterministic bottom layer optimization not being robust enough for a SDP policy to be found at the risk tolerance of κ = 0.05 used. A possible way to mitigate this is to use the algorithm to preprocess the speed profile given by the bottom layer, as done initially. Simula- tions 5 and 6 in Table 4.9 seem to suggest that the stage quantization K = 250 and N = 500 is insufficient to consistently find an optimal solution. Notably, numerous simulations converges to a lower trip cost, as seen in simula- tion 3 in Table 4.5, simulations 1 and 3 in Table 4.8 and all simulations in Table 4.9. In total, 8 out of 24 simulations resulted in a lower cost after iterating. In sim- ulation 3 in Table 4.5 and simulation 5-8 in Table 4.9 the station selection changes which brings a significant improvement, as seen in the decrease in cost. For other simulations, the trip cost instead diverges or stays stagnant. Simulation 1 in Table 41 4. Results 4.5 illustrates the inconsistent behaviour of the NLP solver, as the only parameter that changes between attempts at finding the optimal solution is the initial guess and the refuel amount ∑ H̄2,rf which differs by at most 1.4%, with total trip cost J differing by as much as 19.5% between solutions. 4.4 Test 3 – Resolution analysis This section contains the results of the test that was conducted based on the de- scription in subsection 3.4.3. In Figure 4.3 and Table 4.11, one can find test results obtained with equidistant stage sampling, and in Figure 4.4 and Table 4.11 the ones obtained with non-equidistant stage sampling. Results presented in Tables 4.11 and 4.12 demonstrate that solving with Mx1 = Mx2 = 11 yielded a solution with two more refuelling stops than with Mx1 = Mx2 = 21 and Mx1 = Mx2 = 31, which in turn resulted in a higher travel time and total cost. Similar results were seen when using non-equidistant sampling. However, there was no difference in the stations selected and no significant difference in the amount of refuelled hydrogen with 21 and 31 state values in SDP, whereas the computation time differed by between 30 and 50 seconds depending on the stage resolution. Sta- tions 15 and 30 were selected for refuelling which, according to Table 3.2, have the lowest prices in their respective parts of the route. This further suggests that the SDP strategy reduces the total cost of operation as desired. When it comes to stage sampling N and K, as shown in Table 4.11, using a lower stage resolution consistently yielded lower energy losses which turned out to be al- most linear to the number of stages N , meaning that it most probably was by caused scaling in cost function and specific behaviour of the solver rather than resolution of the stage discretisation. Thus, the difference in cost and brake energy, and implicitly accuracy, is small relative to the difference in computation time, which is approx- imately halved when going from N = K = 1600 to N = K = 800. Worth noting is that in one of the simulations, a resolution of N = K = 400 was insufficient for the solver to converge. This suggests a trade-off between reliability and accuracy is necessary to achieve the desired results, as ultimately the mission optimiser needs to be reliable in providing a solution. The results when running the same simulations using non-equidistant stage sam- pling, which are demonstrated in Table 4.12, suggest that the difference in perfor- mance is minimal. Losses from friction braking are identical to the losses when using equidistant sampling. Noticeably, the simulation which was infeasible in Table 4.11 was solvable when using non-equidistant sampling. However, no conclusive assump- tions can be made based on this as the sample size is too low and only one route and vehicle is considered. Although the stations selected were the same, the trip time was slightly longer for the non-equidistant simulations, differing by an average of 1.2% which is reflected in the trip cost as well. This can be explained by an in- 42 4. Results (a) Simulation results for Mx1 = Mx2 = 31 (b) Simulation results for Mx1 = Mx2 = 21 (c) Simulation results for Mx1 = Mx2 = 11 Figure 4.3: H̄2 and SoC trajectories together with the power split for the Gothenburg–Kiruna route with equidistant sampling with N = K = 1600. Sub- figures (a)–(c) correspond to different numbers of quantised state values in SDP. 43 4. Results Table 4.12: Simulation results with different SDP state quantisation and stage resolution with non-equidistant stage sampling for the route Gothenburg–Kiruna. Mx1 = Mx2 N = K J tN [min] Selected stations ∑ H̄2,rf Ebr [J ] tcomp [sec] 31 1600 493493 1620 15, 30 1.056 3198 231.01 1200 497093 1622 1.042 2398 183.8 800 492097 1618 1.030 1598 121.76 400 495038 1634 0.991 798 66.6 21 1600 493511 1619 15, 30 1.058 3198 186.5 1200 497282 1623 1.045 2398 138.3 800 491902 1617 1.038 1598 90.6 400 495222 1634 1.018 798 61.3 11 1600 506863 1662 15, 27, 30, 34 1.159 3198 147.8 1200 510501 1665 1.139 2398 125.9 800 504525 1658 1.123 1598 80.1 400 507578 1674 1.060 798 42.33 creased accuracy of the solution, although such claims would have to be investigated further. Table 4.11: Simulation results with different SDP state quantisation and stage resolution with equidistant stage sampling for the route Gothenburg–Kiruna. Mx1 = Mx2 N = K J tN [min] Selected stations ∑ H̄2,rf Ebr [J ] tcomp [sec] 31 1600 488376 1605 15, 30 1.048 3198 236.9 1200 487542 1605 1.038 2398 159.8 800 485240 1598 1.022 1598 95.2 400 485467 1604 1.011 798 56.3 21 1600 488236 1605 15, 30 1.047 3198 172.6 1200 487559 1605 1.043 2398 134.3 800 485305 1598 1.033 1598 77.56 400 † † 1.029 † † 11 1600 501133 1646 15, 27, 30, 34 1.148 3198 120.5 1200 500238 1645 1.131 2398 91.0 800 497800 1639 1.103 1598 63.0 400 498178 1645 1.096 798 39.6 44 4. Results (a) Simulation results for Mx1 = Mx2 = 31. (b) Simulation results for Mx1 = Mx2 = 21. (c) Simulation results for Mx1 = Mx2 = 11. Figure 4.4: H̄2 and SoC trajectories together with the power split