Physics-Informed Neural Networks for
Charge Dynamics in Air
Master’s thesis in Complex Adaptive Systems

Árni Konráðsson

DEPARTMENT OF Electrical Egineering

CHALMERS UNIVERSITY OF TECHNOLOGY
Gothenburg, Sweden 2022
www.chalmers.se

www.chalmers.se


Master’s thesis 2022

Physics-Informed Neural Networks for
Charge Dynamics in Air

Árni Konráðsson

Department of Electrical Engineering
Division of High Voltage Engineering

Chalmers University of Technology
Gothenburg, Sweden 2022


Physics-Informed Neural Networks for Charge Dynamics in Air

Árni Konráðsson

© Árni Konráðsson, 2022.

Supervisors: Olof Hjortstam, Hitachi Energy
Examiner: Yuri Serdyuk, Department of Electrical Engineering

Master’s Thesis 2022
Department of Electrical Engineering
Division of High Voltage Engineering
Chalmers University of Technology
SE-412 96 Gothenburg
Telephone +46 31 772 1000

Typeset in LATEX, template by Kyriaki Antoniadou-Plytaria
Printed by Chalmers Reproservice
Gothenburg, Sweden 2022

iv


Physics-Informed Neural Networks for Charge Dynamics in Air
Árni Konráðsson
Department of Electrical Engineering
Chalmers University of Technology

Abstract
Physics-Informed Neural Networks (PINNs) are neural networks trained to consider
the laws of physics outlined in Partial Differential Equations (PDEs). PINNs are
a new tool aimed at solving difficult problems, such as systems arising from PDEs,
complementing or replacing traditional methods that already exist. This thesis
focuses on applying PINNs to the drift-diffusion equation and the Poisson equation,
which are fundamental PDEs that describe the physics of electrical discharges in air
and other gases under high-voltage stress. The aim of the work is to predict the
electric field in an air gap between two electrodes and to study how ionic charges
drifting in the gap modify the electrostatic field distribution. In the first case,
the equations are considered by PINN independently. In the second case, they
are coupled together to implement the effect of the space charge on the electric
field. The results obtained in both cases are compared with respective analytical
solutions and/or solutions obtained from the Finite Element Method (FEM) using
COMSOL Multiphysics software. Details of the implementation of PINNs as well
as encountered limitations are discussed. In particular, best practices when setting
up problems and choosing hyperparameters are highlighted. The obtained results
indicate that PINNs can, in some aspects, outperform FEM in accuracy although
using significantly longer processing time. Further research is needed to confirm
the feasibility of PINNs for electrical gas discharges, taking into account additional
effects of more complex geometries and physics. Additionally, the usability of PINNs
for solving the inverse problem (i.e., extracting physical parameters defining the PDE
from its solution) needs to be analyzed further.

Keywords: Physics-Informed Neural Network, Artificial Neural Network, Partial
Differential Equation, Discharge physics, Air discharges, DeepXDE

v


Acknowledgements
This thesis is written under the supervision of Hitachi ABB Power Grids, a world-
leading supplier of grid infrastructure. In particualar it was carried out in collaba-
ration with the research center located in Västerås, Sweden.
My gratitude goes out to my family for having the patience to support me through
the project and providing me with immense moral support. I would also like to give
special thanks to my supervisors and examiner, Olof Hjortstam, Christian Häger
and Yuriy Serdyuk, for providing helpful insights and discussions that have shaped
my work.

Árni Konráðsson, Gothenburg, June 2022

vii


Contents

List of Figures xi

List of Tables xiii

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background 5
2.1 Deep Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Physics-Informed Neural Network . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 DeepXDE Framework . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Methods 11
3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1.1 Laplace and Poisson’s Equations . . . . . . . . . . . . . . . . . 11
3.1.2 Drift-Diffusion Equation . . . . . . . . . . . . . . . . . . . . . 12
3.1.3 Mono Polar Charge Transport Coupled with Poisson’s equation 14

3.2 Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.1 Activation Function tanh and Scaling. . . . . . . . . . . . . . 15
3.2.2 Collocation Points . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.3 Optimizers and Learning Rate . . . . . . . . . . . . . . . . . . 16
3.2.4 Loss Weights and Hard Boundary Conditions . . . . . . . . . 17
3.2.5 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 Results 19
4.1 Coaxial Laplace Equation . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Drift-Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Mono Polar Charge Transport Coupled with Poisson’s Equation . . . 23
4.4 Summary of Best Practices . . . . . . . . . . . . . . . . . . . . . . . . 26

5 Discussion 27
5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

ix


Contents

5.1.1 Improve Current Work . . . . . . . . . . . . . . . . . . . . . . 28
5.1.2 Use PINNs for Inverse Problem . . . . . . . . . . . . . . . . . 28

6 Conclusion 29

Bibliography 31

x


List of Figures

2.1 Schematic graph of a simple feed forward neural network . . . . . . . 6
2.2 Schematic representation of a PINN . . . . . . . . . . . . . . . . . . . 8

3.1 Coaxial electrode setup . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Schematic representation of PINN for 1D Laplace equation . . . . . 13
3.3 Schematic of the PINN network for the drift-diffusion equation . . . . 14
3.4 Schematic of PINN network for mono polar charge transport coupled

with Poisson equation . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 Schematic of network modification for applying hard constraints . . . 18

4.1 Comparison of model prediction of the potential and the true values
for the Laplace coaxial equation . . . . . . . . . . . . . . . . . . . . . 20

4.2 Comparison of model prediction of the electrical field and the true
values for Laplace coaxial equation . . . . . . . . . . . . . . . . . . . 21

4.3 Comparison of the drift model with the solution found using Comsol
without adding numerical stabilization. . . . . . . . . . . . . . . . . . 21

4.4 Comparison of drift model with solution found using Comsol with
numerical stabilisation . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.5 Good result form a drift-diffusion PINN model compared with so-
lution found using Comsol with numerical stabilization on 50 mesh
point grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.6 Not so good result form a drift-diffusion PINN model compared with
solution found using Comsol with numerical stabilisation on 50 mesh
point grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.7 Best result form a drift-diffusion PINN model compared with solution
found using Comsol with numerical stabilisation on 50 mesh point grid 24

4.8 Coupled PDE PINN model with low injection compared with solution
found using Comsol with numerical stabilization on 50 mesh point grid 25

4.9 Coupled PDE PINN model with the highest injection solved, 1e9

ions/m3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

xi


List of Figures

xii


List of Tables

3.1 Parameters and relevant descriptions of the drift-diffusion equation
for ionic drift in air [2]. . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 Description of hyperparameters of PINNs . . . . . . . . . . . . . . . . 16

4.1 Time and iterations different boundary conditions and optimizer take
until results have a L2 error of less than 1e−2 for R0 = 0.1m. . . . . 20

4.2 Test loss, training time, and optimizer specifications for PINNs shown
in Figures 4.3-4.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3 Test loss, training time, and optimizer specifications for PINNs shown
in Figures 4.8-4.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.4 Number of layers and neurons used for the best results for each prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

xiii


List of Tables

xiv


1
Introduction

1.1 Background
Due to the worldwide need for a strong reduction in greenhouse gases to suppress
ongoing climate changes, the electrical energy system is under great changes. An
important part of the development of high voltage equipment needed for this chal-
lenge is to predict the insulation performance of air around such equipment. Physical
processes in air discharges are normally modeled utilizing hydrodynamic approaches
[1]. One such model is the drift-diffusion equation for charge carriers coupled with
Poisson’s equation.

∂ne
∂t

+∇ · (neWe −De∇ne) = Re

∂np
∂t

+∇ · (npWp −Dp∇np) = Rp

∂nn
∂t

+∇ · (nnWn −Dn∇nn) = Rn

(1.1)

Here, the subscripts e, n and p indicate electrons, positive ions, and negative ions.
D stands for diffusion coefficient. The first term represents the rate in change of
the number density, n, over time. The second and third terms are divergence of the
flux due to advection and diffusion of particles respectively. The source terms of
Equation 1.1 incorporate rates of processes to be considered in the model:

Re = αne|We| − ηne|We| − βepnenp +R0 +Rph

Rp = αne|We| − βepnenp − βpnnpnn +R0 +Rph

Re = ηne|We| − βpnnpnn
Rp +Re +Rn = 0

(1.2)

where βep and βnp are electron-ion recombination coefficients and ion-ion recombi-
nation coefficients, respectively, R0 and Rph are the rate of background and photo
ionization, respectively, α is the first ionization coefficient and η is the attachment
coefficient.
Every one of the source terms have strong non-linear field dependence as most of
the parameters in these drift-diffusion equation are dependant on the electric field
E, in which case we couple them with the Poisson’s equation

∇ · (−ε0εr∇V ) = e(np − ne − nn)
E = −∇V.

(1.3)

1


1. Introduction

Here, ε0 is the permittivity of vacuum and εr is the relative permittivity, V is
the electric potential, and e is the electron charge. This is a complex transport
and charge generation model with strong non-linear behaviors. Current solution
methods such as the Finite Element method (FME) have their shortcomings, the
drift-diffusion equation is in-stable, high speed charges need high resolution on the
computational grid and small time steps which results in time and memory con-
suming calculations. The time scales of the problem are also vastly different: parts
of the discharge physics take place at short time scales, while other parts, such as
drift of ions, take a much longer time. Georgi Karman simplifies Equations 1.1-1.3
to one dimension in his thesis work [2], we will refer to his work for the choices of
parameters in Chapter 3.

In recent years, researchers in machine learning have started to develop a method
known as physics-informed neural networks (PINNs). This approach was first pro-
posed by Raissi et al.[4][5]
PINNs are neural networks trained to solve supervised learning tasks while respect-
ing physical laws described by general nonlinear partial differential equations (PDE).
There are two types of PINNs, the data-driven solution of PDEs and the data-driven
discovery of PDEs. In the data-driven solution, also called the forward problem,
neural networks are used to solve a given PDE, and in the data-driven discovery of
PDEs, also called inverse problems, neural networks are used to find suitable pa-
rameters for PDEs. Although the use of experimental data and PINN to find PDEs
is a promising area of study, we will focus solely on the forward problem and leave
the inverse problem for future studies.

1.2 Purpose
This thesis investigates charge transport in an air gap between two electrodes using
PINNs. The PDEs used in the study are Poisson’s equation and the drift-diffusion
equation, and the long-term goal is to be able to predict electric breakdown in air
insulation. The Poisson equation is solved analytically and used for comparison,
whereas the drift-diffusion equation is compared to a solution found using FEM.
The research questions that this work aims to answer are as follows.

1. How useful are PINNs for problems arising in discharge physics?
2. What are the best practices when using PINNs to solve PDEs?
3. What are the limitations and advantages of using PINNs compared to tradi-

tional numerical solvers?

1.3 Limitations
Using PINN to solve Equations 1.1-1.3 would be ideal, but its complexity is too
high for the current thesis. We therefore have to accept some limitations. First, we
only look at simple geometry, one-dimensional interval in both coaxial and Cartesian
coordinates. Second, we have a simple physical system that does not account for ion
generation, recombination, and more. Third, we make use of parameters to PDEs

2


1. Introduction

that are found using approximations, simplifications, and results from experiments.
This will result in PDE that will never be completely accurate. Typical parameters
for air discharge are taken from Karman [2].

1.4 Related Work
The work done in this study is largely based on the work by Raissi et al. [4][5]. Part
one of their paper focuses on showing the promise of using PINNs to solve PDEs
for both continuous and discrete time. The second paper [5] focuses on finding the
best parameters of a PDE describing the observed data. The equations studied
are Burger’s equation, Schrödinger’s equation, Navier-Stokes equation, Kortewedge
Vries equation, and Allen-Cahn equation. Since the publication of the article by
Raissi et al., there have been a number of studies exploring the application of PINNs.
In particular Poisson’s equation and diffusion equation in 1D in mechanics [15], the
wave equation, KdV and KdV-Burger equations in 1D [13], and the swing equation
for both forward and inverse problem [14]. Other applications include the heat
transfer equation in 1D [16], coupled PDE [18], and many more.
Using PINNs to replace traditional solvers is already being studied. Markids [12],
studies the Poisson equation and how choosing different hyperparameters affects
training time and convergence. Many insightful thoughts and lessons can be taken
from his paper.

3


1. Introduction

4


2
Background

2.1 Deep Neural Networks

A deep neural network is an artificial neural network with multiple layers between
the input and output layers. Many different neural networks have been developed
in the past decades including convolutional neural networks, recurrent neural net-
works (RNN), residual neural networks (ResNet) and feed forward neural networks
(FFNN). We consider FFNN in this study. Using other networks could be of use in
further research.
Let us now give a short description of a deep neural network. We have N L(x) a
L−layered neural network from Rd to RD where d is the input dimension and D
the output dimension. Layer i has Ni neurons, so N0 = d and NL = D. Now we
denote the weight matrix and bias vector in layer ` by W` ∈ RN`×N`−1 and b` ∈ RN`

respectively. Given a nonlinear activation function σ applied elementwise we can
define a FFNN recursively with

input layer : N 0(x) = x ∈ Rd

hidden layers : N ` = σ(W`N `−1 + b`) ∈ RN` for 1 ≤ ` ≤ L− 1
output layer : N L = WLN L−1 + bL ∈ RD

A schematic graph of a typical FFNN can be seen in Figure 2.1, x is the spatial
input, t is the time input, and û is the output from the net.
PINNs require computation of derivatives of the network outputs with respect to the
inputs. This can be done in four different ways [11], hand-coding analytical deriva-
tive, numerical approximation (like finite element method), symbolic differentiation
(used in Mathematica, Maple and such) and the fourth is automatic differentiation
(AD). Deep learning utilizes a specialized technique of AD called backpropagation
for evaluating the derivatives.
Deep neural network represent a compositional function and therefore AD can apply
the chain rule repeatedly to compute the derivative. AD can be boiled down to two
steps, first a forward pass to calculate the value of all variables and then a backward
pass to compute the derivatives. A simple example of how AD works can be found
in [10].

5


2. Background

Figure 2.1: Schematic graph of a simple feed forward neural network

2.2 Physics-Informed Neural Network
PINNs are introduced in this section starting with an overview and background.
The second part of the section focuses on DeepXDE, a framework for implementing
PINNs.

2.2.1 Overview
Deep neural networks need a large amount of training data to obtain accurate re-
sults. However when analyzing complex physical systems, the cost of acquiring data
is prohibitive which forces us to draw conclusions and make decisions with only par-
tial information. PINNs do not have this need for large amount of data, instead they
make use of prior knowledge, such as physical laws, empirically validated results or
other domain expertise. This prior knowledge acts as a regularization agent that
constrains the space of available solutions to a more manageable size, resulting in
less data needed for training.
PINNs can be be used for two types of problems, the forward and the inverse prob-
lem. The forward problem is a so-called data-driven solution of PDEs, focusing on
finding solutions to a predetermined PDE. The inverse problem is a so-called data-
driven discovery of PDEs, focusing on finding parameters of PDEs that fit given
data. The work done in [4][5] is split into two types of models, continuous-time
and discrete-time models. This study will only consider continuous-time forward
problems and leave inverse problems for future research.

The general idea for both data-driven solutions and data-driven discovery of PDEs
is as follows. Let us consider a parameterized and nonlinear PDE of the general
form

ut +N [u;λ] = 0, x ∈ Ω, t ∈ [0, T ] (2.1)

where u(t, x) denotes the latent (hidden) solution, N [·;λ] is a nonlinear operator

6


2. Background

parameterized by a vector λ and Ω is a subset of Rd. We use the notation ut, ux
and uxx for, u differentiated by time, x−coordinate and twice by x−coordinate,
respectively.
When dealing with a forward problem, we assume that all λ are known and constant.
We define f(t, x) as given by the left-hand side of Equation 2.1,

f := ut +N [u]. (2.2)

An approximate of u(t, x) is found with the use of a deep neural network. This
approximation along with 2.2 result in a PINN f(t, x).
The shared parameters between the neural networks u(t, x) and f(t, x) can be
learned by minimizing the loss function. In particular, the loss given by summing
together the mean squared loss from network u and network f

MSE = MSEu + MSEf (2.3)

where

MSEu = 1
Nu

Nu∑
i=1

(u(tiu, xiu)− ui)2 and MSEf = 1
Nf

Nf∑
i=1

(f(tif , xif ))2. (2.4)

Here {tiu, xiu, ui}Nu
i=1 denote the initial and boundary training data on u(t, x) and

{tif , xif}
Nf

i=1 are the collocation points for f(t, x). The loss MSEu corresponds to the
boundary and initial conditions, while MSEf enforces the PDE structure at a finite
set of so-called collocation points chosen in the domain of interest. Figure 2.2 shows
a general PINN schematic. The neural network inputs are points in space (x) and
time (t) at the initial and boundary conditions. The net outputs an approximation
(û) of the PDE solution u which is then used to calculate the gradients appearing
in the PDE. The results are provided by the function f .
The inverse problem uses an idea similar to the forward problem keeping λ from
Equation 2.1 as a variable parameter. λ then becomes a parameter of the PINN
f(t, x).

2.2.2 DeepXDE Framework
There are a number of different frameworks for creating machine learning models,
among the most popular is the Tensorflow framework created by Google. Tensor-
flow has many built-in modules and is useful for deep learning. Tensorflow is not
specifically built for PINNs, so every function, class, or module would need to be
created from scratch. That is where DeepXDE comes in.
DeepXDE was designed by Lu et al. [10], and is built on Tensorflow. It serves both
as an educational tool and as a research tool to solve problems in computational
science and engineering. It can solve both forward problems given initial and bound-
ary conditions, as well as the inverse problem given specific data. It was designed
so that the code written is short and comprehensive, resembling the mathematical
formulations.

Solving PDEs with DeepXDE is done in nine steps:

7


2. Background

Figure 2.2: Schematic representation of a PINN

1. Specify the computational domain
2. Specify the PDE
3. Specify the boundary/initial condition
4. Use a built in function to create training and testing points for the system

defined by 1,2 and 3.
5. Construct neural network
6. Define a model
7. Set hyperparameters and compile the model
8. Train the network
9. Predict the PDE solution

When looking at each step more closely, we first look at the geometry. DeepXDE
has many built-in geometries, but we only use an interval and combine it with
a time interval when we have time-dependent PDEs. When specifying the PDE,
we use grammar from Tensorflow. DeepXDE was developed in Tensorflow 1 and
even though it has the option to use other backends, like Tensorflow 2, we found
that it worked best using Tensorflow 1. DeepXDE has four common boundary
conditions, Dirichlet, Neumann, Robin, and periodic. In this study we only have
Dirichlet boundary conditions. The fourth step is to combine the geometry, PDE,
and boundary/initial conditions together, creating either time-dependent or time-
independent training data. We do this with a one line command where we specify
how many training points we want in the domain and the boundaries. We also specify
which distribution should be used to sample training points from the domain. If
we have an analytical solution to the PDE, can we also input that function, then
we can add a metric to our training that evaluates the PINN. Defining a model is

8


2. Background

a simple one-line command combining the data from step 4 and the network from
step 5. Then we choose hyperparameters, like optimizer, learning rate and more,
see Section 3.2. Then it is time to compile and train the model. After training we
have a model capable of predicting the desired PDE solution.

2.3 Finite Element Method
Numerical methods are an established way of solving PDEs. The performance of
such methods is crucial when analytical solutions are either hard or impossible to
find.
The finite element method is a popular numerical method, especially when deal-
ing with complex geometry. To solve a problem, FEM divides a large system into
smaller, simpler parts called finite elements. This is done by discretization in the
space dimensions which is implemented by construction of a mesh of the object,
which has a finite number of points. The method approximates the unknown func-
tion over the domain. The simple equations that model these finite elements are
then gathered together to make a larger system that models the entire problem.
FEM then approximates a solution by minimizing an associated error function.[9]
FEM will be used for validation of the equations solved by PINN in the cases where
no analytical solution exists.

9


2. Background

10


3
Methods

This chapter introduces the methods used. First, in Section 3.1 we provide a detailed
description of the problems we will examine. Then in Section 3.2 we go over what
hyperparameters to consider.

3.1 Problem Description
The target of the present work is to investigate some specific features of 1D formula-
tions of the equation presented in 1.1-1.3: Namely, the coupling of Poisson equation
and the drift-diffusion for a case with only one charge species and no source terms
R. First, we look at each of them separately. Then combine them and show how
predictions of the electric field can be made.

3.1.1 Laplace and Poisson’s Equations
The general Poisson’s equation is given by

∇2U = −ρ
ε

(3.1)

where U is the electric potential, ρ is the space-charge distribution and ε is the
permittivity. If we have no space charge present in the system, ρ will be zero, and
Equation 3.1 will become Laplace’s equation.
The solutions to both the Laplace and Poisson equations in 1D Cartesian coordinates
are quite simple, given that the charge distribution is not too complex. A more com-
plex but still simple geometry that is relevant to study of high voltage applications is
the coaxial geometry that can be described by coaxial coordinates. Therefore, when
looking at Laplace’s equation in one-dimensional coaxial coordinates, we model the
electric potential as a function of the radius according to

d2U

dr2 + 1
r

dU

dr
= 0 (3.2)

Looking at Figure 3.1, we have a coaxial electrode setup with an inner electrode with
some radius R0 and a fixed potential. Then we have an outer electrode, a grounded
cage, with radius R1. Then we say that R0 is the inner radius and R1 the outer
radius. For simplicity we want to model Equation 3.2 with boundary conditions

U(r = R0) = 1V U(r = R1) = 0V (3.3)

11


3. Methods

Figure 3.1: Coaxial electrode setup

To study geometries relevant for air discharge experiments [2], the dimensions of
R0 = 1mm and R1 = 0.5m are relevant to study. We have an analytical solution to
Equation 3.2 with these boundary conditions.

U(r) = ln r − lnR1

lnR0 − lnR1
(3.4)

Thus we can compare with the PINN model. From a physical point of view, the
electrical field is a important parameter that can be calculated if the potential
distribution is known. We know the relation between the field and the potential
expressed in coaxial coordinates is,

E = −dU
dr

(3.5)

and the analytical solution for the field is

E = −1
r

1
lnR0 − lnR1

(3.6)

Figure 3.2 shows the schematic representation of the PINN used to solve the one-
dimensional coaxial Laplace equation.

3.1.2 Drift-Diffusion Equation
The next problem we look at is mono polar charge transport in an external electric
field in one dimension. A suitable PDE for this is the drift-diffusion equation found
in Karman [2], we only look at one charge species and leave out the source term,

∂n

∂t
= −w∂n

∂x
+Diff

∂2n

∂x2 (3.7)

Table 3.1 shows the description and values used for the parameters of Equation 3.7,
all parameters are taken from Karman [2].

12


3. Methods

Figure 3.2: Schematic representation of PINN for 1D Laplace equation

Now we have two electrodes separated by one meter air gap. We have a constant
external electric field over the gap. We want to model what happens when we inject
ninj ions into the left boundary, given a constant density n0 before injection. This
can be modeled using the boundary conditions

n(t, x = 0) = ninj, n(t = 0, x) = n0 (3.8)

Table 3.1: Parameters and relevant descriptions of the drift-diffusion equation for
ionic drift in air [2].

n ion density (ions/m3)
w speed of charges (m/s) w = µnEext = 200
Diff Diffusion coefficient (m2/s) Diff = µnkbT

q
= 5.05e−6

µn mobility of positive ions (m2/V s) µn = 2e−4

Eext External electric field (V/m) Eext = 1e6

kb Boltzmann’s constant (J/K) kb = 1.38e−23

T Temperature (K) T = 293
q elementary electric charge (C) q = 1.602e−19

Figure 3.3 shows the schematic representation of the PINN used to solve the one-
dimensional time-dependent drift-diffusion equation.

13


3. Methods

Figure 3.3: Schematic of the PINN network for the drift-diffusion equation

3.1.3 Mono Polar Charge Transport Coupled with Poisson’s
equation

Now we want to see how the electric field distribution changes over time in the air
gap if ionic charges are drifting in the electric field. For this we couple Equation 3.7
with Poisson’s equation in 1 dimensional Cartesian coordinates.

∂n

∂t
= −w∂n

∂x
+Diff

∂2n

∂x2

∂2U

∂x2 = −q · n
ε0

(3.9)

Here the charge distribution n feeds into Poisson’s equation and the electrical field
can be calculated by, E = −∂U

∂x
. At one electrode we have 1MV electric potential

and at the other zero. We start with a low injection of charges, with a charge density
of 1ion/m3 at the injecting boundary, under so low a charge density, the equations
are essentially uncoupled. We then increase the charge density, aiming to have a
solution for a model with injection of 5e13ions/m3.

Figure 3.4 shows a schematic of the PINN for Equation 3.9.

14


3. Methods

Figure 3.4: Schematic of PINN network for mono polar charge transport coupled
with Poisson equation

3.2 Hyperparameters
When creating PINNs there are numerous so-called hyperparameters that need to
be considered. Table 3.2 lists them and gives a brief description.

3.2.1 Activation Function tanh and Scaling.
Choosing the activation function is an important step for its performance, different
activation functions can provide considerably different accuracy and convergence.
In fact, it has been shown theoretically [6] and confirmed experimentally [12] that
non-smooth activation functions such as ReLU (Rectified Linear Unit) are not con-
sistently convergent, so it is recommended to use smooth functions such as the
hyperbolic tangent, tanh, or the sigmoid function. In this paper, we choose tanh as
the activation function. Tanh is defined by

tanh(x) = 2
1 + e−2x − 1 (3.10)

and squeezes values into a range between -1 and 1. Because of this, it is important
to rescale the input to the network and to scale the parameters of the PDE so that
the solution is also in this range. If we do not rescale, then training can become
difficult, as the net output is of order 1 but the output of the PDE could be of order
10 or even higher.
When selecting which activation function to use, we tried training with every built-
in function in DeepXDE. It soon became clear that non-smooth functions did not

15


3. Methods

Table 3.2: Description of hyperparameters of PINNs

Activation function Adds non-linearity to neural networks, tanh and sigmoid
are most used in PINNs

Scaling Rescaling of PDE parameters and inputs to the network
Network architecture Number of hidden layers and number of nodes in each

layer
Collocation Points Points where the PDE residual function is evaluated at.
Optimizer Algorithm used to optimize weights of neural network.

Most commonly L-BFGS or Adam
Learning rate Determines how large a step to take at each iteration

while moving towards the minimum of the loss function.
Loss weights How much each loss function should contribute to the

overall loss
Hard boundary conditions Function applied to the output of the neural network

that ensures that one or more boundary conditions are
met

Loss function The loss function in a neural network qualifies the dif-
ference between the expected outcome and the outcome
from the model

converge as fast as the others and when comparing the smooth functions, tanh was
always either fastest to converge or the most accurate. We then decided tanh was a
good choice.

3.2.2 Collocation Points
Collocation points are the input points in the integration domain making up the
training data set. In practice, a large number of collocation points result in larger
values of the loss function as well as longer training time. The distribution of
the collocation points can also impact the loss. The three most used distributions
are uniform, pseudo-random and Sobol. In a uniform distribution, the data set is
uniformly spaced on the simulation domain. In a pseudo-random distribution, the
points are sampled randomly from the simulation domain. In the Sobol distribution,
the points are from the Sobol low-discrepancy sequence [20]. Sobol is the default
distribution when using DeepXDE. Selecting a good distribution is more relevant
when we have higher dimension and larger domain. When we have a small domain
and are in 1 dimension, the number of collocation points is more important. After
trying out different distributions we decided to use Sobol for the Laplace equation
and pseudo-random for the time dependent equations.

3.2.3 Optimizers and Learning Rate
Most widely used optimizer for training neural networks is the Adam optimizer [21].
Adam is a variant of the stochastic gradient descent method that considers the es-
timates of the first and second order moments of the gradients i.e., the mean and

16


3. Methods

the uncentered variance. The moments are inserted into the update formula for the
network parameters. The main advantages of Adam are that it works well with
noisy or sparse data and has fast convergence.
Another optimizer used in present thesis is Limited memory Broyden-Fletcher-
Goldfarb-Shanno (L-BFGS) [22] optimizer, it is a quasi-Newton method that uses
limited memory and approximates the BFGS algorithm. The L-BFGS algorithm
uses an estimate of the inverse of the Hessian matrix to minimize the second-order
Taylor expansion of the loss function. Since the Hessian is found with the second
order partial derivatives while the Adam optimizer uses the first order derivative,
L-BFGS optimizer converges to a more accurate results.

3.2.4 Loss Weights and Hard Boundary Conditions
The loss we have is a combination of two or more losses

MSE = MSEu + MSEf (3.11)

where MSEu is the loss from boundary and initial conditions and MSEf is the loss
from the PDEs. Say we have two boundary conditions, boundary A and B, and two
PDEs, call them C and D. Then the loss becomes

MSE = MSEA + MSEB + MSEC + MSED (3.12)

Now, if one of the losses is much larger than the others, we will have a problem train-
ing. S.Wang et al. [7] detail this problem and suggest an algorithm to deal with
this problem, but for simplicity this study does not implement it. Instead we assign
weights to each loss and manually assign values so the losses have similar magnitude.

Another approach that mitigates this problem and while also simplifying the train-
ing is to use hard boundary constraints introduced by Lu et al. [8]. Contrary to
the normal way to enforce boundary conditions via loss functions, hard boundaries
strictly impose the boundary conditions by modifying the network architecture.
Only Dirichlet and period BC were considered by Lu and this study will only con-
sider Dirichlet BC.
Dirichlet boundary condition for the solution u is given by

u(x) = g(x), x ∈ ΓD (3.13)

where ΓD ⊂ ∂Ω is a subset of the boundary. To make the approximate solution
û(x; θu) satisfy this BC, they construct the solution.

û(x; θu) = g(x) + `(x)N (x; θu) (3.14)

Here, N (x; θu) is the network output and ` is a function satisfying{
`(x) = 0, x ∈ ΓD
`(x) > 0, x ∈ Ω− ΓD

(3.15)

If ΓD is a simple geometry we can choose `(x) analytically. For example, if ΓD = a, b
i.e., the boundary of an interval from a to b, we can choose `(x) as (x − a)(b − x)

17


3. Methods

or (1− ea−x)(1− ex−b). For more complex domains it becomes difficult to find `(x)
analytically so it is possible to use a spline function to approximate `(x) but this is
outside of the scope of this study.
Now if we impose hard constraints on one of the boundaries the corresponding loss
would be zero thus reducing the complexity of the overall loss minimization improv-
ing training time and accuracy. Schematic of the network architecture modification
can bee seen in Figure 3.5.

Figure 3.5: Schematic of network modification for applying hard constraints

3.2.5 Loss Function
The loss function in a neural network quantifies the difference between the expected
outcome and the outcome produced by the model. From the loss function, we can
derive the gradients that are used to update the weights. The average over all losses
constitutes the cost. In this paper, we used only the mean squared error (MSE).
Other possible loss functions include mean absolute error, logarithmic MSE, mean
l2 relative error, softmax cross entropy error, and more.

18


4
Results

This chapter presents the results of various created models.

4.1 Coaxial Laplace Equation
The first model we consider is the one dimensional coaxial Laplace equation. Like
we said in Section 3.1.1, a reasonable relevant target for inner and outer radii is
R0 = 0.001m and R1 = 0.5m. The Laplace equation with these boundaries is highly
non-linear, we can increase the linearity by having a larger inner radius.
The theory on using neural network as function approximator states that any func-
tion can be approximated with a neural network given sufficiently many nodes.
However, larger neural networks take longer to train, and the same can be said for
the training set. Starting with a small network and few collocation points is there-
fore beneficial to keeping training time short.
It was found that it was much easier to solve Equation 3.2 for larger values of R0
compared to smaller values. To handle this, the following approach was used to be
able to solve the equation for as small R0 as possible.

1. Set an initial R0
2. Add hard constraints
3. Solve for as low R0 as possible for the current setup.
4. Change Hyperparameter: collocation points, network architecture, Optimizer

and learning rate, activation function.
5. Solve for current R0, if successful go to Step 3 else go to Step 4

It is worth mentioning that all parameters and the inner and outer radii are of order
one so no scaling is required here.
Starting with an inner radius of 0.1m we set up a small net with two hidden layers,
each with 10 nodes, tanh activation function, sample 30 points on the interval and
one on each boundary. Table 4.1 shows how many iterations and how long it takes
using either hard boundary conditions or soft boundary conditions. It is clear that
using hard boundary conditions speeds up the training time considerably especially
when using the Adam optimizer.
After many iterations of changing hyperparameters we where able to find a reliable
solution for 0.005m inner radius. One setup that gives good results every time is a
net with 3 hidden layers with 40 nodes and tanh activation function. We sampled
128 points on the interval and two on the boundaries with hard boundary conditions.
The model is trained for 5000 iterations using Adam with 0.001 learning rate and
then the L-BFGS optimizer. One realization of this setup can be seen in Figure 4.1

19


4. Results

Table 4.1: Time and iterations different boundary conditions and optimizer take
until results have a L2 error of less than 1e−2 for R0 = 0.1m.

Boundary Condition Optimizer Nr iterations time L2 error
Hard Adam - lr 0.001 1500 6s 8e−3

Soft Adam - lr 0.001 6000-12000 9-30s 5− 8e−3

Hard L-BFGS 150 7s 2.3e−5

Soft L-BFGS 314 10s 1.8e−4

and the corresponding electric field prediction can be seen in Figure 4.2.
When the inner radius is lower than 0.005m we encounter problems, mainly because
the training loss does not converge to a good solution consistently. Chapter 5 details
possible reasons for this and possible solutions.

Figure 4.1: Comparison of model prediction of the potential and the true values
for the Laplace coaxial equation

4.2 Drift-Diffusion Equation
We mentioned in Section 3.1.2 that a realistic values for the drift speed w is 200
m/s and the diffusion coefficient Diff is 5.05e−6. Like with the Laplace equation,
it is beneficial to look at a simplified case, so we choose to ignore diffusion, i.e. set
Diff to zero, and set the drift speed to be 1 m/s. The omission of diffusion is not
an unreasonable simplification because the diffusion is small relative to the system.
We find a good setup, using a net with 8 layers of 20 neurons, with 500 points in
the domain, 200 on the boundaries and 150 at the initial time. To see how well the
PINN can model charge transport we used FEM with the help Comsol [19] using
implementations found in Karman [2]. We compare the solution we get from PINN

20


4. Results

Figure 4.2: Comparison of model prediction of the electrical field and the true
values for Laplace coaxial equation

to the solution found using Comsol with 200 mesh points both with and without
numerical stabilization.

Figures 4.3 and 4.4 show a comparison of the PINN model with Comsol simulations
for 11 different times; each drop from 1 to zero is a prediction for a different time.

Figure 4.3: Comparison of the drift model with the solution found using Comsol
without adding numerical stabilization.

21


4. Results

Figure 4.4: Comparison of drift model with solution found using Comsol with
numerical stabilisation

Table 4.2: Test loss, training time, and optimizer specifications for PINNs shown
in Figures 4.3-4.7.

Figure Adam lr L-BFGS Time Loss
4.3 and 4.4 - - 356 37s 4.2e−8

4.5 - - 1210 269s 2.9e−6

4.6 - - 911 11s 3.2e−3

4.7 2000 0.01 521 30s 7.5e−7

Now we move on to the more realistic case with w = 200m/s and Diff = 5.05e−6.
First thing to consider is scaling, since w is not of order one. After dividing Equation
3.7 by w, we have all parameters of order one:

1
w

∂n

∂t
+ ∂n

∂x
+ Diff

w

∂2n

∂x2 = 0 (4.1)

It is also worth noting that applying hard constraints to this case does not improve
training and convergence as consistently as for the Laplace equation. The reason
for this could be the discontinuity on the left boundary.
When using the same model setup as in the previous example, we can get good
results (Figure 4.5, takes 3.5-5min), however we can also get results that are not
as good (Figure 4.6, takes 0-3min). This randomness comes from how the values
in the weight matrix are initialized. We can achieve faster training time by first
training with Adam (as recommended by Markids [12]). The results obtained from
this setup are still subject to the randomness of the initialization. It is difficult
finding a configuration that consistently outputs good results. Figure 4.7 shows
best results found for Equation 4.1, it took 30s with the same setup as before except
it is trained for 2000 iterations with Adam with learning rate 0.01 before training
with L-BFGS.

22


4. Results

Figure 4.5: Good result form a drift-diffusion PINN model compared with solution
found using Comsol with numerical stabilization on 50 mesh point grid

Figure 4.6: Not so good result form a drift-diffusion PINN model compared with
solution found using Comsol with numerical stabilisation on 50 mesh point grid

4.3 Mono Polar Charge Transport Coupled with
Poisson’s Equation

Next, we couple the two previous equations, Poisson and the drift-diffusion equation,
together. We use the knowledge that we have gained from the previous equations.
It works well to use hard boundary conditions for Poisson’s equation but not for
the drift diffusion equation and we should use scaling when parameters are not of
order one. It is also beneficial to use Adam for a few thousand iterations before
using L-BFGS. As mentioned in Section 3.1.3, we first look at the case where we
have a low injection of charges. Using the setup that gave the best result for the
drift-diffusion equation we get good results, see Figure 4.8, Table 4.3 shows how
long training takes and how low training loss we can achieve.
When increasing the concentration of charges injected we run into problems, as

23


4. Results

Figure 4.7: Best result form a drift-diffusion PINN model compared with solution
found using Comsol with numerical stabilisation on 50 mesh point grid

Table 4.3: Test loss, training time, and optimizer specifications for PINNs shown
in Figures 4.8-4.9.

Figure Adam lr L-BFGS Time Loss
4.8 2000 0.01 850 96s 4.4e−6

4.9 4000 1e−2, 1e−3, 5e−4, 1e−4 10482 312s 4.6e−4

stated before it is better to have parameters and boundary conditions of order one,
and when scaling the injection for the drift-diffusion equation we have to re-scale
it up for when it enters in the Poisson equation so the effects are measurable in
the electric field. When computed using the same network setup as before we get
decent results in 2 minutes, we can improve our results by increasing the number of
layers and having more iterations with Adam, the improvement in the electric field
prediction will however not be substantial and the increase in computation time is
high. Figure 4.9 shows one realization when injection is 1e9ions/m3 (highest we
where able to solve for). Training time, optimizer specification, and loss achieved
can be seen in Table 4.3.
Figure 4.9 shows that we can detect changes in the electric field, although they are
small. When increasing the injection further the loss becomes large and L-BFGS
gets stuck in local minima and never getting close to the global minima.

24


4. Results

Figure 4.8: Coupled PDE PINN model with low injection compared with solution
found using Comsol with numerical stabilization on 50 mesh point grid

Figure 4.9: Coupled PDE PINN model with the highest injection solved, 1e9

ions/m3.

25


4. Results

4.4 Summary of Best Practices
For all equations:

• Start with a simple problem.
• Have a smooth activation function. In our case tanh was used.
• Scale parameters such that they are of order one.
• Use hard boundaries or loss weights to ensure losses have similar magnitudes.
• Start training with adam and then use L-BFGS optimizer.

In the Laplace set-up we get better results increasing the number of neurons in each
layer than increasing the number of layers. However in the time dependent cases,
drift-diffusion and coupled equation, we get better results using deeper nets with
fewer nodes in each layer. The network used for each problem can be seen in Table
4.4

Table 4.4: Number of layers and neurons used for the best results for each problem.

Problem Layers Neurons
Laplace best 3 40
Drift-diffusion 8 20
Coupled equations 8 20

26


5
Discussion

Now we have gone through one approach to solve coupled PDEs with PINN model
and it is time to reflect on what lessons we can learn. The coaxial Laplace equation
shows that even for a simple example that has an analytical solution, the PINN
can run into problems if the system is highly nonlinear. When the inner radius is
smaller, the non-linearity of the system increases. This could be the reason for the
difficulties we have when set R0 lower than 0.005m. Possible solutions that have
not been explored include changing the network architecture using another type of
network, such as ResNet. ResNet uses the input at every layer and we have the
input in the PDE so that could be beneficial. Another possible solution could be to
write Equation 3.2 differently. When r is small, 1/r is large and having large values
in the PDE can prove troublesome in training as we saw in the coupled equation.
Therefore, it could be beneficial to multiply Equation 3.2 by r.
The drift-diffusion equation was more of a success, it shows that it is possible to get
more accurate results form a PINN model than from FEM. There are still improve-
ments to be made if PINNs are to be used. The PINN is currently much slower
than FEM, simulations that take a few seconds in Comsol take up to half a minute
with PINN. Then there is the problem of consistency. When the solution space
gets more complex like when we introduce diffusion and increase the drift-speed
the initial weight matrix starts to have a bigger effect on weather we converge to a
good solution or not. To help mitigate this problem it might be useful use transfer
learning, i.e. train a net on a similar problem that is simpler and then use that as
the initial net for the more complex problem.

When we moved on to the coupled equation it was beneficial to have already looked
at the equations separately. That allowed us to find out what worked well for
each equation and implement it for the coupled equation. When all the parameters
of the coupled PDE were of order 1, the PINN was able to find good solutions.
However when some parameter gets large it makes the training slow and inefficient.
Possible solutions are to re-scale the whole system or possibly applying a logarithmic
transform to the PDEs making everything a lot smaller.

5.1 Future Work
We have only just began answering the question, How useful are PINNs for problems
arising in discharge physics. Alot more research is needed to determine that. In this
section we will suggest how we think it is possible to build upon our work.

27


5. Discussion

5.1.1 Improve Current Work
We think that the next step would be to find a working solution to the mono polar
charge transport equation coupled with Poisson’s equation for up to 5e13 charges
injected. As mentioned earlier in the report several possible strategies for improved
convergence are suggested such as using: ResNets, transfer learning, RNN, more
advanced weights for the loss function, transformer network or another loss function
1. Then adding more complexity to the equations by introducing ion-generation,
ion-ion recombination, electrons and possibly more. Having done that, we think it
would be a good idea to take the Poisson equation and solve it in 2D and 3D with
increasingly complex geometry. To better understand the advantages of PINN over
FEM with respect to propagation of steep charge density gradients a systematic
comparison between FEM and PINN is needed to be done. In such as comparison
PINN collocation points and FEM mesh and time steps needs to be well chosen to
make a fair comparison.

5.1.2 Use PINNs for Inverse Problem
When finding the parameters of Equations 1.1 and 1.2 many assumptions and sim-
plifications are made, resulting in only approximates of the real parameters, there is
a need for methods that extract these physical parameters from experimental data.
One such method is using PINNs data-driven discovery of PDEs. The problem of
data-driven discovery of partial differential equations poses the following question,
given a small set of scattered and potentially noisy observations of the hidden state
of a system, what are the parameters that best describe the observed data? This
method has been shown to give good results.

1Suggested by thesis opponents Eric Jergéus and Leo Karlson Oinonen.

28


6
Conclusion

The results show that although PINNs are promising, there is still a need to develop
how to use them effectively. They can be useful as shown by good results in the
drift-diffusion equation.
From the three case studies we learn that:

• Laplace equation in coaxial coordinates can be solved if the non-uniformity of
the solution is not too large.

• The drift-diffusion equation can be solved with a higher accuracy than for
FEM. However, the solution time is much longer.

• Coupled Poisson a and drift-diffusion equation can be successfully solved by
PINN if the equations are weakly coupled. Further development is needed to
solve these equations if they are strongly coupled.

It is a good practice to start with a simple problem and then gradually increase
the complexity. One needs to know about every hyperparameter and how changing
them affects training time and convergence.
Current limitations of PINNs that we encountered are: Highly non-linear equations
rusult in slow convergence or non-convergent models. Also when systems are com-
plex, model accuracy, taining time, and convergence tend to be inconsistent, varying
a lot for every initialization of the weight matrix. We found no clear cut advantages
using PINNs rather than traditional numerical solvers. Further research into the
uses of PINNs for discharge physics is needed.

29


6. Conclusion

30


Bibliography

[1] S.Singh, "Computational framework for studying charge transport in high-
voltage gas-insulated systems", Doctor Thesis, 2017

[2] G. Karman, "Simulation and Analysis of Corona Currents in Large Scale Coax-
ial Geometry due to Triangular Voltages", Masters Thesis, 2013

[3] Hyuk Lee and In Seok Kang, "Neural algorithm for solving differential equa-
tions", In: Journal of Computational Physics 91.1(1990) Pages 110-131

[4] Maziar Raissi and Paris Perdikaris and George E. Karniadakis, "Physics In-
formed Deep Learning (Part I): Data-driven Discovery of Nonlinear Partial
Differential Equations"(2017), http://arxiv.org/abs/1711.10561

[5] Maziar Raissi and Paris Perdikaris and George E. Karniadakis, "Physics In-
formed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial
Differential Equations"(2017), http://arxiv.org/abs/1711.10566

[6] S. Mishra and R. Molinaro, "Estimates on the generalization error of
Physics Informed Neural Networks (PINNs) for approximating PDEs"(2020),
https://arxiv.org/abs/2006.16144

[7] S. Wang and X. Yu and P. Perdikaris, "When and why PINNs fail to train: A
neural tangent kernel perspective"(2020), https://arxiv.org/abs/2007.14527

[8] L. Lu and R. Pestourie and W. Yao and Z. Wang and F. Verdugo and S.G.
Johnson, "Physics-informed neural networks with hard constraints for inverse
design"(2021), https://arxiv.org/abs/2102.04626

[9] Daryl L. Logan (2011). A first course in the finite element method. Cengage
Learning. ISBN 978-0495668251.

[10] L. Lu and X. Meng and Z. Mao and G.E. Karniadakis, "Deep-
XDE: A Deep Learning Library for Solving Differential Equations"(2021),
https://doi.org/10.1137/19M1274067

[11] A.G. Baydin and B.A. Pearlmutter and A.A Radul, "Automatic differentiation
in machine learning: a survey"(2015), http://arxiv.org/abs/1502.05767

[12] S. Markidis, "The Old and the New: Can Physics-Informed Deep-Learning
Replace Traditional Linear Solvers?" (2021), https://arxiv.org/abs/2103.09655

[13] Y. Guo and X. Cao and B. Liu and M. Gao, "Solving Partial Differ-
ential Equations Using Deep Learning and Physical Constraints", (2020),
https://www.mdpi.com/2076-3417/10/17/5917

[14] G.S Misyris and A. Venzke and S. Chatzivasileiadis, "Physics-Informed Neural
Networks for Power Systems", (2019), https://arxiv.org/abs/1911.03737

[15] Q. Zhang and Y. Chen and Z. Yang, "Data driven solutions and
discoveries in mechanics using physics-informed neural network", (2020)
http://cs229.stanford.edu/proj2020spr/report/ZhangChenY ang.pdf

31


Bibliography

[16] N. Zobeiry and K.D. Humfeld, "A physics-informed machine
learning approach for solving heat transfer equation in ad-
vanced manufacturing and engineering applications", (2021)
https://www.sciencedirect.com/science/article/pii/S0952197621000798

[17] F. Bragone, "Physics-informed machine learning in power transformer dynamic
thermal modelling", Master thesis, 2021

[18] M. Barreau and J. Liu and K.H. Johansson, "Learning-based State Reconstruc-
tion for a Scalar Hyperbolic PDE under noisy Lagrangian Sensing", (2021),
https://proceedings.mlr.press/v144/barreau21a.html

[19] Comsol Multiphysics software, https://comsol.com
[20] https://en.wikipedia.org/wiki/Sobolsequence
[21] D.P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization", 2014,

https://arxiv.org/abs/1412.6980
[22] D.C. Liu and J. Nocedal, "On the limited memory BFGS method for

large scale optimization.", Mathematical Programming 45, 503–528 (1989),
https://doi.org/10.1007/BF01589116

32


DEPARTMENT OF ELECTRICAL ENGINEERING
CHALMERS UNIVERSITY OF TECHNOLOGY
Gothenburg, Sweden
www.chalmers.se

www.chalmers.se

	List of Figures
	List of Tables
	Introduction
	Background
	Purpose
	Limitations
	Related Work

	Background
	Deep Neural Networks
	Physics-Informed Neural Network
	Overview
	DeepXDE Framework

	Finite Element Method

	Methods
	Problem Description
	Laplace and Poisson's Equations
	Drift-Diffusion Equation
	Mono Polar Charge Transport Coupled with Poisson's equation

	Hyperparameters
	Activation Function tanh and Scaling.
	Collocation Points
	Optimizers and Learning Rate
	Loss Weights and Hard Boundary Conditions
	Loss Function


	Results
	Coaxial Laplace Equation
	Drift-Diffusion Equation
	Mono Polar Charge Transport Coupled with Poisson's Equation
	Summary of Best Practices

	Discussion
	Future Work
	Improve Current Work
	Use PINNs for Inverse Problem


	Conclusion
	Bibliography