From Noise to Pattern: Inverse Design
of FSS Using Variational Autoencoder
Generative Machine Learning for the Inverse Design of FSS

Master’s thesis in Complex Adaptive Systems

FRANCISCO BOUDAGH

DEPARTMENT OF PHYSICS

CHALMERS UNIVERSITY OF TECHNOLOGY
Gothenburg, Sweden 2025
www.chalmers.se

www.chalmers.se


Master’s thesis 2025

From Noise to Pattern:
Inverse Design of FSS Using

Variational Autoencoder

Generative Machine Learning for the Inverse Design of FSS

FRANCISCO BOUDAGH

Department of Physics
Chalmers University of Technology

Gothenburg, Sweden 2025


From Noise to Pattern: Inverse Design of FSS Using Variational Autoencoder
Generative Machine Learning for the Inverse Design of FSS
FRANCISCO BOUDAGH

© FRANCISCO BOUDAGH, 2025.

Supervisor: Ahmed Gouda, Department of Antenna System and Technology, Erics-
son AB
Supervisor: Giovanni Volpe, Department of Physics, University of Gothenburg

Examiner: Giovanni Volpe, Department of Physics, University of Gothenburg

Master’s Thesis 2025
Department of Physics
Chalmers University of Technology
SE-412 96 Gothenburg
Telephone +46 31 772 1000

Cover: Conceptual visualization of the high-dimensional loss landscape in FSS de-
sign, where every point corresponds to a unique FSS pattern. The arrows trace
gradient descent paths, guiding the way to lower loss and optimal solutions.

Typeset in LATEX, template by Kyriaki Antoniadou-Plytaria
Gothenburg, Sweden 2025

iv


From Noise to Pattern: Inverse Design of FSS Using Variational Autoencoder
Generative Machine Learning for the Inverse Design of FSS
FRANCISCO BOUDAGH
Department of Physics
Chalmers University of Technology

Abstract
Frequency Selective Surfaces (FSSs) are critical for modern electromagnetic filtering,
but their design traditionally relies on costly trial-and-error simulations. We present
a generative machine learning framework for the inverse design of FSS unit cell
patterns. By utilizing a conditional variational autoencoder (cVAE), the proposed
method directly maps desired electromagnetic scattering parameters (S-parameters)
to FSS patterns, thereby circumventing the traditional energy- and time-expensive
trial-and-error approach inherent in FSS design. A dataset of 10,000 simulated (pat-
tern, S-parameter) samples was generated using Ansys HFSS over the 2 to 8 GHz
frequency range to train both a surrogate neural network, which accurately predicts
the S-parameters from a given pattern, and a cVAE-based generator that synthe-
sizes novel pattern designs conditioned on target frequency responses.

The integrated framework employs a gradient-based optimization strategy in the la-
tent space to minimize the deviation between the predicted and desired S-parameter
responses, with particular emphasis on preserving the resonant frequency. Bench-
marking on the test dataset demonstrates that the surrogate model achieves mean
absolute errors from 0.5 dB at 2 GHz to 1.9 dB at 8 GHz, while the optimization
loop refines designs to yield deviations as low as 0.0-0.2 GHz at the resonant fre-
quency for half of the samples. These results underscore the promising potential of
generative machine learning for rapid FSS inverse design.

Keywords: Machine learning, variational autoencoder, artificial neural networks,
inverse problem, optimization, frequency selective surfaces, metamaterials.

v


Acknowledgments

This thesis is the product of numerous conversations, interactions, and shared ex-
periences. I am grateful to everyone who inspired me and encouraged authenticity
throughout this journey, shaping both this work and my personal perspective.

Special thanks go to my supervisor, Ahmed, and my examiner, Giovanni, for their
valuable guidance and insightful support.

To all who have supported me throughout my academic and personal journey, your
influence is woven into every page of this thesis.

Francisco Boudagh, Gothenburg, June 2025

vii


List of Acronyms

Below is the list of acronyms that have been used throughout this thesis listed in
alphabetical order:

Adam Adaptive Moment Estimation
ANN Artificial Neural Network
CNN Convolutional Neural Network
cVAE Conditional Variational Autoencoder
EM Electromagnetic
FCNN Fully Connected Neural Network
FSS Frequency Selective Surface
HFSS High-Frequency Structure Simulator
IRS Intelligent Reflecting Surface
MAE Mean Absolute Error
ML Machine Learning
MLP Multilayer Perceptron
MSE Mean Squared Error
PEC Perfect Electric Conductor
S-parameters Scattering Parameters

ix


Nomenclature

This nomenclature lists the parameters and variables used in the Methods, Results,
and Conclusion chapters, excluding those exclusive to the Theory chapter.

Scattering Parameters

|S11| Magnitude of reflected waves (in dB)

|S21| Magnitude of transmitted waves (in dB)

S S-parameters vector (concatenation of |S11| and |S21|)

Ŝ Predicted S-parameters vector (obtained from the surrogate model)

St Target S-parameters vector

□v Index at the valley of St (resonant frequency index)

Pattern Representation and Latent Space

P Pattern space

P Pattern (binary image)

P ∗ Optimal pattern derived from z∗

(P, S) A pair consisting of a pattern and its corresponding S-parameters

Z Latent space of the variational autoencoder

z A sample from the latent space of the variational autoencoder

z∗ Optimal latent space sample, obtained through optimization

xi


Neural Network Models and Loss Functions

NS Surrogate neural network model

NG Generator neural network model

Lopt Loss function for the optimization process

LS Loss function for training the surrogate neural network

LG Loss function for training the generator neural network (VAE loss)

Electromagnetic Parameters

εr Dielectric constant

tan δ Dissipation factor

f Frequency

λ Wavelength

∆f Frequency deviation

∆Mag. Magnitude deviation

xii


Contents

List of Acronyms ix

Nomenclature xi

List of Figures xv

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement and Thesis Scope . . . . . . . . . . . . . . . . . . 2
1.3 Applications of FSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Theory 7
2.1 Electromagnetic Plane Waves . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Material Models: Metals and Dielectrics . . . . . . . . . . . . . . . . 8

2.2.1 Metals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Dielectrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Periodic Structures and Modal Analysis . . . . . . . . . . . . . . . . . 8
2.4 Scattering Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Artificial Neural Networks Fundamentals . . . . . . . . . . . . . . . . 10

2.5.1 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Advanced Neural Network Concepts . . . . . . . . . . . . . . . . . . . 13

2.6.1 Convolutional Neural Networks . . . . . . . . . . . . . . . . . 13
2.6.2 Residual Connections . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.3 Attention Mechanisms . . . . . . . . . . . . . . . . . . . . . . 14

2.7 Conditional Variational Autoencoder . . . . . . . . . . . . . . . . . . 15
2.8 Theoretical Insights and Constraints . . . . . . . . . . . . . . . . . . 16

3 Methods 17
3.1 Overall Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Dataset Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2.1 Pattern Generation . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3 Surrogate Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.1 Dataset Size Analysis . . . . . . . . . . . . . . . . . . . . . . . 23

3.4 Generator Model: cVAE . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

xiii


Contents

4 Results 27
4.1 Surrogate Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.1 Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.2 Show Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.3 Dataset Size Analysis . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Generator Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.1 Show Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5 Conclusion 35
5.1 Summary of Research Findings . . . . . . . . . . . . . . . . . . . . . 35
5.2 Recommendations for Future Work . . . . . . . . . . . . . . . . . . . 35

5.2.1 Improving the Performance . . . . . . . . . . . . . . . . . . . 35
5.2.2 Inclusion of Incident Angles and Polarization . . . . . . . . . . 36
5.2.3 Integration of Multi-Layered FSS . . . . . . . . . . . . . . . . 36
5.2.4 Material and Dimension Parameterization . . . . . . . . . . . 36
5.2.5 Alternative Pattern Representation Techniques . . . . . . . . . 36

Bibliography 37

A Appendix I
A.1 Target, predicted, and simulated S -curves for benchmarking samples . I
A.2 Optimized unit cell pattern for benchmarking samples . . . . . . . . . III

xiv


List of Figures

1.1 Example of an FSS with a unit cell that has a simple metallic crosshair-
like pattern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Schematic representation of an object (golden circle) without an FSS
enclosure (left) and with an FSS enclosure (right). The FSS (teal
color) is designed to cloak the object by altering the field distribution.
In the left panel, the object scatters the incident wave, whereas in
the right panel, the FSS manipulates the wavefronts to guide the
electromagnetic waves around the object. . . . . . . . . . . . . . . . . 3

1.3 Two common applications of FSS. A: A stealth radome used in mil-
itary vehicles. B: An intelligent reflecting surface (IRS) that directs
the signal from the base station to the receiver, bypassing obstructions. 4

2.1 A schematic of an FSS array illustrating an incident wave Einc (green),
reflected wave Eref (red), and transmitted wave Etrans (blue). The
metallic pattern (gold) is repeated on a dielectric substrate (light
blue). On the right, the simulated scattering parameters’ magnitudes
are shown as |S11| (red) and |S21| (blue) in dB versus a frequency range. 9

2.2 A schematic representation of a feed forward neural network with
two input neurons, two hidden layers, and two output neurons. The
first hidden layer (x(1)) consists of three neurons, while the second
hidden layer (x(2)) has two neurons. The connections between layers
are weighted by w

(ℓ)
j←i, where ℓ indicates the layer index. Example

weights, such as w
(0)
32 , w

(1)
23 , and w

(2)
11 , are labeled to illustrate layer

connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Illustration of convolution and transposed convolution. The input

map (left-most) is convolved with a kernel to produce a feature map.
The feature map is then transposed convolved with a kernel to pro-
duce an upsampled feature map (right-most). . . . . . . . . . . . . . 14

2.4 Schematic representation of a residual block. The input is added via
a skip connection to the output of a series of layers. . . . . . . . . . . 14

2.5 Illustration of the cVAE architecture. The encoder maps the pat-
tern P and condition S to the latent distribution parameters, from
which z is sampled via the reparameterization trick. The decoder
then reconstructs P conditioned on z and S. . . . . . . . . . . . . . . 16

xv


List of Figures

3.1 Visual schematic of the overall optimization workflow. The diagram
shows how the generator NG and surrogate NS are connected within
the optimization loop to produce the optimal pattern P ∗. . . . . . . . 18

3.2 Illustration of full unit cell pattern creation through double mirroring. 19
3.3 Some of the handmade initial patterns. . . . . . . . . . . . . . . . . . 19
3.4 Different algorithms used to expand the initial 100 patterns to 10,000. 20
3.5 Overview of Ansys HFSS setup. A: A vacuum network with only

a dielectric plate in the middle. B: A quadrant of a pattern loaded
from a file. C: The quadrant in B has been mirror-duplicated around
the y and x axes to form the complete pattern. . . . . . . . . . . . . 21

3.6 Neural network architecture of the surrogate model NS, with exem-
plified input P and output Ŝ. . . . . . . . . . . . . . . . . . . . . . . 23

3.7 Neural network architecture of the generator model NG (cVAE), with
exemplified input St and output P̂ . . . . . . . . . . . . . . . . . . . . 25

4.1 Results from hyperparameter fine-tuning of the surrogate model. Train
and validation loss over 50 epochs for the best and worst hyperpa-
rameter configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.2 Train and validation loss of the surrogate model using the optimal
hyperparameters. The marker x indicates the minimum of the vali-
dation loss curve, i.e., the epoch at which the final model is saved,
epoch 63. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.3 Count versus absolute error (in dB) distribution, illustrating that
most indices exhibit very low errors, with a sharp decline in count as
absolute error increases. . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.4 Mean absolute error over the test samples across the frequency inter-
val, showing generally low errors with a positive trend of increased
errors at higher frequencies. . . . . . . . . . . . . . . . . . . . . . . . 29

4.5 Three different samples from the test data showing prediction (blue)
and the ground truth (green). . . . . . . . . . . . . . . . . . . . . . . 29

4.6 Validation loss L(min)
S,val on a validation set of 2,000 samples versus train-

ing data size ranging from 1,000 to 8,000. . . . . . . . . . . . . . . . . 30
4.7 Results from hyperparameter fine-tuning of the generator model (cVAE).

Train and validation loss over 70 epochs for the best and worst hy-
perparameter configurations. . . . . . . . . . . . . . . . . . . . . . . . 31

4.8 Train and validation loss of the generator model using the optimal
hyperparameters. The marker x indicates the minimum of the val-
idation loss curve, i.e., the epoch at which the final model is saved
(epoch 107). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.9 A set of 6 arbitrary outputs from the generator model, illustrating its
ability to generate novel FSS unit cell patterns. . . . . . . . . . . . . 32

4.10 Sample 5 - target, predicted, and simulated S-curves . . . . . . . . . 33
4.11 Sample 6 - target, predicted, and simulated S-curves . . . . . . . . . 33
4.12 Sample 7 - target, predicted, and simulated S-curves . . . . . . . . . 33
4.13 Sample 8 - target, predicted, and simulated S-curves . . . . . . . . . 33

A.1 Sample 1 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . I

xvi


List of Figures

A.2 Sample 2 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
A.3 Sample 3 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
A.4 Sample 4 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
A.5 Sample 5 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
A.6 Sample 6 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
A.7 Sample 7 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
A.8 Sample 8 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
A.9 Sample 9 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
A.10 Sample 10 - S-curves . . . . . . . . . . . . . . . . . . . . . . . . . . . II
A.11 Sample 1 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . III
A.12 Sample 2 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . III
A.13 Sample 3 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . III
A.14 Sample 4 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . III
A.15 Sample 5 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . IV
A.16 Sample 6 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . IV
A.17 Sample 7 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . IV
A.18 Sample 8 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . IV
A.19 Sample 9 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . V
A.20 Sample 10 - optimized pattern . . . . . . . . . . . . . . . . . . . . . . V

xvii


List of Figures

xviii


1
Introduction

1.1 Background
A Frequency Selective Surface (FSS) is an ultra-thin periodic surface designed to
interact with incident electromagnetic plane waves by partly or fully transmitting,
reflecting, or changing their polarization at different frequencies [1]. The smallest
component of such a surface is called a unit cell, which is repeated along both axes, in
theory infinitely and in practice many times depending on the specific application [1].
The common design consists of a dielectric substrate, serving as a complementary
base, with a metal printed on it. The electromagnetic (EM) behavior of this surface
depends, among other factors, on the pattern formed by the dielectric and metal [2].
Figure 1.1 depicts a simple FSS with a metallic crosshair-like pattern.

Figure 1.1: Example of an FSS with a unit cell that has a simple metallic crosshair-
like pattern.

When designing an FSS, the pattern of a single unit cell is primarily considered.
Other properties that affect the electromagnetic behavior include the specific mate-

1


1. Introduction

rial choice (i.e., which dielectric and which metal), the number of FSS layers, and the
geometric dimensions. When analyzing a specific design of an FSS, the scattering
parameters (S-parameters) are typically used to quantify the EM behavior. In older
literature, these are simply referred to as reflection and transmission coefficients.
These parameters are complex numbers that describe the portion of the wave being
transmitted or reflected at different frequencies. Thus, when designing the pattern,
one aims to achieve the desired S-parameters in simulation software [1, 3].

The search for what is known today as an FSS began in the early 1950s by the
organization now known as the Air Force Research Laboratory, operating under the
United States Air Force. The main goal at that time was to minimize the radar
signature of aircraft by using radar absorbing materials (RAM) [1], now referred to
as stealth materials [4]. Today, FSS are used in many applications, including radar
and stealth technology, satellite communication systems, wireless communication,
medical applications, and optical and infrared applications [2]; specific applications
will be presented later in this chapter.

1.2 Problem Statement and Thesis Scope

The current design process for FSSs typically begins with an experienced engineer or
antenna designer defining the desired S-parameters, which may also be provided by
the customer. Next, an initial metal-dielectric pattern is proposed and simulated.
The resulting EM behavior is analyzed and compared with the target behavior. If
the simulated and target behavior do not match, the geometric dimensions of the
pattern are adjusted to better match the desired EM response. For example, if the
proposed pattern is ring-shaped, the geometric dimensions would include the inner
and outer radii. If these modifications are insufficient, an outer loop is initiated in
which an entirely different pattern is selected that may better achieve the intended
response [5].

The mapping between EM behavior and the FSS pattern is nonlinear and many-
to-one; that is, multiple distinct patterns can produce the same EM response. In
other words, the problem lacks a closed-form solution and is non-injective [6]. Con-
sequently, the design process is trial-and-error-based, time-consuming, and energy-
intensive due to the large number of required simulations. Furthermore, it relies
heavily on the designer’s experience and creativity [7].

To tackle the time- and energy-consuming, tedious process of designing FSS and
to potentially discover new patterns for desired behavior, this thesis will develop
a generative machine learning (ML) model that, given a desired electromagnetic
behavior, comes up with the FSS pattern that obeys it. The idea is to use a con-
ditioned variational autoencoder model as the pattern generator. There are many
benefits and a need for such a tool. Given that such a model will generate reasonable
or promising results, it will save enormous time, and consequently energy, and also
lower the threshold for people with domain knowledge such that a junior designer
will be able to design FSS. Furthermore, such a probabilistic model is, in contrast to

2


1. Introduction

human experience and imagination, not limited and will come up with completely
new and novel patterns.

There are other parameters besides the pattern that could affect the S-parameters,
such as the dielectric substrate’s physical properties, thickness, and the dimensions
of the FSS itself, including height and width. All these parameters will be fixed.
While it is possible to design multi-layer FSS to achieve more complicated EM
behavior, to further limit the project’s scope, only single-layer FSS will be used.
The dielectric substrate will be fixed. Lastly, the metal type will also be fixed to
perfect electric conductor (PEC). In conclusion, everything except for the unit cell
pattern will be fixed.

1.3 Applications of FSS
Reducing the radar cross-section of electronic devices using antennas is one of the
most common applications of FSS. This technique is widely used in the military
to cloak stealth missiles or vehicles. For example, in a stealth missile such as the
Franco-British Storm Shadow or the US F-35, an antenna in the nose is covered by
a stealth radome consisting of FSS. This radome allows radar waves to pass through
at a specific operating frequency (in band) while blocking waves at other frequencies
(out of band) [2] (see Figure 1.3A).

A notable application of FSS worth highlighting is its use in advanced cloaking, a
technique to manipulate the electromagnetic field grid in order to warp the incoming
plane wave in a controlled manner to cloak an object (see Figure 1.2). In fact, with
this technique, one could design FSS to completely warp the wave around an object
and hence make it completely invisible at a specific frequency [8, 9].

Figure 1.2: Schematic representation of an object (golden circle) without an FSS
enclosure (left) and with an FSS enclosure (right). The FSS (teal color) is designed
to cloak the object by altering the field distribution. In the left panel, the object
scatters the incident wave, whereas in the right panel, the FSS manipulates the
wavefronts to guide the electromagnetic waves around the object.

3


1. Introduction

Figure 1.3: Two common applications of FSS. A: A stealth radome used in military
vehicles. B: An intelligent reflecting surface (IRS) that directs the signal from the
base station to the receiver, bypassing obstructions.

FSSs are also used in electromagnetic interference shielding. They work as enclosures
for electronic systems, such as medical devices or other sensitive equipment, to block
interference at specific frequencies. Studies have also shown that it is physically
possible to fabricate FSS on flexible substrates, which makes them practical for
portable devices and even for integration into automotive interiors [10].

The evolution of wireless communication and advancements in 5G and 6G have
increased the need for advanced EM filtering methods. In modern wireless commu-
nication systems, intelligent reflecting surfaces (IRS) are employed. These surfaces
are composed of FSS elements that control the propagation of EM waves. They can
reflect, transmit, and steer (beamform) waves in real time in a controlled manner.
This is done to improve coverage, enhance link reliability, and increase data rates
[11] (see Figure 1.3B).

Another application is in smart sensing for structural health and environmental
monitoring. In this approach, FSSs are fabricated on flexible or textile substrates
and integrated into structures or wearable devices. When a structure undergoes
strain or deformation, the dimensions of the FSS elements change slightly, which in
turn shifts the resonance frequency of the FSS. By monitoring these changes, one
can detect mechanical stress or environmental changes in real time. This method is
low-profile, non-invasive, cost-effective, and suited for integration into the internet
of things [2].

1.4 Related Work
There have been many research articles using machine learning for FSS design.
However, most of them consist of researchers coming up with a pattern and then
using ML algorithms to optimize the pattern dimensions, that is, the geometric
parameters of that specific pattern. While this approach reduces the time required
for designing an FSS, it still does not fully harness the power of generative ML to

4


1. Introduction

create novel designs given a desired frequency response.

For example, in the study [12], the authors proposed an initial predefined pattern
that was sunflower-inspired. They then trained an ML model (a decision tree algo-
rithm) using various geometric parameters of the pattern and their corresponding
frequency responses. Once the model was trained, it was used to predict the optimal
geometric parameters given a desired frequency response. A similar study, focusing
on a different frequency band and application, used another predefined pattern, a
plus-shaped design with a circle in the center [13]. In that work, a simple fully
connected neural network (FCNN) was trained to learn the mapping between the
geometric parameters of the predefined pattern and the frequency response, and
later used to predict the optimal geometric parameters given a desired behavior.
There have been many similar studies (for example, [14, 15, 16]) that follow this ap-
proach; what differentiates them is mainly the initial predefined pattern, frequency
band, and specific application of the FSS.

These approaches, which start with a predefined pattern and then train an ML
model to learn the mapping between the geometric parameters and the EM re-
sponse, streamline the design process by reducing the number of simulation itera-
tions through the interpolating power of the neural network. However, this approach
is quasi-automated and still relies heavily on the designer’s experience and creativity,
for example, in designing the initial pattern, which is commonly quite complex.

In recent years, there have been a few research articles addressing the fully auto-
mated inverse design of FSS through generative ML. For example, a study published
in 2023 [17] developed an FCNN to predict the pattern (a binary image encoding
metal and dielectric substrate) given the desired S-parameters. However, as men-
tioned, the generative component consisted mainly of an FCNN, which is not inher-
ently a generative model and therefore did not harness the power of recent generative
ML algorithms such as generative adversarial networks (GANs), variational autoen-
coders (VAEs), or diffusion models. Furthermore, that study used binary images of
size 30×30 pixels and only 3000 samples of patterns with their corresponding sim-
ulated S-parameters in the training data. In a similar study that used 52×52-pixel
binary images to represent the pattern, a VAE was used as a generator [18].

While this thesis shares similarities with the aforementioned studies, it differs by
utilizing a larger search space and dataset, as well as by adopting one of the more
recent generative machine learning architectures: a conditional variational autoen-
coder (cVAE). Standard VAEs are known to sometimes produce blurry or lower-
quality images because of their inherent design. In contrast, the conditional variant
(cVAE) leverages auxiliary information, in this case the desired S-parameters, to
guide the generation process more precisely. This not only leads to improved image
quality and increased fidelity with respect to the target characteristics but also en-
sures a more stable training process. Moreover, while GANs can be unstable and
require constant fine-tuning to avoid issues such as mode collapse, the cVAE frame-
work offers a robust and straightforward alternative without reliance on adversarial
training mechanisms.

5


1. Introduction

6


2
Theory

This chapter presents the mathematical framework for electromagnetic wave propa-
gation in periodic structures. We derive plane-wave solutions from Maxwell’s equa-
tions, introduce material properties for metals and dielectrics, formulate the Floquet
modal expansion for periodic structures, and define the scattering parameters S11
and S21. In addition, we present an overview of artificial neural networks and ad-
vanced machine learning techniques relevant to our inverse design approach. This
includes the architecture and training of feedforward networks, convolutional lay-
ers, and a conditional variational autoencoder, which will work together to map the
desired electromagnetic responses to optimal FSS patterns.

2.1 Electromagnetic Plane Waves

In a homogeneous, isotropic, source-free medium, Maxwell’s curl equations in phasor
form are

∇× E(r) = −j ω µ H(r), (2.1)
∇×H(r) = j ω ε E(r), (2.2)

where ω is the angular frequency, µ = µ0µr is the permeability, and ε = ε0εr is the
permittivity. A plane-wave solution is given by

E(r) = ℜ
{

E0 e−j k·r
}

, H(r) = ℜ
{

H0 e−j k·r
}

, (2.3)

with the position vector r = (x, y, z), constant amplitude vectors E0 and H0, and
wave vector k. The dispersion relation

k ≡ |k| = ω
√

µ ε = ω

c

√
εr µr, (2.4)

with c = 1/
√

µ0ε0, [19, 20].

7


2. Theory

2.2 Material Models: Metals and Dielectrics

2.2.1 Metals
In FSS applications, metallic elements are often modeled as Perfect Electric Con-
ductors (PECs), satisfying

E∥
∣∣∣
metal

= 0. (2.5)

For real metals with finite conductivity σ (S/m), fields penetrate up to the skin
depth

δ =
√

2
ω µ σ

, (2.6)

where µ = µ0µr. If the metal thickness tmetal satisfies tmetal ≫ δ, the PEC approxi-
mation holds [1].

2.2.2 Dielectrics
Dielectric substrates are characterized by a relative permittivity εr and a loss tangent
tan δ. Their response is modeled by the complex permittivity

ε = ε0 εr (1− j tan δ), (2.7)

which directly affects the resonance frequency and bandwidth of the FSS [19].

2.3 Periodic Structures and Modal Analysis
An FSS is realized as a two-dimensional periodic array of metallic elements on a
dielectric substrate. The lattice translation is defined as

R = m px x̂ + n py ŷ, m, n ∈ Z, (2.8)

and Bloch’s theorem requires that

E(r + R) = E(r) e j kB ·R, (2.9)

with kB being the Bloch wave vector. Consequently, the total field can be expanded
into Floquet modes:

E(r) =
∑
m,n

Em,n exp
{

j

[
k0 + 2πm

px

x̂ + 2πn

py

ŷ
]
· r
}

, (2.10)

where k0 is the incident wave vector [21].

2.4 Scattering Parameters
When an incident plane wave with complex amplitude Einc impinges on an FSS
unit cell, it produces a reflected wave of amplitude Eref and a transmitted wave

8


2. Theory

of amplitude Etrans, see Figure 2.1 for illustration. The scattering parameters are
defined as

S11 = Eref

Einc
, S21 = Etrans

Einc
, (2.11)

which are complex and thus encode both amplitude and phase information. Their
linear magnitudes, |S11| and |S21|, represent the amplitude ratios; squaring these
values gives the reflected and transmitted power ratios, respectively. An ideal band-
stop FSS is designed such that |S11|2 ≈ 1 and |S21|2 ≈ 0 over the stop band, while
a band-pass FSS requires |S11|2 ≈ 0 and |S21|2 ≈ 1. In a lossless network, energy
conservation demands that

|S11|2 + |S21|2 = 1. (2.12)

While in practice, material losses ensure that |S11|2 + |S21|2 < 1, [20, 22].

Figure 2.1: A schematic of an FSS array illustrating an incident wave Einc (green),
reflected wave Eref (red), and transmitted wave Etrans (blue). The metallic pattern
(gold) is repeated on a dielectric substrate (light blue). On the right, the simulated
scattering parameters’ magnitudes are shown as |S11| (red) and |S21| (blue) in dB
versus a frequency range.

9


2. Theory

2.5 Artificial Neural Networks Fundamentals

Artificial neurons, in the context of artificial neural networks (ANNs), are computer-
simulated biological neurons. The explanation of ANNs in this chapter is targeted
at the classical feedforward neural network (and not other types, such as in reservoir
computing). The purpose of ANNs is to be taught, usually through large amounts
of data, to accomplish a certain task, typically one that is not easily programmable
in an explicit manner, by learning a mapping from inputs to outputs. Each neuron
has a state (x), a value derived from the neurons in the previous layer through
weighted connections, much like how biological neurons communicate via electrical
and chemical signals [23]. The feedforward information propagation mechanism in
an ANN is defined as follows. Let x(0) be the input layer and O be the output, see
Figure 2.2. Denote the number of hidden layers (i.e., the layers between the input
and output layers) by N . Then the output is computed by

O = x(N+1) = g(N+1)
(

W (N+1)x(N) − θ(N+1)
)

, (2.13)

and state recursion for the hidden layers is given by:

x(ℓ) = g(ℓ)
(

W (ℓ)x(ℓ−1) − θ(ℓ)
)

, ℓ = 1, 2, . . . , N. (2.14)

Here, W (ℓ) represents the weights connecting layer ℓ−1 to layer ℓ and θ(ℓ) represents
the biases (or thresholds). Note that the linear operations within the parentheses
would only allow the network to model linear relationships; hence, a nonlinear ac-
tivation function g is applied to introduce nonlinearity [24]. Some commonly used
activation functions include:

Sigmoid : σ(z) = 1
1 + e−z

,

Tanh : tanh(z) = ez − e−z

ez + e−z
,

ReLU : ReLU(z) = max(0, z),

SiLU : SiLU(z) = z
( 1

1 + e−z

)
= zσ(z),

Softmax : softmax(zi) = ezi∑
j ezj

, i = 1, . . . , K.

As mentioned earlier, the network is trained by allowing it to predict an output
(via Equations (2.13) and (2.14)), comparing this prediction with the true value
through a loss function, and then updating the learnable parameters accordingly

10


2. Theory

Figure 2.2: A schematic representation of a feed forward neural network with two
input neurons, two hidden layers, and two output neurons. The first hidden layer
(x(1)) consists of three neurons, while the second hidden layer (x(2)) has two neurons.
The connections between layers are weighted by w

(ℓ)
j←i, where ℓ indicates the layer

index. Example weights, such as w
(0)
32 , w

(1)
23 , and w

(2)
11 , are labeled to illustrate layer

connections.

[24]. Common loss functions include:

Mean Squared Error : LMSE = 1
n

n∑
i=1

(yi − ŷi)2 ,

Binary Cross− Entropy : LBCE = − 1
n

n∑
i=1

[
yi log ŷi + (1− yi) log(1− ŷi)

]
,

Categorical Cross− Entropy : LCCE = − 1
n

n∑
i=1

K∑
k=1

yi,k log ŷi,k.

Here, ŷ denotes the network’s prediction, while y denotes the ground truth value
for the i-th sample in a dataset of n samples. The choice of the loss function is
problem-specific; for example, MSE may be suitable for a regression task, while
cross-entropy functions are appropriate for classification tasks [24].

Given the loss value L, the network’s learnable parameters, namely, the weights
W and biases θ, are updated according to the gradient descent principle through
the algorithm known as backpropagation. In essence, backpropagation computes the
gradient of the loss function with respect to each parameter by propagating the error
backward from the output layer to the input layer.

11


2. Theory

2.5.1 Backpropagation
To understand backpropagation, we begin by defining the local field (or net input)
of the neurons. For the output layer (layer N + 1), the local field is

B(N+1) = W (N+1)x(N) − θ(N+1).

The error at the output is then expressed as

∆(N+1) = (t−O)⊙ g′(N+1)
(

B(N+1)
)

,

where t denotes the target output, g′(N+1) is the derivative of the activation function
at the output layer, and ⊙ indicates the element-wise Hadamard product. Note that
since we need to use the derivative of the activation function, such a function must
be differentiable [24].

For the hidden layers (ℓ = N, N −1, N −2, . . . , 1), the error is computed recursively
by applying the chain rule:

∆(ℓ) =
((

W (ℓ+1)
)T

∆(ℓ+1)
)
⊙ g′(ℓ)

(
B(ℓ)

)
,

with
B(ℓ) = W (ℓ)x(ℓ−1) − θ(ℓ).

This backward flow of error signals is the origin of the term backpropagation.

Once the errors ∆(ℓ) are determined for each layer, the parameters are updated via
gradient descent. The weight update for layer ℓ is given by

δW (ℓ) = η ∆(ℓ)
(
x(ℓ−1)

)T
,

and the corresponding bias update is

δθ(ℓ) = −η ∆(ℓ),

where η > 0 is the learning rate.

Thus, the updated weights and biases are obtained as [24]:

W (ℓ) ←W (ℓ) + δW (ℓ),

θ(ℓ) ← θ(ℓ) + δθ(ℓ).

For batch training, where the weight and bias updates are averaged over all p training
patterns, the updates become

δW (ℓ) = η
1
p

p∑
µ=1

∆(ℓ)(µ)
(
x(ℓ−1)(µ)

)T
,

12


2. Theory

δθ(ℓ) = −η
1
p

p∑
µ=1

∆(ℓ)(µ).

Alternatively, in stochastic gradient descent, the updates are applied after processing
each individual pattern:

δW (ℓ) = η ∆(ℓ)(µ)
(
x(ℓ−1)(µ)

)T
,

δθ(ℓ) = −η ∆(ℓ)(µ).

It is important to note that the choice of hyperparameters, such as the activation
function, learning rate, number of neurons, number of hidden layers, etc., is largely
an experimental, experience-based fine-tuning decision. The choice of a loss function,
the objective, depends heavily on the specific task.

The gradient descent and stochastic gradient descent formulations presented above
are typically used in the literature to illustrate the basic process of computing gra-
dients and updating network parameters. In practice, modern machine learning
applications generally employ more sophisticated optimizers that incorporate mo-
mentum and adaptive learning rates, such as Adaptive Moment Estimation (Adam),
a method for stochastic optimization [25].

2.6 Advanced Neural Network Concepts
Now that we have covered the fundamentals in neural networks, feedforward propa-
gation, learning (backpropagation), activation functions, and loss functions, we now
introduce several advanced concepts and common layers that are needed for this
work. In particular, we focus on convolutional and transposed convolutional layers,
as well as residual connections and attention mechanisms.

2.6.1 Convolutional Neural Networks
Convolutional neural networks (CNNs) are designed to process data with a grid-like
topology, such as images. In a convolutional layer, a convolution (denoted by ⊛) is
performed between an input image (I) and a learnable filter (kernel) (K) to produce
a feature map [24]. The kernel size is a hyperparameter, and the kernel weights are
learnable parameters. Mathematically, the convolution operation is defined as:

(I ⊛ K) (i, j) =
∑
m

∑
n

I(i + m, j + n) K(m, n).

This operation acts as a feature extractor. Transposed convolutional layers reverse
this process to upsample feature maps. An illustration of convolution and transposed
convolution is provided in Figure 2.3.

13


2. Theory

Figure 2.3: Illustration of convolution and transposed convolution. The input map
(left-most) is convolved with a kernel to produce a feature map. The feature map
is then transposed convolved with a kernel to produce an upsampled feature map
(right-most).

2.6.2 Residual Connections
Residual connections help train neural networks by allowing gradients to flow more
directly [24]. In a residual block, the input is added to the output of a series of
layers. This can be written as:

y = F(x) + x,

where F(x) represents the layers in the network (typically convolution, activation
functions, etc.) applied to x. This simple mechanism helps to reduce the vanishing
gradient problem [24]. A schematic of a residual block is shown in Figure 2.4.

Figure 2.4: Schematic representation of a residual block. The input is added via
a skip connection to the output of a series of layers.

2.6.3 Attention Mechanisms
Attention mechanisms allow neural networks to focus on the most relevant parts of
the input data. In the scaled dot-product attention, given queries Q, keys K, and
values V , the attention operation is defined as [26]:

Attention(Q, K, V ) = softmax
(

QKT

√
dk

)
V,

where dk is the dimension of the key vectors.

14


2. Theory

Multi-head attention extends this idea by computing multiple attention operations
in parallel. This can be written as:

MultiHead(Q, K, V ) = Concat(head1, . . . , headh)W O,

with each head defined by:

headi = Attention(QW Q
i , KW K

i , V W V
i ).

In cross attention, the queries come from one sequence while the keys and values
come from another in order to allow the model to relate information across different
inputs.

2.7 Conditional Variational Autoencoder
A variational autoencoder (VAE) is a generative neural network that learns a proba-
bilistic latent representation of the input pattern P [27]. In this context, the encoder
is a distribution qϕ(z | P ) that maps each pattern P to a latent variable z, and the
decoder is a conditional distribution pθ(P | z) that attempts to reconstruct P given
z. Typically, a prior p(z) (often chosen to be a standard multivariate Gaussian) is
imposed on the latent space Z to regularize the learned representations.

To train such a model, one minimizes the negative of the evidence lower bound
(ELBO). Denoting this generator loss by LG, the objective can be written as:

LG(θ, ϕ; P ) = −Eqϕ(z|P )
[
log pθ(P | z)

]
+ DKL

(
qϕ(z | P ) ∥ p(z)

)
, (2.15)

where θ and ϕ denote the learnable parameters of the decoder and encoder neural
networks, respectively, and DKL

(
·∥·
)

is the Kullback-Leibler divergence, a measure of
discrepancy between two probability distributions. The first term in (2.15) penalizes
reconstruction errors, while the second term ensures that the approximate posterior
qϕ(z | P ) remains close to the prior p(z).

A conditional variational autoencoder (cVAE) extends this idea by incorporating
additional information S (for example, class labels or other attributes) into both the
encoder and decoder [28]. In our case, S will represent the simulated S-parameters
that characterize the electromagnetic behavior of P . Accordingly, the encoder and
decoder become qϕ(z | P, S) and pθ(P | z, S), respectively. The generator loss
becomes

LG(θ, ϕ; P, S) = −Eqϕ(z|P,S)
[
log pθ(P | z, S)

]
+ DKL

(
qϕ(z | P, S) ∥ p(z | S)

)
.

(2.16)
In many practical scenarios, p(z | S) is kept as the same standard Gaussian used for
p(z), thus simplifying the prior to be independent of S.

To enable gradient-based optimization despite the sampling of z from the encoder,
the reparameterization trick is employed. That is, rather than directly sampling

15


2. Theory

z ∼ qϕ(z | P, S), the encoder produces a mean µ(P, S) and standard deviation
σ(P, S), and the latent variable is obtained via

z = µ(P, S) + σ(P, S) ϵ, ϵ ∼ N (0, I).

This reformulation makes the sampling operation differentiable with respect to µ
and σ, allowing backpropagation to update the encoder parameters ϕ.

Figure 2.5 schematically illustrates a cVAE architecture, showing how the condition
S is introduced in both the encoder and the decoder. Through this conditioning, the
model can learn a structured latent space z that reflects desired attributes encoded
in S, thereby enabling the generation of patterns P from the pattern space P under
specified conditions.

Figure 2.5: Illustration of the cVAE architecture. The encoder maps the pattern
P and condition S to the latent distribution parameters, from which z is sampled
via the reparameterization trick. The decoder then reconstructs P conditioned on
z and S.

2.8 Theoretical Insights and Constraints
The inverse problem of predicting a pattern P ∈ P from desired S-parameters S ∈ S
(i.e., the magnitudes |S11(f)| and |S21(f)| over a frequency interval) is inherently
nonlinear and non-injective. That is, while the forward mapping F : P → S assigns
a unique S to each P , its inverse F−1 : S → P is many-to-one; many distinct patterns
can yield the same S. Hence, when a generative model produces a candidate pattern
P̂ , it cannot be directly compared to a unique ”true” P ; instead, the quality of P̂
is assessed by evaluating the corresponding (simulated) S-parameters against the
desired ones.

16


3
Methods

In this chapter, we introduce the mathematical notations and general workflow,
followed by detailed descriptions of the individual components of the project: data
generation, the surrogate model, the generator model, and the optimization process.

3.1 Overall Workflow
Let St ∈ S denote the target S-parameter magnitudes (i.e., |S11(f)| or |S21(f)| over
a frequency interval). The cVAE (generator) NG : S,Z → P maps the target St

and a random noise vector z ∈ Z to a candidate pattern:

P = NG(St, z).

The surrogate model NS : P → S then predicts S corresponding to P :

Ŝ = NS(P ) = NS

(
NG(St, z)

)
.

We define a loss function for the optimization process, Lopt
(
St, Ŝ

)
, which will be

specified in detail later. The objective is to obtain an optimal pattern P ∗ by mini-
mizing Lopt. In practice, since the generator NG and surrogate NS are fixed at this
stage, we achieve this by finding the optimal noisy vector z∗ such that

z∗ = arg min
z∈Z
Lopt

(
St,NS(NG(St, z))

)
= arg min

z∈Z
Lopt

(
St, Ŝ

)
,

and then obtaining the optimal pattern as

P ∗ = NG(St, z∗).

We update the latent vector z via gradient descent:

zt+1 = zt − η∇zLopt
(
St,NS(NG(St, zt))

)
,

with a learning rate η > 0. The overall latent optimization process is summarized
in Figure 3.1.

17


3. Methods

Figure 3.1: Visual schematic of the overall optimization workflow. The diagram
shows how the generatorNG and surrogateNS are connected within the optimization
loop to produce the optimal pattern P ∗.

Throughout the project, only the reflection will be considered. The rationale behind
this is that lossless networks are being dealt with, and if a desired behavior is
expressed in transmission, it can be converted to reflection through Equation 2.12.
This is also the reason why only one curve is displayed in Figure 3.1 as St and Ŝ.

In the following sections, the following steps will be described: (1) the generation of
the dataset {(P, S)}, (2) the training of NS to predict S-parameters from a pattern,
(3) the training of NG to generate patterns conditioned on S-parameters, and (4)
how these networks are combined in an optimization loop to produce the optimal
pattern P ∗.

3.2 Dataset Creation
In order to train our machine learning models later, we need to generate a dataset
of reasonable FSS unit cell patterns and their corresponding frequency responses,
(P, S). Due to time limitations, we could only generate and simulate 10,000 patterns.

3.2.1 Pattern Generation
To represent a pattern in an image, we encode metal as 1 and dielectric as 0, so P
is a binary image. To balance computational resources, efficiency, and practicality,
the image size is chosen to be 128×128 pixels. However, the search space is only
binary 64×64 because only one quadrant of the image needs to be decided, and the
full pattern is created through mirroring around the x and y axes (see Figure 3.2).
In this approach, only the first quadrant of 64×64 pixels is determined, and the
full 128×128 pixels pattern is generated through double mirroring. The complete
128×128 pixels pattern is what is simulated later to obtain the S-parameters, while
only the first quadrant is used in the machine learning models, since one quadrant
uniquely maps to the full pattern. This reduces the image size by a factor of four.
Therefore, when the generative ML model predicts a pattern, it essentially predicts
the first quadrant, which is then mirrored manually to obtain the complete unit cell

18


3. Methods

pattern. The rationale behind this approach is to create a relatively small search
space for the ML models (64×64 pixels, corresponding to a search space of 264×64),
while using a larger image for simulation to achieve a higher resolution and a less
rasterized image, enabling smoother modeling of round shapes.

Figure 3.2: Illustration of full unit cell pattern creation through double mirroring.

Generating 10,000 unique patterns manually is not feasible. To address this, we
drew 100 unique, realistic, research-inspired patterns and then applied various algo-
rithms to mutate them into 10,000 patterns. These 100 initial patterns were inspired
by several research articles and books to mimic common designs used in FSS. An
arbitrary sample of these patterns is shown in Figure 3.3.

Figure 3.3: Some of the handmade initial patterns.

With the initial 100 patterns in hand, we developed the following algorithms to
expand the dataset: operations on single patterns (inversion, rotation 90deg
and morphological operations such as dilation, erosion, closing, opening)
and operations on pairs of patterns (OR, AND, XOR). The idea behind the invert
and rotation 90deg operations is that if a pattern is realistic, its inverse and
rotated version are likely to be realistic as well. Morphological operations were
used to teach the ML model how small changes in patterns (for example, a cross
with different thicknesses) affect the S-parameters. The binary operations helped to

19


3. Methods

increase diversity by creating entirely new patterns. Visualization of the outcomes
of these different algorithms is shown in Figure 3.4.

Figure 3.4: Different algorithms used to expand the initial 100 patterns to 10,000.

Inversion followed by rotation was applied to all initial patterns. Then, morphologi-
cal operations were applied to randomly selected images, and binary operations were
applied to randomly selected pairs of images. Additionally, to maximize diversity,
a random sequence of operations (e.g., dilation followed by opening applied to an
already XORed image) was performed. These random mutations continued until we
had 10,000 unique patterns, with duplicates removed.

Now we have 10,000 binary images representing different unit cell patterns.

3.2.2 Simulation
With the complete set of patterns ready, we simulated each one to obtain the cor-
responding S-parameters. For this, we used Ansys Electronics 2024 R2.1, a high-
frequency structure simulator (HFSS), over the frequency interval [2, 8] GHz with
steps of 0.1 GHz, which covers widely used bands (S-, and C-band [29]). First, we
created a fixed setup - a vacuum network with a dielectric plate in the middle (see
Figure 3.5A). The ports were set as Floquet to idealize the infinite duplication of a
unit cell pattern, and excitations were applied to the walls. The dielectric used was
Rogers RT/duroid 5880, a common industry material with a low dielectric constant
(εr = 2.2) and a low dissipation factor (tan δ = 0.0009). This is the plate on which
the metal (PEC) is printed. The dielectric thickness was set to 1.575 mm, a stan-
dard value [30], and the unit cell dimensions were 15×15 mm. This size was chosen
based on the thumb rule of having a side length smaller than half the wavelength at

20


3. Methods

A B C

Figure 3.5: Overview of Ansys HFSS setup. A: A vacuum network with only a
dielectric plate in the middle. B: A quadrant of a pattern loaded from a file. C:
The quadrant in B has been mirror-duplicated around the y and x axes to form the
complete pattern.

the highest frequency [31] (for 8 GHz, λ/2 = c/2f ≈ 18.75 mm). Given the plate
dimensions of 15×15 mm and the image resolution of 128×128 pixels, each pixel
corresponds to approximately 15/128 ≈ 0.12 mm.

To automate the process, an automation script was written using Ansys HFSS’s
automation framework. The script reads a pattern, recreates its first quadrant in
the program (see Figure 3.5B), duplicates and mirrors it around the y and x axes
(see Figure 3.5C), runs the simulation, saves the |S11(f)| and |S21(f)| (in steps of 0.1
GHz) data to a CSV file, clears the scene, and repeats the process for all patterns.

The simulation time varied depending on the complexity of each pattern, averaging
about 4 minutes per pattern. Running simulations on two machines in parallel, the
entire process took roughly 2 weeks of simulation time. As a result, we now have a
dataset consisting of 10,000 (P, S) samples.

3.3 Surrogate Model
This is the surrogate model that will predict S-parameters given a pattern P , which
will be noted as NS(P ) = Ŝ. We will train this model using the complete dataset of
10,000 samples, with splits according to train/validation/test being 0.8/0.1/0.1. The
patterns are binary, hence no normalization is needed; the S-parameters, however,
will be normalized. The train and validation data will use the same Huber loss

LS =


0.5 (S − Ŝ)2/δ, if |S − Ŝ| < δ,

|S − Ŝ| − 0.5 δ, otherwise,
(3.1)

where δ = 0.2. The rationale behind the choice of δ and the loss function is to
penalize larger errors (using an L1 term) since peaks in S-parameters are of most

21


3. Methods

importance to predict correctly.

The test data will be evaluated with MAE loss
∣∣∣Si − Ŝi

∣∣∣, as it is more interpretable.
For the test data, several other benchmarks will be performed, such as the dis-
tribution of count versus absolute error and absolute error versus frequency. The
validation data will be used both to save the model with the lowest validation loss
over the training epochs and to perform a fine-tuning of the learning rate, batch
size, weight decay, and architectural hyperparameters (dropout rate and number of
filters in the convolutional residual blocks) over the following search space:

• Learning Rates: {0.00005, 0.0005, 0.0001}

• Weight Decays: {0.00005, 0.0005, 0.0001}

• Batch Sizes: {16, 32, 64}

• Dropout Rates: {0, 0.2, 0.4, 0.6}

• Filter Configurations: with baseline filters (32, 64, 128), the search space is
defined as {(32α, 64α, 128α) | α ∈ {0.5, 1.0, 2.0}}

Since the total number of hyperparameter combinations is 324, only 150 randomly
selected combinations will be used for fine-tuning, which is done for 50 epochs per
hyperparameter configuration, as training with all combinations would require a
very long time.

The optimizer Adam will be used. The total number of epochs for the final training
using the optimal hyperparameters is 200; however, as mentioned earlier, the best
model during these epochs will be saved as the final model.

The architecture for this model is a straightforward convolutional-MLP. The input
is a single-channel (binary pattern) 64×64 px image, hence the convolutional layers
at the beginning, and the prediction consists of the S-values, hence the MLP at
the end. The architecture also includes residual connections to mitigate the risk of
vanishing gradients. The architecture is illustrated in Figure 3.6.

22


3. Methods

Figure 3.6: Neural network architecture of the surrogate model NS, with exempli-
fied input P and output Ŝ.

3.3.1 Dataset Size Analysis
We have a total of 10,000 samples. Out of these, 2,000 samples are reserved for
validation and remain constant throughout all experiments. The remaining 8,000
samples are used for training, where we experiment with different training set sizes:
1,000, 2,000, 3,000, . . ., and 8,000 samples.

For each training set size, the model is trained for 100 epochs using fixed hyperpa-
rameters. The performance proportional measure, denoted as L(min)

S,val , is defined as
the minimum validation loss observed over the 100 epochs.

To ensure the robustness of the results, the entire experiment is repeated 5 times with
different random selections of the training samples. The final reported performance
for each training set size is obtained by averaging the results from these 5 runs. This
analysis provides insight into the model’s performance as the size of the training
dataset increases and helps determine whether our dataset of 10,000 samples is
sufficient.

3.4 Generator Model: cVAE
The goal of the generator model is to generate a pattern P , given the condition of the
target scattering parameters St and a random sample from the latent space z. The
ideal case is that the generated pattern actually yields the conditioned frequency
response when simulated. However, to truly ensure that it does so, it will later be
fed into the surrogate model for evaluation; this process is described in the next
section. The generator model, NG(St, z) = P , will work as a component in the
optimization loop defined later, where its main task is to provide P given St and
the latent vector, which is the parameter that will be optimized in order to yield
the optimal pattern P ∗.

23


3. Methods

This model will be trained using the complete dataset of 10,000 samples with splits
according to train/validation being 0.8/0.2. Similarly to the data preprocessing
done in the surrogate model, the patterns will not be normalized as they are already
binary, and the S-parameters will be normalized. The train and validation data will
use the same VAE loss

LG = −
n∑

i=1

[
Pi log P̂i +

(
1− Pi

)
log
(
1− P̂i

)]
︸ ︷︷ ︸

Binary Cross-Entropy

− 1
2

DL∑
j=1

[
1 + log σ2

j − µ2
j − σ2

j

]
︸ ︷︷ ︸

KL-Divergence

Here, Pi denotes the i-th pixel of the ground-truth pattern P ∈ {0, 1}n, P̂i is the
predicted probability of that pixel being 1, and µj, σj are the parameters of the latent
distribution for the j-th dimension of the latent space which has a total dimension
of DL (a hyperparameter).

The validation data will be used both to save the model with the lowest validation
loss over the training epochs and to perform a fine-tuning of the learning rate, batch
size, latent dimension, and number of heads in multi head self attention over the
following search space:

• Learning Rates: {0.00001, 0.0001, 0.0005}

• Batch Sizes: {32, 64}

• Latent Dimensions: {256, 512, 1024}

• Number of Heads: {2, 4, 6}

The total number of hyperparameter combinations is 54 and all combinations will
be evaluated for fine-tuning, which is done for 70 epochs per hyperparameter con-
figuration. The optimizer Adam will be used. The total number of epochs for the
final training using the optimal hyperparameters is 200.

The architecture for this model is a conditional variational autoencoder. The en-
coder part consists of four convolutional layers of kernel size 4×4, each followed by
the ReLU activation function. The encoder also includes a multi head attention
(MHA) module at the end. The decoder is the reverse of the encoder, consisting of
transposed convolutional layers, except that it does not contain an MHA, and the
final activation function is a Sigmoid to bound the pixel value prediction to [0, 1].
Directly after the prediction is done, a binarization is applied to the image manu-
ally; this is not illustrated in the figure below and is only used during inference (not
during training). The architecture is illustrated in Figure 3.7. When the cVAE is
trained, only the decoder will be extracted and used in the optimization loop in the
next step.

24


3. Methods

Figure 3.7: Neural network architecture of the generator model NG (cVAE), with
exemplified input St and output P̂ .

3.5 Optimization
The optimization process is the final step in our workflow, aiming to generate an
optimal pattern P ∗ that produces a frequency response closely matching the desired
target St. The workflow is illustrated in Figure 3.1. To achieve this, we iteratively
adjust the latent variable z used by the generator model NG until the surrogate
model NS predicts an S-parameter response Ŝ that is close enough to St.

Our objective is formulated as minimizing the loss function Lopt(St, Ŝ), which is
defined as the MSE between St and Ŝ plus a ReLU penalty term applied at the fre-
quency index corresponding to the valley (resonant frequency) of the target response
(denoted v):

Lopt(St, Ŝ) = 1
N

N∑
i=1

(
St,i − Ŝi

)2
+ max

(
0, Ŝv − St,v

)
,

where N is the number of frequency points. The loss function Lopt is designed to
capture both the overall deviation between the target and predicted S-parameters
(through the MSE term) and to emphasize the accuracy at the valley frequency
(resonant frequency) v (via the ReLU penalty term).

Since both the generator and surrogate models are fixed during the optimization,
the process is performed solely over the latent variable z. In other words, we search
for

z∗ = arg min
z∈Z
Lopt

(
St,NS

(
NG(St, z)

))
,

25


3. Methods

and then obtain the optimal pattern via

P ∗ = NG(St, z∗).

The optimization is carried out using gradient descent with the Adam optimizer.
Starting from an initial random latent vector z0, we update the latent variable
iteratively:

zt+1 = zt − η∇zLopt

(
St,NS

(
NG(St, zt)

))
,

where η > 0 is the learning rate. The use of the Adam optimizer ensures robust
convergence in the high-dimensional latent space [25].

When the optimization is complete and an optimal pattern P ∗ is achieved, it will
be modeled in Ansys HFSS to simulate it and obtain the true S, which will be
compared with the target St. This benchmarking will be performed on 10 different
St. These St are shown as blue curves in Appendix A.1.

26


4
Results

4.1 Surrogate Model
The hyperparameter search yielded the following optimal hyperparameters: learning
rate = 5 · 10−4, weight decay = 5 · 10−5, batch size = 32, dropout = 0.4, and filter
scale = 2. The train and validation loss using the optimal and worst hyperparameter
configurations is shown in Figure 4.1, highlighting the need for hyperparameter fine-
tuning. The worst hyperparameter configuration was: learning rate = 10−4, weight
decay = 5 · 10−4, batch size = 16, dropout = 0.6, and filter scale = 1.

Figure 4.1: Results from hyperparameter fine-tuning of the surrogate model. Train
and validation loss over 50 epochs for the best and worst hyperparameter configu-
rations.

27


4. Results

Using the optimal hyperparameters, the surrogate model was trained over 200
epochs. However, the model with the lowest validation loss over these epochs was
saved as the final model. As seen in Figure 4.2, the final model corresponds to the
epoch 63. The train and validation loss are depicted in Figure 4.2, where a steady
decrease is observed, which later stagnates.

Figure 4.2: Train and validation loss of the surrogate model using the optimal
hyperparameters. The marker x indicates the minimum of the validation loss curve,
i.e., the epoch at which the final model is saved, epoch 63.

4.1.1 Benchmarking

The trained model was evaluated using all 1000 test samples (10% of the total
dataset of 10,000). It should be noted that this data was never exposed to the
model during training, and is therefore equivalent to completely new patterns. In
Figure 4.3, the number of indices versus the absolute error is plotted as a histogram.
The count indices represent the number of frequency indices; since every sample
covers the frequency range from 2 to 8 GHz, and the data was simulated in steps
of 0.1 GHz, there are 61 indices per sample. The distribution exhibits a very steep
negative exponential shape, which is desirable as it shows that most indices have
low errors. However, it should also be noted that the absolute error axis extends up
to 40 dB, which is a very large error; only a few indices exhibit such high errors. In
Figure 4.4, the mean absolute error (in dB) over the frequency interval from 2 to 8
GHz is plotted. In this context, the mean refers to the average error for a specific
index across all test samples. The curve shows overall low average errors, ranging
from 0.5 to 1.9 dB, with a clear trend that higher frequencies (>5 GHz) exhibit
higher errors compared to lower frequencies (<5 GHz).

28


4. Results

Figure 4.3: Count versus absolute er-
ror (in dB) distribution, illustrating that
most indices exhibit very low errors, with
a sharp decline in count as absolute error
increases.

Figure 4.4: Mean absolute error over
the test samples across the frequency in-
terval, showing generally low errors with
a positive trend of increased errors at
higher frequencies.

4.1.2 Show Cases
Figure 4.5 displays three arbitrarily selected samples from the test data, which
allow for a visual inspection of the surrogate model predictions alongside the actual
ground truth. The samples on the left and in the middle exhibit very different
characteristics, yet the predictions are nearly perfect. In the sample on the right,
the prediction is accurate at the beginning and end, but the valley is missed.

Figure 4.5: Three different samples from the test data showing prediction (blue)
and the ground truth (green).

29


4. Results

4.1.3 Dataset Size Analysis

The dataset size analysis using the performance metric L(min)
S,val yielded the results

shown in Figure 4.6.

Figure 4.6: Validation loss L(min)
S,val on a validation set of 2,000 samples versus

training data size ranging from 1,000 to 8,000.

Figure 4.6 clearly indicates that the surrogate model improves with increased train-
ing data. The performance metric shows a rapid initial decline that gradually levels
off; however, it has not completely plateaued at the highest training data size, sug-
gesting that further increases in the dataset would result in noticeable improvements.
The significant reduction in the standard deviation as data increases implies that
repeated runs of the data are more likely to yield similar results, which indicates
higher reliability of the trained model.

4.2 Generator Model

The hyperparameter search yielded the following optimal hyperparameters: learning
rate = 10−4, batch size = 32, latent dimension = 512, and number of heads = 8. The
train and validation loss using the optimal and worst hyperparameter configurations
is shown in Figure 4.7, highlighting the need for hyperparameter fine-tuning. The
worst hyperparameter configuration was: learning rate = 10−5, batch size = 64,
latent dimension = 1024, and number of heads = 8.

30


4. Results

Figure 4.7: Results from hyperparameter fine-tuning of the generator model
(cVAE). Train and validation loss over 70 epochs for the best and worst hyper-
parameter configurations.

Using the optimal hyperparameters, the generator model was trained over 200
epochs. However, the model with the lowest validation loss over these epochs was
saved as the final model. As seen in Figure 4.8, the final model corresponds to
epoch 107. The train and validation loss are depicted in Figure 4.8, where a steady
decrease is observed, which later plateaus.

Figure 4.8: Train and validation loss of the generator model using the optimal
hyperparameters. The marker x indicates the minimum of the validation loss curve,
i.e., the epoch at which the final model is saved (epoch 107).

31


4. Results

4.2.1 Show Cases
Using the trained cVAE, inference was performed with a random latent vector and
arbitrarily chosen conditions from the simulated data. This demonstration serves
to show that the model can generate novel, unique patterns beyond those contained
in the dataset. A set of outputs is presented in Figure 4.9. It should be noted that
this is not a benchmarking nor an optimal design for any desired electromagnetic
behavior, but rather a showcase of the outputs.

Figure 4.9: A set of 6 arbitrary outputs from the generator model, illustrating its
ability to generate novel FSS unit cell patterns.

4.3 Optimization

The optimization was performed using 10 different target S-curves, St (blue curves
in Appendix A.1). The results are presented in Appendix A.1. In this appendix,
three S-curves are shown: the target S-curve (blue), the predicted S-curve for the
optimized pattern (orange), and the true S-curve (green) obtained from simulating
the optimized pattern in ANSYS HFSS. Appendix A.2 displays the 10 optimized
patterns. Note that the optimization outputs are binary and represent only the top-
left quadrant of a complete pattern. In Appendix A.2, the quadrant patterns have
been manually colored in gold and blue to visually represent the metal and dielectric
surfaces, respectively, and then expanded to illustrate the complete pattern.

From Appendix A.1, we clearly see that our optimization provides good results on
samples 1, 3, 5, 6, and 9. On samples 2, 4, 7, 8, and 10, the model performs poorly;
however, we still observe a valley in the simulated S-curve, although it is shifted by
up to 1.3 GHz from the desired valley.

In Figures 4.10 and 4.11 we show two of the good samples, sample 5 with (∆f =
0.1 GHz, ∆Mag. = 0.36 dB) and sample 6 with (∆f = 0.2 GHz, ∆Mag. = 6.4 dB).

32


4. Results

In Figures 4.12 and 4.13 we show two of the poor samples, sample 7 with (∆f =
0.5 GHz, ∆Mag. = 4.6 dB) and sample 8 with (∆f = 0.4 GHz, ∆Mag. = 3.0 dB).
These values are between St and S at their respective minima.

Figure 4.10: Sample 5 - target, pre-
dicted, and simulated S-curves

Figure 4.11: Sample 6 - target, pre-
dicted, and simulated S-curves

Figure 4.12: Sample 7 - target, pre-
dicted, and simulated S-curves

Figure 4.13: Sample 8 - target, pre-
dicted, and simulated S-curves

33


4. Results

34


5
Conclusion

In this project, an end-to-end framework for the inverse design of single-layer FSS
has been developed. The framework consists of a pattern generator (the decoder
part of the cVAE), a surrogate model, and gradient-based latent-space optimization.
Although it is not a complete product yet, it can still be used in the early phases
to inspire designers with patterns. We have demonstrated a proof of concept that
confirms that generative ML is a promising path toward solving the inverse problem
of FSS. This framework significantly reduces time and computational resources,
as it replaces a trial-and-error workflow. Broadly speaking, it lowers the entry
barrier for engineers working on FSS design. Furthermore, the entire methodology
is transferable to other inverse design problems, such as designing metasurfaces in
mechanics.

5.1 Summary of Research Findings
This thesis demonstrates that generative machine learning can effectively perform
the inverse design of FSS unit cell patterns. A dataset of 10,000 simulated (P, S)
samples enabled robust training, and the surrogate model achieved mean absolute
errors between 0.5 and 1.9 dB. The framework yielded good results for half of the
samples and poorer results for the other half, thereby demonstrating its potential.
By optimizing the latent space, the method produces designs with S-parameter
curves that closely match the desired responses, achieving resonant frequency devi-
ations as low as 0.0-0.2 GHz in half of the benchmarking cases. As shown in Figure
4.6, increasing the dataset size further improves the surrogate model’s performance.

5.2 Recommendations for Future Work

5.2.1 Improving the Performance
The first step forward in this application is to improve the model’s performance
to provide patterns with EM-behavior closer to the desired one. This can be done
by simulating more relevant data (patterns that have at least one clear valley in
the S-curve), see Figure 4.6. The dataset in this thesis consisted of only 10,000
samples (with only 8,000 used for training), which is a very small dataset size in

35


5. Conclusion

terms of machine learning. Another measure is to experiment with other kinds
of neural network architectures, especially for the surrogate model to improve its
performance. These could be the famous network architectures Xception, VGG16,
VGG19, and ResNet50.

5.2.2 Inclusion of Incident Angles and Polarization
In this work, only the boresight condition (0 degrees incident angle) is analyzed for
the simulation of S-parameters. If the study is extended to include other incident
angles, such as 60 degrees, a more comprehensive understanding of the FSS func-
tioning would be possible. Additionally, the second polarization could be included,
as the current study only examines the first. This type of study will require the
creation and analysis of a more extensive dataset to accurately capture the changes.

5.2.3 Integration of Multi-Layered FSS
This research has been limited to single-layer FSS in order to manage the complexity
of the design process. Future work should consider the extension to multi-layered
FSS designs, which offer the potential to achieve more complex frequency responses
through coupling effects. Incorporating multi-layered FSS will require a much more
extensive dataset.

5.2.4 Material and Dimension Parameterization
In this study, the material choice and the thickness of the dielectric substrate were
kept fixed to streamline the design process. A promising direction for future re-
search is to incorporate several common materials, both metals (e.g. copper) and
dielectrics (e.g. foam), along with dielectric thickness as adjustable parameters.
Allowing the network to select the optimal combination of these parameters could
lead to significant improvements in performance, though this approach will demand
a substantial increase in simulated data to cover the expanded search space.

5.2.5 Alternative Pattern Representation Techniques
The current approach uses a rasterized image (128×128 pixels) representation for
FSS patterns, which may impose limitations on the resolution and the fidelity of ge-
ometrical details. Future studies could explore the use of higher resolution methods
or alternative representation techniques, such as vector-based formats (e.g., SVG),
to mitigate issues associated with rasterization. Such enhancements are expected to
provide smoother and more precise pattern reconstructions.

36


Bibliography

[1] Ben A. Munk. Frequency Selective Surfaces: Theory and Design. John Wiley
& Sons, Inc., New York, 2000.

[2] Waseem Afzal, Muhammad Zeeshan Baig, Amir Ebrahimi, Md. Rokunuzzaman
Robel, Muhammad Tausif Afzal Rana, and Wayne Rowe. Frequency selective
surfaces: Design, analysis, and applications. Telecom, 5(4):1102–1128, 2024.

[3] Rana Sadaf Anwar, Lingfeng Mao, and Huansheng Ning. Frequency selective
surfaces: A review. Applied Sciences, 8(9), 2018.

[4] Xingwei Wang, Chen Zhao, Chuanpeng Li, Yu Liu, Shuang Sun, Qiangliang
Yu, Bo Yu, Meirong Cai, and Feng Zhou. Progress in mxene-based materials
for microwave absorption. Journal of Materials Science & Technology, 180:207–
225, 2024.

[5] Zhao Zhou, Zhaohui Wei, Jian Ren, Yingzeng Yin, Gert Frølund Pedersen, and
Ming Shen. Representation learning-driven fully automated framework for the
inverse design of frequency-selective surfaces. IEEE Transactions on Microwave
Theory and Techniques, 71(6):2409–2421, 2023.

[6] J.N. Hwang, C.H. Chan, and II. Marks, R.J. Frequency selective surface design
based on iterative inversion of neural networks. In 1990 IJCNN International
Joint Conference on Neural Networks, pages 39–44 vol.1, 1990.

[7] Bora Döken and Mesut Kartal. Easily optimizable dual-band frequency-
selective surface design. IEEE Antennas and Wireless Propagation Letters,
16:2979–2982, 2017.

[8] Pekka Alitalo and Sergei Tretyakov. Electromagnetic cloaking with metamate-
rials. Materials Today, 12(3):22–29, 2009.

[9] J. B. Pendry, D. Schurig, and D. R. Smith. Controlling electromagnetic fields.
Science, 312(5781):1780–1782, 2006.

[10] Komal Kaur and Amanpreet Kaur. Frequency selective surfaces (fss) for s and
x band shielding in electromagnetic applications. AIP Conference Proceedings,
2576:030015, 12 2022.

37


Bibliography

[11] Xuehan Chen, Jingjing Tan, Litian Kang, Fengxiao Tang, Ming Zhao, and
Nei Kato. Frequency selective surface toward 6g communication systems: A
contemporary survey. IEEE Communications Surveys & Tutorials, 26(3):1635–
1675, 2024.

[12] Leidiane CMM Fontoura, Hertz Wilton De Castro Lins, Arthur S Bertuleza,
Adaildo Gomes D’assunção, and Alfredo Gomes Neto. Synthesis of multiband
frequency selective surfaces using machine learning with the decision tree algo-
rithm. IEEE Access, 9:85785–85794, 2021.

[13] Varun Chaudhary and Ravi Panwar. Machine learning empowered magnetic
substrate coupled broadband and miniaturized frequency selective surface.
IEEE Transactions on Electromagnetic Compatibility, 65(2):406–413, 2023.

[14] Varun Chaudhary and Ravi Panwar. Machine learning derived tio 2 embedded
frequency selective surface for emi shielding applications. IEEE Transactions
on Dielectrics and Electrical Insulation, 30(5):2205–2212, 2023.

[15] Riqiu Cong, Ning Liu, Xiao Li, Hongwei Wang, and Xianjun Sheng. Design of
wideband frequency selective surface based on the combination of the equivalent
circuit model and deep learning. IEEE Antennas and Wireless Propagation
Letters, 22(9):2110–2114, 2023.

[16] Yaxi Pan, Jian Dong, Meng Wang, Heng Luo, and Yadgar I Abdulkarim. In-
verse design of ultra-wideband transparent frequency selective surface absorbers
based on evolutionary deep learning. Journal of Physics D: Applied Physics,
56(41):415002, 2023.

[17] Li-Ye Xiao, Yu Cheng, Yan-Fang Liu, Fu-Long Jin, and Qing Huo Liu. An
inverse topological design method (itdm) based on machine learning for fre-
quency selective surface (fss) structures. IEEE Transactions on Antennas and
Propagation, 2023.

[18] Parinaz Naseri and Sean V Hum. A generative machine learning-based approach
for inverse design of multilayer metasurfaces. IEEE Transactions on Antennas
and Propagation, 69(9):5725–5739, 2021.

[19] C. A. Balanis. Antenna Theory: Analysis and Design. Wiley, 4th edition, 2016.

[20] D. M. Pozar. Microwave Engineering. Wiley, 4th edition, 2012.

[21] Samuel Harrison. Exploring and Exploiting Charge-Carrier Confinement in
Semiconductor Nanostructures: Heterodimensionality in Sub-Monolayer InAs
in GaAs and Photoelectrolysis Using Type-II Heterojunctions. PhD thesis, Lan-
caster University, 2016. PhD Thesis. Accessed: Mars 2025.

[22] R. F. Harrington. Time-Harmonic Electromagnetic Fields. IEEE Press, 2001.

38


Bibliography

[23] David M. Lovinger. Communication networks in the brain: Neurons, recep-
tors, neurotransmitters, and alcohol. Alcohol Research & Health, 31(3):196–214,
2008.

[24] Bernhard Mehlig. Machine Learning with Neural Networks: An Introduction
for Scientists and Engineers. Cambridge University Press, 2021.

[25] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimiza-
tion, 2017.

[26] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you
need. In Advances in Neural Information Processing Systems (NeurIPS 2017),
pages 5998–6008, 2017.

[27] Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. arXiv
preprint arXiv:1312.6114, 2014.

[28] Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output
representation using deep conditional generative models. In Advances in Neural
Information Processing Systems (NIPS), pages 3483–3491, 2015.

[29] Institute of Electrical and Electronics Engineers (IEEE). Ieee standard letter
designations for radar-frequency bands. IEEE Std 521-2019 (Revision of IEEE
Std 521-2002), pages 1–15, 2020.

[30] Rogers Corporation. RT-duroid 5870 - 5880 Data Sheet, August 2022. Accessed:
2025-03-05.

[31] Edward A. Parker. The Gentleman’s Guide to Frequency Selective Surfaces.
Kent, UK, April 1991. First presented at the 17th Q.M.W. Antenna Sympo-
sium, London, April 1991.

39


Bibliography

40


A
Appendix

A.1 Target, predicted, and simulated S-curves for
benchmarking samples

Figure A.1: Sample 1 - S-curves Figure A.2: Sample 2 - S-curves

Figure A.3: Sample 3 - S-curves Figure A.4: Sample 4 - S-curves

I


A. Appendix

Figure A.5: Sample 5 - S-curves Figure A.6: Sample 6 - S-curves

Figure A.7: Sample 7 - S-curves Figure A.8: Sample 8 - S-curves

Figure A.9: Sample 9 - S-curves Figure A.10: Sample 10 - S-curves

II


A. Appendix

A.2 Optimized unit cell pattern for benchmark-
ing samples

Figure A.11: Sample 1 - optimized pat-
tern

Figure A.12: Sample 2 - optimized pat-
tern

Figure A.13: Sample 3 - optimized pat-
tern

Figure A.14: Sample 4 - optimized pat-
tern

III


A. Appendix

Figure A.15: Sample 5 - optimized pat-
tern

Figure A.16: Sample 6 - optimized pat-
tern

Figure A.17: Sample 7 - optimized pat-
tern

Figure A.18: Sample 8 - optimized pat-
tern

IV


A. Appendix

Figure A.19: Sample 9 - optimized pat-
tern

Figure A.20: Sample 10 - optimized
pattern

V


DEPARTMENT OF PHYSICS
CHALMERS UNIVERSITY OF TECHNOLOGY
Gothenburg, Sweden
www.chalmers.se

www.chalmers.se

	List of Acronyms
	Nomenclature
	List of Figures
	Introduction
	Background
	Problem Statement and Thesis Scope
	Applications of FSS
	Related Work

	Theory
	Electromagnetic Plane Waves
	Material Models: Metals and Dielectrics
	Metals
	Dielectrics

	Periodic Structures and Modal Analysis
	Scattering Parameters
	Artificial Neural Networks Fundamentals
	Backpropagation

	Advanced Neural Network Concepts
	Convolutional Neural Networks
	Residual Connections
	Attention Mechanisms

	Conditional Variational Autoencoder
	Theoretical Insights and Constraints

	Methods
	Overall Workflow
	Dataset Creation
	Pattern Generation
	Simulation

	Surrogate Model
	Dataset Size Analysis

	Generator Model: cVAE
	Optimization

	Results
	Surrogate Model
	Benchmarking
	Show Cases
	Dataset Size Analysis

	Generator Model
	Show Cases

	Optimization

	Conclusion
	Summary of Research Findings
	Recommendations for Future Work
	Improving the Performance
	Inclusion of Incident Angles and Polarization
	Integration of Multi-Layered FSS
	Material and Dimension Parameterization
	Alternative Pattern Representation Techniques


	Bibliography
	Appendix
	Target, predicted, and simulated S-curves for benchmarking samples
	Optimized unit cell pattern for benchmarking samples