DF

Delta Updates for Embedded Systems
An Implementation of Firmware Patching for Zephyr RTOS

Bachelor’s thesis in Computer Science and Engineering

LINNÉA LINDH

Department of Computer Science and Engineering
CHALMERS UNIVERSITY OF TECHNOLOGY / UNIVERSITY OF GOTHENBURG
Gothenburg, Sweden 2021


DEGREE PROJECT REPORT

Delta Updates for Embedded Systems
An Implementation of Firmware Patching for Zephyr RTOS

LINNÉA LINDH

DF

Department of Computer Science and Engineering
Chalmers University of Technology

University of gothenburg
Gothenburg, Sweden 2021


Delta Updates for Embedded Systems
An Implementation of Firmware Patching for Zephyr RTOS
LINNÉA LINDH

© LINNÉA LINDH, 2021.

Supervisors: Peter Malmberg, Endian Technologies AB; Jan Jonsson,
Department of Computer Science and Engineering
Examiner: Jonas Duregård, Department of Computer Science and Engineering

Department of Computer Science and Engineering
Chalmers University of Technology / Univerity of Gothenburg
SE-412 96 Gothenburg, Sweden
Telephone +46 31 772 1000

The Author grants to Chalmers University of Technology and University of Gothenburg
the non-exclusive right to publish the Work electronically and in a noncommercial purpose
make it accessible on the Internet.

The Author warrants that he/she is the author to the Work, and warrants that the Work
does not contain text, pictures or other material that violates copyright law. The Author
shall, when transferring the rights of the Work to a third party (for example a publisher or
a company), acknowledge the third party about this agreement. If the Author has signed
a copyright agreement with a third party regarding the Work, the Author warrants hereby
that he/she has obtained any necessary permission from this third party to let Chalmers
University of Technology and University of Gothenburg store the Work electronically and
make it accessible on the Internet.

Department of Computer Science and Engineering
Gothenburg, Sweden 2021

iv


Abstract
Many resource-constrained Internet of Things units are used in a manner which makes
them inaccessible via cable and at the same time unable to receive large amounts of
data using radio transmissions. Upgrading firmware is hence difficult or in some cases
impossible. Integrating support for delta updates in the unit is a potential solution to this
problem, as it significantly reduces the payload during an upgrade scenario. However, as
of now there is no open-source solution for delta upgrades in embedded systems. This
thesis aims to show that such a solution is achievable and that it causes a significant
payload reduction. This proof of concept is realized through the development of a Zephyr
application with delta upgrade support. Using this application firmware upgrades are
performed using only 2.6 percent of the amount of external data needed for standard
upgrades.

Keywords: Firmware Upgrades, Zephyr, Resource-constrained Systems, Patching, Delta
Encoding, Delta Updates, Data Differencing, Open-source.

v


Acknowledgements
This thesis would have been inconceivable without the sterling help I received from the
developers at Endian Technologies. Not only in the form of stimulating discussions and
competent feedback, but through presenting the ideas behind this work in the first place.
It has been tremendous fun, and I have learnt a great deal, but most importantly, I feel
like I got the opportunity to create something useful.

Thank you.

Linnéa Lindh, Gothenburg, February 2021

vii


Contents

Glossary xi

1 Introduction 1
1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Theory 5
2.1 Differential Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Differencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1.1 BSDiff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Patching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 System Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Limited Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Flash Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.1 Patch Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Source image location . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Method 11
3.1 Algorithm Design Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1.1 Selecting an Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Data Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.3 Patch Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 Algorithm descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.1 Detools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.2 Heat-shrink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.1 Detools Porting Strategy . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.2 Demonstrator Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.3 Main Application Setup . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3.4 Testing Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.4 Development Environment and Tools . . . . . . . . . . . . . . . . . . . . . . 15

ix


Contents

3.4.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.2 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Results 19
4.1 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Main Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Initialization Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Callback Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5 Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 Conclusion 23
5.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.1.1 Application Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.1.2 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.1.3 Ethical and Environmental Perspectives . . . . . . . . . . . . . . . . 24

5.2 Implications and Consequences . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Topics of Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.3.1 Effects of Compiler Optimizations . . . . . . . . . . . . . . . . . . . 25
5.3.2 Encryption Management . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Bibliography 27

x


Contents

Glossary
Term Definition
Binary Data represented using the binary number system. In

this project often referring to executable code.
Binary Differencing Differencing using two binaries. This is in contrast to

source code differencing, or other forms of differenc-
ing comparing text files, which is commonly used in
version control software.

Booting The process that starts the device operating system.
Block In C, a block refers to an area in the source code de-

limited by curly braces.
Callback Function A function which is passed as an argument to another

function, which the other function calls/executes at a
given time.

Delta Data describing the differences between the contents
of two pieces of data - the source and the target.

Data Differencing The process of generating a delta. Also known as dif-
ferencing.

Delta Updates Updates which create new firmware locally by patch-
ing old firmware.

Device Tree A data structure describing the hardware components
of a system. Allows for kernel usage and management.

Distribution server The server used for distributing the data needed for
firmware upgrades. For delta updates this data is the
delta.

Firmware Computer software used for low-level control of device
hardware.

Git A version management system for repositories. Ex-
amples of online repositories using Git include GitLab
and GitHub.

Image Signing A tool used for confirming that firmware has not been
corrupted or altered since it was signed. Uses crypto-
graphic hash to validate identity and integrity.

Internet of Things Items with embedded electronics and an internet con-
nection. Often abbreviated IoT.

Makefile A file mapping words to short scripts. Used for build
automation.

Metadata Data containing information about other data.
Partition A division of the secondary storage of a unit. In this

project, a labeled sector between two memory offsets
in the flash.

Patch File A file containing a delta.
Patching The process of creating the target using the source and

the delta.
Repository A storage location for data. Primarily source code.
Resource-constrained System An embedded system with limited hardware resources

available for completing jobs. For example, limited
memory.

xi


Contents

Term Definition
RTOS Real-time Operating System.
Source The old data that one wants to upgrade from. In this

project, the old firmware.
Space Complexity How much memory an algorithm requires.
Struct In C, a composite data type that stores a list of vari-

ables.
Target The new data that one wants to upgrade to. In this

project, the new firmware.
Target Device/Unit The device running the firmware we want to upgrade.
Time Complexity The amount of time it takes to run an algorithm.

hej

xii


1
Introduction

Many resource-constrained Internet of Things (IoT) units are used in a manner which
makes them inconvenient to access via physical cable. In these cases, firmware upgrades
are preferably transferred using Firmware Over-the-air, FOTA. FOTA technology makes
it possible to deliver firmware using radio technology[1]. However, using radio technology
comes with the drawback of it being rather expensive. Units receiving large amounts
of data via radio may accumulate high data costs and fast battery drain. The latter of
which is a consequence of the — relative to other IoT components — high energy needs
of communication devices[1]. Hence, many IoT units seldom or never perform firmware
upgrades.
A potential way to get around this problem is using so-called delta updates. These reduce
the data redundancy of distributing the upgraded system in its entirety, by replacing it
with a patch file containing a delta. Deltas contain information about the differences
between two pieces of data, the source and the target. This information can together
with the source data be used to create an exact copy of the target data. Delta updates
function by using the old firmware image as source data, and the desired firmware as target
data. Managing deltas does, however, introduce new complexity to the firmware upgrade
process.
On the distribution side, the server has to keep track of which version of the firmware
every target unit is currently running and be able to create deltas using this information.
On the target units side, the devices system has to be able to build bootable firmware
images using the data contained in these deltas. There are several open-source solutions
for version control and creating deltas, or differencing as this process is called, but none
freely available for applying deltas, or patching, in resource-constrained systems.
Developers at Endian Technologies AB saw a need to integrate delta updates into some
of their resource-constrained solutions, which gave rise to the objectives of this bachelor’s
thesis. Endian is a small engineering firm located in Mölndal. The firm specializes in
industrial IoT and manages everything from designing hardware, to developing firmware,
to creating cloud services. Endian has developed online solutions for embedded systems
for almost 15 years, with a broad spectrum of applications, such as the transport sector,
medical technology, and consumer electronics. The company specializes in open-source
solutions for embedded systems, which heavily influence the tools and systems used in
this theses as well.

1.1 Problem Statement

With the prospect of allowing developers to create products that previously have not
been viable, we hope to contribute to the development of a free implementation of delta-
updates for embedded systems. We aim to do this by (i) demonstrating that it is possible
to perform delta updates in a resource-constrained system and (ii) showing that this causes

1


1. Introduction

a meaningful reduction of data transmitted during a firmware upgrade.
More concretely, this will be achieved through the completion of the following objectives:

1. Choose a suitable algorithm for patch creation and application.
2. Tweak the patch application algorithm in order to make it compatible with the

hardware and the operating system (Zephyr).
3. Create a demonstrator which can be used to run the solution and test its validity.
4. Document the reduction of amounts of data transferred.
5. Suggest improvements to the solution and topics of further research.

Note that the aim of the project is to realize an idea in order to demonstrate its feasibility.
It is not to create a finished product or a generalized solution. Thus allowing for the
following demarcations to be made:

• the implementation will be a Zephyr application,
• the board will be the nRF52840 Development Kit (DK),
• data will be transferred via the boards USB interface,
• image encryption will be turned off.

Though creating a generalized solution would be out of the scope of this work, the nature
of embedded systems makes the problems present in one system likely to be common in
another. Ergo, the results of the report will in all likelihood be useful for implementations
of delta updates in many embedded systems, despite the demarcations made.

1.2 Problem Domain

In order to theorize about potential solutions, some prerequisite knowledge about the
target system is required. In this case the target system is a combination of the operating
system and the card on which it functions; Zephyr real-time operating system (RTOS)
and the nRF52840 DK.

1.2.1 Software

The Zephyr project is an open-source collaboration project hosted by Linux Foundation[2].
Zephyr RTOS is optimized for resource-constrained devices and is available across multiple
architectures, which makes it a good fit for this project.
Updates in Zephyr are done via the Device Firmware Upgrade subsystem[3]. It provides
the necessary frameworks to upgrade an application at run time. The subsystem consists
of two modules: the boot loader and image management code.
A boot loader is a piece of software used for starting the operating system, or main
application, of a unit[4]. It also enables device firmware upgrades, which makes it a focal
point of this thesis. Zephyr uses a boot loader called MCUboot.
For MCUboot to work a set of fixed flash partitions must be defined, so that the boot
loader knows where in the memory it ought to look to find certain data. The device
tree labels for these partitions are: boot_partition, the partition containing the the
MCUboot image; slot0_partition, the primary partition, containing the Zephyr image;
slot1_partition, the secondary partition, which stores firmware upgrade images and
therefore must be the same size as slot0_partition; and lastly, the scratch partition,
which is used as temporary storage while swapping the contents of slot0_partition and
slot1_partition[5].
The standard upgrade scenario is started by MCUboot[6] reading from the secondary
partition and noticing that it contains an image marked as "pending". This triggers the
process of physically swapping the contents of the two partitions with each other, and then

2


1. Introduction

trying to boot from the upgrade image located on the secondary partition. If the upgrade
does not boot, the boot loader swaps the images again and runs the old firmware, and if
it boots, the device naturally runs the new firmware.
By not discarding the old firmware, MCUBoot is able to test upgrades before committing
to running them. This is a process which makes upgrading firmware a lot safer. Moreover,
it also leads the device having the old firmware stored until something new is written over
it. Which could hypothetically be utilized for patching.
Before booting the new firmware, the MCUBoot performs an integrity check by reading
some data placed at the end of the image - the image trailer. The trailer contains a
cryptographic hash signature used for authentication and metadata about the image used
for ensuring compatibility. MCUboot supports several options for signing an image, but
in this work we will use a set of RSA keys.

1.2.2 Hardware

The system will run on the Nordic Semiconductor development kit nRF52840[7]. Key
features are:

• an on-board SEGGER J-Link debugger
• support for Bluetooth, NFC, Thread and Zigbee
• user programmable LEDs and buttons
• 1 MB flash memory divided into 256 pages of 4kB each
• NVMC - Non-volatile memory controller
• 256 kB RAM on chip
• 64 MB external flash memory
• USB interface
• 32-bit, 64MHz Arm® Cortex™-M4F

More details on the significance of these properties will be presented in the theory chapter
(chapter 2). A more detailed view of the device’s memory layout can be found in figure 1.1.
Zephyr has defined standard partitions for the nRF52840’s flash memory, the addresses
and labels of which can be found in table 1.1.

boot_partition 0x00000000 - 0x0000C000
slot0_partition 0x0000C000 - 0x00073000
slot1_partition 0x00073000 - 0x000DA000
scratch_partition 0x000DA000 - 0x000F8000
storage_partition 0x000F8000 - 0x000FFFFF

Table 1.1: Partition addresses[8].

1.3 Thesis Outline
The remainder of this report will be organised as follows. Chapter 2 introduces the the-
ory, which focuses on presenting the theoretical foundations needed to understand the
subsequent chapters. This includes the theory surrounding the subject of patches, and
the theory surrounding embedded systems, and the the potential compatibility issues that
may arise from combining these. Chapter 3 documents the methods used to create and
test the implementation of the patching algorithm. Chapter 4 focuses on documenting the
implementation created and the results of the tests. Chapter 5 evaluates the results and
reassess the report as a whole in reference to the aims set up in section 1.1.

3


1. Introduction

RAM3
AHB slave

RAM2
AHB slave

RAM1
AHB slave

RAM0
AHB slave

RAM7
AHB slave

RAM6
AHB slave

RAM5
AHB slave

RAM4
AHB slave

AHB multilayer interconnect

A
H

B
 

sl
av

e

Page 0

Page 1

Page 2

Page 3..254

Page 255

0x0000 0000

0x0000 2000

0x0000 3000

0x000F F000

Flash
ICODE/DCODE

A
H

B
 

sl
av

e

N
V

M
C

ICODE

DCODE

Peripheral

EasyDMA

D
M

A
 b

us

Peripheral

EasyDMA

D
M

A
 b

us

CPU

ARM Cortex-M4

S
ys

te
m

 b
us

IC
O

D
E

D
C

O
D

E

AHB2APB

AHB

APB

0x0000 1000

I-
C

ac
he

0x2000 0000

0x2000 1000

0x2000 2000

0x2000 3000

0x2000 4000

0x2000 5000

0x2000 6000

0x2000 7000
Section 0

Section 1

Section 0

Section 1

Section 0

Section 1

Section 0

Section 1

0x2000 8000

0x2000 9000

0x2000 A000

0x2000 B000

0x2000 C000

0x2000 D000

0x2000 E000

0x2000 F000
Section 0

Section 1

Section 0

Section 1

Section 0

Section 1

Section 0

Section 1

Data RAM
System

0x0080 0000

0x0080 1000

0x0080 2000

0x0080 3000

0x0080 4000

0x0080 5000

0x0080 6000

0x0080 7000

0x0080 8000

0x0080 9000

0x0080 A000

0x0080 B000

0x0080 C000

0x0080 D000

0x0080 E000

0x0080 F000

Code RAM
ICODE / DCODE

Section 0

Section 1

Section 2

Section 3

Section 4

Section 5

0x2001 00000x0081 0000

0x2001 80000x0081 8000

0x2002 00000x0082 0000

0x2002 80000x0082 8000

0x2003 00000x0083 0000

0x2003 80000x0083 8000

RAM8
AHB slave

Figure 1.1: nRF52840 memory layout[7].

4


2
Theory

As mentioned in the introduction, the subject of this report is a combination of two topics:
delta updates and embedded systems. Naturally, this chapter will focus on these, and what
potential compatibility issues may arise from combining them. First, this aim is directed
at the subject known as differential compression: what it is, how it is done, and which
algorithms are used for this purpose. Second, the quirks of the device, or in particular
its flash memory, will be detailed. The descriptions will not be covering the full topics,
but will focus on the information needed to understand the design choices made in the
following sections. The chapter will end on a concrete description of the compatibility
issues and their potential solutions.

2.1 Differential Compression
Differential Compression is the process of creating and applying deltas. As the name
implies, differential compression has a close relationship with data compression. In fact,
data compression can be seen as a special case of differential compression in which the
source data is empty[14]. As indeed, differencing with empty source data would be nothing
more than condensing the target image into a more compact form, which is by definition
data compression.
The data record containing the delta is in the context of software upgrades often referred to
as a patch. The ideal patch is minimal in space complexity and has high usability, where the
specifics of what usability entails varies depending on the domain of the problem. Usability
is however, often to some degree tied to a time complexity. This leads to a space-time
complexity trade-off, as lower time complexity leads to higher space complexity and vice
versa. This implies that finding an algorithm for differential compression includes finding
a global minimum, where the patches created are sufficiently small but time complexity
is not consuming important computational resources from either the server creating the
patch or the device applying it.

2.1.1 Differencing

Data compression is a process used to encode information using fewer bits than the original
representation. It comes in the forms of lossy and lossless compression. Lossless data
compression means that the compressed data will contain the same information as the
uncompressed version, which enables the compression to be reversed. For delta updates,
this is obviously needed, as the point of the process is to recreate the target. Differencing
algorithms are ergo implicitly lossless data compression algorithms.
The theoretical foundations of lossless compression are found in algorithmic information
theory, which concerns itself with measuring the irreducible information content of a string.
In relation to binary differencing the string is the difference between the contents of two
firmware binaries, or what could be called "the uncompressed patch".

5


2. Theory

Most differencing algorithms use a combination of data compression and differential com-
pression, where the patches are post-compressed using regular data compression.
Delta encoding algorithms fall into two classes depending on what type of differencing
algorithm is used[16]. The first of these is the copy/insert class, which typically use string
matching to locate matching blocks of code. The second is the insert/delete class, which
makes use of the longest common sub-sequence or shortest edit distance algorithms to
create a sort of ”edit script” that modifies the source file into the target.
In general, insert/delete algorithms are inferior to copy/insert algorithms in terms of size
complexity, but a lot easier for humans to read. Because of this, the former is used
for version control in Git for example[18], and the latter is used for binary differencing.
Consequently, we will only be concerned with algorithms in the copy/insert class in this
thesis.
The term "copy/insert" comes from the instruction stored in the patches the algorithm
class creates. "Copy" refers to instructions regarding which source blocks ought to be kept
and at which offset they should start. "Insert" refers to instructions regarding where the
new byte stored in the patch should be inserted.

2.1.1.1 BSDiff

BSDiff is a tool for building patches based on binary files, created by Colin Percival[10].
A major advantage of the BSDiff algorithm is that it offers a solution to the pointer
problem[10], which is a commonly occurring problem when creating binary patches. The
pointer problem arises from the nature of the executable file. Small, one-line, source code
adjustments to the code or data changes the relative positions of blocks, which makes all
pointers jumping over the modified regions outdated. A problem which, if not dealt with,
creates unnecessarily large patches.
BSDiff exploits two facts to solve the pointer problem. Firstly, that the changes outside
the modified region generally will be rather sparse, and mostly concern the least significant
one or two byte. Secondly, that data and code usually moves around in blocks, which leads
to a large number of nearby pointers having to be adjusted by the same amount. This
information can be used to deduce that the byte-wise differences between old binary and
the new are highly compressible.
The algorithm creates binary patches as follows. The algorithm starts by applying suffix
sorting to the source file, specifically qsufsort by Larsson and Sadakane[15]. Suffix sorting
is an algorithm which creates a suffix array, a lexicographically sorted array containing all
suffixes of a string. A simple example of what this entails can be constructed using the
string S = ”delta”. The string is indexed:

i 0 1 2 3 4
S[i] d e l t a (2.1)

and would given ascending order yield the following suffixes:

i 4 0 1 2 3
Suffix a delta elta lta ta (2.2)

where i is the starting index of the suffix. The suffix array, A, is an array containing i:

i 0 1 2 3 4
A[i] 4 0 1 2 3 (2.3)

This array can be used to find the largest possible regions in the new file which either
approximately or exactly match with regions in the old file. The regions that are only

6


2. Theory

off by a smaller amount are saved in a separate file containing the byte-wise difference
between the two files, which should be very repetitive and easily compressed. Copy and
insert instructions are saved in another file, and should also have high redundancy. Lastly
the non-matching data, which approximately should correspond to the modified code, is
saved in an ’extra’ file.
These output files can be considerably compressed using the Burrous-Wheeler transform,
also known as block-sorting compression or BWT[15]. BWT uses permutation of the
order of the characters in a string to concentrate character recurrences, which can be
abbreviated (for instance “EEEEE” could be written as ”5E”). Together these steps make
BSDiff remarkably good at creating small patches.
Differencing using the BSDiff algorithm has a time complexity of O((n+m) logn) and a
space complexity of max(17n, 9n+m) +O(1), where n is the size of the old file and m is
the size of the new file[10]. Patching using BSDiff has a time complexity of O(n+m) and
a space complexity of n+m+O(1).

2.1.2 Patching

In the same manner that the creation of a patch is related to data compression, the
application of a patch is related to decompression. Patching consists of data compression
and applying the instructions copy and insert. Exactly how this is done depends on
the target system and general preferences. In this work, we will only look at sequential
application and in-place application.
Sequential patching uses two memory regions, one containing the source and one containing
the target[12]. It also uses patches that are divided into repeating patterns, or sequences.
Each sequence consists of the parts: diff, extra, and adjustment, which together make up
the instructions for a small section of the target image. Sequential patching is hence a
loop which repeats the same three steps until the patch is fully applied.
In-place patching is slightly more memory constrained, as it only uses one memory region
for both source and target. It uses a sort of rotation, where the source image is moved
upwards in the memory to make room for the target image. Similarly to sequential patch-
ing, this algorithm also applies patches incrementally. However, it does also require that
the source image can be moved during run-time.

2.2 System Limitations
A resource constrained system has several notable features that sets it apart from the
conventional system most differential compression algorithms have been developed for.
This implies that a large part of what signifies a good solution has to do with its portability.
So, in order to choose a good algorithm, it ought to be determined which aspects of the
system are likely to be incompatible with regular patching algorithms.

2.2.1 Limited Memory

As mentioned in 1.2.1 the unit’s internal memory is divided into 4 partitions, where the
primary partition contains the Zephyr application with the ability to decipher a down-
loaded patch. When performing a delta update a reasonable approach would be to build
the new firmware in the secondary partition and then proceed with the normal update
maneuver. However, this leaves no room for the patch. In a worst case scenario both
the current and the new Zephyr application image take up the full primary or secondary
partition, in which case the patch itself would either have to be stored somewhere else or

7


2. Theory

progressively discarded in order to make room for the new image being built in the second
partition.

2.2.2 Flash Memory

One of the most notable properties of the system is that it uses flash memory. Indeed,
memory management is quite the quintessential part of the delta encoding process. Flash
memory has several limitations to low-level access that memory types assumed in patching
algorithms do not have.
Erasing and writing to the memory is done using the Non-Volatile Memory Controller
(NVMC)[7]. The NVMC has several features of interest. To start with, the operations of
erasing or the writing to the flash will halt the CPU if it is currently running code located
on it. Second, the NVMC is only able to set bits to ’0’ when writing. To achieve bits
set to ’1’ a section of the memory has to be erased, creating a ’clean slate’. Additionally,
there are a few features which vary between systems, and are characteristics of the target
device.
Writing to the flash can be done down to the 32-bit word. This means that only word
aligned writes are allowed. Byte or half-word-aligned writes will result in a fault. Each
writing cycle takes 41 µs. To put this number into perspective it would take about 10
seconds to program the entire memory. The precision when erasing a page is significantly
smaller. The smallest memory area that can be erased is called a page. As mentioned in
section 1.2.2 the flash is divided into 256 pages of 4 kB each (one thousand words per page
that is). Erasing a page takes 85 ms, which when put into perspective means that erasing
the entire memory page by page would take around 22 seconds (though just erasing the
entire memory normally, that is not page by page, would only take about 169 ms).
Furthermore, the write cycles per erase cycle are also limited. A 32 bit word can only be
written 2 times before a page erase must be performed, because of something called flash
wear. This means that even if two write cycles are used to flip nothing more than two bits
it is still not possible to write anything more to that area before erasing the page. Exactly
what flash wear is and how to avoid it is a whole subject by itself, and we will not dive
any deeper into it in this thesis than just asserting that flash memories have limitations
on how many times they can be reprogrammed before they start to lose their integrity.
To clarify why this matters, let us imagine a situation where a user wants to replace old
firmware with new firmware that is identical but for one word, which has a bit flipped
from ’0’ to ’1’. This could potentially mean that 4 kB would have to be erased and then
reprogrammed with almost identical data, instead of a simple one word write. Even more
devastating, it would likely be impossible to keep track of when this is needed, making
it the default case. Ergo, every page that is to have parts of the target image on it
would likely have to be erased before anything can be written. This indicates that an
implementation would need some sort of erase page management.

2.3 Solutions

Given the information in the preceding sections a series of different options for solutions
can be imagined. These options concern the placement of data in the memory, which
because of the device’s constrained memory and the general nature of the flash memory
has to be somewhat optimally selected. They also involve picking the algorithm used for
compression, which has to be compatible with the resources provided by Zephyr. This
section will describe what these options are and what picking them entails.

8


2. Theory

2.3.1 Patch Location

There are two appealing options for where the patch could be located. Either reserving a
whole new partition for patches (see table 2.2), or storing it on the second partition (see
table 2.1). An advantage of creating a new partition is that the patch and the target image
will not ever risk obstructing each other. However, a major disadvantage is that this par-
tition will further restrict the already very restricted memory of the device, or necessitate
the use of an external memory. Putting the patch at the end of the second partition has
the advantage being more memory efficient, but has the drawback of potentially getting
very complex when the target or source image is close to the secondary partition in size.
Unfortunately, this is rather likely to be the case in a memory restricted devices.

boot Lower addresses
source → slot0
patch → slot1

scratch Higher addresses

Table 2.1: Patch located on the primary partition.

boot Lower addresses
source → slot0

slot1
scratch

patch → patch Higher addresses

Table 2.2: Patch located on a patch partition.

2.3.2 Source image location

At first glance it might seems as only one option exists when it comes to the location of
the source image - the primary partition. This is not necessarily the case (see table 2.3),
as the secondary image actually also contains a version of the firmware - the version of the
firmware that was replaced during the last firmware upgrade. If a patch was to be created
based on the firmware located on the second partition this would likely mean a slightly
larger patch, but also that nothing would have to be copied from the primary partition,
which would both lower the time complexity of the solution and the flash wear.
It would also be possible to combine this with the previous section by having the source,
the patch, and the target all located on the secondary partition. This would enable use of
the in-place patching algorithm.

boot boot
v1 → slot0 Upgrade slot0 ← v2
v2 → slot1 −→ slot1 ← v3

scratch scratch

Table 2.3: Two versions of firmware swapping places during a firmware upgrade. A
delta upgrading the system to v3 can be created based on either v2 or v1.

9


2. Theory

2.3.3 Compression

A suitable compression algorithm has to live up to two criteria: firstly, it has to be imple-
mented into the differencing tool used to create patches; and secondly its corresponding
decompression algorithm has to be implemented into our Zephyr application. At this
moment Zephyr does not have any support for decompression[17], which means that the
easiest solution would be to port a decompression algorithm corresponding to the com-
pression options available for the delta encoding tool.

10


3
Method

The methods used to develop the product can be divided into: design, procedure, and
environment. Design pertains to the choices made regarding the solutions presented in
the previous chapter. It also pertains to the documentation of the algorithms that where
chosen. Procedure refers to all steps performed in order to generate the results, both in
terms of components needed for the solution and in terms of tests needed to validate it.
Development environment includes all tools used outside the target system to test and
deploy solutions. Together these make up a foundation which we hope to gives reliability
to the results presented in the following section.

3.1 Algorithm Design Choices
The design choices made regarding the solution are: choice of algorithm, data locations,
and patch metadata. Note that the project is a proof of concept, and thus does not have
to be optimal in any manner. This makes it likely that improvements could be made to
each of these.

3.1.1 Selecting an Algorithm

The algorithm chosen is an implementation of BSDiff created by Erik Moqvist called
Detools [12]. Detools is a delta encoding tool primarily based on Python, but with a C
based patching implementation created for embedded solutions. Detools has six different
types of compression implemented: BZ2, LZ4, LZMA, Zstandard, heat-shrink and CRLE.
However, only one of these are included in the embedded solution - heatshrink.
The solution is highly portable to Zephyr, with two exceptions. The first of these being
that the solutions assumes that the target device has file system support. Which zephyr
as a standard has not. However, data manipulations (read/write/erase operations) in
Detools are performed using call back functions. The functions could easily be replaced
by functions manipulating flash directly. They could also easily be developed to manage
erasing pages, which would solve the problems stemming from the nature of the flash
memory. The second of these is that the solution uses the size of the patch as input. The
sizes of data records are difficult to know without a file system or meta data, but this can
easily be solved by adding metadata to the patches.
For these reasons we selected Detools patching in combination with heatshrink decom-
pression.

3.1.2 Data Locations

The data locations where heavily influenced by the introduction of Detools. The algorithm
uses three separate memory locations, which means the target, patch, and source all have
their own partitions. Although a more complex solution would likely require less space,

11


3. Method

the significant cut to development time and the reduced risk of bugs made this appear the
most sensible choice.
This solution uses the devices storage partition (see table 1.1) as a patch partition, limiting
the size of the patch to 24 576 byte. The source image is naturally located on the primary
partition, making the patch based on the currently running version of the firmware.
In summary, all data locations and their partitioned areas are thus as displayed in table
3.1.

boot 0x00000000 - 0x0000C000
source → slot0 0x0000C000 - 0x00073000
target → slot1 0x00073000 - 0x000DA000

scratch 0x000DA000 - 0x000F8000
patch → patch 0x000F8000 - 0x000FFFFF

Table 3.1: Data locations in the implementation.

3.1.3 Patch Metadata

The patch created by Detools does, as most patch files, normally just contains instructions
for which part of the source image binary needs to be changed and how. However, as
mentioned above, there is a need for some extra information. This information is included
in the form of metadata at the start of the patch. This metadata is composed of the size
of the patch and a small message signifying that the patch is new. The message exists so
that the device may edit once it has implemented the patch, and in this way keep track
of what is currently located in the patch partition.
The metadata consists of a 24 byte long string organized as displayed in table 3.2. The
first 8 byte say "NEWPATCH", which is the message that tells the device to upgrade. The last
16 byte are made up of the size, encoded as a string representing a padded hexadecimal
number. In practice this string could be, for example "NEWPATCH0x000000000008c4".

Content ”NEWPATCH” Size Patch contents
Offset 0x0 0x8 0x18

Table 3.2: Patch contents and their offsets.

3.1.4 Summary

The selected algorithm is the Detools BSDiff implementation with heat-shrink compres-
sion. The source image is located in the primary partition, the target image is written to
the secondary partition, and the patch is located on a patch partition. The patch has a
24 byte header which contains the word "NEWPATCH" and the patch size.

3.2 Algorithm descriptions

This project utilizes code written in two other projects, Detools and heat-shrink, to a large
degree. The goal of the following section is to provide a short but sufficient explanation
of all the external code used, so that the reader may more easily understand the solution
as a whole.

12


3. Method

3.2.1 Detools

The easiest way to understand the Detools algorithm is by understanding the composition
of the patch. The patch created by the Detools differencing algorithm is a sequential patch,
and thus has a repeating layout (see table 3.3). Consequently, the patching algorithm
consists of the looping of several stages. These are: reading a chunk (512 byte), getting
the size of the diff, applying the diff, getting the size of the extra, applying the extra, and
adjusting the position of the buffer reading from the source image. In this context the
"diff" is data created through additions of the patch data and the source image data, and
the "extra" is data created simply by reading patch data.
Algorithm 1 provides an abstract description of the Detools algorithm, meaning that
some parts of it, such as initiation of structs and error management, has been omitted
for the sake of readability. If the reader wishes to dive deeper into the details of the
Detools algorithm - the full source code exists on the code repository[12], along with some
documentation.

header diff 1 extra 1 adj. 1 diff 2 extra 2 adj. 2 ...

Table 3.3: Patch layout without metadata (table 3.2) added. The first part of the
header is uncompressed and the remainder of the patch is compressed.

3.2.2 Heat-shrink

In the original implementation of BSDiff, the compression algorithm used is suffix sorting,
as explained in section 2.1.1.1. However, the Detools implementation of BSDiff does not
make use of suffix sorting at all, but rather a completely different algorithm.
The compression library used by Detools is called "Heat-shrink" and was created by a
developer at Atomic Object LLC. The project’s source code is located on the company
GitHub page[19]. It features both static and dynamic memory use, low memory usage,
and bounded CPU usage, which is perfect in a resource constrained environment. The
algorithm itself is based on the Lempel-Ziv-Storer-Szymanski (LZSS) algorithm for lossless
data compression. LZSS is a dictionary coding technique, which means it replaces a string
with a reference to a dictionary location of the same string. The idea is to record the
locations of previously unseen sub-strings. Sub-strings which, once seen in the text again
again, are replaced by a pointer to the first occurrence of this sub-string[20].
Table 3.4 shows an example of how LZSS compression changes a text. The example is taken
from an article by A. Ozsoy and M. Swany [21] published in the 2011 IEEE International
Conference on Cluster Computing.

0 I meant what I said 0 adj. I meant what I said
20 and I said what I meant 20 and(12,7)(7,8)(2,5)
44 30
45 From there to here 31 From there to (51,4)
64 from here to there 47 f(46,4)(51,8)(50,5)
83 I said what i meant 55 (24,19)

Table 3.4: LZSS encoding of a small piece of text. The original number of characters is
102, which is encoded into only 56 characters.

13


3. Method

3.3 Procedure

In order to create the desired application we needed a ported version of Detools, an idea
for a demonstrator used to test the validity of the solution, an idea for a main application
design, and a strategy for which tests are to be performed. This section will be the
foundation of the results, as it details the strategy behind them.

3.3.1 Detools Porting Strategy

The initializing function of the Detools patching algorithm takes six arguments. Which
are:

1. a callback to a function used from the source image into a buffer,
2. a callback to a function used to "jump" forward to an offset in the source image.
3. a callback to a function used for reading from the patch into a buffer,
4. an integer containing the size of the patch,
5. a callback to a function used for writing from a buffer into the target image,
6. a pointer to a struct containing all the resources needed by the functions above,

for example file pointers, or in our case a pointer to the flash device and some
information about the different partitions.

Given these arguments the following had to be implemented:
• A function which reads the patch header, determines if the patch is new, sets it to

"not new", and returns the size of the patch.
• A struct which contains the necessary resources the four functions will need in order

to access the flash and read or write to the correct places.
• A function which internalizes this struct.
• four functions which handles interactions with the device’s flash memory.
• A main function which starts the patching process, reboots the device if it was

successful, and manages any errors that may spawn if it is not.
The development of these callback functions were assisted by the use of code from the
Zephyr sample application "Flash Shell", an application which provided examples of func-
tions used for flash manipulation.

3.3.2 Demonstrator Setup

The "demonstrator" refers to the setup surrounding the implementation that is used to
demonstrate its functionality. It is a collection of sample applications that together makes
up the functionalities included in the main application that are not a part of the delta
encoding process, but rather the testing of it.
The first sample application integrated into the solution, that also served as the base for
the whole project, was an application called "Blinky" that turned one of the LED lights
on the device, LED 0, on or off every second. The program was a good starting point as
it was rather short and therefore easy to understand, but also communicated information
about the software outwards. It could also easily be modified into blinking LED 1 on the
board, creating a new application that was visibly distinguishable from the original. This
characteristic could be used to demonstrate that a delta upgrade had been successfully
implemented, as the source image currently running could be blinking LED 0 and the
target image could be blinking LED 1.
Second, we wanted to integrate some user input functionality into the software. The
purpose of this was to control when the patch was applied, which would make it easier to
supervise the events relating to an upgrade scenario. The easiest way to do this was using

14


3. Method

the device’s buttons. Hence, we utilized the sample program "Button". Button executes
a function every time a button on the device is pressed. This function could be modified
so that pressing the button led to a flag being set to one, indicating to the main loop,
running the LED-blinking, that it was time to stop blinking and look for a patch.
The resulting demonstrator was a application blinked LED 0 or LED 1, and when button
0 was pressed started the patch application process. This was the foundation of the main
application loop, see algorithm 2, which could be used to launch the implementation of
Detools (in the algorithm referred to as "checkForPatch").

3.3.3 Main Application Setup

The finished main application needed functionality concerning the upgrade process added
to the demonstrator setup mentioned in the previous section. This functionality was:

• Reading a patch and marking it as read.
• Initialing the data structure.
• Starting patch application.
• Marking the new firmware as an upgrade and rebooting the system.
• Managing potential errors.

In combination with the main loop from the demonstrator and the Detools implementation,
these functions would complete the full solution for the product.

3.3.4 Testing Strategy

Testing the validity of the solution is done in two ways. First, using the demonstrator,
meaning seeing which LED is blinking. Second, by dumping the contents of the primary
partition to a binary file after an upgrade and preforming a byte-wise comparison to the
target image.
Reduction of payload can simply be shown through looking at the ratio between target
image and patch.

3.4 Development Environment and Tools

This section details the environment and tools used to develop the application. This
includes some standard software which will be named but not explained, as their func-
tionality is assumed to be obvious. These are:

• IDE: Visual Studios Code 1.52.1
• Operating System: Ubuntu 20.04.1 LTS (x64)
• C standard: 11
• Compiler: GCC 9.3.0
• Python version: Python 3.8.5
• Board OS: Zephyr 2.4.0
• SDK: Zephyr SDK 0.11.3

In the section below the less known tools used will be detailed. The last section provide
some documentation of the python scrips created for automation.

3.4.1 Tools

The Zephyr Project includes a meta tool named West. West manages building, signing,
flashing and debugging Zephyr applications. It also has several repository management

15


3. Method

tools. In this project West is used for keeping the Zephyr modules updated, as well as
building Zephyr applications and signing them.
Building and flashing the boot-loader binary is done using ninja, which is a small build
system with similar functionality as the Make tool. Ninja is also what west per default
uses in the background for its flashing, building, and debugging operations.
Flashing the firmware image and the patch is done using pyOCD, which is a python based
tool used for programming and debugging ARM Cortex-M microcontrollers.
Connecting to the device’s console is done with J-Link Real-Time transfer, JLinkRTT-
Logger, from Segger.
The script used for dumping flash contents into files utilizes the pynrfjprog API. pynrfjprog
is a python wrapper around the nrfjprog dynamic link libraries, which allows for the
development of python scripts programming and debugging a nRF device.

3.4.2 Scripts

To automate the setting up of the demonstrator, a make file was created. The make
file manages building the boot loader and the application, signing applications, creating
patches, and flashing these images to the target device. It also manages connecting to
the devices shell, dumping the contents of the flash into files, and installing the necessary
dependencies.
Most of these functions are supported using the tools mentioned in section 3.4.1, but a
few of them rely on scripts written in Python. These functions are: keeping track of which
version of the firmware the device is currently running, adding metadata to the patches,
and dumping the contents of certain partitions into files.
The version tracking system is for the sake of simplicity incredibly rudimentary and only
keeps track of two versions of the firmware: the currently running one and the most
recently built version. It works by running a script before flashing a firmware image and
after flashing a patch. The script asks the user if they wish to update the source image
by replacing it with the contents of the target image, and does so if the user presses ’y’.
This is a far from perfect system, but it serves the purpose of this report by providing an
easy system for storing the files needed to create a patch.
The script adding metadata appends metadata to the start of the patch file in accordance
with section 3.1.3.
The script for dumping flash contents are, as the name suggests, used for saving flash con-
tents on binary files. The make file has three commands for this: dump-slot0, dump-slot1,
and dump-flash (which dumps both 0 and 1). All three of these use a python script which
takes a start offset, a slot size, and a target file as input and outputs the slot into the
target file. The purpose of this is to make it easier to troubleshoot if a patch for some
reason is not successfully applied. For example, the UNIX command cmp may be used to
get a byte by byte comparison between the desired firmware image with the one created
by the patching algorithm on slot one.

16


3. Method

Algorithm 1 Detools
while patch_offset < patch_size do
chunk_size = min(patch_size - patch_offset, 512)
chunk = read_chunk_from_patch(chunk_size)
while chunk_available(chunk) do
if STATE == init then
target_size = read_patch_header(); STATE = diff_size

end if
if STATE == diff_size then
chunk_size = unpack_diff_size(chunk); STATE = diff_data

end if
if STATE == diff_data then
to_size = min(128,chunk_size)
if chunk_size == 0 then
STATE = extra_size

end if
to[to_size] = decompress_patch_data(to,chunk)
from[to_size] = read_from_source(to_size)
for i = 0 ; i < to_size ; i++ do
to[i] = to[i] + from[i]

end for
write_to_target(to)
chunk_size -= to_size
target_offset += to_size

end if
if STATE == extra_size then
chunk_size = unpack_extra_size(chunk); STATE = extra_data

end if
if STATE == extra_data then
to_size = min(128,chunk_size)
if chunk_size == 0 then
STATE = adjustment

end if
to = decompress_patch_data(to,chunk)
write_to_target(to)
chunk_size -= to_size
target_offset += to_size

end if
if STATE == adjustment then
source_offset = unpack_offset(chunk)
move_source_reader_buffer(source_offset)
if target_size == target_offset then
STATE = done

else
STATE = diff_size

end if
end if
patch_offset += chunk_size

end while
end while

17


3. Method

Algorithm 2 Main loop
loop

turn led on/off
sleep for 1 second
if buttonPressed then

return ← checkForPatch(flash)
buttonPressed ← FALSE
if return then
print errorMessage(return)

end if
end if

end loop

18


4
Results

This chapter will contain the full documentation of the patching solution that was devel-
oped, as well as the tests that were performed in order to determine how successful this
solution was at reaching the aims of this work. The body of work surrounding the imple-
mentation mainly consists of solutions for porting the Detools algorithm. It also consists
of the structure around it which manages device specific things, for example triggering an
update and restarting the system. The following section details what this entails.

4.1 Data structure
In the original implementation of Detools a struct containing FILE pointers existed. These
file pointers where the only data records needed by the callback functions (see section 3.3.1)
to read and write data. Obviously this struct needed to be replaced by a new one in our
implementation, as ours do not use files, but rather, offsets in the flash memory. Thus in
its place a struct with the following data fields was created:

• *device: a pointer which points to the device used to access the flash memory,
• patch_current: the current position in the patch partition,
• patch_end: the end of the patch partition,
• from_current: the current position in the primary partition,
• from_end: the end of the primary partition,
• to_current: the current position in the secondary partition,
• to_end: the end of the secondary partition,
• write_buf: how much has been written to the secondary partition since a page was

erased.
The fields keeping track of "current" positions might not be intuitive, and their use stems
from the original implementation reading data from files using I/O streams. When new
data from a data stream is read it will be starting from the last unread byte, meaning
there is no need to keep track of positions. Consequently, using the original struct, the
callback functions do not have any information about offsets. This is naturally needed
in order for the functions to know where the relevant data starts. In the same manner
the ends of the relevant memory areas are not marked with a "end of file" or similar, but
instead one needs to keep track of this information manually.
The write_buf data field is used to keep track of when a full page has been written to
the secondary partition, and a new page needs to be cleared in order to make room for
more data.

4.2 Main Function
The main function serves to do five things: read the patch header, initialize the data
structure, start the Detools patching process, inform the MCUBoot that slot 1 contains

19


4. Results

a firmware upgrade, and restart the system. Algorithm 3 shows how this is done in more
detail.

Algorithm 3 Main patching function
patchSize ← readPatchHeader(flash)
if patchSize < 0 then
return patchSize

else if patchSize > 0 then
init(flash)
detoolsApplyPatch(func1, func2, func3, patchSize, func4, flash)
requestUpgrade()
rebootSystem()

end if
return 0

4.3 Initialization Functions
The first initialization function is the function reading the patch header, in algorithm
3 referred to as readPatchHeader. This function reads the 24 first byte of the patch
partition, figures out if the first 8 byte say NEWPATCH or not (if not returns 0), and what
size the last 16 byte say the patch is. It then writes the word FFFFFFFF over the place
where NEWPATCH used to be. Lastly it returns the size of the patch, or an error code if
something went wrong.
Initialization functions also refer to the two functions which run when the function referred
to as init in algorithm 3 is called. These functions erase the first page of the second
partition and initialize all the values in the flash memory struct.

4.4 Callback Functions
There are four callback functions in total. Their uses are: reading from the source image,
reading from the patch, writing to the target destination, and "jumping" to another offset
in the source image. The two functions used for reading are rather similar and will therefore
be explained by the same algorithm (algorithm 4). The two other functions for writing
and seeking ("jumping", as it were) are explained by algorithms 5 and 6.
All of these four functions start with a casting of the contents of a struct pointer. In our
implementation this struct is the one detailed in section 4.1, which among other things
contains a pointer to the device used to manipulate the flash memory and data relating
to buffer positions. In the original implementation of Detools these functions cast to a
FILE-pointer.
The read functions are rather simple. They use the flash device to read at some offset into
a buffer, increase the offset by the buffer size, check if the buffer is outside the partition
bounds, and return an error or a message saying the operation succeeded.
The write function does all the same things that the read functions do, but instead of
reading into a buffer, it writes from one. However, there are a few additional features which
make the write function slightly more complicated - it also contains some management of
erasing data and write protections.
Write protection management consists of removing the write protection when writing to
the flash, and then applying it again after the write. Erasing page management consists of

20


4. Results

keeping track of how many byte have been written to the target area since the last erase,
and erasing a new page every time the amount of byte becomes larger than the size of a
page. This way we keep flash wear down, as we only have to erase enough pages to make
room for the new image.
The seek function is likely the most simple function as it only adds an offset to the data
field keeping track of the current offset used to read from the source image.

Algorithm 4 Read function
flash ← (flashStruct) *structPointer
read(flash.device,flash.from_current,buf,bufSize)
flash.from_current ← flash.from_current + bufSize
if flash.from_current >= flash.from_end then
return ERROR

end if
return OK

Algorithm 5 Write function
flash ← (flashStruct) *structPointer
flash.writeBuf ← flash.writeBuf + bufSize
if flash.writeBuf >= PAGESIZE then
erasePage(flash,flash.to_current + bufSize)
flash.writeBuf ← 0

end if
setWriteProtection(flash.device,FALSE)
write(flash.device,flash.to_current,buf,bufSize)
setWriteProtection(flash.device,TRUE)
flash.to_current ← flash.to_current + bufSize
if flash.to_current >= flash.to_end then
return ERROR

end if
return OK

Algorithm 6 Seek function
flash ← (flashStruct) *structPointer
flash.from_current = flash.from_current + offset
if flash.from_current >= flash.from_end then
return ERROR

end if
return OK

4.5 Tests

The aims of this thesis are to (i) demonstrate that it is possible to perform delta updates
on resource constrained embedded systems and (ii) that these updates cause a meaningful
reduction of data transmitted during a firmware upgrade. To test that these aims have
been reached we need to verify that the application created can perform a delta update,

21


4. Results

Source
Info

Target
Info

Target
Size

Patch
Size

P/T
Ratio

Notes

LED 1 LED 0 31.27 kB 845 B 2.6 %
LED 0 LED 1 31.27 kB 845 B 2.6 %
LED 1 LED 1 31.27 kB 515 B 1.6 % Same signature.
LED 1 LED 1 31.27 kB 800 B 2.5 % New signature.
LED 1 and
print

LED 0 and
no print

30.47 kB 1.4 kB 4.6 %

LED 0 and
no print

LED 1 and
print

31.27 kB 2.0 kB 6.4 %

Table 4.1: Test results. Where: Source/Target Info refers to information about the
image; Target Size refers to the size of the target image; Patch Size refers to the size of
the patch; and P/T ratio refers to the size ratio between patch and target image. The

unit for size is byte.

and that the patch used in this process is significantly smaller than the target image. All
the test results can be found in compact form in table 4.1.
The first of these aims was acknowledged by the successful upgrade from an application
blinking LED 1 on the board to an application blinking LED 0. It was also possible to
flash a new patch "upgrading" the image back to its original state. The validity of this
was further confirmed through dumping the upgraded image on the primary partition of
the board and comparing it to the target binary used to create the patch.
The second of these aims was acknowledged by noting that the patch created was 845
byte, while that target image was 31 270 byte. This meant that in this case only 2.6
percent of the amount of data needed without delta updates had to be transferred with
delta updates. The target and the source file differ by 291 byte, 286 of which are changes
to the image trailer.
Additionally, a few extra tests were performed using other types of upgrade images.
The first of these tests was performed using an upgrade image that was based on identical
source code, but had been built and signed on a different day. The signed images differed
by 256 byte, all located in the image trailer. The generated patch was 802 byte and had
a 2.5 percent patch/target ratio.
The second test was performed using a firmware upgrade with error printouts disabled and
the LED switched. This yielded a large difference to the binary (over half the image, due
to shifts resulting from missing blocks of code) and a generated patch of about 1439 byte
byte. The target image was in this case 30 470 byte, which means that the patch/target
ratio was 4.6 percent. The patch could be applied without problem.

22


5
Conclusion

This chapter reflects on the methods and results presented in the previous chapters by
returning to the aims presented in the problem statement. The analysis is intended to
point out both how the thesis results may be utilized, but also in which way the method
might be improved in order to generate even more utility. We also intend to describe what
this utility can add to the industry at large. Lastly we suggest some topics which might
be suitable for future work.

5.1 Discussion
This sections contains the full analysis of the methods and results in the report. This is
divided into application design and tests. Where the first discusses the choice of algorithm
and the second the outcome of the tests used to show its validity.

5.1.1 Application Design

This project was made considerably easier through the usage of the Detools package. Not
only did the use of the package contribute to reducing the amount of time needed to
develop an algorithm by ourselves, it also saved a lot of time otherwise needed for testing
the validity of it. Though it is possible that another solution would have offered more
control, the package proved to leave little to wish for in terms of portability. Hence, it
seems rather unnecessary to reinvent the wheel.
The choices regarding data locations were selected for reduced development time rather
than constraining memory usage. This was a consequence of the limited scope of the
project, and might have limited the amount of systems for which the results are applicable.
On the other hand, a robust system is one of the most essential parts of a unit that
cannot receive firmware upgrades easily. Adding extra complexity to the solutions would
have increased the risk for bugs, and thereby perhaps deemed the results in this report
applicable for even fewer systems.

5.1.2 Test Results

The tests performed in section 4.5 acknowledged that the aims of the report had been
fulfilled. A Zephyr application that was able to perform delta updates had been created,
and it had been used to reduce the amount of data needed to be transferred. Though,
there is still a lot of room for there to edge cases where the solution does not function
as intended. In order for our solution to be ready for deployment, this would have to be
investigated.
The solution developed has two natural limits to patch sizes. These are: when the patch
gets larger than the patch partition (24 576 byte) and when the patch is larger than the
target image. Though, it is indeed possible that there are other limits that we have yet to

23


5. Conclusion

discover. This could, for example, be at some arbitrary point where the time consumed
by the device to apply the patch becomes too large, or at a point where the patch simply
consumes too much memory. It is possible that there exists a better size for the patch
partition. To discover these things, some further investigations would have to be made.
Notable is also the fact that there seems to exist a minimal patch size at around 800 byte.
Given that the target and the source do not share the same signature, a patch cannot
be any smaller than that. This seems to stem from the use of cryptographic hash, which
generates a different signature every time. The signature has high information density,
and can therefore not be compressed much. The patch header and information about the
location and size of the signature also adds to patch size.

5.1.3 Ethical and Environmental Perspectives

After careful consideration, we have decided that there are no ethical aspects to consider
in this project. However, delta updates do have an environmental impact, as they decrease
battery drain.
Energy consumption takes up 60 per cent of total global greenhouse gas emissions, accord-
ing to the United Nations[22]. Additionally, battery production is associated with several
human rights abuses and large scale environmental destruction[23].
While IoT devices likely do not contribute the lions share to any of these issues, an
increasing amount of IoT devices are currently entering the market, which means that ways
to use less energy for data communication in IoT devices might become a progressively
more important issue.

5.2 Implications and Consequences

As mentioned in section 1.1 one of the prospects of this project is to contribute to the
development of a free implementation of delta updates in embedded systems. The current
iteration of the created application is far from ready for deployment, and does not have
much practical use. However it serves as a base for further development.
One such development would be to add support for radio transmissions. As of now, the
solution requires a cable for information transfer, which does not solve the problem delta
updates solve, namely to allow for cheap updates without the need for a cable. Zephyr has
several sample applications which demonstrate information transfer via Bluetooth, which
would likely be very easy to integrate into our application. Another possible solution could
be to transfer data via SMS, which is useful for devices positioned at remote locations.
The long term goal with this type of addition, in combination with some additional testing
of the current application, would be to submit the solution to the Zephyr project in the
hopes of it becoming a sample application. A sample application is a small application
which is integrated into the Zephyr repository, making it easily accessible. Having a full,
freely available, implementation of delta updates via SMS would significantly lower the
threshold for creating some IoT units. Units that had previously been unfeasible due to
the cost that would have to be sunken into their development.

5.3 Topics of Further Work

Working on this report generated quite a few questions that were out of the scope of this
thesis to answer. Two of them are left to the reader as suggestions for future work.

24


5. Conclusion

5.3.1 Effects of Compiler Optimizations

Like many other embedded systems Zephyr is built using the C programming language.
When one compiles code written in C, it is common to let the compiler optimize the
code to some degree, to achieve a faster or smaller application. Without optimization
all statements are independent. That means that if one stops the program on a break
point and changes the program counter to jump to another statement in the function -
one would get exactly the results one expected based on the source code[9]. In the context
of delta encoding this means that if the source code of a firmware upgrade differs by only
a few statements the binary will only differ on the instructions representing these specific
statements, but no more than that. However, the higher the degree of optimization gets,
the more this ceases to be the case. Indeed, the optimization might make it so that even
small changes to the source code result in changes to the binary executable so large that
it becomes pointless to generate a delta.
Like many other resource constrained systems, Zephyr does as a standard use size op-
timizations rather than performance optimizations. This might make the optimization
problem a bit less of a problem than in the default case. Whether this is the case or not
is highly interesting for the field.

5.3.2 Encryption Management

To provide confidentiality of image data while in transport to the unit MCUBoot supports
using encrypted images[11]. This means that an encrypted firmware upgrade may be
downloaded to the secondary partition to be decrypted when moved to the primary slot.
It also means the image located in the primary partition will be re-encrypted once moved
to the secondary partition. The purpose of this is to secure confidentiality of the image
while not residing on the device.
While it is probably possible to obfuscate the delta via encryption and then decrypt it on
the device, doing that was outside the scope of this thesis. However, IoT applications are
often used in contexts where cyber security is important, and thus there would probably
be great utility to finding such a solution.

5.4 Summary
In summary a proof of concept was successfully created and it was able to significantly
reduce the payload needed for firmware upgrades. This solution is far from ready for
deployment, but could likely with the addition of radio transfer support have several use
cases. An aspiration is that the solution could be expanded upon to create a sample
application that can be submitted to the Zephyr project.

25


5. Conclusion

26


Bibliography

[1] F. Firouzi, K. Chakrabarty, S. Nassif, Intellegent Internet of Things: From De-
vice to Fog and Cloud, Charm, Switzerland: Springer, 2020. [Online]. Available:
https://link-springer-com.proxy.lib.chalmers.se/book/10.1007%2F978-3-
030-30367-9#about, Accessed on: 2020-10-08.

[2] Zephyr Project, ”About the Zephyr Project”, 2020. [Online]. Available: https://
www.zephyrproject.org/learn-about/, Accessed on: 2020-10-05.

[3] Zephyr Project, ”Device Firmware Upgrade”, 2020. [Online]. Available:
https://docs.zephyrproject.org/latest/guides/device_mgmt/dfu.html,
Accessed on: 2020-10-05.

[4] Wikipedia, ”Booting”, 2020. [Online]. Available: https://en.wikipedia.org/wiki/
Booting#Boot-loader, Accessed on: 2020-10-06.

[5] Zephyr Project, ”Legacy Devicetree macro”, 2020. [Online]. Available:
https://docs.zephyrproject.org/2.3.0/guides/dts/legacy-macros.html#
mcuboot-partitions, Accessed on: 2020-10-05.

[6] ”MCUboot”, GitHub Documentation, 2020. [Online]. Available: https://github
.com/JuulLabs-OSS/mcuboot/blob/master/docs/design.md,
Accessed on: 2020-10-05.

[7] nRF52840 DK: Development kit, Trondheim, Norway, Nordic Semiconductios, 2020.
[Online]. Available: https://www.
nordicsemi.com/Software-and-Tools/Development-Kits/nRF52840-DK,
Accessed on: 2020-10-06.

[8] ”Zephyr Project”, 2020, GitHub Repository. [Online]. Available: https://github
.com/zephyrproject-rtos, Accessed on: 2020-11-16.

[9] R. Stallman, ”Options That Control Optimization” in Using The Gnu Compiler Col-
lection: A Gnu Manual For GCC Version 4.3.3, CreateSpace, 100 Enterprise Way,
Suite A200, Scotts Valley, CA, USA, 2009. [Online]. Available: https://gcc.gnu
.org/onlinedocs/gcc/Optimize-Options.html, Accessed on: 2020-10-08.

[10] C. Percival, ”Naive differences of executable code”, Aug. 2003. [Online]. Available:
http://www.daemonology.net/bsdiff/, Accessed on: 2020-10-21.

[11] ”Encrypted Images”, MCUboot, GitHub Documentation, 2020. [Online]. Avail-
able: https://juullabs-oss.github.io/mcuboot/encrypted_images.html, Ac-
cessed on: 2020-10-21.

[12] ”Detools”, GitHub Repository, 2020. [Online]. Available: https://github.com/
eerimoq/detools, Accessed on: 2020-10-14.

[13] J.W. Hunt, M.D. McIlroy, ”An Algorithm for Differential File Comparison”, Bell Lab-
oratories, Murray Hill, USA, Department of Electrical Engineering, Stanford Univer-
sity, Stanford, USA, 1975. [Online]. Available: https://www.cs.dartmouth.edu/
doug/diff.pdf, Accessed on: 2020-10-30.

27


Bibliography

[14] D. Korn, J. MacDonald, J. Mogul, K. Vo, ”The VCDIFF Generic Differencing and
Compression Data Format”, RFC 3284, Internet Engineering Task Force, 2002. [On-
line]. Available: https://tools.ietf.org/html/rfc3284, Accessed on: 2020-10-30.

[15] J. Larsson, K. Sadakane, ”Faster Suffix Sorting”, , vol. 387, nr. 3, ss. 258-272, Nov.
2007 doi: doi:10.1016/j.tcs.2007.07.017. [Online]. Available:
http://www.larsson.dogma.net/ssrev-tr.pdf, Accessed on: 2020-11-03.

[16] J.P. MacDonald, ”File System Support for Delta Compression”, master thesis, De-
partment of Electrical Engineering and Computer Sciences, University of California
at Berkeley, Berkeley, USA, 2000. [Online]. Available: http://pop.xmailserver.
net/xdfs.pdf, Accessed on: 2020-11-04.

[17] ”Data compression module integration”, GitHub Thread, 2020. [Online]. Available:
https://github.com/zephyrproject-rtos/zephyr/issues/26648, Accessed on:
2020-11-08.

[18] Y.S. Nugrogo, H. Hata, K. Matsumoto, ”How different are different diff algorithms
in Git?”, Empirical Software Engineering, vol. 25, nr. 1, pp. 790-823, 2020, doi:
10.1007/s10664-019-09772-z. [Online]. Available: https://link.springer
.com/article/10.1007/s10664-019-09772-z, Accessed on: 2020-11-04.

[19] ”Heatshrink”, GitHub Repository, 2015. [Online]. Available: https://github.com/
atomicobject/heatshrink, Accessed on: 2021-02-12.

[20] Du, Ke-Lin Swamy, M. N. S, ”Wireless Communication Systems - From RF Subsys-
tems to 4G Enabling Technologies - 14.1 Basic Definitions”, 2010. [Online]. Available:
https://app.knovel.com/hotlink/pdf/id:kt009BKH15/wireless-communication
/basic-definitions, Accessed on: 2021-02-12.

[21] A. Ozsoy and M. Swany, ”CULZSS: LZSS Lossless Data Compression on
CUDA”, 2011. IEEE International Conference on Cluster Computing, Austin,
USA, 2011, pp. 403-411. doi: 10.1109/CLUSTER.2011.52. [Online]. Available:
https://ieeexplore.ieee.org/document/6061071, Accessed on: 2021-02-12.

[22] ”Ensure access to affordable, reliable, sustainable and modern energy”,
2020. United Nations Sustainable Development Goals. [Online]. Available:
https://www.un.org/sustainabledevelopment/energy/, Accessed on: 2021-05-
17.

[23] ”UN highlights urgent need to tackle impact of likely electric car bat-
tery production boom”, 2020. United Nations News [Online]. Available:
https://news.un.org/en/story/2020/06/1067272, Accessed on: 2021-05-17.

28


	Glossary
	Introduction
	Problem Statement
	Problem Domain
	Software
	Hardware

	Thesis Outline

	Theory
	Differential Compression
	Differencing
	BSDiff

	Patching

	System Limitations
	Limited Memory
	Flash Memory

	Solutions
	Patch Location
	Source image location
	Compression


	Method
	Algorithm Design Choices
	Selecting an Algorithm
	Data Locations
	Patch Metadata
	Summary

	Algorithm descriptions
	Detools
	Heat-shrink

	Procedure
	Detools Porting Strategy
	Demonstrator Setup
	Main Application Setup
	Testing Strategy

	Development Environment and Tools
	Tools
	Scripts


	Results
	Data structure
	Main Function
	Initialization Functions
	Callback Functions
	Tests

	Conclusion
	Discussion
	Application Design
	Test Results
	Ethical and Environmental Perspectives

	Implications and Consequences
	Topics of Further Work
	Effects of Compiler Optimizations
	Encryption Management

	Summary

	Bibliography