Integrating Generative AI Tools into Soft-
ware Development

A case study overseeing the adoption of generative AI tools in the software
development industry

Master’s Thesis in Computer Science and Engineering

TIM NIEMEIJER

Department of Computer Science and Engineering,
CHALMERS UNIVERSITY OF TECHNOLOGY AND UNIVERSITY OF GOTHENBURG
Gothenburg, Sweden 2025
www.chalmers.se

www.chalmers.se


© TIM NIEMEIJER, 2025.

Supervisor: Aris Alissandrakis, Computer Science and Engineering
Examiner: Miroslaw Staron

Master’s Thesis 2025
Department of Computer Science and Engineering,
Software Engineering and Technology
Chalmers University of Technology and University of Gothenburg
SE-412 96 Gothenburg
Telephone +46 31 772 1000

Typeset in LATEX, template by Kyriaki Antoniadou-Plytaria
Gothenburg, Sweden 2025

i


Abstract
Generative AI (GenAI) and large language models (LLMs) have become popular
in recent years for their versatility, usefulness, and ease of use. Now, these GenAI
tools are being adopted for more professional use cases, such as software develop-
ment. The case study presented in this thesis took place from March to August 2025
at a larger international IT company, while they tried to test and adopt Google’s
GenAI tool Gemini into the organization, with a special focus on the software devel-
opment department. This case study examines different aspects of the integration
and adoption of GenAI tools in a larger organization, as well as how developers view
and trust these tools in a software development context. The results showed that
both the perceived usefulness and perceived ease of use of generative AI tools for soft-
ware development are high, and that the vast majority of developers perceive their
productivity, development velocity, and problem-solving speed to all increase over a
four-month period when using GenAI tools. As a product of this study, a concep-
tual model was developed based on collected data and adoption theory to show how
different factors influence each other in the selection and adoption of GenAI tools.
Tool selection from an organizational perspective showed that cost was of secondary
importance to compliance, productivity, and integration into existing workflows.

Key words and phrases: Generative AI, Large Language Models, Technology adap-
tation, Software development industry, Case Study.

ii


Contents

1 Introduction 2
1.1 Motivation and Context . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Literature Review 5
2.1 Generative AI in software development . . . . . . . . . . . . . . . . . 5

2.1.1 Accelerating development and reducing effort . . . . . . . . . 5
2.1.2 Effectiveness depends on human oversight and prompt engi-

neering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Most value as supportive collaborators . . . . . . . . . . . . . 6

2.2 Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Traditional Collaboration . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Human-AI Collaboration . . . . . . . . . . . . . . . . . . . . . 7

2.3 Trust and Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Managing Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Organizational Adoption . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Methodology 10
3.1 Context and rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Literature selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Surveys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Interviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.4.1 Thematic Analysis . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Results 17
4.1 Survey results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.1.1 Initial survey . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.2 Closing survey results . . . . . . . . . . . . . . . . . . . . . . . 19

4.2 Interviews results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.1 Interview with management . . . . . . . . . . . . . . . . . . . 22
4.2.2 Interviews with developers . . . . . . . . . . . . . . . . . . . . 25

4.2.2.1 Tester Interview . . . . . . . . . . . . . . . . . . . . 25
4.2.2.2 Full-stack/Backend Developer Interview . . . . . . . 26

5 Discussion 28

iii


CONTENTS CONTENTS

6 Conclusion 31

7 Use of AI 34

A Appendix 1 - initial survey data II

B Appendix 2 - closing survey data VI

C Appendix 3 - management interview VIII
C.1 Interview guide/questions . . . . . . . . . . . . . . . . . . . . . . . . VIII
C.2 Thematic analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX

D Appendix 4 - Developer interview guide XI

iv


List of Figures

4.1 Initial survey responses for the most time-consuming tasks during
development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2 Comparison of initial and closing survey responses . . . . . . . . . . . 18
4.3 Comparison of initial and closing survey responses regarding genAI

tool perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4 Closing survey responses . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.5 Conceptual model of GenAI adoption . . . . . . . . . . . . . . . . . . 24

v


List of Tables

2.1 Release dates of the top three AI tools for developers . . . . . . . . . 5

4.1 Mapping of the questions from the two surveys . . . . . . . . . . . . . 20

1


1
Introduction

In the following chapter, the motivation and context of the thesis will be explained.
Additionally, the research objectives and research questions will be defined for clarity
and understanding of what the goals of the thesis are.

1.1 Motivation and Context
In recent years, the number and popularity of AI tools have increased dramatically.
Today, few people remain unfamiliar with tools such as ChatGPT (OpenAI, 2025),
Microsoft’s CoPilot (GitHub, 2025), or Google’s Gemini (Google, 2025). These
AI technologies have evolved significantly—from simple web-based chat-bots to so-
phisticated systems now being tested or directly integrated into various industries,
including automotive (Soegoto et al., 2019), medicine (Rajpurkar et al., 2022), agri-
culture (Liu, 2020), and now also software development (Rajbhoj et al., 2024; Choud-
huri et al., 2024a; Li et al., 2024).

The emergence of GenAI tools marks a shift not just in what software can do, but
in how software is built. Tools like GitHub Copilot and Gemini are being actively
embedded into software development workflows, influencing how developers write,
review, and reason about code. Barke et al. (2023) observed that developers use
AI coding assistants in two major ways: to accelerate routine tasks and to explore
alternative implementation strategies—highlighting both productivity support and
creative augmentation. Alternatively, Vaithilingam et al. (2022) and Pearce et al.
(2022) found that developers’ expectations of GenAI tools often diverge from their
actual experiences, citing concerns such as over-reliance, diminished code compre-
hension, and the need for continuous oversight. These findings reflect a shift toward
GenAI as a collaborator in the development process—still earning developers’ trust
and learning to meet their expectations.

While GenAI tools promise efficiency and creative support, they also introduce new
complexities into the development process. Developers must navigate uncertainty
around the correctness, explainability, and intent of AI-generated code. Moreover,
the collaborative dynamic between developer and AI is not yet well understood:
issues of trust, accountability, and appropriate reliance continue to surface as key
factors shaping effective adoption. These challenges highlight the need for deeper
investigation into how developers integrate, trust, and adapt to GenAI tools within
real-world workflows.

2


1.2. PROBLEM STATEMENT 1. Introduction

Understanding how developers adopt and interact with AI tools also requires in-
sight from established theoretical frameworks. The Technology Acceptance Model
(TAM) proposed by Davis (1989) and recent Human-AI Interaction research from
HCI literature such as Yang et al. (2020) provide essential foundations for analyzing
user perceptions, acceptance, and trust toward emerging AI technologies. Addition-
ally, issues of transparency and bias in AI decision-making highlighted by Binns
et al. (2018), Eiband et al. (2019), and Jacovi et al. (2021) underscore the critical
importance of trust as a pivotal factor influencing the effectiveness of human-AI
collaboration.

This thesis investigates the integration of GenAI tools into the software development
industry, focusing on a real-world case study of a larger software development organi-
zation adopting Google’s Gemini. The study addresses the complexities of managing
technological change, understanding trust in AI-assisted development, and navigat-
ing the evolving dynamics of human-AI collaboration. The observation period of the
study spanned from mid-March to mid-August 2025. This research provides insights
into how developer interaction with GenAI tools influences developers’ perceptions,
trust, creativity, autonomy, and collaboration patterns, as well as how organizations
determine what the main selection criteria are and which roll-out strategies are used
when embarking on their GenAI journey.

This context not only contributes to the academic understanding of GenAI adoption
but also offers practical guidance for software engineering practitioners and organi-
zations facing similar integration and adoption challenges. The following sections
outline the main research objective and present the research questions that guide
the study.

1.2 Problem Statement
In today’s software development industry, the following problems have been identi-
fied:
Decision problem (management lens): Organizations lack an evidence-based
insight into the selection and roll-out of GenAI coding tools that balance expected
productivity gains with legal/IP risks, IDE/infrastructure integration, cost, and
developer trust.
Practice gap (developer lens): There is no clear, empirical picture of how devel-
opers use GenAI (which tasks, when, and how), how they perceive it (tool, assistant,
peer, etc.), or how this shapes productivity, autonomy, and confidence.
Context gap (organization): There is limited evidence on developer perceptions
of GenAI—ease of use, usefulness, and trust—in low power-distance organizations,
which is needed as a baseline to judge adoption pathways and outcomes in such
contexts.

The goals of this study are to provide an integrated, evidence-based view that con-
nects management drivers and governance with actual developer practices and per-

3


1.3. RESEARCH QUESTIONS 1. Introduction

ceptions, offering a theoretical approach with a practical example to guide the se-
lection and adoption of GenAI tools in software development.

1.3 Research Questions
To try to solve the problems highlighted in Section 1.2 above, the following research
question has been formulated:

How do developers perceive and integrate GenAI into their workflows,
what adoption-and-trust patterns emerge, and which organizational

factors drive tool selection and adoption in large software companies?

To facilitate answering this broad question, it is divided into the following sub-
questions:

RQ1.1 How do developers perceive GenAI’s role in their development workflow — as
a tool, assistant, peer, competitor, or something else?

RQ1.2 How do developers perceive the usefulness and reliability of GenAI tools in
supporting their coding tasks?

RQ1.3 In what ways are developers integrating GenAI into their daily workflow, and
for which types of tasks is it most commonly used?

RQ1.4 What factors influence developers’ trust in and willingness to rely on GenAI-
generated code?

RQ1.5 How do developers perceive GenAI tools as affecting their productivity, au-
tonomy, and confidence in solving coding problems?

RQ1.6 What are the main drivers in adoption of GenAI tools in software development,
from a higher-up perspective, in larger software development organizations?

4


2
Literature Review

This chapter will be an in-depth reflection of the preceding literature relevant to the
case study or other companies or organizations looking to adopt or integrate GenAI
tools for software development.

2.1 Generative AI in software development
Ever since the large GenAI tools were released (See Table 2.1), people are using and
applying them to almost everything (starting a business, as personal trainers gen-
erating workout schedules and training programs, writing song lyrics, etc.). Since
2021, when OpenAI and Github released Github Copilot, coding and software en-
gineering practices have been included in the list of jobs where GenAI tools are
now available to aid and assist. In the past three years since GitHub Copilot came
out, ChatGPT, and later Google’s Gemini also released new versions that could
aid in coding and software development, and these three tools are among the most
used for software development (Olavsrud, 2025). When these tools were released,
and ever since, statements are continuously made that they improve productivity,
creativity, and more (section 2.1.1). However, the tools have barely been around
long enough for them to be properly integrated into the industries. Even though
scientific and empirical evidence exists for the use of these tools in the software
development industry, the amount of released and available research on the topic is
still limited enough that major companies and corporations still want more proof
before adopting them (or other underlying reasons exist).

GenAI Tool Released
Github CoPilot 2021, October (GitHub, 2021)
ChatGPT 2022, November (Southern, 2024)
Google Gemini 2023, December (Google, 2023)

Table 2.1: The release dates for three of the biggest AI tools for developers
(Olavsrud, 2025)

2.1.1 Accelerating development and reducing effort
Generative AI tools have the potential to reduce skill barriers and learning curves,
thereby accelerating software development substantially (Rajbhoj et al., 2024). LLMs

5


2.1. GENERATIVE AI IN SOFTWARE DEVELOPMENT 2. Literature Review

are especially beneficial in the early stages of software development projects, and by
generating structural application code, like service layer code and user interfaces,
they can provide a boost in initial application development (Rajbhoj et al., 2024;
Rasnayaka et al., 2024). Furthermore, groups that have been observed to utilize
GenAI tools for software development have been measured to have an acceleration
in task completion of 55.8% faster than their respective control group, as well as a
productivity increase of between 21%-89% (Peng et al., 2023). This is also empha-
sized by Rajbhoj et al. (2024) who recorded a noticeable productivity increase of
roughly 70%. While speed improvements are widely reported, Barke et al. (2023)
note that developers also use GenAI tools for exploring alternative implementations
and debugging—activities that may not always reduce time but do reduce cognitive
load and development effort. However, not all studies report clear-cut productivity
gains. Vaithilingam et al. (2022) observed a gap between developers’ expectations
and actual experience, citing usability challenges and the need for careful validation
of AI-generated code, reducing the increase in productivity.

2.1.2 Effectiveness depends on human oversight and prompt
engineering

The quality, correctness, and effectiveness of the response from the GenAI tool are
dependent on the quality and precision of the prompt given to the tool (Rajbhoj
et al., 2024; Pinto et al., 2024). For good prompting and prompt engineering, it
is crucial for the effectiveness of the validation of the generated code that there
is involvement of a subject matter expert (Rajbhoj et al., 2024). In the current
state, people still need to supervise GenAI and validate the outputs to ensure the
tool produces a suitable solution and to limit and prevent the propagation of pos-
sible errors or unwanted functionality (Rajbhoj et al., 2024; Kalliamvakou, 2024;
Vaithilingam et al., 2022). Barke et al. (2023) found that developers often rely on
iterative prompting and interpretation to shape useful responses.

2.1.3 Most value as supportive collaborators
AI tools and agents, by lowering the cognitive burden on developers, allow devel-
opers to focus more on so-called system thinking while prompting, supervising, and
validating what the tools are producing (Kalliamvakou, 2024). It has also been
shown that LLMs in their current state are mainly useful as assistants or tools, and
they have not yet reached a level where they could replace human developers (Coello
et al., 2024). Vaithilingam et al. (2022) found that developers often went in with
high expectations of autonomy from the AI but quickly learned that human input
and supervision remain essential, supported by Barke et al. (2023) arguing that de-
velopers treat Copilot more like an interactive collaborator than a replacement.

6


2.2. COLLABORATION 2. Literature Review

2.2 Collaboration
In this subsection, different types of collaboration will be discussed. The main focus
will be 1) traditional collaboration (also referred to as human-human collaboration)
since this is the standard and traditional collaboration type for software develop-
ment; and 2) human-AI collaboration, since this is the shift we are seeing in the
industry over the past decade and more recently due to the release of GenAI for
software development.

2.2.1 Traditional Collaboration
As generally stated by several, including the Oxford dictionary, collaboration is "the
act of working with another person or group of people to create or produce some-
thing" (Dagli, 2023; Press, 2025), which in software development has traditionally
been either with other people or rubber ducks1. Besides working together to create
or produce something, this can also be extended to reaching a common goal of some
sort. For software development specifically, this could be something as simple as
getting help in producing a line of code, or something as complex as successfully
running an application live in the cloud as a team or company. As discussed by
Apicella and Silk (2019), cooperation and collaboration between homo sapiens has
been around for thousands of years, and is one of the main reasons for the success
and survival of the human race. Collaboration is an important quality in current
human society increasing motivation, task attention, and task persistence (Carr and
Walton, 2014), and helps improve the future of society by reaping all the benefits of
having cognitively diverse collaboration (Ultimo, 2024).

With the rapid development of computer hardware, computing, and AI on the rise,
we are starting to see trends of asking and collaborating with AI agents instead of
colleagues. This is what is paving the way for Human-AI collaboration.

2.2.2 Human-AI Collaboration
Human-AI collaboration is trending now with all the development and progress
of GenAI tools such as the previously mentioned ChatGPT, Gemini, and Copilot.
That being said, the ideas and theories behind human-AI collaboration date back
to at least the 1960s. J.C.R. Licklider was one of the first people to discuss the
possible symbiosis between man and machine in his article Man-Computer Sym-
biosis (Licklider, 1960). Already in 1960 he spoke about the future development
of cooperation and collaboration of people and computers, and reflected that “The
basic dissimilarity between human languages and computer languages may be the
most serious obstacle to true symbiosis.”. Generally speaking, there have been dis-
cussions at IT forums, conferences, and by experts whether or not AI would take
over the job of software development. More recent discussions and research however
points towards collaboration between people (in this case software developers) and

1 Rubber Ducking / Rubber duck debugging: a technique where a software developer explains
their code, line by line, to an object (most commonly a rubber duck)

7


2.3. TRUST AND INTERACTION 2. Literature Review

GenAI (Licklider, 1960; Wang et al., 2020). In terms of guidelines for Human-AI
interaction, and the design of AI from that perspective, Amershi et al. (2019) did
a study which resulted in a thorough literature review on how to design AI from a
human-computer interaction perspective, as well as 18 guidelines for the design of
how AI should interact with people and vice versa.

2.3 Trust and Interaction
Generally speaking, there is a different level of trust between people and between
people and computers or AI. In terms of trust in AI system quality, output quality,
and functional value greatly influence developers’ trust in these tools (Choudhuri
et al., 2024b). However, GenAI for software development hasn’t been around for
long enough for it to have gained the trust of developers. This is why developers
feel like they have to supervise and guide the tools (Kalliamvakou, 2024), and that
the effectiveness and quality produced by these GenAI tools are still too dependent
on the precision and appropriateness of the prompt formulation (Pinto et al., 2024;
Choudhuri et al., 2024b). In terms of the trust in these tools, when considering
that rumors about AI taking software developers’ jobs in the future were (and are
still) circling in the industry (Horowitch, 2025; Times, 2025; Abril and O’Donovan,
2025), most research seems to point towards AI becoming more of a collaborator
or assistant (Kalliamvakou, 2024; Coello et al., 2024; Licklider, 1960; Pinto et al.,
2024). While Licklider (1960) assumed back in the day that people would remain
in full control with AI supporting, Wang et al. (2020) mention the possibility of AI
assuming a more dynamic and participatory role. Furthermore, Eiband et al. (2019)
emphasize that transparency in AI-generated decisions is key to building trust, fur-
ther reinforcing the role of human judgment in validating GenAI outputs.

2.4 Managing Change
When talking about and discussing change in an organization or in the way of work-
ing in an organization, there are two terms that can be discussed which are similar
yet different: change management and management of change. Change manage-
ment focuses on the issues and difficulties involved in altering working practices,
addressing challenges such as preparing for change, implementing it, and managing
resistance. In contrast, management of change refers to a more systematic approach
for evaluating change, including activities like risk assessment.
For change management and implementing change in organizations, one of the first
big hits was the article by Kotter and Schlesinger (1979) about choosing strate-
gies for change, and a 6 step model was defined on how to deal with resistance to
change. Bridges (1991) came out with a book on managing transitions which defined
3 stages of transitioning, namely 1) ending, losing, and letting go, 2) the neutral
zone, and 3) the new beginning. These three phases or steps became the basis for
organizational change and leadership development. Later on, Kotter reiterated the
models previously made, combining his 6-step model with for example Bridges’ 3

8


2.5. ORGANIZATIONAL ADOPTION 2. Literature Review

phases, and came up with an 8 step change model in 1996. The 8-step model has
since been a world standard which has been further updated over the years (Kotter,
2012). His approach combines planning, implementing, highlighting internally, and
anchoring changes through the 8 steps defined in his works. All three of these pieces
of literature are still relevant in today’s industries and form a large part of the base
for the subject of change management, which is why they are recommended by or-
ganizations such as MIT Human Resources (2025).

Alternatively to change management, management of change is a slightly safer ap-
proach and is used mainly prior to implementation to make sure that no new haz-
ards are introduced, and to try to ensure safety and high-risk mitigation (Faulk and
da Fonseca, 2022; Mullins, 2024).

2.5 Organizational Adoption
At the organizational level, the Technology–Organization–Environment (TOE) frame-
work explains adoption through three areas: the technology itself, the organization,
and the external environment (Oliveira and Martins, 2011). Prior work also shows
that top-management support, business alignment, readiness (skills, budget, infras-
tructure), governance/risk (e.g., legal/intellectual property (IP)), and external pres-
sures (vendors, competition, regulation) strongly affect whether interest turns into
actual use (Ali et al., 2022). Culture can further shape the path and speed of adop-
tion: Lee et al. (2013) shows that in low-power-distance, more individualist, and
lower-uncertainty-avoidance settings, adoption relies more on early autonomous in-
novation than on imitation, favoring pilot champions and decentralized experimen-
tation.

9


3
Methodology

This study employs a mixed-methods approach, combining case-study methodology
guided by the framework presented by Runeson and Höst (2009) with surveys and
interviews to conduct an empirical investigation. With a single-case count and lit-
eral replication logic, this study aims to provide insight into this real-world case of
an organization adopting a GenAI tool for software development. The case-study
design is further strengthened by utilizing methods and theories by Robson and
McCartan (2016), Kotter (2012), and Yin (2018).

Kotter (2012), argues that a true mixed-method design both increases overall under-
standing and mitigates the inherent weaknesses of single-method studies—quantitative
data offers breadth, while qualitative data provides the necessary depth into con-
text and meaning. Runeson and Höst (2009) further recommend employing multi-
ple data-collection procedures to enhance construct validity1 and provide comple-
mentary perspectives on software-engineering phenomena. In addition, Yin (2018),
shows that triangulating multiple sources of evidence deepens insight into complex,
contextualized events and strengthens internal validity2 through converging lines of
inquiry. Therefore, this mixed-method approach uses quantitative measures (sur-
veys) for a broad view of adoption patterns, and qualitative techniques (interviews,
observations) for contextual insight, enabling a triangulated, empirically robust in-
vestigation of a real-world GenAI test rollout (Robson and McCartan, 2016; Runeson
and Höst, 2009; Yin, 2018).

Yin (2018) further reflects upon the reliability: the ability for another researcher
to reproduce your procedures and findings, and external validity: can the findings
be transferred or generalized to other settings or populations. Regarding that, Yin
distinguishes between single-case and multiple-case designs: the former provides
detailed, in-depth insights into a particular phenomenon, while the latter enables
analytic generalization through cross-case comparison. Replication logic for case
studies can accordingly be parted into two categories; Literal and Theoretical Repli-
cation. Literal replication is when replicating the case study would be expected to
produce similar results and demonstrates the reliability of findings by repeating the
experiment under the same or similar conditions. Theoretical replication is when
additional cases are selected specifically to produce different results to show how dif-

1construct validity: "measurement accuracy" - concerns the fit between theory and measure-
ment

2internal validity: "causal inference" — whether observed changes can be attributed to the
changes rather than surrounding factors

10


3.1. CONTEXT AND RATIONALE 3. Methodology

ferent contexts and variations lead to different outcomes. By combining case-study
count and replication logic, researchers can confirm core propositions and explore
boundary conditions, striking a balance between depth and breadth in empirical
studies.

This study draws on two complementary models. The Technology Acceptance Model
(TAM) by Davis (1989), which explains how perceived usefulness and perceived ease
of use drive adoption intentions. Survey items and interview probes are mapped
to these constructs, ensuring alignment with a well-validated theory of technology
uptake. Jacovi et al. (2021) offers a human–AI trust framework with a multidi-
mensional view of trust in AI systems—encompassing predictability, reliability, and
transparency. Incorporation of these dimensions into the instruments guides both
the phrasing of questions and the interpretation of developers’ responses through an
AI-specific perception and trust lens .

3.1 Context and rationale
This study focuses on a single-case study of a larger Sweden-based technology com-
pany—referred to as "the company"—which is leading within its industry. The
company employs over 3,000 people, including more than 400 in-house software de-
velopers. It builds and maintains large-scale, high-availability applications used by
millions of end-users daily, both for its own brands and through partnerships with
other businesses. Given the scale and complexity of its systems, any technological
or process change within the development organization is likely to have a significant
impact.
The case was selected based on a critical-case rationale from Yin (2018), with the or-
ganization representing a strategically important example of GenAI adoption in the
software industry. At the time of the study, the company had just begun integrat-
ing GenAI tools in its software development processes, offering a unique opportunity
to observe the early stages of organizational adoption. This timing aligns with a
time in the industry where many companies face the strategic decision of whether
to adopt GenAI technologies or risk being outpaced by competitors. The case is
thus particularly well-suited to generate insights relevant to both practitioners and
researchers.
Several characteristics make the organization a compelling subject for this research.
Its size and maturity imply that the selection and integration of GenAI must be
supported by comprehensive preparations—not only technical, but also legal, ethical,
and procedural. This distinguishes the case from early-stage startups or informal
trials, positioning it as a high-validity example of enterprise-level adoption.
The development teams within the company vary in size and specialization, with
a mix of backend-focused groups (primarily using Java), frontend teams (working
mainly with React), full-stack developers, teams more infrastructure-focused, and
more. As part of the GenAI rollout, a browser-based AI assistant was made available
to all developers, enabling widespread, low-barrier experimentation. In parallel,
more deeply integrated GenAI tools — embedded directly in integrated development
environments (IDEs) — were deployed selectively to a testing group of 40 developers

11


3.2. LITERATURE SELECTION 3. Methodology

tasked with piloting and evaluating these capabilities.
This study adopts Yin’s longitudinal single-case design, investigating how develop-
ers in a large, mature software organization perceive and engage with generative AI
tools over time. The case is both critical—as the organization represents a strate-
gically important and high-validity example of enterprise-level AI adoption — and
revelatory, providing timely access to early integration stages. Developer attitudes
were captured through two surveys: an initial baseline survey conducted during the
early rollout phase and a follow-up survey after ∼4 months of use. The first survey
focused mostly on the developers’ views on GenAI for coding, and the second one-in
addition to views and engagement-was guided by the TAM, focusing on perceived
usefulness and ease of use. This design aligns with Yin’s guidance on longitudi-
nal case studies, as it enables a clearer view of how developer perceptions change
over time during the adoption process. Studying these changes across multiple time
points also helps strengthen the internal validity of the findings.
This study used a mixed-methods approach combining quantitative surveys and
qualitative interviews to capture both broad trends and in-depth insights into the
company’s adoption of GenAI tools in software development.

3.2 Literature selection
A mixed approach was used for finding relevant literature. The traditional approach
of querying established academic databases with keywords, such as IEEE Xplore,
arXiv, ACM Digital Library, and more, was used. Titles and abstracts are then
scanned to assess relevance, then selected articles were skimmed to evaluate their
methodology, use of theory, and alignment with the thesis topic. Literature that
is frequently cited or published in reputable journals is prioritized due to credibil-
ity. Reference lists from key papers were investigated to uncover additional sources
that may not have appeared in initial database searches - also known as snowballing.

In addition to traditional searching using keywords in conventional databases, gen-
erative AI tools like ChatGPT and Gemini were used as supplementary resource
finders to broaden the search and identify additional relevant literature.
When using AI to search for literature, a focused and iterative prompting strategy
was used to enhance the relevance of literature suggestions. The search process typ-
ically began with broad thematic queries like “academic papers on AI in software
engineering” or “literature on generative AI for coding in the industry” to get sug-
gestions of articles, books, and related sources. Every prompt for literature would
generally produce between 5-10 suggestions, of which on average 1-3 of them would
appear relevant. Seemingly relevant literature suggestions would then be further
investigated with more targeted prompts. Things such as summaries, key findings,
alignments with other papers, or citations would be queried for selected literature.
If a piece of literature after this iterative prompting still seemed relevant, it would
be looked up in the respective database (ieeexplore, arxiv, etc.) where it would be
skimmed to determine whether or not the information provided by AI was accurate,
and if it suited this study. Known researchers or article titles were occasionally
referenced to guide the AI more effectively, an example being "please find 5 pieces

12


3.3. SURVEYS 3. Methodology

of academic literature similar to Jacovi et al. (2021)". To ensure academic credi-
bility, all AI-suggested sources that appeared relevant (and reflected what AI had
summarized about them) were manually verified in their respective databases for
authenticity, publication status, and where they were published. Priority was given
to peer-reviewed articles and works published by recognized academic publishers
or conferences. Sources with high citation counts or referenced in other scientific
publications were favored to strengthen the reliability of the literature base.

With this mixed approach of prompting AI and searching for keywords in databases,
more than 200 pieces of literature were suggested/found. Most of the AI sugges-
tions were immediately filtered out based solely on irrelevance for this study as, for
example, lots of studies came up on how to develop AI.

3.3 Surveys
Surveys are this study’s main source of quantitative data. Yin (2018) identifies ques-
tionnaires as a primary evidence source in case studies and recommends using them
alongside interviews for triangulation. Two rounds of surveys were conducted with
all approximately 200 developers invited to answer via relevant company-wide Slack
channels. Participation was voluntary and anonymous; 50 developers responded to
the first (initial) survey and 40 to the second (closing) survey. Of those in the sec-
ond survey, 52.5% reported having taken the initial survey as well, but individual
changes could not be tracked over time due to the fully anonymous design. The
surveys were sent out to all developers, meaning respondents potentially spanned
all teams (over 20) and all levels of seniority, closely matching the company’s overall
demographics.
Survey items were adapted from established TAM and trust-in-AI scales, pilot-tested
with 3 developers, and demonstrated good reliability when the pilots were inquired
about the survey. The initial survey assessed prior GenAI experience, trust in AI-
generated outputs, and expectations of its role in software development. Although
not explicitly labeled as TAM constructs, several items directly measured perceived
usefulness and perceived ease of use.
To enable direct comparison between surveys, four questions were repeated in the
second survey. Three of these were slightly rephrased to fit updated wording and
format, while preserving their underlying constructs. This approach provided a
limited basis for cross-sectional comparison of key attitudes.

3.4 Interviews
Interviews are this study’s main source of qualitative data. Yin (2018), identifies
interviews as a primary evidence source in case studies and recommends open-ended
“how” questions to elicit rich, process-oriented narratives without putting the inter-
viewee on the defensive. Runeson and Höst (2009) complement this by describing
three levels of structure—unstructured, semi-structured, and fully structured—and
suggest following specific patterns for the order of the questions (funnel, pyramid,

13


3.4. INTERVIEWS 3. Methodology

hourglass) to balance comparability with exploration.
The interviews serve to complement and triangulate the findings in combination with
the surveys, offering a richer understanding of the organization’s decision-making
processes and the contextual factors influencing adoption. This aligns with Yin
(2018) and Robson and McCartan (2016) recommendations to use interviews as a
core source of qualitative evidence in case study research and as a way to enhance
internal validity through a mixed-method approach.

To gain deeper insight into the company’s strategic rationale and governance ap-
proach to GenAI adoption, one semi-structured interview was conducted with 2
people from middle management, responsible for evaluating and implementing AI
tools across the development organization. That interview was conducted remotely,
recorded with consent, and transcribed by Gemini for analysis. Only one interview
was conducted with management, involving the two primary stakeholders respon-
sible for AI tool adoption and selection. Including both decision-makers in the
same session allowed them to discuss, support, and correct each other in real time.
This joint format targeted the core voices in the adoption process and helped mit-
igate individual bias by enabling each stakeholder to validate or refine the other’s
responses. The interview with management covered areas such as tool evaluation
criteria, organizational goals, compliance concerns, perceived gains in productivity,
training practices, and anticipated future roles of AI in the software lifecycle. These
topics were shaped both by exploratory aims and by the TAM, particularly focus-
ing on perceived usefulness and ease of use—core constructs for understanding the
developer-facing tool adoption (Davis, 1989). Additionally, questions related to gov-
ernance, legal risk, and trust in AI-generated code were informed by the human–AI
trust framework proposed by Jacovi et al. (2021), which emphasizes transparency,
reliability, and predictability in AI systems.

To broaden perspectives beyond management, a senior tester and a senior full-
stack/backend developer were interviewed. They were selected on the basis of having
responded to at least the closing survey as well as volunteering for the interview. No
incentives were offered. All interviewees received information regarding the topics
of the interview beforehand and gave informed consent to recording, transcription,
and anonymized quotation. Consistent with Runeson and Höst (2009) and Yin
(2018), confidentiality was assured through pseudonyms and removal of identifying
details and participants could withdraw at any time. For all conducted interviews,
participants were offered member checks to verify factual accuracy and anonymity;
interpretive judgments remained the researcher’s responsibility (Robson and Mc-
Cartan, 2016).
Following Runeson and Höst (2009) guidance, a semi-structured format was used
in all interviews to ensure consistency across key topics while allowing flexibility to
explore emergent themes. No specific question-order pattern (e.g., funnel, pyramid,
hourglass) was enforced; instead, the draft interview outline and questions were
sent to the to-be-interviewees in advance to read through and raise questions. They
reviewed the questions for clarity and provided feedback on wording, sequence, and
suggested minor adjustments to ensure correct understanding from their perspective.

14


3.4. INTERVIEWS 3. Methodology

Incorporating their input before the actual session helped balance structure and
depth, reduce misunderstandings, and allowed the interviewees to prepare necessary
information before the interview. Also, incorporating their input before the actual
session helped balance structure with responsiveness, guaranteeing the capacity to
capture unanticipated insights through better preparation and responsiveness.

3.4.1 Thematic Analysis
To systematically analyze the qualitative data obtained from the management in-
terview (transcript), a step-by-step thematic analysis process, proposed by Naeem
et al. (2020), was employed. This approach provides a clear pathway from raw text
to conceptual model, ensuring both rigor and transparency.

Following these stages, the thematic analysis for the management interview looked
as follows:

1. Data Familiarization: The full interview transcript was read multiple times
to immerse the researchers in the data and note initial impressions.

2. Generating Initial Codes: Relevant segments of text — phrases or sen-
tences reflecting distinct ideas about AI adoption — were systematically coded
and a total of 48 relevant codes were defined (see end of Appendix C), an ex-
ample being: “usability of it is rather important.” Codes were added until code
saturation was reached.

3. Searching for Themes: The 48 codes were reviewed and redundant codes
were merged, reducing 48 initial codes to 18 refined codes feeding into the
themes. The 18 refined codes were then categorized into 9 candidate themes
based on conceptual similarity, yielding preliminary themes such as strategic
alignment, training events,and perceived usefulness and ease-of-use.

4. Reviewing Themes: Through an iterative process, the 9 candidate themes
were reviewed against the data and the entire transcript to ensure coherence
and distinctiveness. Overlapping or under-supported themes were merged or
discarded. Through 3 iterations, the 9 themes became 6, and then became 4
which are highlighted in Section 4.2.1. An example of a theme that was sorted
out was training events.

5. Defining and Naming Themes: Remaining themes were clearly defined
and named, with concise descriptions of their core meaning and boundaries in
relation to our research questions.

6. Developing a Conceptual Model: Finally,the finalized themes were orga-
nized into a conceptual model (Figure 4.5) illustrating how factors such as
developer experience, legal constraints, productivity motives, and governance
considerations interact to shape generative AI adoption decisions.

By following the step-by-step process proposed by Naeem et al. (2020), we ensured
that the analysis remained tightly grounded in the data while building a coherent
conceptual model that connects empirical findings to theory. Moreover, adhering
to a transparent, reproducible methodology allows future researchers to replicate or
extend this study, thereby validating or challenging its conclusions. Furthermore,

15


3.4. INTERVIEWS 3. Methodology

interpretation of themes and the construction of the conceptual model draw on three
firm-level perspectives: the Technology–Organization–Environment (TOE) lens to
structure the context (Oliveira and Martins, 2011); the Ali et al. (2022) lens to
label key organizational drivers (management support, alignment, readiness, gover-
nance/risk, external pressures); and the Lee et al. (2013) lens to consider culture as
a moderator of adoption pathways. These lenses offer a clear scaffold for situating
the qualitative themes and linking them to the research questions, while avoiding
any claims about results at this stage. By following a step-by-step thematic anal-
ysis process, and using these lenses as a clear scaffold for situating the qualitative
themes, it also allows for easy coupling of the findings to relevant literature and
research questions.

16


4
Results

The following chapter will present and analyze the gathered data from the study.
This includes the interview data, a conceptual model made based on theories high-
lighted in earlier chapters and data from the management interview, as well as a
representation of the survey data including a comparison of the overlapping themes
and questions.

4.1 Survey results
The study included two surveys: one at the release of the GenAI tool in the or-
ganization—initial survey— and one after ∼4 months of tool availability and us-
age—closing survey. Both surveys were shared and sent out through internal com-
munication channels and shared among the developers throughout the company.

4.1.1 Initial survey
The initial survey to try to understand developers’ experience with, trust in, and
views on GenAI was sent out through internal channels in the company in April,
after access to Gemini in the Integrated Development Environment (IDE) (only for
a pilot group), as well as using it in the browser (entire organization) was rolled out
roughly 2 weeks before.

0 5 10 15 20 25 30
Documenting

Testing
Refactoring
Debugging

WritingNewCode
Planning

2

14

14

20

23

25

Responses (n=50, max 2 votes per respondent)

Figure 4.1: Responses for most time-consuming tasks during development, initial
survey (n=50, max 2 votes per respondent).

The respondents of the initial survey consisted of 50 people. 80% of the responses

17


4.1. SURVEY RESULTS 4. Results

were male, 14% female, and the remaining 6% preferred not to answer. Locations
of the respondents were divided between Sweden (68%), Greece(12%), India (6%),
Uruguay (6%), Canada (4%), and the remaining 2% did not answer.
The respondents were asked what tasks they typically spend the most time on during
development to gauge where GenAI could possibly have a big impact on time saving.
Each respondent could select a maximum of two of the options. It was possible to
add answers; however, none of the participants chose to do so. The results are
shown in figure 4.1, and shows how developers perceive that writing new code and
debugging take up second and third most time, however taking most time with 25
votes is the planning phase of the development process.
The developers were then asked about the frequency of their collaboration with their
coworkers and questions about AI’s impact, how much they trust it for general ques-
tions, and how much they trust the code it generates (Figure 4.2). How frequently
developers collaborated with their coworkers on their tasks ranged from never(1) to
Multiple times for every task(5). Based on the median being 3.5, respondents have
a tendency to, more often than not, collaborate with their peers. When asked how
much they trust the correctness of a coworker’s help or code suggestions, ranging
from not trusting it (1) to fully trusting the correctness (5), the respondents seemed
to have a lot of faith in their peers’ skills with a median of 4.

Figure 4.2: Likert scale question response distribution. Black boxes represent
responses from the initial survey (Section 4.1.1), red from the closing survey (Section
4.1.2). Statistical comparisons between initial and closing survey responses’ median
values were done using Wicoxon rank sum tests.

Looking at the highest and lowest boxes and medians from the initial survey, devel-

18


4.1. SURVEY RESULTS 4. Results

opers seem to have high trust in their coworkers’ suggestions and solutions (median
of 4) while they do not seem to be worried about them being replaced by AI (1: not
worried, 5: worry often) with a median of 2 (Figure 4.2).
Asking the developers how much they had used GenAI for coding, before the com-
pany started their pilot and adoption phase, showed that 54% of the developers use
GenAI for coding and code generation on at least a monthly basis, of which 38%
used it on a weekly basis or more often.

Figure 4.3: Response genAI tool perception of developers. Black = % of initial
survey respondents (n=50), gray = % of closing survey respondents (n=40)

The developers were also asked how they view GenAI for software development.
This was a multiple answer type of question, and developers could choose one of the
existing answers or add another answer themselves. From the black bars in Figure
4.3 we can see that 88% of the developers saw GenAI in software development as
a tool (44 votes) at the time of the initial survey. What is interesting is that the
second highest voted option, having 74% votes (37 total), was assistant, giving the
GenAI a more human connotation.

For replicability and transparency, the data from the initial survey can be found in
Appendix A.

4.1.2 Closing survey results
The second and last survey was done ∼4 months after the approval and release of
Gemini in the organization. With this survey, the intention was to have a few ques-
tions to compare changes over the time period, as well as assessing the developers’
perception of GenAI and how it had effected their workflow over/after the 4-month
period.

19


4.1. SURVEY RESULTS 4. Results

To enable meaningful comparison across the two surveys, a subset of questions from
the initial survey was retained in the closing survey, with some rephrased for im-
proved clarity and alignment with theoretical constructs and the new questions.
These revisions were made to better reflect the terminology and dimensions found
in relevant literature—particularly the Technology Acceptance Model (Davis, 1989)
and trust in AI frameworks (Jacovi et al., 2021)—while maintaining comparability
with the original phrasing. The consistent use of Likert-scale items across both sur-
veys allows for analysis of changes in developer perceptions of usefulness, trust, ease
of use, and intended future use of GenAI tools over time.

In Table 4.1, you will find the 4 questions that have been added to the second
survey, 3 of them slightly rephrased for better alignment with the rest of the added
questions and their phrasing. The responses for the three mapped questions in the
closing survey that use the Likert-scale are represented in Figure 4.2 as the red boxes
to aid visible comparison between the two surveys.

Initial survey Closing survey
On a scale 1-5, how frequently would
you say that you collaborate with your
co-workers, ask them questions or
discuss with them?

I collaborate frequently with my
colleagues, multiple times per task.

On a scale 1-5, how much do/would
you trust the correctness of a
co-workers help or code suggestions?

I trust the correctness of code
suggestions provided by my
co-workers.

On a scale 1-5, how much do/would
you trust the correctness of AI
generated code?

I trust the correctness of code
generated by generative AI tools.

How do you currently view genAI for
software development? (select as many
as appropriate)

How do you currently view genAI for
software development? (select as many
as appropriate)

Table 4.1: Mapping of questions which are present in both surveys, some of
which have been rephrased.

When a GenAI tool is adopted, the logical assumption would be that people would
collaborate less with coworkers due to the introduction of a tool that can replace
some of that interaction from a logical perspective. However, it is observed during
the study that people actually collaborated more with their coworkers (comparison
tested using Wilcoxon Rank sum, p =.03). Furthermore, a small change was ob-
served in the trust in AI-generated code; however, the increase is not of statistical
significance.
All the other questions from the closing survey that used the Likert-scale are dis-
played in Figure 4.4. It was observed that the experience of the developers using
the GenAI tools was generally positive, with the lowest first quartile (Q1) being 2.5
("predictability of GenAI tool behavior" and "confidence in using GenAI tools for
production coding"). The data shows that the median ≥ 3 for all the Likert-scale

20


4.2. INTERVIEWS RESULTS 4. Results

Figure 4.4: Likert scale question response distribution, closing survey (n=40).

questions in the closing survey. Given these numbers, a sub-conclusion can be made
that the developers perceive the GenAI tools as useful as well as perceiving the tools
as relatively easy to integrate and use in their daily workflow.
The three questions regarding improvement of development velocity, faster problem
solving, and contribution to overall productivity, all having medians of 4 and upper-
quartiles at or near 5, underline that the developers feel or perceive that GenAI
tools speed them up and boost their overall output. The highest scoring question
of the closing survey was the response to the question of whether they would like to
continue to use GenAI tools going forward. The question was met with a median
of five, and a Q1 of four, highlighting the enthusiasm for the use of the tools and
indicating that once trialed, most developers wish to keep GenAI tools as part of
their workflow. Predictability of GenAI behavior and confidence in using GenAI
for code that will go into the production environment are the two only questions
where neutral or slightly positive answers—and the widest spread of answers—are
observed. The spread of responses to the predictability question highlights that the
unpredictable outputs or solutions produced by GenAI tools are a worry or frus-
tration for developers. The spread of responses regarding the confidence in GenAI
tools’ outputs that will go into production is highlighting similar concerns when it
comes to the trust in these tools, as well as possible concerns regarding needing
further training or governance safeguards.

4.2 Interviews results
The study included two types of interviews. One with two people from middle
management and 2 interviews with developers who had utilized the GenAI tool in
their workflow during the duration of the case study.

21


4.2. INTERVIEWS RESULTS 4. Results

4.2.1 Interview with management
A semi-structured interview was conducted with a Systems Architect/Team Lead
and a Platform Manager at the company, which lasted one hour. This provided rich
insights into the organizational considerations, selection criteria, and initial experi-
ences associated with adopting GenAI tools for software development. Both tech-
nical and managerial perspectives were explored, addressing critical aspects such
as usability and developer experience, compliance and legal constraints, strategic
alignment, informal adoption processes, and anticipated long-term effects on team
structure and productivity. Although the company is still in an exploratory phase,
without formalized Key Performance Indexes (KPIs) or structured training pro-
grams, the interview revealed patterns of consideration aligning with established
theoretical frameworks, notably TAM (Davis, 1989) and Trust in AI frameworks
(Jacovi et al., 2021). The following 4 points are a representation of the themes,
identified and named through the thematic analysis, from the management inter-
view. The questions for the interview, as well as the 48 initial codes found in through
the thematic analysis can be found in Appendix C.

Management perspective When asked about higher management’s motivation
for adopting GenAI tools, the manager stated ".. overall I think management is
pushing to adopt new tooling for many reasons. First and foremost is to boost pro-
ductivity of the personnel.", highlighting how perceived usefulness, in the shape of
expected productivity benefits, is a key driver for technology acceptance and change.
Furthermore, the manager also later in the interview stated that having a GenAI
tool for coding could potentially help make the company more attractive for future
employees, thereby adding another potential benefit from a management and hiring
perspective.

Selection Criteria When asking early in the interview about selection and evalu-
ation criteria, both the architect and manager pitched in. The architect commented
that the general usability of the tool is important, pointing out how the perceived
usefulness and perceived ease-of-use of the tools are of high importance from a code
production perspective since the developer experience (DX) is central. Adding onto
what the architect stated, the manager specified "we also had to include a legal as-
pect, more particular the intellectual property of the company needs to be protected.",
which relates more to company-specific selection criteria in terms of legal and com-
pliance constraints and criteria.

Adoption Strategy and Alignment When asked whether they had looked at
other AI tools and why the company had selected Gemini for this first bigger test-
adoption, the manager commented that they had selected Gemini as a holistic AI
tool that could be enabled in other products and for other departments as well, not
only in development tooling, highlighting how Gemini specifically was selected with
a broader strategic alignment in mind.
Another strategic choice the company had made was not to use a formal change
management approach. When asked if any specific change management models or
styles had been used, the manager answered ".. the short answer would be no. it’s

22


4.2. INTERVIEWS RESULTS 4. Results

more of exploratory slash collaborative approach of evaluating those tools... I don’t
think we have it [change management processes] at all in this company.". This shows
that the company does not use a formal, top-down change management process. In-
stead, it relies on a collaborative and exploratory approach to adopting new tools.
This reflects a culture of low power-distance (Hofstede, 2001), where decisions are
made together rather than imposed by management as well as Lee et al. (2013)
which stated that in low-power-distance, more individualist, and lower-uncertainty-
avoidance settings, adoption relies more on early autonomous innovation than on im-
itation, favoring pilot champions and decentralized experimentation. Furthermore,
in terms of strategy, the manager shortly after highlights that no formal training
had been provided. Besides an internal hackathon1 where it was encouraged to use
AI tools the company has only prioritized informal knowledge sharing2.

Measures and Governance Another factor of GenAI tool adoption is the impact
it might have on velocity, quality, etc. When, if, and/or how we measure things like
assisted development velocity and code quality, the manager answered "we based our
evaluation in collaboration and discussions based on verbal feedback from the partic-
ipants on the evaluation", highlighting how the developers’ opinions are collected in
a qualitative manner. The manager states that they have no intention of making
a distinction between human-written and AI-generated code. This could either be
due to the level of trust in these tools, or it could be related to the difficulties in
making the distinction between the two when all code is delivered in the developer’s
name and no trace is left of what part was AI-generated if/when the code is refac-
tored. "We have no intention of, let’s say, introducing some kind of distinction on
this code has been produced by humans, this code has been produced by AI tooling."
The architect also states " We haven’t used them [GenAI tools] long enough to actu-
ally be able to establish proper KPIs.", highlighting the challenges of evaluating new
technologies in a rapidly evolving context. Furthermore, due to the rapidly evolving
context, the manager and architect acknowledged a lack of clear governance guide-
lines, largely due to the inherent uncertainty and rapidly evolving nature of GenAI
tools. This volatility makes it difficult to establish long-term generalizations and
stable measurement methods. Furthermore, as the architect highlighted, differenti-
ating between human-written and AI-generated code is inherently difficult because
all code is committed by developers who ultimately assume responsibility for its
quality. Thus, current organizational practices do not distinguish between human-
authored and AI-generated contributions, reinforcing the complexity in governing
the use of these tools2.

These findings highlight broader strategic and practical considerations that go be-
yond immediate usability and adoption metrics, aligning with concerns discussed in
the literature on Trust in AI frameworks (Jacovi et al., 2021).

1A collaborative event where programmers and other parts of an organization can come together
to work intensively on projects or ideas.

2The company later informed that it is planning to soon roll out formal training programs
and updated policies and guidelines for the use of AI to address operational needs and regulatory
development.

23


4.2. INTERVIEWS RESULTS 4. Results

Following Naeem et al. (2020), the four themes were organized into a conceptual
model shown in Figure 4.5 in combination with additional factors from for example
TOE.

Management Perspective
(productivity, talent, urgency)

External factors:
regulation, pressure,

competition, isomorphism

Selection Criteria
(TAM, legal/I.P., integration)

Technical aspects:
task fit, domain complexity,

tool maturity

Adoption Strategy & Alignment
(ecosystem fit, change mgmt, holisticity)

Organizational aspects:
skills, training, cost, culture,

risk tolerance

Measures & Governance
(human-in-loop, policies, monitoring)

Adoption & Use Patterns
(integration, usage, productivity)

feedback

evidence

trust

Figure 4.5: Conceptual model of organizational GenAI adoption, grounded in
themes found through thematic analysis of the management interview and adoption
theories from Section 2.5. Dashed boxes act as moderators: affecting the strength
and direction of the relationship (arrow) between two variables. Dashed arrows
signify an upstream impact creating a loop. Captions highlighted in bold are the
themes from the thematic analysis of the management interview.

Management Perspective directs both Selection Criteria and Adoption Strategy &
Alignment while being influenced by External factors such as changing industry
standards or regulations. Selection Criteria and Adoption Strategy & Alignment
influence each other bidirectionally: strategic choices (e.g., a holistic workspace roll-
out) shape acceptance criteria, while legal/IP and IDE-integration constraints steer
strategy. Selection Criteria then flows into Measures & Governance, which in turn
leads to Adoption & Use Patterns which is the final step describing individual devel-
oper adoption. Measures & Governance provides feedback to refine selection criteria
and supplies evidence upward to inform the management perspective, creating an
iterative feedback loop. The level of trust that developers have in GenAI, based
on the interaction through Adoption & Use Patterns, influences the Measures &
Governance due to trust being a regulator on how "in-the-loop" humans are and
will be, creating a loop. Two moderators, namely Technical aspects and Organi-
zational aspects condition the strength and direction of the relationship between
Selection Criteria and Adoption Strategy & Alignment, as these aspects influence
the relationship between strategy and criteria, ultimately affecting the success of
the adoption. Using the model links TAM drivers with trust/predictability under
governance (human-in-the-loop) (Davis, 1989; Jacovi et al., 2021), TOE and adop-

24


4.2. INTERVIEWS RESULTS 4. Results

tion (Oliveira and Martins, 2011; Lee et al., 2013) and follows case-study logic for
analytical generalization (Yin, 2018; Runeson and Höst, 2009).

Summary: Overall, the interview underscores that the company’s adoption of
GenAI tools is primarily motivated by enhancing productivity and development
experience, strongly aligning with constructs such as perceived ease-of-use and per-
ceived usefulness. At this stage, adoption remains exploratory, informal, and reliant
on qualitative feedback, with continued strategic considerations regarding gover-
nance, legal compliance, and company-wide tool accessibility and usability. Moving
forward, introducing formal training programs, clear governance policies, and mea-
surable KPIs could provide a structured pathway to sustainable integration and
deeper organizational acceptance of GenAI solutions.

4.2.2 Interviews with developers
After the closing survey was completed, two 30-minute semi-structured interviews
were conducted with employees to complement and enrich the survey findings by
probing four theory-grounded dimensions: First, the developers’ mental models and
relational metaphors for GenAI (e.g., “assistant,” “rubber duck,” “tool”) were ex-
plored to uncover emotional framing and a sense of agency in line with human–AI
interaction frameworks (Amershi et al., 2019; Wang et al., 2020). Second, questions
on task integration (e.g., “Which phases or tasks do you habitually use Gemini for?”)
were posed to elicit concrete examples of how GenAI embeds into development,
maintenance, debugging, scaffolding, and bulk edits—adding contextual richness to
quantitative measures of use, integration and its limitations. (Peng et al., 2023; Ras-
nayaka et al., 2024). Third, trust and willingness to rely on GenAI were examined
by asking what conditions increase or decrease confidence in AI-generated code,
reflecting the emphasis on predictability and transparency in trust-in-AI frame-
works (Jacovi et al., 2021; Eiband et al., 2019). Finally, perceived productivity,
autonomy, and cognitive impact (e.g., “Has GenAI changed how you solve prob-
lems or how much mental effort you invest?”) were assessed to engage the TAM’s
core constructs—perceived usefulness and ease of use — and the cognitive-load con-
siderations highlighted by Davis (1989) and Naeem et al. (2020). Together, these
interviews provide a rich, theory-grounded qualitative layer that both triangulates
and deepens the statistical trends observed in the surveys.

4.2.2.1 Tester Interview

An interview was conducted with a senior software tester who has incorporated
Gemini into daily workflows for both test-case generation and broader development
support. Initially treated as an experimental side tool, Gemini is now relied upon
regularly to accelerate repetitive tasks—while the tester retains critical oversight
and maintains full control over all AI-generated outputs.
Externalized Thinking Aid: The tester compared GenAI to a “rubber duck”
early in the interview, explaining: “I just throw things at it and see what kind of

25


4.2. INTERVIEWS RESULTS 4. Results

analysis or tips and tricks it comes up with.” This metaphor echoes Pinto et al.
(2024) on AI as a contextual partner that reduces cognitive friction.
Test-Case Scaffolding: When the tester prompts GenAI to create new tests, it
creates several for each case. These are all analyzed and then some are discarded
while others are kept. This aligns with Li et al. (2024) and Rasnayaka et al. (2024)
on AI’s strength in high-volume, boilerplate tasks.
Calibrated Trust: Trust in the GenAI is carefully managed. When the tester was
asked about what makes him trust or distrust GenAI, he responded that it depends
on how much data it already has, and that one needs to keep track when it goes
off-rails. This underlined his perceived importance of reliability and predictability
as prerequisites for human trust in AI systems (Jacovi et al., 2021).
Human-in-the-Loop Verification: The tester insists on full comprehension be-
fore acceptance. When he was asked whether he has ever used AI-generated code
without fully understanding it he responded that “so far, no [I haven’t used code
without understanding it]... I have understood the code it generated so far”, exem-
plifying the continuous validation loop advocated by Amershi et al. (2019).
Productivity Gains: At the end of the interview, the tester was asked if he feels
like GenAI has freed up time for more creative or challenging tasks. He responded
that while GenAI “has freed up time... I still need to think about the solution.”
The tester confirms that GenAI accelerates mechanical work without relinquishing
decision responsibility, supporting Peng et al. (2023) on measurable productivity
improvements alongside sustained developer oversight, while hinting towards that
time might have been removed from one task to then be added to another existing
task.

4.2.2.2 Full-stack/Backend Developer Interview

The second interview involved a senior full-stack/backend developer who now uses
Gemini in both front-end (React/Next.js) and back-end (Java/Spring) workflows.
Having transitioned from pure back-end to full-stack six months ago, they moved
from cautious experimentation to daily reliance—especially for refactoring and code
scaffolding—while retaining critical oversight.
The developer has integrated Gemini into most of their workflows, invoking it ha-
bitually for clarification, UI enhancement, and boilerplate cleanup. They value its
ability to generate “very elegant” generic solutions that drastically reduce man-
ual effort and accelerates bug-fixing, yet they test and review every suggestion to
avoid over-commenting, logic errors, or misinterpretation. Overall, they view Gem-
ini as a strong productivity and learning aid—particularly for mastering front-end
frameworks—while maintaining healthy skepticism in complex or novel scenarios
and ensuring full comprehension before merging AI-generated code.
Refactoring & Boilerplate: When asked what kinds of development tasks the full-
stack developer found GenAI most helpful for, he responded that it had been helpful
with refactoring work, and that "Gemini was able to find generic solutions which
were very elegant and which worked well with our code.” This aligns with Li et al.
(2024) and Rasnayaka et al. (2024) on AI’s effectiveness at boilerplate generation and
refactoring. Unlike the tester—who saw only a one-off test failure—this developer
reports consistent success, suggesting outcome variability depends on prompt quality

26


4.2. INTERVIEWS RESULTS 4. Results

and task/context complexity.
Collaborative Dialogue: Where the tester likened GenAI to a “rubber duck,” this
developer described it as a “dialogue”: “I feel I’m collaborating... it’s a dialogue: I
upload code, get code back, test it, and improve continuously”, mirroring Pinto et al.
(2024) on AI as a contextual partner. The “dialogue” metaphor positions GenAI as
an active, iterative collaborator rather than a static tool.
Human-in-the-Loop Verification: When discussing when and how much the
developer trusts the code produced by GenAI, the developer responded that he does
so "[when] I have thoroughly tested the code and I see that it works”, exemplifying the
continuous validation loop advocated by Amershi et al. (2019). Even when domain
expertise is lacking, the developer relies on testing to maintain control.
Trust Through Expertise: The developers’ perspective, in terms of his trust
towards GenAI, was very self-reflective. Regarding specific experiences increasing or
decreasing the trust in GenAI over time, he responded that “I tested it quite a lot and
there were some issues in the beginning... but then I learned better prompting... and
the code quality was better as well.” This reflects Jacovi et al. (2021)’s emphasis that
trust grows with both system reliability and user proficiency, as improved prompting
skills yield more predictable outputs.
Productivity Gains: At the end of the interview, the developer was asked if he
felt like utilizing GenAI for his coding freed up more time for more creative or chal-
lenging tasks, he replied that debugging time had significantly dropped, and that
issues that could take him up to half a day to fix now sometimes only take 30 min-
utes. This, in turn, frees up a lot of time for more challenging tasks according to
the developer, which supports Peng et al. (2023) on measurable productivity im-
provements, as a developer can note occasional passivity until their prompting skills
evolves.

Result summary Based on the insight gained from this study, along with other
statements made by various researchers and writers, the following statement made
by Licklider (1960) seems to perfectly resonate with the results: “In the anticipated
symbiotic partnership, men will set the goals, formulate the hypotheses, determine
the criteria, and perform the evaluations. Computing machines will do the rou-
tinizable work...". The human-AI collaboration patterns which were found by data
collection, or interpretation of said data, underline the fact that people have not let
go of the control to AI, or if they ever will. Developers today remain critical and
cautious of AI-generated code, and through triangulation of surveys, interviews, and
observations at the company, we see that the relationship and collaboration have
not matured enough to know how this new relationship might grow or change in the
future.
Trust and integration success are going to be dependent on how well AI can be
integrated into existing workflows and processes, and how genAI is developed and
shaped to become more trustworthy. When developers, through AI transparency,
get the idea that GenAI tools grasp the context of the prompt as well as the task
they have been given, a higher level of trust can be obtained. But in cases where it
is unclear whether the tool has understood the assignment or not, a higher level of
distrust and skepticism will be present.

27


5
Discussion

The following chapter will discuss and reflect upon constructs such as reliability,
transparency, validity, and more. Here, potential biases and weaknesses will be
highlighted, and things that should be considered if this study (or replications of it)
would be conducted again in the future.

Construct Validity: Survey questions were adapted from TAM (perceived use-
fulness/ease of use) and trust-in-AI work, and they were pilot-tested with three
developers for understandability. The interview guide followed the same process.
The two survey waves were combined with three interviews and kept a clear “chain
of evidence” in the methodology, results, and appendices (instruments, coding notes,
quotes) for triangulation and traceability (Yin, 2018; Runeson and Höst, 2009; Rob-
son and McCartan, 2016). Remaining risks are typical biases (self-report bias, prox-
imity bias, confirmation bias, etc.) and limited scale validation beyond pilot feed-
back. This was reduced by checking survey patterns against interview themes.

Internal Validity: Causation is not claimed. Changes over four months could come
from other stimuli (tool updates, team priorities, season/holidays), not only GenAI
tools. Because surveys were anonymous, people could not be matched between the
initial and closing survey (21 said they answered both), and three repeated questions
were slightly rephrased — both of these being potential threats. To limit this, the
same Likert scale was kept, each wave was analyzed on its own, and all conclusions
were cross-checked with interview data. In future work, pseudonymous IDs would al-
low anonymous yet within-person comparisons (Yin, 2018; Runeson and Höst, 2009).

External Validity: These results are meant to be generalized analytically to sim-
ilar settings, not statistically to everyone: in this case, a larger software company
adopting GenAI under legal/compliance rules, with full-scale workspace integration
(in this case, Google Workspace). The single-site design and the specific tools (Gem-
ini, IDE assistants) limit generalization. To help readers judge transferability, the
context has been described in as much detail as possible and the procedure has
been documented so others can repeat or extend this study (Yin, 2018; Robson and
McCartan, 2016; Runeson and Höst, 2009).

Transparency and Reliability: To strengthen reliability and transparency, this
study documents a clear chain of evidence (Yin, 2018): data-collection instruments,
coding protocol, analysis decisions provided throughout the thesis, linked together
with quotations from the interviews, and data from the surveys (which are also

28


5. Discussion

provided in the appendices). The end-to-end process for data gathering, transfor-
mation, and interpretation is described in detail to allow replication. Furthermore,
this thesis will be critically analyzed and exposed to independent scrutiny during
the presentation and defense of the results (Runeson and Höst, 2009; Robson and
McCartan, 2016). Because of confidentiality constraints, interviewees have been de-
identified using generic names like Tester and Architect which have in turn been
used for references and quotations. Altogether, these materials enable readers to
verify how conclusions were reached and allow other researchers to reuse the proto-
col in comparable contexts—aiming for analytic reproducibility rather than identical
outcomes.
Further thoughts and improvements: One finding was that collaboration scores
were higher in the closing survey than in the initial survey. This change may partly
reflect that nearly half of closing-survey respondents did not participate in the first
survey. Alternatively, the timing of the closing survey - conducted during the sum-
mer holiday - may have forced developers to rely more on colleagues outside their
usual networks, making them more aware of how much they depend on and collab-
orate with others due to the increased awareness when asking unfamiliar people for
help or guidance. The slight rephrasing of the question could also have impacted
this result.
Secondly, in terms of reliability and validity, the people in the company that did the
survey were people that applied for or willingly decided to use GenAI tools for their
development. For the IDE extension tool, one had to apply to get a license, and the
browser version of Gemini was also available to people willing to keep the tab open
and use it. This means that it is likely that most people that chose to use the tools,
which in return would be the people answering the surveys and interviews, are people
that could be considered enthusiastic about learning how to use and experiment with
these tools in their workflow. This could be the source of a potential bias.
Because of the anonymous format of the surveys, paired analyses of individual re-
spondents were not possible. The inability to link responses across surveys is a
limitation, as it prevents precise measurement of attitude changes over time. If this
study was to be run again in the future, surveys should consider using pseudonymous
identifiers to enable paired analyses while preserving respondent confidentiality.

Ethical Considerations: Following some of the guidelines of Runeson and Höst
(2009), all participants provided informed consent; the company and individuals
are anonymized, the surveys were completely voluntary and anonymous, and re-
spondents of the interviews were offered the opportunity to check the transcripts to
verify factual accuracy and anonymity. Interpretation and comparison of the data
remained the responsibility of the researchers.
All data (survey results, interview recordings, transcripts) were stored in Google
Drive, only allowing selected people access to the documents while only giving edi-
torial access to researchers.

Result reflections: Comparing the initial and closing surveys (Figure 4.3), the
most notable changes over time concern the perceptions of the tool as a Stack-
Overflow replacement and as a Competitor. Although neither change is statistically

29


5. Discussion

significant (both p-values » .05), they remain noteworthy and could merit further
investigation in future research.
In addition, several respondents made use of the “other, please specify” option, each
of which received only a single vote. In the initial survey, these were: “As an overly
confident assistant who also makes mistakes” and “Search Engine replacement.” In
the closing survey, they were: “StackOverflow supplement,” “Inspiration prompt,”
“Enabler – makes scaffolding go a lot faster,” and “Bulk code writer.” Because each
option was selected by only one participant (2% in the initial survey, 2.5% in the
closing survey), they were not included in the main analysis.
This study and its conclusions are based on research performed at an organization
with a Scandinavian company culture and structure. For generalization and appli-
cability to an organization or company with a vastly different culture, structure, or
general mindset, the results of this study might not be accurate.

30


6
Conclusion

Circling back to the main research question in Section 1.3 of how developers perceive
and integrate GenAI into their workflows, what adoption-and-trust patterns emerge,
and which managerial factors drive tool selection in large software organizations, we
can now make conclusions based on the results of the study highlighted in Chapter
4 and after critical reflection in Chapter 5.

RQ1.1 - GenAI’s role in development workflow: Most developers use GenAI
as either a tool (combined ∼ 88%), an assistant (combined ∼ 72%), or a Stack-
Overflow replacement (combined ∼ 65%) in their development workflow (survey
data, Figure 4.3). Based on the survey results, only a minority consider it a co-
worker or teammate, underscoring that, at this stage, GenAI is still largely seen as
augmentative rather than peer-like. Even in the qualitative data, GenAI surfaced
as a “rubber-duck” - an externalized thinking aid that developers throw problems
at to gain new perspectives. This metaphor was made by the tester and mirrors
Pinto et al. (2024), who describe AI assistants serving as contextual partners that
reduce cognitive friction. Unlike the tester’s “rubber-duck” metaphor, the full-stack
developer (section 4.2.2.2) described a two-way “dialogue,” indicating a more inter-
active, partnership-like relationship once users gain prompting proficiency—further
supporting Pinto et al. (2024) contextual partner model.

RQ1.2 – Usefulness and Reliability: From the interviews, GenAI is perceived
as highly useful for offloading repetitive work like scaffolding bulk test cases or code
templates, thereby boosting productivity which further supporting the claims of
Peng et al. (2023), Li et al. (2024), and Rasnayaka et al. (2024). Reliability and
trust are treated as dynamic constructs: outputs are checked for domain relevance
and consistency, and reliability and trust are continuously assessed based on the
quality and fit of the output. Every suggestion is reviewed to ensure full under-
standing before integration, reflecting the prerequisites of reliability, predictability,
and transparency described by Jacovi et al. (2021) and Amershi et al. (2019). The
full-stack developer’s experience of initial failures followed by improved outputs via
refined prompts illustrates how user expertise with prompting directly enhances
perceived reliability — reinforcing the trust prerequisites of predictability and user
competence in the framework presented by Jacovi et al. (2021).

RQ1.3 - Integration into daily workflow: Based on the interviews and survey
data, GenAI has become integral to developers’ routines, especially for automating

31


6. Conclusion

high-volume, boilerplate tasks such as test-case generation, CI/CD1 pipeline script-
ing, SQL2/reporting script assistance, and bulk search-and-replace. Many partici-
pants describe GenAI as a StackOverflow replacement, preferring in-context prompts
for code snippets or explanations rather than navigating external forums. The devel-
oper habitually turns to GenAI when “enhancing the product” (rather than starting
every task by prompting it) highlights a just-in-time integration pattern, where
GenAI serves as an on-demand support rather than a default. This dynamic as-
sistance supports Pinto et al. (2024), who characterizes AI assistants as contextual
partners that step in precisely when developers need cognitive offloading or help.

RQ1.4 – Influences on trust and willingness to rely: Survey data (Figure
4.2) shows that developers consistently trust their colleagues’ code suggestions more
than those generated by GenAI. However, we do see an increase in the trust in
correctness of AI-generated code between the initial and closing surveys (although
not statistically significant). Furthermore, by the closing survey (Figure 4.4), most
developers report that GenAI behavior feels sufficiently predictable and express con-
fidence in using AI-generated code in production environments. Trust and reliance
clearly vary by task and recent tool experience: routine or well-scoped tasks (e.g.,
test scaffolding and boilerplating) lead to greater confidence, whereas open-ended
or complex tasks still instill greater caution. These findings underscore Jacovi et al.
(2021) and Amershi et al. (2019) claims that reliability and predictability form the
foundation of human trust in AI, and that trust must be continually calibrated as
humans work with AI tools.

RQ1.5 - Productivity, autonomy and confidence: Based on findings from the
closing survey highlighted in Figure 4.4, developers reported that they felt produc-
tivity, development velocity, and problem-solving speed to have all improved while
using GenAI. The closing survey also showed that most developers find GenAI easy
to integrate into their workflows - indicating both high perceived usefulness and high
perceived ease of use, as GenAI simultaneously was easy to integrate into existing
processes and accelerated routine tasks. The tester confirmed some of these ef-
fects in practice, noting that they maintained autonomy by demonstrating sustained
confidence and ownership over AI-generated outputs. Furthermore, the tester also
specified that GenAI offloads work without taking away developers’ sense of control.

RQ1.6 – Drivers from the organizational perspective: Based on the interview
with the platform manager and systems architect, four key drivers underlie their
GenAI adoption strategy. First, increasing developer productivity is paramount:
positioning GenAI as a strategic tool for faster delivery and reduced manual effort.
Second, protecting intellectual property and ensuring legal compliance impacted
their tool choice, which reflected and underlined the trust-in-AI prerequisite of clear
usage terms (Jacovi et al., 2021). Third, strategic alignment with the broader Google
ecosystem drove the selection of Gemini as a “holistic AI tool” usable across devel-
opment as well as other departments of the organization. Finally, GenAI tooling

1CI/CD is short for Continuous Integration / Continuous Deployment/Delivery
2SQL is a scripting language for querying databases

32


6. Conclusion

serves as a talent-attraction and retention measure, offering modern capabilities that
appeal to prospective hires. Cost considerations were secondary to compliance, pro-
ductivity, and seamless integration into existing workflows.

In conclusion, in an organization with a Swedish structure and culture and af-
ter GenAI has been utilized for development for an extended period of time, the
vast majority of developers view GenAI primarily as a productivity-boosting as-
sistant — used wherever boilerplating or cognitive offloading is needed. High per-
ceived ease of use and developer enthusiasm drove fast and smooth integration into
both testing and coding workflows. In this low-power-distance, more individual-
ist, and lower-uncertainty-avoidance setting, the adoption— which relied more on
early autonomous innovation, favored pilot champions and decentralized experimen-
tation—seemed to be successful, based on the perceptions of the developers. Trust
in AI outputs is situational: clear, well-prompted interactions built confidence, while
extended and iterative improvement loops that failed to meet requirements signifi-
cantly decreased both confidence and trust. Across interviews, developers adhered
to a strict human-in-the-loop collaboration pattern, checking and testing every AI
suggestion before peer reviews and merging into main code bases. From manage-
ment’s perspective, cost was secondary to perceived usefulness, ease of integration,
and alignment with existing workflows — underscoring that strategic fit, legal risk
management, and developer perception—not price—govern organizational GenAI
adoption.

33


7
Use of AI

For the making of this thesis, both ChatGPT and Gemini have been used for the
following:

Searching for literature can be daunting and very time-consuming. During the
initial part of searching for literature, once one relevant article was found, ChatGPT
was used to quickly find related and similar articles. Articles were then critically
skimmed and reviewed before selecting the most relevant for the purpose of this
thesis.
Grammar and word choice is always a discussion when writing a scientific paper.
ChatGPT was used as a tool to highlight potential issues with word choices, find
synonyms that were more suitable, as well as double-check grammar. Prompting
ChatGPT with questions similar to: ’for the following sentence, please highlight any
grammatical issues or words that could be removed or replaced to make the sentence
more clear: "..." ’. Suggestions were then taken into consideration and applied where
they were deemed an improvement. At times the same would be done for shorter
paragraphs, prompting for example ’Please give suggestions as to if there is a way
to make the following paragraph shorter and clearer: "..." ’. Similarly, suggestions
were then taken into consideration and corrected where deemed an improvement.
Human-AI collaboration has been a subject of this thesis, so it was fitting to use
AI as a type of collaborative and reflective partner to try to ensure non-bias and
clarity, since the thesis was written by one individual. Similar to grammar and word
choices, an example prompt could be: ’for the following conclusion, please analyze
and highlight any potential biases, correlations or inverse correlations i might have
missed, or potential ways this could be misinterpreted: "..."’. It would normally
respond with a few suggestions which then could be considered. This allowed for
additional reflection, double-checking results, and reducing biased thinking during
the thesis work.

34


Bibliography

Abril, D. and O’Donovan, C. (2025). Bosses want you to know ai is coming for your
job. The Washington Post. Accessed June 25, 2025. Top U.S. CEOs including
Amazon, IBM, Salesforce, and JPMorgan warn that AI will significantly transform
or eliminate software-developer jobs.

Ali, O., Murray, P. A., Muhammed, S., Dwivedi, Y. K., and Rashiti, S. (2022).
Evaluating organizational level it innovation adoption factors among global firms.
Journal of Innovation & Knowledge, 7(3):100213.

Amershi, S., Weld, D. S., Vorvoreanu, M., Fourney, A., Nushi, B., Collisson, P., Suh,
J., Iqbal, S., Bennett, P. N., Inkpen, K., Teevan, J., Kikin-Gil, R., and Horvitz,
E. (2019). Guidelines for human-ai interaction. In Proceedings of the 2019 CHI
Conference on Human Factors in Computing Systems, CHI ’19, New York, NY,
USA. Association for Computing Machinery.

Apicella, C. L. and Silk, J. B. (2019). The evolution of human cooperation. Current
Biology, 29(11):R447–R450. Accessed: 2025-04-30.

Barke, S., James, M. B., and Polikarpova, N. (2023). Grounded copilot: How
programmers interact with code-generating models. Proceedings of the ACM on
Programming Languages, 7(OOPSLA1):Article 78, 1–27.

Binns, R., Kleek, M. V., Veale, M., Lyngs, U., Zhao, J., and Shadbolt, N. (2018).
‘it’s reducing a human being to a percentage’: Perceptions of justice in algorith-
mic decisions. In Proceedings of the 2018 CHI Conference on Human Factors in
Computing Systems (CHI ’18), pages 1–14. ACM. Accessed on May 16, 2025.

Bridges, W. (1991). Managing Transitions: Making the Most of Change. Addison-
Wesley, Reading, MA.

Carr, P. B. and Walton, G. M. (2014). Cues of working together fuel intrinsic
motivation. Journal of Experimental Social Psychology, 53:169–184. Accessed:
2025-05-05.

Choudhuri, R., Trinkenreich, B., Pandita, R., Kalliamvakou, E., Steinmacher, I.,
Gerosa, M., Sanchez, C., and Sarma, A. (2024a). What guides our choices?
modeling developers’ trust and behavioral intentions towards genai. arXiv preprint
arXiv:2409.04099.

Choudhuri, R., Trinkenreich, B., Pandita, R., Kalliamvakou, E., Steinmacher, I.,
Gerosa, M., Sanchez, C., and Sarma, A. (2024b). What guides our choices? mod-

35


BIBLIOGRAPHY BIBLIOGRAPHY

eling developers’ trust and behavioral intentions towards genai. In Proceedings of
the 47th IEEE/ACM International Conference on Software Engineering (ICSE),
Ottawa, Canada. IEEE/ACM.

Coello, C. E. A., Alimam, M. N., and Kouatly, R. (2024). Effectiveness of chatgpt
in coding: A comparative analysis of popular large language models. Digital,
4(1):114–125.

Dagli, K. (2023). Collaboration vs. teamwork: Key differences and
when to use each. https://www.togetherplatform.com/blog/
collaboration-vs-teamwork-key-differences-and-when-to-use-each.
Accessed: 2025-04-30.

Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance
of information technology. MIS Quarterly, 13(3):319–340. Accessed on May 16,
2025.

Eiband, M., Schneider, H., Bilandzic, M., Fazekas-Con, J., Haug, M., and Huss-
mann, H. (2019). Transparency and trust in ai: How humans perceive ai’s
decision-making. In Proceedings of the 2019 CHI Conference on Human Factors
in Computing Systems, Glasgow, Scotland, UK. ACM.

Faulk, N. and da Fonseca, C. C. (2022). Moc 101—fundamentals for effective change
management. Process Safety Progress, 41(3):492–502.

GitHub (2021). Introducing github copilot: Your ai pair programmer. Accessed:
2025-04-10.

GitHub (2025). Github copilot: Your ai pair programmer. N/A. Accessed: 2025-
03-28.

Google (2023). Introducing gemini: Our largest and most capable ai model. Ac-
cessed: 2025-04-10.

Google (2025). Gemini. Accessed: 2025-03-28.

Hofstede, G. (2001). Culture’s Consequences: Comparing Values, Behaviors, Insti-
tutions and Organizations Across Nations. Sage Publications, Thousand Oaks,
CA, 2nd edition.

Horowitch, R. (2025). The computer-science bubble is bursting. The Atlantic. Ac-
cessed June 25, 2025. Discusses how AI is contributing to declines in CS enrollment
and entry-level developer hiring.

Jacovi, A., Marasovic, A., Miller, T., and Goldberg, Y. (2021). Formalizing trust
in artificial intelligence: Prerequisites, causes and goals of human trust in ai.
In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and
Transparency (FAccT ’21), pages 624–635. Accessed on May 16, 2025.

Kalliamvakou, E. (2024). A developer’s second brain: Reducing complexity
through partnership with ai. https://github.blog/news-insights/research/
a-developers-second-brain-reducing-complexity-through-partnership-with-ai/.
Accessed: 2025-04-30.

36

https://www.togetherplatform.com/blog/collaboration-vs-teamwork-key-differences-and-when-to-use-each
https://www.togetherplatform.com/blog/collaboration-vs-teamwork-key-differences-and-when-to-use-each
https://github.blog/news-insights/research/a-developers-second-brain-reducing-complexity-through-partnership-with-ai/
https://github.blog/news-insights/research/a-developers-second-brain-reducing-complexity-through-partnership-with-ai/


BIBLIOGRAPHY BIBLIOGRAPHY

Kotter, J. P. (2012). Leading Change: With a New Preface by the Author. Harvard
Business Review Press, Boston, MA.

Kotter, J. P. and Schlesinger, L. A. (1979). Choosing strategies for change. Harvard
Business Review, 57(2):106–114.

Lee, S.-G., Trimi, S., and Kim, C. (2013). The impact of cultural differences on
technology adoption. Journal of World Business, 48(1):20–29.

Li, Z. S., Arony, N. N., Awon, A. M., Damian, D., and Xu, B. (2024). Ai tool use and
adoption in software development by individuals and organizations: a grounded
theory study. arXiv preprint arXiv:2406.17325.

Licklider, J. C. R. (1960). Man-computer symbiosis. IRE Transactions on Human
Factors in Electronics, HFE-1(1):4–11.

Liu, S. Y. (2020). Artificial intelligence (ai) in agriculture. IT Professional, 22(3):14–
15. Accessed: 2025-04-10.

MIT Human Resources (2025). Recommended resources: Managing change. https:
//hr.mit.edu/learning-topics/change/resources. Accessed: 2025-04-30.

Mullins, A. (2024). What is the management of change (moc) process? https:
//coastapp.com/blog/management-change-moc/. Accessed: 2025-04-30.

Naeem, M., Besharat, F., Bashir, M., and Arif, M. (2020). A step-by-step process of
thematic analysis to develop a conceptual model in qualitative research. Quality
& Quantity, 54(3):1335–1352.

Olavsrud, T. (2025). 10 most used genai tools in the enterprise. Accessed: 2025-04-
26.

Oliveira, T. and Martins, M. R. (2011). Literature review of information technology
adoption models at firm level. 1566-6379, 14.

OpenAI (2025). Chatgpt. Accessed: 2025-03-28.

Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., and Karri, R. (2022). Asleep
at the keyboard? assessing the security of github copilot’s code contributions.
Communications of the ACM.

Peng, S., Kalliamvakou, E., Cihon, P., and Demirer, M. (2023). The impact
of ai on developer productivity: Evidence from github copilot. arXiv preprint
arXiv:2302.06590. Accessed: 2025-04-28.

Pinto, G., de Souza, C., Rocha, T., Steinmacher, I., de Souza, A., and Monteiro,
E. (2024). Developer experiences with a contextualized ai coding assistant: Us-
ability, expectations, and outcomes. In Proceedings of the 2nd Conference on AI
Engineering (CAIN), Lisbon, Portugal. Association for Computing Machinery.
Accessed: 2025-04-28.

Press, O. U. (2025). Collaboration. https://www.oxfordlearnersdictionaries.
com/us/definition/english/collaboration. Accessed: 2025-04-30.

37

https://hr.mit.edu/learning-topics/change/resources
https://hr.mit.edu/learning-topics/change/resources
https://coastapp.com/blog/management-change-moc/
https://coastapp.com/blog/management-change-moc/
https://www.oxfordlearnersdictionaries.com/us/definition/english/collaboration
https://www.oxfordlearnersdictionaries.com/us/definition/english/collaboration


BIBLIOGRAPHY BIBLIOGRAPHY

Rajbhoj, A., Somase, A., Kulkarni, P., and Kulkarni, V. (2024). Accelerating soft-
ware development using generative ai: Chatgpt case study. In Proceedings of the
17th innovations in software engineering conference, pages 1–11.

Rajpurkar, P., Chen, E., Banerjee, O., and Topol, E. J. (2022). Ai in health and
medicine. Nature Medicine, 28(1):31–38. Accessed: 2025-04-10.

Rasnayaka, S., Wang, G., Shar