FAUXperience Framework Designing For Critical Conscious AI Use In Higher Education Master’s Thesis in Computer Science and Engineering JOSÉ BENER DE SOUSA NUNES Department of Computer Science and Engineering Chalmers University of Technology University of Gothenburg Gothenburg, Sweden 2024 Master’s thesis 2024 FAUXperience Framework Designing For Critical Conscious AI Use In Higher Education JOSÉ BENER DE SOUSA NUNES Department of Computer Science and Engineering Chalmers University of Technology University of Gothenburg Gothenburg, Sweden 2024 FAUXperience Framework - Designing For Critical Conscious AI Use In Higher Education JOSÉ BENER DE SOUSA NUNES © JOSÉ BENER DE SOUSA NUNES, 2024. Supervisor: Sara Ljungblad, Computer Science and Engineering Examiner: Staffan Björk, Computer Science and Engineering Master’s Thesis 2024 Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg SE-412 96 Gothenburg Telephone +46 31 772 1000 Cover: The original hand image was produced using generative AI before the author’s final editing. DALL-E 3 Prompt: High fidelity, black woman’s hand, holding a diploma. Typeset in LATEX Gothenburg, Sweden 2024 iv FAUXperience Framework - Designing For Critical Conscious AI Use In Higher Education JOSÉ BENER DE SOUSA NUNES Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg Abstract Integrating generative artificial intelligence (AI) into education is a complex process that requires the development of clear principles for its ethical and responsible use. Despite numerous potential benefits, such as personalized learning and task optimization, this technology poses ethical concerns regarding biases, misinformation, and the difficulty of differentiating human-created texts from AI-generated content. To address these con- cerns, we created the "FAUXperience Framework" to offer guidelines for the ethical use of AI in education by fostering a critical consciousness about this technology’s potential benefits and risks. The framework results from user research with stakeholders such as teachers, students, and other interested parties. The study focused on collecting quali- tative and quantitative data on how stakeholders experience using generative AI tools in their academic activities and how universities handle the issues related to the mis- use of AI. The term "FAUXperience" combines "faux," which refers to artificiality, and "experience," to indicate the use of AI to artificially enhance the educational process in addition to traditional teaching methods. By fostering critical consciousness about AI, the framework aims to promote the benefits of learning associated with AI while drawing stakeholders’ attention to its impact on education. In conclusion, the "FAUXpe- rience Framework" encourages teachers and students to be critically conscious actors in AI-powered education. Keywords: AI, artificial intelligence, FAUXperience Framework, education, critical con- sciousness, large language models, ChatGPT, critically conscious, ethics. v Acknowledgements I want to thank my supervisor, Sara Ljungblad, for her guidance throughout my the- sis and my entire journey at Chalmers. Her support has been essential in numerous projects. I also want to thank everyone who contributed to this thesis by participating in interviews or responding to surveys. Their feedback and insights were essential to the development of this thesis project. Lastly, I want to acknowledge my family and friends for their constant encouragement. José Bener de Sousa Nunes, Gothenburg, 2024-06-19 vii "I believe the cost of getting to know AI – really getting to know AI – is at least three sleepless nights. After a few hours of using generative AI sys- tems, [...] it dawns on you that you are interacting with something new, something alien, and that things are about to change. You stay up, equal parts excited and nervous, wondering: What will my job be like? What job will my kids be able to do? Is this thing thinking? [...] You realize the world has changed in fundamental ways and that nobody can really tell you what the future will look like." Ethan Mollick. Co-Intelligence: Living and Working with AI. ix x Contents List of Figures xiii List of Tables xv 1 Introduction 1 1.1 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Stakeholders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Expected Results and Impact on Stakeholders . . . . . . . . . . . . . . . . 4 1.4 Scope and Delimitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 Ethical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.6 Use of Generative AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Background 7 2.1 The Convergence of Society and Technology . . . . . . . . . . . . . . . . 7 2.2 The Integration of AI Into Education . . . . . . . . . . . . . . . . . . . . 8 2.3 Ethics, Challenges, and Dangers of AI . . . . . . . . . . . . . . . . . . . . 11 3 Theory 13 3.1 The Key Concepts and Terminology of AI . . . . . . . . . . . . . . . . . . 13 3.2 The Four Principles of Co-Intelligence With AI . . . . . . . . . . . . . . . 15 3.3 Human-Computer Interaction . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4 Education for Critical Consciousness . . . . . . . . . . . . . . . . . . . . 23 3.5 Simulated-based Education . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4 Methodology 27 4.1 Design-oriented Research in HCI . . . . . . . . . . . . . . . . . . . . . . 27 4.2 Research Phase (Discover) . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3 Synthesis Phase (Define) . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4 Ideation Phase (Develop) . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.5 Implementation Phase (Deliver) . . . . . . . . . . . . . . . . . . . . . . . 34 4.6 Overview of the Planned Research Methodology . . . . . . . . . . . . . . 35 5 Process and Execution 37 5.1 Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.2 Define . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.3 Develop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 xi Contents 5.4 Deliver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6 Results 61 6.1 Experiences and Perceptions of AI in HE . . . . . . . . . . . . . . . . . . 61 6.2 The Limitations of AI Detection Tools . . . . . . . . . . . . . . . . . . . . 74 6.3 The FAUXperience Framework . . . . . . . . . . . . . . . . . . . . . . . . 79 7 Discussion 87 7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 8 Conclusion 95 Bibliography 97 xii List of Figures 2.1 Student Achievement Distribution. . . . . . . . . . . . . . . . . . . . . . 9 3.1 Overview of LLM training process. . . . . . . . . . . . . . . . . . . . . . 14 3.2 Visualization of the Jagged Frontier. . . . . . . . . . . . . . . . . . . . . . 16 3.3 Centaur and Cyborg on the Jagged Frontier. . . . . . . . . . . . . . . . . 17 3.4 Color Scales. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.1 Overview of the planner research methodology. . . . . . . . . . . . . . . 35 5.1 Key Stakeholders in the higher education system. . . . . . . . . . . . . . 39 5.2 Distribution of respondents by university and program. . . . . . . . . . . 44 5.3 Crossplag, Undetectable, and Sapling Scoring Systems. . . . . . . . . . . 55 5.4 Scribr, Plagiarism Detector, GPTZero Scoring Systems. . . . . . . . . . . 56 5.5 Scispace, ZeroGPT, Content Detector Scoring Systems. . . . . . . . . . . 57 5.6 QuillBot AI Scoring System. . . . . . . . . . . . . . . . . . . . . . . . . . 58 6.1 Students’ Perceptions of AI Integration and Guidelines in HE. . . . . . . 63 6.2 AI-Powered Tools Used by Students. . . . . . . . . . . . . . . . . . . . . . 63 6.3 Purpose of AI Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.4 Concerns About Implementing AI in Studies. . . . . . . . . . . . . . . . . 64 6.5 Usage of AI-Content Detection Tools. . . . . . . . . . . . . . . . . . . . . 65 6.6 Ethical Considerations in Using AI Detectors. . . . . . . . . . . . . . . . 66 6.7 Satisfaction with Accuracy of AI Detectors. . . . . . . . . . . . . . . . . . 66 6.8 Teachers’ Perceptions of AI Integration in HE. . . . . . . . . . . . . . . . 67 6.9 Integration of AI in Teaching Methods and Courses. . . . . . . . . . . . . 68 6.10 Motivations for Integrating AI into Teaching. . . . . . . . . . . . . . . . . 68 6.11 Concerns Regarding the Use of AI in Education. . . . . . . . . . . . . . . 69 6.12 Concerns About AI in Education. . . . . . . . . . . . . . . . . . . . . . . 70 6.13 Impact of AI on Student Engagement. . . . . . . . . . . . . . . . . . . . . 70 6.14 Discussions About AI Risks. . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.15 Challenges in Integrating AI in Courses. . . . . . . . . . . . . . . . . . . 71 6.16 Communication of AI Usage Policies to Students. . . . . . . . . . . . . . 72 6.17 Guidelines for Bachelor’s Students. . . . . . . . . . . . . . . . . . . . . . 83 6.18 Guidelines for Masters’ Students. . . . . . . . . . . . . . . . . . . . . . . 84 xiii List of Figures xiv List of Tables 2.1 Ranking of Occupations Based on AI Impact Potential. . . . . . . . . . . 10 3.1 Classification of Tasks for AI Interaction. . . . . . . . . . . . . . . . . . . 16 3.2 Examples of Different Perplexity Levels. . . . . . . . . . . . . . . . . . . . 21 3.3 Essential Principles of Social Constructivism. . . . . . . . . . . . . . . . . 24 3.4 Components of a Role-Play Simulation Prompt. . . . . . . . . . . . . . . 26 4.1 The four main types of interviews. . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Types of Conceptual Constructs. . . . . . . . . . . . . . . . . . . . . . . . 33 4.3 Types of Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.1 List of Interviewees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.2 Distribution of Teacher Survey Participants by University and Country. . 44 5.3 Distribution of Student Survey Participants by University and Country. . 45 5.4 Selected Detection Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6.1 AI Detection Tool Scores on Human-Written Texts. . . . . . . . . . . . . 76 6.2 Detection Rates for 100% AI-Generated Texts. . . . . . . . . . . . . . . . 76 6.3 Detection Rates for Texts with Mixed AI and Human Content. . . . . . . 77 6.4 Comparison with Self-Reported Accuracy. . . . . . . . . . . . . . . . . . 78 6.5 Total False Positives and False Negatives. . . . . . . . . . . . . . . . . . . 78 xv List of Tables xvi 1 Introduction "Good Afternoon, When your paper was uploaded on 10/06/23, it was checked through Turnitin. The program returned a positive response for AI. I also checked your paper through a third-party app uti- lized by the Criminal Justice Department for verification. This app confirmed the Turnitin AI response. Using AI is cheating and not your work. Therefore, you will receive a grade of zero for your paper. Any further violations will be sent to the Student Academic Integrity Committee. Sincerely, Robert Ellison" In October 2023, this email was sent to Marley Stevens, a student at the University of North Georgia [1]. Marley claimed that she used only Grammarly to proofread and cor- rect punctuation and grammar errors in her essay and did not use any content generated by AI chatbots. Despite her defense and the evidence she presented, the Student Aca- demic Integrity Committee placed her on academic probation for six months. According to the head of Grammarly for Education [1], her essay was mistakenly flagged as AI- generated due to the University’s defective AI identification software. This case illustrates how products with deficient interaction design can negatively im- pact users and how generative AI tools might affect higher education (HE). Universities worldwide are struggling to adapt to this new paradigm in education [2][3][4] as a result of students increased use of AI-powered tools and those tools becoming more sophisti- cated at producing texts that seem written by a human. In the age of AI-powered tools, the needs and expectations of HE students may differ significantly from those of students in conventional educational systems. Generative AI chatbots, such as ChatGPT, have gained popularity among students due to their ability to generate essays, answer test questions, and even write computer code [5] [6]. The ethical issues surrounding AI use, including privacy concerns, the possibility of AI- generated hallucinations, and the perpetuation of racial and gender biases [7] [8] [9] [10], have been intensified in this new educational reality [11]. A survey conducted in January 2023 reported that more than one-third of college students in the United States were using ChatGPT for academic purposes [12]. The increasing use of AI in education makes it challenging for universities to distinguish between texts written by humans and those produced by machines [5]. 1 1. Introduction As remarkable as today’s AI tools are, there are benefits and drawbacks to using them for all stakeholders in the educational system. To utilize them responsibly, we must continue instilling a critical-thinking mindset in students. Even if AI can potentially transform education, failure and struggle are necessary processes that a button click cannot substitute. We are at a pivotal point where AI will transform education, but classrooms will be more necessary than ever [13] [14]. This emphasis on critical consciousness [15] is especially relevant to HE as modern tech- nologies have fundamentally impacted education and challenged our conventional meth- ods of instruction and learning [16]. The existence of remote learning platforms along- side traditional in-person education requires HE institutions to reevaluate their current curricula and rethink their educational philosophies and approaches [17]. Given these circumstances, many companies have launched generative AI tools since 2022 [18] [19]. As a result, other companies have created tools to identify AI-generated content [20]. Companies are already launching tools to bypass AI detection. This never- ending loop results in companies fabricating issues to offer solutions, perpetuating an ongoing cycle of technological dependency [21] [22]. Significant concerns concerning the reliability and functionality of these items are highlighted by this dependence [23]. As AI-powered tools are continuously developed and integrated into education, we must ensure that the design process of these tools is conducted ethically and transparently with the involvement of all stakeholders [24]. Similar to the problems faced by students such as Marley Stevens, faulty detection tools fail to satisfy user needs and cause social tensions between users and digital systems. Preventing this friction is one of the primary objectives of interface design research [20]. Understanding users’ needs and context when interacting with a system, service, or product is necessary to build products that satisfy user needs [25]. We must understand students’ and teachers’ changing needs and perspectives in an in- creasingly AI-driven educational scenario. Students will inevitably employ AI for aca- demic activities as it becomes more widely available, creating new challenges for teach- ers and HE institutions [14]. They will use AI as a study partner, collaborator, or assis- tant and seek explanations for tasks that may seem outdated because of AI capabilities. Students will expect to achieve higher productivity levels and understand how AI will impact their careers [14]. Therefore, educational institutions need to develop strategies to address these emerging challenges. A critically conscious approach to learning becomes particularly relevant in the present educational landscape as artificial intelligence continues to change how we learn and transfer knowledge. According to Paulo Freire [15], the unique contribution of a teacher to the emergence of a new society would have to be the formation of a critical education that helps to form critical attitudes. Due to historical processes, people emerged with a naive consciousness, making them easy prey to irrationality. Only an education that fa- cilitates the transition from naive to critical transitivity, increasing students’ capacity to recognize the issues of their time, could prepare people to handle the emotional impact of societal transitions [26]. 2 1. Introduction ProblemStatement: GenerativeAI Chatbots, such as ChatGPT, Gemini, andClaude, have gained popularity for their ability to generate human-like text [6], raising con- cerns about their impact on traditional educational practices [5]. Since its launch in November 2022, ChatGPT has sparked polarizing opinions, particularly for its use in writing essays, answering test questions, and coding, which disrupts the con- ventional teacher-student dynamic and introduces uncertainties about academic in- tegrity [5]. With reports indicating that many students use ChatGPT and other AI-powered tools for schoolwork, we need to understand what and how they use them [12]. In this new scenario, universities now face the challenge of distinguishing human-written content from AI-generated text. In response, many companies have introduced AI-detection tools to detect AI-based plagiarism, but concerns about their functionality and reliability grow [23]. As generativeAI chatbots rapidly evolve, we must create strategies to ensure their critically conscious use to maintain aca- demic integrity [24]. 1.1 Research Questions In this thesis, we aim to explore the integration of AI into educational settings. Our primary focus will be to investigate the experiences of students and teachers using AI- powered tools and identify the challenges universities face in differentiating between human-created and AI-generated content. Additionally, we will propose strategies for the ethical use of AI in education. This research will address these areas by examining the following questions in higher education: • RQ1: How do teachers and students experience and perceive the usage of Gener- ative AI chatbots? • RQ2: What limitations prevent the adoption of AI detection tools? • RQ3: What strategies can be used to promote a critically conscious use of Gener- ative AI chatbots? 1.2 Stakeholders This study focuses on users of AI-powered tools in HE, including students, teachers, and other stakeholders involved in this context. Students Ourmain stakeholders are students in HE, particularly those studying interaction design, computer science, and software engineering at Chalmers University of Technology and the University of Gothenburg (GU). We chose these fields because they are expected to impact AI in the near future significantly [27]. Since students are often motivated by job prospects [27], Chalmers and GU are expected to prepare students in these fields with the necessary skills to become the next generation of interaction designers, computer scientists, and software engineers. These individuals will be responsible for designing, conceptualizing, critiquing, and programming the next generation of AI tools that will 3 1. Introduction shape our society. We particularly want to hear from students in these fields about their views, concerns, and needs. Teachers The other stakeholders are educators who teach interaction design, computer science, and software engineering at Chalmers University of Technology and GU. These teachers are already affected by the new education paradigm and are navigating the integration of AI into their teaching. It is particularly relevant to hear their perspectives on this integration and how they adapt to this rapidly changing landscape [28]. Teachers from other universities and countries responded to our survey. Still, the main stakeholders remain the teachers employed at Chalmers and GU. AI Engineers Given their expertise and direct involvement in shaping AI technology, AI engineers can provide insights into the understanding of the complexities of AI technology and its current implications. Furthermore, it provides valuable insights into how their work impacts users and society [25]. Chalmers University of Technology Chalmers University and its Computer Science and Engineering department are integral stakeholders in this thesis. Chalmers is responsible for providing the requirements for this thesis and for judging whether it follows the expected academic quality standards. The university has provided an academic examiner and a supervisor who play funda- mental roles in completing this process and contributing to scientific research. Author This thesis represents the final effort toward earning an M.Sc. in Interaction Design and Technologies. It is the final step toward acquiring the theoretical and practical knowl- edge necessary to earn this academic title. 1.3 Expected Results and Impact on Stakeholders This thesis explores the opportunities and challenges of integrating AI in education by analyzing the user experiences of AI-powered tools in HE contexts and aiming to ex- pand the conversation on the interplay of AI, interaction design, and ethics [29]. The research investigates the implications of using AI-powered tools, focusing on usability and interaction patterns in educational settings. It assesses current AI applications, iden- tifies ethical concerns, and proposes improvements prioritizing user well-being, privacy, and equity. The anticipated outcome of this research is the proposal of a comprehensive framework guiding ethical and critical user interactions with AI-powered tools. The thesis aims to benefit various stakeholders in HE by promoting the ethical and crit- ical use of AI-powered tools: For students, the framework addresses their concerns and needs. It gives them the strate- gies to assess and ethically use AI in their academic activities. Teachers can learn how students use AI tools and the ethical measures they implement to maintain academic integrity. This understanding can help teachers develop strategies to foster critical con- sciousness among students. AI engineers can benefit from understanding how their tools 4 1. Introduction affect users’ well-being, privacy, and equity, enabling them to advocate for responsi- ble AI innovation. Lastly, for Chalmers University of Technology and the University of Gothenburg, the thesis offers a comprehensive overview of the academic experiences of students and teachers with AI-powered tools. This overview can inspire the university to improve its educational standards, contribute to innovative research, improve stu- dent engagement, ensure ethical AI integration, and provide professional development for teachers. Overall, this thesis aims to promote a more ethical, critical, and responsible integration of AI in HE by aligning the results with the needs and perspectives of these stakeholders. 1.4 Scope and Delimitation This project examines how students and teachers use generative AI chatbots in learning and teaching activities. We are particularly interested in analysing these AI tools’ us- ability, user experience, and interaction design within the HE context. However, we will not cover the impact on other creative-based educational programs adapted to this new context. Our goal is to evaluate the integration of AI tools in HE, focusing on the ethical di- mensions of students’ and teachers’ reflections on their interactions with these tools. Additionally, we aim to identify the main tools used, how they are used, and what mo- tivates their selection. Based on this research, we will propose improvements in using these AI tools to fulfill user needs better. By analyzing how stakeholders interact with AI-powered tools, we want to determine whether the tools respond well to user’s behav- iors [17]. Furthermore, we will critically evaluate AI’s ability to meet the needs of both students and teachers within an ethical framework. This study focuses on the accuracy of AI-powered tools than on the level of AI rea- soning; given that these tools are still in the early stages of development despite their widespread use, our attention is primarily drawn to their potential to produce reliable and accurate information. A large body of research has focused on developing automated fact-checking systems [30]. However, there are many obstacles to solving the credibility issues with these systems. Therefore, before we focus on their AI argumentation level, we must address the ability of AI-powered tools to provide trustworthy and dependable information, especially in education, which is our goal. 1.5 Ethical Considerations When conducting scientific research, we must adhere to research ethics to maintain sci- entific findings’ credibility and validity. These ethics provide the values, standards, and protocols that govern scientific research [31]. An essential part of research ethics is how researchers interact with participants and the impact of the research on society at large [32]. This master’s thesis follows the princi- ples of fairness, diversity of perspectives, and honesty in reporting findings using ethical 5 1. Introduction research principles. All ethical requirements are carefully followed during data collec- tion to ensure participants’ well-being and comply with the General Data Protection Regulation (GDPR) [33]. Students and teachers were actively involved in the research process, and explicit con- sent procedures were in place. Participants were informed about the project’s purpose and scope and their right to withdraw consent for participation and data collection at any time [31]. Additionally, the background section examines some of the ethical aspects of AI in education. To build trust and encourage open sharing of experiences, we must establish positive and respectful relationships with participants during interviews. We must also take strict measures to protect participants’ confidentiality by GDPR [33]. All participant data remains anonymous unless we receive explicit consent for their identification. Further- more, all collected data is securely stored and will be deleted after the specified retention period stated in the consent form [34]. Surveys are anonymous to ensure participants’ autonomy and encourage honest re- sponses. We recorded interviews with participants’ consent and assured them in ad- vance that the personal data we collected was only for research purposes. We securely and confidentially store the recorded data and adhere to data protection regulations [34]. 1.6 Use of Generative AI Throughout thewriting process, the author used OpenAI’s ChatGPT-4, ChatGPT-4o, and Grammarly to refine the writing and receive guidance on academic writing standards, text’s readability, formatting references, grammar and spelling checks, and overall co- herence. The suggestions provided were based on the author’s original content. After receiving the recommendations, the author carefully and critically assessed each and made necessary grammatical adjustments when necessary. 6 2 Background Following the rise of the Internet, which triggered the Third Industrial Revolution, the world is undergoing a new revolution led by AI, known as Industry 4.0 [35]. This process is characterized by incorporating human behavior into machines and systems, reshaping the world through computing, and marking a true Intelligence Revolution [36]. Econom- ically influential nations and the world’s largest corporations are actively working to dominate every aspect of society with substantial investments in AI innovation [35] to gain a competitive advantage over other countries and commercial rivals. This intense competition is causing a profound impact across all areas of society, including healthcare, business, and education [36] [37]. Given these circumstances, this section will explore the factors driving the evolution of education within the AI Era [38][39][40]. 2.1 The Convergence of Society and Technology General-purpose technologies (GPT) are technologies whose applications extend to di- verse and generic purposes. A concept introduced by Timothy F. Bresnahan and Manuel Trajtenberg in 1992, GPTs are characterized by their widespread use, potential for tech- nical improvement, and ability to enhance overall economic productivity. Despite dif- fering definitions, historical examples of GPTs include the steam engine, electricity, the combustion engine, and computers [36]. Using the criteria set by Bresnahan and Trajtenberg, AImeets all three [36]. Its widespread applications prove its pervasiveness, and its rapid advancements show its potential for improvement. Although AI’s impact on productivity is still emerging, several forecasts predict significant productivity increases over the decade ahead [36]. Despite AI meeting the GPT criteria, by defining AI as a System Technology and focus- ing on its systemic nature, we can emphasize its qualitative changes, complex integration into various societal sectors, and multifaceted impact [36] [41]. This systematic view re- veals how the social context significantly influences AI’s development, including the norms and values of developers and companies responsible for its development. For in- stance, AI used in hiring, loan approval, or criminal proceedings may perpetuate biases related to gender, ethnicity, or age [41]. Recognizing AI’s systemic nature highlights the importance of ethics in its development and use. It empowers policymakers, businesses, civil society, and AI developers to use AI ethically and responsibly, thereby promoting positive outcomes [41]. 7 2. Background 2.2 The Integration of AI Into Education The development of contemporary educational theories and methodologies has made it possible to enhance instruction-based learning and give students more effective learn- ing experiences [42]. According to some researchers [43] [44], teachers can employ AI-powered tools to provide students with individualized learning opportunities in ad- dition to the conventional passive learning strategy. In content delivery, some experts even speculate that AI may potentially replace instructors [13]. In contrast to these perspectives, Ko [13] argues that learning technologies are not value- neutral by examining the impact of Large Language Models (LLMs) on student perfor- mance and refutes the comparison between calculators and AI in producing minimal influence on Education. Ko believes that AI could potentially harm student learning and accelerate the deterioration of public education. Leveraging established knowledge of educational dynamics and learning theories, Ko anticipates ambiguous future repercus- sions and advocates for a thorough evaluation of the consequences of LLMs in educa- tional settings. We can gain a more comprehensive understanding of integrating AI into education by taking into account Ko’s [13] perspective that technologies are not value-neutral along with the notion that AI can provide students with innovative learning experi- ences [43] [44]. Considering this technology’s possible benefits and drawbacks, we can analyze how it can drastically change the educational system by creating new difficulties for educational institutions in maintaining academic integrity. According to Mollick, we are on the verge of an era in which AI will revolutionize our education by reshaping the learning experience. The only question is whether we can guide this change in a way that fulfills the ideals of expanding opportunities for everyone and nurturing human potential [14]. AI as a Tutor For millennia, we have known that the most effective method of learning is individual tutoring, where tutors teach students in a one-on-one format following the students’ own learning pace [45]. If a student is struggling with a concept with a concept, the tutor can adjust the pace of instruction until the student understands it. In the same way, the instructor can quickly advance or deepen a topic if the student demonstrates a keen interest in it and mastery of it [45] [46]. However, considering the global population’s growth, one-on-one tutoring is not fea- sible and is prohibitively expensive for mass public education. Due to these reasons, when the public education process began to develop in the 18th century, the educational system divided the students into groups of about thirty or so per classroom and created standardized teaching and assessment procedures, such as collective lectures and regular test assessments [45]. Despite its numerous shortcomings, the modern mass public education system has sig- nificantly contributed to increasing literacy rates globally, and its core principles have 8 2. Background remained the same [14]. However, Benjamin Bloom’s work in the 1980s demonstrated that one-on-one tutoring significantly outperforms traditional classroom teaching. In his 1984 paper The 2 Sigma Problem (2.1), Bloom reported that tutored students scored better than 98% those taught in a classroom setting. This phenomenon, named the the two sigma problem, presented an ongoing challenge to achieve similar results with group instruction, given that personalized tutoring is often impractical on a large scale [14]. CONVENTIONAL CLASSROOM TUTORING summative ACHIEVEMENT SCORE N u m b e r o f s t u d e n t s Figure 2.1: "Using the standard deviation (sigma) of the control (conventional) class, it was typically found that the average student under tutoring was about two standard deviations above the average of the control class (the average tutored student was above 98% of the students in the control class" [47]). Created by the author, adapted from [47]. With the advancement of AI, one potential solution is the flipped classroom, the ped- agogical model in which online lectures and digital materials are provided to students before class so they can apply them in the classroom through collaborative activities and discussions [48] [14]. However, the success of flipped classrooms is mixed, largely due to the lack of quality resources and teacher time. AI presents itself as a promising partner, not a replacement, in overcoming these chal- lenges. AI systems can help generate customized active learning experiences by offering personalized instruction tailored to each student’s needs and adjusting content based on performance. This allows students to engage more effectively with the material at home, coming to class better prepared for hands-on activities and discussions. Teachers can then focus on meaningful interactions and use AI insights to identify and support students’ specific needs [14]. Despite the ongoing challenge of achieving Bloom’s two-sigma effect through group instruction, incorporating models such as flipped classrooms shows great potential for improving learning experiences. Traditional schooling will continue offering unique op- portunities for collaborative problem-solving and socializing. Classrooms will continue 9 2. Background to provide value, even with AI tutors, but these tutors will transform the field of educa- tion, enhancing the traditional learning experience [14]. AI as a Teacher Every GPT has impacted jobs throughout history. In the case of AI, studies show almost all professions overlap with AI capabilities. But unlike past automation revolutions tar- geting repetitive and dangerous tasks, AI now overlaps most with highly compensated, creative, and educated roles [14]. College professors make up 8 of the top 10 jobs over- lapping with AI [49]. Table 2.1: Ranking of Occupations Based on AI Impact Potential [49]. Rank Occupation Title 1 Telemarketers 2 English Language and Literature Teachers, Postsecondary 3 Foreign Language and Literature Teachers, Postsecondary 4 History Teachers, Postsecondary 5 Law Teachers, Postsecondary 6 Philosophy and Religion Teachers, Postsecondary 7 Sociology Teachers, Postsecondary 8 Political Science Teachers, Postsecondary 9 Criminal Justice and Law Enforcement Teachers, Postsecondary 10 Sociologists While AI may overlap with many jobs, it does not necessarily mean these roles will be replaced. Jobs are composed of many tasks, and AI can start performing some of them, particularly repetitive and tedious ones, without eliminating these roles. For instance, teachers could assign administrative tasks to AI while still maintaining their roles [14]. Tasks form the basis of jobs, and a teacher’s role involves a range of responsibilities: teaching, researching, writing papers, doing administrative paperwork, writing recom- mendation letters, applying for grants, and more. The occupation title "teacher" encap- sulates many tasks. Considering teachers’ tasks, as AI automates administrative tasks, teachers could focus on work that demands human qualities like creativity and critical thinking [14]. AI could even eventually deliver lectures. However, the environment in which a job operates plays an important role in this task division. In the case of teachers, factors like institutional traditions and student acceptance influence the integration of AI into the university. Thus, understanding AI’s role requires looking at both task-level impacts and the systemic context within which jobs exist [14]. 10 2. Background AI as a Leveler A study [50] on the impact of Generative AI on the writing tasks of college-educated workers reported that those using ChatGPT completed their tasks faster and with higher quality, reducing the productivity gap among workers. Individuals with the lowest abil- ities benefited most in this process, as it helped them improve their performance and leveled the overall productivity distribution. ChatGPT made the work process more effi- cient by enhancing productivity, reducing worker performance disparities, and turning AI into a performance leveler. When it comes to productivity’s effects on education, AI has the potential to help strug- gling students improve. For instance, AI can assist students who struggle with writing in creating well-written essays. In activities requiring creativity, AI can help students gen- erate ideas. While it’s difficult to predict the exact effects of productivity in any specific context, educational institutions can influence the outcomes of AI integration, whether positive or negative, through strategic AI implementation [14]. By using AI-powered tools in academic settings, educational systems can focus on teach- ing students to be central participants in the "generative loop" by encouraging them to apply their own expertise to problems instead of just relying on AI for solutions [14]. This approach can helpmaintain the human element in education, preventing skill-based education from becoming less valuable and avoiding the distortion of our education sys- tem into prompt-based learning, where automatically generated prompts are reported to perform better than expert-human prompts [51]. 2.3 Ethics, Challenges, and Dangers of AI AI Ethics refers to a set of values, principles, and techniques that apply widely accepted standards of right and wrong to guide moral behavior in developing and utilizing AI technologies. Given AI’s significant potential to cause profound ethical consequences, this technology can enhance and disrupt human lives [52]. The line that separates the consequences that enhance or disrupt human lives is condi- tioned by the biggest risk presented by AI, which is that there is no particular reason that AI should share or accept our standards of right and wrong to guide moral behavior or our view of ethics and morality. At this stage, AI is guided by the values imbued by the companies leading their commercial development [14]. The reality is that the organizations working on AI tools, even when they try to create less biased, more accurate, and more helpful technologies, end up influencing AI and can introduce new types of bias. Given that North American enterprises are behind the biggest AI models in the market [53], many AI systems appear to align with a liberal, Western, pro-capitalist worldview as the AI is designed to avoid making statements that could cause controversy for its creators, generally liberal, Western capitalists [14]. 11 2. Background Guidelines: The Genie is Out of the Bottle The approach to decision-making in different educational systems varies, with some being centralized and others localized. However, as AI-powered tools become increas- ingly complex, it also becomes challenging for teachers to make well-informed decisions about students using AI-powered tools. As these tools become more commonly used, data privacy, security, bias, transparency, and accountability concerns will continue to grow. Maintaining the benefits of AI tools will become more difficult in the face of this evolving and complex educational system [54]. In this context, guidelines and official recommendations play an important role in har- nessing the potential of AI while mitigating its risks. These guidelines can serve as a compass, directing all educational stakeholders toward safe, fair, and effective AI uti- lization. They are valuable resources that enhance the understanding of AI and protect those most affected by these new technologies [54]. While developing such guidelines, institutions will establish criteria to determine which assignments require AI assistance and which do not. School assignments must be re- vised; in-school writing assignments, non-internet-enabled computers, and written ex- ams might be options to ensure students learn basic writing skills [14]. When we consider how AI-powered tools affect education, we can compare it to when calculators became popular in North American schools in the mid-1970s. Many teachers wanted to use calculators in their classrooms because they saw they could make students more interested in learning. However, even though calculators became popular in the mid-1970s, they only became part of the North American school curriculum in the mid- 1990s. AI won’t replace the need to learn how to write and think critically [14]. However, be- cause AI technology is advancing so fast, the policymakers responsible for education should take less time than those who decided to include calculators in the school cur- riculum to prevent risks and dangers. After all, "the genie is out of the bottle"[14]. 12 3 Theory This chapter provides an overview of the project’s theoretical foundation, emphasizing relevant theories grouped around three primary topics: understanding artificial intelli- gence, the intersections between AI and interaction design , and modern perspectives on education from a critical consciousness perspective. The key concepts and terminology of AI theory are examined in the section on AI Concepts. Investigating the intersections between AI and interaction design addresses the mechanics underlying AI-generated text recognition, covering the fundamental ideas of large language models. Lastly, Paulo Freire and other educational theorists are used to analyze the social dynamics of learning and recognize learner diversity as a strength. 3.1 The Key Concepts and Terminology of AI Transformer Architecture. In 2017, Google researchers published a paper that intro- duced a novel neural network architecture for language understanding called the Trans- former [55] [14]. This architecture significantly influenced the AI community and served as the basis formost large languagemodels (LLMs) today [14]. For comparison, early text generators did not rely on contextual comprehension but used simple rule-based word selection [14]. For comparison, early text generators did not rely on contextual compre- hension but used simple rule-based word selection. The Transformer architecture solves this issue by employing an "attention mechanism," which allows the AI model to assess the importance of different words or phrases in a text block [56]. Large Language Models and Tokenization. Building on the Transformer architec- ture, new forms of AI, called Large Language Models (LLMs), analyze a text and predict the next word or part of a word [56] [14]. This process is known as Tokenization, where a text is divided into smaller units, referred to as tokens, before being fed into the model. For instance, the Tokenization of hypertension produces the following: "hy," "per," and "tension" [56]. ChatGPT is essentially a very sophisticated autocomplete [14]. Once you give it some preliminary text, it keeps writing text based on what statistical analysis predicts will probably be the next token in the sequence [14]. Pre-Training. The process of prompting LLMs to generate responses happens in several stages [57] [56]. The initial phase, known as Pre-training, consists of feeding the model a wide variety of data from various online sources such as websites, books, and digital documents to learn how to comprehend and generate human-like texts. Humans do not 13 3. Theory supervise this process; theAI independently analyzes textual sources to identify patterns, structures, and content in human language [57] [56]. Weights. LLMs can create a model that emulates human-written texts by using numeri- cal values, known asweights. Theseweights tell the AI how likely it is for differentwords or word components to appear together or in a specific order. For instance, in ChatGPT’s first version, 175 billion weights were linked to produce human-like text [14]. The LLMTraining Process. The training process starts with a large number of weights without any helpful information about word relationships. After that, the model is trained using a large volume of textual input [56]. Throughout the training process, the model adjusts its weights based on the input data, learning which word combina- tions tend to occur and which do not. It then uses its current weights to predict words in a sequence. Each iteration compares the produced text with the original text, looking for discrepancies [57]. The model then modifies the weights to enhance its predictions, continuously refining word connections to generate contextually appropriate text [14]. Self-supervised Learning Human-feedback Learning Prompt-based Learning Augmented Model Pre-training No Human Involvement􀉵 Unlabelled Data Proprietary Data 􂇲 􀢍 Fine-turning Human Involvement􀉳 Narrow Datasets Base Model 􀧸􀈫 Indirect and Direct Prompts Fine-tuned Model Prompting Human Involvement With Specialized Knowledge 􀉯 􀬎􀈧 Figure 3.1: LLMs "learn" at every training phase using increasingly focused inputs. The LLM training process begins with pre-training, where the model learns from unlabeled and proprietary data without human supervision. Later, more specific datasets and hu- man input are added during fine-tuning. Following this, individuals with specialized knowledge use prompting techniques to modify the LLM into an augmented model to perform specialized tasks. Created by the author, adapted from [56]. Fine-tuning. Public domain books, research articles, and other free online resources are typical examples of LLM training datasets. Due to the variety of the datasets, the AI also learns the biases, errors, and falsehoods in the text samples during pre-training [14]. At this stage, no guardrails are in place to prevent AI from disseminating harmful content. Therefore, LLMs undergo an improvement process called fine-tuning to mitigate this is- sue [57]. This process involves further training a pre-trained model on specific datasets, such as medical records, for a healthcare application or customer service logs for a con- tact center support system [57] [56]. Additionally, this fine-tuning can be enhanced further by humans scoring the quality of multiple model outputs, a process known as reinforcement learning from human feedback (RLHF), which employs human workers evaluating AI responses according to different standards, such as removing violent or sexual content from the results or screening them for correctness [14]. 14 3. Theory 3.2 The Four Principles of Co-Intelligence With AI As artificial intelligence becomes more widespread, we must establish comprehensive guidelines for interacting with this technology. These guidelines should be based on general principles rather than tailored to specific versions of the technology, especially considering the rapid advancements in LLMs [14]. When we apply these principles, we learn about AI’s strengths and understand when not to utilize them as they function as guardrails for a future in which human and artificial intelligence are integrated [14]. Principle One: Always Invite AI To The Table The initial principle is based on an experimental task-based approach. Since AI is a general-purpose technology, there isn’t one single manual explaining its benefits and limitations [14]. This lack of clarity makes it difficult to determine which tasks are best performed by these tools and which are not. By embracing this experimental approach, users can intimately identify AI tools’ nuances, limitations, and abilities to assist in their specific tasks and scout out their weaknesses [14]. "As we growmore familiar with LLMs, we can harness their strengths more effectively and preemptively recognize potential threats [...], equipping ourselves for a future that demands the seamless integration of human and artificial intelligence" [14]. Mollick suggests that to understand AI’s impact, we need to comprehend how human interaction with AI changes and the capabilities and limitations of AI in performing tasks [14]. To explain this task-oriented perspective, Dell’Acqua et al. [58] introduced the metaphor of AI as a Jagged Frontier . The Jagged Frontier In the Jagged Frontier metaphor, AI is represented by a fortress wall, with tasks inside being easy to handle and those outside being more challenging [3.2]. However, the wall is invisible, making it difficult to determine which tasks fall into each category. Basic math might be outside the wall and challenging for AI, while idea generation might be inside and easy for AI [14]. We must experiment with different tasks, dedicate time, and gain experience to under- stand AI’s capabilities. Because this experimentation is what gives shape and contours to the Jagged Frontier [14]. For instance, in educational contexts, students should apply AI in various academic tasks to understand and map out which tasks are best suited for them and which ones they can outsource to machines. Once they know the wall’s shape and contours, they can make informed decisions about utilizing AI, benefiting from its strengths, and addressing its limitations. At the task level, we must identify the tasks humans should carry out exclusively. Be- cause AI, while not suitable for tasks that involve ethical considerations, has suitable capabilities. By understanding the role of AI at the task level, we can assign tasks based on the necessary level of humanness required by them. 15 3. Theory Figure 3.2: The fortress wall represents AI’s abilities. Tasks closer to the center are easier for AI to handle, while those farther away are more challenging. However, the wall is invisible, so some tasks that seem equally difficult may be on opposite sides of the wall. Created by the author, based on [58] [14]. Centaurs and Cyborgs The Centaur and Cyborg concepts aim to combine the strengths of human intelligence and machine capabilities by demonstrating various ways to integrate AI into the work- place [14]. If we apply this abstraction to education, centaur tasks can involve summa- rizing technical papers, where the AI’s ability to summarize complements the author’s deeper understanding of a topic. Cyborg tasks may include using AI for writing assis- tance and guiding the author through challengingwriting sections by providing valuable insights when simulating human feedback. This combination of human intelligence and AI support illustrates how Centaur and Cyborg [3.3]1 approaches can improve produc- tivity and creativity in a mutually beneficial way [14]. Table 3.1: Classification of Tasks for AI Interaction [14]. Type of Task Description Example Just Me Tasks Tasks where AI is not useful or should remain human for personal or ethical reasons. Expressing human values. Delegated Tasks Tasks assigned to AI, checked by humans; often tedious or repetitive, saving human time. Scheduling appointments. Automated Tasks Tasks are left entirely to AI without human oversight; they are reliable and scalable by AI. Spam filtering. 1Image produced using generative AI. DALL-E 3 Prompt: High fidelity, head-shot of a half-centaur, half- black cyborg on a white background, front view. 16 3. Theory Human Task Ex.: Decide on the statistical approaches for analysis. Ex.: The AI produces graphs. AI Task This strategic division of labor leverages the unique abilities of both humans and AI, maximizing efficiency and effectiveness. Description Centaurs work by clearly dividing tasks between humans and AI, akin to the separation of the human torso and horse body in the mythical centaur. Centaur Human Task Ex.: An author asks AI for style variations and text improvements. Ex.: AI provides stylistic options and refines clunky paragraphs. AI Task This intertwined collaboration allows for continuous, dynamic interaction between human creativity and AI assistance. Description A blend of human and machine efforts. Cyborgs work in tandem with AI, frequently switching back and forth between human and machine inputs. Cyborg Figure 3.3: TheCentaur andCyborg are two approaches to co-intelligence that integrate the work of person and machine. Centaur work involves a clear separation of work between humans and machines and a strategic distribution of labor according to each entity’s strengths. In contrast, cyborgs deeply integrate human and machine elements while integrating human efforts with AI. Created by the author, based on [58] [14]. 17 3. Theory Principle Two: Be The Human In The Loop The "human in the loop" principle refers to incorporating human judgment when op- erating automated systems. Overall, this principle fosters a sense of responsibility and accountability. Wemaintain human control over the technology, ensuring that AI-driven solutions align with human values, ethical standards, and social norms [14]. While AI training methods now involve human judgment, the increasing automation and delega- tion of tasks to AI requires a focus on maintaining human values in the decision-making process. LLMs, as text prediction machines, cannot differentiate between true and false information, so while theymay excel at generating plausible answers, they often contain subtle errors, which can pose potential risks [14] [59]. Over time, aswe learnmore about AI-powered tools and comprehend their complex func- tioning, our confidence in their ability to perform without explicit guidance decreases. However, by recognizing their limitations, these tools can become effective assisting resources in a variety of disciplines [59]. Therefore, when interacting with generative chatbots, it’s helpful to think of AI as an entity trying to optimize several functions. One of the most important functions it’s programmed to optimize is to satisfy the user by pro- viding an answer the user will like [14] [60]. This optimization may lead AI to prioritize user satisfaction over information accuracy. For instance, current generative chatbots may make up something if they haven’t "learned" it because prioritizing "making the user happy" takes precedence over "being accurate" [14]. The generation of these incor- rect answers is known as "hallucination." While newer, larger LLMs exhibit significantly fewer cases of hallucination compared to older models, hallucination is still a weakness of this technology [60] [14]. Principle Three: Treat AI Like A Person Anthropomorphism is the tendency to attribute human-like traits to non-human enti- ties. Both behavioral psychology and evolutionary biology have extensively studied the causes and implications of this tendency, demonstrating its significance in human his- tory [61]. Therefore, it is no surprise that we are inclined to anthropomorphize artificial intelligence, mainly due to the widespread use of chat-based interactions. We often de- scribe these intricate algorithms and computations as "understanding," "learning," and even "feeling," which fosters a sense of familiarity and relatability but also causes poten- tial confusion [14], which can be exploited by malicious individuals who can manipulate users by gaining their trust in the system [61]. Therefore, treating AI like a person and telling it what kind of person it is refers to creating a distinct and clear AI persona as a strategy to improve Human-AI interaction. This process involves defining the AI’s identity, its role in addressing specific issues, and the context in which it operates [14]. Many AI models produce generic outputs by default, but by providing context and constraints, we can tailor the tone and direction to serve a specific purpose. We can use AI as a cooperative co-intelligence with this collaborative editing process and continuous guidance [14]. 18 3. Theory Principle Four: Assume This Is the Worst AI You Will Ever Use Given the speed at which generative AI is developing, very soon, tasks that we thought were intrinsically human will be able to be done by AI. Even if LLMs are software engi- neering products, we should view AI’s limits as temporary. Traditional software is de- pendable, predictable, and, when developed correctly, produces consistent results [14]. On the other hand, AI is unpredictable, unreliable, and capable of hallucinations. AI doesn’t act like software, but it does act like a human being. This mindset, which aligns with the "treat it like a person" principle of AI, can significantly improve our compre- hension of how and when to use AI in a practical, if not technical, sense [14]. 3.3 Human-Computer Interaction The field of Human-Computer Interaction (HCI) investigates the complex interplay be- tween interactive systems and humans. Through its study, design, and evaluation, HCI seeks to enhance these systems’ usability, effectiveness, and overall user experience [62]. Interaction occurs when a person uses computing technology to perform a task, utilizing their senses and responses to monitor and control devices, machines, or systems enabled by computing technology [62]. By drawing on the knowledge and techniques of diverse scientific disciplines such as computer science, psychology, and ergonomics, HCI continually adapts its researchmethod- ologies to cater to users’ changing needs and capabilities. As a result, many methodolog- ical advancements have emerged to address evolving trends in the field [62]. Interaction Modalities Human-computer interactions rely on input and output devices. Input devices enable users to transmit data and instructions to a computer (e.g., keyboard, mouse, micro- phone), while output devices enable the computer to communicate information back to the user (e.g., monitor, speaker, headphones) [63]. Interaction modalities refer to the exchange of information between humans and computers. This exchange involves in- put and output devices, information channels, and sensory modalities [63]. To make human-computer interaction more natural, we can combine many modalities, such as drawing, writing, speaking, and gesturing. The human senses, which include hearing, smell, sight, touch, and taste, are all related to sensorymodalities essential for interaction. These sensory modalities are connected to distinct information channels; for example, sight is mainly associated with the visual channel, whereas hearing is associated with the auditory channel [63]. Conversational Interfaces Under this premise, chatbots like ChatGPT, Gemini, and Claude are designed as conver- sational interfaces to emulate human speech. Through their interactionmodalities, these interfaces enable machines to give the impression of being human to other humans [64]. Different modalities can affect how additional or different information can be conveyed and make interactions more natural and immersive [65]. 19 3. Theory Conversational interfaces can make sophisticated human-computer interactions easier by allowing users to express themselves naturally and directly, such as by typing or speaking [65]. Natural Language Processing (NLP) is one technology that facilitates this level of personalization. NLP allows users to engage using their language and preferred style rather than being restricted to a limited set of pre-defined interaction methods [65]. Natural Language Processing NLP is a popular artificial intelligence application that automates the reading, analyz- ing, and generating of human language [36]. From an interaction design standpoint, NLP enables machines to emulate coherent conversations with users through natural language-based prompts. The primary goal of the NLP field is to develop algorithms that can comprehend human language and perform interpreting tasks. These algorithms can distinguish between letters and words, label text elements, and analyze text direction to infer meaning [36]. Deep learning has emerged as a popular approach to designing AI-powered interactions that can understand human language, with Chatbots being a prominent example. Chat- bots are automated chat systems that interact with users through prompts, analyze ques- tions, and select appropriate responses or follow-up questions using decision trees [8]. Chatbots are extensively valuable for education, aiding students in language translation, answering questions, and explaining topics to facilitate learning. However, concerns have been raised regarding the future of education due to potential threats posed by AI, including issues with assignment integrity and online exams, dependence on generative AI tools, challenges in evaluating ChatGPT-generated content, and potential impacts on critical thinking and problem-solving skills [8]. The Mechanisms Behind AI-Generated Text Detection For over two decades, universities have used text-matching software to detect plagiarism. The recent development of generative AI tools has motivated the creation of technical solutions to distinguish between human-written and AI-generated texts [66]. With the popularization of generative AI chatbots and the dissemination of AI-generated texts in many areas, the software industry has responded by introducing dozens of tools for AI-generated text detection. However, we must understand the mechanisms behind AI-generated text detection to understand whether these tools can distinguish between human-written and machine-generated content [66]. Various techniques for analyzing a text, such as readability, linguistic analysis, frequency counting, perplexity-based filtering, diversity and vocabulary richness, and much more, are used to determine whether the text is human-written or AI-generated [67]. The detection tools apply these features to discover patterns and information regarding the text that are invisible to the human eye. In general terms, LLMs are systems trained to predict the likelihood of a specific character, word, or string (called a token) in a particular context [66]. The AI detector similarly tries to indicate if the source material results from predicting a specific character, word, or string in the same context. If the 20 3. Theory answer is yes, it flags the text as probably AI-generated. This detection process examines two key metrics: perplexity and burstiness. The lower the values of these variables, the greater the likelihood that AI has generated the text. Perplexity. Perplexity is a metric that reveals the degree of unpredictability in a text. Lower perplexity values indicate that the text was likely generated by AI language mod- els, which tend to make the text more coherent and easier to read but more predictable. For instance, human-written texts tend to have higher perplexity due to more creative language choices and typos [68] [69]. Language models function by anticipating the next word that would fit naturally in a sentence. For instance, in the sentence "I couldn’t fall asleep last night...", there are many possible continuations, as outlined in the table below [68]. Table 3.2: Examples of Different Perplexity Levels. [68] Example Continuation Perplexity I couldnt get to sleep last night. Low: Probably the most likely continuation. I couldnt get to sleep last time I drank coffee in the evening. Low to medium: Less likely, but it makes grammatical and logical sense. I couldnt get to sleep last summer on many nights because of how hot it was at that time. Medium: The sentence is coherent but quite unusually structured and long-winded. I couldnt get to sleep last pleased to meet you. High: Grammatically incorrect and illogical. Burstiness. Burstiness refers to the patterns observed in word choice and vocabulary size. This metric indicates that AI-generated text displays a higher occurrence of clusters of similar words or phrases within shorter text segments. In contrast, human-written texts demonstrate a wider variety in their word selections, showing a more extensive range of vocabulary [70]. Data Visualization and Color Scales AI-detecting tools employ various data visualizations to illustrate human-generated and AI-generated content levels. "There are three fundamental use cases for color in data visualizations" [71]: To distinguish data groups, represent data values, and highlight key information [71]. An extensive theory underlies the use of color to distinguish data, and we will provide an overview of the main color scales commonly used to visualize different types of data [71]. This will enhance the clarity of the analysis of AI detection tools in the Process and Execution chapter. Color has three primary use cases in data visualizations: distinguishing groups of data from each other, representing data values, and highlighting. The colors and how we use them vary in these scenarios [71]. Qualitative Color Scale. We use a qualitative color scale to distinguish items or groups on a map with no inherent order, such as distinct countries. In this scale, we select colors 21 3. Theory that look different from each other and are visually equivalent. The colors should not give the idea of order, as in a color gradient. Such colors would create an apparent order among the colored elements, which, by definition, have no order [71]. Sequential Color Scale. We can use a sequential color scale to represent quantitative data values such as income, temperature, and speed [71]. In this scale, we use a color sequence to represent which values are larger or smaller than others and the distance between two specific values [71]. We should be able to perceive this color scale to vary uniformly across its entire range; to accomplish this uniformity, "we can use a single hue (e.g., from dark blue to light blue) or multiple hues (e.g., from dark red to light yellow) arranged in color gradients" [71]. Accent Color Scale. Color can also help highlight specific parts of the data. When we emphasize categories or values in a dataset, we can help the reader understand the information by emphasizing the important figure elements. One technique to get this effect is to color the element we wish to highlight in a hue or set of colors that contrast with the rest of the figure. In this type of color scale, we can find both a set of subdued colors and a matching set of stronger, darker, and more saturated colors [71]. Accent Color Scale sequential color scale qualitative color scale Figure 3.4: An example of a qualitative color scale at the top includes a set of distinct col- ors. In the middle, we have an example of a sequential color scale, which is a monochro- matic scale that progresses from dark to light blue. Lastly, at the bottom, we have an example of an accent color scale with four gray base colors paired with three accent colors. Created by the author, based on [71]. Information Visualization for Color-Vision Deficiency When it comes to sequential color scales that represent data values [71], we need to consider three critical conditions: the colors need to clearly indicate which data values are larger or smaller than others, the differences between colors need to visualize the corresponding differences between data values, and we need to design with color-vision accessibility in mind [71]. When designing and selecting colors, we must always be mindful of the possibility that a good proportion of readers may have some form of color-vision deficiency (i.e., are 22 3. Theory colorblind), which means that some readers might have difficulty distinguishing cer- tain types of colors, such as red and green (red-green color-vision deficiency) or blue and green (blue-yellow color-vision deficiency) [71]. "The technical terms for these defi- ciencies are deuteranomaly/deuteranopia and protanomaly/protanopia for the red-green variant (where people have difficulty perceiving either green or red, respectively) and tri- tanomaly/tritanopia for the blue-yellow variant (where people have difficulty perceiving blue)" [71]. 3.4 Education for Critical Consciousness The concept of critical consciousness (CC), or conscientização in Portuguese [72], was an approach developed by Brazilian philosopher Paulo Freire, to teach literacy skills to peasants in rural Brazil and a tool to help them gain a critical consciousness of their social reality. This concept emphasizes the importance of learning to read the written word and the world around us. By doing so, marginalized groups can recognize and analyze systems of inequality and commit to take action against these systems [73] [74] [75] Through practical initiatives in Brazil, Freire reported howmarginalized communities, as they actively engaged in critical consciousness, developed a more nuanced and complex understanding of societal structures. This transformative process, characterized by a dynamic interplay between reflection and action, is the essence of the theory of critical consciousness [75] [15]. Critical consciousness implies "learning to perceive social, political, and economic con- tradictions, and to take action against the oppressive elements of reality" [76]. For ex- ample, in the context of higher education, students with limited economic resources, students with higher levels of critical consciousness are more likely to recognize the in- equality in access to educational resources. This recognition enables them to take action, such as joining a student group, attending a school board meeting to address the issue, or using generative AI chatbots to create their own private tutor. On the other hand, stu- dents with lower levels of critical consciousness may fail to recognize such inequalities, feel powerless to address them, or avoid acknowledging the problem [75]. In this project, we explore the practical application of critical consciousness in fostering students’ awareness of AI technology. This framework helps students identify how their conscious use of technology can enhance reflection, motivation, and agency and encour- ages them to take action. It also aims to foster a shared sense of values among students, prioritizing the human aspect of using AI. The Social Dynamics of Learning Learning theories have evolved to prioritize the social dimension of learning alongside individual learning. While individual learning was previously the main focus, contem- porary learning theorists have emphasized the importance of collaborative work and discussion in the learning process. There is now a recognition that independence and social interaction are both essential for effective learning [42][77]. 23 3. Theory Among the learning theories that investigate learning as a social phenomenon, social constructivism, sociocultural, and activity theories can be traced back to psychology, particularly to Vygotsky’s work on the influence of the social world on an individual’s development [42]. To be didactic, we can summarize the main principles of these theo- ries into four key points: observation and application lead to effective learning; learning occurs within communities; quality assessment is collaborative; and transferable knowl- edge is context-based [42]. Table 3.3: Essential Principles of Social Constructivism [42]. Principle Description Observation and application lead to effective learning A crucial element in learning is observing individuals tackling real-world issues, as learning is a social phenomenon within communities. Learning occurs within communities Knowledge and learning are present in interactions among individuals and their surroundings within communities of practice. Quality assessment is collaborative Assessing quality involves adhering to group standards rather than individual criteria; unlike traditional educational approaches prioritizing "assembly-line learning" with fixed roles, which lack meaningful context and rely on extrinsic rewards to motivate learners, contemporary pedagogical models, such as the flipped classroom model [78], promote collaborative participation, enabling learners to interact with instructional content at their own pace, but also encouraging practical, creative, and active learning activities. Context shapes transferable knowledge Context heavenly influences learning, determining what and when we learn and impacting our ability to apply knowledge to new situations. These principles urge us to investigate the impact of AI on education. Despite some posi- tive influence of digital technology on education [79], including efficiency in lesson plan- ning and provision of immediate feedback, negative views persist in this new paradigm. For instance, students’ mastery of essential skills has declined, teacher-student relation- ships have been distorted, and students have become increasingly isolated in a virtual world [28]. Therefore, exploring how AI can avoid further exacerbating these conse- quences is vital. In light of these principles, exploring the potential impact of AI integration on the so- cial aspects of learning and students’ educational growth is essential. Additionally, we must consider how AI may impact the relationship between humans and machines, par- ticularly regarding learning communities and feedback mechanisms. These inquiries demand thorough examination to fully uncover their intricacies, which is impossible in the boundaries established by this study’s scope. 24 3. Theory Nonetheless, they will inform the research design within the scope of our project. Ulti- mately, the principles of social constructivism highlight the impactful changes this tech- nological scenario has brought to learning communities, leading us to inquire how AI will further shape human-machine dynamics. Recognizing Learner Diversity as Strengths Education has historically been a privilege reserved for socially and economically sta- ble families, perpetuating their dominance and homogenizing student backgrounds and experiences [80]. Additionally, there was a widely held belief in a singular, correct ap- proach to teaching and learning, with little consideration for diversity and inclusion due to the generational uniformity among students [42]. However, as access to education has gradually become more accessible, education has evolved into a more multicultural environment. This change has sparked a reassessment of how student differences are perceived, challenging the deficit model prevalent in the classical education system that treated deviations from the norm as deficits [42]. Embracing a constructivist approach to education has led to a paradigm shift, fostering the idea that students construct their own meanings throughout the learning journey [42]. This idea was later expanded upon by Cross-cultural studies, which propose a shift in how education perceives diverse backgrounds and prior experiences [81][82]. Rather than viewing them as barriers, they are valuable assets enriching the learning experience [42]. In a democratic educational setting, students come from various backgrounds and possess diverse intelligence, interests, ethnicity, race, culture, and gender [42]. As we explore the complexities of student diversity, questions arise about how AI, a tech- nology designed to emulate human knowledge, can address differences in intelligence, interests, ethnicity, and race [42]. We must also consider how education can adapt when data collected may not reflect the realities of a racially unaware society. With so many variables at play, it can be difficult for teachers to meet each student’s unique needs. We must investigate how we can leverage AI to support educators in adapting teaching to diverse student populations. However, AI’s unregulated nature in many contexts raises concerns about its potential to exacerbate or mitigate existing educational inequalities. 3.5 Simulated-based Education Simulation-based education enables students to practice their skills by simulating real- world scenarios in an immersive environment [83]. Immersion occurs when students fully engage in a task or setting as if it were real [84]. Simulation is a technique that can be used in various disciplines to replace and improve real-world experiences with guided, often immersive, activities that replicate significant aspects of the real world [83] [84]. In AI-based simulations, AI mediates the user and the simulated scenario. One of the pri- mary benefits of these simulations is their adaptability to specific learning goals, which improves personalized learning experiences [84]. Teachers might assign simulation ex- ercises to students, such as role-playing [84], which allows students to be more willing to take risks they wouldn’t usually take and step into unfamiliar roles [84]. It also allows 25 3. Theory students to learn a topic through a narrative, where they can test their knowledge and practice making critical-thinking decisions without the risks of real-life contexts. To create efficient AI-based simulations, Mollick [84] developed prompting instructions to ensure engagement between educational stakeholders and AI. By following these com- ponents, teachers and students can create prompts that improve learning experiences by providing well-defined scenarios and clear objectives. Originally, Mollick categorized simulation-based education into two types: role-play simulation and goal-play simula- tion [84]. In role-play simulations, students take on roles that are different from their real-life identities, while in goal-playing, students maintain their own identity and use their knowledge and skills to guide others, such as simulated characters [84]. For the purposes of this project and its limited scope, we have chosen to use a combination of both simulation types, which we refer to as role-play simulation for simplicity’s sake. AI Role-Play Simulation In AI Role-Play simulations, students can assume roles that differ from their real-life identities [84]. These simulations encourage students to step outside their comfort zones and experiment with different perspectives. Stepping into a role also allows students to experiment with other ways of engaging with a topic and new ways of solving prob- lems [84]. This immersive experience allows them to gain valuable insight into their strengths and places for improvement in a particular subject. Table 3.4: Components of a Role-Play Simulation Prompt. [84] Component Description A Dual Role for the AI The AI plays the AI Mediator, creating the scenario and giving students directions. The AI also plays a character within the scenario. Scenario Choices The AI Mentor offers students a choice among scenarios (e.g., persona’s personality) that pique their interest. Narrative Set-Up The AI sets the stage for the scenario, avoiding excessive complexity. Scenario Initiation The AI clearly marks the beginning of the simulation by signaling to students that they are now in a scenario. Guidance on Goal and Techniques The AI Mentor may step into the scenario to remind students of their goals or give hints but does not interfere during the scenario. End of Scenario and Advice The AI Mentor steps back onto the scene to reinforce key elements of the topic and identify additional considerations. Given the importance of raising awareness among educational stakeholders about AI- related resources, students must be mindful of the information they disclose and the tasks teachers delegate to AI [52] [85] [86]. Therefore, by cultivating a critically con- scious education as a guiding principle, all educational stakeholders must acknowledge that delegating tasks to AI will have unforeseen repercussions on the academic area [5]. So, when incorporating AI-based simulations into education, it is crucial to carefully evaluate the advantages and potential risks of incorporating AI into education [13]. 26 4 Methodology This section outlines how the methods used in this study follow the Double Diamond De- sign Process model, developed by the British Design Council in 2005 [25][87][88]. This methodology focuses on the goals and behaviors of users by Using a user-centered ap- proach and investigating the usage and target domain. Design requirements are defined and translated into a high-level interaction framework, which connects research and design [25]. Therefore, this section details the activities and methods (see Figure 4.1) employed in each project phase to evaluate how students and educators perceive and utilize AI-powered tools in the context of higher education. 4.1 Design-oriented Research in HCI Design-oriented research is a methodology that focuses on understanding user behav- ior and experience rather than focusing on a design artifact [89]. Through this type of research we can explore new knowledge within a specific field by examining how research artifacts are used in real-world contexts or during the product development process. Which means that the main contribution of this methodology is the knowledge that results from the project [89]. With the knowledge gained from the research pro- cess, researchers and designers can improve the design process itself or facilitate the creation of improved products. The research project serves as the design effort’s primary stakeholder or client in this methodology [89]. 4.2 Research Phase (Discover) During the discovery phase, the researcher gains a comprehensive understanding of the problem space. The researcher typically achieves this understanding using research methods such as ethnographic observation and contextual interviews to gather qualita- tive data about potential or current users [25]. The insights gathered during this stage help designers identify user needs and shape the direction of the design project [25]. It also involves assessing existing research and market solutions to understand the users and domain under investigation better. Additionally, the researcher conducts individual interviews with stakeholders, subject matter experts (SMEs), and technology experts as needed for the specific domain. By employing these methods, designers can obtain crit- ical information to help them make informed decisions during the design process [90]. 27 4. Methodology During the discovery phase, we usually rely on qualitative research to gain a more in- depth understanding of human behavior. This approach enables us to explore the intri- cacies and nuances of studying people’s behavior, providing us with valuable insights into the "what,", "how,", and "why" of human rationale. Using this approach, we can de- sign solutions that effectively address users’ needs, essential for creating a user-centered design [90]. Qualitative methods are beneficial when identifying individuals’ behaviors, attitudes, and aptitudes toward a design solution and its related domain. They help us trace design decisions to their origins, uncovering underlying user preferences and motivations [90]. However, we must acknowledge the limitations of qualitative methods in this study. While they provide valuable information, they need to be used in conjunction with quan- titative methods, such as surveys, to assess a solution’s viability fully. To address this limitation, complementary quantitative techniques, such as surveys, are necessary for filling the gap [90]. These techniques provide quantitative data that complements the qualitative findings, offering a more comprehensive understanding of user behavior and preferences. Literature Review. A literature review is a comprehensive analysis of existing research within a particular field that includes various sources like research studies, user surveys, technical specifications, white papers, related research, usability studies, technical jour- nal articles, and web searches related to the topic. The primary objective of a literature review ismultifaceted, as it serves as a foundation for developing research questions, pro- vides additional knowledge within the field, and enables researchers to cross-reference collected user data [90]. Interviews. Interviews are conversations that aim to achieve specific objectives. Each type of interview has a unique purpose and serves a distinct function. "The four primary types of interviews are open-ended or unstructured, structured, semi-structured, and group interviews" [25]. These types are categorized based on the degree of control the interviewer exercises to direct the conversation using predetermined questions [25]. Stakeholder interviews. To understand the social and technical contexts surrounding a prospective or existing design solution, it is important to consider the viewpoints and input of stakeholders. These stakeholders may include individuals, groups, or entities with a vested interest or involvement in the decision-making and operational aspects of a domain, business, organization, or project [90]. Conducting one-on-one interviews with each stakeholder is often the most effective approach to ensure we adequately capture individual opinions. These interviews usually last approximately an hour, and follow-up sessions may be necessary, particularly if a stakeholder emerges as a valuable source of insights [90]. Through this exploration, researchers can identify gaps in existing liter- ature, recognize technical constraints within the field of study, and uncover potential research opportunities [90]. This understanding of research opportunities within the domain can significantly influence the design of the research approach. Subject Matter Expert (SME) interviews. SMEs possess extensive knowledge and expertise in a particular area, enabling them to provide valuable insights into domains, products, markets, or processes. They are experts in their field and have a profound understanding of the domain in its current state. However, it is essential to remember 28 4. Methodology Table 4.1: The four main types of interviews [25]. Interview Type Description Unstructured Researchers use unstructured interviews to explore a topic in-depth. Questions are open-ended, allowing participants to respond freely. The interviewer should have a plan to cover all necessary topics. Unstructured interviews can offer deep insights into the topic. Structured Structured interviews involve asking predetermined questions to each participant. The interviewer maintains standardization by using the same questions for all participants. Questions are typically short, clearly worded, and mostly closed-ended, requiring answers from a predetermined set of alternatives. Structured interviews are suitable when study objectives are well-defined. Semi-structured Semi-structured interviews blend closed and open-ended questions. The interviewer uses a general script to cover all key topics, starting with planned questions and asking follow-ups as needed to gather relevant information. Focus Groups One form of group interview is the focus group. It involves a structured gathering of 3 to 10 participants led by a trained facilitator to discuss various topics. They provide multiple viewpoints on shared issues and are useful for exploring diverse perspectives. A preset agenda guide the discussion, but the facilitator ensures flexibility and encourages participation. that their expertise can sometimes result in a biased perspective, as their knowledge and experience may make them overlook the needs of most users and focus primarily on advanced aspects or features [90]. User interviews. Individuals who interact with a product or service with a specific goal are called users. When referring to a product or existing service, it is essential to consider potential and current users who can offer insights into their experience with the current version. Conducting user interviews can provide valuable information, such as how the product fits into users’ daily routines or professional workflows, their area of expertise, current tasks, objectives, motivations, mental models, and any obstacles or sources of frustration they may encounter while using the product [90]. Surveys. Collecting demographic information and opinions from a large group of people can be accomplished through surveys. They are similar to interviews and can have either closed or open-ended questions. An electronic message is typically sent to prospective participants, instructing them to access an online survey. The main difference between surveys and structured interviews is the motivation of the respondent to answer the questions. A survey would be appropriate if the respondent is motivated enough to complete it without anyone else’s presence. However, a structured interview format would be better if the respondent required some persuasion to answer the questions [25]. 29 4. Methodology Note taking. Effective documentation methods are essential for recording and orga- nizing information, findings, and conclusions related to the research process, especially from interviews [91]. In the context of ethnographic studies, note-taking is a valuable technique for obtaining diverse data [91]. It allows researchers to capture comprehen- sive details about social settings and situations [92] while being minimally intrusive and posing minimal risk to data confidentiality compared to other methods [92]. Diary Studies. Diary studies are another valuable research tool for gathering and doc- umenting participant insights, but in this case, over an extended period. Participants provide researchers with information by documenting their thoughts, emotions, and ac- tions [25]. While traditionally done on paper, technology now enables digital entries through photos and audio recordings [93]. At the start of data collection, researchers request that participants keep a diary of their activities, including what and when they did the activity and their reactions. Diaries are particularly useful when the researcher is not in the exact geographical location as the participants, the activity is private, or the research deals with participants’ emotions or motivation [25]. One of the most significant advantages of diary studies is that they offer a low-cost and time-efficient data collection method. They also require minimal equipment or expertise and are suitable for long-term studies. However, diary studies heavily depend on partici- pants’ motivation, whichmay require incentives and a streamlined process. Additionally, participants may recall events with exaggerated details or need to remember crucial in- formation. One possible solution is to supplement diary entries with multimedia data, such as photographs, audio, or video recordings [25]. Kanban Board. In addition to the previously mentioned data collection methods, metic- ulous activity planning is crucial to any research plan. This planning involves the im- plementation of a methodology that maximizes productivity and efficiency [91], which requires a thorough analysis and understanding of the research’s design requirements and objectives. Factors such as resource availability and time allocation for each phase must be considered, along with strategies for achieving desired outcomes within a spec- ified timeframe [94]. Therefore, this methodology provides a structure for the activity planning in the research plan. The Kanban Board is an effective management tool for monitoring the research and design process [91]. This approach uses a three-column table to provide a clear overview of the stages: to-do, doing, and done [95]. This struc- tured organization facilitates prioritizing tasks based on the specific project needs [91]. As the designer or researcher initiates a task, its corresponding note is moved to the next column until it is completed [95]. 4.3 Synthesis Phase (Define) The Synthesis Phase is a critical step of the design process. Based on user research, we may develop personas that represent users’ goals and needs during this phase [87]. We may also use techniques such as affinity diagramming to evaluate data and discover themes [25] and empathy mapping to obtain insight into customer emotions and behav- iors [25] [87]. Additionally, we may use case studies to identify specific difficulties and scenarios to help stakeholders relate to user needs and context [25], as well as design 30 4. Methodology audits to improve existing designs [96]. Thematic analysis can also help to uncover pat- terns in qualitative data [97]. These techniques enable us to synthesize data, comprehend user behaviors, and direct design toward effective, user-centered solutions. Qualitative Personas. Creating personas involves conducting in-context interviews, fieldwork, and ethnographic techniques to collect information from stakeholders, SMEs, and users. Personas offer a framework for understanding and communicating user be- haviors andmotivations as descriptive models. The efficacy of personas in clarifying and directing design activities depends on the quality of data obtained from user interviews. Identifying manageable and coherent behavior patterns across all contexts is essential to creating effective personas. It is important to note that similar behaviors exhibited by two users regarding one product may not necessarily translate to similar behaviors concerning a different product [90]. Persona Prompting. Starting from the premise that personas are descriptive models, generative AI tools enable the "materialization" of thesemodels through persona prompt- ing [98]. This method involves creating persona-grounded chatbots to generate person- alized conversations. With this technique, you can determine the personas’ personality, tone, and objectives. The use of persona prompting leverages what Mollick calls co- intelligence, applying AI technology to augment human thinking [14]. This method is only feasible due to the capabilities of recently released large language models, such as ChatGPT4 [98]. De Paoli [99] explains that persona prompting works as a middle ground between traditional qualitative personas created entirely from qualitative data and related analysis and data-driven personas, which reuse a pool of existing data, such as analytics or surveys and are often produced with algorithmic support. Case Study. A case study is a methodological approach to generate an in-depth un- derstanding of an issue or phenomenon within a specific system [100]. This method is widely accepted in qualitative research within the social sciences, and it involves con- ducting in-depth investigations into individuals, groups, or events to gain insight into real-life phenomena. Case studies may include gathering data from multiple sources such as interviews, observations, or documents [100]. In summary, the primary goal of case study research is to achieve a detailed and nuanced understanding of the subject and potentially generate new theories or insights. Scenarios. A scenario is an informal narrative description depicting human activities or tasks within a story, facilitating the investigation and debate of settings, needs, and requirements [25]. We can apply scenarios as a tool to enable stakeholders to relate to an issue, understand the context in which tasks occur, and fully participate in development. They can portray current behavior and be used to depict behavior involving possible new technology. They also serve as an effective technique for presenting user goals [25]. The research stresses human activity above technology interaction by focusing on un- derstanding why people do things the way they do and what they want to achieve in the process. We can also use scenarios to explain futuristic situations that envision a future context, including new technologies and a different worldview [25]. Design Audit. A design audit involves a comprehensive and systematic evaluation of existing designs to assess their overall design [96]. The main goal is to identify short- comings in the current design and propose enhancements to improve it. This evaluation 31 4. Methodology relies on comparative benchmarking to establish differences between current and desired performance and provide information that designers can use to develop action plans for design improvement to satisfy user needs [96]. The outcome of an audit should be a clear overview of strengths and areas needing improvement, leading to planned improvement actions that can be monitored for progress. Affinity Diagramming. A practical approach to analyzing data, recognizing patterns, and building a coherent narrative is to use an affinity diagram. This method organizes in- dividual concepts and observations into a hierarchical arrangement emphasizing shared themes and structures. No predetermined categories exist, and notes are grouped based on their similarities. The process of constructing an affinity diagram is gradual. The team begins with one note and then identifies additional connected notes [25]. Empathy Mapping. The Empathy Map (EM) is a customer-centric method developed by Scott Mathews to aid in designing business models. Unlike traditional methods fo- cusing exclusively on demographic characteristics, the EM investigates the customer’s environment, behavior, aspirations, and concerns, aiming to create empathy for a spe- cific individual. By adopting a user-centered approach, stakeholders can view the world through the customer’s eyes and better understand how design changes can significantly impact their experience. The EM consists of six key areas: (a) See, (b) Say and Do, (c) Think and Feel, (d) Hear, (e) Pain, and (f) Gain, each with a set of guiding questions to capture the customer’s perspective effectively [101]. Thematic Analysis. Thematic analysis (TA) is a method for discovering, analyzing, and understanding patterns in a qualitative dataset [97]. It involves data coding techniques to identify themes that represent the ultimate analytic purpose. TA provides tools for or- ganizing, interrogating, and interpreting data,