Learning human actions on-demand based on graph theory
dc.contributor.author | Zhang, Jing | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för elektroteknik | sv |
dc.contributor.examiner | Ramirez-Amaro, Karinne | |
dc.date.accessioned | 2024-06-14T14:19:56Z | |
dc.date.available | 2024-06-14T14:19:56Z | |
dc.date.issued | 2024 | |
dc.date.submitted | ||
dc.description.abstract | Abstract Collaborative robots (Cobots) are designed to work side-by-side with humans, sharing space and skills to achieve common goals. However, as human tasks become increasingly complex, Cobots must adapt to unfamiliar tasks. Traditional machine learning methods, while offering potential solutions, tend to focus on learning lowlevel physical activities. This lack of interpretability makes it difficult for humans and robots to understand and predict each other’s behavior, hindering effective collaboration. In addition, machine learning methods rely heavily on human demonstrations, limiting the robot’s ability to generalize to new scenarios. In this work, each task (e.g., putting a spoon in the drawer) can be segmented into an interpretable activity sequence (e.g., Open, Grasp, Drop, etc.) based on human activities in real-time. We propose a method that can automatically construct different sequences for different tasks using a single human demonstration. Given that human demonstrations can vary and may include mistakes, this method reconstructs the most representative activity sequence from multiple demonstrations, thus robot could understand and predict human activities, and this method could extend to unseen scenarios. We use a semantic reasoning method to transform low-level data into high-level concepts understandable by humans. Decision trees are trained to capture specific activity characteristics defined by predicates, e.g., when a human grasps a spoon, data about velocity and spatial relations are translated into predicates like inHand(spoon) as the input of decision tree, then this movement is inferred by out method as “Grasp”, allowing real-time prediction and segmentation of human activities into activity sequences even for new experiments. Activities are parameterized using ontology knowledge, enabling robots to adapt to various objects and tasks. If an object, e.g., a bottle, is not inside the ontology, then we use Large Language Models(LLMs) to categorize the object according to the predefined ontology. For instance, both “spoon” and “bottle” belong to the category “Objects,” making the activity “Grasp” identical in a high-level context. Since humans may perform the same task in different ways, de Bruijn graph and sequence assembly algorithms streamline these sequences by eliminating redundant activities and representing repetitive patterns, then reconstructing the most representative activity sequence by finding a path traversing each edge of the graph. This approach enhances the ability of Cobots to understand and predict human activities, thereby improving their collaboration with humans in dynamic environments. | |
dc.identifier.coursecode | EENX30 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/307867 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Keywords: Human-Robot Collaboration, Ontology, Online Semantic Segmentation, De Bruijn Graph, LLM, Robot Learning | |
dc.title | Learning human actions on-demand based on graph theory | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Systems, control and mechatronics (MPSYS), MSc |
Ladda ner
Original bundle
1 - 1 av 1
Hämtar...
- Namn:
- Learning_human_actions_on_demand_based_on_graph_theory.pdf
- Storlek:
- 7.09 MB
- Format:
- Adobe Portable Document Format
- Beskrivning:
License bundle
1 - 1 av 1
Hämtar...
- Namn:
- license.txt
- Storlek:
- 2.35 KB
- Format:
- Item-specific license agreed upon to submission
- Beskrivning: