Text-driven Motion Synthesis and Interaction Generation using Masked Deconstructed Diffusion and Multi-task Scene-aware Models

Chen, Jia

Text-driven Motion Synthesis and Interaction Generation using Masked Deconstructed Diffusion and Multi-task Scene-aware Models

Files

Primary Chen_Jia_2025October_MSc.pdf (25.36 MB)

Date

2025

Authors

Chen, Jia

Abstract

This thesis introduces a new generative AI approach that addresses three long-standing hurdles in human motion generation: accuracy, speed, and reliable alignment with user-written text. From a simple sentence, the system quickly produces natural, high-quality 3D movements that can be retargeted to digital characters for animation, virtual reality (VR), and games. The experiment demonstrates its practical value in VR, where the generated motions enhance immersion and responsiveness. Building on this, the thesis explores a second, scene-aware model that works with large language models to understand both the instruction and the surrounding scene. It can break down long requests into smaller steps and generate motions that interact with objects, for example walking to a chair and then sitting down. Together, these contributions point to more intuitive, text-driven tools for creating lifelike character animation.

URI

http://hdl.handle.net/11375/32619

Collections

Open Access Dissertations and Theses

Full item page

Text-driven Motion Synthesis and Interaction Generation using Masked Deconstructed Diffusion and Multi-task Scene-aware Models

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By