Stefano Braghetto

2025

Stylizing animation from realtime sparse data

This project builds on work I did in collaboration with researchers from the University of Edinburgh and Abertay University, where we explored how to bring motion style into real-time character animation. The original system Codebook predicted full-body motion from sparse VR headset input. I adapted it by retraining the model with our own dataset and modifying the architecture so it could learn a style embedding.

That embedding lets the AI blend between natural walking and more stylized movements—like high-knee strides—while keeping the core motion realistic. In the demo, you can see the character shift in and out of that exaggerated style as I toggle the style vector in real time.

AI Animation with AMP in Isaac Lab

This is part of a larger, ongoing project where I’m building an AI that can learn to play football in a fully physics-based environment.

There are already some impressive results out there—especially from Google—but I wanted to tackle the challenge myself, starting from the fundamentals. What you’re seeing here is inference running in Isaac Lab with thousands of agents walking simultaneously. I used reinforcement learning (PPO) combined with adversarial imitation learning ((AMP, which builds on ideas from GAIL) ), to train agents to mimic real animations in a physically realistic way. If the character stops applying force, it collapses—so learning a stable walking policy is non-trivial.

To feed AMP, I built a custom pipeline (mostly in Blender and Python) that retargets any humanoid animation to the Isaac character and extracts key motion features to generate training data.

The long-term goal is to scale this system to learn a wide range of actions—from simple locomotion to complex team behaviors—all learned directly in simulation with realistic physics.

Adaptive Real-Time TTS (F5-TTS)

In game development, generative TTS forces a difficult trade-off: high iteration counts yield better quality but cause lag, while low counts sound robotic.

I solved this by developing an Adaptive Stepper algorithm. It treats the inference steps ($N$) as a function of the client's audio buffer status.

The server calculates the maximum inference steps ($N$) possible while guaranteeing (with 99% confidence) that audio arrives before the buffer runs out. This allows the game to start instantly with lower latency and seamlessly upgrade to high-quality audio as the buffer fills up.

2024

Real Time Motion Prediction from VR Headset

This is one of the projects I’m most proud of, because I put a huge amount of effort into solving it. It might be the most difficult problem I’ve ever tackled in my life.

What you’re watching is a project that aims to predict a user’s motion while they are holding the VR headset. The code belongs to Retinize Limited, for whom I developed this work.

What makes it especially interesting is that, since we wanted to commercialize it, we started by creating our own dataset using motion capture.

I’m very happy with these results—though they’ve actually improved even further since this video was recorded in February 2024.

2023

AI Implementation for Face Animation

This is an implementation of the SelfTalk project (link) in Unity, with several modifications made to improve real-time performance (RTF).

SelfTalk enables facial animation to be generated from voice alone. In this version, instead of predicting vertex delta motion—as done in the original SelfTalk paper—we generate FLAME blendshapes directly from the audio.

2019

From Scratch to Strategy: My First Poker-Playing AI

This was my first serious AI project back in 2019—an agent that learns to play poker entirely on its own.

Phase 1 – Self-Play:
I trained an agent using Deep Q-Learning (DQN) to play poker against copies of itself. Through self-play, it gradually learned the rules and basic strategies of the game.

Phase 2 – Real Player Data:
I built a custom annotation system that parsed online poker logs and labeled player behavior, including revealed hands when available. This allowed me to train a model to estimate hand strength and predict decision quality.

Phase 3 – Combined Training:
Finally, I combined both datasets—real and synthetic—and trained a DQN agent that could generalize and adapt to human-like behavior.

⚡ The final AI achieved ~60% win rate against low- and mid-stakes opponents.

I still keep the code archived privately and haven’t released it publicly—mainly to avoid any potential issues with platforms like PokerStars.

Note: Apologize for the video which is in Spanish, as this was an early project recorded back in 2019.

About Me