Dancely

ResponsibilitiesPose Landmark Integration, Feedback System Design, API Integration
Pose Comparison with LLM interpreted feedback for movement-based disciplines

Project Description

I architected API integration and synchronization for Dancely in a team of 4 during the AI ATL Hackathon. Dancely is a video comparison platform built during the AI ATL Hackathon designed to improve learning in movement-based disciplines such as dance or martial arts. Users are expected to provide 2 videos: one instructor and one learner. The system estimates key body landmarks using Google’s Pose Landmark Detection library, compares actions through Dynamic Time Warping, and provides actionable feedback to highlight errors and suggest improvements.

The platform outputs

  • Side-by-side videos with annotated body position points
  • A synchronized timeline linking key moments.
  • Feedback cards highlighting areas of improvement based on pose deviations.

This page will go over the system I developed to synchronize movements to isolate errors thereby improving feedback consistency.

Dynamic Time Warping
- How it works and Why its Used

Dynamic Time Warping (DTW) is a technique used in time series analysis to measure similarity between two sequences of data that may vary in speed or length. It is often used in speech recognition, online signature recognition and in finance. Unlike static frame-by-frame comparison, DTW captures temporal information by accounting for variations in speed and sequence length, which makes it especially useful in our application.

Applying DTW for comparing Poses

To analyze movements, each video is first processed to identify key body points such as shoulders, elbows and knees (pose landmarks). All of these points are tracked over time, creating a set of sequences that represents how each action is taken throughout the video. DTW is then used to compare the two sequences allowing the system to measure similarity even if the learner performs actions at a different speed than the instructor. For example, if one person raises their arm more quickly than another, DTW accounts for this timing difference so the comparison remains fair ensuring that prevents such error from propagating into the rest of comparison

The following is an example of how the system highlights errors correctly

Image showcasing the system correctly highlighting a mistake

Improving LLM reliability & consistency

Simply uploading videos into LLMs provide unreliable feedback as they fail to find major deviations consistently. DTW solves this by finding exact frames where deviations occur for the LLM to provide feedback on. This provides a clear separation of concerns allowing the LLM to only perform feedback based analysis and allows for consistent error finding. Future improvements

Pose estimation models return confidence scores for keypoints. Incorporating these into DTW weighting can improve accuracy.

  • Not all keypoints contribute equally to an action. For example, torso movement is more critical than leg movement in punching or certain dance moves. Removing unnecessary keypoints could also improve accuracy.
  • The DTW algorithm is applied to the time-series data of all body points combined, but separating these sequences could help focus the LLM’s feedback more effectively.