FaceMusic (also presented as FacePlay Trigger) is a lightweight Python tool by Roman Slack that maps facial expressions to media playback, turning a webcam into an interactive controller. Using MediaPipe for real-time facial landmark detection, it recognizes gestures such as eyebrow raises and winks and uses them to trigger YouTube video playback or local audio files.
The tool runs as a simple single-script application: launching it opens a webcam window that overlays a green facial mesh, and raising your eyebrows opens a configured YouTube video in the browser while a wink plays a local audio file through pygame. Expressions and their associated actions are fully configurable through a JSON config file, where users can set media paths, detection confidence, and an expression threshold to tune sensitivity.
FaceMusic is notable for its minimal, no-complex-setup design, with a small dependency footprint of OpenCV, MediaPipe, NumPy, and pygame. It includes a cooldown period between triggers and practical guidance on lighting and camera positioning, making it an approachable example of gesture-driven media control tested on Ubuntu 24.04.
Key Features
- Real-time facial expression detection powered by MediaPipe
- Eyebrow raise detection that triggers YouTube video playback
- Wink detection that plays a local audio file
- Configurable expressions and actions via a JSON config file
- Single-script implementation requiring no complex setup
- Adjustable detection confidence and expression threshold with a trigger cooldown
Tech Stack
Designed and built by Roman Slack, Lead AI Platform Engineer. See more of Roman Slack's work on the projects page or get in touch via the contact page.