|
|
|
We introduce the Multimodal Dyadic Behavior (MMDB) dataset, a unique collection of multimodal (video, audio, and physiological) recordings of the social and communicative behavior of toddlers. The MMDB contains 160 sessions of 3-5 minute semi-structured play interaction between a trained adult examiner and a child between the age of 15 and 30 months. Our play protocol is designed to elicit social attention, back-and-forth interaction, and non-verbal communication from the child. These behaviors reflect key socio-communicative milestones which are implicated in autism spectrum disorders. The MMDB dataset supports a novel problem domain for activity recognition, which consists of the decoding of dyadic social interactions between adults and children in a developmental context. |
|
|
Our overall goal is to facilitate the development of novel computational methods for measuring and analysing the behavior of children and adults during face-to-face social interactions. We have explored the automatic analysis of three aspects of the dataset:
-     - Parsing into stages and substages
-     - Detection of discrete behaviors (gaze shifts, smiling, and play gestures)
-     - Prediction of engagement ratings at the stage and session level
|
|
|
We have collected 160 sessions of 5-minute interaction from 121 children. All multimodal signals are synchronized, including:
-     - 2 frontal view Basler cameras (1920x1080 at 60 FPS)
-     - An overhead view Kinect (RGB-D) camera
-     - 8 side view & 3 overhead view AXIS cameras (640x480 at 30 FPS)
-     - An omnidirectional and a cardioid microphone, ceiling mounted
-     - 2 wireless lapel microphones, worn by both the child and the adult
-     - 4 Affectiva Q-sensors for electrodermal activity and accelerometry, worn by both the adult and the child.
|
|
|
The MMDB dataset contains fine-grained annotations of behaviors, including
-     - Ratings of engagement and responsiveness at substage level
-     - Frame-level, continuous annotation of relevant child behaviors (attention shifts, facial expressions, gestures and vocalizations)
|
|
|
|
|
|
|