597 Details of Lectures and Notes

Introduction: "What's DIVA ?" - [Sharma]

Introduction to human-computer interfaces, modalities, issues in multimodality,  case study: iMAP system architecture, issues.
References: [1][2]

Notes:
After the general introduction and discussion of class mechanics, the class will be divided into four groups. The purpose is to ease the selection of papers and projects and to help in the balanced coverage of the course material. The groups are:DIVA. V-group will deal with role of vision in HCI, the A-group will deal with role of audio/speech, the D-group will be concerned with interpretation of mulitmodal dialog, and the I-group will be conerned with the representation of information (both for the purpose of providing the context and for helping with the multimodal "output"). The class should be roughly equally divided between the four groups and the projects should reflect that distribution (although it will hard many times to keep the demarcation clear, especially for D &I).


Video (V-group)

Using video - hand, face & body localization, tracking and gesture recognition, face recognition, emotions, gaze estimation.
References: [][][]

Notes:


Audio (A-group)

Using audio- capture (microphone basics, microphone array), speech recognition (using commercial software), syntactic  analysis.
References: [][][]

Notes:
 


Dialog (D-group)

Multimodal interpretation and dialog, issues in modality integration .
References: [][][]

Notes:
 
 


Information (I-group)

Information representation (esp. geo-centered) for display,  multimodal information display case studies and applications
References: [][][]

Notes: