MMM Home   

Ferret Meeting Browser - User Guide






Ferret allows to navigate in a multimedia meeting recording. The aim of Ferret is to enable interactive navigation within meeting recordings, and quickly find and play back segments of interest within these recordings. If this is your first time here, please make sure your browser is well configured.
  • (1) Select a corpus / meeting:

    This page presents all the meetings available so far of a particular corpus. Jump into the data selection page of a particular meeting by clicking on the meeting link.

  • (2) Select data sources:

    This intermediate page let you choose particular streams you want to browse. Select streams of interest and press the browse button.

    Available annotation streams are:
    • agenda-GT.xml represents the ground truth meeting action sequence. See an example. See ICASSP publication for more details.
    • agenda-ML.xml represents an automatic output of meeting action sequence.
    • interest-GT.xml represents the ground truth level of interest. See an example. Contact Iain McCowan for more details.
    • agendaExt-ML.xml represents an automatic output of meeting action sequence using new extended set of actions including disagreements.
    • interest-GT.xml represents the ground truth level of interest. See an example. Contact Iain McCowan for more details.
    • interest-ML.xml represents an automatic output of level of interest.
    • speakerSegAll-GT.xml represents the ground truth of all speaker turns. See an example.
    • SpeakerSegChanX-GT.xml represents the ground truth of speaker turns for 1 speaker. See an example.
    • speakerSegAll-ML.xml represents the automatic speaker turn detections, more info.
    • SpeakerSegChanX-ML.xml represents the automatic speaker turn detection for 1 speaker.

    Available transcripts are:
    • Transcript.html represents a manual transcript, example.
    • ASR-Brno.html represents the automatic speech recognition ouput from Brno university, an example, more info.
    • FAword.html represents the phoneme force alignment. See an example.
    • FAphn.html represents the word force alignment example. Clicking on a word plays the associate audio track from this particular word.
    • anvil.html represents anotations made by Muenchen university using ANVIL tool. See an example.

  • (3) Browse with Ferret:

    The following page is the main web graphical user interface of Ferret. Pressing the browse button in the previous page brings you to this Ferret page.

    (a) The top part of the interface is dedicated to the videos to be played synchronised.
    (b) The left part consists of basic VCR controls for RealPlayer, some zoom in/out controls.

    Clicking on the add button makes poping up a window for adding an XML annotation file to the graphical part in the middle of the screen.
    (c) Data stream vectors are stacked. One column corresponds to the results of a particular analysis, i.e. speaker localization.
  • clicking or dragging the cursor in the SVG object makes the cursor move to that point and play it in Realplayer. The SVG cursor position is updated while the video is playing.
  • clicking on a rectangle, you can play the video, starting at that point.
  • (d) The HTML transcript transcribes the dialogues. Clicking on a labeled interval in the timeline scrolls the display of words in the transcript column and positions the audio and video in the media player to that part of the recording. You may also search through the transcript for words and phrases of interests, then see who was speaking at the time and click to see and hear the associated speech.




  • Interactive Multimodal Information Management -- (IM)2, part of the swiss National Centre of Competence in Research (NCCR).
  • EU project -- Multi-Modal Meeting Manager -- M4
  • EU project -- Augmented Multy-Party Interaction -- AMI