Fakhri Karray

Selected Projects:




Speech Transcription: (Top)

The goal of this project is to implement an intelligent spoken document retrieval system that searches within an audio database for files that match a given query. The audio( video) files are collected from different domains and various archives.

Given a speech query, the proposed system should extract the important keywords from this query and search for the audio files that contain these keywords. As shown in Figure 1, the proposed system is composed of two stages. The first stage comprises a speech decoder engine that transcribes all the given audio files to machine-understandable sequences. The second stage is a search and semantic engine that searches for the sequences that match the given query.


Architecture
Figure 1. The high level architecture of the proposed approach

In the system proposed, the speech decoder engine is to convert a given speech utterance into an equivalent sequence words. We used the hidden Markov toolkit (HTK) and Julius to implement our speech decoder. Furthermore, we employed the wall street journal (WSJ) database to train and validate the performance of our recognition system. The structure of the speech decoder engine is illustrated in Figure 2.



Speech Decoder
Figure 2. The structure of the speech decoder engine

As can be seen in Figure 2, the speech decoder engine is composed of two parallel systems. The first system is word recognizer based on Julius. It converts any given utterance into its underlying sequence of words. The other system is phoneme recognizer based on HTK. It converts the given utterance into a sequence of phonemes. This sequence is then processed using a fuzzy matching algorithm and its most important keywords are extracted. Eventually, the output of these parallel systems is combined in fusion module.



Discover: (Top)

Surveillance systems collect data and information about different aspects of objects (used in the sense of actors) and events in a given Volume Of Interest (VOI) and fuse them in order to provide a complete picture of the situation of interest. Most surveillance systems operate in an open-loop mode, i.e., they try to maximize the information gathered from the environment regardless of its value to the task/mission. Although these systems are able to handle a large number of objects and events, few have the ability to accurately and robustly discriminate the critical ones. A low Discrimination Power (DP) can result in a loss of opportunities for action, poor reliability, and, in some cases, inaccurate and undesirable effects, such as collateral damages, caused by taken actions. This project proposes a closed-loop approach to surveillance that optimally combines the sensing and the data/information fusion processes using a Pervasive Multi-Modal Surveillance System. The proposed approach aims at maximizing DP over critical objects/events by selectively collecting and fusing information based on high-level information about the task/mission. The results of this project will advance the state of knowledge of this discipline by providing enabling technologies that will make surveillance systems adapt to their context of operations. The project involves collaboration of researchers, experts in their fields, from academic and industrial institutions pooling their talents to target this important research area. The developed methods and technology is of strategic importance to our industrial partners who specialize in real-time surveillance systems and intelligent decision support systems. The project fits very well into their plan for product development, growth and marketing strategy. As illustrated in Fig. 1, this project is divided the following eight tasks:

  • Task 1: System Architecture & Testbed
  • Task 2: Scenario Development
  • Task 3: Context-aware Cooperative Patterns
  • Task 4: Distributed Sensor Management
  • Task 5: Adaptive Cooperative Object Detection
  • Task 6: Adaptive Cooperative Object Tracking
  • Task 7: Contextual Information
  • Task 8: Test and Evaluation
Discover Tasks
Fig. 1 Project Tasks

The specific research tasks of the project are:

  • Design and development of an open and flexible SOA that will allow implementation and evaluation of the proposed solutions.
  • Investigation and development of solutions for fusion of data from distributed and heterogeneous sources. Specific consideration will be given to the detection and resolution of conflicts. This includes fusion network consistency verification, error correlation avoidance, data incest prevention between multiple nodes, and data association across the system.
  • Extraction of key high-level features that characterize the sensed scene. These features will provide the contextual information for Sensor Management and Adaptive Data Fusion algorithms.
  • Investigation and development of strategies for automatically balancing the sensing effort to increase the system Discrimination Power. Solutions for problems such as sensor placement, mode control, allocation, and coordination will be investigated.
  • Investigation and development of Adaptive Data Fusion solutions in order to increase Discrimination Power over critical objects and events.
  • Investigation of cooperation patterns and adaptive behaviours that can support, and that, sometimes results in an emergent fashion, from context-aware distributed surveillance.


MUSES SECRET: (Top)

MUltimodal- SurvEillance System for SECurity-RElaTed Applications (MUSES_SECRET) project is an ORF-RE project aims at developing and commercializing new multimodal (video and infrared, voice and sound, RFID and perimeter intrusion) intelligent sensor surveillance technologies for the timely identification of human intent and threat assessment in high security-risk dynamic environments involving human crowds, located at places such as school campuses, shopping centers, airports, etc. The main objectives of MUSES_SECRET projects are:

  • Development of an intelligent distributed multimodal sensor network, including video and infrared cameras, sound and voice detectors and other wireless sensors, for real-time detection and identification of potential threatening behaviour of humans acting individually or in groups;
  • Development of real-time computer vision and signal processing algorithms for tracking and recognition of relevant human body-language patterns such as gait, hand gestures, facial expressions, and voice emotions;
  • Development of a context-aware system for the real-time evaluation of threatening behavioural patterns of human subjects identified as potential security risks; and
  • Development of a synthetic environment and human-computer interfaces for the human users who are the final threat assessors and decision makers in specific security surveillance situations.

For more information about the project themes and task, visit MUSES_SECRET Wiki.



LORNET: (Top)

The main goal of LORNET is to build new knowledge in computer science and cognitive science to help design and develop the architectures, the tools and the methods to maximise the usability, the efficiency and the usefulness of a network of learning objects repositories (LOR) for education, training and knowledge management.

Six objectives were established to reach this main goal:

  1. To design and implement interoperability architectures and metadata protocols to help establish the structure and the operations of a LOR network.
  2. To develop clustering methods and editors to construct and operate complex resources from simple ones, integrating learning objects, features and actors involved in a distributed learning or knowledge management system.
  3. To represent objects at different levels of granularity and abstraction, using ontologies and multi-agent techniques to enable content repurposing and adaptive assistance to users.
  4. To adapt, expand and extract knowledge and resource mining techniques and tools to fully exploit the contents of learning objects repositories and describe them for interoperability and reusability purposes.
  5. To define metadata and design protocols, as well as search and delivery tools, for advanced multimedia and virtual reality objects, providing quality service on the networks for their development, integration and use in repositories.
  6. To develop an integrated, accessible and flexible operating system that supports knowledge management activities for multiple users. This sophisticated solution includes a wide array of tools, applications, functionalities and resources to develop and support learning object repositories networks.


A Theme Structure for Knowledge Evolution

Each objective corresponds to a research area, or theme, of the proposed program, where researchers will build new knowledge and technical innovation through targeted research and development.

  • Theme 1 - Interoperability and Metadata Protocols
  • Theme 2 - Modeling, Clustering and Coordinating Learning Objects
  • Theme 3 - Active and Adaptive Learning Objects
  • Theme 4 - Knowledge Extraction and Learning Object Mining
  • Theme 5 - Creation, Search and Delivery of Advanced Multimedia Learning Objects
  • Theme 6 - Telelearning Operations Systems (TELOS)
LORNET Themes
LORNET Project Themes (Courtesy of LORNET Research Network)

Theme 4 - Knowledge Extraction and Learning Object Mining

The PAMI research group will be working on theme 4 of the LORNET project under the leadership of Dr. Mohamed Kamel.

The efficient use of learning objects repositories requires the development of relevant tools to locate, extract and disseminate knowledge embedded in learning object repositories. These tools will help provide the appropriate contexts and structures.

Moreover, they will facilitate interactions and favour efficient delivery, navigation and retrieval. Theme 4 aims to explore the possibilities of applying dynamic pattern discovery and resource-mining techniques to learning object repositories and related information sources, such as the usage history and user information, in order to identify hidden patterns.

This theme addresses problems such as the representation and the extraction of learning object repository contents, be it phrases, semantics, graphics or metadata. It also addresses the organization and clustering techniques to extract common knowledge and classify the elements of a collection of learning objects. It also deals with cases where classification and clustering approaches cannot be applied to knowledge discovery (i.e. training units which became unavailable or contain an insufficient number of samples) requiring other approaches such as reinforcement learning agents. The outcome generated by theme is the cornerstone to a more sophisticated knowledge tools and management.



DIVA: (Top)

The DIVA NSERC Strategic Network targets the development and integration of communication systems, vehicular technologies, and applications for enabling nationwide deployment of vehicular networks (VANets) and intelligent transportation systems.

The main goal of this network project is to develop innovative large-scale communication architecture and wireless network technologies to make Canada's transportation systems in urban, rural and roadway environments more efficient and safer, as well as offering revolutionary services. The vision for this proposed Network is to enable distributed, robust, secure and fault-tolerant Intelligent Vehicular Networks (IVT) to reduce roadway fatalities, fossil-fuel consumption, greenhouse gas emissions and traffic congestions, while providing drivers and passengers with driving comfort applications such as location-aware services, multimedia streaming, local news, tourist information, animal collision damage avoidance and alert messages along highways and city streets. Innovative applications and services will be developed to assess the feasibility and penetration of IVT in the Canadian market. The DIVA Strategic Network will bring together leading companies and recognized Canadian academic scientists and engineers from the fields of vehicular networks, mobile ad hoc networks, distributed and mobilesystems, wireless sensor networks, data mining, network security and privacy, service oriented architectures, and heterogeneous wireless networks. In addition, by working with industry and governmental partners the outputs of DIVA and the highly qualified personnel (HQP) developed in association with the proposed research program will strengthen Canada's capability to apply ITS to real-world situations.



Cognitive Assistive Robotics: (Top)

Progress witnessed in medicine and health care over the past decades has increased life expectancy but has led to an aging society, a good portion of which involves elderly and disabled individuals who are in dire need for direct physical assistance and day to day care. This has induced skyrocketing costs associated with the services and care provided to them. We explore in this proposal an alternative to minimize some of these costs and to alleviate the burden on the health care providers by developing a class of intelligent machines termed as cognitive assistive robotic systems, the main task of which is to support and provide basic needs and assistance for an elderly/disabled individual. To impart natural interaction capabilities with humans, these systems need to have cognitive features allowing them to recognize to a certain extent human intentions, to predict human behavior under certain circumstances and to respond adequately to some basic requests made by the humanuser. The robotic system designed should be able to store knowledge and to learn useful skills from humans.

Development of an assistive robotic system in which the intelligence is not entirely hand-coded requires extensive exploitation of existing advanced robotics and mechatronics technologies and the development of new types of algorithms and human-machine interactions systems allowing for seamless and almost-natural interaction with humans where perception, reasoning, and action planning capabilities are key factors. To deal with such challenges and to fulfill the goal of designing a cognitive assistive robotic system, the proposed research integrates within a flexible framework a number of modules involving machine attention, natural man-machine interaction, behavioral and incremental learning and action planning. These are all fields pertinent to the applicant's expertise. The proposal tackles a number of issues in the maturing and evolving field of cognitive assistive robotics. It should lead to excellent opportunities for the advancement of knowledge in this field, the training of highly qualified personnel and could also potentially lead to commercially viable alternatives in the field of service robotics (of assistive nature) whether for hospital or home use.