Skip Ribbon Commands
Skip to main content
Skip Navigation Links.
Two Research Proposals of Dr. Ahmad Salman Accepted

Two Research Proposals of Dr. Ahmad Salman Accepted in National ICT R&D Fund

23rd June, 2017

Name: Object Detection and Categorization for Blind Using Deep Neural Learning
Funding Body: National ICT R&D Fund
Funding Amount: PKR 7.9 Million
One Line Description: To design and develop real-time obstacle avoidance and object recognition system for blind people with deep learning and using scene analysis of video sequences.


The main challenge in this project is to devise a machine learning algorithm that should be capable of recognizing the objects from video/image with high accuracy in a highly variable environment. This is achievable if the machine learning system learns and recognizes the object of interest viewed through arbitrary angle and position i.e., learn to capture the unique and abstract representation of object. For example, a cell phone or a cup placed on the table should be recognized regardless of its size, shape and orientation. 
Not until recently, deep neural architectures have gained a prominent place in the picture of machine learning algorithms. Marked by the ability of extracting characteristic and invariable feature of an object, they are amongst the state-of-the-art in various applications related to image or object recognition/categorization.
Our aim is to design a dedicated deep neural architecture and employ that in a form of portable yet wearable device using small video capturing cameras. The end product should give the information about the type and distance of the object from a user. Indeed, this can be realized with conventional image processing together with the well-learned system.

Name: Content-based Indexing and Retrieval of Videos
Funding Body: National ICT R&D Fund
Funding Amount: PKR 10.3 Million
Collaborators: Behria University Islamabad and SEECS, NUST
One Line Description: To design and develop real-time system to segment and index the video stream using audio and visual queries.


With the tremendous increase in the amount of multimedia data in general and video databases in particular has increased the need for effective indexing and retrieval mechanisms. In most cases, video retrieval is based on user assigned tags and not on the actual content of the video. The proposed research is aimed at developing a content based video indexing and retrieval system. The caption text appearing in videos will be used as the primary index while the audio content in these videos will serve as the secondary index. The text module will rely on extracting the occurrences of textual content in video independent of its script. Once the text is extracted, it will be fed to a script recognition module that will identify the script of the text in question so that the subsequent processing is carried out by the respective modules of each script/language. Indexing can be implemented either through recognition of text or using a word spotting based technique. The former requires a video OCR while the later would need clustering of ‘similar’ shapes (words) into classes and subsequent matching of query words using shape matching algorithms. Any of these solutions could be developed to target a defined vocabulary of (key)words. The audio indexing module will rely on identifying the occurrences of the keywords in a vocabulary in the audio stream of the video. Once the videos are indexed, user may then provide a query keyword and retrieve all the frames of all the videos containing the text or/and spoken occurrences of the provided word.​