Context-sensitive Content Extraction and Scene Understanding

Award Information

Agency: Department of Defense

Branch: Navy

Contract: N00014-08-C-0639

Agency Tracking Number: N071-085-0199

Amount: $729,423.00

Phase: Phase II

Program: SBIR

Solicitation Topic Code: N07-085

Solicitation Number: 2007.1

Timeline

Solicitation Year: 2007

Award Year: 2008

Award Start Date (Proposal Award Date): 2008-09-30

Award End Date (Contract End Date): 2010-09-29

Small Business Information

ObjectVideo

11600 Sunrise Valley Drive Suite # 290

Reston, VA 20191

United States

DUNS: 038732173

HUBZone Owned: No

Woman Owned: No

Socially and Economically Disadvantaged: No

Principal Investigator

Name: Mun Wai Lee
Title: Principal Investigator
Phone: (703) 654-9300
Email: mlee@objectvideo.com

Business Contact

Name: Paul Brewer
Title: Founder, VP New Technolog
Phone: (703) 654-9314
Email: pbrewer@objectvideo.com

Research Institution

N/A

Abstract

Automatic visual content extraction and scene understanding is a critical enabling technology for video surveillance, security and forensic analysis applications. The task involves identifying objects in the scene, describing their inter-relations, and detecting events of interests. The project addresses this need by developing algorithms to extract syntactic, semantic and conceptual information from visual data. We adopted the modeling and conceptualization framework of stochastic attribute image grammar. In this framework, a visual vocabulary is defined from pixels, primitives, parts, objects and scenes. The image grammar provides a principled mechanism to list visual elements and objects present in the scene and describe their relations. A bottom-up top-down strategy is used for inference to provide a description of the scene and its constituent elements. A text generation system then converts the semantic information to text report. The Phase I study has demonstrated the feasibility of this approach. In Phase II, we plan to extend the technology to handle more complex scene and achieve the following objectives: (1) classification of more than 20 types of scene elements; (2) data fusion with data from multiple cameras and other modalities; (3) complex events detection; and (4) enhanced text report generation and forensic analysis.

* Information listed above is at the time of submission. *

You are here

Context-sensitive Content Extraction and Scene Understanding