You are here
Representation and Inference for Developing Deep Language Engines (RIDDLE)
Title: Principal Scientist
Phone: (617) 491-3474
Email: apfeffer@cra.com
Title: Contracts Manager
Phone: (617) 491-3474
Email: mfelix@cra.com
Contact: Dina Caplinger
Address:
Phone: (972) 883-2312
Type: Nonprofit College or University
ABSTRACT: Intelligence analysts need to process large amounts of text information to form an understanding of a topic of interest. The sheer amount of information can be overwhelming, so automated text analytics that assist with filtering, information extraction, and document understanding, can be highly beneficial. Deep natural language processing (NLP) applications require both structural knowledge of language and background knowledge of the domain. Statistical relational learning representations support reasoning about knowledge-rich domains under uncertainty, but joint inference in NLP applications is a challenging task due to the thousands of variables and millions of features. Charles River Analytics proposes to develop Representation and Inference for Developing Deep Language Engines (RIDDLE), which investigates advanced joint inference algorithms for NLP and the representational issues that are intimately tied to inference. In particular, we will develop three novel classes of inference algorithms, including both lifted and non-lifted algorithms, as well as structured representations of knowledge to support inference using probabilistic programming. We will perform a cross-cutting evaluation of representations and inference algorithms on a range of NLP tasks. BENEFIT: RIDDLE will benefit intelligence analysts by enabling them to filter and extract meaning from large numbers of text documents, thereby supporting more timely and effective intelligence. RIDDLE will also be beneficial to commercial applications of NLP systems, such as text analytics of medical databases. The algorithms developed under this effort will also extend to our commercial FigaroTM probabilistic modeling tool, enabling it to be applied to larger and richer domains.
* Information listed above is at the time of submission. *