Panesar, Kulvinder (2018) HOW CAN ONE EVALUATE A CONVERSATIONAL SOFTWARE AGENT FRAMEWORK? In: 7th Internal Conference on Meaning and Knowledge Representation, 4-6 July 2018, ITB (Institute of Technology Blanchardtown), Dublin. (Unpublished)

[img] Text (Conference Paper)
EvaluationOfCSA Paper FINAL 4-7-2018.pdf - Submitted Version
Restricted to Repository staff only

Download (980kB)


This paper presents a critical evaluation framework for a linguistically orientated conversational software agent (CSA) (Panesar, 2017). The CSA prototype investigates the integration, intersection and interface of the language, knowledge, and speech act constructions (SAC) based on a grammatical object (Nolan, 2014), and the sub-model of belief, desires and intention (BDI) (Rao and Georgeff, 1995) and dialogue management (DM) for natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of interpretation to provide realistic dialogue to support the human-to-computer communication.
This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory – Role and Reference Grammar (RRG) (Van Valin Jr, 2005); (2) Agent Cognitive Model with two inner models: (a) knowledge representation model employing conceptual graphs serialised to Resource Description Framework (RDF); (b) a planning model underpinned by BDI concepts (Wooldridge, 2013) and intentionality (Searle, 1983) and rational interaction (Cohen and Levesque, 1990); and (3) a dialogue model employing common ground (Stalnaker, 2002).
The evaluation approach for this Java-based prototype and its phase models is a multi-approach driven by grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and integration of all phase models and their inner models. This multi-approach encompasses checking performance both at internal processing, stages per model and post-implementation assessments of the goals of RRG, and RRG based specifics tests.
The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for purpose for describing, and explaining phenomena, language processing and knowledge, and computational adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL – agent to ontology with semantic gaps, and further addressed by a lexical bridging consideration (Panesar, 2017).

Item Type: Conference or Workshop Item (Paper)
Status: Unpublished
Subjects: A General Works > AS Academies and learned societies (General)
School/Department: School of Art, Design & Computer Science
URI: http://ray.yorksj.ac.uk/id/eprint/3408

University Staff: Request a correction | RaY Editors: Update this record