HOW CAN ONE EVALUATE A CONVERSATIONAL SOFTWARE AGENT FRAMEWORK?

Panesar, Kulvinder

HOW CAN ONE EVALUATE A CONVERSATIONAL SOFTWARE AGENT FRAMEWORK?

Panesar, Kulvinder ORCID: https://orcid.org/0000-0002-4523-7218 (2018) HOW CAN ONE EVALUATE A CONVERSATIONAL SOFTWARE AGENT FRAMEWORK? In: 7th International Conference on Meaning and Knowledge Representation, 4-6 July 2018, ITB (Institute of Technology Blanchardtown), Dublin.

[thumbnail of Conference Presentation of a Conference Paper]

Preview

Text (Conference Presentation of a Conference Paper)
KMR2018 - Evaluating Paper FINAL4718 PPt Presentation.pdf - Presentation
| Preview

Abstract

This paper presents a critical evaluation framework for a linguistically orientated conversational software agent (CSA) (Panesar, 2017). The CSA prototype investigates the integration, intersection and interface of the language, knowledge, and speech act constructions (SAC) based on a grammatical object (Nolan, 2014), and the sub-model of belief, desires and intention (BDI) (Rao and Georgeff, 1995) and dialogue management (DM) for natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of interpretation to provide realistic dialogue to support the human-to-computer communication.
This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory – Role and Reference Grammar (RRG) (Van Valin Jr, 2005); (2) Agent Cognitive Model with two inner models: (a) knowledge representation model employing conceptual graphs serialised to Resource Description Framework (RDF); (b) a planning model underpinned by BDI concepts (Wooldridge, 2013) and intentionality (Searle, 1983) and rational interaction (Cohen and Levesque, 1990); and (3) a dialogue model employing common ground (Stalnaker, 2002).
The evaluation approach for this Java-based prototype and its phase models is a multi-approach driven by grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and integration of all phase models and their inner models. This multi-approach encompasses checking performance both at internal processing, stages per model and post-implementation assessments of the goals of RRG, and RRG based specifics tests.
The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for purpose for describing, and explaining phenomena, language processing and knowledge, and computational adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL – agent to ontology with semantic gaps, and further addressed by a lexical bridging consideration (Panesar, 2017).

Item Type:	Conference or Workshop Item (Paper)
Status:	Published
Subjects:	A General Works > AS Academies and learned societies (General)
School/Department:	School of Science, Technology and Health
URI:	https://ray.yorksj.ac.uk/id/eprint/3402

University Staff: Request a correction | RaY Editors: Update this record

CORE (COnnecting REpositories)

Tools

Deposit and Record Details

ID Code:	3402
Depositing User:	Panesar, Dr Kulvinder
Deposited On:	05 Sep 2018 13:50
Last Modified:	30 Jan 2026 16:45