Response to FIPA Call For Proposals

Response to FIPA Call For Proposals

France Télécom - CNET

D. Sadek, P. Bretier, F. Panaget

France Télécom

CNET - DIH/RCP

Technopole Anticipa - 2, Avenue Pierre Marzin

22307 Lannion Cedex - France

E-mail: sadek@lannion.cnet.fr

Tel. : +33 2 96 05 31 31

Fax : +33 2 96 05 35 30

Submission for standardisation of components of

France Télécom's ARTIMIS technology

ARCOL agent communication language and

MCP, CAP and SAP agent's cooperativeness protocols

1. Introduction
2. What does ARTIMIS technology consist of and why does it fit FIPA requirements?
3. Components of ARTIMIS technology submitted to standardisation
3.1. Interacgent Communication Language: ARCOL
3.1.1. Framework for ARCOL semantics: the SL language
3.1.1.1. Bases of the SL formalism
3.1.1.2. The Components of a Communicative Act Model
3.1.1.3. ARCOL's primitive communicative acts
3.1.2. Flexibility of the ARCOL language
3.2. Comparison with KQML
3.3. A few words about KIF compared to SCL
3.4. Agent Cooperativeness Protocols: MCP, CAP, SAP
3.4.1. Minimal Cooperativeness Protocol (MCP): A Basic Interaction Protocol
3.4.2. The Corrective-Answers Protocols (CAP)
3.4.3. The Suggestive-Answers Protocol (SAP)
4. Why the components submitted to standardisation and ARTIMIS technology are relevant to FIPA?

Introduction

In this proposal, two components of ARTIMIS, an effective generic intelligent agent technology developed by France Télecom, are submitted for standardisation:

- an interagent communication language, ARCOL (ARtimis COmmunication Language), and,

- Three protocols for agent's cooperative reactions: MCP (Minimal Cooperation Protocol), CAP (Corrective-Answers Protocol), and SAP (Suggestive-Answers Protocol).

The whole ARTIMIS technology, in particular the submitted language and protocols, is available.

What does ARTIMIS technology consist of and why does it fit FIPA requirements?

ARTIMIS is an agent technology which provides a generic framework to instantiate intelligent dialoguing agents, which can interact with human users as well as with other software agents. When instantiated in a human-agent interaction context, ARTIMIS-like agents can engage in mixed-initiative cooperative interaction in natural language with human users. The resulting systems are able to display advanced functionalities, such as negotiating the user's requests, producing cooperative answers (which involves relevant, possibly non explicitly requested information), performing complex tasks, etc.

Primarily designed to support advanced interactive services and to offer user-friendly cooperative intelligent interfaces to information bases, ARTIMIS intelligent agent technology, de facto, strongly relates to end-user applications.

Roughly speaking, the ARTIMIS software consists of three main components [Sadek et al 96]: a rational unit (which constitutes the heart of the technology), and two front-end components for natural language processing: understanding and generation.

The rational unit is the decision kernel of the agent. It endows the agent with the capability to reason about knowledge and actions. It performs cooperative rational reaction calculus producing motivated plans of actions, such as plans of speech acts. In this framework, communicative acts are modelled as regular rational actions, thus enabling the agent to handle interaction. The communication protocols are dynamic and flexible: there is no predetermined interagent interaction patterns.

The two natural language components [Sadek et al 96, Panaget 96] are essential to use the technology in a context of interaction with humans. They bridge the gap between the communication language (which, in this case, is natural language) and the internal semantical knowledge representation in terms of communicative acts with semantic contents expressed in a powerful language: a first-order modal language.

Without the two natural language components, the rational unit is an intelligent communicating agent (in a context of human-agent interaction, the human user is viewed as a particular agent; no assumption is made about the interlocutor's type). Therefore, the rational unit can be used as a regular communicating agent in a multi-agent system.

The ARTIMIS model is a formal theory of rational interaction expressed in a homogeneous logical framework [Sadek 91a]. The basic concepts of this framework are mental attitudes and actions. The theory provides, in particular, a model for the relationships between all the mental attitudes of an agent, principles of rational behaviour, principles of cooperative behaviour, and fine-grained models of communicative acts (and actions in general). The set of the primitive communicative acts is small, but the expressive power of the underlying language enables to handle complex interaction, because the language allows for complex combinations (e.g., sequences, alternatives) required by the negotiation and coordination protocols used by the agents, and for a high expressiveness of the message contents.

ARTIMIS involves a homogeneous set of generic logical propertys, which embodies the innate potential of the system. This potential is independent of its specific use in a given application domain. The underlying technology is an inference engine [Bretier 95, Bretier & Sadek 96] which faithfully executes the theory. The ARTIMIS technology offers several advantages. Firstly, the specification of ARTIMIS, i.e., the theory of rational interaction, is semantically well-defined, and non ambiguous. Secondly, it can be guaranteed that ARTIMIS follows its specifications soundly and completely (e.g., keeps its commitments). Thirdly, ARTIMIS can be easily maintained, adapted, and/or customised.

So far, to our knowledge, there is no intelligent agent technologies/products comparable to ARTIMIS.

ARTIMIS is implemented in Quintus Prolog on a Sun Ultra 1 under Solaris 2.5. It is a stand-alone software which is integrated in a speech-telephony-computer platform (i.e., a speech recognition software, a speech synthesis software and an ISDN board/software). Currently, ARTIMIS is demonstrated on a lab version of a real application, AGS, the directory of added-value voice-activated services hosted by France Telecom (Audiotel services) [Sadek et al 95, 96].

ARTIMIS technology has been developed to support advanced services, and mainly Telecom (telephone, Internet, etc.) services requiring cooperative interaction with human users. There are tremendous obvious commercial interests in ARTIMIS-like technology, notably specifically designed "high-quality" user-friendly interactive services, and intelligent-service development tools.

ARTIMIS fits FIPA requirements for three main reasons. Firstly, the kernel of ARTIMIS, a rational unit, allows, in particular, dynamic and flexible communication protocols. Secondly, ARTIMIS is based on a semantically well-defined theory of communication and cooperation. And thirdly, since ARTIMIS is based on a rational unit, it allows an efficient combination of services without that the plugging of a new service requires an explicit specification of its relation with the other services.

Components of ARTIMIS technology submitted to standardisation

Are submitted for standardisation, an interagent communication language, ARCOL, and three cooperative reaction protocols, MCP, CAP, and SAP.

Interacgent Communication Language: ARCOL

An agent communication language allows to express messages. Traditionally, Speech act theory is used as the basis for the definition of messages. A message is specified as a communicative act type, e.g., a speech act type (also called communicative function), applied to a semantic content (also called propositional content). The semantic contents of communicative acts may be of three different types, depending on the type of the communicative acts: propositions, individuals, and communicative acts. It is worth noting that the expressiveness of an interagent communication language strongly depends on the expressiveness of the language for semantic contents.

In our view, a communication language has to specify a set of primitives from which any message can be built up. This view applies both to the sub-language for the communicative functions and to the language for semantic contents. For example, as regard the former, one have to define, on one hand, basic communicative act types, and, on the other hand, operators to combine these acts in order to build complex messages.

As specified by FIPA requirements, our definition of the ARCOL interagent communication language is twofold:

(1) the definition of communicative acts (or basic message types), and

(2) the definition of the language, SCL, for the semantic contents.

ARCOL involves the following primitive communicative acts [Sadek 91b]:

act: < i, INFORM(j, p) >

meaning:

agent i informs agent j that proposition p is true,

performance conditions:

i believes that p,

i believes that j does not believes that p,

i has the intention that j comes to believe that p.

act: < i, REQUEST (j, a) >

meaning:

agent i requests agent j to perform action (e.g., comm. act) a,

performance conditions:

i believes that (as far as he is concerned) a is "performable",

i believes that j does not have the intention to do a.

i has the intention that j performs a.

act: < i, CONFIRM (j, p) >

meaning:

agent i confirms to agent j that proposition p is true,

performance conditions:

i believes that p,

i believes that j is uncertain about p,

i has the intention that j comes to believe that p.

act: < i, DISCONFIRM (j, p) >

meaning:

agent i disconfirms to agent j that p is true,

performance conditions:

i believes that p,

i believes that j is uncertain about p or believes that p,

i has the intention that j comes to believe that p.

act: < i, INFORMREF (j, ix d(x)) >

meaning:

agent i informs agent j of the value of the referent x denoted

by description d(x),

performance conditions:

i knows (or believes that he knows) the referent x of d(x),

i believes that j does not know the referent x of d(x),

i has the intention that j comes to know the referent x of d(x).

where p is a proposition of SCL, a an action expression (e.g., a communicative act), and x d(x) a term (denoting an object) described by the description d(x), i.e., a proposition of SCL with a free variable x.

The semantics of the primitive communicative acts has to be expressed in a language: SL,

(1) containing SCL, since the semantic contents appear in the semantic definitions of the communicative acts (preconditions and effects), and

(2) allowing for the expression of mental attitudes such as intentions and beliefs.

The expressiveness of ARCOL depends on the expressiveness of SCL. For instance, if agents are intended to be able to inform about their own beliefs or intentions, SCL must allow for the expression of such mental attitudes.

In our view, SCL and SL has to be unified. Thus the whole expressiveness of SL will be available for the expression of semantic contents.

Framework for ARCOL semantics: the SL language

SL being the language for ARCOL semantics, it has to be precisely defined. In this section, we present the SL language definition and the semantics of the ARCOL primitive communicative acts.

Bases of the SL formalism

Propositions are expressed in a logic of mentale attitudes and actions, formalised in a first order modal language with identity, similar in many aspects to that of Cohen & Levesque (1990b). (See [Sadek 91] for details of this logic.) Let us briefly sketch the part of the formalism used in this proposal. In the following, p, p₁, ... are taken to be closed formulas (denoting propositions), and formula schemas, and i and j are schematic variables which denote agents. means that is valid. The mental model of an agent is based on the representation of three primitive attitudes: belief, uncertainty and choice (or, to some extent, goal). They are respectively formalized by the modal operators B, U and C. Formulas as B_ip, U_ip and C_ip can be read, respectively, "i (implicitly) believes (that) p", "i thinks that p is more likely than its contrary" and "i desires that p currently holds". The logical model for the operator B is a KD45 possible-worlds semantical Kripke structure (see, e.g., [Halpern & Moses 85]) with the fixed domain principle (see, e.g., [Garson 84]). To enable reasoning about action, the universe of discourse involves, in addition to individual objects and agents, sequences of events. A sequence may be formed by a single event. This event may be also the void event. The language involves terms (in particular a variable e) ranging over the set of event sequences. To talk about complex "plans", events (or actions) can be combined to form action expressions, such as sequences a₁;a₂ or nondeterministic choices a₁a₂. Action expressions will be noted a. The operators Feasible, Done and Agent are introduced to enable reasoning about actions. Formulas Feasible(a,p), Done(a,p) and Agent(i,a) respectively mean that a can take place and if it does p will be true after that, a has just taken place and p was true before that, and i denotes the only agent of the events appearing in action expression a.

From belief, choice and events, the concept of persistent goal is defined. An agent i has p as a persistent goal, if i has p as a goal and is self-committed toward this goal until i comes to believe that the goal is achieved or to believe that it is unachievable. Intention is defined as a persistent goal imposing the agent to act. Formulas as PG_ip and I_ip are intended to mean that "i has p as a persistent goal" and "i has the intention to bring about p", respectively. The definition of I entails that intention generates a planning process.

A fundamental property of the proposed logic is that the modelled agents are perfectly in coherence with their own mental attitudes. Formally, the following schema is valid:

B_i

where is governed by a modal operator formalizing a mental attitude of agent i.

Below, the following abbreviations are used:

Feasible(a) Feasible(a,True)

Done(a) Done(a,True)

Possible() (e)Feasible(e,)

Bif_i B_i B_i

Bref_i(x) (y)B_i(x)(x)=y

Uif_i U_i U_i

Uref_i(x) (y)U_i(x)(x)=y

AB_n,i,j B_iB_jB_i....

In the fifth and seventh abbreviations, is the operator for definite description; (x)(x) is read "the (x which is) ". In the last one, which introduces the concept of alternate beliefs, n is a positive integer representing the number of B operators alternated between i and j.

The Components of a Communicative Act Model

The components of a Communicative Act (CA) model that are involved in a planning process characterize both the reason for which the act is "selected" and, the conditions that have to be satisfied for the act to be planned. For a given act, the former is referred to as the rational effect (RE) (sometimes also referred to as the perlocutionary effect), and the latter are the feasibility preconditions (FPs) (or qualifications).

To give an agent the capability of planning an act whenever the agent intends to achieve its RE, the following property is proposed:

Property 1: Let a_k be an act such that:

(x) B_ia_k=x,

p is the RE of a_k and

C_i Possible(Done(a_k));

then the following formula is valid: I_ip I_iDone(a_1...a_n)

where a₁, ...,a_n are all the acts of type a_k.

This Property says that for an agent the intention to achieve a given goal generates the intention that be done one of the acts know to the agent and which are such that their RE corresponds to the agent's goal and the agent has no reason for not doing them.

The set of feasibility preconditions for a CA can be split into two subsets: the ability preconditions and the context-relevance preconditions. The ability preconditions characterize the intrinsic ability of an agent to perform a given CA. For instance, to sincerely Assert some proposition p, an agent has to believe p. The context-relevance preconditions characterize the relevance of the act to the context in which it is performed. For instance, an agent can be intrinsically able to make a promise while believing that the promised action is not needed by the addressee. The context-relevance preconditions may correspond to the Gricean quantity and relation maxims.

The following property imposes on an agent, whenever the agent "selects" an act (in virtue of Property 1), to seek the satisfiability of its FPs:

Property 2: I_iDone(a) B_iFeasible(a) I_iB_iFeasible(a)

An agent cannot intend to perform (the illocutionary component of) a communicative act for a different reason from the act's RE. The following property formalizes this idea:

Property 3: I_iDone(a) I_iRE(a), where RE(a) is the RE of act a.

Consider now the opposite aspect: the consommation of CAs. When an agent observes a CA, he has to come to believe that the agent performing the act has the intention (to make public his intention) to achieve the act's RE. This kind of the act effect is called the intentionnal effect. The following captures this consideration:

Property 4: B_i(Done(a) Agent(j, a) I_jRE(a))

There are FPs that persist after an act. For the particular case of CAs, this is the case for all the FPs which do not refer to time. Then, when an agent observes a CA, he has to come to believe that the persistent FPs hold:

Property 5: B_i(Done(a) FP(a))

Hereafter, a CA model will be presented as follows:

< i, Act (j, ) >

FP: ₁

RE: ₂

where i is the agent of the act, j the addressee, Act the name of the act, its semantic content (or propositional content) (SC), ₁ its FPs and ₂ its RE.

ARCOL's primitive communicative acts

The assertive Inform:

One of the most interesting assertives regarding the core of mental attitudes it encapsulates (or we want it to encapsulate) is the act of Informing. An agent i is able to Inform an agent j that some property p is true only if he/she believes p (i.e., only if B_ip). This act is considered to be context-relevant only if i does not think that j already believes p (i.e., only if B_iB_jp).

Given the core of mental attitudes just highlighted for the act Inform, we propose a first model for this act as follows; this model will be adapted to the other acts we will introduce later, with the aim of keeping the acts qualifications mutually exclusive (when the acts have the same effect):

< i, INFORM (j, ) >

FP: B_i B_iB_j

RE: B_j

The Directive Request:

We propose the following model for the directive Request, a being a schematic variable for which can be substituted any action expression, FP(a) being the feasibility precondition of a, and FP(a) [i\j] being the FPs of a concerning the mental attitudes of agent i:

< i, REQUEST (j, a) >

FP: FP(a) [i\j] B_iAgent(j,a) B_iPG_jDone(a)

RE: Done(a)

The Confirmation:

First of all, let us mention that the rational effect of the act Confirm is identical to that of most of the assertives, i.e., the addressee comes to believe the SC of the act. An agent i is able to Confirm a property p to an agent j only if i believes p (i.e., B_ip). This is the sincerity condition an assertive imposes on its agent. The act Confirm is context-relevant only if i belives that j is uncertain about p (i.e., B_iU_jp). In addition, the analysis we have performed in order to determine the necessary qualifications for an agent to be justified to claim the legitimacy of an act Inform remains valid for the case of the act Confirm. These qualifications are identical to those of an act Inform for the part concerning the ability, but they are different for the part concerning the context relevance. Indeed, an act Confirm is irrelevant if its agent believes that the addressee is not uncertain of the property intended to be Confirmed.

In virtue of this analysis, for the act Confirm we propose the following model:

< i, CONFIRM (j, ) >

FP: B_i B_iU_j

RE: B_j

The act Confirm has a negative "counterpart": the act Disconfirm. The characterization of this act is similar to that of the act Confirm and leads us to provide the following model:

< i, DISCONFIRM (j, ) >

FP: B_iB_i(U_j B_j)

RE: B_j

Redefining the Act Inform: Mutual Exclusiveness Between Acts:

The qualifications of the act Inform have to be reconsidered according to the models of the acts Confirm and Disconfirm such that the context relevance be total (i.e., such that the preconditions of the three act models be mutually exclusive). To do that, it is sufficient to formalize the following property (and its "complementary"): to be justified to Inform, an agent must not be justified either to Confirm (and hence to believe that the addressee is uncertain of the SC of the act) or to Disconfirm (and hence to believe that the addressee is uncertain of, or believes the contrary of the SC of the act). The updated model for the act Inform is the following one:

< i, INFORM (j, ) >

FP: B_i B_i(Bif_jUif_j)

RE: B_j

Hence, in a given context, among several acts which (potentially) achieve some goal, there is at most one act, the preconditions of which are satisfied. This means that the agent can never be faced with two illocutionary acts leading to the same situation, both of them being context-relevant.

The Closed-Question Case:

In terms of Illocutionary Acts (IAs), exactly what a speaker i is Requesting when uttering a sentence such as "Is p?" to a hearer j, is that j performs the act "Inform i that p" or performs the act "Inform i that p". We know the model of both of these acts: < j, INFORM (i, ) >. In addition, we know the relation "or" set between these two acts: it is the relation that allows the building of action expressions which represent a nondeterministic choice between several (sequences of) events.

In fact, as mentioned above, the semantic content of a directive refers to an action expression; so, this can be a disjunction between two or more acts. Hence, by using the utterance "Is p?", what an agent i Requests an agent j to do is the following action expression:

< j, INFORM (i, p) > < j, INFORM (i, p) >

Now, it seems clear that the SC of a directive realized by a yn-question can be seen as an action expression characterizing an indefinite choice between two IAs Inform. In fact, it can also be shown that the binary character of this relation is only a special case: in general, any number of IAs Inform can be handled. In this case, the addressee of the directive is allowed to "choose" one from among several acts. This is not only a theoretical generalization: it accounts for some ordinary linguistic behaviour traditionally called Alternatives question. An example of an utterance realizing this type of act is "Would you like to travel in first, second or third class ?". In this case, the SC of the Request realized by this utterance is the following action expression:

< j, INFORM (i, p₁ ) > < j, INFORM (i, p₂ ) > < j, INFORM (i, p₃) >

where p_1,p₂and p₃ are intended to mean respectively that j wants to travel in first class, in second class, or in third class.

Now, we have to provide the plan-oriented model for this type of action expression. In fact, it would be interesting to have a model which is not specific to the action expressions characterizing the nondeterministic choice between IAs of type Inform, but a more general model where the actions referred to in the disjunctive relation remain unspecified. In other words, we would like to describe the preconditions and effects of the expression a₁ a₂ ... a_n where a₁, a₂,..., a_n are any action expressions. It is worth mentioning here that we want to characterize this action expression when it can be planned as a disjunctive macro-act. We are not attempting to characterize the nondeterministic choice between acts which are planned separately. In both cases, we get a branching plan but in the first case, the plan is branching in an a priori way while in the second case it is branching in an a posteriori way.

An agent will plan a macro-act of nondeterministic choice when she intends to achieve the RE of one of the acts compounding the choice, no matter which one it is. To do that, one of the feasibility preconditions of the acts must be satisfied, no matter which one it is. Therefore, we propose the following model for a disjunctive macro-act:

a₁ a₂ ... a_n

FP: FP(a₁) FP(a₂) ... FP(a_n)

RE: RE(a₁) RE(a₂) ... RE(a_n)

where FP(a_k) and RE(a_k) represent respectively the FPs and the RE of the action expression a_k.

<i,INFORM(j,)> <i,INFORM(j,)>

FP: Bif_i B_i(Bif_jUif_j)

RE: Bif_j

In the same way, we can derive the disjunctive macro-act model which gathers together the acts Confirm and Disconfirm. We will hereafter use the abbreviation < i, CONFDISCONF (j, ) > to refer to the following model:

<i,CONFIRM(j,f)> | <i,DISCONFIRM(j,f)>

FP: Bif_if B_iU_jf

RE: Bif_j

The Strict-Yn-Question Act Model:

Starting from the act models < j, INFORMIF (i, f) > and < i, REQUEST (j, a) >, we build up the strict-yn-question act (and not plan, as seen below) model. Unlike a Confirm/Disconfirm-question, which will be addressed below, a strict-yn-question requires from its agent not to have any "knowledge" about the proposition the truth value of which is asked for. To get the intended model, a transformation must be applied to the FPs of the act < j, INFORMIF (i, f) >. In accordance with this proposition, we get the following model for a strict-yn-question act:

< i, REQUEST (j, < j, INFORMIF (i, f) >) >

FP: Bif_if Uif_if B_iPG_jDone(< j, INFORMIF (i, f) >)

RE: Done(< j, INFORM (i, f) > | < j, INFORM (i, f) >)

The Confirm/Disconfirm-question Act Model:

In the same way, we can derive the following Confirm/Disconfirm-question act model:

< i, REQUEST (j, < j, CONFDISCONF (i, f) >) >

FP: U_if B_iPG_jDone(< j, CONFDISCONF (i, f) >)

RE: Done(< j, CONFIRM (i, f) > | < j, DISCONFIRM (i, f) >)

The Open-Question Case

Open question is a question which is not closed, i.e., which does not suggest a choice and, in particular, which does not require a yn-answer. A particular case of open questions are the questions which require a referring expression as an answer. They are generally termed wh-questions. The "wh" refers to the interrogative pronouns such as "what", "who", "where", or "when". Nevertheless, this must not be taken literally since the utterance "How did you travel ?" can be considered as a wh-question.

Now, we are interested in setting up a formal plan-oriented model for the wh-questions. In our opinion, from the addressee point of view, this type of question can be seen as a closed question where the suggested choice is not made explicit because it is "too wide". Indeed, a question such as "What's your destination ?" can be restated as "What's your destination: Paris, Rome,... ?".

When asking a wh-question, an agent j intends to acquire from the addressee i an identifying referring expression (IRE) [Sadek 90] for (generally) a definite description. So, the agent j intends to make his interlocutor i perform an IA of the following form:

< i, INFORM (j, ix d(x) = r) >

where r is an IRE (e.g., a standard name or a definite description) and ix d(x) is a definite description. Hence, the SC of the directive performed by a wh-question is a disjunctive macro-act compounded with acts of the form of the one above. Here is the model of such a macro-act:

< i, INFORM (j, ix d(x) = r₁) > ... < i, INFORM (j, ix d(x) = r_k) >

where r_k are IREs. To deal with the case of closed questions, the plan-oriented generic model we have proposed for a disjunctive macro-act can be instantiated here for the account of the macro-act above. Note that the following equivalence is valid:

(B_iixd(x) = r₁ B_iixd(x) = r₂ ... ) (y)B_ix(x) = y

we propose the following model, which we will refer to as < i, INFORMREF (j, ix d(x)) >:

< j, INFORMREF (i, ix d(x)) >

FP: Bref_id(x)) B_iref_jd(x)

RE: Bref_jd(x)

where ref_jd(x), Bref_jd(x) and Uref_jd(x) are the three following abbreviation schemas:

ref_jd(x) Bref_jd(x) Uref_jd(x)

Bref_jd(x) (y)B_i_{_x}d(x)=y

Uref_jd(x) (y)U_i_{_x}d(x)=y

Now, provided the act models < j, INFORMREF (i, ix d(x)) > and < i, REQUEST (j, a) >, the wh-question act model can be built in the same way we have done in the case of the yn-question act models. Applying the transformation to the FPs of the act schema < j, INFORMREF (i, ixd(x)) > and by virtue of proposition 3, we get the following model:

< i, REQUEST (j, < j, INFORMREF (i, ix d(x) >) >

FP: ref_id(x) B_iPG_jDone(< j, INFORMREF (i, ixd(x)) >)

RE: Done(< i, INFORM (j, ixd(x)=r₁) >|...< i, INFORM (j, ixd(x)=r_k)>)

Building Interagent Communication Plans:

The properties of rational behaviour we have stated when we have defined the RE and the FPs of a CA specify an algorithm of CA planning. This algorithm builds a plan through the inference of the causal chain of intentions, resulting from the application of properties 1 and 2.

With this method, it can be shown that what is usually called "dialogue acts" and their models which are postulated, are, in fact, complex plans of interaction, and so can be derived from elementary acts using the principles of rational behaviour. Let us see an examplesof how this is done.

The interaction plan "hidden" behind a question act can be more-or-less complex depending on the agent mental state when the plan is generate.

Let us call direct question a question undelain by a plan limited to the reaction strictly legitimized by the question. Suppose that (the base of) i's mental state is:

B_iBif_jf, I_i,Bif_if

In virtue of property 1, the intention that the act < j, INFORMIF (i, f) > be done is generated. Then, according to property 2, follows the intention to bring about the feasibility of this act. So, the problem is to know whether the following belief can be derived at this moment (from i's mental state):

B_i(Bif_jf (B_jBifif Uif_if))

This is the case with i's mental state. By virtue of properties 1 and 2, the intention that the act < i, REQUEST (j, < j, INFORMIF (i, f) >) > be done and then the intention to achieve its feasability, are inferred. The following belief is derivable:

B_i(Bif_if Uif_if)

Now, no intention can be inferred. This terminates the planning process. The performance of a direct strict-yn-question plan can be started for instance by uttering a sentence such as "Has the flight from Paris arrived ?"

Given the FPs and the RE of the plan above, the following model for a direct strict-yn-question plan can be derived:

<i,YNQUESTION(j,f)>

FP: B_iBif_jf Bif_if Uif_if B_iB_j(Bif_if Uif_if)

RE: Bif_if

Flexibility of the ARCOL language

When unified to SL, SCL (the language for semantic content) happens to be a rich language but may turn out to be difficult to use as such (nevertheless, it is worth recalling that the required technology, ARTIMIS, to use it is available now). However, as mentioned above, ARCOL expressiveness is conditioned by SCL one.

Suppose that one want to simplify SCL, for example by restricting it to a first-order predicate logic language. Let us call this simplified version SCL1. In this case, ARCOL is accordingly restricted also to a language ARCOL1. Yet, we would like to enable the agents to communicate their mental attitudes (beliefs, intentions, etc.). One solution is to augment the set of ARCOL1's communicative acts with complex acts (namely, macro-acts) which intrinsically integrate in their semantics relevant mental attitudes. For example, the act of agent i informing agent j that it (i.e., i) has a given intention, has the following semantics in ARCOL:

< i, INFORM (j, I_ip) >:

FP: I_ip B_iB_jI_ip B_iB_jI_ip

ER: B_jp

This act can be defined in ARCOL1 as follows:

< i, INFORM-I (j, p) >:

FP: I_ip B_iB_jI_ip B_iB_jI_ip

ER: B_jI_ip

Of course, this ARCOL1's new act < i, INFORM-I (j, p₁) > is only an abbreviated form of < i, INFORM (j, I_ip) > of ARCOL since propositions p and p' are expressen in SCL and SCL1, respectively.

In other respects, the major downside of this solution is that it requires the definition of an additional communicative act each time the semantic content cannot be expressed in the first-order predicate language.

In any case, agents are required to express communicative acts which refer to mental attitudes, the corresponding technology must enable the manipulation of such notions. Subsequently, if agents have these notions explicitly represented in their own kernels, there no a priori reason not to include them into the agent communication language.

Comparison with KQML

One of the interagent communication languages that have been proposed recently seems to emerge: KQML [KQML93,LF94,FLM96]. KQML has been designed to facilitate high-level cooperation and interoperation among artificial agents. Agents may range from simple programs and databases to more sophisticated knowledge-based systems. In KQML, agents communicate by passing so-called "performatives" to each other. KQML provides an open-ended set of "performatives", whose meaning is independent of the propositional content languages (e.g., first-order logic, SQL, Prolog, etc.). Examples include ACHIEVE, ASK-ALL, ASK-IF, BROKER, DELETE-ALL, DENY, STREAM-ALL, TELL, UNTELL, UNACHIEVE. Moreover, KQML provides a set of conversation policies that expresses which "performatives" start a conversation and which "performatives" are to be used at any given point of a conversation. These policies induce a set of interagent conversation patterns using the communication actions.

Several general difficulties with the KQML specification have been identified. The major one is that the meaning of the defined "performatives" is rather unclear. "Performatives" are given in English glosses, which often are vague and ambiguous (a first attempt to clarify the language has been made [LF94], but much work remains). For example, the "performative" DENY is defined as follows [KQML93, p. 12]:

DENY

:content <performative>

:language KQML

:ontology <word>

:in-reply-to <expression>

:sender <word>

:receiver <word>

"Performatives" of this type indicate that the meaning of the embedded <performative> is not true of the sender. A deny of a deny cancels out.

As noted by Cohen & Levesque [CL95], if an agent denies a tell, does that mean that the agent did not tell earlier, or does not believe what is being said now? Cohen & Levesque highlighted another confusion: the DENY act says that what agents deny is a performative, and that it is no longer true of the speaker. This implies that "performatives" do in fact have truth values, and are not actions after all. And, given that other "performatives" are defined in terms of DENY, such as UNACHIEVE, UNREGISTER and UNTELL, it is not a small problem.

Another example of vagueness appears with the "performative" DELETE-ALL which is defined as follows [KQML93, p. 17]:

DELETE-ALL

:content <expression>

...

"Performatives" of this type indicate that sender want the receiver to remove all sentences that match <expression> in its virtual knowledge base.

First of all, if the notion of believe (or knowledge) is meaningful for an agent, it is not clear what does removing propositions mean exactly? Does it mean believing the opposite propositions? becoming ignorant about the referred propositions? Secondly, how can the "ALL" be circumscribed? If all an agent's knowledge, including its reasoning rules, is "coded" in the same knowledge base, and that the agent receives the "performative" DELETE-ALL with a content that matches everything, the agent should, a priori, remove all its knowledge base. Thirdly, a matching operation must be defined for each propositional content language. For example, if the language is Prolog, we can suppose that "matching" means "unification". But, what does "matching" mean when the language is the first-order logic? Therefore, the meaning of the DELETE-ALL act depends on the meaning of the matching operation. Thus, any attempt to semantically define KQML's performatives requires beforehand to have a complete and clear semantics for the language of communicative act contents. This also applies to the EVALUATE "performatives" and its variants such as ASK-IF, ASK-ALL, and STREAM-ALL:

EVALUATE

:content <expression>

...

"Performatives" of this type indicate that the sender would like the recipient to simplify <expression>, and reply with this result. (Simplification is a language specific concept, but it should subsume "believed equal".) [KQML93, p. 15]

We consider that simplification is more than a language-specific concept: it is a domain- specific operation. For example, assuming that the propositional content language is an arithmetic language, the simplification of an expression such as "1x2x4" can be "8", "2x2x2" (prime numbers), or any other relevant expression.

In fact, there are only two types of speech acts that are hidden into the English glosses of each KQML's "performatives": directives and assertives. In particular, although KQML is extensible, an important class of speech acts seems to be missing: the commissives, such as promising, accepting a proposal, agreeing to perform a requested action, etc. And, it is hard to see how agents can robustly communicate without these actions or a way to express commissive attitudes, such as intentions (see, e.g., the ARCOL1's Inform-I communicative act).

Is it relevant for the high-level communication language KQML to define "performatives" such as STANDBY, READY and NEXT which are useful to deal with the low-level problem of buffering stream? In the same way, KQML defines several families of "performatives" that differ according to the format of the expected replies. For example, STREAM-ALL is like ASK-ALL except that rather replying with a "performative" containing a collection of solution, the responder sends series of "performatives", each of them containing a member of that collection.

ARCOL has no one of the troubles or weaknesses mentioned. Moreover, it can provide the same (and much more) expressive power as KQML's, starting from a smaller set of semantically well-defined primitive communicative acts. If needed, more complex and/or specific (or customised) acts can be easily defined with the same precise semantics.

For example, the two KQML's "performatives" TELL (which indicates that its content is in the sender's virtual knowledge base), and INSERT (which indicates that its content must be added to the receiver's virtual knowledge base) are respectively expressed in ARCOL with the same communicative act as follows:

< i, INFORM (j, B_ip) >

and

< i, INFORM (j, I_iB_jp) >

A few words about KIF compared to SCL

KIF [Genesereth & Fikes 92] is a knowledge representation language which may be compared to SCL. Unlike SCL, which is not related to any programming language, KIF is strongly LISP-oriented: whatever the interpretation language mechanism is, it must handle, at least, the quotation and the comma, and potentially the whole LISP language. Moreover, unlike ARTIMIS's automated inference unit, the existing inference engines for KIF (such as EPILOG and PROLOGIC) are rather weak regarding the intended richness of the language: firstly, apparently, deduction is limited to Horn clauses. Secondly, it is not clear how is inference drawn into the scope of the quotation operator, and especially into the scope of a belief operator. More generally, it is not clear at all how specific belief logics, particularly those relevant in the context of agent interaction , as KD45, or other BDI logics, can be handled

Agent Cooperativeness Protocols: MCP, CAP, SAP

The strong overlap between human-agent interaction and agent-agent interaction leads to design a common technology for these two types of interaction.

Concerning the interaction protocols, a same argument can be used. Agent-agent protocols are implemented through message exchanges (i.e., speech acts or communicative acts). In the case of human-agent interaction (and more specifically in natural language interaction), the protocols are subject to a large variability and do not correspond to a pre-specified script. At each step of interaction, agents and humans react (ideally, in a rational way) opportunistically according to the messages their receive. In the case of agent-agent interaction, the protocols can (and, why not?, should) also have the same flexibility.

The interaction protocols we propose are not relative to the interaction structure (e.g., an interaction grammar or automaton, as it is the case for the Contract Net protocol, for example), but rather to the agent's cooperative behaviour. In that sense, these protocols can be viewed as the basis for interagent negotiation.

Minimal Cooperativeness Protocol (MCP): A Basic Interaction Protocol

The interagent interaction protocols make the agents committed to basic behaviour principles which some of them directly follow from the semantics of the communication language. As far as we are concerned, given the ARCOL communication language defined in section xxx (and considering that the agents follow its semantics), a minimal cooperativeness is necessary and sufficient for interaction to take place. Conversely, a minimal cooperativeness is required for communication to be possible. For example, suppose that agent A1 asks agent A2 if proposition p is true; if both agents respect the semantics of the communication language, A2 knows that A1 intends to know if p is true. But, without a minimum of cooperativeness, A2 is in no way constrained to react to A1's request.

Informally, the MCP protocol states that agents must not only react when addressed but, more than that, they must adopt the interlocutor's intention whenever they recognise it (if there is no objection to adopt it). Formally, this principle is expressed as follows [Sadek 91a]:

B_iI_j I_iB_j I_iB_j

Actually, this protocol has a wide scope: it may produce cooperative behaviours which are much more complex than merely answering questions, such as making an agent forward a request to a competent agent if it cannot answer the request by itself (cf. brokering).

The Corrective-Answers Protocols (CAP)

More complex cooperativeness protocols can also be standardised.

Let us take an example with a Personal Travel Assistant (PTA). A user wants to reserve a hotel room. She asks her PTA to find a double room for her, with a bath-tub, a satellite-TV, for less than 300 FF, for date d, and if not possible, to accept, in priority, that the satellite-TV be replaced with regular-TV or with no TV at all, or that the bath-tub be replaced with a shower or with a shared shower. Finally, the user asks the PTA to contact her if only rooms with higher prices are available.

The PTA contacts a hotel reservation agent and asks for a double room with a bath-tub, a satellite-TV, for less than 300 FF, for date d. Suppose that the rooms available on date d are as follows:

beds	bathroom	TV	Price
1 double	bath-tub	reg-TV	450
1 double	shower	sat-TV	450
1 double	shared shower	-	350
2 single	shared shower	reg-TV	350
1 single	shared shower	-	250

Without any cooperativeness protocol, the hotel reservation agent would answer "0" (i.e., no room of the requested type is available), and the PTA can infer no further information from this answer, and would be led to ask further questions (up to 9, i.e., (sat-TV, reg-TV, no TV) = 3 x (bath-tub, shower, shared shower) = 3), to which the hotel reservation agent may answer "0" again.

But, with a cooperativeness protocol, by providing some additional relevant information, the hotel reservation agent can enable the PTA to use its inferential capabilities. For example, if the hotel reservation agent has declared using a Corrective-Answers Protocol (CAP) and produces the following answer: "There are no double rooms neither for less than 300 FF, nor with a bath-tub and a satellite-TV", the PTA can infer, on one hand, all the answers to the new questions just mentioned, and, on the other hand, that there are double rooms for more than 300 FF, double rooms with bath-tub, and double rooms with satellite-TV.

Such protocols are very useful for the efficiency of interaction: not only they reduce the number of the exchanged messages in an interaction, but also enable agents to optimally exploit the messages they receive depending on their inferential abilities. However, since these protocols cannot be necessarily made available for all agents, they can be considered as complementary protocols (unlike MCP which is intended to be a basic one) to which agents may or may not subscribe. In this way, if an agent declares that it is behaving according to these complementary protocols, its interlocutors are entitled to infer certain type of relevant information for its responses.

Of course, for an agent, to declare using a cooperativeness protocol, as CAP for example, means to have certain inferential capabilities and to commit to certain behaviour.

The second cooperativeness protocol we propose for standardisation is the CAP protocol. First of all, we propose to standardise the corrective-answers production principle. Formally, this principle can be expressed as follows [Sadek 91a]:

B_i( B_j Comp(i, )) I_iB_j

It says that an agent i will act to correct a statement believed by another agent j whenever i thinks that it is competent about the truth of and believes the opposite statement .

By itself, this principle is not sufficient to support a CAP protocol. Indeed, according to the inferential abilities of the so-called corrective agents (i.e., using the corrective-answers production principle, they will not detect the same sub-parts of a request which have to be corrected. Since the CAP protocol is not intended to be used by all the agents, a minimum of (reasonable and relevant) inferential capabilities are required to be able to use it.

Thus, the corrective-answers production principle is augmented with some minimal inferential capabilities. First of all, it seems necessary that a corrective agent be able to infer that the persistent qualifications of the communicative act it has just observed hold. This condition is in complete conformity with the communication language.

Secondly, some inferences about the object descriptions mentioned in the request seem to be necessary too: for instance, provided that a primitive request can be viewed at a certain level of analysis as about an objet described with a conjunction of constraints or object descriptions, the addressee must be able to infer that the requesting agent believes that the described object exists and that all the objects described by the sub-conjunctions of the initial conjunction (namely, 2ⁿ object descriptions if the request involves n constraints) also exist. For example, if the request is about rooms with double bed and shower, the objects described by the sub-conjunctions are rooms, rooms with double bed, rooms with shower, and rooms with double bed and shower.

Moreover, the corrective answer must contain all the problematical "minimal objects", i.e., the minimal sub-conjunctions which cause the failure of the initial request.

To summarise, the CAP protocol submitted to standardisation involves:

- the correction principle (i.e., the commitment for an agent to be corrective and, consequently, to act in the case of opposite viewpoints) ;

- the mechanisms for inferring sub-conjunctions, in particular inferring all the problematical minimal sub-conjunctions.

The Suggestive-Answers Protocol (SAP)

One can go further concerning the production of cooperative answers. In the previous example, the hotel reservation agent can use a SAP protocol (in addition to the CAP protocol) to produce the following answer to the first question of the PTA: « There is neither double room for less 300 FF, nor with a bath-tub and satellite-TV. But there are two double rooms for 450 FF each: one with bath-tub and regular-TV, and the other with shower and satellite-TV. ». This answer allows the PTA to infer (in addition to the information given explicitly) that there is no double room neither with bath-tub nor with satellite-TV, for less than 450 FF.

The SAP relies on a suggestive-answers production principle. This principle consists in informing the requesting agent of other possibilities when no strict answer to the request as it is formulated. A standard way to determine these possibilities relies on the standard determination of the corrective answers and on the existence of universal metrics, i.e., metrics over dimensions such as time, space, money, etc. The algorithm is as follows: having explored the set of objects which are described by the sub-conjunctions, the requested agent can identify a maximal (with respect to the number of constraints appearing in the initial request) non-problematical subset. A suggestive answer consists in informing the requesting agent about (the existence of) a (set of) particular object(s), by substituting, on a dimension having a universal metrics, a new constraint (the closest one with respect to the metrics) to the initial one.

On the previous example, the set of non-problematical objects which has a maximal description can be the double rooms with bath-tub or with satellite-TV, and the problematical constraint having a universal metrics is the price. Thus, the hotel reservation agent can suggest a double room with bath-tub for 450 FF and a double room with satellite-TV for 450 FF.

As for the corrective-answers production principle and the subsequent CAP protocol, all the agents are not intended to adopt the suggestive-answers production principle or the SAP protocol.

To summarise, the SAP protocol submitted to standardisation involves:

- the suggestive-answers production principle (i.e., the commitment for an agent to be suggestive when possible) ;

- the mechanisms for the inference of maximal non-problematical sub-conjunctions and constraint substitution over universal metrics ;

Why the components submitted to standardisation and ARTIMIS technology are relevant to FIPA?

ARCOL is a semantically well-defined, highly expressive, adaptable, supported by an available intelligent agent technology (ARTIMIS), interagent communication language. The expressive power of ARCOL (and SCL) provides an accurate specification of the user's personal preferences (or profile). ARCOL (and the underlying ARTIMIS technology) does not induce any interagent conversational patterns. Thus, an agent unsing ARCOL can display very flexible interaction behavior (in particular, in the context of human-agent interaction).

The protocols proposed are clearly specified. They provide a very appropriate framework for efficient cooperative negotiation. Indeed, they not only enable agents (and human users) to infer relevant information from the received answers, but they also allow to strongly minimise the number of required exhanges between agents.

ARTIMIS interacts in ARCOL and follows the proposed cooperativeness protocols proposed. For instance, ARTIMIS is able to produce cooperative answers (corrective, suggestive, and other types of cooperative answers). Moreover, since ARTIMIS is based on an explicit first-order modal reasoning process, its communicative and cooperative behaviour is specified declaratively, in terms of logical properties, which are implemented as such. To change its behaviour only require to modify this set of logical properties. No translation from logics to another language is required.

Since ARTIMIS involves a reasoning unit and is so able to draw inferences, it can really combine services without that the service provider (or designer) has to specify beforehand all the relevant combinations. In that sense, ARTIMIS fits a plug-and-play architecture that is required for the FIPA target applications.

Finally, the reasoning abilities of ARTIMIS are useful to deal with the notion of "semantically-related-to" which is particularly relevant during for the dertermination of cooperative answers, and the filtering and retrieval of relevant information (particularly required for applications such as Personal Assistant, Personal Travel Assistant, and Audio-Visual Entertainment and Broadcasting). Such abilities are more important especially when the application has a wide semantic domain (as, for example, Audio-Visual Entertainment and Broadcasting).

References

[Bretier 95] Bretier P. La communication orale coopérative : contribution à la modélisation logique et à la mise en oeuvre d'un agent rationnel dialoguant. Thèse de Doctorat Informatique, Université de Paris XIII, 1995.

[Bretier & Sadek 96] Bretier P. & Sadek M.D. A rational agent as the kernel of a cooperative spoken dialogue system: Implementing a logical theory of interaction. Proceedings of the ECAI'96 workshop on Agent Theories, Architectures, and Languages (ATAL), Budapest, Hungry, 1996. (A paraître).

[Cohen & Levesque 90] Cohen P.R. & Levesque H.J. Intention is choice with commitment. Artificial Intelligence, 42(2-3):213--262, 1990.

[Cohen & Levesque 95] Cohen P.R. & Levesque H.J. Communicative actions for artificial agents; Proceedings of the First International Conference on Multi-agent Systems (ICMAS'95), San Francisco, CA, 1995.

[Finin et al 96] Finin T., Labrou Y. & Mayfiled J., KQML as an agent communication language, Bradshaw J. ed., Sofware agents, MIT Press, Cambridge, 1996.

[Genesereth & Fikes 92] Genesereth M.R. & Fikes R.E. Knowledge interchange format. Technical report Logic-92-1, CS Department, Stanford University, 1992.

[Guyomard & Siroux 89] Guyomard M. & Siroux J. Suggestive and corrective answers : A single mechanism. In Taylor M.M., Néel F., & Bouwhuis D.G., editors, The structure of multimodal dialogue, North-Holland, 1989.

[Garson 84] Garson, G.W. Quantification in modal logic. In Gabbay, D., & Guentner, F. eds. Handbook of philosophical logic, Volume II: Extensions of classical Logic. D. Reidel Publishing Company: 249-307. 1984.

[Halpern & Moses 85] Halpern, J.Y., & Moses Y. A guide to the modal logics of knowledge and belief: a preliminary draft. In: Proceedings IJCAI-85, Los Angeles, CA. 1985.

[Kaplan 82] Kaplan S.J. Cooperative responses from a protable natural language query system. Artificial Intelligence, 19:165--187, 1982.

[KQML93] External Interfaces Working Group, Specification of the KQML agent-communication language, 1993.

[Labrou & Finin 94] Labrou Y. & Finin T., A semantic approach for KQML - A general purpose communication language for software agents, Proc. of 3rd International Conference on Information Knowledge Managment, November 1994.

[Panaget 96] Panaget F. D'un système générique de génération d'énoncés en contexte de dialogue oral à la formalisation logique des capacités linguistiques d'un agent rationnel dialoguant. Thèse de Doctorat Informatique, Université de Rennes I, 1996.

[Sadek 90] Sadek M.D., Logical task modelling for Man-machine dialogue. Proceedings of AAAI'90: 970-975, Boston, MA, 1990.

[Sadek 91a] Sadek M.D. Attitudes mentales et interaction rationnelle: vers une théorie formelle de la communication. Thèse de Doctorat Informatique, Université de Rennes I, France, 1991.

[Sadek 91b] Sadek M.D. Dialogue acts are rational plans. Proceedings of the ESCA/ETRW Workshop on the structure of multimodal dialogue, pages 1-29, Maratea, Italy, 1991.

[Sadek 92] Sadek M.D. A study in the logic of intention. Proceedings of the 3rd Conference on Principles of Knowledge Representation and Reasoning (KR'92), pages 462-473, Cambridge, MA, 1992.

[Sadek et al 95] Sadek M.D., Bretier P., Cadoret V., Cozannet A., Dupont P., Ferrieux A., & Panaget F. A cooperative spoken dialogue system based on a rational agent model: A first implementation on the AGS application. Proceedings of the ESCA/ETR Workshop on Spoken Dialogue Systems : Theories and Applications, Vigso, Denmark, 1995.

[Sadek et al 96a] Sadek M.D., Ferrieux A., Cozannet A., Bretier P., Panaget F., & Simonin J. Effective human-computer cooperative spoken dialogue: The AGS demonstrator. Proceedings of ICSLP'96, Philadelphia, 1996.

[Searle 69] Searle J.R. Speech Acts, Cambridge University Press, 1969.