Mobile Multimodal Maps: A Natural Interface to Distributed Services
Adam Cheyer, Douglas Moran (SRI)
Gowang-lo Lee, Sangkyu Park (ETRI)
Objectives and Motivation
With the growing number of on-line information sources and services,
the human user faces increasing difficulty in managing the access and
integration of this information. Automated agents can provide an approach for
reducing this complexity: agents would be responsible for finding
new resources, for knowing how to interact with them on behalf of the user, and
for correctly combining a specific service or piece of information
with other resources.
However, collections of automated agents will only be useful if the user has
a natural way of interacting with them -- otherwise, managing the agents
themselves becomes as complicated to deal with as the services they command.
A user should not have to know what agents are available, where they are
located, or what knowledge schema
the agents use. Rather, he or she should be able to express requests
simply by stating objectives or by giving recommendations.
The architecture used to implement an
agent system must know how to map the user's
model of the world (and ways of interacting with it) into the
agents' model.
Application Proposal
When people communicate with each other, they often combine speech with
physical gestures. For instance, a professor standing at a blackboard
interacting with a classroom of students will most likely
make use of speaking, drawing, writing, circling,
underlining, and pointing. Interacting with a network of distributed
agents should be this easy!
In our proposed application, a user interacts with a map-based application,
running on a lightweight PDA or laptop, to access remote services
and information provided by a dynamic distributed community of agents.
Agent services may include:
- Retrieval of information from semi-structured (e.g. HTML) or structured
sources (e.g. company databases)
- Reasoning about actions to perform (planning agents)
- Tailoring responses to the user's preferences (learning agents)
- Keeping watch for events of interest to the user (monitoring)
- Simulation services: allowing the user to experiment in a
hypothetical world before installing actions in a real world
A map interface has been chosen because maps are familiar, graphical,
and are applicable to a number of domains, such as:
- Travel Planning
- Multi-Robot Control
- Disaster Response
- Shop Floor Simulations
- Battlefield Simulations
- many more...
Our recommendation would be for FIPA to select one application domain on
which to focus, perhaps travel planning.
Agent Framework Requirements
We have chosen to highlight several of FIPA's
agent requirements
as particularly pertinent to our application proposal:
- ``Allow for the use of natural modalities (e.g. natural language) when
communicating with rational agents''
- ``Through the use of high level declarative expressions,
the agent interaction language will facilitate manipulation of messages,
e.g.: reasoning about content, translation to/from natural language''
- ``...should be able to handle dynamic addition/deletion/role-changing
of agents''
- ``Allow for light-weight agents, able to run on low-memory, low
performance devices''
In our experience, one viable approach to addressing these four requirements
is for the architecture to provide a "Facilitator" agent who
is responsible for managing the execution of complex logical queries. These
queries, expressed in a declarative language, can be generated from
natural language or multimodal requests, as well as being produced by
more standard GUI techniques. Individual agents remain simple and lightweight
since the Facilitator and support framework hide the complexities involved in
resolving queries using a dynamic distributed community of agents.
Contact: Adam Cheyer,
SRI, International
Version: September 30, 1996