jump to navigation

Comments on Prolegomena to any future artificial moral agent – Allen, Varner and Zinser (2000) March 12, 2011

Posted by Sean Welsh in ethics.

Journal of Experimental and Theoretical Artificial Intelligence, 12, 2000, pp 251-261.


As artificial intelligence moves ever closer to the goal of producing fully autonomous agents, the question of how to design and implement an artificial moral agent (AMA) becomes increasingly pressing. Robots possessing autonomous capacities to do things that are useful to humans will also have the capacity to do things that are harmful to humans and other sentient beings. Theoretical challenges to developing artificial moral agents result both from controversies among ethicists about moral theory itself and from computational limits to the implementation of such theories. In this paper the ethical disputes are surveyed, the possibility of a “Moral Turing Test” is considered and the computational difficulties accompanying the different types of approach are assessed. Human-like performance, which is prone to include immoral actions, may not be acceptable in machines, but moral perfection may be computationally unattainable. The risks posed by autonomous machines ignorantly or deliberately harming people and other sentient beings are great. The development of machines with enough intelligence to assess the effects of their actions on sentient beings and act accordingly may ultimately be the most important task faced by the designers of artificially intelligent automata.

Comments on the paper

The authors begin by briefing stating the case for machine ethics.

Allen et al write:

Robots possessing autonomous capabilities to do things that are useful to humans will also have the capability to do things that are harmful to humans and other sentient being. How to curb these capabilities for harm is a topic that is beginning to move from the realm of science fiction to the realm of real-world engineering problems. As Picard (1997) puts it: ‘The greater the freedom of the machine. the more it will need moral standards’.

Clearly we need machine ethics. Machine ethics can be defined as the ethics that can be implemented in machines.

They continue:

Attempts to build an artificial moral agent (AMA) are stymied by two areas of deep disagreement in ethical theory. One is at the level of moral principle: ethicists disagree deeply about what standards moral agents ought to follow.

They are not kidding. Here is a brief selection of ethical theories you might like to think about implementing to control the behaviour of a robot.

  • Divine command theory
  • Moral relativism
  • Natural law theory
  • Positivism
  • Utilitarianism
  • Other forms of consequentialism
  • Kantianism
  • Other forms of deontology
  • Moral pluralism
  • Virtue theory
  • Moral particularism
  • Emotivism
  • Egoism
  • Marxism
  • Buddhism
  • Asimov’s Three Laws

It gets worse. There are many variations of the Divine command theory. There are various factions of the Jewish schools, various factions of the Christian schools, various factions of the Muslim schools. There are numerous variations of utilitarianism, Marxism and Buddhism and indeed all the above.

No ethical theory has the epistemological status of say Newton’s Law. We can all agree on the scope of the application of Newton’s Law and it’s degree of accuracy for projects on this planet. This cannot be said about even the most popular of moral theories in the list above.

Allen et al outline a second problem. This is more conceptual. Actually it is ontological.

Apart from the question of what standards a moral agent ought to follow, what does it mean to be a moral agent?

Speaking bluntly, I think the question of ‘being’ should be de-scoped from early implementations of machine ethics. I don’t think it is seriously worth pursuing yet.

I take the view that the focus of machine ethics should be on the discovery and development of ethical decision procedures for machines than can realistically be built in the short to medium term.

The idea of digital being is interesting indeed fascinating but in the short term we will be more concerned with acceptable and efficient behaviour of robots built for specific purposes (e.g. mining, transport, eldercare).

Put this way, machine ethics must learn to crawl before it aspires to govern (or save) the planet. But that is not to deny that there will be a common thread between the ethics involving in crawling and the ethics involved in planetary government.

Allen et al point out that the requirements of your moral theory will impact your technical implementation. (I think a detailed description of the data required and the decision procedure involved in some of the above theories would be a very worthwhile undertaking.)

In Kant and Mill, for example, as Allen et al point out, there are very different moral principles tied to very different conceptions of what a good moral agent is.

In Mill to be good the robot merely has to act so that it tends to promote happiness.

For Kant to be good the robot must go through certain specific cognitive processes. (Only a good will is good.)

I think the challenge for virtue theory at first glance as high as the Kantian challenge. Exactly how do I go about programming beneficence? What does it mean to assert that good action follows from good character?

Then again, if you have a particular concrete task in mind for your robot, this question is a lot easier to answer in a specific situation than in general.

A taxi robot is good if it takes the shortest practicable route, does not crash, does not get stuck in traffic and uses the minimum amount of energy. A mining robot is good if it extracts ore from the ground conveys it to a ship or processing plant.

What more do we need to worry about?

So long as the robots stay ‘on mission’ I see little serious difficulty in their programming from an ethical perspective. Though obviously there will be situations that crop up in such domains that will require some ethical decision-making especially as the capacity of the machines for autonomous action increases.

Allen et al go on to discuss some high-level problems with consequentialist and deontological approaches to machine ethics.

There are various problems with the consequentialist approach. There is the computational ‘black hole’ problem. When do you stop evaluating possible consequences?

I am not too concerned about this. In practical terms the decisions of a taxi robot or mining robot will be relatively simple to compute in terms of future consequences. Thinking about abstract generalities will lead you to processing black holes but having a concrete task in mind will mitigate these processing issues.

Similar problems exist for deontology. The brute force required to implement the Categorical Imperative appears daunting. Allen et al write:

To determine whether or not a particular action satisfies the categorical imperative, it is necessary for the AMA to recognize the goal of its own action, as well as assess the effects of al l other (including human ) moral agents ’ trying to achieve the same goal by acting on the same maxim. This would require an AMA to be programmed with a robust conception of its own and others ’ psychology in order to be able to formulate their reasons for actions, and the capacity for model ling the population-level effects of acting on its maxim, a task that is likely to be several orders of magnitude more complex than weather forecasting (although it is quite possibly a task for which we humans have been equipped by evolution).

Likewise the Golden Rule has its issues.

Similarly, to implement the golden rule, an AMA must have the ability to characterize its own preferences under various hypothetical scenarios involving the effects of others’ actions upon itself. Further, even if it does not require the ability to empathize with others, it must at least have the ability to compute the affective consequences of its actions on others in order to determine whether or not its action is something that it would choose to have others do to itself. And it must do all this while taking into account differences in individual psychology which result in different preferences for different kinds of treatment.

Virtue approaches at first glance have their problems too.

The basic idea underlying virtue ethics is that character is primary over deeds because good character produces good deeds.

But honestly, how could an open source robot lie? Such a machine could be built (and in my view should be built) to be ethically transparent. A robot would be literally shameless. Its internal states could be precisely logged. Its algorithms would be clear.

I am not overly worried about the idea of a moral Turing Test for robots. I would rather that the algorithms controlling their behaviour were open source and that all their decisions were logged in real time to enable review of their function.

I am wary about bottom-up approaches to machine ethics. Most of the ‘nightmare scenarios’ involving the imminent human extinction at the hands of our digital creations are based on the premise of machine learning. The machine becomes superior to us and decides to enslave us (as in The Matrix) or destroy us (as in Terminator) or to implement martial law for our own good (as in the film version of I, Robot).

The ideas that a future superintelligence might just want to play a game (as in Wargames) I find far more persuasive.

But at this stage, notions such as machines becoming self-aware and developing malevolence are as remote as building consciousness.

However projects such as Cog are moving in that direction. Shame is a part of moral education in humans. Emotions provide moral knowledge – shame lets you know the action is morally wrong but Deep Blue’s lack of passion makes it a more reliable chess player.

Allen et al conclude as follows:

This is an exciting area where much work, both theoretical and computational, remains to be done. Top-down theoretical approaches are ‘safer’ in that they promise to provide an idealistic standard to govern the actions of AMAs. But there is no consensus about the right moral theory, and the computational complexity involved in implementing any one of these standards may make the approach infeasible. Virtue theory, a top-down modelling approach, suffers from the same kind of problems. Bottom-up modelling approaches initially seem computationally more tractable. But this kind of modelling inherently produces agents that are liable to make mistakes. Also as more sophisticated moral behaviour is required, it may not be possible to avoid the need for more explicit representations of moral theory. It is possible that a hybrid approach might be able to combine the best of each, but the problem of how to mesh different approaches requires further analysis.

I agree it’s exciting. The obvious approach is de-scoping. Let’s define the ethical requirements for specific machines in specific domains not machines ‘in general’. These will be well documented in human readable manuals about procedures, safety, best practice and so on. Mining robots should follow the same procedures (morally) as human miners. Taxi robots should obey the same rules as human cabbies.

A hybrid approach can compensate for the lack of ethical theoretical agreement. For example software could take a set of inputs and see if it passes the test of the Categorical Imperative and produces happiness. If we can tick Kant’s boxes and Mill’s boxes in a few milliseconds this might be more productive than worrying about which box is the right box to tick. And they we can see if the machines own learning’s agree as well.

Allen et al state:

We think that the ultimate objective of building an AMA should be to build a morally praiseworthy agent.

This is fine as an ultimate goal but in the short term it is, I think, so difficult as to inhibit progress.

They sum up:

Systems that lack the capability for knowledge of the effects of their actions cannot be morally praised or blamed for effects of their actions (although they may be ‘blamed’ in the same sense that faulty toasters are blamed for the fires that they cause.) Deep Blue is blameless in this way, knowing nothing of the consequences of beating the world champion, and therefore not morally responsible if its play provokes a psychological crisis in its human opponent. Essential to building a morally praiseworthy agent is the task of giving it enough intelligence to assess the effects of its actions upon sentient beings, and to use those assessments to make appropriate choices. The capacity for making such assessments is critical if sophisticated autonomous agents are to be prevented from ignorantly harming people and other sentient beings. It may be the most important task faced by developers of artificially intelligent automata.

I see the design of a morally praiseworthy agent as more problematic. This is akin to building a conscious, knowing thing. This is far more difficulty that simply defining the rules well-behaved robots should follow in their functional domains.

Thus I argue for limited scope in machine ethics. Or at least a clear understanding of the feature set being asked for.

As a postscript, I like the term Artificial Moral Agent (AMA) I think it is useful to contrast with the term Human Moral Agent (HMA).



No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: