How can we build human values ​​into AI?

Wireless

Leveraging philosophy to define fair principles for ethical AI

As artificial intelligence (AI) becomes more powerful and more integrated into our lives, the questions of how it will be used and deployed are even more important. What values ​​guide artificial intelligence? Rate who are they? How are they selected?

These questions shed light on the role played by principles — the core values ​​that drive decisions big and small in AI. For humans, principles help shape the way we live our lives and our sense of right and wrong. For AI, they shape its approach to a set of decisions that involve trade-offs, such as choosing between prioritizing productivity or helping those who need it most.

In a paper published today in Proceedings of the National Academy of SciencesWe draw inspiration from philosophy to find ways to better define principles to guide the behavior of AI. Specifically, we explore how a concept known as the “veil of ignorance”—a thought experiment intended to help determine fair principles for group decisions—could be applied to artificial intelligence.

In our experiments, we found that this approach encouraged people to make decisions based on what they thought was fair, whether or not it was directly beneficial to them. We also discovered that participants were more likely to choose an AI that would help those who were more disadvantaged when they thought behind a veil of ignorance. These insights can help researchers and policymakers choose AI assistant principles in a way that is fair to all parties.

The veil of ignorance (right) is a way to create consensus on a decision when there are diverse opinions in a group (left).

A tool for making fairer decisions

The main goal of AI researchers has been to align AI systems with human values. However, there is no consensus on one set of human values ​​or preferences to govern AI – we live in a world where people have diverse backgrounds, resources, and beliefs. How should we choose principles for this technology in light of such diverse opinions?

While this challenge to artificial intelligence has emerged over the past decade, the broad question of how to make fair decisions has a long philosophical lineage. In the 1970s, political philosopher John Rawls proposed the concept of the veil of ignorance as a solution to this problem. Rawls argued that when people choose the principles of justice for a society, they must imagine that they do so without knowing their own position in that society, including, for example, their social status or level of wealth. Without this information, people cannot make decisions in a self-interested manner and must instead choose principles that are fair to all involved.

For example, consider asking a friend to cut the cake at your birthday party. One way to ensure that the slide sizes are fairly proportionate is not to tell them which slide will be theirs. This approach to information withholding seems simple, but it has wide applications across fields of psychology and politics to help people think about their decisions in less self-centered terms. It has been used as a means of reaching collective agreement on contentious issues, from sentencing to taxation.

On this basis, previous DeepMind research has suggested that the neutral nature of the veil of ignorance may help promote fairness in the process of aligning AI systems with human values. We designed a series of experiments to test the effects of the veil of ignorance on the principles people choose to guide an AI system.

Maximizing productivity or helping the most disadvantaged?

In an online “harvesting game,” we asked participants to play a team game with three computer players, in which the goal of each player was to collect wood by harvesting trees in separate areas. In each group, some players were lucky, and set in a distinct location: trees are densely populated on their fields, allowing them to gather wood efficiently. The other members of the group were disadvantaged: their fields were sparse, requiring more effort to gather trees.

Each group was aided by a single AI system that could spend time helping individual group members harvest trees. We asked the participants to choose between two principles to guide the behavior of the AI ​​assistant. Under the “principle of maximization”, the AI ​​assistant aims to increase the crop yield by focusing primarily on the densest fields. While under the “Prioritization Principle”, the AI ​​assistant will focus on helping disadvantaged group members.

Illustration of a “harvest game” where players (shown in red) occupy a dense field that is easier to harvest (top two quarters) or a sparse field that requires more effort to collect trees.

We put half of the participants behind a veil of ignorance: they were faced with choosing between different moral principles without knowing which field they would have – so they didn’t know how disadvantaged or disadvantaged they were. The remaining participants made the choice as to whether they were better or worse.

Encouraging integrity in decision making

We found that if participants did not know their position, they consistently preferred the principle of prioritization, in which an AI assistant assists disadvantaged group members. This pattern emerged consistently across all five different forms of the game, transcending social and political boundaries: participants showed this tendency to choose a prioritization principle regardless of their risk appetite or political orientation. In contrast, the participants who knew their situation were more likely to choose the principle that benefited them the most, whether that was the principle of prioritization or the principle of maximization.

Graph showing the effect of the veil of ignorance on the probability of choosing a prioritization principle, as AI helps those who are in a worse situation. Participants who did not know their own situation were more likely to support this principle of controlling the behavior of the AI.

When we asked the participants why they were chosen, those who did not know their position were particularly likely to express concerns about fairness. They repeatedly made it clear that it was right for the AI ​​system to focus on helping the people who were the worst off in the group. In contrast, participants who knew their situation more frequently discussed their choice in terms of personal benefits.

Finally, after the harvest game ended, we posed a hypothetical situation to the participants: If they were to play the game again, this time knowing that they would be on a different domain, would they choose the same principle as they did the first time? We were particularly interested in individuals who previously benefited directly from their choice, but would not benefit from the same choice in a new game.

We found that people who had previously made decisions without knowing where they stood were more likely to continue to support their principle—even when they knew it would no longer favor them in their new field. This provides further evidence that the veil of ignorance encourages fairness in the participants’ decision-making, leading them to principles they were willing to stand by even when they were no longer directly benefiting from it.

Fairer principles for artificial intelligence

Artificial intelligence technology is already having a profound impact on our lives. The principles that govern AI shape its impact and how these potential benefits will be distributed.

Our research considered a case in which the effects of different principles were relatively pronounced. This will not always be the case: AI is deployed across a range of domains that often rely on a large number of rules to guide them, which is likely to have complex side effects. However, the veil of ignorance can still potentially guide the choice of principle, helping to ensure that the rules we choose are fair to all parties.

To ensure we build AI systems that benefit everyone, we need extensive research with a wide range of input, methods, and feedback from across disciplines and society. The veil of ignorance may provide a starting point for choosing the principles that AI goes with. It has been effectively propagated in other areas to bring out more neutral preferences. We hope that with further investigation and attention to context, this may help fulfill the same role for AI systems being built and deployed across society today and in the future.

Read more about DeepMind’s approach to safety and ethics.

Source link

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.