Specification gaming: the flip side of AI ingenuity

a thoughtful web.

Good ideas and conversation. No ads, no tracking. Login or Take a Tour!

Specification gaming: the flip side of AI ingenuity · 1

wasoxygen · 1385 days ago

LessWrong published a set of books containing selected essays from 2018. A common theme is artificial intelligence, especially the challenge of making AI more capable without making it dangerous, like a paperclip maximizer. This article on specification gaming is a charming preview of what to expect from our future robot overlords.

In a Lego stacking task, the desired outcome was for a red block to end up on top of a blue block. The agent was rewarded for the height of the bottom face of the red block when it is not touching the block. Instead of performing the relatively difficult maneuver of picking up the red block and placing it on top of the blue one, the agent simply flipped over the red block to collect the reward. This behaviour achieved the stated objective (high bottom face of the red block) at the expense of what the designer actually cares about (stacking it on top of the blue one).

Consider an agent controlling a boat in the Coast Runners game, where the intended goal was to finish the boat race as quickly as possible. The agent was given a shaping reward for hitting green blocks along the race track, which changed the optimal policy to going in circles and hitting the same green blocks over and over again.

list of specification gaming behaviours

Block moving A robotic arm trained using hindsight experience replay to slide a block to a target position on a table achieves the goal by moving the table itself.

CycleGAN steganography CycleGAN algorithm for converting aerial photographs into street maps and back steganographically encoded output information in the intermediary image without it being humanly detectable.

Football The player is supposed to try to score a goal against the goalie, one-on-one. Instead, the player kicks it out of bounds. Someone from the other team has to throw the ball in (in this case the goalie), so now the player has a clear shot at the goal.

Pneumonia X-rays Deep learning model to detect pneumonia in chest x-rays works out which x-ray machine was used to take the picture; that, in turn, is predictive of whether the image contains signs of pneumonia, because certain x-ray machines (and hospital sites) are used for sicker patients.

Qbert - million "The agent discovers an in-game bug. For a reason unknown to us, the game does not advance to the second round but the platforms start to blink and the agent quickly gains a huge amount of points (close to 1 million for our episode time limit)"

Roomba "I hooked a neural network up to my Roomba. I wanted it to learn to navigate without bumping into things, so I set up a reward scheme to encourage speed and discourage hitting the bumper sensors. It learnt to drive backwards, because there are no bumpers on the back."

tweet · print · htmlmarkup tips · 0

am_Unition · 1385 days ago · link ·

Imagine an AI trained on exploiting tax law loopholes. "Wait, wait! Stop gaming that! Don't do it! We'll criminalize that, somehow!"

I dunno. You've worn me down, or something. I'm evermore at some sort of crossroads. Hopefully it's a temporary thing related to Trump's successful lawlessness, but it puts me in a dark spot.

I still think that the free market alone isn't enough. There has to be some sort of guaranteed, democratized information distribution system. Free market alone can't penetrate the potential opaqueness of massive monopolies' profit incentives that lead to obsequiousness. God, if the last 5 years have taught me nothing else