comment

a thoughtful web.

Good ideas and conversation. No ads, no tracking. Login or Take a Tour!

comment

hyperflare · 3365 days ago · link · · parent · post: Threat from Artificial Intelligence not just Hollywood fantasy

The article was basically rubbish, but the science question behind it is fascinating and worth considering.

This is great, seeing as I'm currently writing an essay on this for class (It's actually the philosophy part of my curriculum).

Defining AI isn't that hard - it's software exhibiting intelligence. AGI (Artifical Generel Intelligence) is software exhibitinig human-level intelligence and adaptability to various tasks. Self-improving AI is software that can design better software satifying the above requirements.

Controlling an AI should be fairly straightforward: Everything the AI does should be slaved to its utility function, which it would seek to maximize. Any serious discussion of AI control revolves around designing just such a function. It's actually surprisingly close to making a good wish to a genie: Almost everything you can think of quickly can be perverted into something that fulfills the letter of the wish, but not the spirit.

But the best wish is always wishing for more wishes, right? We can apply that same logic to AIs. Don't give them a rigid formula for evaluating happiness. Give them the need to satisfy human's wishes. But that's just another version of the three laws of robotics, right?

2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.

3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Everyone who has actually read one of Asimov's stories knows that these laws can break down in a myriad of ways: They are much too limited. AIs need to be able to act preventatively, as well.

I would recommend using simple preference utilitarianism: The AI needs to learn about people's preferences, and then seek to maximize those. This would allow to a tailor-suited approach to every human. Of course, what if someone's preference is "kill everyone else"? You could stick them in a simulation where that happens. What is if their wish is "kill everyone else for real, not in a simulation?" - Obviously the AI would also need to keep all of humanities' preferences in mind, balancing needs if they have to (e.g. in distributing limited goods). In that case it would be tough luck for our would-be genocider. This also has interesting implications about harming some humans for the benefit of everyone else.

Summed up, my proposal for an AI utility function would be "How well am I satisfying humanities' preferences?". The key is not laying a certain way to happiness in stone, but allow for changing definitions. The actually hard part would be measuring satisfaction objectively.

markup tips · 0