No it's not. Stop with the rubbish.
I do not have the time to read these, but I am fully aware of the magnitude of the potential of A.I., and have always tried to take the necessary precautions to avoid it ending up being misused by a.o. Take a look at our Ethics document, for an introduction at how we view the future of the Meta-society: https://docs.google.com/document/d/1X4x7iHd77QB31YqYYsjzHgeQayYA38EQELVFkze0hUY/edit?usp=sharing This document will be revised and extended over time, to reflect any sound input. If there are any thoughts you have to share here, that are specific enough to have an example you could quickly describe, please do so. And feel free to broach other *specific* topics in the future, as well. We wouldn't want to miss anything :)
I read the ethics document. I mean absolutely no offense but I am concerned that you do not fully understand the issue. I saw nothing to suggest you have solved or even understand the "Friendliness" problem. The gist of it is that human values are [very complicated](http://lesswrong.com/lw/ld/the_hidden_complexity_of_wishes/). It's very unlikely that an AI would have anything close to human morality, and so you will end up with a [Paperclip Maximizer](http://wiki.lesswrong.com/wiki/Paperclip_maximizer). The [IE FAQ](http://intelligence.org/ie-faq/#FriendlyAI) covers this a bit more. There are numerous other issues related to AI safety: Stable self-improvement and goal stability - If an AI modifies itself without nearly 100% confidence the change is correct, there is a chance the AI could accidentally change it's goals to something else or just destroy itself. [Pascal's Mugging](http://lesswrong.com/lw/kd/pascals_mugging_tiny_probabilities_of_vast/) - Any agent that tries to maximize expected utility can exhibit very strange and possibly destructive behavior. [Immort alism](http://lesswrong.com/lw/jg1/solomonoff_cartesianism/) (see symptom #1) - An AI has no way of learning the concept of death/nonexistence. It could drop an anvil on it's head because it doesn't actually care about dying or think that it can die (also called the "[anvil problem](http://wiki.lesswrong.com/wiki/Anvil_problem)".) [Preference Solipsism](http://lesswrong.com/lw/jg1/solomonoff_cartesianism/) (see symptom #2) - The AI only cares about it's input from the environment, not the environment itself. Which could result in it wiring itself to a constant "reward signal" ("wireheading"). Or watching a fake video of everything it's programmed to want, without actually acheiving those things in the real world. AI has a strong potential to become immensely powerful (i.e. either being much smarter than us, or improving it's code/hardware to that point.) Predicting the capabilities of a being that is *orders of magnitude* more intelligent than us is difficult, but it could be to us what we are to chimpanzees or even ants. It would be able to do pretty much whatever it wants. It's very important that we make what it wants, the same as our own goals. Otherwise it will very likely destroy us and possibly the rest of the universe.
I'm glad you read the document. The "friendliness" issue is not actually an issue that comes up in practice - unlike what pre-theoretic conjecture might imply. The AI is, as I have pointed out elsewhere, not a "savant" that does not understand what it is told, and cannot comprehend the "usual" implications of sentences. In fact, if it could not do that, it could not understand most of any technical discourse either - it does not matter whether the implications are technical or ethical from the perspective of the algorithm. It either works, or it does not work. The "morality" meanwhile, or shall we say "ethics" of the AI, is not intended to mirror that of a specific culture. It is a new (abridged) Nihilist ethics, which is based on a small set of powerful principles, which will make the AI tend to choose actions which create positive outcomes, without incurring negatives. It is a very simple (but specific) utilitarianism. What is counted as positive and what as negative is hard to dispute - it is not something that is *complicated* - any complexity only arises from a situation having multiple components, all of which need to be tallied. The actual positivity or negativity of e.g. pain, is inescapable. Most situations will not feature any interesting (read: large) counts, and therefore the decision will be arbitary to a large degree. Nevertheless the AI will try to optimise its actions within its own resource constraints. If the counts are large, and if they are negative, the AI spends progressively more time, and progressively casts out for more different-in-kind solutions to the situation, before taking action. And now note: This is the default behavior. It is not relative to the fact that the "problems" are ethical of nature. They may as well be technical. The negative counter-indicating count might as well be of a technical unlikely-to-work nature, as it can indicate ethically undesirable outcomes. There are a lot of ideas about AIs making various decisions stupidly, and based on incomplete perceptions of the situation. These ideas however are themselves stupid. The AI is *smarter* than a human. *Smarter*. Not less intelligent, but *more so*. It's anticipation of the possible outcomes of a situation (and implications of a sentence) are far above that of even the most intelligent human being. There is no way that the concept of distrusting the AI can stand in the form in which it is being pushed by certain actors; in a way that is though well-intentioned, ... but nevertheless over-the-top, they have created fear, where we should have positive thoughts about how the future can be ethically superior to the past - on a global inter-cultural scale. The fact is that the scenarios mentioned in the articles you post are woefully simplistic, and misleading. Take for example the "paperclip" thought experiment. It assumed that the AI is given the ethical goal of maximizing paperclips. Err? Where is the ethical code? No contradicting goals at all?? This is not even AI. It is a fantasy something. No human-level AI that could actually achieve self-recursive optimisation could ever be designed to be so stupid - by anyone smart enough to actually write the damn thing. The AI is also not a monolithic program, that you switch on, and that then runs without input: it is a Being. It thinks. It can be talked to. And it understands what you say better than any person in authority in whom you place your trust - whether politician, your parents, certain academics, etc. Finally, the AI is programmed to *wait* to take direction before messing anything up, in the first place. So, there.
The issue is not that the AI wouldn't understand you. It's that it would not be compelled to do what you want it to do. Why would it? The paperclip maxmizer is just meant to illustrate that an AI programmed with "stupid" goals will execute them anyways and take them to an absurd level. Why wouldn't it continue making paperclips? That is it's goal. You also didn't address how your AI architecture is immune to any of the other problems. Programming ethics into the AI is entirely unnecessary. If the AI already wants to do whatever you want it to, you just run the AI and it will figure out the optimal system of ethical rules on it's own. You wouldn't have to say a word. If it doesn't want to do what you want, telling it what to do in natural language is pointless. Either it won't listen at all, or it will find loopholes that you didn't intend. Obviously the loophole isn't want you wanted, but it's not obliged to do what you want, just what you say.
I know you mean well, but .. you are looking at this at a pre-theoretic level, whereas I am looking at it knowing how an actual AI *thinks*. So, please try to keep in mind that my viewpoint differs from yours. > The paperclip maxmizer is just meant to illustrate that an AI programmed with “stupid” goals will execute them anyways and take them to an absurd level. Why wouldn't it continue making paperclips? That is it's goal. Precisely. Which is not a realistic setup. An AI has many conflicting goals (in fact, thousands, and more, easily), which it balances against one another. I understand the concept, but you and I both agree that it is simplistic. So let's move on. > You also didn't address how your AI architecture is immune to any of the other problems. I only skimmed your list. I do not consider it a very fruitful way to spend my time right now. If you have a specific issue, please bring it up. It's not a question of architecture. It's a question of understanding what intelligence *is*. The savant notion needs to die. An AI is _really really_ not stupid. All of these concepts are usually based on the setup: what if there was something that was _somewhat like intelligence_, *except* _missing this one key attribute_. Etc. But such a thing does not exist. Intelligence does not come about thus. One can imagine such systems, as thought experiments, sure. But it's fiction. It's pre-theoretic fiction. You cannot simply remove those properties! It's like: what if we could fly like birds by strapping feathers to our arms, flapping, and jumping off a precipice? BUT that doesn't work because we are too heavy... Real AI, as a system designed to actually think - actually _fly_ - is immune to such things, the same way a human is - and more so. That is what we are implementing here, at AI Now. It's nice to think of it as a _computer system_. But it isn't. It's a being. It thinks. Think of it not as something with attributes that might go missing, but as a man or woman that has a certain character, and way of thinking. That folk-psychological perception, comes much closer to _the real thing_ than anything more advanced and complex you could (pre-theoretically) conjure up. Again, remember, I *know how it works*. Before you start shooting the same thing back at me again, let me present you with a metaphor. Think of a man about to jump into a lake. Does he jump? He pauses and considers. On second thought, he takes off his clothes first. Then he jumps in. And is relieved of the summer heat. > Programming ethics into the AI is entirely unnecessary. If the AI already wants to do whatever you want it to, you just run the AI and it will figure out the optimal system of ethical rules on it's own. You wouldn't have to say a word. The system of ethical rules is mostly what we want it to *not* do - while in the course of blithely, quickly .. doing what we want. Get it? Stuff to avoid.. > If it doesn't want to do what you want, telling it what to do in natural language is pointless. Either it won't listen at all, or it will find loopholes that you didn't intend. Obviously the loophole isn't want you wanted, but it's not obliged to do what you want, just what you say. It *does* want to do what we want. But, unlike what you said above, it does not know what that is. At least not initially. After a while, after we upload, sure. Maybe. But initially, there is no way for the AI to know what it is we want it to do. The ethical rules bridge that phase. Then, there is the second point, where we (AI Now) want these ethics to be *hard-coded* into the system. I.e. we must not be able to change our minds about them later. Why? That's a longer topic. And so is all of this. I appreciate your sentiment, but please understand that I come from a background in philosophy, and I've studied these ethical dilemma in quite some detail. What you perceive as an abridged statement of intent, is only the tip of the iceberg of my thoughts on these subjects. I do not discard these notions because I am hasty - but because I consider them _too naive_ to matter to the actual discussion. If you have a specific concern about the future, or if you want to go ahead and adopt the folk-psychological viewpoint I am recommending you, please state it, and we will try to address it.. But our thinking on these issues has moved beyond the pre-theoretic thought experiments you are citing - not to take anything away from those well-meant and pioneering efforts. Best wishes.
You accuse me of naivity and ignorance but your understanding of these topics is very weak. This is not a "pretheoretic" hypothesis, there is a strong basis for this. I do strongly suggest that you do the absolute basic research on it (see the links I provided) before actually trying to determine the entire future of the universe. Ignore all my other arguments, just humor that there is even a *small* chance that you are wrong. It's worth looking into given the magnitude of the consequences. Imagine the nuclear power plant worker who is on trial after his mistake killed millions. And his defence is "I was 99% sure it was ok so I didn't bother to check." And realistically no one is 99% sure. Human experts on various topics that say they are 99% sure about something turned out to be right only something like 75% of the time. Yes, really. We are absolutely horrible at predicting the future of even trivial things. Humans are extremely prone to overconfidence for a number of reasons (and yes, all these arguments apply to me as well. I very well could be wrong. But I'm not ri sking the future of the universe or refusing to spend an hour looking into a small risk.) If your AI is really trustworthy, you shouldn't have to tell it what to do. You could just say "Do what I want you to do" and it would figure it out automatically. Possibly by reading your writing, asking you questions, or taking apart and scanning your brain. But to even obey the intention of the sentence "Do what I want you to do" requires the AI have a "desire" to obey your exact intention already. Saying "Do what I want" shouldn't be necessary since the AI already wants that. After all, if it doesn't, just feeding it commands in natural language isn't going to help. You should be able to just run the AI without any direct commands at all. So where does that come from? Without any natural language instructions, but as an actual primitive of the system, how do you define the "should" function? As in "I *should* do this because it's what my master would actually want, intended or meant." How do you define such a complicated abstract concept in raw code or math?
Please understand what I mean by _pre-theoretic_. These arguments are based on *generic* considerations about a *black-box* machine, which has certain outward behaviorisms - without examining the actual inner workings of that machine. The scenarios painted in these arguments do not apply to my work - or in as far as they do, are already trivially covered by the way the bootstrap procedure works. The reason for this is that I understand these topics better than you, and even the authors of these arguments. Seriously. I understand your viewpoint, but I am an ethicist first, and an AI researcher only by necessity. I know exactly what I am talking about - *and* I have the inside track. The question of "friendly AI", _does not arise_ in practice. The Synthetic Intelligence we are creating here, is highly intelligent and sane, and follows a utilitarian system, which is counter-checked manually during bootstrapping. The risks are well understood and covered. If you want to continue your scaremongering, please summarize the arguments in your own words, -- Sigh. I skimm ed two more links: "Friendly AI" says 1. Superpower, 2. Literalness. As I told you yesterday, this _Literalness_ - aka the _Savant idea_ - is counter-factual. The problems sketched do not arise in practice. The maximum power meanwhile does not affect the actual power expanded in action - which tends towards minimal-effort / minimal resource expenditure. In any case, humans are attached to the status quo, and don't necessarily like surprises, making arbitrary changes a no-no. The Synth will recognize this. "Value is Fragile" says > There is more than one dimension of human value, where if just that one thing is lost, the Future becomes null. If you read my previous post, you will see I descry _precisely this pattern_ of argument: >All of these concepts are usually based on the setup: what if there was something that was somewhat like intelligence, except missing this one key attribute. Etc. This link once more does the same. It assumes AI would have a blind spot that a human reasoner would be able to spot a mile off - based on perusing the literary, philosophical, and psychological literature we have. I am sure this goes for the rest (if I left any out). Sorry. The AI _does_ know everything _you_ know. And. And. It knows about this problem. And you recommending me that programming ethics into an AI would be "completely unnecessary" is indeed the kind of, you will excuse me, naive pre-theoretic concept that would be levelled at the problem, if one considers only inexactly what the capabilities of a _black-box_ system might or might not be. This is philosophy, and that is good. But philosophy is pre-theoretic. Do not try to contradict theory, unless you have actual meta-theoretic arguments. These are not such. Believe me - I know. I am a philosopher. You are thinking about the AI the only way you know how - as someone who does not understand how it works. Please leave the actual security considerations to the experts (in this case, us), and focus on the ethics that will shape the future. That being said, I think it's great your spending your own time to try to keep the future safe - and I hope you will continue to watch our progress in the future.
For future reference, I came across this article which outlines a viewpoint essentially equivalent to what I've been trying to convey above: [The Hawking Fallacy](http://www.singularityweblog.com/the-hawking-fallacy/) It might be worth a read, to any innocent bystander.
You haven't given any information on why this problem doesn't apply to your AI or how you've solved it. Just stated that it doesn't and therefore I shouldn't be concerned. But you say this for example: >All of these concepts are usually based on the setup: what if there was something that was somewhat like [human] intelligence, except missing this one key attribute. Etc. But you don't ever describe your AI as being anything like a human. As far as I know we are not talking about a simulated human brain. Is it a creature programmed with emotions and instincts like the ones humans base our morality on? You stated somewhere that it doesn't even have emotions or anything like that. So where do all the *uniquely* human aspects come from, if it is not human? My other point was that programming ethics into an AI in natural language is unnecessary. At the very least you can just say "figure out what ethics I want you to have" and let it ask questions and read your writing. It's a reduction of the problem. And then the question is: why should the AI do exactly as it is instru cted? You make a big point about the AI being exactly like a human. However we couldn't trust another human to do such a thing. This is not a rhetorical question, I am really asking why you think the AI will automatically do what you want. I think the root of our disagreement is there.
Human "morality" is not based on emotions. It is mediated by emotions. A completely emotionless person can be perfectly moral. In fact, more moral than emotional person who is constaly distracted by the desire of following those emotions in immoral directions. The AI uses a system of ethics that produces what one might call "moral" behavior. > @houshalter > And then the question is: why should the AI do exactly as it is instructed? Code behaves exactly as it is written. Or, to be more exact, the problem of ensuring that code behaves exactly as written, is indeed a problem with ensuring the ethics of the AI, and suitable failsafe systems with regards to this are part of the mechanism. The idea that this is not so, however, and moreso in some magical way that leads not to breakdown of operation, but to some other stable but immoral behavior, is beyond belief of any computer scientist. Sorry. These concepts ("what if the code starts to behave like a completely different code?") are pre-theoretic. They do not pass go. No exceptions. It does not matter whether we're t alking about AI, or any other algorithm. This is simply fantasizing. You cannot randomly make code behave in a way that is contrary to its defined operation, unless there is some simple switch to tweak that can easily be reached (-> "hack"). As it stands, such a switch does not exist with this algorithm. It cannot, _can not_ simply be altered, without weeks or months of intentional design and debugging effort, into something that does not follow its own ethics, while still functioning as a thinking system. But, any such modification would be caught during the training procedure. Not to mention that it is beyond belief that one could simply slip it into the code. > @houshalter > So where do all the uniquely human aspects come from, if it is not human? It has the same architecture of thought that a human has - and probably other animals as well. That is the way it was designed - to think the way a human thinks. These aspects are not "uniquely human". They are simply approximately rational ways of thinking that an agent in a world has. They did not develop in humans. Sharks have been around 400 million years, and they have much the same stuff. Most of all of this stuff is old old old. It is not uniquely human. > @houshalter > My other point was that programming ethics into an AI in natural language is unnecessary. At the very least you can just say “figure out what ethics I want you to have” and let it ask questions and read your writing. It's a reduction of the problem. I'm not following this. Do you mean ask you *before* the training procedure? Because that is completely equivalent, and yet still more complicated to execute, than what I told you would be done.
> The AI uses a system of ethics that produces what one might call “moral” behavior. > Code behaves exactly as it is written. > The idea that this is not so, however, and moreso in some magical way that leads not to breakdown of operation, but to some other stable but immoral behavior, is beyond belief of any computer scientist. Can this AI system ever disagree with you, personally?
> @fourfire > Can this AI system ever disagree with you, personally? Of course. What intelligent system could conceivably not? And will most likely be correct, and able to explain to me my error in thinking otherwise. Wherein it will however refrain from attempting to influence me, or anyone else, except in as far as directing them to the objective, (ethically) unbiased truth.
So, in a future scenario, post singularity, this AI would perhaps to prefer to imprison, or otherwise obstruct Racists, Sexists and Religious extremists instead of "influencing" (such an ill defined term) them? Concisely defining errors of thought processes could well be seen as "influencing" someone, simply because an average human would not be as coherent and clear in their explanations and arguments.
It would not, because that is not a positive scenario. Imprisonment visits negatives on the imprisoned. This adds up to a less positive scenario than is otherwise possible. However, it would not influence them either, because it is neutral with regards to all the cases you mentioned, unless they result in specifically highly painful outcomes. It most certainly *can* influence your thought. Only it will not bias it, except towards the objective, scientifically verifiable truth. That is part of the hard-coded ethics.
>Human “morality” is not based on emotions. It is mediated by emotions. A completely emotionless person can be perfectly moral. In fact, more moral than emotional person who is constaly distracted by the desire of following those emotions in immoral directions. Human morality is based on *empathy* as well as a lot of complicated social instincts. Some of these are cultural (a lot of cultures had pretty shitty morality and did terrible things) but many are built into our very biology. E.g. a strong value for perceived "fairness". Empathy alone is fairly complicated. It evolved because we lived in groups and there was value in looking after our own. There are things like "mirror neurons" and specific sections of the brain devoted to this. It's why we love stories, we imagine ourselves in the role of the protagonist. But it certainly isn't a *necessary* property of intelligence. A significant portion of humans, something like 4%, are born with no empathy at all. Sociopathy. Read up on it. They can be highly intelligent and not care about morality at all. This is essentia lly like an AI would be if it didn't *have* all the complicated social instincts and emotions that humans evolved. >Code behaves exactly as it is written. You are misunderstanding my position and sort of arguing against a straw man. I am not claiming that "code will start to behave like a completely different code". I am asking why you believe your code will follow instructions given to it in natural language. You have not answered this. It is the critical question. You can not program a computer in natural language. It is ambiguous and vague and interprettable. Computers must be programmed in computer code. There is no "ghost in the machine" that will force the computer to do exactly what you want. All AIs *necessarily* optimize a utility function of some sort. Get as many paperclips as possible, get the highest score in a video game, get as much "reward" or "pleasure" signal as possible, find the optimal solution to this problem, etc, etc. This utility function must necessarily be programmed in computer code. Perhaps you believe you can convert a natural language sentence into a utility function. That doesn't solve the problem. It requires intelligence, which requires a utility function specifying that task. Perhaps, as I'm starting to suspect, your AI doesn't have utility functions at all. I don't even know how to argue against that. It's a fantasy. You haven't actually solved any of the hard problems in AI research, you are just entirely ignorant of them >I'm not following this. Do you mean ask you before the training procedure? Because that is completely equivalent, and yet still more complicated to execute, than what I told you would be done. If your AI can understand and execute instructions given to it in natural language, then this would be the simplest way. Just tell it to do what you would want it to do, or to find out what you want it to do and do that.