A.I.: Approach with Caution

by Sanad Arora


The computational ability of Artificial Intelligence (“A.I.”) is growing at a very rapid pace right now, making it more and more capable to take up complex tasks. Coupled with deep learning Artificial Intelligence Systems, it has transformed into a software where the A.I. can learn such tasks by themselves by using the algorithms they make, but these algorithms are not comprehensible by humans and they cannot scrutinise them. Since these algorithms can neither be understood nor scrutinised, humans cannot know whether the algorithms so employed are providing the most efficient answers or not, and humans just end up blindly trusting the A.I. In humanity’s road to achieving Artificial Super Intelligence (“ASI”). We have to realise that we can not hand over our lives to A.I. systems which we ourselves can not investigate and do not fully understand and just hope that the outcome of its actions doesn’t prove fatal to humans.  If we were to evaluate the risks posed by Artificial Super Intelligence under the KuU framework, they would skew towards the big U i.e Unknowable risks. When it comes to an entity like ASI you cannot just simply grasp the magnitude of how it will function because of it being an entity beyond human comprehension. To put things into perspective, a human below the IQ of 85 is labelled as stupid and above the IQ of 130 is labelled as smart, we do not have a word for an entity with an IQ of 12,000.[1] This paper sets out to elaborate on the notion that the A.I. algorithms developed with deep learning essentially have an inherent black box characteristic which might be near impossible to decode. It suggests the practice of goal alignment as a way to by-pass the issues posed by the black box characteristic of A.I. and emphasises that it is only when we can absolutely warrant that the advanced A.I. system created would have the same goals as the human race, should we approach towards to developing it.


To quote Robert Oppenheimer, “When you see something that is technically sweet, you go ahead and do it and you argue about what to do about it only after you have had your technical success. That is the way it was with the atomic bomb.” [2]

A very similar trend that is stated in the above quote can be observed in the case of A.I.  Moore’s Law states that the maximum computing power of an integrated chipset would double every two years.[3] But according to a study conducted by Stanford, the computational ability of A.I. has well surpassed the rate of increase stated in the Moore’s Law and is now progressing at a rate of 3.4 times every two months.[4] This exponential increment in the computational ability of A.I. might sound impressive when looked at in isolation, but it is imperative to note that with every passing second, humanity is moving closer to creating a being which is smarter than itself and such technology can have deep seated repercussions if its goals are not aligned with ours. Nick Bostrom a leading researcher in A.I., illustrates this with an example that, when an A.I. is intellectually inferior to humans and it is given the command to make humans smile, it might perform actions that us, humans generally find funny, but when A.I. achieves super intelligence, it might realise that there are other more efficient means of making humans smile such as attain world domination and start sticking an electrode in every human’s face so that we never stop smiling.[5] Even though this is a cartoonish example, but it is necessary to acknowledge it, because according to Nick Bostrom, a super intelligent A.I. would be extremely efficient at realising the means to accomplish its goals, which is why it becomes all the more important that when we command the A.I. to achieve an objective we include each and everything which is important to humans and ensure that it is not altered or exploited by the A.I. And for humans to make sure that the goals of the super intelligent A.I. are aligned to ours, it is important that before developing more complex A.I. we should understand how the existing A.I. functions and why it makes the decisions it makes, which is exactly what is not happening right now.

The Neural Network Blackbox

Autonomous driving which seemed like a farfetched dream of the future has now become a reality all thanks to companies like Tesla. Having a car drive itself has become a possibility due to A.I. applications analysing the data it is fed to mimic human like driving behaviour. Various sensors attached to the car provide data to the neural engine which then processes it to reach a result. But the engineers who designed it may struggle to explain why the car decided to take a particular action at a particular movement, because such complex A.I. based systems are based on the platform of deep learning, which allows the A.I. system to replicate the workings of a human brain in processing data and creating patterns for “its own recognition” and it also allows the A.I. to partake in unsupervised and unstructured learning.[6] But because of the enormous computational ability of the A.I. system, the patterns and the algorithms which it is able to build for itself can’t be comprehended by humans, simply because they are way too layered and dense for a human mind to understand.[7]

JoelDudley leads the team of the A.I. program called ‘Deep Patient’, which uses an A.I. algorithm to predict diseases based on the patients’ records. ‘Deep Patient’ has been extremely accurate to predict a mental disorder like schizophrenia, which human doctors find extremely hard to diagnose.[8]  Even though Dudley claimed that the algorithm they can similar advanced A.I. models, they did not know how such models actually function.[9] Neural Networks formed by A.I. are a major quagmire when it comes to understanding how they function. A networks functioning is embedded in thousands of neurons embedded into hundreds of intricately connected layers.[10] The first layer which is usually the input layer sends data into the deeper layers where the processing happens and then the result is revealed through the output layer.[11]  Since the A.I. system sorts the data in accordance with the algorithm which only it can understand it becomes nearly impossible for humans to understand how they function. Hence, it can be inferred that by the very nature of how deep learning functions with the A.I. making its own algorithms and forming artificial neural networks by applying back propagation[12], deep learning turns the A.I. system into a black box. This characteristic of advance A.I. makes it extremely opaque and renders it near impossible for humans to verify whether the algorithm employed by the A.I. because a machine-readable decision would often look like a set of hundreds of thousands which we won’t be or might never be able to comprehend.[13] And if we keep on encouraging this trend of making advanced A.I. while being completely indifferent to the process A.I. employs to reach an outcome, we might be risking a scenario where the result reached by the algorithm is either suboptimal or it produces a formally correct but completely unsafe outcome, and such failures might be happening even right now.[14] Since interpreting the intention of an A.I. system behind its decision is nearly impossible, it becomes extremely important that as the computational ability of A.I. systems increases and they become more and more complex and powerful, thus becoming more essential to the daily functioning of our society by taking on more pivotal tasks (such as administration, legal consultancy, policy making etc.) , we as humans make it a priority to align our goals with the super intelligent being we develop. By adopting this approach humans will at least ensure to avoid a circumstance where the A.I. system employs a method to achieve its goals which might be detrimental to humans or lead to an unsafe outcome.  

The issues of dealing with a Super intelligent Being

Aligning the goals of humanity with that of an advanced A.I. even though can resolve the core issue of having a black box super intelligent A.I. but it is clearly not as simple as it seems and can run into many obstacles, much of which are explored by Max Tegmark. [15] At the heart of this obstacle lies a process called “recursive self-improvement”, artificial intelligence as it stands now would require human intervention to change or alter the its core software by re-writing its code.[16] But once humans develop Artificial General Intelligence, A.I. systems would be able to write new code, add new functions to themselves and essential grow on their own. And once an A.I. system is able to accomplish the feat of making itself more intelligent, it will continue to get better at making itself more intelligent and make a dash towards becoming an ASI. After achieving the state of ASI, it would become difficult to negotiate with it because it would be just exponentially better at doing everything than humans, be it strategic planning, gathering resources, scientific discovery etc.[17] Max Tegmark argues that it is in this sweet spot of the A.I. being too dumb to understand humans and before developing into an all-powerful super intelligent being, will the humans have a chance to imbue the A.I. with our goals.[18] The only way of making the A.I. system understand our goals it must not focus on the things we do but why we do them.[19] For example if a human puts his life in danger to save a child who was about to come in front of a moving car, the A.I. might infer that the human was himself suicidal instead of acknowledging the principle of preserving human life. Another argument by Tegmark which we need to take into consideration is that if we were to provide the A.I. with a model of all our preferences and biases, which would act as fixed parameters which it has to consider while making a decision, but such a model might not last the test of time. Since the A.I. would be a sentient being, there is a high chance that it might make a model of itself in an attempt to understand itself and during this process might arrive at an understanding that the goals set by the humans for it, lack qualitative merit or in other words might be misguided and choose to disregard them and adopt new goals which it finds greater merit in.[20]


Humans as a species stand at the brink of a great and yet a very strange transition in their lifecycle, where they are essentially going to give up control of their lives to their own creation, which when fully realised would be exponentially smarter and more powerful than them. Nick Bostrom compares the bargaining power of humans against an ASI, to that of a chimpanzee against a human, which is essentially nothing.[21] In a sense humans are creating their own God in the form of ASI, and for the ASI to be a benevolent God, the human race has to ensure that before developing an ASI they resolve the issues with regards to the process of goal alignment and guarantee that the ASI  does not waver or misinterpret the goals which humans imbue with it, because any other outcome could very well mean the extinction of our race.

[1] ‘The Artificial Intelligence Revolution: Part 1’, Wait But Why (blog), 22 January 2015, https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html.

[2] Luke Muehlhauser and Nick Bostrom (2014). WHY WE NEED FRIENDLY AI . Think, 13, pp 41-47 doi:10.1017/S1477175613000316, accessed 28 June 2021, https://www.nickbostrom.com/views/whyfriendlyai.pdf

[3] Esther Shein in Hardware on April 17, 2020, and 6:00 Am Pst, ‘Moore’s Law Turns 55: Is It Still Relevant?’, TechRepublic, accessed 25 June 2021, https://www.techrepublic.com/article/moores-law-turns-55-is-it-still-relevant/.

[4] ‘Stanford University Finds That AI Is Outpacing Moore’s Law’, ComputerWeekly.com, accessed 25 June 2021, https://www.computerweekly.com/news/252475371/Stanford-University-finds-that-AI-is-outpacing-Moores-Law.

[5] Nick Bostrom, ‘Transcript of “What Happens When Our Computers Get Smarter than We Are?”’, accessed 27 June 2021, https://www.ted.com/talks/nick_bostrom_what_happens_when_our_computers_get_smarter_than_we_are/transcript.

[6] Marshall Hargrave et al., ‘How Deep Learning Can Help Prevent Financial Fraud’, Investopedia, accessed 27 June 2021, https://www.investopedia.com/terms/d/deep-learning.asp.

‘The Dark Secret at the Heart of AI’, MIT Technology Review, accessed 27 June 2021, https://www.technologyreview.com/2017/04/11/5113/the-dark-secret-at-the-heart-of-ai/.

[7] ‘The Dark Secret at the Heart of AI’.

[8] University Herald, ‘The Most Advanced AI Algorithms Don’t Follow Humans At All [VIDEO]’, University Herald, 14 June 2017, https://www.universityherald.com/articles/75720/20170614/the-most-advanced-ai-algorithms-dont-follow-humans-at-all.htm.

[9] Ibid

[10] Supra note 6

[11] ‘What Is an Artificial Neural Network (ANN)? – Definition from Techopedia’, Techopedia.com, accessed 27 June 2021, http://www.techopedia.com/definition/5967/artificial-neural-network-ann.

[12] Ibid

[13]Dave Gershgorn, ‘We Don’t Understand How AI Make Most Decisions, so Now Algorithms Are Explaining Themselves’, Quartz, accessed 27 June 2021, https://qz.com/865357/we-dont-understand-how-ai-make-most-decisions-so-now-algorithms-are-explaining-themselves/.

[14] Amarshal, ‘Failure Modes in Machine Learning – Security Documentation’, accessed 11 May 2021, https://docs.microsoft.com/en-us/security/engineering/failure-modes-in-machine-learning.

[15] ‘Max Tegmark, Author at Future of Life Institute’, Future of Life Institute, accessed 28 June 2021, https://futureoflife.org/author/max/.

[16] ‘The Unavoidable Problem of Self-Improvement in AI: An Interview with Ramana Kumar, Part 1’, Future of Life Institute, 19 March 2019, https://futureoflife.org/2019/03/19/the-unavoidable-problem-of-self-improvement-in-ai-an-interview-with-ramana-kumar-part-1/.

[17] Supra note 1

[18] Max Tegmark, Life 3.0: Being Human in the Age of Artificial Intelligence, 2017.

[19] Ibid

[20] Ibid

[21] Supra note 16

About the Author

Sanad Arora is currently a law student at Jindal Global Law School, pursuing B.B.A. L.L.B (Hons.).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s