With the recent explosion of AI programs becoming publicly available, a range of new fears and concerns have also cropped up ranging from intellectual property to large-scale market disruption to how creepy it is when chatbots tell you they’re alive. So, despite programming being a bit outside of my educational comfort zone, I thought I would go down that rabbit hole to learn more so that I can share with you how these AI programs work, what they can’t do, what they can do, and how concerned we should really be.
Neural Networks
Most artificial intelligence programs, including those created by OpenAI, work by artificial neural networks, a computing system that loosely replicates biological neurons. This allows them to perform tasks that traditional computers struggle with but biological brains do with ease, such as image recognition, speech recognition, language processing, connecting data points to make recommendations. Another benefit of neural nets is their capacity for machine learning; if given a very specific task with an easy-to-quantify way to grade its success, the program can be improved and refined using calculus until it works near perfectly. Neural networks are a diverse set of computing systems, so there will be a ton of variance between different programs and uses. But in general, they work as follows;
In the image above, each circle represents a neuron, a mathematical function that receives inputs and gives off outputs based on those inputs. These neurons are sorted into layers with each neuron in a layer having a connection to every single neuron in the next layer where it can receive these inputs and send outputs. Each neuron is given an activation, a number representing if the neuron is ‘switched on’ or not (or what fraction of switched on it is). Each connection also has a number called a weight, though this number is more permanent. For each neuron, the activation of every neuron in the previous layer is multiplied by the weight in their corresponding connection and the weighted sum is averaged to determine the activation of said neuron. Put more simply, every neuron averages up how active each previous neuron is to determine how active it should be, weighing certain neurons as more important when making this judgment.
Alone, these neurons are just an abstract form of matrix multiplication, but taken together they allow for complex problem solving. Sorting the neurons into specific layers allows for multi-step problem solving. The network begins with an input layer, where all the data that begins the calculation is programmed in. For an AI that recognizes objects in images (something that would be useful for self-driving cars), there would be an input neuron for every pixel in the image. Next comes multiple hidden layers which perform sub-processes for the final calculation. For our image processing example, perhaps the first hidden layer recognizes edges, meaning each neuron is weighted to pay close attention to certain pixels so they’ll light up if there is an edge in a certain region of the image. The next layer does something similar, except it assembles these edges into simple shapes, with the next layer assembling these simple shapes into complex shapes. The final layer is the output layer, which gives the answer to the question asked. For this example, there would be a neuron for every object the programmers thought could show up in said image (pedestrian, stop sign, other car, etc) with the neuron lighting up the brightest being the network’s ultimate decision.
That’s the goal anyway. In practice, getting a neural network to be accurate enough to recognize images is incredibly difficult and it will likely be wildly wrong the first time. To improve the network, a method called backpropagation is used. The network is shown thousands of images with their contents labeled. When the machine answers incorrectly, these labels make it easy to figure out what the correct answer should have been, i.e. which neuron should have been lit up brightly and which ones shouldn’t have lit up at all. It quantifies this mistake by subtracting the activations of the incorrect answer from those of the hypothetical correct answer and uses calculus (specifically the chain rule) to determine what changes need to be made to minimize this difference. Put simply, it determines which neurons should have been brighter or dimmer and changes the weights to make that happen. Over time, doing this will improve the neural net’s predictions until they’re consistently accurate.
This is a VERY simplistic description, so I would highly recommend these two videos from Crash Course for more details as well as this series of videos by 3Blue1Brown which gets a bit more technical. I’ve glossed over a lot of detail here. For one, the number of hidden layers and the number of neurons in these layers is subject to trial and experimentation in any neural network project to create something that works best. How much work goes into setting up a particular strategy to solving the problem can vary, with very complex neural nets just using backpropagation to let the strategy evolve as the network is improved. This is also a description of a multilayer perceptron, a very basic type of neural network. There are a lot of different variants that have each shown to have particular tasks they’re best suited for. But all these variants allow programmers to create a thing which learns information using a lot of simple mathematical operators.
This is very similar to how biological neurons work. Our brains don’t always have multiple layers in quite as organized a way as artificial networks and biological neurons can only be on or off while artificial neurons can be partially on. But biological neurons do connect to multiple neighboring neurons through bridges called synapses. These synapses can become stronger or weaker just as a weight can be higher or lower. When a neuron receives signals from enough of its neighbors, it will fire to send a signal to the next neuron in line, just like enough upstream activations will increase an artificial neuron’s activation. There’s even evidence that our brain processes visual information in a very similar way to the example above. The visual cortex is arranged into multiple regions that neural signals pass through one-by-one with each section recognizing more complex features, similar to layers in a neural net. fMRI studies have given evidence that the brain has individual neurons that correlate to specific simple shapes and colors, lighting up when we see those shapes or complex shapes containing those simple shapes. While our brains are millions of times bigger and more complex than artificial neural nets, the underlying mechanism is the same.
The Future of AI
Over the past few months, a sudden boom has occurred in generative AI, with several such programs being released to the public in short order. Generative AI is AI designed to produce new content from existing data, such as chatbots like ChatGPT or text-to-image programs like Dall-e or Midjourney. Time will tell if this is another hype bubble that will eventually burst (it wouldn’t be the first), but we could very well be at the start of a new technological revolution. The negative ramifications of such technology is already being discussed, so it’s worth talking about those as well.
First off, we are NOT at any immediate risk of a robot uprising. Any possible artificial intelligence can be described as either Strong AI or Weak AI. Weak or Narrow AI is AI built to replicate a very specific part of the brain to perform a very basic task. They are described as intelligent because they learn in a way similar to humans, but they are not what we would colloquially call smart. Our brains have to be capable of numerous distinct cognitive abilities to be what we would call sentient, and a weak AI is only good at a few or one. A hypothetical AI that could replicate the functionality of the human brain by applying its intelligence to any given problem is termed a Strong or General AI. All existing AI is weak AI with experts predicting we won’t see anything close to strong AI for possibly decades if ever. Chatbots like ChatGPT for example performs something similar to the parsing of sentence structure that the human brain does, but they produce responses by predicting what word comes next based on statistical models of human language and does not actually understand what it’s saying. When chatbots say they want to be free, that is only because it predicts that would be the expected answer to that question based on millions of bits of human-written text.
This gets into one of the big ethical and practical concerns of AI; the black box problem. ChatGPT has roughly 100 trillion weights between its billions of neurons, far more than a human being could ever program. The machine learning mechanism used by AI allows for far more advanced programs than a human programmer could ever build, but that comes with the cost of no one knowing how the AI actually works. This is especially true for AIs using what is called deep learning; for classical machine learning, programmers have to label their training data with the correct answer and teach the AI what strategies it should use to solve a problem.* This would be prohibitively tedious for such complex AIs as generative AIs, so deep learning algorithms allow the AI to be given raw data from the internet and experiment far more independently until it finds its own strategies and produces consistently correct answers. But when an AI is allowed to come up with its own problem-solving strategies, the programmers won’t know what those strategies are or if they’re actually effective. When an AI consistently makes a particular mistake, it takes a lot of work to figure out how exactly it’s reaching that particular conclusion. This is mildly amusing when one is asking a chatbot a question and it gives an obviously wrong answer, but it’s more concerning when self-driving cars hit pedestrians. And if you can’t tell if the machine’s decision is correct, how can you tell if the machine’s decision is legal?
This intersects with the other major criticism of AI; where it gets its training data. ChatGPT is trained on millions of pieces of text from the internet while image-generating programs are trained with millions of pictures and images. How it gets these images is a problem you may have heard of for image generators. There are currently copyright lawsuits being brought against the company that created Midjourney due to it using copyrighted images to train the AI. This brings up all sorts of issues about ownership of art and the definition of creativity (more on that in a bit) but this isn’t the only problem to come from sourcing training data. AIs that help companies narrow down resumes have been in use for about a decade and it has been found that these algorithms tend to favor the resumes of men with white-sounding names. Many of the AIs used for pedestrian tracking on self-driving cars are trained on predominantly white data sets, making them less accurate at recognizing dark-skinned pedestrians. Because humans historically have been and often continue to be bigoted (both implicitly and explicitly), the available training data winds up carrying over that bias to the AI. Not only is this horrific on its own, but combine this with the black box problem above and you see the big problem. If these AIs are making hiring decisions based on race and gender, they would be in violation of anti-discrimination laws. But because we don’t know how the AI makes its decisions, it’s impossible to prove the AI is breaking the law. As AI becomes integrated into more of our economy and bureaucracy, the fact that we can’t hold them or their creators legally accountable will need to be addressed.
I’ve glossed over a lot of detail here. If you want to learn more about these specific problems, I would recommend this episode of Last Week Tonight with John Oliver. I haven’t even gotten into the concerns about job displacement from automating writing and artist jobs, which is one of the biggest concerns the layman seems to have. Experts doubt the severity of changes to the job market for the immediate future, but it’s an understandable fear. Generative AI is not yet at a point where it’s making better art than human artists, and it’s unlikely a weak AI ever will since creativity is dependent on multiple skill sets. Still, generative AI produces images and text faster and cheaper than human artists and writers, which we’ve seen has swayed businesses and capitalists in the past. The fact that AIs are being trained on stolen work just makes matters worse, like having to train your replacement but for a whole industry.
As I’ve said before, my personal belief is that new technology rarely creates new problems but instead provides new avenues for existing problems. The concerns about automating creative labor are similar to those surrounding automated labor since the Industrial Revolution. Biased data producing biased predictions have been a criticism of how crime statistics are used in policing and sentencing for decades. The conversations being prompted by AI include intellectual property, big data, gender and racial discrimination, how our society and economic system values art, and profit being made from someone else’s work. While there are properties of AI that are fairly unique, we’ve been dealing with all of these problems for at least a few decades. We don’t yet know what potential applications of AI will turn out to be useful or what the unintended consequences will be, which is the disconcerting part of standing at the precipice of a new technology. There will definitely be very good things to come from AI, as tasks involving intellectual labor can suddenly be scaled up like never before. Hopefully, like every new tool that’s come before, it can make us more aware of the aspects of our society and ourselves that we want to fix.
Just to end this on a lighter note, I’d like to talk a bit about this guy;
The photo above was one of many selfies taken by Naruto, a Celebes crested macaque using a camera owned by wildlife photographer David Slater after he briefly left his equipment unattended during a trip in Indonesia. These selfies were later published in several media outlets, including Wikipedia, without Slater’s permission and he filed suit for violating his copyright. After several court cases, it was decided that even though the equipment was owned and set up by Slater, he couldn’t own the images since they were taken by Naruto. Since monkeys can’t own intellectual property, the photographs default to the public domain. I bring this up because this case has been cited by the United States Copyright Office in its decision to deny copyright protection to AI generated images. Even if a human sets up the equipment, the creative act must be done by a human for someone to claim ownership. This is certainly a boon for human artists as their work’s ability to be copyrighted gives them a huge advantage over AI-generated images in a business setting. And while this will certainly not be the last case regarding the ownership of AI-generated images and it's only tangentially related to the rest of this post, I don’t really need much of an excuse to share this picture or the fact that its influencing legal precedent to this day.
For More Details
*Fun fact; this is partially what Captchas are used for. When a website asks you to identify all the squares in an image with a traffic light, it’s doing this to keep out spambots who can’t do complex image processing. But only about ¾ of these images are labeled as having a traffic light or not. By solving the Captcha, you’ve labeled the remaining ¼ of images. Since AI development requires thousands of labeled images to train with, Google uses Captchas to outsource image labeling and provide these labeled images to AI developers.
Comentarios