The future of AI and tech policy
The ‘A’ in my CATS narrative stands for algorithmic or automated decision-making — a large swath of which is powered by a different ‘A’: Artificial Intelligence. This concept encompasses a range of present and future internet and technology policy fights for two reasons: first, incredibly complex technologies produce outputs which sometimes lead to outcomes that we as a society wish had been different; and second, government policymakers who don’t really understand the technology seek to address these undesired outcomes through approaches that don’t appreciate or protect the good outcomes or the process that allowed them to come to be. Sometimes, these interventions are done with the best of intentions, and the solutions can be figured out collaboratively; at other times, though, they’re powered by fear and overreaction, and misguided instincts to assert control over these “algorithms”.
Automated decision-making powers significant swaths of our society and our economy today. And as remarkable as they are already, we’ve barely scratched the surface of how we humans can build and harness such technology to serve our interests, because artificial intelligence (or “AI”) is still but an echo of a shadow of an outline of what we human beings can do with our minds. But whether we’ll ever reach that future, or will instead curtail its potential in its infancy (either through efforts to mitigate the negative effects or out of sheer fear), remains an open question.
Before I get into that future, I want to talk a little bit about this term “algorithm”, which as I noted in my CATS post is too often used by policymakers as a synonym for “magic”. The concept of an algorithm is relatively new in the public policy world, but it has been an established part of the computer science lexicon for as long as computer science has been a field. An algorithm is, basically, a set of instructions for performing a task. That makes it sound very simple. Of course, sometimes it’s a really long, complex set of instructions — which have other sets embedded within it, algorithms within algorithms. Standing alone, an algorithm is just a concept; but in the hands of a programmer, it can be turned into functional computer code.
Think of it like a recipe, but for a computer program. Baking is the act of turning a recipe and ingredients into a delicious chocolate chip cookie; programming is the act of turning an algorithm and (probably) some data into a software system. Somebody with proper coding chops can turn a great algorithm into great software. A bad algorithm can’t be made into great software, just like a bad recipe if followed will always make a bad cookie. And a bad programmer can take the best algorithm in the world and turn it into terrible, broken code; just like a novice cook attempting Julia Child’s masterpieces will often fail.
Now, back to artificial intelligence. AI is a field of technology that makes decisions based on input it receives as processed through various weighting criteria, developing the criteria for those decisions based on pre-supplied “training data”, and potentially continuing to “learn” over time by taking as input the outcomes of its decisions (and the result of those outcomes in the external environment). AI systems maintain an evolving sense of “state” and use that to guide future decisions.
One simple example is from the realm of natural language processing, a method called Bayesian analysis. The central “algorithm” behind Bayesian analysis goes something like this: if you see two words together that are very frequently followed by a third, you can predict with some reasonable probability that when you see those two words in the future, they will again be followed by that third word. Make that your prediction engine, feed it a ton of natural language training data, and suddenly you have the core of something that can crudely approximate Amazon’s Echo or Google Home. (To the philosophers and lawyers in the audience: Yes, this is basically code that conflates correlation and causation.)
This approach (along with its more advanced cousins, such as neural networks) offers an enormous amount of utility for any environment where predicting a human action with higher probability has commercial value… which means a lot of environments. But, Skynet it isn’t. Facebook’s head of AI research Yann LeCun explains why in The Verge: even impressive seeming feats, like self-driving cars and beating the world’s best Go players, are systems trained for specific purposes using incredible amounts of purpose specific training data.
In the human world we talk about ‘transferable skills’ that you can develop in one position and put to use in another. This is the exact opposite. Training a computer to win at Go isn’t transferable at all — literally not to a single new task. It’s powered by a technique, reinforcement learning, that can be used in other contexts. But it requires retraining for every new purpose it’s put to. LeCun’s most poignant example is self-driving cars: A simulated car would need crash into a tree 40,000 times before learning that it shouldn’t do that.
Furthermore, we see anecdotes of the fallibility of even specifically designed AI systems from time to time: brilliant image recognition programs that can positively identify school buses in pictures 90%+ of the time, but also label random yellow and black polka dots as buses, mistakes that my 2 year old daughter is far too advanced to make.
Regardless of its shortcomings when compared to human intelligence, AI as it stands today is already incredible in its power for automating laborious processes and for advancing users’ interests in new, unprecedented ways. And that’s why it’s being embedded in more and more parts of our economy and our society, from law enforcement checks and security screenings at airports, to the Facebook updates we see and don’t see in our feeds.
But it’s not all roses. Machines — like humans — make mistakes. And we’re not very forgiving when they do, because of the importance and value we place in them, and because they’re not “only human.”
These mistakes take all sorts of forms. The most potent are where they produce bias and discrimination in ways that are in outright violation of established law. Think of the local city government that buys an off-the-shelf automated decision-making system, feeds in decades of data on bail decisions in criminal cases, and then uses it to make “objective” decisions on new cases — only to generate outcomes that echo the bias and discrimination made by human predecessors who created the input data set. More subtly, machine learning systems powering the display of social media updates and news articles — built and trained to highlight popular content — can end up spreading and implicitly validating demonstrably false information, because it turns out that such content is often popular.
It’s important to realize that these “mistakes” are unlikely to be errors in the underlying mathematics or the (often fairly straightforward!) algorithms that underlie the code. Sometimes they can be programming errors, I suppose — and once in a blue moon a truly bad algorithm. But almost always, they’re most fairly attributed to the training data fed into the system.
Many government policymakers either don’t understand this distinction or don’t care. They’re looking for somebody to hold accountable for the societal harms produced by decisions that a human-powered process might have been able to make differently. And they’re looking for interventions to help reduce the frequency and severity of such problems in the future.
All of that is fairly within their remit. But some of them are going about it in a technically unsophisticated and problematic way. They hear and use the word “algorithm” as if it were synonymous with “magic”. Not understanding how complex AI systems work, and focused myopically on the many problems that can arise, they suspect evil intentions. They fear that which they do not understand or know how to control.
The only true cure for this is greater technical literacy. Historically, many policymakers used to approach the word “computer” similarly — a powerful thing that they don’t quite understand, and believe they need to control to manage its harmful effects. By and large, they’re past that stage now, and hopefully, they’ll get to the same place with “algorithms” in short order.
Frankly, the tech industry isn’t always helping its case here. At first, when public and policymaker challenges to automated decision-making arose, too many voices got defensive and reacted by pointing to the infallible algorithm — clearly, the math and the code are right, so the problem must be someone else’s! Of course, when policymakers brusquely push for broader transparency into secret, proprietary, and highly valuable business secrets, resistance is understandable — especially where sharing details of the algorithms themselves won’t actually shine any light on the source of the visible problems.
I won’t get into a deep policy analysis here of the relative merits of a policy push for “algorithmic accountability” vs transparency — I’ll leave that to Mozilla’s more official channels, like this blog post on our recent filing with the UK government. There’s a lot to unpack and develop in this dimension, norms and laws and precedents that we can work to establish today that will guide regulatory interventions now and well into the future.
The next question is what that future will look like. Which means I can’t resist the “elephant in the room” question any longer: How smart are artificial intelligence systems going to become, and on what time horizon? Should we be worried already today about future super smart robots and seeking to get a handle on ethics in that context, as Elon Musk would suggest? Or is that just scaremongering, the position of Musk’s high-profile verbal sparring partner on the subject, Mark Zuckerberg? Will we ever get to the point where AI systems can realistically simulate general human intelligence capacities?
I certainly don’t know. But I do have an old personal theory on the subject. (For context, before I decided to go to law school — many years ago now! — I was thinking about this question a lot, and even started talking to people at a research lab about a possible postdoc to work on it.) My hypothesis is that the unique function performed by our brains is best described as our power of performing and comparing abstractions — it’s a sort of data structure that has no effective programmable implementation in computers today (as far as I know, at least), and it’s one that allows for incredibly powerful, lossy information pattern matching in ways we’re only crudely simulating today with our Turing machines. So, basically I sympathize with the recent New York Times opinion piece that general purpose AI seems “stuck” today. But I think more clever approaches are out there, yet to be discovered.
Meanwhile, industry interests, academics, public interest organizations, and others are building out the policy landscape around the present and future technology landscape of automated decision-making today. Major technology companies together formed the Partnership on AI “to benefit people and society” as its tagline goes. The current iteration of the NetGain partnership (in which Mozilla is a partner) is focused on the “quantified society” and problems arising from it. Google’s DeepMind team just brought in a new ethics research group. These aren’t isolated incidents, but examples of a broader trend of interest in this space.
One of the most important thing we can do now, together, is focus on literacy and understanding of these technologies. We need to work to mitigate the problems we are indeed seeing today, without being so afraid of the future potential that we repress research and development in the tech itself. That’s not an easy road to travel, but we can get there.