Look, we all know that algorithms are biased. What matters is how they are biased. What data helped train the algorithm? Is that data representative? What weights and preferences did the data scientist ascribe to different features when designing the algorithm? As an end user, it’s often impossible to know.
But when we blame the algorithms, more often than not we’re abdicating our basic responsibility to articulate and pursue a specific policy goal. Machine learning and software are not neutral arbiters of a policy. They reflect the goals and desires of the people collecting the data and designing a model. Which means that those implementing an algorithm can’t just shrug and say a decision was data-driven; they have to match the results up against a set policy goal.
So how should a society do that?
The authors of a newly issued study by the Ada Lovelace Institute, AI Now Institute, and the Open Government Partnership have tackled this exact question in a paper released Aug. 24. The authors of the paper analyzed 40 different policies built around algorithms. The policies range from Europe’s General Data Protection Regulations (GDPR) to an Idaho bill that forced transparency on state courts that used software to analyze a defendant’s pre-trial risk of not showing up in court.
The study lays out the eight most common mechanisms for algorithmic accountability used in various laws and governmental frameworks. It also makes six recommendations for governments to follow when trying to build accountability programs in the future. And it calls for more research into how useful accountability systems actually are, as well as for more empirical examples of testing various algorithms.
Because this is an academic effort, the researchers start with a definition of what they call an “algorithmic system.” They define it as “a system that uses automated reasoning to aid or replace a decision-making process that would otherwise be performed by humans.”
The bulk of the paper examines how governments, mostly in Europe and North America, have tried to hold algorithmic systems accountable. Most of these mechanisms are familiar to us. The first two — writing down principles and guidelines about the use of AI, and prohibitions and moratoriums around AI — are popular, but aren’t super effective when it comes to creating long-term policies.
The first is too vague and lacks teeth, while the second doesn’t always lead to lasting regulation and simply puts a pause on the process of deploying AI. The facial recognition bans common in liberal cities in the U.S. are a good example of the second method of AI accountability.
The third way governments can hold algorithms accountable is through public transparency, which includes providing information about the algorithmic system to the public. It’s also less than effective, mostly because it, too, is vague and lacks regulatory teeth.
The study focuses on so-called algorithmic registries that some European cities have created which share data on every city policy decision that’s affected by algorithms. A challenge with this type of transparency is how accessible and intelligible the information is to citizens. For example, some of the laws require details about the algorithm’s decision points while others simply require the code to be published. Additionally, some laws ask only for the relevant information to be made accessible as opposed to being broadcast publicly. That’s equivalent to the difference between having information in a basement filing cabinet and issuing a compelling public service announcement.
The fourth mechanism for accountability is impact assessments. These mechanisms ask a public agency to study how an algorithm affects the population and asks the agency to evaluate whether that assessment meets policy goals. But rather than assessing the performance of the algorithm after it has been deployed, many of the assessments are based on forecasted impacts. Moreover, the affected communities may not be consulted when it comes to assessing the impact of an algorithm and the agency assessing the algorithm might have a blind spot when it comes to a particular harm. Finally, in many cases, the results of these assessments are used for self-regulation, and may not be made public at all.
Related to impact assessments is the fifth mechanism, which comprises formal audits and regulatory inspection. With formal audits, a third party looks at the algorithmic system either from a technical perspective (making sure all elements work and are producing a consistent outcome) or a regulatory perspective, which ensures the algorithm is technically accurate and also meets some stated policy standard. To do these well, the third party needs both access to all elements of the algorithmic system as well as a way to make the results public and get the affected agency to take action.
Which leads us to the sixth mechanism for keeping AI accountable: the independent advisory board! Governments use these boards to audit and measure the outcomes or quality of algorithmic systems. But they should ensure the board has a diversity of opinion and take into account how the board communicates and enforces any issues it finds.
Given that when we talk about algorithmic systems, we’re essentially putting software in charge of a human’s life, some governments include a mechanism for appeal in case the algorithm gets something wrong. What the report dubs the seventh mechanism, is when governments offer citizens the ability to understand how an algorithm made the choice it did, as well as the opportunity to have “human intervention” in the decision-making process.
The eighth mechanism for algorithmic accountability sets out procurement guidelines, less as a way to fight corruption and more to ensure that when researchers or third parties need access to the algorithm for accountability purposes they can access the code and understand how the data was gathered and weighted.
The paper notes that Amsterdam has added some standard clauses when it buys any algorithmic system. These clauses include “conditions for transparency, including the right of government auditors or agencies to examine the underlying data and models; conditions for the vendor to assess algorithmic systems for bias, and risk management strategies to be complied with by the vendor.”
With these eight mechanisms identified, the paper’s authors suggest they are not enough. They recommend greater community engagement and outreach before adopting algorithmic systems so governments can see how they will affect a diverse population. Governments should include binding legal frameworks associated with any AI audits, they add, so the results of those audits are made public and flawed AIs are changed.
I also believe that governments need to have clear and well-defined policy goals associated with algorithmic systems instead of abdicating their responsibility to make a decision by assuming AI will be neutral. Most people assume data is neutral, when it very much isn’t. Rather, many of these algorithms will clearly expose what we value as opposed to what we say we value, and that disconnect is what drives our outrage.
Figuring out how to fairly deploy and regulate algorithmic systems won’t be easy. It will, however, force us to directly spell out what we value in our code. Unlike humans, computers take clear, consistent input and output exactly what someone programs them to deliver. Whereas while humans can also take clear, consistent input, if it doesn’t abide by their worldview, they can rationalize the decision they wanted to make anyway.
I am so happy you are bringing this topic forward. My comment isn’t nearly as intellectual as your post or the studies you reference, but it is a basic issue nearly everyone misses. My view is that the government must regulate the IoT platform – as it does with the analogous US wireless and telecommunications networks. Related to AI risks, it really doesn’t matter what requirements are imposed if the data traverses across an unregulated platform. We all know the AI superiority China possesses – and virtually every OEM is trying to sell AI capabilities built into their IoT devices (especially cameras). These capabilities are likely informed, updated, and managed through internet/cloud connections back to foreign platforms. Your opinions on AI policy are spot on. An unaddressed and fundamental problem is the lack of policy looking at the stack that governs the flow of the data.
David Abigt says
As a developer and end user of AI systems I see two basic issues:
First we tend toward get it out the door as quickly with as little work as possible (either through personality or delivery pressures) which means training sets and params tend to be kept to a minimum.
Second most systems seem to get built as static models (no feedback / continued training). The first would not be that big of a deal if end users could just let the systems know they got it wrong and that adjusted the training. Granted that could skew the system to think more like the person it is designed to replace but in such cases a review process should help sort that.
For and example of what I mean look at the DeepStack AI many use for IDing objects with security cameras. It does an OK job with the models it comes with and you can train up add in models (if you can sort through the lengthy and complex process) but that is still guess at what pics are good to train with. If the system was built to look at the marked up pics and result data with the correct ID for feedback the system could become better at a very fast rate.