Solving Problems at the Frontier of AI Safety

Sep 20

Artificial Intelligence (AI) is no longer a technology of the future, but a reality that shapes our lives in the present through everyday applications like voice assistants, personalization, data processing, and many others. By 2025, the global AI market will reach $190.61 billion in value with a potential to change the world in radical ways - both for better and for worse.

As for better, Machine Learning (ML) and neural networks stand out as some of the most promising avenues for innovation in the 21st century. Right now, these technologies are helping organizations to automate complex tasks and gain unprecedented insights. In the future, they will power next-generation tech such as driverless vehicles and assisted surgery.

As for worse, AI brings a distinct set of risks and safety challenges to users, which developers must be prepared to recognize and address. In this article, we will explore some of these challenges and the ways Data Machines Corp. (DMC) has forged a path through them towards a world of safer and more reliable AI applications.

AI: A Risky Technology

There is every reason to hope that AI will transform society for the better at an individual, corporate, and international level. But like any other technology, AI can be used in deliberately malicious ways. In a past blog post, for instance, we wrote about the issue of ML-driven deepfakes and cybersecurity attacks.

However, AI can give rise to harmful behavior even if it is deployed with the best of intentions. Errors in the development process, poorly specified functions, and low-quality datasets can lead to "accidents," which Google Brain Researchers define as: "unintended and harmful behavior that may emerge from machine learning systems" for any number of reasons.

According to research from McKinsey, organizations tend to overlook rather than overestimate the potential perils in AI development. Consequently, they are ill-equipped to resolve (or prevent from occurring) those problems when they arise.

Six Challenges to AI Safety

AI becomes unsafe whenever it becomes unreliable, and behaves in ways that are harmful or disruptive to users. Depending on the context in which AI is deployed, the resulting harm can range from benign (mostly inconvenient) to life-threatening (industrial automation). But what do challenges to AI safety look like? Here are six examples:

1. AI Alignment Problem

At a high-level, many AI risks come down to the challenge of aligning AI with human values, goals, and social norms. An AI will gladly repeat offensive language it picked up in a training set, because it doesn’t know the difference. An AI-powered vehicle will knock a person over in order to get where it’s going, unless it is taught not to.

Aligning AI with human values is a delicate balancing act between over-defining and under-defining its function. Developers cannot anticipate every failure case, and attempting to do so eliminates the AI’s ability to learn for itself and adapt like a real intelligence.

2. Reward Hacking

A reinforcement learning algorithm learns through “rewards”. Whenever it performs a desired behavior, its weights are changed to reinforce that behavior repeatedly over time. But sometimes, the AI can find shortcuts to a reward that result in undesirable behavior.

For instance, a room-cleaning robot might learn to trigger a reward by simply deactivating its sensors, interpreting the absence of input data to mean its job is finished. In this way, an AI can go astray from its intended purpose in dangerous or subversive ways.

3. Biased Training Set

A disproportionate number of datasets are gathered in the United States, China, and other countries with high technology investment. This can lead to models that give biased and inaccurate predictions, especially when they involve human needs, attributes, and demographic distribution.

The same problem occurs when data is scraped from the Internet, since online communities don’t necessarily reflect real-life communities where the AI application is being deployed.

4. Data Scarcity

Abundant data is frequently unavailable for niche applications. This can cause problems of overfitting: for instance, if an AI is trained on a company of fifty employees, it will rely heavily on the original examples and fail to generalize. Moreover, any outliers in a small dataset will heavily skew the algorithm, leading to worthless outputs.

5. Unlabeled Data

In order for an AI to learn, it must “understand” the data it has been given, which is usually accomplished through labels and other forms of metadata. For instance, an AI that is being trained to recognize fraudulent checks must be told which checks are fraudulent. Unfortunately, labeled datasets can be difficult to obtain, and manually labelling datasets can take hundreds of hours of labor.

6. Data Privacy

Language models that power AI assistants are often trained on massive datasets that include personally identifiable information (PII) that is scraped from across the web. In some cases, it will “remember” this information and return it to users when it shouldn’t.

So-called “training data extraction attacks” represent a cybersecurity threat to organizations that use reinforcement learning models to process intellectual property, financial information, and other sensitive data.

Building a Safer AI Landscape

As experts in the development of AI/ML algorithms support a wide range of applications, DMC has confronted these and many other AI challenges across a wide variety of environments, ranging from small businesses to large federal organizations. Along the way, we’ve forged novel approaches to improve AI safety and predictability for our clients. Here are just a few:

1. Predicting Uncertainty

When AI gives rise to undesirable behavior, it’s not always clear why. And unlike conventional programming, it’s not possible to analyze or debug the underlying code: learning algorithms are an undecipherable web of weights and biases.

Alternatively, by measuring the certainty of an AI prediction, it’s possible to learn more about the underlying problem and change training parameters for better results. But reliable certainty predictions are a major challenge even for advanced neural networks.

In response, DMC engineers have devised and published new methods for predicting uncertainty in AI algorithms by measuring the difference between an unknown model’s output and true values as a target. This leads to substantially improved predictive accuracy, which enables more effective troubleshooting and design.

2. Pioneering Better Data Approaches

The perennial rule of computer science applies to AI as much as it applies to traditional software techniques: garbage in, garbage out. At DMC we identify, supplement, and rehabilitate defective datasets through a number of techniques.

First, we use dynamic transformations to overcome the problem of data scarcity. For instance, a set of 100 images can be effectively increased to 1,000 or more images with sufficient variation for training through strategic warping and “perturbations”.

Second, we automate the task of labeling datasets through an ML-based application, which identifies the best candidates for initial labeling. These initial labels are used to sort the rest of the dataset with sufficient accuracy to create usable training information, which can be validated manually or automatically.

Finally, we advocate for “datasheets” to provide developers with information about a dataset’s provenance, including test results, documented biases, and standard operating characteristics. First described in a paper from Microsoft, datasheets can help organizations to choose better datasets and use them more effectively.

3. Identifying Better Goals

AI alignment is a long-term problem that is likely to remain unsolved for the next century of computing. However, we are taking steps in the right direction by identifying values and goals that should form the basis for a new generation of AI development. At DMC, our engineers, AI/ML experts, and data and computer scientists are proud contributors to the scientific community participating in a global conversation about AI values.

We advocate for many of the goals outlined by the Institute of Electrical and Electronics Engineers (IEEE), U.S. Intelligence community, and Ethical ML Network, including:

· Respect for human rights, and focus on human well-being

· Explainability to provide transparent rationale for all AI-driven decisions

· Accountability to our clients and community

· Data risk awareness to ensure data mode security

· Bias evaluation to ensure that our products do not result in unfair outcomes

The Future of AI

As AI and associated technologies scale in complexity, current challenges to safety and reliability will also become harder to solve. The work that we are doing today on the frontier of AI research is laying the groundwork for safer and more reliable solutions in the future.

At DMC, we are strong believers in the scientific community, advocates for open-source software and public accountability. As we continue to encounter new challenges, we freely share our knowledge and discoveries so other organizations and developers can work towards realizing the promise of AI for better, and not for worse.

Catherine Schymanski

Solving Problems at the Frontier of AI Safety

AI: A Risky Technology

Six Challenges to AI Safety

Building a Safer AI Landscape

The Future of AI

How to Install a Docker Machine on a MacOS

When the Heat is on: Analyzing Olympic Competition Structures