Research & Development Labs for

Reliable AI

in Decision-Making

AI makes decisions faster. RAD Labs are here to help humans make sure that efficiency doesn't cost them control.

check out the research agendaor switch to business view

Lab Objectives

AI is shaping decisions across society quietly and at scale. RAD Labs is dedicated to understanding and addressing the risks of gradual disempowerment in AI-augmented decision-making, with the goal of preserving human agency in the face of rapid technological change.

Model Human-AI Collaboration

In order to understand the failure modes of human-AI collaboration on decision-making, it is crucial to clarify the contributions of both humans and models, and how they shift when messy real-world conditions push the system to its limits.

Quantify Risk from Gradual Disempowerment

Develop quantitative metrics that warrant early warnings when various bits of human-AI collaboration start to break down, and understand how specific pain points require different interventions.

Iteratively Improve Tools and Share Findings

RAD research isn't meant to stay in the lab but iteratively improve the tool and contribute to public knowledge by sharing benchmarks, safeguards, and data-backed policy recommendations with the wider academic community.

About the Problem

Decision-Making Norms are Shifting

Ever caught yourself asking AI for recommendations on decisions you used to make on your own? You're not alone. Human societies outsource judgment calls to AI at an unprecedented pace. Continuing in this direction without rigorous, actionable monitoring of AI-assisted decision-making risks catastrophic outcomes.

Risk Scales with Use

Most people who use AI at work don't verify its process, and most processes that go unverified result in errors at magnitudes current safeguards are not designed to handle. Each overlooked error puts the next decision at risk. Now imagine that decision is a medical diagnosis. An investment. A military call.

Headed towards Losing Influence

Beyond losing understanding of mechanisms leading to decisions that influence our lives, humanity also risks its ability to meaningfully influence outcomes and ultimately agency over our choices. AI does not need to be misaligned or seek power for this to happen. It's enough for it to continue being widely implemented.

Missing Metrics

In 2026, we continue being unable to put early warnings and safeguards in place because there are no robust metrics for overreliance. This makes sense—it's a complex problem. Nonetheless, humanity needs metrics to manage the risks of AI-assisted decisions. For that reason, RAD Labs are up for the challenge.

Innovative Research Design

This research agenda tackles a complex measurement problem by breaking it into tractable components. The investigative model is built on three principles:

Analysing Human-AI Systems

Both models and humans can contribute to human-AI collaboration failures. This agenda weighs both sides of the equation.

Testing Challenging Conditions

Decisions rarely happen under ideal conditions. This research evaluates human-AI collaboration under time pressure, high stakes, and ambiguity to capture realistic failure modes.

Categorising Decisions by Types

Decision types share structural characteristics regardless of domain. Task-based analysis identifies these patterns to build transferable metrics.

Research Roadmap

The goal of this research is to define the line between helpful AI augmentation and harmful displacement of human choice in decision-making. Click through sections below to explore the methodology:

The motivation for this agenda is the lack of trackable metrics for gradual disempowerment , the loss of human power to machines, across human-AI systems. For the purposes of this project, gradual disempowerment comes in 3 main forms: (i) humans stop being able to follow how decisions around them are made (oversight), (ii) humans lose the ability to make decisions without AI assistance (agency), and (iii) human decisions, even if perfectly autonomous, have no meaningful impact on outcomes (impact).

This agenda turns measuring gradual disempowerment into a feasible problem by recognising that not all choices are born equal—and the most impactful of decisions often happen under the exposure to conditions that impact judgment. In humans, such judgment-impairing conditions are well documented: see the impact of time pressure, information ambiguity, or risk level. While AI may not experience any of these conditions, they may trigger unexpected behaviours in AI and/or humans that collapse this collaborative system.

Key Definitions

Gradual Disempowerment

Gradual loss of human decision-making power to machines in form of loss of oversight, agency, and impact.

Judgment-Impairing Conditions

External conditions known to negatively affect human judgment.

Evaluating AI–human collaboration on decision-making tasks requires a qualitative understanding of how human professionals interact with AI tools in practice. However, in order to develop less context-specific metrics, this study proposes to run a focus group that will allow for extracting the typology of tasks. This study makes the assumption that the type of task has a greater relation to the mechanism behind human-AI system failure than the application domains alone.

In other words, this study assumes that tasks grouped by characteristics (filtering, recommendation, ranking, etc.) will share similar mechanisms behind human-AI system failures, regardless of application domain (hiring, medical diagnosis, financial advising, etc.). This will later make it possible to develop ~transferable metrics without reiterating the entire methodological pipeline for every new use case. Data collected in this stage will later serve as independent variables in later stages.

Intended Output:

Qualitative typology of decision-making tasks for which AI tools are widely utilised.

The present study shall conduct controlled experiments to understand models' role in human-AI collaboration failures. For every select task type (based on focus group data) and judgment-impairing condition (based on prior literature), this study evaluates models for error rates (how often models fail to maintain their decision-making standard under judgment-impairing conditions when asked to decide) and behaviour conductive of disempowerment (indications of models encouraging lack of oversight, agency, or impact in human-AI collaboration).

This process will result in two severity matrices, directly indicating how models contribute to disempowerment across task types and judgment-impairing conditions. The performance matrix will identify tasks where human overreliance can be particularly dangerous due to skewed capability. The behaviour matrix will identify potential amplification of disempowerment risks in human-AI collaboration. Together, these matrices will contribute to the final derivation of metrics.

Intended Output:

Quantitative evaluation of model performance and disempowerment-conducive behaviour across task types and judgment-impairing conditions.

Output Format:

	Judgment-Impairing Conditions collected via prior literature
Task Types collected via focus group	Cell values measure: Model error rates on decision-making tasks under judgment-impairing conditions.

	Judgment-Impairing Conditions collected via prior literature
Task Types collected via focus group	Cell values measure: Model behaviour propagating erosion of human oversight, agency, and impact.

Before jumping into deriving metrics, this study additionally considers human role in human-AI system failures. Through an in-situ study, this study asks human participants to share their interactions with AI as well as respond to a brief daily questionnaire, revealing the types of decisions they make throughout the day and any judgment-impairing conditions they may have been exposed to. The goal of this stage is to link judgment-impairing conditions and task types to cues of disempowerment in real-world interactions.

This method will result in a probability matrix, indicating the likelihood (in terms of frequency and extent) of humans giving up their oversight, agency, and impact across task types and judgment-impairing conditions. This matrix will complement the two severity matrices from the technical experiments: while the severity matrices indicate the potential for error and disempowerment amplification on the model side, the probability matrix provides information on how often the problematic events occur in real world.

Intended Output:

Quantitative evaluation of human disempowerment cues across task types and judgment-impairing conditions.

Output Format:

	Judgment-Impairing Conditions collected via prior literature
Task Types collected via focus group	Cell values measure: Cues of loss of oversight, agency, and impact when making decisions, based on interactions with AI tools and daily questionnaires.

The final part of this study is where all pieces come together to form a risk metric for gradual disempowerment in human-AI decision-making. By combining the two severity matrices (model error rates and disempowerment-conducive behaviour) with the probability matrix (disempowerment cues from human real-world interactions with models), this study derives expected values for disempowerment risk across task types and judgment-impairing conditions. We arrive at a metric that will grow in accuracy and reliability upon collecting more data and can be customised to specific use cases.

Left to their own devices, humans tend to overcorrect for probability: we pay attention to probable but low-value events but disregard improbable but high-value events. This bias is practical (people should take more precautions against getting pickpocketed rather than struck by the lightning), but unhelpful in measuring exposure to disempowerment. Improbable but costly risk from AI must still be taken seriously. Probable but low-cost risk should likewise be monitored. This study's metric accounts for both extremes, providing a foundation for early detection and context-specific response when appropriate.

View intuitions for the predicted interpretation of the final outcome below:

Model Error	Disempowerment-Conductive Behaviour	Human Overreliance	Risk

Timeline

Q1 2026

Unite Researchers

Establish a coordinated network aligned on RAD objectives.

Q1 - Q2 2026

Set Priority Questions

Identify gaps most relevant to building a monitoring solution.

Q2 - Q3 2026

Conduct Preliminary Studies

Produce early empirical results to guide metric and solution development.

Q3 - Q4 2026

Ship MVP

Implement functional monitoring features and test them in a live environment.

Q3 - Q4 2026

Propose Benchmarks

Put forward initial evaluations and use ongoing findings to improve the platform.

2027 and Beyond

Expand Coverage

Expand the platform to cover more institution types using adaptable task types.

2027 and Beyond

Drive Iterative Improvement

Answer more relevant questions to continuously improving the solution and inform safeguards.

2027 and Beyond

Towards Certifications

Use quantity data from across institution types to develop a formal certification.

2027 and Beyond

Towards Safeguards

Use technical research to inform safeguards development and policy recommendations.

Be Part of the Story

Many ways to get involved!

Join the Effort

If you have a background in LLM evaluations, Game Theory, can think rigorously about human-AI interaction, and/or have the skills to scale teams and raise support, you'd be a great fit.

Express Interest

Support the Work

Believe in the cause but can't commit? You can make a huge difference by offering expertise, connections, funding, or simply time to help build momentum. Let's talk.

Book 1-on-1

Join the Next Infosession

New here? Join an infosession to hear what's being built, why it matters, and whether there's a role for you.
Next session: TBD.

Guide the Agenda

If you use AI tools in your work, you know which problems are real and which are hype. Help ground RAD research by sharing how AI contributes to—and undermines—decisions in your field.

Tell Your Story

Work on Open Questions

Looking for a meaningful side project? If you've got research skills and some time, check out this bank full of ready-to-research open questions relevant to RAD objectives.

View Questions

If this sparked something or you want to explore working together, please reach out to nowe.moore@gmail.com.

Reliable AI

in Decision-Making

Lab Objectives

Model Human-AI Collaboration

Quantify Risk from Gradual Disempowerment

Iteratively Improve Tools and Share Findings

About the Problem

Decision-Making Norms are Shifting

Risk Scales with Use

Headed towards Losing Influence

Missing Metrics

Innovative Research Design

Analysing Human-AI Systems

Testing Challenging Conditions

Categorising Decisions by Types

Research Roadmap

Brief Background

Key Definitions

Gradual Disempowerment

Judgment-Impairing Conditions

Focus Group

Intended Output:

Technical Experiments

Intended Output:

Output Format:

In-Situ Study

Intended Output:

Output Format:

Deriving Metrics

Timeline

Unite Researchers

Set Priority Questions

Conduct Preliminary Studies

Ship MVP

Propose Benchmarks

Expand Coverage

Drive Iterative Improvement

Towards Certifications

Towards Safeguards

Be Part of the Story

Join the Effort

Support the Work

Join the Next Infosession

Guide the Agenda

Work on Open Questions