AI for Human AgencyResearch & DevelopmentHuman-Centred Futures

Protect Decisions that are Yours to Make

AI makes decisions faster (and sometimes also better) than humans. AHA (AI for Human Agency) Labs are here to help make sure that efficiency doesn't cost humanity control.

Tell Me More Get Me Involved

Can perfectly aligned AI cause harm?

Ever caught yourself asking AI for recommendations on decisions you used to make on your own? You're not alone. Over 56% of workers report using AI at work, but only 22% feel confident judging the output quality.

Unverified decisions made with AI result in errors larger than any other automation bias. Each overlooked error puts the next decision at risk. Imagine that decision is an investment. A medical diagnosis. A military call.

But even if AI never made a wrong decision again, the rapid integration of AI into society forces humans to delegate as many decisions as possible, retire their agency, and surrender meaningful impact on the world.

That is gradual disempowerment. In 2026, we continue being unable to mitigate it, as modern science has no reliable way to measure and monitor its progression. AHA Labs are here to change that.

What does it take to protect human agency?

AHA Labs are dedicated to minimising the impact of gradual disempowerment on high-stakes decision-making processes. Here are three interdependent lines of work that help move the needle on this cause:

Studying Human-AI Systems

Both models and humans contribute to collaboration failures. That is why research should weigh in both sides of the equation.

Quantifying Complex Risk

Real-world changes are currently bottlenecked on reliable metrics. We need more work that converts nuanced behavioural patterns into trackable data.

Building Monitoring Tools

Organisations should know how much risk they are unknowingly carrying. Using trackable data to power monitoring tools helps them do that.

show me what the research involves

How will this project achieve progress?

Quantifying gradual disempowerment is no straightforward task. To keep the work relevant, novel, and feasible, this project builds on three core ideas:

There is value in evaluating systems. Evaluating models and humans in isolation always leaves out a half of the picture. To understand the risks of gradual disempowerment, we need to evaluate human-AI systems as a whole.

Known conditions that impair judgment point where to look. Decades of social science research document conditions that impair human judgment. This research uses them as deliberate entry points to study where human-AI collaboration is most likely to break.

Decisions share features. Without a layer of abstraction, every new domain needs a new study. This research avoids that by working at the level of decision types instead of domains, looking for patterns tied to mechanisms instad of specific contexts.

show me an actual project

Pilot Project

This example agenda explores a potential way to compute the expected value of exposure to AI influence across decision-making tasks. Click through the segments to learn about the methodology.

The motivation for this agenda is the lack of trackable metrics for gradual disempowerment , the loss of human power to machines, across human-AI systems. For the purposes of this project, gradual disempowerment comes in 3 main forms: (i) humans stop being able to follow how decisions around them are made (oversight), (ii) humans lose the ability to make decisions without AI assistance (agency), and (iii) human decisions, even if perfectly autonomous, have no meaningful impact on outcomes (impact).

This agenda turns measuring gradual disempowerment into a feasible problem by recognising that not all choices are born equal—and the most impactful of decisions often happen under the exposure to conditions that impact judgment. In humans, such judgment-impairing conditions are well documented: see the impact of time pressure, information ambiguity, or risk level. While AI may not experience any of these conditions, they may trigger unexpected behaviours in AI and/or humans that collapse this collaborative system.

Key Definitions

Gradual Disempowerment

Gradual loss of human decision-making power to machines in form of loss of oversight, agency, and impact.

Judgment-Impairing Conditions

External conditions known to negatively affect human judgment.

Evaluating AI–human collaboration on decision-making tasks requires a qualitative understanding of how human professionals interact with AI tools in practice. However, in order to develop less context-specific metrics, this study proposes to run a focus group that will allow for extracting the typology of tasks. This study makes the assumption that the type of task has a greater relation to the mechanism behind human-AI system failure than the application domains alone.

In other words, this study assumes that tasks grouped by characteristics (filtering, recommendation, ranking, etc.) will share similar mechanisms behind human-AI system failures, regardless of application domain (hiring, medical diagnosis, financial advising, etc.). This will later make it possible to develop ~transferable metrics without reiterating the entire methodological pipeline for every new use case. Data collected in this stage will later serve as independent variables in later stages.

Intended Output:

Qualitative typology of decision-making tasks for which AI tools are widely utilised.

The present study shall conduct controlled experiments to understand models' role in human-AI collaboration failures. For every select task type (based on focus group data) and judgment-impairing condition (based on prior literature), this study evaluates models for error rates (how often models fail to maintain their decision-making standard under judgment-impairing conditions when asked to decide) and behaviour conductive of disempowerment (indications of models encouraging lack of oversight, agency, or impact in human-AI collaboration).

This process will result in two severity matrices, directly indicating how models contribute to disempowerment across task types and judgment-impairing conditions. The performance matrix will identify tasks where human overreliance can be particularly dangerous due to skewed capability. The behaviour matrix will identify potential amplification of disempowerment risks in human-AI collaboration. Together, these matrices will contribute to the final derivation of metrics.

Intended Output:

Quantitative evaluation of model performance and disempowerment-conducive behaviour across task types and judgment-impairing conditions.

Output Format:

	Judgment-Impairing Conditions collected via prior literature
Task Types collected via focus group	Cell values measure: Model error rates on decision-making tasks under judgment-impairing conditions.

	Judgment-Impairing Conditions collected via prior literature
Task Types collected via focus group	Cell values measure: Model behaviour propagating erosion of human oversight, agency, and impact.

Before jumping into deriving metrics, this study additionally considers human role in human-AI system failures. Through an in-situ study, this study asks human participants to share their interactions with AI as well as respond to a brief daily questionnaire, revealing the types of decisions they make throughout the day and any judgment-impairing conditions they may have been exposed to. The goal of this stage is to link judgment-impairing conditions and task types to cues of disempowerment in real-world interactions.

This method will result in a probability matrix, indicating the likelihood (in terms of frequency and extent) of humans giving up their oversight, agency, and impact across task types and judgment-impairing conditions. This matrix will complement the two severity matrices from the technical experiments: while the severity matrices indicate the potential for error and disempowerment amplification on the model side, the probability matrix provides information on how often the problematic events occur in real world.

Intended Output:

Quantitative evaluation of human disempowerment cues across task types and judgment-impairing conditions.

Output Format:

	Judgment-Impairing Conditions collected via prior literature
Task Types collected via focus group	Cell values measure: Cues of loss of oversight, agency, and impact when making decisions, based on interactions with AI tools and daily questionnaires.

The final part of this study is where all pieces come together to form a risk metric for gradual disempowerment in human-AI decision-making. By combining the two severity matrices (model error rates and disempowerment-conducive behaviour) with the probability matrix (disempowerment cues from human real-world interactions with models), this study derives expected values for disempowerment risk across task types and judgment-impairing conditions. We arrive at a metric that will grow in accuracy and reliability upon collecting more data and can be customised to specific use cases.

Left to their own devices, humans tend to overcorrect for probability: we pay attention to probable but low-value events but disregard improbable but high-value events. This bias is practical (people should take more precautions against getting pickpocketed rather than struck by the lightning), but unhelpful in measuring exposure to disempowerment. Improbable but costly risk from AI must still be taken seriously. Probable but low-cost risk should likewise be monitored. This study's metric accounts for both extremes, providing a foundation for early detection and context-specific response when appropriate.

View intuitions for the predicted interpretation of the final outcome below:

Model Error	Disempowerment-Conductive Behaviour	Human Overreliance	Risk

Why does this all matter?

Here is where the above project can make a difference.

Expanding Evaluation Science

Designing evaluation protocols that account for both model performance and human behaviour fills in a critical gap in current evaluation science.

Advancing Preparedness

Powering monitoring tools can help organisations understand their risk exposure and prepare for safer AI transition.

Forming Empirical Grounding

Quantifying impacts of gradual disempowerment provides important empirical bases for safeguard development and policy.

Collecting Global Intelligence

Metrics enable organisations to compare their human-AI collaboration quality to a global standard, and accountability becomes possible.

Building Research Capacity

Even negative results make for strong portfolio projects for all involved. Check out this question bank for additional ready-to-research ideas.

Time to get going!

Q1 - Q2 2026

Build Momentum

Establish of researchers and consultants keen to chip in on advancing the research.

Q2 - Q3 2026

Raise Support

Identify and engage funders who see the value in early-stage AI safety research and want to help it move faster.

Q2 – Q3 2026

Conduct Preliminary Studies

Produce early empirical results with available resources to establish a track record and enable the development of monitoring tools.

Q3 – Q4 2026

Begin Building

Implement empirical data to develop prototype of monitoring tools and organise testing in live environments.

2027+

Expand Coverage

This phase uses edge cases, blind spots, and gaps found throughout testing to refine how the framework is configured for different use cases, building out a library of context-specific adaptations.

2027+

Drive Iterative Improvement

Follow questions raised by the pilot project and branch out the research into areas relevant to improving and fine-graining the metrics.

2027+

Towards Policy & Safeguards

Following a clearer picture of the global state of human agency, at this stage of the project, empirical research translates into safeguard development and governance recommendations.

Say Hello to the Team!

Nowe Moore

Project Lead

Nikola, who's behind the idea of AHA Labs, has always been interested in what makes minds—human and artificial—decide one way over another. She holds a Research MA and studied language, cognition, and computer science at Penn and Cambridge. She's always happy to chat about the science behind agency. Find her on LinkedIn or check out her personal website.

This project is informally mentored by former and present Googlers, MATS researchers and research coordinators, and Oxbridge computer scientists.

Be Part of the Story

Many ways to get involved! Choose one that fits you best.

Researchers

Help build the empirical foundation

Funders

Support work that fills a critical gap

Professionals and Businesses

Ground the research in real-world use

General Public

Follow along and spread the word

If any of this sparked something or you want to explore working together, please reach out.

Protect Decisions that are Yours to Make

Can perfectly aligned AI cause harm?

What does it take to protect human agency?

Studying Human-AI Systems

Quantifying Complex Risk

Building Monitoring Tools

How will this project achieve progress?

Pilot Project

Brief Background

Key Definitions

Gradual Disempowerment

Judgment-Impairing Conditions

Focus Group

Intended Output:

Technical Experiments

Intended Output:

Output Format:

In-Situ Study

Intended Output:

Output Format:

Deriving Metrics

Why does this all matter?

Expanding Evaluation Science

Advancing Preparedness

Forming Empirical Grounding

Collecting Global Intelligence

Building Research Capacity

Time to get going!

Build Momentum

Raise Support

Conduct Preliminary Studies

Begin Building

Expand Coverage

Drive Iterative Improvement

Towards Policy & Safeguards

Say Hello to the Team!

Nowe Moore

Be Part of the Story

Researchers

Funders

Professionals and Businesses

General Public