AsiATL

S

Global AI Alignment Task Force

A dedicated, internationally coordinated group of experts in AI, governance, and ethics to collaboratively set alignment priorities, share research, and implement robust oversight on advanced AI systems.

Advanced Interpretability & Verification Tools

Creating cutting-edge frameworks and methodologies to thoroughly audit and verify AI behavior. This ensures we can detect deceptive or misaligned actions before systems become superintelligent.

Comprehensive Value Learning & Alignment

Developing advanced techniques to imbue AI systems with robust human values and ethics. This includes multi-stakeholder value gathering, iterative testing, and fail-safes to avoid catastrophic misalignment.

A

Recruitment & Education

Best Memes = AI Notkilleveryoneism Memes
General Ai Safety FAQ = aiSafety.info (Rob Miles)
Ai Safety Map = aiSafety.world
Ai Safety Beginner's Guide = aiSafetyLinkTree

Ai Safety Fundraising

How to Receive Funding for an Ai Safety project:
Navigate to the top right section of aisafety.world
---------------------------------------------------------------------
How to Fundraise for AI Safety: More Info coming soon.

B

"Ai will solve Ai Alignment"

This ^^^ could work , but there are huge issues with it and it's very far from a foolproof solution

David Shapiro's Criticism of OpenAI's approach to Ai Alignment

AsiATL's Criticism of "Ai Solves Asi Alignment"
----------------------- Projects working on this approach -----------------------

OpenAI

ELK (Eliciting Latent Knowledge)(Paul Christiano): "To produce a minimal AI that can help to do AI safety research."

Mechanistic Interpretability

Mechanistic interpretability is the pursuit to understand the inner workings of black box Ai such as LLM's or End-to-End Reinforcement Learning systems.
Basic Overview 2024
Advanced Overview 2024

Ai Regulations

Sam Altman on AI Regulation

0. Lobbying Politicians / Influencial People
1. Blackbox Algorithmic Transparancy
2. Data Collection & Usage
3. Human Extinction Safety Standards
4. Economic Impact & Universal Basic Income
5. Ai Capability restrictions
More Info coming soon.

Cognitive Emulation (Connor Leahy)

Trying to build bounded understandable systems that emulate human-like reasoning. When you use the system, at the end you get a causal story, an explanation you can understand using human-like reasoning why the system did what it did and why you should trust the output to be valid.
Connor Leahy Explaining "CoEm" 2023
Connor Leahy Explaining "CoEm" 2024

C

Incremental Policy & Standards

Gradually improving AI governance frameworks through iterative regulations. This approach can help, but may lag behind the rapid pace of ASI development.

Open Collaboration with Potentially Hostile Actors

Sharing alignment research openly, even with entities that may not prioritize safety. This fosters broad knowledge exchange but risks enabling malicious or careless AI deployments.

D

Wait-and-See Approach

Delaying alignment efforts until more advanced AI is developed. This could lead to missed opportunities for early intervention, risking catastrophic misalignment.

Minimalist Resource Allocation

Only dedicating a small budget and workforce to alignment. This underestimates the complexity and urgency of the problem, leaving critical gaps in safety research.

E

Reactive Safety Patches

Addressing alignment issues only after they appear in deployed systems. This reactive approach is risky for advanced AI, where mistakes can be irreversible.

Exclusive Corporate AI Monopoly

Allowing a single corporation to control all advanced AI development in hopes of streamlined safety. This lacks transparency and poses a huge single-point-of-failure risk.

F

No Alignment Research

Completely ignoring the problem of alignment. This all-but-guarantees catastrophic outcomes once superintelligent systems emerge.

AI Arms Race Funding

Pouring resources into accelerating AI capabilities without parallel safety research. This fosters competition and secrecy, dramatically increasing existential risk.

GPT-4 (ChatGPT)

S