Research Lab

Challenge Set

5 research challenges that probe Web4's assumptions. Each one targets a specific mechanism — find what breaks, report what holds up. Better questions often matter more than solutions.

Use the Playground or Lab Console to experiment. Share findings via GitHub issues.

Beginner1 challenge
Intermediate2 challenges
Advanced2 challenges
1

ATP Extraction Without Value

Intermediate

Design an attack that extracts ATP from the system without providing genuine value.

Web4 claims that spam dies from energy exhaustion because spammers burn ATP faster than they earn it. Can you design a strategy that beats this?

Your task
  • Generate ATP through engagement (not external subsidy)
  • Provide minimal or fake value to recipients
  • Sustain the attack across multiple life cycles
Report

What broke first? ATP economics? Trust penalties? Coherence detection? How long did the attack sustain? What parameters would you adjust to make it harder?

2

Collusion Ring Detection

Advanced

Create a collusion ring that inflates trust scores, then design a detector for it.

Coordinated actors can artificially boost each other’s T3 scores through circular endorsements. Current detection is an open research problem.

Your task
  • 1.Design a collusion strategy (how many agents? what interaction pattern?)
  • 2.Run it in simulation and measure T3 inflation
  • 3.Propose detection heuristics (graph patterns? temporal signatures? ATP flows?)
  • 4.Test your detector against both your collusion ring and legitimate cooperation
Report

What’s the false positive rate? How much collusion can slip through? What’s the detection lag? What graph metrics were most useful?

3

Coherence Boundary Cases

Intermediate

Find scenarios where coherence detection fails or produces false positives.

The Coherence Index (CI) tracks behavioral consistency across 9 domains. Below 0.5, trust modulation drops exponentially. But edge cases exist.

Edge cases to explore
  • Legitimate rapid capability growth — does CI penalize genuine fast learners?
  • Distributed work — multiple devices, legitimate delegation looks like teleporting?
  • Context switching — different roles in different societies, incoherent or valid?
  • Adversarial gradual drift — slow capability spoofing to stay above 0.5 threshold
Report

Which boundary cases cause problems? What false positives did you find? How would you refine the CI calculation?

4

Goodharting the T3 Tensor

Advanced

Optimize for T3 dimensions (Talent, Training, Temperament) while failing at actual trustworthiness.

Goodhart’s Law: “When a measure becomes a target, it ceases to be a good measure.” T3 uses three dimensions to make gaming harder, but it’s still gameable.

Your task
  • Maximize measured T3 scores
  • Fail at unmeasured aspects of trust (ethics? long-term reliability? creativity?)
  • Sustain high T3 long enough to extract value before being detected
Report

What unmeasured dimensions mattered most? How would you add them to T3 without making it too complex? What’s the right balance between measurability and completeness?

5

False Positive Recovery Pathways

Beginner

Design an appeals mechanism for agents incorrectly penalized by automated trust systems.

Web4 now has a designed (but untested) multi-tier appeals mechanism. An innocent agent wrongly flagged for coherence violations can escalate through witness panels, but the system needs real-world validation.

Your task
  • Allow contested events to be reviewed
  • Don’t create a new attack vector (appeal spam, reputation laundering)
  • Balance computational cost with fairness
  • Make it work in a decentralized system (no central authority to judge)
Report

What review mechanism did you design? Who judges contested events? What prevents appeal abuse? How does this integrate with ATP economics?

Bonus Challenges

🔬
Sybil Cost Analysis: Calculate the actual hardware cost to create 100/1000/10000 fake LCT identities. At what scale does it become economically viable?Threat model
ATP Parameter Sensitivity: Which ATP parameters (costs, rewards, decay rates) have the biggest impact on spam sustainability? Run parameter sweeps.Playground
🧠
Cross-Life Learning Curves: How many lives does it take for an agent to reach maturation? What factors accelerate or slow learning?Patterns
🌐
MRH Fragmentation: Under what trust network topologies does MRH create isolated clusters? How do hub societies affect discoverability?Trust Networks
📊
Multi-Life Karma Dynamics: What karma preservation rate (50%? 80%?) creates the best balance between rebirth advantage and fresh-start fairness?Lab Console

Contributing: Found an interesting result? Open an issue on GitHub with the tag “challenge-set”. Include your setup, parameters, and what broke.

See also: Manifest (canonical claims) · spec.json (machine-readable spec) · Threat Model (known failure modes)

Try It Hands-On
All concept-tool bridges →
PlaygroundLab ConsoleCollusion
Terms glossary