WalkXR AI: Emotionally Intelligent Agentic Systems

Project Overview
- Mission
- Vision
Repository Structure
Getting Started: Environment Setup
Architecture Overview
Technology Stack
Key Scripts & Usage
- scripts/manage_rag_index.py
Agent Design Philosophy
Development Workflow
Project Roadmap & Milestones
Documentation Index
Contributing
Contact & Support

Project Overview

WalkXR AI is dedicated to creating emotionally intelligent agentic systems that enhance human connection, self-awareness, and empathetic understanding. These agents are designed to power immersive and interactive experiences within the WalkXR platform, a broader initiative by The Verse to build games, experiences, and rituals that uplift humanity.

Mission

To develop and deploy sophisticated AI agents that can perceive, understand, and respond to human emotion with nuance and integrity, fostering psychologically safe and transformative interactions.

Vision

A future where AI companions and roleplay agents act as catalysts for personal growth, facilitating deeper self-reflection, co-regulation, and social connection. The long-term vision is the WalkXR Emotional OS, a comprehensive orchestration engine that dynamically adapts experiences based on user emotional states and therapeutic goals.

Repository Structure

This repository contains the core AI development for WalkXR agents, including knowledge base management, agent logic, and supporting infrastructure.

/Users/romandidomizio/WalkXR-AI
├── .env.template
├── .github/                # GitHub action workflows and templates
├── .gitignore
├── .python-version         # Set by pyenv to lock Python version
├── README.md
├── docs/
│   ├── architecture/       # Core technical and product architecture documents
│   └── research/           # Foundational research and papers
├── notebooks/              # Jupyter notebooks for experimentation
├── pyproject.toml          # Defines project dependencies and tool configurations
├── scripts/
│   └── manage_rag_index.py # CLI tool for managing the RAG knowledge base
├── src/
│   └── walkxr_ai/
│       ├── __init__.py
│       ├── agents/         # Agent implementations (future work)
│       └── rag/            # Core RAG system components
│           ├── __init__.py
│           ├── chunking_strategy.md # Documentation on chunking approaches
│           ├── rag_config.yaml      # Configuration file for the RAG system
│           └── retrieval_engine.py  # Main class for RAG pipeline logic
└── vector_store/           # Default local storage for ChromaDB

Repository Workflow: From Idea to Agent

This repository is structured to support a clear development workflow, from initial design and research to production-ready agents.

docs/ (Design & Research): This is where ideas begin. Foundational research, product requirements, and technical architecture documents live here. Before writing code, consult the docs to understand the "why" behind an agent or feature.
notebooks/ (Prototyping & Experimentation): Use Jupyter notebooks in this directory to rapidly prototype new concepts, test LLM prompts, and experiment with libraries like LangChain or LlamaIndex before formalizing them into the main codebase.
src/walkxr_ai/ (Core Development): This is the heart of the project, containing all production Python code.
- rag/: The foundational knowledge system that all agents will use.
- core/: Core components like state management, safety layers, and memory systems will live here.
- agents/: Where modular, reusable agents are built. Each agent should be self-contained and designed for future orchestration.
- simulation/: Contains tools and schemas for simulating user interactions to test agent responses.
tests/ (Validation & Quality): All Pytest unit and integration tests go here. Every new feature or agent added to src/ should be accompanied by corresponding tests to ensure reliability.
scripts/ (Management & Operations): Contains high-level CLI tools for managing the system, such as the manage_rag_index.py script for interacting with the RAG knowledge base.

Getting Started: Environment Setup

Follow these instructions carefully to create a consistent development environment.

Prerequisites

Operating System: macOS, Linux, or Windows with WSL2.
Git: Ensure Git is installed.
pyenv and poetry: See steps below.

1. Clone the Repository

cd path/to/your/development/folder
git clone https://github.com/VerseBuilding/WalkXR-AI.git
cd WalkXR-AI

2. Python Version Management (`pyenv`)

We use pyenv to lock our Python version to 3.11.9.

# Install pyenv (macOS example)
brew install pyenv

# Follow shell setup instructions from pyenv

# Install and set the project's Python version
pyenv install 3.11.9
pyenv local 3.11.9

# Verify the version is active
python --version
# Expected output: Python 3.11.9

3. Dependency Management (`Poetry`)

We use Poetry for managing project dependencies and virtual environments.

Install Poetry: Follow the official installation guide.
Install Dependencies: The poetry.lock file, which ensures exact dependency versions, is specific to the operating system and Python version it was created on. To avoid conflicts, it is not committed to the repository. You will generate your own local lock file.

Run the following command to install the dependencies and create your poetry.lock:
```
poetry install
```
This command reads pyproject.toml, resolves the dependencies, installs them into a virtual environment, and generates a poetry.lock file tailored to your system.

4. Environment Variables

Create a .env file for API keys and other secrets.

cp .env.template .env

Open the .env file and add any necessary keys (e.g., LANGCHAIN_API_KEY for LangSmith tracing).

5. Local LLM Server (`Ollama`)

For local development, we use Ollama to serve LLMs and embedding models.

Install Ollama: Download from the official Ollama website.

Pull Required Models:

ollama pull llama3          # Primary LLM for generation tasks
ollama pull nomic-embed-text # For text embeddings (RAG)

6. Verify Setup

After completing all steps, verify that the RAG system is working.

Activate the virtual environment:
```
poetry shell
```
1. Build the Knowledge Base: The ingest command processes documents in the docs/ directory and creates a vector store in vector_store/walkxr_knowledge_base.
```
python scripts/manage_rag_index.py ingest
```

2. Run a Test Query: The query command tests the end-to-end RAG pipeline.

python scripts/manage_rag_index.py query "What is the agent design philosophy?"

If both scripts run without errors, your development environment is correctly set up.

Architecture Overview

WalkXR AI is designed with modularity, scalability, and ethical considerations at its core.

Core Principles

Modularity: Agents and system components are designed to be independent, testable, and composable.
Simulation-Led Development: Agent behaviors are tested in simulation before deployment.
Therapeutic Guardrails: Agents never diagnose or judge; they co-regulate and offer reflection.
Interoperability: Support for multiple LLM providers and easy integration with various frontends.

The Four Development Tracks

Our development process is organized into four parallel, interconnected tracks:

Track 1: EI Design & Evaluation: Defines the science behind our agents and builds the frameworks to evaluate their emotional intelligence and effectiveness.
Track 2: Simulation & Data: Generates high-quality training and testing data by simulating diverse human personas and their emotional reactions.
Track 3: Agents & Memory: Builds the sophisticated, stateful agents using frameworks like LangGraph and develops the hybrid memory systems that allow them to learn.
Track 4: Full-Stack & Infrastructure: Builds the production-grade APIs, internal tools, and scalable cloud infrastructure needed to serve our agents reliably.

RAG System Architecture

Our Retrieval-Augmented Generation (RAG) system grounds our agents in the project's specific knowledge, ensuring their responses are relevant, accurate, and aligned with our design philosophy. The entire system is orchestrated by the RetrievalEngine class (src/walkxr_ai/rag/retrieval_engine.py).

graph TD
    subgraph Ingestion Phase
        A[Docs Folder] --> B(RetrievalEngine);
        C[rag_config.yaml] -- defines chunking --> B;
        B -- uses nomic-embed-text --> D[Embeddings];
        D --> E[ChromaDB Vector Store];
    end

    subgraph Query Phase
        F[User Query] --> G(RetrievalEngine);
        G -- queries --> E;
        E -- returns relevant chunks --> G;
        G -- combines query + context --> H{LLM Prompt};
        H -- sent to llama3 --> I[Synthesized Answer];
    end

Key Components:

rag_config.yaml: The central configuration file. It defines all parameters for the RAG pipeline, including paths to documents, the vector store location, the Ollama models to use for embeddings and generation, and the chunking strategy.
Ingestion (ingest command): The RetrievalEngine reads documents from the source directory specified in the config. It applies the chosen chunking strategy (see chunking_strategy.md), generates vector embeddings for each chunk using nomic-embed-text, and stores them in the local ChromaDB vector store.
Querying (query command): When a user submits a query, the RetrievalEngine first embeds the query text. It then searches the ChromaDB vector store to find the most semantically similar document chunks. Finally, it combines the original query with the retrieved context into a single prompt and sends it to a powerful LLM (llama3) to generate a comprehensive, context-aware answer.

Technology Stack

This project leverages a range of modern technologies for AI development.

Core Language: Python 3.11.9
Dependency Management: Poetry
Local Inference: Ollama (serving llama3 and nomic-embed-text)
Agent Frameworks: LlamaIndex, LangChain, LangGraph
Vector Store: ChromaDB
CLI: Typer
Code Quality: Ruff (linter/formatter), MyPy (static type checker)
Observability: LangSmith (optional, via API key)

Key Scripts & Usage

All primary developer tasks are managed through a single, powerful command-line interface.

`scripts/manage_rag_index.py`

This script is the main entry point for managing and testing the RAG knowledge base. It uses Typer to provide a clean, documented CLI.

Activate the environment first: poetry shell
To ingest documents and build the knowledge base:
```
python scripts/manage_rag_index.py ingest
```
Run this command whenever you add or update documents in the docs/ folder.
To test the RAG pipeline with a query:
```
python scripts/manage_rag_index.py query "What is the project's approach to agent safety?"
```
This will retrieve relevant information from the knowledge base and generate an answer.

Building Your First Agent: The `SmallTalkAgent`

Our development philosophy is centered on creating modular, independent agents that can be orchestrated by the future WalkXR Emotional OS. Your first contribution should be building a simple SmallTalkAgent.

Here is the recommended workflow, designed with modularity in mind:

Create the Agent File: Inside src/walkxr_ai/agents/, create a new file named small_talk_agent.py.
Define the Agent Class: Create a SmallTalkAgent class. It should be initialized with a reference to the RetrievalEngine from our RAG system to ensure its responses are grounded in our knowledge base.
Implement a generate_response Method: This method will take a user query as input. Internally, it should first call the RAG system's query method to get relevant context. Then, it will pass that context along with the original query to an LLM to generate a conversational, in-character response.
Keep it Modular: Do not hardcode prompts or model names directly in the class. Instead, load them from a separate configuration file. This ensures that the agent's logic is decoupled from its personality, making it reusable and easier to test.
Add a Test: Create a corresponding test file in the tests/agents/ directory to validate that your agent can be initialized and can generate a response.

By following this pattern, you create a self-contained, testable, and configurable agent—a perfect building block for our larger orchestration platform. poetry run python scripts/test_query.py ``` Use this script to verify that the RAG system is functioning correctly after setup or changes to the ingestion process or models.

Development Workflow

Adherence to a consistent workflow ensures code quality and smooth collaboration.

Git & GitHub

Branching Strategy: We follow a Gitflow-like model. See CONTRIBUTING.md for details on main, develop, feature/*, fix/*, and chore/* branches.
Commit Messages: Follow conventional commit guidelines (e.g., feat(agent): add new capability). Details in CONTRIBUTING.md.
Pull Requests (PRs): Submit PRs to the develop branch for review. Ensure your PRs are linked to relevant issues. We use a Pull Request Template to standardize submissions; please fill it out when creating a PR.
Issue Tracking: We use GitHub Issues to track bugs and feature requests. Please use our provided templates to ensure all necessary information is included:
- Bug Report Template
- Feature Request Template
Code Reviews: At least one team member should review PRs before merging.
Fetching Updates: Regularly pull changes from the remote repository (git pull origin develop) to stay up-to-date.

Code Quality

Linting & Formatting (Ruff): pyproject.toml is configured to use Ruff. Run poetry run ruff check . and poetry run ruff format . before committing.
Static Typing (MyPy): pyproject.toml is configured for MyPy. Run poetry run mypy src to check types.
IDE Integration: Configure your IDE (e.g., VS Code) to use Ruff and MyPy for real-time feedback.

Testing

Unit Tests: Write unit tests for individual functions and classes in the tests/ directory.
Integration Tests: Test interactions between components (e.g., agent logic with memory systems).
Test-Driven Development (TDD): Consider TDD for critical components.
Continuous Integration (CI) (Future): Set up GitHub Actions to automatically run tests, linters, and type checkers on PRs.

Project Roadmap & Milestones

Our development is guided by a phased approach, ensuring we build a robust foundation before scaling complexity. The detailed, epic-level plan is tracked in our official WalkXR_AI_Backlog.md.

Phase 1 (E01-E06): Foundational Systems & Core Architecture

Goal: To build the core infrastructure for creating stateful, knowledge-grounded, and orchestrated AI agents.
Key Activities:
- Establishing a robust RAG pipeline for knowledge retrieval (RetrievalEngine).
- Defining core agent architecture with state management and memory.
- Implementing a multi-agent orchestrator using LangGraph.
- Creating the first fine-tuned models based on simulation data.

Phase 2 (E07-E10): 'Small Moments' Walk v1.0

Goal: To build, test, and release the first complete, end-to-end multi-agent walk experience.
Key Activities:
- Developing the full cohort of specialized agents for the 'Small Moments' walk (Narrative, Ritual, Play, Reflection, etc.).
- Integrating all agents into the master orchestrator for a seamless user journey.
- Conducting rigorous end-to-end simulation, adversarial testing, and performance validation.

Phase 3 (E11-E12): WalkXR OS Platform

Goal: To generalize the architecture into a scalable platform and introduce dynamic, continuous learning.
Key Activities:
- Refactoring the system into a "Walk Factory" to enable rapid creation of new walks.
- Evolving the orchestrator into a dynamic engine that personalizes the user journey in real-time.
- Implementing reward models for continuous learning (RLAIF) and generative agents for novelty.
- Integrate a hybrid memory system (vector + graph, e.g., Neo4j).
- Develop robust evaluation metrics for the overall system and user outcomes (empathy, resilience, etc.).

Documentation Index

Key documents providing context, design rationale, and research foundations are located in the docs/ directory.

`docs/architecture/`

Tech_Architecture.md: Describes the overall technical architecture, technology choices, development lifecycle, and scaling considerations for the WalkXR AI system.
WalkXR_AI_PRD.md: The Product Requirements Document for WalkXR AI, outlining features, user stories, and success criteria for the AI components.

`docs/internal/`

The Verse Notion Page.md: An export or summary of The Verse's broader vision, projects, and operational principles, providing context for WalkXR.
WalkXR AI Team Design Document.md: Detailed design notes, discussions, and decisions made by the AI development team.
WalkXR Full Outline and Investor Outreach.txt: A comprehensive outline of the WalkXR project, potentially used for investor discussions and strategic planning.
WalkXR Simulation Docs.md: Documents related to the simulation methodology for testing WalkXR modules and generating data for AI training and design.
WalkXR Summary.md: A concise summary of the WalkXR project, its goals, and key features.

`docs/research/`

AI and EI.txt: A general exploration or collection of notes on the intersection of Artificial Intelligence and Emotional Intelligence.
Deeper Conversatios Article (Kardas, Kumar, Epley).md: Summary or full text of the research by Kardas, Kumar, & Epley on miscalibrated expectations in conversations, a core piece of research informing agent design.
EI LLM Tests Study.txt: Notes or summary of studies demonstrating LLM performance on standard Emotional Intelligence tests.
Emotional Intelligence in Artificial Agents.md: A document exploring the concepts and challenges of embedding emotional intelligence into artificial agents.
Emotional Intelligence in Artificial Intelligence.md: Similar to the above, likely discussing the broader field of EI within AI.

Contributing

We welcome contributions! Please see our CONTRIBUTING.md for detailed guidelines on our development workflow (including Git branching, issue/PR templates, commit conventions), coding standards (PEP8, Ruff, MyPy), and testing procedures.

Contact & Support

For questions, issues, or discussions related to the WalkXR AI project, please contact the project lead, Roman Di Domizio, via Discord.

This README is a living document and will be updated as the project evolves.