# WalkXR AI: Emotionally Intelligent Agentic Systems

## Table of Contents

1.  [Project Overview](#project-overview)
    *   [Mission](#mission)
    *   [Vision](#vision)
2.  [Repository Structure](#repository-structure)
3.  [Getting Started: Environment Setup](#getting-started-environment-setup)
    *   [Prerequisites](#prerequisites)
    *   [1. Clone the Repository](#1-clone-the-repository)
    *   [2. Python Version Management (`pyenv`)](#2-python-version-management-pyenv)
    *   [3. Dependency Management (`Poetry`)](#3-dependency-management-poetry)
    *   [4. Environment Variables](#4-environment-variables)
    *   [5. Local LLM Server (`Ollama`)](#5-local-llm-server-ollama)
    *   [6. Verify Setup](#6-verify-setup)
4.  [Architecture Overview](#architecture-overview)
    *   [Core Principles](#core-principles)
    *   [The Four Development Tracks](#the-four-development-tracks)
    *   [RAG System Architecture](#rag-system-architecture)
5.  [Technology Stack](#technology-stack)
6.  [Key Scripts & Usage](#key-scripts--usage)
    *   [`scripts/manage_rag_index.py`](#scriptsmanage_rag_indexpy)
7.  [Agent Design Philosophy](#agent-design-philosophy)
8.  [Development Workflow](#development-workflow)
9.  [Project Roadmap & Milestones](#project-roadmap--milestones)
10. [Documentation Index](#documentation-index)
11. [Contributing](#contributing)
12. [Contact & Support](#contact--support)

---

## Project Overview

WalkXR AI is dedicated to creating emotionally intelligent agentic systems that enhance human connection, self-awareness, and empathetic understanding. These agents are designed to power immersive and interactive experiences within the WalkXR platform, a broader initiative by The Verse to build games, experiences, and rituals that uplift humanity.

### Mission

To develop and deploy sophisticated AI agents that can perceive, understand, and respond to human emotion with nuance and integrity, fostering psychologically safe and transformative interactions.

### Vision

A future where AI companions and roleplay agents act as catalysts for personal growth, facilitating deeper self-reflection, co-regulation, and social connection. The long-term vision is the WalkXR Emotional OS, a comprehensive orchestration engine that dynamically adapts experiences based on user emotional states and therapeutic goals.

---

## Repository Structure

This repository contains the core AI development for WalkXR agents, including knowledge base management, agent logic, and supporting infrastructure.

```
/Users/romandidomizio/WalkXR-AI
├── .env.template
├── .github/                # GitHub action workflows and templates
├── .gitignore
├── .python-version         # Set by pyenv to lock Python version
├── README.md
├── docs/
│   ├── architecture/       # Core technical and product architecture documents
│   └── research/           # Foundational research and papers
├── notebooks/              # Jupyter notebooks for experimentation
├── pyproject.toml          # Defines project dependencies and tool configurations
├── scripts/
│   └── manage_rag_index.py # CLI tool for managing the RAG knowledge base
├── src/
│   └── walkxr_ai/
│       ├── __init__.py
│       ├── agents/         # Agent implementations (future work)
│       └── rag/            # Core RAG system components
│           ├── __init__.py
│           ├── chunking_strategy.md # Documentation on chunking approaches
│           ├── rag_config.yaml      # Configuration file for the RAG system
│           └── retrieval_engine.py  # Main class for RAG pipeline logic
└── vector_store/           # Default local storage for ChromaDB
```

**Repository Workflow: From Idea to Agent**

This repository is structured to support a clear development workflow, from initial design and research to production-ready agents.

*   **`docs/` (Design & Research):** This is where ideas begin. Foundational research, product requirements, and technical architecture documents live here. Before writing code, consult the docs to understand the "why" behind an agent or feature.
*   **`notebooks/` (Prototyping & Experimentation):** Use Jupyter notebooks in this directory to rapidly prototype new concepts, test LLM prompts, and experiment with libraries like LangChain or LlamaIndex before formalizing them into the main codebase.
*   **`src/walkxr_ai/` (Core Development):** This is the heart of the project, containing all production Python code.
    *   **`rag/`**: The foundational knowledge system that all agents will use.
    *   **`core/`**: Core components like state management, safety layers, and memory systems will live here.
    *   **`agents/`**: Where modular, reusable agents are built. Each agent should be self-contained and designed for future orchestration.
    *   **`simulation/`**: Contains tools and schemas for simulating user interactions to test agent responses.
*   **`tests/` (Validation & Quality):** All Pytest unit and integration tests go here. Every new feature or agent added to `src/` should be accompanied by corresponding tests to ensure reliability.
*   **`scripts/` (Management & Operations):** Contains high-level CLI tools for managing the system, such as the `manage_rag_index.py` script for interacting with the RAG knowledge base.

---

## Getting Started: Environment Setup

Follow these instructions carefully to create a consistent development environment.

### Prerequisites

*   **Operating System:** macOS, Linux, or Windows with WSL2.
*   **Git:** Ensure Git is installed.
*   **`pyenv` and `poetry`**: See steps below.

### 1. Clone the Repository

```bash
cd path/to/your/development/folder
git clone https://github.com/VerseBuilding/WalkXR-AI.git
cd WalkXR-AI
```

### 2. Python Version Management (`pyenv`)

We use `pyenv` to lock our Python version to `3.11.9`.

```bash
# Install pyenv (macOS example)
brew install pyenv

# Follow shell setup instructions from pyenv

# Install and set the project's Python version
pyenv install 3.11.9
pyenv local 3.11.9

# Verify the version is active
python --version
# Expected output: Python 3.11.9
```

### 3. Dependency Management (`Poetry`)

We use [Poetry](https://python-poetry.org/) for managing project dependencies and virtual environments.

*   **Install Poetry:** Follow the [official installation guide](https://python-poetry.org/docs/#installation).
*   **Install Dependencies:**
    The `poetry.lock` file, which ensures exact dependency versions, is specific to the operating system and Python version it was created on. To avoid conflicts, **it is not committed to the repository**. You will generate your own local lock file.

    Run the following command to install the dependencies and create your `poetry.lock`:
    ```bash
    poetry install
    ```
    This command reads `pyproject.toml`, resolves the dependencies, installs them into a virtual environment, and generates a `poetry.lock` file tailored to your system.

### 4. Environment Variables

Create a `.env` file for API keys and other secrets.

```bash
cp .env.template .env
```

Open the `.env` file and add any necessary keys (e.g., `LANGCHAIN_API_KEY` for LangSmith tracing).

### 5. Local LLM Server (`Ollama`)

For local development, we use Ollama to serve LLMs and embedding models.

*   **Install Ollama:** Download from the [official Ollama website](https://ollama.com/).
*   **Pull Required Models:**
    ```bash
    ollama pull llama3          # Primary LLM for generation tasks
    ollama pull nomic-embed-text # For text embeddings (RAG)
    ```

### 6. Verify Setup

After completing all steps, verify that the RAG system is working.

*   **Activate the virtual environment:**
    ```bash
    poetry shell
    ```
*   **1. Build the Knowledge Base:**
    The `ingest` command processes documents in the `docs/` directory and creates a vector store in `vector_store/walkxr_knowledge_base`.
    ```bash
    python scripts/manage_rag_index.py ingest
    ```
*   **2. Run a Test Query:**
    The `query` command tests the end-to-end RAG pipeline.
    ```bash
    python scripts/manage_rag_index.py query "What is the agent design philosophy?"
    ```
If both scripts run without errors, your development environment is correctly set up.

---

## Architecture Overview

WalkXR AI is designed with modularity, scalability, and ethical considerations at its core.

### Core Principles

*   **Modularity:** Agents and system components are designed to be independent, testable, and composable.
*   **Simulation-Led Development:** Agent behaviors are tested in simulation before deployment.
*   **Therapeutic Guardrails:** Agents never diagnose or judge; they co-regulate and offer reflection.
*   **Interoperability:** Support for multiple LLM providers and easy integration with various frontends.

### The Four Development Tracks

Our development process is organized into four parallel, interconnected tracks:

1.  **Track 1: EI Design & Evaluation**: Defines the science behind our agents and builds the frameworks to evaluate their emotional intelligence and effectiveness.
2.  **Track 2: Simulation & Data**: Generates high-quality training and testing data by simulating diverse human personas and their emotional reactions.
3.  **Track 3: Agents & Memory**: Builds the sophisticated, stateful agents using frameworks like LangGraph and develops the hybrid memory systems that allow them to learn.
4.  **Track 4: Full-Stack & Infrastructure**: Builds the production-grade APIs, internal tools, and scalable cloud infrastructure needed to serve our agents reliably.

### RAG System Architecture

Our Retrieval-Augmented Generation (RAG) system grounds our agents in the project's specific knowledge, ensuring their responses are relevant, accurate, and aligned with our design philosophy. The entire system is orchestrated by the `RetrievalEngine` class (`src/walkxr_ai/rag/retrieval_engine.py`).

```mermaid
graph TD
    subgraph Ingestion Phase
        A[Docs Folder] --> B(RetrievalEngine);
        C[rag_config.yaml] -- defines chunking --> B;
        B -- uses nomic-embed-text --> D[Embeddings];
        D --> E[ChromaDB Vector Store];
    end

    subgraph Query Phase
        F[User Query] --> G(RetrievalEngine);
        G -- queries --> E;
        E -- returns relevant chunks --> G;
        G -- combines query + context --> H{LLM Prompt};
        H -- sent to llama3 --> I[Synthesized Answer];
    end
```

**Key Components:**

*   **`rag_config.yaml`**: The central configuration file. It defines all parameters for the RAG pipeline, including paths to documents, the vector store location, the Ollama models to use for embeddings and generation, and the chunking strategy.
*   **Ingestion (`ingest` command)**: The `RetrievalEngine` reads documents from the source directory specified in the config. It applies the chosen chunking strategy (see `chunking_strategy.md`), generates vector embeddings for each chunk using `nomic-embed-text`, and stores them in the local `ChromaDB` vector store.
*   **Querying (`query` command)**: When a user submits a query, the `RetrievalEngine` first embeds the query text. It then searches the `ChromaDB` vector store to find the most semantically similar document chunks. Finally, it combines the original query with the retrieved context into a single prompt and sends it to a powerful LLM (`llama3`) to generate a comprehensive, context-aware answer.

---

## Technology Stack

This project leverages a range of modern technologies for AI development.

*   **Core Language:** Python 3.11.9
*   **Dependency Management:** Poetry
*   **Local Inference:** Ollama (serving `llama3` and `nomic-embed-text`)
*   **Agent Frameworks:** LlamaIndex, LangChain, LangGraph
*   **Vector Store:** ChromaDB
*   **CLI:** Typer
*   **Code Quality:** Ruff (linter/formatter), MyPy (static type checker)
*   **Observability:** LangSmith (optional, via API key)

---

## Key Scripts & Usage

All primary developer tasks are managed through a single, powerful command-line interface.

### `scripts/manage_rag_index.py`

This script is the main entry point for managing and testing the RAG knowledge base. It uses `Typer` to provide a clean, documented CLI.

*   **Activate the environment first:** `poetry shell`

*   **To ingest documents and build the knowledge base:**
    ```bash
    python scripts/manage_rag_index.py ingest
    ```
    Run this command whenever you add or update documents in the `docs/` folder.

*   **To test the RAG pipeline with a query:**
    ```bash
    python scripts/manage_rag_index.py query "What is the project's approach to agent safety?"
    ```
    This will retrieve relevant information from the knowledge base and generate an answer.

### Building Your First Agent: The `SmallTalkAgent`

Our development philosophy is centered on creating modular, independent agents that can be orchestrated by the future WalkXR Emotional OS. Your first contribution should be building a simple `SmallTalkAgent`.

Here is the recommended workflow, designed with modularity in mind:

1.  **Create the Agent File:** Inside `src/walkxr_ai/agents/`, create a new file named `small_talk_agent.py`.
2.  **Define the Agent Class:** Create a `SmallTalkAgent` class. It should be initialized with a reference to the `RetrievalEngine` from our RAG system to ensure its responses are grounded in our knowledge base.
3.  **Implement a `generate_response` Method:** This method will take a user query as input. Internally, it should first call the RAG system's `query` method to get relevant context. Then, it will pass that context along with the original query to an LLM to generate a conversational, in-character response.
4.  **Keep it Modular:** Do not hardcode prompts or model names directly in the class. Instead, load them from a separate configuration file. This ensures that the agent's logic is decoupled from its personality, making it reusable and easier to test.
5.  **Add a Test:** Create a corresponding test file in the `tests/agents/` directory to validate that your agent can be initialized and can generate a response.

By following this pattern, you create a self-contained, testable, and configurable agent—a perfect building block for our larger orchestration platform.
    poetry run python scripts/test_query.py
    ```
    Use this script to verify that the RAG system is functioning correctly after setup or changes to the ingestion process or models.

---

## Development Workflow

Adherence to a consistent workflow ensures code quality and smooth collaboration.

### Git & GitHub
-   **Branching Strategy:** We follow a Gitflow-like model. See [CONTRIBUTING.md](CONTRIBUTING.md) for details on `main`, `develop`, `feature/*`, `fix/*`, and `chore/*` branches.
-   **Commit Messages:** Follow conventional commit guidelines (e.g., `feat(agent): add new capability`). Details in [CONTRIBUTING.md](CONTRIBUTING.md).
-   **Pull Requests (PRs):** Submit PRs to the `develop` branch for review. Ensure your PRs are linked to relevant issues. We use a [**Pull Request Template**](./.github/PULL_REQUEST_TEMPLATE.md) to standardize submissions; please fill it out when creating a PR.
-   **Issue Tracking:** We use GitHub Issues to track bugs and feature requests. Please use our provided templates to ensure all necessary information is included:
    -   [**Bug Report Template**](./.github/ISSUE_TEMPLATE/bug_report.md)
    -   [**Feature Request Template**](./.github/ISSUE_TEMPLATE/feature_request.md)
-   **Code Reviews:** At least one team member should review PRs before merging.
-   **Fetching Updates:** Regularly pull changes from the remote repository (`git pull origin develop`) to stay up-to-date.

### Code Quality
*   **Linting & Formatting (Ruff):** `pyproject.toml` is configured to use Ruff. Run `poetry run ruff check .` and `poetry run ruff format .` before committing.
*   **Static Typing (MyPy):** `pyproject.toml` is configured for MyPy. Run `poetry run mypy src` to check types.
*   **IDE Integration:** Configure your IDE (e.g., VS Code) to use Ruff and MyPy for real-time feedback.

### Testing
*   **Unit Tests:** Write unit tests for individual functions and classes in the `tests/` directory.
*   **Integration Tests:** Test interactions between components (e.g., agent logic with memory systems).
*   **Test-Driven Development (TDD):** Consider TDD for critical components.
*   **Continuous Integration (CI) (Future):** Set up GitHub Actions to automatically run tests, linters, and type checkers on PRs.

---

## Project Roadmap & Milestones

Our development is guided by a phased approach, ensuring we build a robust foundation before scaling complexity. The detailed, epic-level plan is tracked in our official `WalkXR_AI_Backlog.md`.

### Phase 1 (E01-E06): Foundational Systems & Core Architecture
*   **Goal**: To build the core infrastructure for creating stateful, knowledge-grounded, and orchestrated AI agents.
*   **Key Activities**:
    *   Establishing a robust RAG pipeline for knowledge retrieval (`RetrievalEngine`).
    *   Defining core agent architecture with state management and memory.
    *   Implementing a multi-agent orchestrator using LangGraph.
    *   Creating the first fine-tuned models based on simulation data.

### Phase 2 (E07-E10): 'Small Moments' Walk v1.0
*   **Goal**: To build, test, and release the first complete, end-to-end multi-agent walk experience.
*   **Key Activities**:
    *   Developing the full cohort of specialized agents for the 'Small Moments' walk (Narrative, Ritual, Play, Reflection, etc.).
    *   Integrating all agents into the master orchestrator for a seamless user journey.
    *   Conducting rigorous end-to-end simulation, adversarial testing, and performance validation.

### Phase 3 (E11-E12): WalkXR OS Platform
*   **Goal**: To generalize the architecture into a scalable platform and introduce dynamic, continuous learning.
*   **Key Activities**:
    *   Refactoring the system into a "Walk Factory" to enable rapid creation of new walks.
    *   Evolving the orchestrator into a dynamic engine that personalizes the user journey in real-time.
    *   Implementing reward models for continuous learning (RLAIF) and generative agents for novelty.
    *   Integrate a hybrid memory system (vector + graph, e.g., Neo4j).
    *   Develop robust evaluation metrics for the overall system and user outcomes (empathy, resilience, etc.).

---

## Documentation Index

Key documents providing context, design rationale, and research foundations are located in the `docs/` directory.

### `docs/architecture/`
*   **`Tech_Architecture.md`**: Describes the overall technical architecture, technology choices, development lifecycle, and scaling considerations for the WalkXR AI system.
*   **`WalkXR_AI_PRD.md`**: The Product Requirements Document for WalkXR AI, outlining features, user stories, and success criteria for the AI components.

### `docs/internal/`
*   **`The Verse Notion Page.md`**: An export or summary of The Verse's broader vision, projects, and operational principles, providing context for WalkXR.
*   **`WalkXR AI Team Design Document.md`**: Detailed design notes, discussions, and decisions made by the AI development team.
*   **`WalkXR Full Outline and Investor Outreach.txt`**: A comprehensive outline of the WalkXR project, potentially used for investor discussions and strategic planning.
*   **`WalkXR Simulation Docs.md`**: Documents related to the simulation methodology for testing WalkXR modules and generating data for AI training and design.
*   **`WalkXR Summary.md`**: A concise summary of the WalkXR project, its goals, and key features.

### `docs/research/`
*   **`AI and EI.txt`**: A general exploration or collection of notes on the intersection of Artificial Intelligence and Emotional Intelligence.
*   **`Deeper Conversatios Article (Kardas, Kumar, Epley).md`**: Summary or full text of the research by Kardas, Kumar, & Epley on miscalibrated expectations in conversations, a core piece of research informing agent design.
*   **`EI LLM Tests Study.txt`**: Notes or summary of studies demonstrating LLM performance on standard Emotional Intelligence tests.
*   **`Emotional Intelligence in Artificial Agents.md`**: A document exploring the concepts and challenges of embedding emotional intelligence into artificial agents.
*   **`Emotional Intelligence in Artificial Intelligence.md`**: Similar to the above, likely discussing the broader field of EI within AI.

---

## Contributing

We welcome contributions! Please see our [**CONTRIBUTING.md**](CONTRIBUTING.md) for detailed guidelines on our development workflow (including Git branching, issue/PR templates, commit conventions), coding standards (PEP8, Ruff, MyPy), and testing procedures.


---

## Contact & Support

For questions, issues, or discussions related to the WalkXR AI project, please contact the project lead, Roman Di Domizio, via Discord.

---

*This README is a living document and will be updated as the project evolves.*