mirror of https://github.com/saymrwulf/alpha-arena.git synced 2026-05-14 20:37:51 +00:00

oho 435e09f732 docs: update TESTING.md with new test files and integration tests

- Add test_integration.py, test_debate.py, test_signals.py, test_llm_providers.py to structure
- Add Integration Tests section documenting multi-agent testing
- Update test count to 328+

2026-01-13 14:27:45 +01:00

9.1 KiB

Raw Permalink Blame History

Alpha Arena - Testing Guide

Complete guide to the test suite and testing practices.

I want to...	Go to
Run all tests	Quick Reference
Run specific test suites	Test Categories
Understand test fixtures	Test Fixtures
Write new tests	Writing New Tests
Mock external services	Mocking External Services
Fix test issues	Troubleshooting Tests

Quick Reference

# Run all tests (two ways)
./alpha test
./scripts/test.sh

# Run specific suites
./scripts/test.sh unit       # Unit tests only
./scripts/test.sh api        # API tests only
./scripts/test.sh e2e        # End-to-end tests
./scripts/test.sh fast       # Exclude slow tests
./scripts/test.sh coverage   # With coverage report

Test Structure

tests/
├── conftest.py          # Shared fixtures and configuration
├── test_api.py          # API endpoint tests
├── test_e2e.py          # End-to-end functional tests
├── test_integration.py  # Multi-agent integration tests (NEW)
├── test_debate.py       # Debate system tests (NEW)
├── test_signals.py      # Signal aggregation tests (NEW)
├── test_llm_providers.py# LLM provider tests (NEW)
├── test_risk.py         # Risk management unit tests
├── test_pnl.py          # PnL accounting unit tests
├── test_indicators.py   # Technical indicator tests
├── test_memory.py       # Memory system tests
├── test_backtest.py     # Backtesting tests
└── test_core_types.py   # Core type tests

Test count: 328+ tests

Test Categories

Unit Tests

Test individual functions and classes in isolation.

./scripts/test.sh unit

What's tested:

Risk validation logic (test_risk.py)
PnL calculations (test_pnl.py)
Technical indicators (test_indicators.py)
Core data types (test_core_types.py)

Example:

def test_position_pnl():
    """Test position-level PnL calculation."""
    position = Position(
        size=Decimal("100"),
        avg_entry_price=Decimal("0.50"),
        current_price=Decimal("0.60"),
    )
    expected_pnl = (0.60 - 0.50) * 100  # = $10
    assert position.unrealized_pnl == Decimal("10")

API Tests

Test all REST API endpoints for correct behavior.

./scripts/test.sh api

What's tested:

Health endpoints (/api/system/health)
Trading control (/api/trading/start, /api/trading/stop)
Position management (/api/positions)
Market browsing (/api/markets)
Configuration (/api/config)
All page rendering (/, /trading, etc.)
Error handling (404, 422, etc.)

Example:

@pytest.mark.asyncio
async def test_health_check(async_client):
    """GET /api/system/health should return healthy status."""
    response = await async_client.get("/api/system/health")
    assert response.status_code == 200
    assert response.json()["status"] == "healthy"

Integration Tests

Test multi-component integration with mocked LLM providers.

pytest -m "integration"

What's tested:

Full agent coordination (Research → Risk → Debate → Execution)
Multi-agent debate with multiple personas
Confidence calibration tracking
Signal aggregation from multiple sources
Event calendar impact on trading
End-to-end trading cycle with mocked broker
Risk rejection scenarios
LLM failure handling
Performance metrics tracking

Key fixture: MockLLMProvider - Returns structured JSON responses matching the real LLMResponse interface.

End-to-End Tests

Test complete user workflows via API.

./scripts/test.sh e2e

What's tested:

Trading workflow: start → check → stop
Kill switch emergency stop
Position management workflow
Market browsing and filtering
Wallet analysis workflow
Configuration updates
Navigation between pages
Error recovery

Example:

@pytest.mark.asyncio
async def test_simulation_trading_cycle(async_client):
    """Complete workflow: Check status -> Start -> Stop."""
    # Step 1: Check initial status
    response = await async_client.get("/api/trading/status")
    assert response.status_code == 200

    # Step 2: Start trading
    response = await async_client.post(
        "/api/trading/start",
        json={"mode": "simulation"}
    )

    # Step 3: Stop trading
    response = await async_client.post("/api/trading/stop")

Test Fixtures

Common fixtures defined in conftest.py:

Fixture	Description
`temp_dir`	Temporary directory for test artifacts
`test_config`	Test configuration (no real API keys)
`risk_config`	Risk configuration for tests
`sample_position`	Example position data
`sample_signal`	Example trade signal
`mock_broker`	Mock broker (no real trades)
`mock_llm`	Mock LLM client (no real API calls)
`async_client`	HTTP client for API testing
`metrics_logger`	Logger with temporary storage

Running Tests

All Tests

./scripts/test.sh

Verbose Output

./scripts/test.sh -v

Stop on First Failure

./scripts/test.sh -x

With Coverage

./scripts/test.sh coverage

# View HTML report
open htmlcov/index.html

Specific Test File

source .venv/bin/activate
pytest tests/test_risk.py -v

Specific Test Function

source .venv/bin/activate
pytest tests/test_risk.py::TestRiskChecks::test_valid_signal_passes -v

Test Markers

Tests can be filtered by markers:

# Exclude slow tests
pytest -m "not slow"

# Run only API tests
pytest -m "api"

# Run only E2E tests
pytest -m "e2e"

# Run only integration tests
pytest -m "integration"

Writing New Tests

Unit Test Template

"""Tests for [module name]."""

import pytest
from decimal import Decimal

from src.module import MyClass


class TestMyClass:
    """Test MyClass functionality."""

    def test_basic_operation(self):
        """Should perform basic operation correctly."""
        obj = MyClass()
        result = obj.operation()
        assert result == expected_value

    @pytest.mark.asyncio
    async def test_async_operation(self):
        """Should handle async operations."""
        obj = MyClass()
        result = await obj.async_operation()
        assert result is not None

API Test Template

"""API tests for [endpoint group]."""

import pytest

pytestmark = pytest.mark.api


class TestMyEndpoint:
    """Test /api/my-endpoint."""

    @pytest.mark.asyncio
    async def test_get_success(self, async_client):
        """GET should return success."""
        response = await async_client.get("/api/my-endpoint")
        assert response.status_code == 200
        data = response.json()
        assert "expected_field" in data

    @pytest.mark.asyncio
    async def test_post_validation(self, async_client):
        """POST with invalid data should return 422."""
        response = await async_client.post(
            "/api/my-endpoint",
            json={"invalid": "data"}
        )
        assert response.status_code == 422

E2E Test Template

"""E2E tests for [workflow]."""

import pytest

pytestmark = [pytest.mark.e2e, pytest.mark.slow]


class TestMyWorkflow:
    """Test complete [workflow name] workflow."""

    @pytest.mark.asyncio
    async def test_complete_workflow(self, async_client):
        """
        Complete workflow: Step 1 -> Step 2 -> Step 3.
        """
        # Step 1
        response = await async_client.get("/api/step1")
        assert response.status_code == 200

        # Step 2
        response = await async_client.post("/api/step2", json={...})
        assert response.status_code == 200

        # Step 3
        response = await async_client.get("/api/step3")
        assert response.status_code == 200

Mocking External Services

Mock Broker

def test_with_mock_broker(mock_broker):
    """Test using mock broker."""
    mock_broker.balance = Decimal("500")
    mock_broker.positions = [sample_position]

    # Your test code
    balance = await mock_broker.get_balance()
    assert balance == Decimal("500")

Mock LLM

def test_with_mock_llm(mock_llm):
    """Test using mock LLM."""
    mock_llm.responses = ["Buy YES at 0.55"]

    # Your test code
    response = await mock_llm.complete("Analyze market")
    assert "Buy" in response

Continuous Integration

Tests run automatically on:

Every push to main branch
Every pull request
Nightly scheduled runs

Required Checks

All tests pass
Coverage above 70%
No security vulnerabilities

Troubleshooting Tests

"Module not found"

# Ensure venv is activated
source .venv/bin/activate

"Async test timeout"

# Increase timeout for slow tests
@pytest.mark.timeout(60)
async def test_slow_operation():
    ...

"Database locked"

# Use separate temp directories for parallel tests
pytest -n auto  # Uses pytest-xdist

"Fixture not found"

Ensure conftest.py is in the tests directory and properly imports all fixtures.

9.1 KiB Raw Permalink Blame History

Alpha Arena - Testing Guide

Quick Navigation

Quick Reference

Test Structure

Test Categories

Unit Tests

API Tests

Integration Tests

End-to-End Tests

Test Fixtures

Running Tests

All Tests

Verbose Output

Stop on First Failure

With Coverage

Specific Test File

Specific Test Function

Test Markers

Writing New Tests

Unit Test Template

API Test Template

E2E Test Template

Mocking External Services

Mock Broker

Mock LLM

Continuous Integration

Required Checks

Troubleshooting Tests

"Module not found"

"Async test timeout"

"Database locked"

"Fixture not found"

9.1 KiB

Raw Permalink Blame History