Documentation Index
Fetch the complete documentation index at: https://mintlify.com/xai-org/x-algorithm/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Phoenix is the machine learning system that powers both retrieval (finding relevant candidates from millions of posts) and ranking (scoring and ordering candidates by predicted engagement). The system uses transformer-based architectures adapted from the Grok-1 open source release by xAI, with custom input embeddings and attention masking designed specifically for recommendation systems.The code is representative of the production model with the exception of specific scaling optimizations.
Two-Stage Architecture
Phoenix operates in two distinct stages:Retrieval
Two-Tower ModelNarrows millions of posts to hundreds using approximate nearest neighbor search
- User Tower: Encodes user + engagement history
- Candidate Tower: Encodes all posts
- Similarity: Dot product for top-K selection
Ranking
Transformer with Candidate IsolationScores retrieved candidates using full transformer
- Input: User context + candidate posts
- Attention: Candidates isolated from each other
- Output: Probabilities for multiple engagement types
Retrieval: Two-Tower Model
The retrieval stage efficiently finds relevant candidates from a massive corpus.Architecture
User Tower encodes user features and engagement history:phoenix/recsys_retrieval_model.py
Similarity Search
Once both towers produce normalized embeddings:- Index building: All posts encoded offline into
[N, D]matrix - Query encoding: User tower produces
[B, D]embedding at request time - Top-K retrieval: Dot product similarity → select top candidates
Because both representations are L2-normalized, dot product equals cosine similarity, enabling efficient approximate nearest neighbor search with libraries like FAISS or ScaNN.
Ranking: Transformer with Candidate Isolation
The ranking model scores the retrieved candidates using a full transformer architecture with a critical design choice: candidates cannot attend to each other.Model Architecture
phoenix/recsys_model.py
Candidate Isolation Mask
The attention mask ensures candidates only attend to user/history, never to each other:Multi-Action Prediction
The model predicts probabilities for multiple engagement types simultaneously:Hash-Based Embeddings
Both retrieval and ranking use multiple hash functions for embedding lookup:Multiple hash functions provide better representation capacity and collision resistance compared to a single hash table.
Integration with Home Mixer
Phoenix Source (Retrieval)
home-mixer/sources/phoenix_source.rs
Phoenix Scorer (Ranking)
home-mixer/scorers/phoenix_scorer.rs
Running the Code
The repository includes example code demonstrating both retrieval and ranking:Key Design Decisions
Why Candidate Isolation?
Why Candidate Isolation?
Prevents the score for a candidate from depending on which other candidates are in the batch. This ensures:
- Consistent scores across different batches
- Ability to cache predictions
- Simpler debugging and analysis
Why Hash-Based Embeddings?
Why Hash-Based Embeddings?
Multiple hash functions provide:
- Better representation capacity than single lookup
- Collision resistance for large entity spaces
- Memory efficiency compared to explicit embedding tables
Why Multi-Action Prediction?
Why Multi-Action Prediction?
Rather than predicting a single “relevance” score, the model predicts probabilities for many actions:
- Captures nuanced user preferences
- Enables flexible weighting strategies
- Incorporates negative signals (block, mute, report)
Why Two-Stage (Retrieval + Ranking)?
Why Two-Stage (Retrieval + Ranking)?
- Retrieval: Fast, approximate search over millions of items
- Ranking: Expensive, precise scoring for hundreds of items
- This separation enables scaling to large corpora while maintaining quality
Performance
Typical Latencies
- Retrieval (Two-Tower): ~20-50ms for top-1000 from millions
- Ranking (Transformer): ~50-100ms for scoring 500 candidates
- Total Phoenix latency: ~70-150ms
Related Components
Home Mixer
Orchestration layer that uses Phoenix for candidate sourcing and scoring
Thunder
Provides in-network candidates to complement Phoenix’s out-of-network retrieval
Candidate Pipeline
Framework that integrates Phoenix into the overall recommendation flow