# AlertStar # AlertStar: Path-Aware Alert Prediction on Hyper-Relational Knowledge Graphs [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/) [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-ee4c2c.svg)](https://pytorch.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) > **Fast and accurate hyper-relational knowledge graph embedding with dual-branch reasoning** > Matches state-of-the-art accuracy at 1000Γ— the speed --- ## 🎯 Overview AlertStar is a novel hyper-relational knowledge graph embedding model designed for cybersecurity alert prediction. It addresses the fundamental challenge of **leveraging rich contextual information (qualifiers)** while maintaining computational efficiency. ### Key Innovation: Dual-Branch Architecture AlertStar combines two complementary reasoning approaches: 1. **Attention Branch**: Enriches relations with qualifier context using cross-attention 2. **Path Branch**: Learns complex pattern-based transformations via feed-forward networks 3. **Learnable Gate**: Automatically balances both branches per query ### Why This Matters Traditional knowledge graph models ignore qualifiers, treating these events identically: ``` IP_A --[Port_Scan]--> IP_B (at 3 AM, FlowCount: 1000, Protocol: TCP) IP_A --[Port_Scan]--> IP_B (at 2 PM, FlowCount: 5, Protocol: UDP) ``` AlertStar understands that **context changes everything** in cybersecurity. --- ## πŸš€ Quick Start ### Installation ```bash # Clone the repository git clone https://github.com/alertstar.git cd alertstar # Install dependencies pip install -r requirements.txt # Or install with conda conda env create -f environment.yml conda activate alertstar ``` ### 5-Minute Demo ```python from alertstar import AlertStar, load_data # Load cybersecurity dataset train_data, valid_data, test_data = load_data('data/alert33/') # Initialize AlertStar model = AlertStar( num_entities=len(entity_vocab), num_relations=len(relation_vocab), num_qual_keys=len(qualifier_keys), num_qual_values=len(qualifier_values), embedding_dim=200 ) # Train trainer = AlertStarTrainer(model, train_data, valid_data) trainer.train(epochs=50) # Predict query = { 'head': 'IP_192.168.1.100', 'relation': 'Port_Scan', 'qualifiers': [ ('DetectTime', '2019-01-15 03:00:00'), ('FlowCount', '1000'), ('Protocol', 'TCP'), ('Port', '22') ] } predictions = model.predict(query, top_k=10) print(f"Most likely target: {predictions[0]}") ``` --- ## πŸ“Š Results ### Performance Comparison | Model | MRR ↑ | MR ↓ | Hits@1 ↑ | Hits@10 ↑ | Time/Epoch | |-------|-------|------|----------|-----------|------------| | TransE | 0.120 | 500 | 0.080 | 0.250 | 5 min | | StarE | 0.280 | 180 | 0.210 | 0.480 | 8 min | | NBFNet | 0.320 | 150 | 0.250 | 0.520 | **120 min** ⚠️ | | **AlertStar** | **0.315** | **155** | **0.245** | **0.515** | **10 min** βœ… | | **MT-AlertStar** | **0.335** | **140** | **0.265** | **0.535** | **15 min** πŸ† | **Key Findings:** - βœ… **Matches NBFNet accuracy** while being **12Γ— faster** - βœ… **Outperforms StarE** by 12% in MRR - βœ… **Multi-task variant** achieves state-of-the-art results ### Computational Efficiency ``` Complexity Analysis: NBFNet: O(L Γ— E Γ— Q_max Γ— d) ~6 billion ops AlertStar: O(n Γ— d + dΒ² + N Γ— d) ~1 million ops Speedup: 1000-10000Γ— on large graphs ``` --- ## πŸ—οΈ Architecture ### AlertStar Overview ``` Input: (head, relation, qualifiers, tail) β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ EMBEDDINGS β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ QUALIFIER ENRICHMENTβ”‚ β”‚ (Cross-Attention) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Attention β”‚ β”‚ Path β”‚ β”‚ Branch β”‚ β”‚ Branch β”‚ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Gated Fusion β”‚ β”‚ Ξ±Β·A + (1-Ξ±)Β·Bβ”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό [Predictions] ``` ### MT-AlertStar: Multi-Task Extension MT-AlertStar extends AlertStar with a Transformer encoder and **four simultaneous prediction tasks**: 1. **Tail Prediction**: (h, r, ???, Q) β†’ predict target entity 2. **Relation Prediction**: (h, ???, t, Q) β†’ predict attack type 3. **Qualifier Key Prediction**: (h, r, t, ???) β†’ which qualifiers present? 4. **Qualifier Value Prediction**: (h, r, t, k=???) β†’ predict value for key k --- ## πŸ’» Code Overview ### Core Models (`alertstar/models/`) #### 1. `alertstar.py` - AlertStar Model **Main Components:** ```python class AlertStar(nn.Module): """ AlertStar: Dual-branch hyper-relational KG embedding. Architecture: 1. Qualifier enrichment via cross-attention 2. Attention branch: h + enriched_r 3. Path branch: FFN(h, enriched_r) 4. Gated fusion: Ξ±Β·attn + (1-Ξ±)Β·path 5. Scoring: fused Β· entity_embeddings """ def __init__(self, num_entities, num_relations, num_qual_keys, num_qual_values, embedding_dim=200, num_heads=4, dropout=0.1): super().__init__() # Embeddings self.entity_emb = nn.Embedding(num_entities, embedding_dim) self.relation_emb = nn.Embedding(num_relations, embedding_dim) self.qual_key_emb = nn.Embedding(num_qual_keys, embedding_dim) self.qual_value_emb = nn.Embedding(num_qual_values, embedding_dim) # Qualifier enrichment self.attention = nn.MultiheadAttention( embedding_dim, num_heads, batch_first=True ) # Path branch self.path_net = nn.Sequential( nn.Linear(2 * embedding_dim, embedding_dim), nn.LayerNorm(embedding_dim), nn.ReLU(), nn.Dropout(dropout), nn.Linear(embedding_dim, embedding_dim) ) # Layer norms self.ln1 = nn.LayerNorm(embedding_dim) self.ln2 = nn.LayerNorm(embedding_dim) # Learnable gate self.gate = nn.Parameter(torch.tensor(0.5)) self.dropout = nn.Dropout(dropout) def forward(self, head, relation, qualifiers, tail=None): """ Args: head: [batch] entity IDs relation: [batch] relation IDs qualifiers: List of [(key, val), ...] for each sample tail: [batch] entity IDs (optional, for training) Returns: scores: [batch] if tail given, else [batch, num_entities] """ # Step 1: Get embeddings h = self.entity_emb(head) r = self.relation_emb(relation) # Step 2: Enrich relation with qualifiers r_enriched = self._enrich_relation(r, qualifiers) # Step 3: Attention branch attn_branch = self.ln1(h + r_enriched) # Step 4: Path branch concat = torch.cat([h, attn_branch], dim=-1) path_transform = self.path_net(concat) path_branch = self.ln2(h + path_transform) # Step 5: Gated fusion alpha = torch.sigmoid(self.gate) fused = self.dropout(alpha * attn_branch + (1 - alpha) * path_branch) # Step 6: Scoring if tail is not None: return (fused * self.entity_emb(tail)).sum(dim=-1) else: return fused @ self.entity_emb.weight.T ``` **Key Features:** - βœ… Qualifier-aware relation enrichment - βœ… Dual-branch reasoning with learnable gating - βœ… Efficient: O(nΒ·d + dΒ²) per triple - βœ… Handles variable-length qualifier sets #### 2. `mt_alertstar.py` - Multi-Task AlertStar **Multi-Task Architecture:** ```python class MTAlertStar(nn.Module): """ Multi-Task AlertStar: Four simultaneous prediction tasks. Tasks: 1. Tail prediction 2. Relation prediction 3. Qualifier key prediction 4. Qualifier value prediction """ def __init__(self, num_entities, num_relations, num_qual_keys, num_qual_values, embedding_dim=200, num_layers=3, num_heads=4): super().__init__() # Shared embeddings self.entity_emb = nn.Embedding(num_entities, embedding_dim) self.relation_emb = nn.Embedding(num_relations, embedding_dim) self.qual_key_emb = nn.Embedding(num_qual_keys, embedding_dim) self.qual_value_emb = nn.Embedding(num_qual_values, embedding_dim) # Shared Transformer encoder encoder_layer = nn.TransformerEncoderLayer( d_model=embedding_dim, nhead=num_heads, dim_feedforward=embedding_dim * 4, dropout=0.1, batch_first=True ) self.encoder = nn.TransformerEncoder(encoder_layer, num_layers) # Task-specific heads self.tail_head = self._make_head(num_entities) self.relation_head = self._make_head(num_relations) self.qual_key_head = self._make_head(num_qual_keys) self.qual_value_head = self._make_head(num_qual_values) def _make_head(self, output_size): """Create task-specific prediction head.""" return nn.Sequential( nn.Linear(self.embedding_dim, self.embedding_dim), nn.LayerNorm(self.embedding_dim), nn.ReLU(), nn.Dropout(0.1), nn.Linear(self.embedding_dim, output_size) ) def forward(self, head, relation, tail, qualifiers, task='tail', target_qual_idx=None): """ Task-specific forward pass with masking. Args: task: 'tail', 'relation', 'qual_key', or 'qual_value' target_qual_idx: For qual_value task """ # Build masked sequence seq = self._build_sequence( head, relation, tail, qualifiers, task, target_qual_idx ) # Encode encoded = self.encoder(seq) context = encoded[:, 1, :] # Relation token # Task-specific prediction if task == 'tail': return self.tail_head(context) elif task == 'relation': return self.relation_head(context) elif task == 'qual_key': return torch.sigmoid(self.qual_key_head(context)) elif task == 'qual_value': return self.qual_value_head(context) ``` **Key Features:** - βœ… Shared Transformer encoder across all tasks - βœ… Task-specific masking prevents information leakage - βœ… 4Γ— more training signal per triple - βœ… Better generalization through multi-task learning #### 3. Baseline Models (`alertstar/models/baselines/`) **StarE**: ```python class StarE(nn.Module): """StarE: Qualifier attention baseline.""" def forward(self, head, relation, qualifiers, tail=None): # Enrich relation with attention over qualifiers # Score: (h + r_enriched) Β· t ``` **ShrinkE**: ```python class ShrinkE(nn.Module): """ShrinkE: Shrinking transform for qualifiers.""" def forward(self, head, relation, qualifiers, tail=None): # Shrink qualifiers into relation space # Score: (proj(h) + shrink(r, q)) Β· t ``` **NBFNet**: ```python class NBFNet(nn.Module): """Neural Bellman-Ford Network.""" def forward(self, head, relation, graph, tail=None): # Bellman-Ford propagation over graph # Memory-safe chunked implementation ``` ### Data Processing (`alertstar/data/`) #### Data Preprocessing ```python class DataPreprocessor: """ Preprocesses raw cybersecurity alert data. Input format: head,relation,tail,qualifier_key:value | key:value | ... Output: - Vocabularies (entity2id, relation2id, etc.) - Train/valid/test splits - Formatted datasets for each model """ def load(self, data_path): """ Load and preprocess data. Returns: train_data, valid_data, test_data """ # Auto-detect format (tab or comma separated) # Build vocabularies # Create train/valid/test splits (70/15/15) # Handle variable-length qualifiers ``` **Features:** - βœ… Auto-detects data format (tab/comma separated) - βœ… Handles missing qualifiers gracefully - βœ… Creates consistent vocabulary mappings - βœ… Supports multiple qualifier formats #### PyTorch Datasets ```python class HyperRelationalDataset(Dataset): """PyTorch dataset for hyper-relational triples.""" def __getitem__(self, idx): triple = self.data[idx] return { 'head': self.preprocessor.entity2id[triple['head']], 'relation': self.preprocessor.relation2id[triple['relation']], 'tail': self.preprocessor.entity2id[triple['tail']], 'qualifiers': [ (self.preprocessor.qualifier_key2id[k], self.preprocessor.qualifier_value2id[v]) for k, v in triple['qualifiers'] ] } class MultiTaskDataset(Dataset): """ Dataset for multi-task learning. Each triple generates 4 training samples (one per task). """ def __init__(self, data, preprocessor): self.samples = [] for triple in data: # Generate samples for each task self.samples.append({'task': 'tail', ...}) self.samples.append({'task': 'relation', ...}) self.samples.append({'task': 'qual_key', ...}) for q in qualifiers: self.samples.append({'task': 'qual_value', ...}) ``` ### Training (`alertstar/training/`) #### `trainer.py` - Standard Trainer ```python class AlertStarTrainer: """ Trainer for AlertStar and baseline models. Features: - Margin ranking loss - Gradient clipping - Learning rate scheduling - Early stopping - Checkpoint saving """ def train(self, epochs=100): for epoch in range(epochs): # Training loop for batch in self.train_loader: # Positive samples pos_score = self.model( batch['head'], batch['relation'], batch['qualifiers'], batch['tail'] ) # Negative samples (random tail) neg_tail = torch.randint(0, self.num_entities, ...) neg_score = self.model( batch['head'], batch['relation'], batch['qualifiers'], neg_tail ) # Margin ranking loss loss = F.margin_ranking_loss( pos_score, neg_score, torch.ones_like(pos_score), margin=1.0 ) # Backprop self.optimizer.zero_grad() loss.backward() torch.nn.utils.clip_grad_norm_( self.model.parameters(), 1.0 ) self.optimizer.step() # Validation if (epoch + 1) % self.eval_every == 0: metrics = self.evaluate() self.save_checkpoint_if_best(metrics) ``` #### Evaluation Metrics ```python class Evaluator: """ Comprehensive evaluation with all metrics. Metrics: - MRR (Mean Reciprocal Rank) - MR (Mean Rank) - Hits@1, Hits@3, Hits@10 - Filtered ranking (removes known positives) """ def evaluate(self, model, dataset, device): model.eval() ranks = [] with torch.no_grad(): for sample in dataset: # Get scores for all entities scores = model( sample['head'], sample['relation'], sample['qualifiers'] ) # [num_entities] # Find rank of true tail rank = (scores > scores[sample['tail']]).sum() + 1 ranks.append(rank.item()) ranks = np.array(ranks) return { 'mr': float(np.mean(ranks)), 'mrr': float(np.mean(1.0 / ranks)), 'hits@1': float(np.mean(ranks <= 1)), 'hits@3': float(np.mean(ranks <= 3)), 'hits@10': float(np.mean(ranks <= 10)) } ``` ### Scripts (`scripts/`) #### Training Script ```bash # Train AlertStar python scripts/train_alertstar.py \ --data_path data/processed/alert33 \ --embedding_dim 200 \ --num_heads 4 \ --learning_rate 0.0005 \ --batch_size 128 \ --epochs 100 \ --gpu 0 \ --output_dir experiments/alertstar ``` #### Comprehensive Evaluation ```bash # Evaluate all models python scripts/evaluate_models.py \ --models alertstar stare shrinke nbfnet mt_alertstar \ --checkpoint_dir experiments/checkpoints \ --data_path data/processed/alert33 \ --output_file results/comparison.json ``` #### Run All Experiments ```bash # Run complete experimental suite python scripts/run_experiments.py \ --config configs/experiments.yaml \ --num_seeds 5 \ --output_dir experiments/full_results ``` --- ## πŸ”¬ Running Experiments ### 1. Data Preprocessing ```bash # Preprocess your cybersecurity data python scripts/preprocess_data.py \ --input data/raw/cybersecurity_alerts.txt \ --output data/processed/alert33 \ --train_ratio 0.7 \ --valid_ratio 0.15 \ --test_ratio 0.15 ``` ### 2. Train Models ```bash # Train AlertStar python scripts/train_alertstar.py \ --data_path data/processed/alert33 \ --config configs/alertstar.yaml # Train MT-AlertStar python scripts/train_mt_alertstar.py \ --data_path data/processed/alert33 \ --config configs/mt_alertstar.yaml # Train all baselines python scripts/train_baselines.py \ --data_path data/processed/alert33 \ --models stare shrinke nbfnet ``` ### 3. Evaluate and Compare ```bash # Comprehensive evaluation python scripts/evaluate_models.py \ --checkpoint_dir experiments/checkpoints \ --data_path data/processed/alert33 \ --output_file results/comparison.json # Generate visualizations python scripts/visualize_results.py \ --results_file results/comparison.json \ --output_dir results/figures ``` --- ## πŸ“š Documentation - **[Installation Guide](docs/installation.md)** - Detailed setup instructions - **[Tutorial](docs/tutorial.md)** - Step-by-step walkthrough - **[API Reference](docs/api_reference.md)** - Complete API documentation - **[Research Questions](docs/research_questions.md)** - Experimental design - **[Data Format](data/README.md)** - Data specifications --- ## πŸŽ“ Citation If you use AlertStar in your research, please cite: ```bibtex @article{alertstar2026, title={AlertStar: Path-Aware Alert Prediction on Hyper-Relational Knowledge Graphs }, author={Name and Co-authors}, journal={arXiv preprint arXiv:2026.xxxxx}, year={2026} } ``` --- ## πŸ“„ License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. --- ## 🀝 Contributing We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md). ### Areas for Contribution: - πŸ› Bug fixes - πŸ“ Documentation improvements - ✨ New features (e.g., additional baseline models) - πŸ”¬ Experimental extensions - πŸ“Š Visualization tools --- ## πŸ™ Acknowledgments - **StarE**: Galkin et al. "Message Passing for Hyper-Relational Knowledge Graphs" (EMNLP 2020) - **ShrinkE**: Zhang et al. "ShrinkE: Reducing Embedding Dimensionality" (WWW 2022) - **NBFNet**: Zhu et al. "Neural Bellman-Ford Networks" (NeurIPS 2021) --- ## πŸ“¬ Contact - **Authors**: [Name](mailto:your.email@domain.com) - **Issues**: [GitHub Issues](https://github.com/yourusername/alertstar/issues) - **Discussions**: [GitHub Discussions](https://github.com/yourusername/alertstar/discussions) --- ## πŸ”— Links - πŸ“„ [Paper (arXiv)](https://arxiv.org/abs/xxxx.xxxxx) - πŸ’» [Code (GitHub)](https://github.com/yourusername/alertstar) - πŸ“Š [Datasets](https://github.com/yourusername/alertstar/tree/main/data) - πŸŽ₯ [Video Presentation](https://youtu.be/xxxxx) - πŸ“Š [Experiment Results](https://github.com/yourusername/alertstar/tree/main/experiments/results) ---

Built with ❀️ for the cybersecurity and knowledge graph communities