Infrastructure Overview
This document provides a comprehensive overview of our cluster's core infrastructure components, which are managed using GitOps principles through ArgoCD.
Directory Structure
Our infrastructure follows a structured organization:
infrastructure/
├── application-set.yaml # ArgoCD ApplicationSet for automated deployment
├── project.yaml # ArgoCD Project definition
├── kustomization.yaml # Main kustomization file
├── auth/ # Authentication (Authentik)
├── controllers/ # Core controllers
│ ├── argocd/ # GitOps controller
│ ├── cert-manager/ # Certificate management
│ ├── external-secrets/ # Secrets management
│ └── longhorn/ # Storage controller
├── crds/ # Custom Resource Definitions
├── database/ # Database services
├── deployment/ # Deployment controllers
├── monitoring/ # Monitoring stack
└── network/ # Networking (Cilium, CoreDNS)
Core Components
Authentication (auth/)
- Authentik: Primary authentication provider
- SSO for cluster services
- OAuth2 proxy integration
- User management
Controllers (controllers/)
-
ArgoCD
- GitOps workflow management
- Progressive delivery
- Application synchronization
-
cert-manager
- Certificate lifecycle management
- Let's Encrypt integration
- Internal PKI
-
external-secrets
- Secrets management with Bitwarden
- Secure key distribution
- Secret rotation
-
Longhorn
- Distributed storage
- Volume replication
- Backup management
Networking (network/)
-
Cilium
- CNI provider
- Network policies
- Load balancing
- Gateway API implementation
-
CoreDNS
- Cluster DNS
- Service discovery
- Custom DNS entries
GitOps Workflow
Application Deployment
Deployment Flow:
1. CRDs (Wave -1)
2. Core Infrastructure (Wave 0)
3. Controllers (Wave 1)
4. Storage (Wave 2)
5. Networking (Wave 3)
6. Authentication (Wave 4)
7. Applications (Wave 5+)
Version Control
- All changes through Git
- Pull request workflow
- Automated validation
- Deployment tracking
Security Model
RBAC Configuration
Permissions:
infrastructure:
- cluster-admin scope
- restricted to infrastructure namespace
applications:
- namespace-scoped
- limited to specific resources
Network Security
- Zero-trust network model
- Explicit network policies
- TLS everywhere
- Gateway API for ingress
Maintenance Procedures
Component Updates
- Update in Git repository
- ArgoCD auto-sync
- Progressive rollout
- Validation checks
Troubleshooting Guide
When issues arise:
- Check ArgoCD sync status:
kubectl get applications -n argocd
- Verify resources:
kubectl get events -n <namespace>
kubectl describe <resource> -n <namespace>
- Review logs:
kubectl logs -n <namespace> <pod> -f
Best Practices
-
GitOps Principles
- Everything in Git
- Declarative configurations
- Automated reconciliation
-
Security
- Least privilege access
- Regular certificate rotation
- Secure secret management
-
High Availability
- Component redundancy
- Data replication
- Failure domain isolation
Monitoring & Alerting
Key Metrics
- Controller health
- Resource utilization
- Certificate expiration
- Storage capacity
Alert Rules
Priorities:
critical:
- Controller failures
- Certificate expiration < 7 days
- Storage capacity > 85%
warning:
- High resource usage
- Sync delays
- Storage capacity > 75%
Future Enhancements
- Enhanced metric collection
- Automated disaster recovery
- Cross-cluster failover
- Advanced policy enforcement