Job Description
Are you ready to define the technological landscape of 2026?
Nexus Future Systems is at the forefront of next-generation AI development. We are seeking a visionary Lead AI Infrastructure Architect to design the scalable, resilient, and ethical frameworks that will power our AI solutions for the next decade. You won't just be maintaining systems; you will be architecting the infrastructure for Artificial General Intelligence (AGI) readiness.
In this pivotal role, you will bridge the gap between cutting-edge machine learning research and robust, production-grade engineering. You will lead a team of elite engineers in building high-performance systems capable of processing petabytes of data in real-time.
Why join us?
- Work on mission-critical AI projects that define industry standards.
- Competitive compensation and equity packages.
- Flexible remote-first culture with premium office amenities in San Francisco.
- Access to state-of-the-art hardware and research facilities.
Responsibilities
- Architect Scalable AI Systems: Design and implement distributed computing architectures optimized for Large Language Models (LLMs) and next-gen neural networks.
- Infrastructure Strategy: Lead the roadmap for cloud-native infrastructure, ensuring high availability and fault tolerance for mission-critical AI workloads.
- Ethical AI Governance: Establish frameworks for AI safety, bias mitigation, and data privacy in alignment with global regulatory standards for 2026.
- Performance Optimization: Drive continuous optimization of inference latency and resource utilization across our global data centers.
- Technical Leadership: Mentor senior engineers and define best practices for code quality, CI/CD pipelines, and system reliability.
- Collaboration: Partner with product and research teams to translate complex AI capabilities into user-centric infrastructure solutions.
Qualifications
- Experience: 7+ years of experience in software engineering, with at least 3 years in a senior or lead role within AI/ML infrastructure.
- Technical Stack: Deep expertise in Python, C++, and Rust. Proficiency in containerization (Docker/Kubernetes) and orchestration (K8s).
- Cloud Mastery: Extensive experience with major cloud providers (AWS, GCP, or Azure) and serverless architectures.
- AI/ML Knowledge: Strong understanding of deep learning frameworks (TensorFlow, PyTorch) and model deployment strategies.
- Education: Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or a related field.
- Problem Solving: Exceptional ability to troubleshoot complex system bottlenecks and scale systems under extreme load.