Lead AI/ML Engineer

Litespace

Litespace is a cutting-edge AI recruitment firm that partners with top and Fortune 500 companies across North America.

Rapidly connecting them with elite AI talent. We don’t just recruit—we drive innovation by placing exceptional engineers into roles that redefine the future of technology.

By applying to this role, you’ll be joining an ecosystem where excellence meets disruption.

Hybrid/In-Person (San Francisco Bay Area)

Engineering

Senior

Position Overview

In today’s digital transformation era, top organizations are revolutionizing how they leverage AI to solve complex challenges. In this Lead AI/ML Engineering role, you will be the driving force behind the development and optimization of high-performance, scalable ML infrastructures. This is not just another engineering position—it’s a pivotal role where you will:

Innovate at Scale: Collaborate with cloud architects, ML researchers, and infrastructure experts to pioneer advanced ML systems.
Drive Business Outcomes: Engineer robust systems that ensure seamless, low-latency model deployment and directly contribute to our clients’ digital success.
Lead and Inspire: Provide technical leadership and mentorship, championing best practices and fostering a culture of excellence in AI engineering.

Key Responsibilities

Architect & Optimize ML Pipelines: Design and refine ML pipelines for scalable inference and model deployment on cloud-based GPU infrastructures (e.g., AWS, GCP, Azure).
Develop High-Throughput Systems: Build and maintain serving pipelines for AI models that deliver low-latency and high-performance execution.
Enhance Model Serving: Collaborate with ML research teams to integrate techniques such as tensor parallelism, quantization, distillation, and caching.
Implement Monitoring Tools: Design automated monitoring and profiling systems to track performance, detect regressions, and optimize resource utilization.
Optimize Cloud Resources: Strategically allocate and orchestrate GPU resources across diverse cloud ML workloads.
Integrate Load Testing: Deploy scalable load testing frameworks to ensure system reliability under high-traffic conditions.
Facilitate Production Transitions: Partner with cross-functional teams to move models from experimental phases to production-ready, cloud-native deployments.
Establish Best Practices: Develop and promote standard methodologies for scaling and fault-tolerant ML architectures across multi-region environments.

Required Qualifications

5+ years of proven success in building high-performance ML infrastructure and scalable AI systems.
Degree (Undergraduate, MS, or PhD) in Computer Science, Machine Learning, or a related field.
Proficiency in one or more programming languages (e.g., Python, C++, Java, JavaScript/TypeScript, Go, Rust, Ruby, C#).
Extensive experience with AI/ML frameworks such as TensorFlow, PyTorch, or similar.
Demonstrated expertise in deploying large-scale ML models in cloud environments (e.g., AWS GPU instances, Kubernetes, Ray, etc.).
Deep understanding of cloud-native architectures, autoscaling strategies, and fault-tolerant machine learning systems.
Proven skills in GPU orchestration, CUDA, and accelerated inference techniques with hands-on experience using profiling tools (e.g., Nsight, PyTorch Profiler, perf).
Ability to thrive in a fast-paced, startup-like environment and collaborate effectively with diverse, cross-functional teams.

Nice to have

Experience leading technical teams or mentoring junior engineers.
Advanced expertise in model conversion and optimization frameworks (e.g., ONNX, TensorRT) and AOT compilation techniques.
Prior experience transitioning research models into production-scale applications.

Location:

Hybrid/In-Person (San Francisco Bay Area)

Compensation

Our compensation reflects the labor market across various U.S. regions. The pay range for this position is $240,000 – $470,000 per year, with performance-based incentives and potential long-term equity awards. Detailed salary and benefits information will be shared during the hiring process.

How to Apply

If you’re passionate about Lead AI/ML Engineer and ready to launch your career in a dynamic, innovative environment, we encourage you to apply. Please submit your resume, and any relevant project samples through our application portal.

Embark on your journey with Litespace—where your technical talent meets endless opportunities for growth and innovation.