About Atla
Atla is committed to building safe, beneficial AI systems that will have a massive positive impact on the future of humanity. We are a London-based start-up training frontier AI evaluation models. Become part of our growing world-class team. We are backed by Y Combinator, Creandum, and the founders of Reddit, Cruise, Rappi, Instacart and more.
Role
As Atla’s Machine Learning Engineer, you’ll spearhead our post-training and inference frameworks for large language models.
- Design and optimise a scalable post-training framework to efficiently manage training runs while controlling costs.
- Implement scalable data pipelines, optimise models for performance and accuracy, and ensure they are production-ready.
- Engineer robust, high-performance inference platforms to ensure our products' reliability, throughput and speed at scale.
- Collaborate with researchers and engineers to accelerate the iteration of new research ideas and improve training workflows.
- Build and grow our engineering organisation, setting a high bar of excellence that propels Atla forward.
Please note that this role is in-person (we can sponsor visas and offer international relocation support as a UK AI Futures partner!)
Qualifications
Evidence of exceptional ML engineering ability:
- Proven expertise in software or ML engineering, focusing on building robust, scalable systems.
- Experience with orchestration systems like SLURM or Ray, along with MLOps tools such as Kubernetes, Vertex, or Sagemaker.
- Skilled in creating and managing multi-instance clusters for data and model parallel training on GPUs/TPUs, preferably using DeepSpeed or PyTorch FSDP.
- Proficient in serving large machine learning models at scale, including quantization, distributed computing, and using frameworks like vLLM or Ray Serve.
- Strong understanding of techniques like paged attention, gradient checkpointing, and DeepSpeed, with the ability to implement and optimise them at scale.
Nice to have
- Experience at a leading AI company (Mistral, Anthropic, OpenAI, X.ai, HuggingFace, Cohere, Stability, etc.)
- Publications in top AI conferences.
- Interested in and thoughtful about of the impacts of AI technology.
About you
You are going to thrive at Atla with the following mindset:
- Collaborative and team-oriented, with strong communication skills.
- Comfortable with the uncertainty and fast pace of a hyper-growth startup.
- Willingness to continuously learn and adapt in a dynamic environment.
- Unpretentious and hard working; find the best ideas wherever they come from.
Compensation
- £100K - £300K GBP
- Significant stake in equity as one of the first joiners
- Pension plan
- Medical, dental, and vision benefits