AI Platform Engineer, Madrid
Empresa
Airbus
Provincia
Madrid
Ciudad
Madrid
Tipo de Contrato
Tiempo Completo
Descripción
AI Platform Engineer
Job Description:
Airbus Defence Space is looking for an AI Platform Engineer (MLOps) to build and scale high-performance computing infrastructure in high-security environments.
TADMC provides full engineering and software support for CLAEXs Operational Flight Programs (OFP) and in-service aircraft fleets.
The selected candidate will join the Architecture Integration team (TADMC2) at CLAEX, in Torrejón de Ardoz Air Base. Their primary mission will be to design, operate, and evolve the Kubernetes platform dedicated to Artificial Intelligence, ensuring a stable, automated, and scalable working environment built on cutting-edge infrastructure (GPU).
We are looking for a specialist in Platform Engineering / MLOps who thrives on the challenges of critical infrastructure. We are not just looking for someone to maintain systems, but for a professional capable of building and automating complex computing environments in maximum-security scenarios (air-gapped or offline environments).
Key Responsibilities
- AI Platform Management: Design and administer Kubernetes environments optimized for Artificial Intelligence workloads.
- Infrastructure Automation: Implement Infrastructure as Code (IaC) methodologies and continuous deployment models (GitOps) to ensure system reproducibility.
- High-Security Environment Operations: Manage air-gapped (isolated) infrastructures, ensuring local repository management, updates, and security without reliance on the public cloud.
- Computing Resource Optimization: Administer and prepare high-performance nodes with GPU acceleration for inference and training tasks.
- Storage and Network Architecture: Configure and maintain persistent storage systems and segmented networks to ensure data integrity and speed.
- Observability and Continuity: Implement advanced monitoring systems to ensure cluster health, GPU performance, and proactive incident detection.
- System Security: Apply hardening policies and access control to protect critical infrastructure.
Requirements
- Solid experience (3+ years) in DevOps, SRE, or Platform Engineering roles.
- Degree in Computer, Telecommunications, Maths or Software Engineering.
- Proven experience working with container orchestration (Kubernetes).
- Experience managing critical infrastructure or isolated environments (air-gapped/offline).
- Advanced proficiency in Python.
- Advanced proficiency in Linux operating systems and network administration.
- Experience in deployment automation (Ansible, Terraform, or similar tools).
- Ability to work with modern deployment methodologies (GitOps).
- B2 level in English.
Preferred Qualifications
- Military Avionics and embedded/Real Time Software knowledge is desirable. This is useful since the LLM training and inference is targeted at supporting Military Avionics development and most of the task would be related to military Avionics.
- Previous experience managing infrastructure for Artificial Intelligence (NVIDIA/CUDA driver management).
- Knowledge of distributed storage solutions and private image registry management.
- Official Kubernetes certifications (CKA/CKS).
- Interest in working on defense projects and cutting-edge technology within high-security environments.
WHICH BENEFITS WILL YOU HAVE AS AIRBUS EMPLOYEE?
At Airbus we are focused on our employees and their welfare. Take a look at some of our social benefits:
- Vacation days plus additional days-off along the year.
- Attractive salary.
- Hybrid model of working when possible, promoting the work-life balance.
- Collective transport service in some sites.
- Benefits such as health insurance, employee stock options, retirement plan, or study grants.
- On-site facilities (among others): free canteen, kindergarten, medical office.
- Possibility to collaborate in different social and corporate social responsibility initiatives.
- Excellent upskilling opportunities and great development prospects in a multicultural environment.
- Special rates in products benefits.
This job requires an awareness of any potential compliance risks and a commitment to act with integrity, as the foundation for the Companys success, reputation and sustainable growth.
Company:
Airbus Defence and Space SAU
Employment Type:
Permanent
Experience Level:
Professional
Job Family:
Software Engineering
At Airbus, we support you to work, connect and collaborate more easily and flexibly. Wherever possible, we foster flexible working arrangements to stimulate innovative thinking.
Kubernetes, Python, Linux, Ansible, Terraform, GitOps,
Job Description:
Airbus Defence Space is looking for an AI Platform Engineer (MLOps) to build and scale high-performance computing infrastructure in high-security environments.
TADMC provides full engineering and software support for CLAEXs Operational Flight Programs (OFP) and in-service aircraft fleets.
The selected candidate will join the Architecture Integration team (TADMC2) at CLAEX, in Torrejón de Ardoz Air Base. Their primary mission will be to design, operate, and evolve the Kubernetes platform dedicated to Artificial Intelligence, ensuring a stable, automated, and scalable working environment built on cutting-edge infrastructure (GPU).
We are looking for a specialist in Platform Engineering / MLOps who thrives on the challenges of critical infrastructure. We are not just looking for someone to maintain systems, but for a professional capable of building and automating complex computing environments in maximum-security scenarios (air-gapped or offline environments).
Key Responsibilities
- AI Platform Management: Design and administer Kubernetes environments optimized for Artificial Intelligence workloads.
- Infrastructure Automation: Implement Infrastructure as Code (IaC) methodologies and continuous deployment models (GitOps) to ensure system reproducibility.
- High-Security Environment Operations: Manage air-gapped (isolated) infrastructures, ensuring local repository management, updates, and security without reliance on the public cloud.
- Computing Resource Optimization: Administer and prepare high-performance nodes with GPU acceleration for inference and training tasks.
- Storage and Network Architecture: Configure and maintain persistent storage systems and segmented networks to ensure data integrity and speed.
- Observability and Continuity: Implement advanced monitoring systems to ensure cluster health, GPU performance, and proactive incident detection.
- System Security: Apply hardening policies and access control to protect critical infrastructure.
Requirements
- Solid experience (3+ years) in DevOps, SRE, or Platform Engineering roles.
- Degree in Computer, Telecommunications, Maths or Software Engineering.
- Proven experience working with container orchestration (Kubernetes).
- Experience managing critical infrastructure or isolated environments (air-gapped/offline).
- Advanced proficiency in Python.
- Advanced proficiency in Linux operating systems and network administration.
- Experience in deployment automation (Ansible, Terraform, or similar tools).
- Ability to work with modern deployment methodologies (GitOps).
- B2 level in English.
Preferred Qualifications
- Military Avionics and embedded/Real Time Software knowledge is desirable. This is useful since the LLM training and inference is targeted at supporting Military Avionics development and most of the task would be related to military Avionics.
- Previous experience managing infrastructure for Artificial Intelligence (NVIDIA/CUDA driver management).
- Knowledge of distributed storage solutions and private image registry management.
- Official Kubernetes certifications (CKA/CKS).
- Interest in working on defense projects and cutting-edge technology within high-security environments.
WHICH BENEFITS WILL YOU HAVE AS AIRBUS EMPLOYEE?
At Airbus we are focused on our employees and their welfare. Take a look at some of our social benefits:
- Vacation days plus additional days-off along the year.
- Attractive salary.
- Hybrid model of working when possible, promoting the work-life balance.
- Collective transport service in some sites.
- Benefits such as health insurance, employee stock options, retirement plan, or study grants.
- On-site facilities (among others): free canteen, kindergarten, medical office.
- Possibility to collaborate in different social and corporate social responsibility initiatives.
- Excellent upskilling opportunities and great development prospects in a multicultural environment.
- Special rates in products benefits.
This job requires an awareness of any potential compliance risks and a commitment to act with integrity, as the foundation for the Companys success, reputation and sustainable growth.
Company:
Airbus Defence and Space SAU
Employment Type:
Permanent
Experience Level:
Professional
Job Family:
Software Engineering
At Airbus, we support you to work, connect and collaborate more easily and flexibly. Wherever possible, we foster flexible working arrangements to stimulate innovative thinking.
Kubernetes, Python, Linux, Ansible, Terraform, GitOps,