Infrastructure Engineer for AI/ML

GET (Global Engineering Technologies) is a Belgrade-based Serbian company founded in 2007. GET is recognized as a reliable ISV and IT outsourcing company on the international market. We are specialized in software development and IT services as one of the fastest growing IT companies in the SEE region with Atlassian, SAP, Microsoft and Cisco partner status.

We are innovators, engineers, designers and project managers establishing long-term strategic and innovation partnerships with our clients operating in automotive, banking, logistics and pharma/biotech industries.

GET has more than 280 software engineers, business analysts, business domain experts and corporate staff.

We are GET. And we grow together as a team.

Click to GET Inside:

www.global-engineering-technologies.com/get-inside

Due to our continuous growth, we are looking for:

Infrastructure Engineer for AI/ML

Responsibilities

On-Premise Infrastructure Management:

● Manage storage and networks associated with AI/ML activities.
● Ensure the availability and performance of on-premise resources.
● Administer and maintain our GPU compute clusters (NVIDIA).

Cloud Infrastructure Management (GCP):

● Design, deploy, and manage cloud architectures on Google Cloud Platform (GCP) for AI/ML projects.
● Master and optimize the use of relevant GCP services (Compute Engine, Kubernetes Engine (GKE), Cloud Storage, Networking, etc.).
● Administer and operate the Vertex AI platform for model training, deployment, and management.

Orchestration and Automation:

● Implement and promote Infrastructure as Code (IaC) practices (e.g., Terraform, Ansible).
● Develop, maintain, and improve orchestration tools (particularly in Python) for managing and automating training jobs across different infrastructures (on-premise and cloud).

Support and Collaboration:

● Work closely with Data Science and Machine Learning teams to understand their infrastructure needs and provide them with suitable environments.
● Provide technical support on infrastructure aspects related to the entire AI stack (from development to production).
● Ensure system security, patching, and access control
● Diagnose and resolve incidents related to AI/ML infrastructures.

Optimization and Technology Monitoring:

● Monitor the performance, costs, and security of the infrastructures.
● Propose and implement optimizations.
● Implement disaster recovery and high availability strategies.
● Actively monitor technological advancements in hardware and software infrastructure solutions (on-premise and cloud), MLOps tools, and best practices in the field.

About you

● Confirmed experience (minimum 3-5 years) in IT infrastructure management, including a significant part dedicated to High-Performance Computing (HPC) environments or to AI/ML.
● Excellent expertise in Linux environments.
● Solid experience in managing compute clusters, ideally with NVIDIA GPUs.
● In-depth knowledge of the Google Cloud Platform (GCP) ecosystem, including infrastructure services and the Vertex AI platform.
● Advanced proficiency in Python development, particularly for automation, scripting, and orchestration.
● Good knowledge of containerization technologies (Docker, Kubernetes).
● Strong understanding of the end-to-end AI stack and MLOps principles.
● Experience with Infrastructure as Code (IaC) tools.
● Proficient in written and spoken English.

It would be a plus to have

● Knowledge of monitoring tools such as Prometheus, Grafana, and tools for GPU-specific metrics (e.g., DCGM, nvidia-smi integration).
● Specific experience in infrastructure for Computer Vision workloads (e.g., managing pipelines for image/video processing with GPUs).
● Knowledge of other cloud platforms (AWS).

GET benefits you

● Hybrid work model
● Private health insurance
● Choose one additional benefit: FitPass, event tickets or private pension insurance
● Learning and growth program
● Access to Udemy, Pluralsight and GET Library
● Language classes (English & German)
● Team buildings and other team events
● Various shopping discounts
● Weekly sport activities (football, basketball, volleyball)
● Gaming room (billiards, table football, darts & board games)
● Gift/vouchers for special occasions (Newborns, International Women Day, Christmas gifts for kids)
● Possibility to travel abroad

We are pleased that you have taken the first step to finding out about a career at Global Engineering Technologies. In order to ensure the most professional treatment of your application, GET is exclusively using online applications.

Please note that only shortlisted candidates will be contacted.

Apply for Infrastructure Engineer for AI/ML

+ 5 = 8
You must agree to our Terms of Use in order to proceed.
Copyright © 2025. Global Engineering Technologies. All rights Reserved.