Friedrich-Alexander-Universität Erlangen-Nürnberg

Vollzeit

Erlangen

AI Inference Platform Engineer (m/f/d)
Your workplace
The Erlangen Regional Computing Center (RRZE) is FAU’s internal IT service provider and provides the IT infrastructure and basic IT services. As part of the regional concept, it also supports other colleges and universities in the surrounding area.

The High Performance Computing (HPC@FAU) group works closely with the National Supercomputing Center (NHR@FAU) and the Leibniz Supercomputing Center in Garching near Munich.

Our benefits

Regular promotion to the next level and increase in salary pursuant to the collective bargaining agreement for the public service of the German Länder (TV-L) or remuneration pursuant to the Bavarian Public Servants Remuneration Act (BayBesG) plus an additional annual bonus
30 days annual leave at five working days per week with additional free days on December 24 and 31
Occupational pension scheme and asset accumulation savings scheme
Excellent support during the academic qualification phase
Thorough onboarding process with a dedicated team
Subsidized food and drinks in our student restaurants
Place of work within comfortable walking distance of public transport
Family-friendly environment with childcare options, also during school holidays
Flexible working hours
A wide range of training courses and opportunities for professional development
Active health management

Your Tasks
Your Role and Responsibilities:

Designing, implementing and maintaining an AI inference platform based on predominantly open-source components including a web-based user interface and API, all within a friendly and open work environment in a highly motivated, international team
Conceptualizing and implementing infrastructure components to create a RAG-capable inference environment
Advising and supporting pilot project partners of select universities using this AI service infrastructure in data quality, data preparation and workflow design to contribute to the transfer of prototypes into production, relying on your friendly personality and communication skills
Designing and implementing tenant separation concepts for access, data and compute, integrating with federated single sign-on (SSO) institutional identity management systems
Implementing resource management mechanisms to ensure fair and efficient resource allocation and to allow for usage accounting and cost attribution

Your Profile
Required/Minimum Qualifications

PhD or Master’s degree in computer or data science, or other areas of scientific computing,

Other Requirements

Proficiency working in data center environments (incl. Linux, CLI, Git, Gitlab)
Extensive knowledge and experience in developing and maintaining platform environments in the context of AI inference workflows, that is utilizing e.g.
web server / load balancer (e.g. Nginx), data bases (e.g. MariaDB, SQLite, Redis)
containers/OS-level virtualization (e.g. Docker) and container orchestration (e.g. Kubernetes), as well as HPC-based scheduler (e.g. Slurm)
monitoring tools for metric collection (e.g. Prometheus) and visualization (e.g. Graphana)
Python and JavaScript programming languages for development of frontend components (e.g. Open WebUI)
model gateway (e.g. LiteLLM) and inference engines (e.g. vLLM, Triton, SGLang) as well as underlying GPU-based technologies (e.g. torch, ray)
Knowledge of various types of AI models (e.g. LLMs, vision-language models, …), model guardrails and retrieval-augmented generation (RAG)
Willingness to keep up with current developments and to learn new technologies in the field of AI
Knowledge and experience in software deployment and software lifecycle management (ideally based on principles of continuous integration/continuous deployment, CI/CD)
Basic knowledge and practical skills in software design and engineering
Basic knowledge and practical skills in IT and cyber security for software and software platform development
English and German presentation and writing skills

Supplementary description

Application deadline

2026-06-07

Title

AI Inference Platform Engineer (m/f/d)

Job start date

2026-08-01

Payment

E13

Working time

full time

Weekly working hours

40,1

### Temporary Contract yes

Reason for temporary contract

Project

Further general conditions

Work location

Martensstraße 1
91058 Erlangen

Contact

Thomas Zeiser
Tel.: +49 9131 85-28737
Mobil:
thomas.zeiser@fau.de

Please apply via our online platform instead of sending applications by post or e-mail. Applications sent to us by post will not be returned.

With regard to the personal data collected during the application process, please refer to our information pursuant to Art. 13 and 14 of the General Data Protection Regulations at .

FAU is a modern, cosmopolitan and family-friendly employer. We welcome your application irrespective of your age, gender, cultural and social background, religion, ideological beliefs, disability or sexual identity. Applicants with a disability or who are considered to be equivalent to people with a disability will be given preference provided their aptitude, performance and capability are essentially the same. We are happy to offer part-time positions, provided a job-sharing arrangement means that the tasks in the area are fully covered.

If you wish, you may invite a person responsible for ensuring equal rights to accompany you to the job interview without incurring any disadvantages.