AI Systems Engineer
(3+ years)Onsite

Minimum qualifications:

  • iconBachelor's degree in Computer Science, Engineering, Artificial Intelligence, or a related field, or equivalent practical experience.
  • icon3+ years of backend, platform, infrastructure, AI engineering, or distributed systems experience.
  • iconExperience designing and operating production services, APIs, databases, queues, or workflow systems.
  • iconComfortable reviewing architecture, debugging production issues, writing tests, and documenting system behavior.
  • iconInterest in building AI infrastructure that is reliable, observable, secure, and scalable.

Job Description:

As an AI Systems Engineer at Aiotrix, you will work on the core technical systems that make AI products dependable. This role focuses on runtime behavior, state, memory, tool execution, traces, evaluations, security, and reliability.

You will help build the infrastructure layer behind ART and Aiotrix AI systems, where AI workflows must run predictably, recover from failures, and remain observable in production.


Responsibilities:

  • iconDesign AI runtime components for tool execution, memory, state management, queues, retries, and long-running workflows.
  • iconBuild observability systems for AI workflows, including traces, logs, tool calls, latency, cost, outputs, and failures.
  • iconDevelop evaluation and regression systems to measure quality, reliability, and behavior across AI workflows.
  • iconImplement secure execution boundaries, permission checks, rate limits, and safety controls.
  • iconWork with vector databases, retrieval systems, event streams, caches, and backend services.
  • iconOptimize performance, reliability, scalability, and operational cost of AI infrastructure.
  • iconCollaborate with product and AI engineers to expose system capabilities through simple developer and user workflows.

Preferred qualifications:

  • iconStrong backend engineering experience with Python, Go, TypeScript, Node.js, or similar technologies.
  • iconExperience with distributed systems concepts such as queues, retries, idempotency, concurrency, state, and observability.
  • iconHands-on exposure to LLM APIs, RAG, vector databases, tool calling, or AI workflow orchestration.
  • iconExperience with tracing, monitoring, logging, metrics, testing, and debugging production systems.
  • iconUnderstanding of security, permissions, authentication, and safe execution patterns.
  • iconAbility to reason deeply about reliability, failure modes, and platform behavior.

If interested please fill the below details and apply

Be sure to include an updated resumeDOC, DOCX, PDF (2MB)

Ready to Build Smarter Solutions?

Whether you are planning a new platform or strengthening an existing one, Aiotrix can help structure the path forward.

We bring architecture, engineering and operations into a single conversation so that decisions are made with full visibility of impact, cost and risk.