My job alerts

Staff AI Engineer (Orchestration)

Heidi Health

Software Engineering, Data Science

Melbourne, VIC, Australia

Posted on Mar 16, 2026

Location

Melbourne

Employment Type

Full time

Location Type

Hybrid

Department

Engineering

Who We Are

Healthcare needs a better rhythm: one that keeps care continuous and deeply human. Heidi is building an AI Care Partner that works alongside clinicians to make that possible.

We’re a team of doctors, engineers, designers, researchers, and creatives building tools that help clinicians stay focused on what matters most: their patients.

In just 18 months, Heidi has given back more than 18 million hours to healthcare professionals — supporting 73 million patient visits in 116 countries. Today, more than two million patient visits each week are powered by Heidi worldwide.

Backed by nearly $100 million in funding, we’re growing in the US, UK, Canada, and Europe, partnering with leading health systems including the NHS, Beth Israel Lahey Health, and Monash Health.

The Role

You will operate as a Staff+ AI scientist/engineer on the Orchestration team. You will own the design and delivery of a clinician‑grade retrieval and question‑answering stack across data ingestion, indexing, ranking, grounding, and safe deployment.

You will set technical direction, establish quality bars, and lead cross‑functional execution with engineering, product, and clinical experts.

You will move between research and production, turning prototypes into reliable services with clear SLAs, traceable outputs, and unit/acceptance metrics that matter in clinical contexts.

What you’ll do:

Define the end‑to‑end architecture for literature and guideline ingestion, normalization, metadata extraction, de‑duplication, and versioning.
Build hybrid search and retrieval: lexical + vector + re‑ranking, with tight latency budgets and cost controls.
Design grounding and answer synthesis that cite sources, preserve provenance, and expose confidence and abstention.
Lead model work across prompting, fine‑tuning, distillation, and tool use to improve faithfulness, coverage, and utility.
Stand up gold‑standard evaluation: offline IR metrics (nDCG, MAP, recall), factuality/faithfulness audits, and human review with adjudication.
Run online experiments at scale. Define guardrails, KPIs, and ship A/Bs to measure impact on clinician workflows.
Productionize services with observability, tracing, canaries, rollbacks, and incident playbooks.
Set data governance for medical content: access control, PHI handling, audit logs, and retention policies.
Partner with clinicians to define intents, schemas, and acceptance criteria. Convert ambiguous questions into testable specs.
Coach engineers and scientists. Raise the technical bar through design docs, reviews, and reusable components.

What we will look for:

Staff‑level track record shipping search, NLP, or LLM systems that serve real users at scale.
Mastery of Python and SQL. Strong software engineering fundamentals, testing strategy, and API/service design.
Depth in modern IR/NLP: embeddings, ANN indexes, re‑rankers, retrieval‑augmented generation, and prompt/program synthesis.
Experience building data pipelines: parsing PDFs/HTML, OCR when needed, metadata extraction, and content hashing/versioning.
Familiarity with PyTorch, plus distributed training/inference patterns.
MLOps and reliability: containers, Kubernetes, feature/model registries, experiment tracking, monitoring, and alerting.
Evidence of rigorous evaluation design: offline metrics, human‑in‑the‑loop judging, power analysis for online tests.
Clear thinking on safety: hallucination controls, calibration, abstention, red‑teaming, and privacy/security by design.
Ability to lead cross‑functional initiatives and make crisp decisions with incomplete information.

Bonus:

Search relevance expertise for long‑form, citation‑heavy domains.
Knowledge of biomedical ontologies and standards (e.g., SNOMED CT, UMLS, ICD, RxNorm, FHIR).
Prior work with literature and guideline corpora, de‑duplication, and document lineage tracking.
Experience with hybrid retrieval stacks (e.g., BM25 + ANN) and learned re‑rankers.
Familiarity with clinical evaluation methods, EBM hierarchies, and annotation workflows.
Strong cost‑performance tuning for LLM inference, caching, and batching in production.

The way we work

1. Build to Last

We design for safety and reliability so clinicians, patients, and our teams can trust what we build every day.

2. Own Your Practice

Ideas rise on merit, not title, and everyone shares responsibility for the standards we set together.

3. Move Fast, Stay Steady

We move quickly but never at the cost of trust. Progress only matters if people can depend on what we make.

4. Make Others Better

Honest feedback, steady support, and shared growth keep our teams improving together.

Why you will flourish with us

Flexible hybrid working environment, with 3 days in the office.
A generous personal development budget of $500 per annum
Learn from some of the best engineers and creatives, joining a diverse team
Become an owner, with shares (equity) in the company, if Heidi wins, we all win
The rare chance to create a global impact as you immerse yourself in one of Australia’s leading healthtech startups
If you have an impact quickly, the opportunity to fast track your startup career!

Heidi is dedicated to creating an equitable, inclusive, and supportive work environment that brings people together from diverse backgrounds, experiences, and perspectives. Our strength is in our differences.

We're proud to be an equal opportunity employer and welcome all applicants as we're committed to promoting a culture of opportunity for all.

See more open positions at Heidi Health

Build What's Next