Work with us

We're a small team doing real work with enterprise clients. Fully remote, no fluff.

Open positions

AI Developer in Test

Fully RemoteFull-timeImmediate start
Open

We're looking for experienced AI Developers in Test to work on validating LLM-based applications, multi-agent systems, and RAG pipelines. You'll make sure AI outputs are safe, reliable, and actually useful for end users.

AI System Validation & Testing

  • Design and implement testing frameworks for LLM-based applications, multi-agent systems and RAG pipelines
  • Evaluate AI outputs for accuracy, factual correctness, contextual relevance, bias, safety and compliance
  • Implement governance workflows, risk scoring and compliance reporting
  • Set up continuous monitoring for production AI systems with automated alerts for anomalies

Testing Infrastructure & Automation

  • Validate LLM APIs across multiple providers using Postman, REST Assured, pytest
  • Integrate AI testing suites into CI/CD pipelines for regression testing, benchmarking and deployment gates
  • Version control for prompts and test cases using Git, plus MLOps tools like MLflow or Weights & Biases

Data Engineering & Evaluation

  • Work with JSON, CSV, Parquet, JSONL — build evaluation datasets from real-world scenarios
  • Design automated metrics and human-in-the-loop workflows including inter-annotator agreement
  • Create and maintain domain-specific evaluation benchmarks and ground truth datasets

Leadership & Client Work

  • Build and lead a testing team, be the go-to expert on AI quality
  • Own client relationships — retention and growth
  • Share your knowledge within the B-Sure Digital consulting community

What you need

  • Hands-on experience with GenAI apps — prompt engineering, API integrations, output workflows
  • Familiarity with RAG, vector databases, embedding strategies and multi-agent systems
  • Understanding of GenAI challenges: hallucinations, prompt sensitivity, output variability
  • Solid testing background in enterprise environments
  • Proficiency with API testing frameworks, CI/CD and Git
  • Data handling skills and knowledge of AI evaluation methodologies
  • Comfortable working in ambiguous environments — this is a new field, you'll help define it
  • Strong communication skills — you'll talk to both engineers and stakeholders

Nice to have

  • Experience using LLMs (ChatGPT, Claude, etc.) for test generation
  • Knowledge of computer vision for visual testing
  • Experience with AWS, Azure, or GCP
  • ISTQB or similar certifications

What you get

  • Competitive salary
  • Fully remote — work from wherever
  • Budget for training and conferences
  • Access to the latest AI tools and platforms
  • Diverse international client projects

Don't see your role?

We're always open to hearing from good engineers. Drop us a message and we'll talk.

Get in Touch