You’ll want to know about the department that our role is in…
Calrom is a dynamic, growing SaaS company, developing advanced online pricing, booking, management, & ticketing engines for the travel industry.
With a proven record of delivering innovative digital solutions to complex business challenges, our engineers develop innovative software for airlines such as British Airways, Iberia, & Qantas and other internationally recognised carriers. All our solutions are delivered on a SaaS basis hosted within our own virtualized server infrastructure from datacentres in the UK and overseas We provide robust and scalable service offerings to our customers and span countries around the world.
As our Head of Operations, you’ll be leading the organisation in ensuring our business-critical solutions are kept operational and performing at all times. You’ll be passionate about ensuring our solutions are fast, highly available, scalable, and able to withstand single points of failure.
The position requires the flexibility to take a holistic approach to the transition from development through validation environments and smoothly into production using CD tooling. You’ll lead our SRE, Infrastructure (Back Office and Customer facing) and Service and Support teams providing strategic guidance and oversight. Whilst having a leadership mindset and business acumen, you will retain the opportunity to delve deeply into technical details as and when needed.
The successful candidate will have:
- Experience leading SRE, Production Management or Software Engineering teams.
- Influence and initiate SRE practices with development, operational and product groups to align technology service/solution delivery.
- Able to define and report “progress” on strategic initiatives and project level tasks to all stakeholders including senior executives, clients and use effective communication approaches with each constituency.
- Drive quality accountability within the organization with well-defined processes, metrics, and goals for process quality to ensure service level agreements are met.
- Drive capacity planning, performance analysis, instrumentation and other non-functional systems requirements.
- Experience operating and implementing distributed & highly concurrent service-based architectures, including microservices, containerized services, and/or serverless architectures.
- Manage availability, latency, scalability and efficiency of our solutions with a focus on fault tolerant approaches.
- Proven experience in implementing advanced observability and alerting at scale.
- Excellent understanding of managing a production incident, through to Root Cause Analysis/Post Mortem and implementation of RCA outcomes.
with the following expertise also desirable:
- Hands-on experience with container management and orchestration (using tools such as Rancher and Kubernetes).
- Proven experience in maintaining scalability and resiliency of a complex environment.
- ITIL Certification
- Experience of MS Azure platform for DevOps and Monitoring
- Experience of GitOps, Flux and Flagger for Progressive Delivery