Job Title : Incident & Service Reliability Manager
Position type : Full time
Place of work : Bangkok, Sathorn district
Salary : Negotiable
Working conditions : Working conditions are normal for an office environment.
Department/Function : IT Service Delivery & CMC
Reporting to Title : Head of IT Service Delivery & CMC
The company:
BRED IT (Thailand) Ltd.is a wholly owned subsidiary of the French bank BRED Banque Populaire based out of Paris.
BRED IT was established in 2008 to become an IT hub and deliver IT operations and support for BRED Group Commercial Banks in South East Asia and Pacific Ocean areas.
Today, it supports Banque Franco Lao in Laos, BRED Bank Cambodia, BRED Bank Vanuatu, BRED Bank Solomon Islands, BRED Bank Fiji and Banque pour le commerce et l'industrie Mer Rouge (BCIMR) in Djibouti (Africa).
BRED IT provides end to end Infrastructure and Applications management around Core Banking, Internet Banking and E-Payments.
BRED IT has also operated an offshore development center (specialized in Cobol & Java) for Paris headquarters since 2011.
We are a unique company, thanks to our identity and our history: We place our expertise at the service of BRED Group and develop our activities with an entrepreneurial structure. By putting BRED group best interests first, it allows us to deliver tailor-made solutions with high value-added.
Role Purpose:
As Major Incident & Service Reliability Manager, you are the central point of leadership during critical incidents and a key driver of service performance and availability across BRED IT's international perimeter.
You will:
- Lead major incidents endtoend, from impact assessment to resolution and communication.
- Coordinate crossfunctional technical teams, ensuring efficient investigations and sustainable fixes.
- Drive continuous improvement, strengthening processes, monitoring, and overall service reliability.
- Act as a clear, trusted communicator, providing timely, structured updates to management and international business stakeholders.
Main Responsibilities:
1. Incident Management
- Lead and coordinate the response to major incidents across all supported entities.
- Clarify business impact and criticality with all relevant parties (business, IT, vendors).
- Define and organize workstreams to structure the incident resolution (roles, tasks, timelines).
- Ensure complete and accurate incident records (incident details, impact, timeline, actions).
- Produce and coordinate Root Cause Analysis (RCA) and followup actions with relevant teams.
- Clarify ticket ownership when responsibilities are unclear between teams.
- Summarize key facts in the Incident Report (IR) and prefill required fields.
- Measure and monitor ticket quality and timeliness.
- Take ownership of communication during major incidents, providing concise, regular, and transparent updates to all stakeholders (management, banks, internal IT teams).
- Work closely with the Control and Monitoring Center (24/7) to continuously improve incident response and communication.
- Define, maintain, and publish KPIs on incident response, resolution times, and reporting quality.
- Report on major incident management in monthly reports and quality committees with the banks.
2. Problem Management
- Ensure that, for every major incident, corresponding problems are raised to address root causes.
- Follow up on problems with the relevant teams until permanent fixes are implemented.
- Track and report to management on problem creation, status, and resolution.
- Promote a no recurrence mindset, focusing on structural improvements rather than workarounds.
3. Monitoring & Observability Management
- Ensure that all critical services (infrastructure and applications) are properly monitored.
- In collaboration with business and technical teams, design and improve alerts (e.g. Zabbix, Splunk) to:
- Detect incidents early.
- Provide meaningful information (business impact, procedures, escalation).
- Maintain and improve monitoring processes and procedures, ensuring alignment with best practices.
- Act as a key stakeholder in shaping the overall observability strategy (metrics, logs, alerts, dashboards).
4. Governance, Controls & Reporting
- Answer to controls and audits related to CMC activities (internal, external, regulatory).
- Produce clear, datadriven reports on incident and problem management for:
- Internal IT management
- BRED SA and international banks
- Contribute to quality committees, service reviews, and continuous improvement workshops.
5. Business Continuity & DR (Disaster Recovery)
- Contribute to the planning, organization, coordination, and reporting of DR test activities.
- Provide feedback from incidents and problems to improve DR scenarios, plans, and procedures.
Candidate profile:
You are a handson leader with strong IT operations experience, excellent coordination and communication skills, and a passion for service reliability.
Core Skills & Competencies
- Excellent incident leadership: can manage complex, highpressure IT incidents with calm and structure.
- Strong problemsolving and analytical skills; able to quickly understand technical issues and business impact.
- Excellent coordination skills for multiteam technical investigations.
- Strong judgment and decisionmaking, with a clear sense of priority and urgency.
- High level of initiative, ownership, and reliability; proactive in preventing issues, not only reacting to them.
- Ability to work autonomously and take decisions within the scope of the role.
- Solid understanding of IT governance and operations (ITIL or similar frameworks).
- Strong technical acumen: good understanding of IT systems and operations and willingness to continuously learn new technologies.
- Good business acumen: understands the impact of incidents on costs, customer experience, and production targets.
- Fast learner, selfdriven, highly motivated, with a strong cando attitude.
- Excellent organizational skills, rigor, and attention to detail.
- Proactive, reactive, and disciplined, with a strong sense of service.
- Demonstrated team spirit and ability to build strong relationships with technical and business teams.
- Proven leadership skills, especially in crossfunctional and international contexts.
Nice to have / Optional skills :
- Knowledge of network concepts.
- Experience with virtualization technologies.
- Familiarity with Linux operating systems.
- Exposure to container platforms (e.g. Docker, Kubernetes, OpenShift).
- Understanding of Microsoft technologies (Windows Server, Active Directory, etc.).
Education
- Minimum Bachelor's Degree in Computer Science/Engineering or equivalent experience.
Language skills
- English Full Professional Proficiency (be able to work with BRED SA and BRED international Banks).