On-call policies are organizational guidelines and procedures that define how employees are designated, scheduled, and compensated for being available to work outside of their regular working hours.

These policies are particularly relevant in industries and roles where continuous operations, rapid issue resolution, or emergency response are essential.

Providing on-call support can be stressful at times, especially for folks within companies that have a constantly evolving set of services and customers that are scaling fast.

To support this, having robust on-call compensation packages and a well-defined on-call policy in place is essential. This ensures that your on-call staff remains motivated while undertaking an undoubtedly challenging role.

📑
Are you preparing for on-call? Know the tips here before starting your duties!

There are several models adopted by companies to compensate for on-call work. We spoke to a number of companies and some of our customers and compiled some of the most popular on-call comp models that you can explore within your organization.

But, before that let's understand why on-call policy is important.

Understanding On-Call Policies

On-call policies outline the expectations, responsibilities, and compensation structures associated with employees being on-call, ensuring a systematic and effective approach to addressing critical issues or emergencies that may arise beyond the standard workday.

These policies help strike a balance between meeting business needs, maintaining service levels, and respecting employees' well-being by providing clear frameworks for on-call rotations, communication protocols, and compensation strategies.

Key Components to Consider When Creating On-Call Policies

Developing effective on-call policies is crucial to ensure that on-call teams operate smoothly, respond promptly to incidents, and maintain a healthy work-life balance for team members.

Here are key components to consider when creating on-call policies:

Rotation Schedules:

Establish a fair and predictable on-call rotation schedule. Ensure that team members have advance notice of their on-call shifts, and distribute the workload evenly to prevent burnout.

Defined On-Call Hours:

Clearly outline the hours during which on-call responsibilities apply. Specify start and end times for on-call shifts, and ensure that team members understand when they are expected to be available.

Clear Roles and Responsibilities:

Clearly define the roles and responsibilities of on-call team members. Specify who is on-call, what their duties entail, and how escalations will be handled.

Communication Protocols:

Establish communication protocols for notifying on-call staff about incidents. Specify the channels to be used (e.g., phone, messaging apps) and the information that should be included in incident notifications.

Workload Management:

Implement measures to manage workload and prevent on-call fatigue. Consider factors such as the frequency of on-call shifts, the intensity of the workload, and opportunities for downtime between shifts.

Regular Review and Updates:

Periodically review on-call policies to ensure they remain effective and relevant. Update policies based on feedback, changes in technology, or organizational shifts.

📘
What is Site Reliability Engineering? Checkout the detailed guide here!

What is On-Call Pay and How Does It Work?

On call pay is a form of compensation provided to employees who are required to be available to work beyond their regular working hours, typically during evenings, weekends, or holidays.

This compensation acknowledges the inconvenience and dedication of employees who must be ready to respond promptly to work-related issues during on-call periods.

For example,

Base Compensation:

Employees receive their regular salary or hourly pay for their standard working hours.

On-Call Premium:

On call pay is often structured as an additional premium, offering extra compensation on top of the employee's regular rate.

Availability Stipend:

Some employers provide a fixed stipend simply for being available during on-call hours, regardless of whether the employee is called in to work.

Call-Back Pay:

If an employee is called to work during on-call hours, they may receive additional compensation known as call-back pay.

Minimum Hours:

Employers may guarantee a minimum number of hours of pay, even if the employee is not ultimately called in, as compensation for their readiness.

Overtime Considerations:

On-call hours may contribute to the calculation of overtime pay if they exceed the standard working hours.

📘
How can an incident management software help you in On-Call management? Read here!

Factors Influencing Average On-Call Pay:

Geographic Location:

The cost of living and economic conditions in a specific location impact on call pay rates.

Job Role and Seniority:

The level of responsibility and seniority often affects on-call pay rates, with higher-ranking positions generally receiving higher compensation.

Industry Standards:

Some industries have established norms for on-call pay, influencing what employers are willing to offer to remain competitive.

Company Policy:

Company policies and financial considerations influence how on-call pay is structured and implemented.

Employment laws and regulations may set minimum standards for on-call pay in certain jurisdictions.

Frequency and Intensity:

The frequency and intensity of on-call duties can influence the overall pay structure, with more demanding on-call schedules often warranting higher compensation.

On-Call Pay Rates and Structures:

To help you understand on-call pay rates and structures, let's look at a few models that organizations commonly use.

Time-off comp

Provide an extra half-day off for each week someone is on primary on-call, and generally lowering the “productive work” expectation of the on-call. Companies should cultivate a good culture of management by making sure people take some extra time off after particularly rough shifts.

Of course, time-offs must be modeled for during the project planning phase(so every day off does not feel like less time to get your work done.)

Monetary comp

Weekly or daily compensation for primary, and secondary on-calls. Comp can be for the entire week(irrespective of the number of incidents) or extra pay for actual incident work.

If you can map out your on-call schedules a year or few quarters in advance, the on-call comp can also be included within annual comp with fixed number of weeks or days or hours stipulated within the employment contract.

📘
Are you stressed about on-call duties?Check the tips here!

Hybrid comp

Providing both monetary compensation(weekly or daily) and time off compensation.

For companies with unlimited PTO, people prefer the monetary compensation component over the time off component.

Let's take Google’s on-call comp structure for example:

For any hour outside of 08:00-18:00 your local time, where you are on-call:

- If your response SLA is 30(or 60) mins or less, you get 1/3 time-in-lieu

- If your response SLA is 5 mins or less, you get 2/3 time-in-lieu

- Time-in-lieu can be used for vacation, or it can be paid out to you at the end of the quarter. - Also, there is a hard cap on accruing 80 hours per quarter. So, if you're on a team with a strict response time it would not be uncommon to essentially have 8 additional weeks of vacation or pay per year.

On-Call Support Models for Different Scenarios

Choosing the right on call support model depends on factors like your organization, service complexity, and volume of incidents. Here are some common models with their strengths and weaknesses for different scenarios:

1. Centralized Ops Team:

  • Strengths: Ideal for smaller organizations with limited resources, provides central expertise, and simplifies escalation paths.
  • Weaknesses: May struggle with diverse services or high volume, potential bottlenecks in expertise.
  • Scenarios: Suitable for startups, smaller companies with core services, or those with predictable incident volume.

2. Service/Dev Teams On-Call:

  • Strengths: Teams have deep knowledge of their specific services, faster resolution times, and increased ownership.
  • Weaknesses: Requires mature teams and efficient knowledge sharing, may not be feasible for 24/7 coverage.
  • Scenarios: Ideal for large organizations with diverse services, mature development teams, and manageable incident volume.

3. Dedicated Teams:

  • Strengths: High level of expertise and ownership for specific products, proactive monitoring and incident prevention.
  • Weaknesses: Can be resource-intensive for smaller organizations, potential siloing and duplication of effort.
  • Scenarios: Suitable for large organizations with critical, complex products demanding constant attention and proactive maintenance.

4. Hybrid Models:

  • Strengths: Combines centralized expertise with team ownership, offers flexibility based on service needs and resource availability.
  • Weaknesses: Requires careful planning and coordination, potential for confusion with overlapping responsibilities.
  • Scenarios: Ideal for organizations with diverse services and varying complexity, looking for a balance between centralized and distributed models.

Now that we know about on-call pay rates and structures, let's delve into the role of On call software engineers in this context.

Role of Software Engineers in On-Call


Software Engineers play a vital role in On-Call management, contributing in various capacities:

Diagnosis Detectives: Identify software issues using their code knowledge.

Decision-Makers: Prioritize and solve critical issues swiftly to minimize downtime.

Communication Responsibilities: Collaborate with teams, keeping everyone informed.

Continuous Improvement: Learn from incidents, automate tasks, and enhance system stability.

Problem-Solving : Apply their knowledge and expertise to tackle new challenges.

On-Call Practices at Leading Tech Companies

Google:

  • Culture of Ownership: The "single pager" model enables engineers to take ownership of specific areas, promoting accountability and in-depth product knowledge.
  • Blameless Post-Mortems: Prioritize learning over blaming, and encourage open communication and continuous improvement.
  • Chaos Monkeys: Introducing random failures tests the resilience of a system and helps teams prepare for the unexpected.
  • Global On-Call Pods: Deploying distributed pods globally ensures around-the-clock coverage.

Amazon:

  • Sustainable Scheduling: Fair rotation schemes and overlapping shifts prevent burnout and maintain consistent support.
  • Advanced Monitoring: Proactive detection and identification of potential issues before they escalate.
  • Single Pager Model: Similar to Google, engineers carry one pager for their area of expertise, promoting responsibility and ownership.
  • Chaos Monkeys: Just like Google, Amazon uses chaos engineering to stress-test their systems.

Netflix:

  • Incident Commander Model: A designated leader takes charge during major incidents, streamlining communication and decision-making.
  • Automated Playbooks: Scripted responses for common issues ensure swift and consistent resolution, freeing up engineers for complex problems.
  • Slack-First Communication: Relying on an open platform fosters collaboration and transparency across teams.
  • Netflix Engineering Culture: Strong emphasis on ownership, communication, and proactive problem-solving permeates the on-call practices.

Conclusion

Compensation in itself might not help with “feeling ownership to solve problems” directly. It does surface the cost to the company in a way that is easily explained to management, and the payouts make it easier to spread the on-call load around.

On-call is such a mental tax because of the vast number of services we support, your brain is fried after that week. Most companies realize that the mental and physical strain of on-call should be compensated.

Having said that, irrespective of your comp model, it is important that you have regular sessions with your on-call staff and keep track of their physical and mental health, and most importantly, aggressively tune your on-call alerts to minimize non-critical pages during non-business hours.

Heavens forbid, someone YOLO mutes a spurious monitor at some ungodly hour.

Why are on-call policies crucial for businesses?

On-call policies are crucial for ensuring uninterrupted operations and prompt issue resolution, contributing to enhanced customer satisfaction and supporting the well-being of on-call staff in a 24/7 business environment.

What are the tips for creating an effective IT on-call policy?

To create an effective IT on-call policy, clearly define roles and responsibilities during on-call periods, and establish a fair rotation schedule to prevent burnout among team members. Regularly review and update the policy based on feedback and evolving business needs for optimal performance.

What factors influence different on-call pay rates?

Various factors influence on-call pay rates, including industry standards, the level of expertise required, the frequency and urgency of on-call duties, and the overall compensation structure of the organization. Additionally, regional cost of living and market demand for specific skills can also play a role in determining on-call pay rates.

What are the industry standards for compensating on-call employees?

Industry standards for compensating on-call employees vary across sectors and regions. Generally, they are influenced by factors such as the nature of the job, the level of expertise required, and the prevailing market conditions.

How can organizations define procedures and responsibilities for on-call teams?

Organizations can define on-call procedures and responsibilities by creating clear documentation, conducting regular training, establishing fair rotation schedules, defining efficient communication channels, outlining escalation paths, and conducting periodic reviews for updates.

What are the examples of on-call support models?

  1. Primary On-Call: One designated person is responsible for handling issues during a specific time, with others available as backups.
  2. Follow-the-Sun: Support is handed off between global teams based on the time of day, ensuring 24/7 coverage.
  3. Shared On-Call: Responsibility is distributed among team members, each taking turns in an on-call rotation.
  4. Tiered On-Call: Different levels of on-call support exist, with more experienced personnel being called in for complex issues.

What are the primary responsibilities of software engineers during on-call shifts?

During on-call shifts, software engineers diagnose and prioritize issues, communicate with teams, continuously improve processes, and adapt to challenges for system stability.