Understanding Watchdog Timers in PLCs

170 7 minutes read

A Watchdog Timer (WDT) is a critical component of embedded systems, particularly Programmable Logic Controllers (PLCs), that ensures system stability by monitoring operational integrity. This article explores into the purpose, functionality, and many elements of watchdog timers, with a focus on PLCs and Real-Time Operating Systems (RTOS), as well as answers common questions.

What is a Watchdog Timer?

A watchdog timer is an electrical timer that monitors system processes to ensure they are running properly. If the watchdog detects a malfunction, such as a software error, endless loop, or hardware failure, it initiates corrective steps such as a system rebooting. This technique assures that even in the case of a breakdown, the system can recover to a safe state while ensuring operational integrity.

What is a PLC watchdog timer?

Watchdog timers are most commonly used in PLCs during the program scan cycle. They monitor how long it takes the CPU to read and execute the control logic. If the scan cycle exceeds a specified time, the watchdog timer detects an error and suspends the PLC’s functions until corrective action is performed.

A watchdog timer can be added to PLC– or computer-controlled systems to improve safety and reliability. This separate, independent timer is triggered at regular intervals by the control system. Under normal operation, the PLC continuously retriggers the timer. If the timer is not retriggered within a preset time, it indicates a malfunction, and the timer initiates corrective actions, such as shutting down equipment or switching to a safe state.

What is the purpose of a watchdog timer?

Functions of Watchdog Timers in PLCs

System Monitoring

The watchdog timer tracks the duration of the program’s scan cycle. Any delay or deviation in estimated timing suggests a potential system fault.

Error Detection

Identifies hardware faults, code errors, and infinite loops that may cause delays or disturbances in execution. The watchdog timer alerts to certain issues, requiring corrective action.

Automatic Recovery

When a fault is detected, the watchdog timer resets the CPU. This procedure allows the system to recover and continue normal operations after fixing the problem.

Safety and Reliability

The watchdog timer ensures that the system functions safely and maintains its reliability over time by identifying and quickly fixing any errors that have been present for an extended period of time or system hangs.

Cycle Time and Watchdog Timers in PLCs

Scan Cycle in PLCs

A scan cycle is the time the CPU takes to execute the entire program, from processing the first network to the last.
The duration of the cycle is monitored and compared against a predefined limit to ensure efficient operation.

Cycle Time Monitoring

For instance, in Siemens PLCs, tools like TIA Portal provide diagnostics to monitor the cycle time.
If the cycle time exceeds the maximum allowable threshold (e.g., 6000 ms), the system identifies this as a fault.

Role of Watchdog Timers

The watchdog timer runs in the background, resetting its count after the successful completion of each scan cycle.
If an error occurs (e.g., due to a delay or fault) and the watchdog timer is not reset, it generates a timeout error.

Corrective Actions

When a timeout error is triggered, the watchdog timer initiates corrective actions, such as resetting the CPU or alerting the system, to restore normal operation.

Purpose and Role of Watchdog Timers in RTOS

Watchdog timers in Real-Time Operating Systems (RTOS) are crucial to system stability because they ensure that tasks run within their prescribed bounds. Key features include:

In RTOS systems, many processes execute concurrently, necessitating tools to monitor task execution.
Tasks frequently have strict timing constraints to ensure system responsiveness and correctness, which the watchdog timer helps enforce.

Functions of Watchdog Timers in RTOS

Monitors tasks to ensure they complete execution within their designated time frames.
Prevents task overruns or missed deadlines by taking corrective actions when anomalies occur.

Detects failures such as: Infinite loops, Deadlocks,Unexpected task behavior.
Automatically resets the system when critical failures are detected to restore normal operations.

Balancing Fault Detection and Stability by Watchdog Timers in RTOS

Configure the watchdog timeout to accommodate expected variations in task execution times and prevent resets due to minor timing deviations.
Adjust the watchdog to tolerate transient delays caused by resource contention or temporary high CPU utilization.
Calibrate the watchdog to detect genuine system faults promptly and ensure timely intervention.
Fine-tune the watchdog sensitivity to balance fault detection with overall system stability.
Regularly test and monitor the system to optimize watchdog configuration for reliability and robustness.

Click here for PLC learning Series 10: PLC timers

Implementation Considerations for Watchdog Timers in RTOS

Timeout Configuration

Define timeout intervals carefully based on the criticality of tasks and the expected execution times for each task.
Use dynamic timeouts for jobs that have different execution times.

Task Heartbeats

Implement periodic signals (heartbeats) delivered by tasks to the watchdog to show that the tasks are performing as intended and that no anomalies have occurred.
Configure the watchdog to reset the system if no heartbeats are detected within the timeout period.

Hierarchical Monitoring

Set up numerous watchdog timers to monitor various levels of the system:

Task-Level Monitoring: Each critical task has its dedicated watchdog.
Subsystem-Level Monitoring: Groups of tasks inside a subsystem are monitored together.
System-Level Monitoring: Maintains general health by supervising the entire RTOS system.

Use hierarchical watchdog timers for:

Granular fault detection.
Isolated recovery mechanisms to handle specific faults without affecting the entire system.

Watchdog Integration with RTOS Scheduler

Integrate watchdog timers into the RTOS scheduler to:

Ensure seamless task monitoring.
Avoid timing conflicts with task scheduling.

Recovery Strategy Design

Define recovery mechanisms that include:

Graceful restarts for individual tasks.
Full system reboots for critical failures.

Testing and Validation

Simulate numerous failure scenarios to ensure that watchdog timers detect and respond correctly.
Check that watchdog configurations are consistent with system timing and fault tolerance requirements.

Types of Watchdog Timers

Hardware Watchdog Timers

Integrated into microcontrollers or external components.
Operate independently of the main processor.
Reliable for detecting and handling hardware and software faults.

Software Watchdog Timers

Implemented within the system’s software.
Provide flexibility but depend on the processor’s functionality.

Refer the below link for the Understanding ON Delay and OFF Delay Timers in PLC Programming

Understanding ON Delay and OFF Delay Timers in PLC Programming

Windowed Watchdog Timers

Require the timer to be reset within a specific time window.
Prevent premature or delayed resets, ensuring compliance with defined operational parameters.

Timeout Watchdog Timers

Trigger a system reset if not reset within a predefined timeout period.
Effective for detecting and recovering from system hangs.

Key Considerations for Watchdog Timer Settings

Setting the correct timeout for a watchdog timer is critical for its effectiveness. Factors to consider include:

Choose a timeout period that accounts for the worst-case execution time of tasks or scan cycles to prevent unnecessary resets.
Ensure the timeout period is longer than the minimum execution time between resets to avoid premature resets.
Align the timeout settings with system stability requirements to maintain functionality during periods of inactivity.
Factor in the restart time required for the system to reboot and resume normal operations after a reset.

Refer the below link for the Retentive Timer On (RTO) in PLC Programming

Retentive Timer On (RTO) in PLC Programming

Example: Watchdog Timer Settings

or a PLC controlling a heating appliance with a control period of 100ms and execution times ranging from 40ms to 75ms:

Shortest reset interval: 65ms (40ms execution + 25ms wait time).
Longest reset interval: 135ms (60ms wait + 75ms execution).
Watchdog timeout setting: 135ms or slightly longer to avoid false resets while maintaining fault detection capability.

In RTOS environments, the watchdog settings should accommodate task scheduling variations, ensuring a balance between prompt fault detection and maintaining operational stability.

Click here for more PLC articles

FAQ: Frequently Asked Questions

What is a watchdog timer used for?

A watchdog timer is used to monitor system operations, detect malfunctions, and initiate corrective actions to ensure the system remains operational and safe.

What is the role of the watchdog timer?

The role of the watchdog timer is to:

Monitor program execution.
Detect anomalies such as infinite loops or unresponsive tasks.
Trigger system resets or other corrective actions to restore normal operations.

What is the function of the watchdog?

The function of a watchdog is to:

Ensure the system operates within expected parameters.
Detect and recover from faults autonomously.
Enhance system reliability and safety.

What is the watchdog timer in RTOS?

In RTOS, the watchdog timer ensures tasks execute within their designated periods and maintains system responsiveness. It acts as a safeguard against task overruns and unresponsiveness, initiating resets if necessary.

How many types of watchdog timers are there?

There are four main types of watchdog timers:

Hardware Watchdog Timers
Software Watchdog Timers
Windowed Watchdog Timers
Timeout Watchdog Timers

What is the limit of watchdog timer?

The limit of a watchdog timer refers to the maximum permissible time period within which the system must reset (or “kick”) the timer to avoid triggering a fault condition. If the timer is not reset within this period, it assumes a malfunction and initiates corrective actions.

For example, in the system described, the maximum permissible watchdog timer value is 380 milliseconds. This ensures that the control system remains stable and does not lose functionality. To maintain system stability, the watchdog timer period should be set to the longest time that is shorter than 380 milliseconds.

This limit is typically determined based on:

The system’s operational characteristics.
The longest expected cycle or response time during normal operation.

Why is a watchdog timer used?

A watchdog timer is used to enhance the reliability of a system by monitoring its operation and automatically resetting it if the software freezes or hangs. This ensures that the system can recover from unexpected failures without manual intervention.

How does a watchdog timer work?

The watchdog timer operates as an independent timer that must be reset (or “kicked”) regularly by the system during normal operation. If the system fails to reset the timer within a specified period, the timer assumes a malfunction and initiates corrective actions, such as resetting the processor or triggering a safe state.

Why is it important to use a watchdog timer?

It detects software or system malfunctions, such as freezes or crashes.
It ensures automatic recovery and minimizes downtime.
It improves system reliability and safety by preparing for unexpected failures.

While software is not designed to fail, a watchdog timer ensures the system remains operational and prepared for unforeseen issues.

Test your understanding of timers in PLCs with our Quiz

Test your knowledge of timers in Programmable Logic Controllers (PLCs), which are essential for industrial automation. This quiz explores various types of timers, including Retentive On-Delay Timers (RTO), On-Delay Timers (TON), Off-Delay Timers (TOF), and related concepts such as preset values, time base, and watchdog timers. Strengthen your understanding with straightforward explanations with below link