US20030226056A1 - Method and system for a process manager - Google Patents
Method and system for a process manager Download PDFInfo
- Publication number
- US20030226056A1 US20030226056A1 US10/157,567 US15756702A US2003226056A1 US 20030226056 A1 US20030226056 A1 US 20030226056A1 US 15756702 A US15756702 A US 15756702A US 2003226056 A1 US2003226056 A1 US 2003226056A1
- Authority
- US
- United States
- Prior art keywords
- behavior
- count
- thread
- killing
- abnormal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
- G06F11/0754—Error or fault detection not based on redundancy by exceeding limits
- G06F11/0757—Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0715—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a system implementing multitasking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
- G06F11/0754—Error or fault detection not based on redundancy by exceeding limits
- G06F11/076—Error or fault detection not based on redundancy by exceeding limits by exceeding a count or rate limit, e.g. word- or bit count limit
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
- H04L43/0817—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/02—Standardisation; Integration
- H04L41/0213—Standardised network management protocols, e.g. simple network management protocol [SNMP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
Definitions
- the invention relates to the field of network management.
- the invention relates to a process manager that monitors the health of processes running on a router.
- a method and system is provided to monitor the health of processes and to kill the process when the process is unhealthy.
- the system starts a process, monitors a behavior of the process, and kills the process when the behavior is abnormal.
- the behavior is abnormal when the process is non-responsive.
- the behavior of the process is monitored by setting a process timer after starting the process, registering the process and incrementing a timeout count when the process fails to register before the process timer expires, and polling the process and incrementing the timeout count when the process fails to respond to polling before the process timer expires.
- the process is killed when the timeout count exceeds a maximum number of timeouts.
- the behavior is abnormal when a thread of the process is non-responsive.
- the behavior of the process is monitored by setting up a thread timer for each thread of the process after registering the process, waiting for each thread to provide an updated status, and incrementing a corresponding thread timeout count when any of the threads fails to provide an updated status before the corresponding thread timer expires.
- the process is killed when any of the thread timeout counts exceeds a maximum number of timeouts.
- the behavior is abnormal when the process cannot start.
- the behavior of the process is monitored by setting a startup timer before starting the process and incrementing a startup count when the process fails to register before the startup timer expires.
- the process is killed when the startup count exceeds a maximum number of attempts at starting the process.
- the behavior is abnormal when the process repeatedly crashes.
- the behavior of the process is monitored by accumulating an amount of uptime for the process, incrementing a crash count when the process crashes, and calculating a crash rate from the crash count per the amount of uptime.
- the process is killed when the calculated crash rate exceeds a maximum crash rate.
- apparatus are provided to carry out the above and other methods.
- FIG. 1 illustrates is a block diagram illustrating one generalized embodiment of a process management system incorporating the invention, and the operating environment in which certain aspects of the illustrated invention may be practiced.
- FIG. 2 is a block diagram illustrating selected components of the process management system of FIG. 1 in further detail, in accordance with one embodiment of the invention.
- FIG. 3 is a block diagram illustrating a suitable computing environment in which certain aspects of the illustrated invention may be practiced.
- FIG. 4 is a flow diagram illustrating certain aspects of a method to be performed by a computer executing one embodiment of the illustrated invention.
- FIG. 5 is a flow diagram illustrating one embodiment of monitoring a behavior of a process in further detail.
- FIG. 6 is a flow diagram illustrating an alternative embodiment of monitoring a behavior of a process in further detail.
- FIG. 7 is a flow diagram illustrating another alternative embodiment of monitoring a behavior of a process in further detail.
- FIG. 8 is a flow diagram illustrating yet another alternative embodiment of monitoring a behavior of a process in further detail.
- FIG. 1 is a block diagram illustrating one generalized embodiment of a process management system 100 incorporating the invention, and the operating environment in which certain aspects of the illustrated invention may be practiced.
- the system 100 is shown to be operating in an active Route Processor (RP) card and a line card running multiple virtual routers.
- RP Active Route Processor
- the system 100 may include more components than those shown in FIG. 1. However, it is not necessary that all of these generally conventional components be shown in order to disclose an illustrative embodiment for practicing the invention.
- the process management system 100 operates on a router that has a process manager 102 .
- a typical router supports a number of applications that support protocols, network interfaces, and other components, the operation of which is maintained by the process manager 102 .
- the router includes applications to support the Border Gateway Protocol (BGP) 106 and the Open Shortest Path First protocol (OSPF) 108 .
- Border Gateway Protocol BGP
- OSPF Open Shortest Path First protocol
- Each application may also support one or more management interfaces, such as command line interface (CLI) 110 , Simple Network Management Protocol (SNMP), or Extensible Markup Language (XML) based management interfaces.
- CLI command line interface
- SNMP Simple Network Management Protocol
- XML Extensible Markup Language
- the management interfaces provide network administrators with access to the functions of the router and router applications using CLI commands, or SNMP or XML requests to update or access configuration of the process manager.
- FIG. 2 is a block diagram illustrating selected components of the process management system of FIG. 1 in further detail.
- the process manager 102 uses a configuration file 104 to allow the process manager to be configurable to set various options.
- the configuration file 104 may be implemented in various formats, such as a flat file, a CLI, or an XML file.
- Process manager 102 includes a process monitor 202 to monitor a behavior of a process and a controller 204 to control the process and kill the process when the behavior is abnormal.
- the process manager 102 includes one or more timers 206 .
- One timer may measure a predetermined time interval for a process to perform a desired action, such as responding to polling or registering the process upon startup. Another timer may measure the amount of uptime for the process.
- the process manager 102 includes one or more counters 208 .
- One counter may count a number of times a process fails to perform the desired action during the predetermined time interval.
- the controller 204 kills a process when the counter exceeds a maximum number of failures.
- the process manager 102 may further include a calculator 210 to measure a crash rate from the number of times a process crashes per the amount of uptime.
- the controller 204 kills a process when the calculated crash rate exceeds a predetermined maximum crash rate.
- the configuration file 104 may be used to set the predetermined time interval, maximum number of failures, maximum crash rate, or other configurable options.
- FIG. 3 is a block diagram illustrating a suitable computing environment in which certain aspects of the illustrated invention may be practiced.
- the method for a process management system 100 may be implemented on a computer system 300 having components 302 - 312 , including a processor 302 , a memory 304 , an Input/Output device 306 , a data storage 312 , and a network interface 310 , coupled to each other via a bus 308 .
- the components perform their conventional functions known in the art and provide the means for implementing the process management system 100 . Collectively, these components represent a broad category of hardware systems, including but not limited to general purpose computer systems and specialized packet forwarding devices.
- system 300 may be rearranged, and that certain implementations of the present invention may not require nor include all of the above components.
- additional components may be included in system 300 , such as additional processors (e.g., a digital signal processor), storage devices, memories, and network or communication interfaces.
- FIG. 4 is a flowchart illustrating certain aspects of a method to be performed by a computer executing one embodiment of the invention.
- the methods to be performed by a processor on a router or other network device constitute computer programs made up of computer-executable instructions. Describing the methods by reference to a flowchart enables one skilled in the art to develop such programs including such instructions to carry out the methods on suitably configured computers, in which the processor of the computer execute the instructions from computer-accessible media.
- the computer-executable instructions may be written in a computer programming language or may be embodied in firmware logic such as an application-specific integrated circuit (ASIC).
- ASIC application-specific integrated circuit
- the method begins at 400 , where a process is started.
- the process may be running in any of the applications, protocols, or management interfaces managed by process manager 102 .
- a behavior of the process is monitored.
- the process is killed when the behavior is abnormal. In one embodiment, an administrator is notified about the abnormal behavior.
- FIG. 5 is a flowchart illustrating one embodiment of monitoring a behavior of a process in further detail.
- a maximum startup count is set to define a maximum number of startup attempts for the process.
- a startup timer is set to define a time interval to wait for the process to startup and register.
- the process manager 102 determines whether the timer has expired. When the timer has not yet expired, the waiting continues at 506 until the timer expires. When the timer expires, at 508 , the process manager 102 determines whether the process has registered. When the process has registered, the behavior of the process is normal. When process has not registered when the timer expires, at 510 , the startup count is incremented.
- the process manager 102 determines whether the process has reached the maximum startup count. When the process has not yet reached the maximum startup count, then the process is restarted at 514 , and the method is reiterated from 502 . When the process has reached the maximum startup count, then the behavior of the process is abnormal, and the process is killed. Internal process control data may then be cleaned up and an administrator may be notified about the inability to start the process.
- FIG. 6 is a flowchart illustrating an embodiment of monitoring a behavior of a process in further detail.
- a maximum timeout count is set to define a maximum number of timeouts for the process.
- a timer is set to define a time interval to wait for the process to respond to polling.
- the process is polled. An example of polling is to send a “hello” message to the process. The process may reply by sending a “hello acknowledgment” message.
- the process manager 102 determines whether the timer has expired. When the timer has not yet expired, the waiting continues at 608 until the timer expires.
- the process manager 102 determines whether the process has registered. When the process has registered, at 612 , the process manager 102 determines whether the process has responded to polling. When the process has responded to polling, at 614 , the timeout count is reset, and the method is reiterated from 604 . When the process has not registered or the process has not responded to polling when the timer expires, at 616 , the timeout count is incremented. Then, at 618 , the process manager 102 determines whether the process has reached the maximum timeout count. When the process has not yet reached the maximum timeout count, the method is reiterated from 604 . When the process has reached the maximum timeout count, then the behavior of the process is abnormal, and the process is killed.
- FIG. 7 is a flowchart illustrating an embodiment of monitoring a behavior of a process in further detail.
- a maximum thread timeout count is set to define a maximum number of timeouts for the threads of the process.
- a thread timer is set for each thread in the process to define a time interval to wait for each thread to provide an updated status.
- the process manager 102 waits for each thread to provide an updated status.
- the process manager 102 determines whether the corresponding thread timer has expired. When the timer has not yet expired, the waiting continues at 708 until the timer expires. When the timer expires, at 710 , the process manager 102 determines whether the process has registered.
- the process manager 102 determines whether the thread has provided an updated status. When the thread has provided an updated status, the corresponding thread timeout count is reset at 714 , and the method is reiterated from 704 . When the process has not registered or the thread has not provided an updated status when the timer expires, at 716 , the corresponding thread timeout count is incremented. Then, at 718 , the process manager 102 determines whether the thread of the process has reached the maximum thread timeout count. When the thread has not yet reached the maximum thread timeout count, the method is reiterated from 704 . When the thread has reached the maximum thread timeout count, then the behavior of the process is abnormal, and the process is killed.
- FIG. 8 is a flowchart illustrating one embodiment of monitoring a behavior of a process in further detail.
- a maximum crash rate for the process is set.
- the amount of uptime for the process is accumulated.
- the process manager 102 determines whether the process has crashed. When the process has not crashed, the method is reiterated at 802 .
- the crash count is incremented.
- the crash rate is calculated from the crash count per the amount of uptime.
- the process manager 102 determines whether the process has reached the maximum crash rate.
- the process is restarted at 812 , and the method is reiterated from 802 .
- the process has reached the maximum crash rate, the behavior of the process is abnormal, and the process is killed. Internal process control data may then be cleaned up and an administrator may be notified about the inability to start the process.
- the content for implementing an embodiment of the method of the invention may be provided by any machine-readable media which can store data that is accessible by system 100 , as part of or in addition to memory, including but not limited to cartridges, magnetic cassettes, flash memory cards, digital video disks, random access memories (RAMs), read-only memories (ROMs), and the like.
- the system 100 is equipped to communicate with such machine-readable media in a manner well-known in the art.
- the content for implementing an embodiment of the method of the invention may be provided to the system 100 from any external device capable of storing the content and communicating the content to the system 100 .
- the system 100 may be connected to a network, and the content may be stored on any device in the network.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Environmental & Geological Engineering (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Debugging And Monitoring (AREA)
Abstract
A method and system is provided for monitoring the health of processes running on a router. A behavior of a process is monitored and the process is killed if the behavior is abnormal. The behavior may be abnormal if the process is non-responsive, cannot start, or repeatedly crashes. The system may include a timer to measure a predetermined time interval for the process to perform a desired action, a counter to count a number of times the process fails to perform the desired action before the timer expires, and a controller to kill the process when the counter exceeds a maximum number of failures. Alternatively, the timer could measure an amount of uptime, the counter could count the number of times the process crashes, and the controller could kill the process when a crash rate calculated from the number of times the process crashes per the amount of uptime exceeds a maximum crash rate.
Description
- 1. Technical Field
- The invention relates to the field of network management. In particular, the invention relates to a process manager that monitors the health of processes running on a router.
- 2. Background Information and Description of Related Art
- The architecture of high-performance Internet routers has advanced in the last several years to provide increased performance in routing ever-greater volumes of network traffic. It is not uncommon for a router to support numerous protocols as well as several applications for configuration and maintenance of the router tables, protocols, and network policies. These advances have increased the complexity of the router such that the management of applications and protocols running on the router is critical for reliable network performance.
- In existing router management technology, the logic to support the applications, protocols, and associated management interfaces is centrally managed in a single master program. This can result in a single point of failure, meaning that even if there is a problem with only one protocol or application or interface, the entire program could crash, bringing the router down with it. In addition, if the master program needs to be updated, for example to accommodate a new protocol, then the master program must be brought down in order to perform the updates, thereby temporarily taking the router out of service.
- In an effort to overcome some of the limitations in existing router management technology, management of applications, protocols, and associated management interfaces may be decentralized. However, this means that there will be several independent processes running simultaneously on both the active Router Protocol (RP) and line cards. These multiple processes share the resources of the processor on which they are running. Furthermore, processes often run multiple threads per process. These threads share the same address space and may be short or long-lived. Therefore, if any one process or any thread of a process is not healthy, it could be taking up valuable resources unnecessarily.
- According to one aspect of the invention, a method and system is provided to monitor the health of processes and to kill the process when the process is unhealthy. The system starts a process, monitors a behavior of the process, and kills the process when the behavior is abnormal.
- According to one aspect of the invention, the behavior is abnormal when the process is non-responsive. The behavior of the process is monitored by setting a process timer after starting the process, registering the process and incrementing a timeout count when the process fails to register before the process timer expires, and polling the process and incrementing the timeout count when the process fails to respond to polling before the process timer expires. The process is killed when the timeout count exceeds a maximum number of timeouts.
- According to one aspect of the invention, the behavior is abnormal when a thread of the process is non-responsive. The behavior of the process is monitored by setting up a thread timer for each thread of the process after registering the process, waiting for each thread to provide an updated status, and incrementing a corresponding thread timeout count when any of the threads fails to provide an updated status before the corresponding thread timer expires. The process is killed when any of the thread timeout counts exceeds a maximum number of timeouts.
- According to one aspect of the invention, the behavior is abnormal when the process cannot start. The behavior of the process is monitored by setting a startup timer before starting the process and incrementing a startup count when the process fails to register before the startup timer expires. The process is killed when the startup count exceeds a maximum number of attempts at starting the process.
- According to one aspect of the invention, the behavior is abnormal when the process repeatedly crashes. The behavior of the process is monitored by accumulating an amount of uptime for the process, incrementing a crash count when the process crashes, and calculating a crash rate from the crash count per the amount of uptime. The process is killed when the calculated crash rate exceeds a maximum crash rate.
- According to one aspect of the invention, apparatus are provided to carry out the above and other methods.
- The invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
- FIG. 1 illustrates is a block diagram illustrating one generalized embodiment of a process management system incorporating the invention, and the operating environment in which certain aspects of the illustrated invention may be practiced.
- FIG. 2 is a block diagram illustrating selected components of the process management system of FIG. 1 in further detail, in accordance with one embodiment of the invention.
- FIG. 3 is a block diagram illustrating a suitable computing environment in which certain aspects of the illustrated invention may be practiced.
- FIG. 4 is a flow diagram illustrating certain aspects of a method to be performed by a computer executing one embodiment of the illustrated invention.
- FIG. 5 is a flow diagram illustrating one embodiment of monitoring a behavior of a process in further detail.
- FIG. 6 is a flow diagram illustrating an alternative embodiment of monitoring a behavior of a process in further detail.
- FIG. 7 is a flow diagram illustrating another alternative embodiment of monitoring a behavior of a process in further detail.
- FIG. 8 is a flow diagram illustrating yet another alternative embodiment of monitoring a behavior of a process in further detail.
- In the following description various aspects of the present invention, a method and apparatus for process management will be described. Specific details will be set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced with only some or all of the described aspects of the present invention, and with or without some or all of the specific details. In some instances, well known architectures, steps, and techniques have not been shown to avoid unnecessarily obscuring the present invention. For example, specific details are not provided as to whether the method and apparatus is implemented in a switch, router, bridge, server or gateway, as a software routine, hardware circuit, firmware, or a combination thereof.
- Parts of the description will be presented using terminology commonly employed by those skilled in the art to convey the substance of their work to others skilled in the art, including terms of operations performed by a network operating system, and their operands, such as transmitting, receiving, routing, packets, messages, tables, command, message information base, command trees, tags and the like. As well understood by those skilled in the art, these operands take the form of electrical, magnetic, or optical signals, and the operations involve storing, transferring, combining, and otherwise manipulating the signals through electrical, magnetic or optical components of a system. The term system includes general purpose as well as special purpose arrangements of these components that are standalone, adjunct or embedded.
- Various operations will be described as multiple discrete steps performed in turn in a manner that is most helpful in understanding the present invention. However, the order of description should not be construed as to imply that these operations are necessarily performed in the order they are presented, or even order dependent. Lastly, repeated usage of the phrase “in one embodiment” does not necessarily refer to the same embodiment, although it may.
- It should be noted that while the description that follows addresses the method and apparatus as it applies to a network device such as a router, or layer3 switch, it is appreciated by those of ordinary skill in the art that method is generally applicable to any packet forwarding device, including a bridge (layer 2 switch), server or gateway. It should also be noted that while the method and apparatus may be discussed in the context of a local area network (LAN), the present invention may also be used in the context of other Transport Control Protocol/Internet Protocol (TCP/IP)-based networks including, but not limited to, internetworks, Virtual Local Area Networks (VLANs), Metropolitan Area Networks (MANs), and Wide Area Networks (WANs), as well as networks organized into subnets.
- FIG. 1 is a block diagram illustrating one generalized embodiment of a
process management system 100 incorporating the invention, and the operating environment in which certain aspects of the illustrated invention may be practiced. Thesystem 100 is shown to be operating in an active Route Processor (RP) card and a line card running multiple virtual routers. Those of ordinary skill in the art will appreciate that thesystem 100 may include more components than those shown in FIG. 1. However, it is not necessary that all of these generally conventional components be shown in order to disclose an illustrative embodiment for practicing the invention. As illustrated, theprocess management system 100 operates on a router that has aprocess manager 102. - A typical router supports a number of applications that support protocols, network interfaces, and other components, the operation of which is maintained by the
process manager 102. For example, in the illustrated embodiment, the router includes applications to support the Border Gateway Protocol (BGP) 106 and the Open Shortest Path First protocol (OSPF) 108. - Each application may also support one or more management interfaces, such as command line interface (CLI)110, Simple Network Management Protocol (SNMP), or Extensible Markup Language (XML) based management interfaces. The management interfaces provide network administrators with access to the functions of the router and router applications using CLI commands, or SNMP or XML requests to update or access configuration of the process manager.
- FIG. 2 is a block diagram illustrating selected components of the process management system of FIG. 1 in further detail. The
process manager 102 uses aconfiguration file 104 to allow the process manager to be configurable to set various options. Theconfiguration file 104 may be implemented in various formats, such as a flat file, a CLI, or an XML file. -
Process manager 102 includes aprocess monitor 202 to monitor a behavior of a process and acontroller 204 to control the process and kill the process when the behavior is abnormal. In one embodiment, theprocess manager 102 includes one ormore timers 206. One timer may measure a predetermined time interval for a process to perform a desired action, such as responding to polling or registering the process upon startup. Another timer may measure the amount of uptime for the process. In one embodiment, theprocess manager 102 includes one ormore counters 208. One counter may count a number of times a process fails to perform the desired action during the predetermined time interval. In one embodiment, thecontroller 204 kills a process when the counter exceeds a maximum number of failures. Another counter may count the number of times a process crashes. Theprocess manager 102 may further include acalculator 210 to measure a crash rate from the number of times a process crashes per the amount of uptime. In one embodiment, thecontroller 204 kills a process when the calculated crash rate exceeds a predetermined maximum crash rate. Theconfiguration file 104 may be used to set the predetermined time interval, maximum number of failures, maximum crash rate, or other configurable options. - FIG. 3 is a block diagram illustrating a suitable computing environment in which certain aspects of the illustrated invention may be practiced. In one embodiment, the method for a
process management system 100 may be implemented on acomputer system 300 having components 302-312, including aprocessor 302, amemory 304, an Input/Output device 306, adata storage 312, and anetwork interface 310, coupled to each other via a bus 308. The components perform their conventional functions known in the art and provide the means for implementing theprocess management system 100. Collectively, these components represent a broad category of hardware systems, including but not limited to general purpose computer systems and specialized packet forwarding devices. It is to be appreciated that various components ofcomputer system 300 may be rearranged, and that certain implementations of the present invention may not require nor include all of the above components. Furthermore, additional components may be included insystem 300, such as additional processors (e.g., a digital signal processor), storage devices, memories, and network or communication interfaces. - FIG. 4 is a flowchart illustrating certain aspects of a method to be performed by a computer executing one embodiment of the invention. The methods to be performed by a processor on a router or other network device constitute computer programs made up of computer-executable instructions. Describing the methods by reference to a flowchart enables one skilled in the art to develop such programs including such instructions to carry out the methods on suitably configured computers, in which the processor of the computer execute the instructions from computer-accessible media. The computer-executable instructions may be written in a computer programming language or may be embodied in firmware logic such as an application-specific integrated circuit (ASIC). If written in a programming language conforming to a recognized standard, such instructions can be executed on a variety of hardware platforms and for interface to a variety of operating systems. In addition, the invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein. Furthermore, it is common in the art to speak of software, in one form or another, for example a program, procedure, process, or application, as taking an action or causing a result. Such expressions are merely a shorthand way of saying that execution of the software by a computer causes the processor of the computer to perform an action or a produce a result.
- The method begins at400, where a process is started. The process may be running in any of the applications, protocols, or management interfaces managed by
process manager 102. At 402, a behavior of the process is monitored. At 404, the process is killed when the behavior is abnormal. In one embodiment, an administrator is notified about the abnormal behavior. - FIG. 5 is a flowchart illustrating one embodiment of monitoring a behavior of a process in further detail. At500, a maximum startup count is set to define a maximum number of startup attempts for the process. At 502, a startup timer is set to define a time interval to wait for the process to startup and register. At 504, the
process manager 102 determines whether the timer has expired. When the timer has not yet expired, the waiting continues at 506 until the timer expires. When the timer expires, at 508, theprocess manager 102 determines whether the process has registered. When the process has registered, the behavior of the process is normal. When process has not registered when the timer expires, at 510, the startup count is incremented. Then, at 512, theprocess manager 102 determines whether the process has reached the maximum startup count. When the process has not yet reached the maximum startup count, then the process is restarted at 514, and the method is reiterated from 502. When the process has reached the maximum startup count, then the behavior of the process is abnormal, and the process is killed. Internal process control data may then be cleaned up and an administrator may be notified about the inability to start the process. - FIG. 6 is a flowchart illustrating an embodiment of monitoring a behavior of a process in further detail. At600, a maximum timeout count is set to define a maximum number of timeouts for the process. At 602, a timer is set to define a time interval to wait for the process to respond to polling. At 604, the process is polled. An example of polling is to send a “hello” message to the process. The process may reply by sending a “hello acknowledgment” message. At 606, the
process manager 102 determines whether the timer has expired. When the timer has not yet expired, the waiting continues at 608 until the timer expires. When the timer expires, at 610, theprocess manager 102 determines whether the process has registered. When the process has registered, at 612, theprocess manager 102 determines whether the process has responded to polling. When the process has responded to polling, at 614, the timeout count is reset, and the method is reiterated from 604. When the process has not registered or the process has not responded to polling when the timer expires, at 616, the timeout count is incremented. Then, at 618, theprocess manager 102 determines whether the process has reached the maximum timeout count. When the process has not yet reached the maximum timeout count, the method is reiterated from 604. When the process has reached the maximum timeout count, then the behavior of the process is abnormal, and the process is killed. - FIG. 7 is a flowchart illustrating an embodiment of monitoring a behavior of a process in further detail. At700, a maximum thread timeout count is set to define a maximum number of timeouts for the threads of the process. At 702, a thread timer is set for each thread in the process to define a time interval to wait for each thread to provide an updated status. At 704, the
process manager 102 waits for each thread to provide an updated status. At 706, theprocess manager 102 determines whether the corresponding thread timer has expired. When the timer has not yet expired, the waiting continues at 708 until the timer expires. When the timer expires, at 710, theprocess manager 102 determines whether the process has registered. When the process has registered, at 712, theprocess manager 102 determines whether the thread has provided an updated status. When the thread has provided an updated status, the corresponding thread timeout count is reset at 714, and the method is reiterated from 704. When the process has not registered or the thread has not provided an updated status when the timer expires, at 716, the corresponding thread timeout count is incremented. Then, at 718, theprocess manager 102 determines whether the thread of the process has reached the maximum thread timeout count. When the thread has not yet reached the maximum thread timeout count, the method is reiterated from 704. When the thread has reached the maximum thread timeout count, then the behavior of the process is abnormal, and the process is killed. - FIG. 8 is a flowchart illustrating one embodiment of monitoring a behavior of a process in further detail. At800, a maximum crash rate for the process is set. At 802, the amount of uptime for the process is accumulated. At 804, the
process manager 102 determines whether the process has crashed. When the process has not crashed, the method is reiterated at 802. When the process crashes, at 806, the crash count is incremented. Then, at 808, the crash rate is calculated from the crash count per the amount of uptime. Then, at 810, theprocess manager 102 determines whether the process has reached the maximum crash rate. When the process has not yet reached the maximum crash rate, the process is restarted at 812, and the method is reiterated from 802. When the process has reached the maximum crash rate, the behavior of the process is abnormal, and the process is killed. Internal process control data may then be cleaned up and an administrator may be notified about the inability to start the process. - As will be appreciated by those skilled in the art, the content for implementing an embodiment of the method of the invention, for example, computer program instructions, may be provided by any machine-readable media which can store data that is accessible by
system 100, as part of or in addition to memory, including but not limited to cartridges, magnetic cassettes, flash memory cards, digital video disks, random access memories (RAMs), read-only memories (ROMs), and the like. In this regard, thesystem 100 is equipped to communicate with such machine-readable media in a manner well-known in the art. - It will be further appreciated by those skilled in the art that the content for implementing an embodiment of the method of the invention may be provided to the
system 100 from any external device capable of storing the content and communicating the content to thesystem 100. For example, in one embodiment, thesystem 100 may be connected to a network, and the content may be stored on any device in the network. - The above description of illustrated embodiments of the invention, including what is described in the abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize.
- These modifications can be made to the invention in light of the above detailed description. The terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification and the claims. Rather, the scope of the invention is to be determined entirely by the following claims, which are to be construed in accordance with established doctrines of claim interpretation.
Claims (28)
1. A method comprising:
starting a process;
monitoring a behavior of the process; and
killing the process when the behavior is abnormal.
2. The method of claim 1 , further comprising notifying an administrator about the abnormal behavior.
3. The method of claim 1 , wherein the behavior is abnormal when the process cannot start, and wherein monitoring the behavior comprises:
setting a startup timer before starting the process; and
incrementing a startup count when the process fails to register before the startup timer expires,
and wherein killing the process comprises killing the process when the startup count exceeds a maximum number of attempts at starting the process.
4. The method of claim 3 , further comprising attempting to restart the process when the startup count does not exceed the maximum number of attempts at starting the process.
5. The method of claim 3 , further comprising cleaning up internal process control data and notifying an administrator about the inability to start the process when the startup count exceeds a maximum number of attempts at starting the process.
6. The method of claim 1 , wherein the behavior is abnormal when the process repeatedly crashes, and wherein monitoring the behavior comprises:
accumulating an amount of uptime for the process;
incrementing a crash count when the process crashes; and
calculating a crash rate from the crash count per the amount of uptime,
and wherein killing the process comprises killing the process when the calculated crash rate exceeds a maximum crash rate.
7. The method of claim 6 , further comprising attempting to restart the process when the crash rate does not exceed the maximum crash rate.
8. The method of claim 6 , further comprising cleaning up internal process control data and notifying an administrator about the inability to start the process when the crash rate exceeds a maximum crash rate.
9. The method of claim 1 , wherein the behavior is abnormal when the process is non-responsive, and wherein monitoring the behavior comprises:
setting a process timer after starting the process;
registering the process and incrementing a timeout count when the process fails to register before the process timer expires; and
polling the process and incrementing the timeout count when the process fails to respond to polling before the process timer expires,
and wherein killing the process comprises killing the process when the timeout count exceeds a maximum number of timeouts.
10. The method of claim 9 , further comprising resetting the timeout count each time the process responds to polling.
11. The method of claim 9 , wherein polling the process and incrementing the timeout count when the process fails to respond to polling before the process timer expires comprises sending a message to the process and incrementing the timeout count when the process fails to reply to the sent message before the process timer expires.
12. The method of claim 10 , wherein resetting the timeout count each time the process responds to polling comprises resetting the timeout count to zero each time the process replies to the sent message.
13. The method of claim 1 , wherein the behavior is abnormal when a thread of the process is non-responsive, and wherein monitoring the behavior comprises:
setting a thread timer for each thread after registering the process;
waiting for each thread to provide an updated status; and
incrementing a corresponding thread timeout count when any of the threads fails to provide an updated status before the corresponding thread timer expires,
and wherein killing the process comprises killing the process when any of the thread timeout counts exceeds a maximum number of timeouts.
14. The method of claim 13 , further comprising incrementing the timeout counts when the process fails to register before any of the thread timers expire.
15. The method of claim 13 , further comprising incrementing the timeout counts when the process fails to send a message before any of the thread timers expire.
16. The method of claim 13 , further comprising resetting the thread timeout count each time a corresponding thread provides an updated status.
17. An apparatus comprising:
a process monitor to monitor a behavior of a process; and
a controller to kill the process when the behavior is abnormal.
18. The apparatus of claim 17 , further comprising a timer to measure a predetermined time interval for the process to perform a desired action and a counter to count a number of times the process fails to perform the desired action before the timer expires, and wherein a controller to kill the process comprises a controller to kill the process when the counter exceeds a maximum number of failures.
19. The apparatus of claim 18 , wherein the desired action is to respond to polling and wherein the counter to increment each time the process does not respond to polling before the timer expires.
20. The apparatus of claim 18 , wherein the desired action is to provide an update on the status of a thread of the process and wherein the counter to increment each time a thread of the process does not provide an updated status before the timer expires.
21. The apparatus of claim 18 , wherein the desired action is to register and wherein the counter to increment each time the process does not register before the timer expires.
22. The apparatus of claim 17 , further comprising a timer to measure an amount of uptime, a counter to count the number of times the process crashes, and a calculator to measure a crash rate from the number of times the process crashes per the amount of uptime, and wherein a controller to kill the process comprises a controller to kill the process when the calculated crash rate exceeds a predetermined maximum crash rate.
23. An article of manufacture comprising:
a machine accessible medium comprising content that when accessed by a machine causes the machine to:
start a process;
monitor a behavior of the process; and
kill the process when the behavior is abnormal.
24. The article of manufacture of claim 24 , further comprising a machine accessible medium comprising content that when accessed by a machine causes the machine to notify an administrator about the abnormal behavior.
25. The article of manufacture of claim 24 , wherein the behavior is abnormal when the process cannot start, and wherein monitoring the behavior comprises:
setting a startup timer before starting the process; and
incrementing a startup count when the process fails to register before the startup timer expires,
and wherein killing the process comprises killing the process when the startup count exceeds a maximum number of attempts at starting the process.
26. The article of manufacture of claim 24 , wherein the behavior is abnormal when the process repeatedly crashes, and wherein monitoring the behavior comprises:
accumulating an amount of uptime for the process;
incrementing a crash count when the process crashes; and
calculating a crash rate from the crash count per the amount of uptime,
and wherein killing the process comprises killing the process when the crash rate exceeds a maximum crash rate.
27. The article of manufacture of claim 24 , wherein the behavior is abnormal when the process is non-responsive, and wherein monitoring the behavior comprises:
setting a process timer after starting the process;
registering the process and incrementing a timeout count when the process fails to register before the process timer expires; and
polling the process and incrementing the timeout count when the process fails to respond to polling before the process timer expires,
and wherein killing the process comprises killing the process when the timeout count exceeds a maximum number of timeouts.
28. The article of manufacture of claim 24 , wherein the behavior is abnormal when a thread of the process is non-responsive, and wherein monitoring the behavior comprises:
setting a thread timer for each thread after registering the process; and
waiting for each thread to provide an updated status and incrementing a timeout count when any of the threads fail to provide an updated status before the corresponding thread timer expires,
and killing the process comprises killing the process when the timeout count exceeds a maximum number of timeouts.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/157,567 US20030226056A1 (en) | 2002-05-28 | 2002-05-28 | Method and system for a process manager |
US10/170,246 US7017082B1 (en) | 2002-05-28 | 2002-06-11 | Method and system for a process manager |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/157,567 US20030226056A1 (en) | 2002-05-28 | 2002-05-28 | Method and system for a process manager |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/170,246 Continuation US7017082B1 (en) | 2002-05-28 | 2002-06-11 | Method and system for a process manager |
Publications (1)
Publication Number | Publication Date |
---|---|
US20030226056A1 true US20030226056A1 (en) | 2003-12-04 |
Family
ID=29582496
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/157,567 Abandoned US20030226056A1 (en) | 2002-05-28 | 2002-05-28 | Method and system for a process manager |
US10/170,246 Expired - Lifetime US7017082B1 (en) | 2002-05-28 | 2002-06-11 | Method and system for a process manager |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/170,246 Expired - Lifetime US7017082B1 (en) | 2002-05-28 | 2002-06-11 | Method and system for a process manager |
Country Status (1)
Country | Link |
---|---|
US (2) | US20030226056A1 (en) |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1640865A2 (en) * | 2004-08-25 | 2006-03-29 | Evolium S.A.S. | Process management system |
US20060090062A1 (en) * | 2002-01-19 | 2006-04-27 | Martin Vorbach | Reconfigurable processor |
US20060179353A1 (en) * | 2005-02-04 | 2006-08-10 | Fujitsu Limited | Computer shutoff condition monitoring method, information processing apparatus, program and computer readable information recording medium |
US20070294584A1 (en) * | 2006-04-28 | 2007-12-20 | Microsoft Corporation | Detection and isolation of data items causing computer process crashes |
US7313735B1 (en) * | 2004-04-30 | 2007-12-25 | Sun Microsystems, Inc. | In-line server health checking |
US20080155544A1 (en) * | 2006-12-20 | 2008-06-26 | Thales | Device and method for managing process task failures |
CN100555228C (en) * | 2006-11-08 | 2009-10-28 | 中兴通讯股份有限公司 | A kind of method for supervising of embedded LINUX applications progress |
US20100281235A1 (en) * | 2007-11-17 | 2010-11-04 | Martin Vorbach | Reconfigurable floating-point and bit-level data processing unit |
US20110119657A1 (en) * | 2007-12-07 | 2011-05-19 | Martin Vorbach | Using function calls as compiler directives |
US20110161977A1 (en) * | 2002-03-21 | 2011-06-30 | Martin Vorbach | Method and device for data processing |
US20110173483A1 (en) * | 2010-01-14 | 2011-07-14 | Juniper Networks Inc. | Fast resource recovery after thread crash |
US20110173596A1 (en) * | 2007-11-28 | 2011-07-14 | Martin Vorbach | Method for facilitating compilation of high-level code for varying architectures |
WO2011087924A1 (en) * | 2010-01-15 | 2011-07-21 | Apple Inc. | Method and apparatus for idling a network connection |
US8281265B2 (en) | 2002-08-07 | 2012-10-02 | Martin Vorbach | Method and device for processing data |
US8301872B2 (en) | 2000-06-13 | 2012-10-30 | Martin Vorbach | Pipeline configuration protocol and configuration unit communication |
US8310274B2 (en) | 2002-09-06 | 2012-11-13 | Martin Vorbach | Reconfigurable sequencer structure |
US8312200B2 (en) | 1999-06-10 | 2012-11-13 | Martin Vorbach | Processor chip including a plurality of cache elements connected to a plurality of processor cores |
US8312301B2 (en) | 2001-03-05 | 2012-11-13 | Martin Vorbach | Methods and devices for treating and processing data |
CN102932346A (en) * | 2012-10-26 | 2013-02-13 | 杭州迪普科技有限公司 | Method and device for detecting unavailable addresses in network address translator (NAT) address pool |
US8407525B2 (en) | 2001-09-03 | 2013-03-26 | Pact Xpp Technologies Ag | Method for debugging reconfigurable architectures |
US8471593B2 (en) | 2000-10-06 | 2013-06-25 | Martin Vorbach | Logic cell array and bus system |
USRE44365E1 (en) | 1997-02-08 | 2013-07-09 | Martin Vorbach | Method of self-synchronization of configurable elements of a programmable module |
CN103503374A (en) * | 2011-11-15 | 2014-01-08 | 华为技术有限公司 | Monitoring method and device, and network device |
US8686549B2 (en) | 2001-09-03 | 2014-04-01 | Martin Vorbach | Reconfigurable elements |
US8686475B2 (en) | 2001-09-19 | 2014-04-01 | Pact Xpp Technologies Ag | Reconfigurable elements |
US8819505B2 (en) | 1997-12-22 | 2014-08-26 | Pact Xpp Technologies Ag | Data processor having disabled cores |
US8869121B2 (en) | 2001-08-16 | 2014-10-21 | Pact Xpp Technologies Ag | Method for the translation of programs for reconfigurable architectures |
US8914590B2 (en) | 2002-08-07 | 2014-12-16 | Pact Xpp Technologies Ag | Data processing method and device |
US9021310B1 (en) * | 2012-02-14 | 2015-04-28 | Amazon Technologies, Inc. | Policy-driven automatic network fault remediation |
US20150347203A1 (en) * | 2014-05-29 | 2015-12-03 | Mediatek Inc. | Electronic device capable of configuring application-dependent task based on operating behavior of application detected during execution of application and related method thereof |
US9477490B2 (en) * | 2015-01-05 | 2016-10-25 | Dell Software Inc. | Milestone based dynamic multiple watchdog timeouts and early failure detection |
US20180107178A1 (en) * | 2016-10-17 | 2018-04-19 | Fisher-Rosemount Systems, Inc. | Methods and Systems for Streaming Process Control Data to Remote Devices |
CN108197000A (en) * | 2018-01-10 | 2018-06-22 | 武汉斗鱼网络科技有限公司 | Application program launching log preservation method, storage medium, electronic equipment and system |
CN108777631A (en) * | 2018-05-07 | 2018-11-09 | 深圳绿净网科技有限公司 | Router user's network log-in management method and system |
US11160052B2 (en) * | 2017-03-10 | 2021-10-26 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method for adjusting broadcast receiver queue, storage medium and electronic device |
US20220188184A1 (en) * | 2019-07-12 | 2022-06-16 | Ebay Inc. | Corrective Database Connection Management |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE10225471A1 (en) * | 2002-06-10 | 2003-12-18 | Philips Intellectual Property | Reset monitoring method for use with a microcontroller, whereby a monitoring module monitors the microcontroller and generates an acknowledgement signal when it is successfully reset |
TWI228650B (en) * | 2003-06-17 | 2005-03-01 | Acer Inc | Application program management system and method thereof |
US7657635B2 (en) * | 2004-07-30 | 2010-02-02 | Extreme Networks | Method and apparatus for converting network management protocol to markup language |
WO2010120689A2 (en) * | 2009-04-14 | 2010-10-21 | Interdigital Patent Holdings, Inc. | Method and apparatus for processing emergency calls |
US8239709B2 (en) * | 2009-08-12 | 2012-08-07 | Apple Inc. | Managing extrinsic processes |
US8639991B2 (en) * | 2010-12-17 | 2014-01-28 | Sap Ag | Optimizing performance of an application |
CN107819640B (en) * | 2016-09-14 | 2019-06-28 | 北京百度网讯科技有限公司 | Monitoring method and device for robot operating system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6425093B1 (en) * | 1998-01-05 | 2002-07-23 | Sophisticated Circuits, Inc. | Methods and apparatuses for controlling the execution of software on a digital processing system |
US6453430B1 (en) * | 1999-05-06 | 2002-09-17 | Cisco Technology, Inc. | Apparatus and methods for controlling restart conditions of a faulted process |
US20020162053A1 (en) * | 1999-03-10 | 2002-10-31 | Os Ron Van | User transparent software malfunction detection and reporting |
US20030037172A1 (en) * | 2001-08-17 | 2003-02-20 | John Lacombe | Hardware implementation of an application-level watchdog timer |
US20030074605A1 (en) * | 2001-10-11 | 2003-04-17 | Hitachi, Ltd. | Computer system and method for program execution monitoring in computer system |
US6662310B2 (en) * | 1999-11-10 | 2003-12-09 | Symantec Corporation | Methods for automatically locating url-containing or other data-containing windows in frozen browser or other application program, saving contents, and relaunching application program with link to saved data |
US6665758B1 (en) * | 1999-10-04 | 2003-12-16 | Ncr Corporation | Software sanity monitor |
US20040054984A1 (en) * | 2002-04-08 | 2004-03-18 | Chong James C. | Method and system for problem determination in distributed enterprise applications |
-
2002
- 2002-05-28 US US10/157,567 patent/US20030226056A1/en not_active Abandoned
- 2002-06-11 US US10/170,246 patent/US7017082B1/en not_active Expired - Lifetime
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6425093B1 (en) * | 1998-01-05 | 2002-07-23 | Sophisticated Circuits, Inc. | Methods and apparatuses for controlling the execution of software on a digital processing system |
US20020162053A1 (en) * | 1999-03-10 | 2002-10-31 | Os Ron Van | User transparent software malfunction detection and reporting |
US6453430B1 (en) * | 1999-05-06 | 2002-09-17 | Cisco Technology, Inc. | Apparatus and methods for controlling restart conditions of a faulted process |
US6665758B1 (en) * | 1999-10-04 | 2003-12-16 | Ncr Corporation | Software sanity monitor |
US6662310B2 (en) * | 1999-11-10 | 2003-12-09 | Symantec Corporation | Methods for automatically locating url-containing or other data-containing windows in frozen browser or other application program, saving contents, and relaunching application program with link to saved data |
US20030037172A1 (en) * | 2001-08-17 | 2003-02-20 | John Lacombe | Hardware implementation of an application-level watchdog timer |
US20030074605A1 (en) * | 2001-10-11 | 2003-04-17 | Hitachi, Ltd. | Computer system and method for program execution monitoring in computer system |
US20040054984A1 (en) * | 2002-04-08 | 2004-03-18 | Chong James C. | Method and system for problem determination in distributed enterprise applications |
Cited By (58)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
USRE45223E1 (en) | 1997-02-08 | 2014-10-28 | Pact Xpp Technologies Ag | Method of self-synchronization of configurable elements of a programmable module |
USRE44365E1 (en) | 1997-02-08 | 2013-07-09 | Martin Vorbach | Method of self-synchronization of configurable elements of a programmable module |
USRE45109E1 (en) | 1997-02-08 | 2014-09-02 | Pact Xpp Technologies Ag | Method of self-synchronization of configurable elements of a programmable module |
US8819505B2 (en) | 1997-12-22 | 2014-08-26 | Pact Xpp Technologies Ag | Data processor having disabled cores |
US8468329B2 (en) | 1999-02-25 | 2013-06-18 | Martin Vorbach | Pipeline configuration protocol and configuration unit communication |
US8726250B2 (en) | 1999-06-10 | 2014-05-13 | Pact Xpp Technologies Ag | Configurable logic integrated circuit having a multidimensional structure of configurable elements |
US8312200B2 (en) | 1999-06-10 | 2012-11-13 | Martin Vorbach | Processor chip including a plurality of cache elements connected to a plurality of processor cores |
US8301872B2 (en) | 2000-06-13 | 2012-10-30 | Martin Vorbach | Pipeline configuration protocol and configuration unit communication |
US8471593B2 (en) | 2000-10-06 | 2013-06-25 | Martin Vorbach | Logic cell array and bus system |
US9075605B2 (en) | 2001-03-05 | 2015-07-07 | Pact Xpp Technologies Ag | Methods and devices for treating and processing data |
US8312301B2 (en) | 2001-03-05 | 2012-11-13 | Martin Vorbach | Methods and devices for treating and processing data |
US8869121B2 (en) | 2001-08-16 | 2014-10-21 | Pact Xpp Technologies Ag | Method for the translation of programs for reconfigurable architectures |
US8407525B2 (en) | 2001-09-03 | 2013-03-26 | Pact Xpp Technologies Ag | Method for debugging reconfigurable architectures |
US8686549B2 (en) | 2001-09-03 | 2014-04-01 | Martin Vorbach | Reconfigurable elements |
US8429385B2 (en) | 2001-09-03 | 2013-04-23 | Martin Vorbach | Device including a field having function cells and information providing cells controlled by the function cells |
US8686475B2 (en) | 2001-09-19 | 2014-04-01 | Pact Xpp Technologies Ag | Reconfigurable elements |
US8281108B2 (en) * | 2002-01-19 | 2012-10-02 | Martin Vorbach | Reconfigurable general purpose processor having time restricted configurations |
US20060090062A1 (en) * | 2002-01-19 | 2006-04-27 | Martin Vorbach | Reconfigurable processor |
US20110161977A1 (en) * | 2002-03-21 | 2011-06-30 | Martin Vorbach | Method and device for data processing |
US8914590B2 (en) | 2002-08-07 | 2014-12-16 | Pact Xpp Technologies Ag | Data processing method and device |
US8281265B2 (en) | 2002-08-07 | 2012-10-02 | Martin Vorbach | Method and device for processing data |
US8803552B2 (en) | 2002-09-06 | 2014-08-12 | Pact Xpp Technologies Ag | Reconfigurable sequencer structure |
US8310274B2 (en) | 2002-09-06 | 2012-11-13 | Martin Vorbach | Reconfigurable sequencer structure |
US7313735B1 (en) * | 2004-04-30 | 2007-12-25 | Sun Microsystems, Inc. | In-line server health checking |
EP1640865A2 (en) * | 2004-08-25 | 2006-03-29 | Evolium S.A.S. | Process management system |
EP1640865A3 (en) * | 2004-08-25 | 2009-10-14 | Alcatel Lucent | Process management system |
US20060179353A1 (en) * | 2005-02-04 | 2006-08-10 | Fujitsu Limited | Computer shutoff condition monitoring method, information processing apparatus, program and computer readable information recording medium |
US7506209B2 (en) * | 2005-02-04 | 2009-03-17 | Fujitsu Limited | Computer shutoff condition monitoring method, information processing apparatus, program and computer readable information recording medium |
US20070294584A1 (en) * | 2006-04-28 | 2007-12-20 | Microsoft Corporation | Detection and isolation of data items causing computer process crashes |
CN100555228C (en) * | 2006-11-08 | 2009-10-28 | 中兴通讯股份有限公司 | A kind of method for supervising of embedded LINUX applications progress |
US20080155544A1 (en) * | 2006-12-20 | 2008-06-26 | Thales | Device and method for managing process task failures |
FR2910656A1 (en) * | 2006-12-20 | 2008-06-27 | Thales Sa | DEVICE AND METHOD FOR PROCESS TASK FAILURE MANAGEMENT |
US20100281235A1 (en) * | 2007-11-17 | 2010-11-04 | Martin Vorbach | Reconfigurable floating-point and bit-level data processing unit |
US20110173596A1 (en) * | 2007-11-28 | 2011-07-14 | Martin Vorbach | Method for facilitating compilation of high-level code for varying architectures |
US20110119657A1 (en) * | 2007-12-07 | 2011-05-19 | Martin Vorbach | Using function calls as compiler directives |
US8627142B2 (en) * | 2010-01-14 | 2014-01-07 | Juniper Networks, Inc. | Fast resource recovery after thread crash |
US20110173483A1 (en) * | 2010-01-14 | 2011-07-14 | Juniper Networks Inc. | Fast resource recovery after thread crash |
US20130132773A1 (en) * | 2010-01-14 | 2013-05-23 | Juniper Networks, Inc. | Fast resource recovery after thread crash |
US8365014B2 (en) * | 2010-01-14 | 2013-01-29 | Juniper Networks, Inc. | Fast resource recovery after thread crash |
US8706855B2 (en) | 2010-01-15 | 2014-04-22 | Apple Inc. | Method and apparatus for idling a network connection |
WO2011087924A1 (en) * | 2010-01-15 | 2011-07-21 | Apple Inc. | Method and apparatus for idling a network connection |
CN103503374A (en) * | 2011-11-15 | 2014-01-08 | 华为技术有限公司 | Monitoring method and device, and network device |
US9021310B1 (en) * | 2012-02-14 | 2015-04-28 | Amazon Technologies, Inc. | Policy-driven automatic network fault remediation |
CN102932346A (en) * | 2012-10-26 | 2013-02-13 | 杭州迪普科技有限公司 | Method and device for detecting unavailable addresses in network address translator (NAT) address pool |
US9632841B2 (en) * | 2014-05-29 | 2017-04-25 | Mediatek Inc. | Electronic device capable of configuring application-dependent task based on operating behavior of application detected during execution of application and related method thereof |
US20150347203A1 (en) * | 2014-05-29 | 2015-12-03 | Mediatek Inc. | Electronic device capable of configuring application-dependent task based on operating behavior of application detected during execution of application and related method thereof |
US9477490B2 (en) * | 2015-01-05 | 2016-10-25 | Dell Software Inc. | Milestone based dynamic multiple watchdog timeouts and early failure detection |
US20180107178A1 (en) * | 2016-10-17 | 2018-04-19 | Fisher-Rosemount Systems, Inc. | Methods and Systems for Streaming Process Control Data to Remote Devices |
US10671032B2 (en) * | 2016-10-17 | 2020-06-02 | Fisher-Rosemount Systems, Inc. | Methods and systems for streaming process control data to remote devices |
US11353854B2 (en) | 2016-10-17 | 2022-06-07 | Fisher-Rosemount Systems, Inc. | Methods and apparatus for configuring remote access of process control data |
US11543805B2 (en) | 2016-10-17 | 2023-01-03 | Fisher-Rosemount Systems, Inc. | Systems and apparatus for distribution of process control data to remote devices |
US12078981B2 (en) | 2016-10-17 | 2024-09-03 | Fisher-Rosemount Systems, Inc. | Systems and apparatus for distribution of process control data to remote |
US11160052B2 (en) * | 2017-03-10 | 2021-10-26 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method for adjusting broadcast receiver queue, storage medium and electronic device |
CN108197000A (en) * | 2018-01-10 | 2018-06-22 | 武汉斗鱼网络科技有限公司 | Application program launching log preservation method, storage medium, electronic equipment and system |
CN108777631A (en) * | 2018-05-07 | 2018-11-09 | 深圳绿净网科技有限公司 | Router user's network log-in management method and system |
US20220188184A1 (en) * | 2019-07-12 | 2022-06-16 | Ebay Inc. | Corrective Database Connection Management |
US11860728B2 (en) * | 2019-07-12 | 2024-01-02 | Ebay Inc. | Corrective database connection management |
US20240070013A1 (en) * | 2019-07-12 | 2024-02-29 | Ebay Inc. | Corrective Database Connection Management |
Also Published As
Publication number | Publication date |
---|---|
US7017082B1 (en) | 2006-03-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7017082B1 (en) | Method and system for a process manager | |
US6360260B1 (en) | Discovery features for SNMP managed devices | |
US7016955B2 (en) | Network management apparatus and method for processing events associated with device reboot | |
EP3905590A1 (en) | System and method for obtaining network topology, and server | |
EP0868800B1 (en) | Method and apparatus for determining the status of a device in a communication network | |
US7783733B1 (en) | Method and apparatus for dynamic configuration management | |
US8782211B1 (en) | Dynamically scheduling tasks to manage system load | |
US7257731B2 (en) | System and method for managing protocol network failures in a cluster system | |
US8549119B1 (en) | Error handling for device management configuration and operational data retrieval commands | |
US6757901B1 (en) | Method and system for setting expressions in network management notifications at an agent | |
CN109344014A (en) | A kind of main/standby switching method, device and communication equipment | |
WO1997023974A9 (en) | Method and apparatus for determining the status of a device in a communication network | |
WO2017215441A1 (en) | Self-recovery method and apparatus for board configuration in distributed system | |
US11683257B1 (en) | Method and device for improving link aggregation protocol timeout | |
JP2016536920A (en) | Apparatus and method for network performance monitoring | |
US20160094657A1 (en) | Event-driven synchronization in snmp managed networks | |
US10404561B2 (en) | Network operational flaw detection using metrics | |
US10992770B2 (en) | Method and system for managing network service | |
US7716320B2 (en) | Method and apparatus for persisting SNMP MIB integer indexes across multiple network elements | |
US20110320633A1 (en) | System and methods for a managed application server restart | |
US10033569B1 (en) | Automated simple network management protocol variable reset | |
US8825845B1 (en) | Managing a network element operating on a network | |
Cisco | Troubleshooting Commands | |
JP3978099B2 (en) | Communication network system management method and network relay device | |
CN101043357A (en) | Automatic discovery method for equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: EXTREME NETWORKS, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YIP, MICHAEL;BERENBERG, ANNA;REEL/FRAME:012945/0638 Effective date: 20020516 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |