US20140086260A1 - Managing starvation and congestion in a two-dimensional network having flow control - Google Patents
Managing starvation and congestion in a two-dimensional network having flow control Download PDFInfo
- Publication number
- US20140086260A1 US20140086260A1 US13/629,351 US201213629351A US2014086260A1 US 20140086260 A1 US20140086260 A1 US 20140086260A1 US 201213629351 A US201213629351 A US 201213629351A US 2014086260 A1 US2014086260 A1 US 2014086260A1
- Authority
- US
- United States
- Prior art keywords
- queue
- messages
- round robin
- arbitration
- arbiter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 235000003642 hunger Nutrition 0.000 title claims description 28
- 230000037351 starvation Effects 0.000 title claims description 28
- 239000000872 buffer Substances 0.000 claims abstract description 35
- 241001522296 Erithacus rubecula Species 0.000 claims description 46
- 238000000034 method Methods 0.000 claims description 21
- 230000007246 mechanism Effects 0.000 claims description 17
- 238000012545 processing Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 6
- 230000015654 memory Effects 0.000 description 5
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- APTZNLHMIGJTEW-UHFFFAOYSA-N pyraflufen-ethyl Chemical compound C1=C(Cl)C(OCC(=O)OCC)=CC(C=2C(=C(OC(F)F)N(C)N=2)Cl)=C1F APTZNLHMIGJTEW-UHFFFAOYSA-N 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/625—Queue scheduling characterised by scheduling criteria for service slots or service orders
- H04L47/6275—Queue scheduling characterised by scheduling criteria for service slots or service orders based on priority
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/6215—Individual queue per QOS, rate or priority
Definitions
- Embodiments of the invention relate to techniques for traffic management between connected nodes. More particularly, embodiments of the invention relate to a multi-tier arbitration scheme that may be used to traffic management between the connected nodes.
- routers are utilized to manage traffic.
- routers with multiple virtual channels may use round robin priority arbitration schemes in both local arbitration (LA) and global arbitration (GA) stages.
- LA local arbitration
- GA global arbitration
- Worst case miss-alignments consist primarily of persistent loss of priority either at the LA arbiter or the GA arbiter and is a result of independent round robin policies used at each stage. For example, when a message is able to receive a highest priority at the LA arbiter, the GA arbiter may point to a different port when the message finally arrives and the LA arbiter may have already changed the highest priority to a different message that is not the oldest at the input port. This may happen multiple times for the same message, increasing the worst case latency, even up to starvation.
- FIG. 1 is a block diagram of one embodiment of an apparatus for a physical interconnect.
- FIG. 2 is a conceptual diagram of a router pipeline and arbitration scheme as described herein.
- FIG. 3 is a flow diagram of one embodiment of a technique for operation of anti-starvation mechanism credit return and anti-starvation counter control.
- FIG. 4 is a block diagram of one embodiment of an electronic system.
- processor packages with tens to hundreds of processing cores and other functional blocks can be integrated on a single die and will be widely available for use in, for example, cloud computing environments.
- the arbitration scheme described herein may be utilized in any interconnection network between these processor cores and/or other functional blocks and may provide fairness and latency predictability for messages (packets) routed through the architecture.
- Interconnection schemes can include, for example, two-dimensional router-based mesh or torus interconnect with virtual channel support, which can be very scalable.
- the techniques described herein may be applied to interconnections between packages as well.
- FIG. 1 is a block diagram of one embodiment of an apparatus for a physical interconnect.
- the apparatus depicts a physical layer for a cache-coherent, link-based interconnect scheme for a processor, chipset, and/or IO bridge components.
- the physical interconnect may be performed by each physical layer of an integrated device.
- the physical layer may provide communication between two ports over a physical interconnect comprising two uni-directional links.
- one uni-directional link 104 from first transmit port 150 in physical layer 102 of a first integrated device to a first receiver port 150 of a second integrated device.
- a second uni-directional link 106 from a first transmit port 160 in physical layer 112 of the second integrated device to a first receiver port 150 of the first integrated device.
- the claimed subject matter is not limited to two uni-directional links.
- the links described with respect to FIG. 1 may be used, for example, as interconnections between an on-chip network.
- the on-chip network may include, for example, multiple processing cores, multiple caches, multiple memories, other components and/or any combination thereof.
- the techniques described herein may significantly reduce the worst case delay experienced by a packet (message) or eliminate starvation of a packet (message) in an on-chip network with multiple virtual channels. This may result in improved latency predictability, even under heavy congestion.
- routers may be interconnected and utilized to provide the desired bandwidth.
- Routers may be designed with the availability to support multiple virtual channels to allow all message types to be mapped to the same physical network, thus providing optimal use of the available bandwidth for a given traffic mix at any time and at any point in the network.
- the router with virtual channels typically contains two different arbitration stages.
- the Local Arbiter (LA) that belongs to a particular input port, and, during a second stage, the Global Arbiter (GA) that selects a winner among all input ports that compete for a particular output port.
- LA Local Arbiter
- GA Global Arbiter
- FIG. 2 is a conceptual diagram of a router pipeline and arbitration scheme as described herein.
- the techniques described herein utilize multiple different arbitration schemes that maintain priority alignment between the LA and the GA.
- a technique utilizing two-tier priority arbitration at the LA stage and an anti-starvation mechanism with round robin arbiter at the GA stage may be utilized.
- a router may receive a packet (message) after route computation (RC), 200 .
- Route computation may be accomplished according to any technique known in the art. Route computation results in a path for a packet to travel through the interconnected nodes in the network, which may include one or more routers.
- the LA 205 utilizes a two-tier priority arbitration scheme.
- Packets messages that are received (e.g., travel across one or more links of the on-chip network) may be divided into flits.
- the first flit of a packet is referred to as the head flit
- the last flit is referred to as the tail flit
- any intervening flits are referred to as body flits.
- the input buffer is shared by all virtual channels (VCs) providing input to the router port.
- VCs virtual channels
- FIG. 2 provides four input ports (e.g., 210 , 212 , 214 , 216 ); however, any number of input ports may be supported. Further, each input port may support any number of VCs (e.g., 218 ). In one embodiment, each VC maintains a separate linked list among the flits belonging to it. There may be multiple types of VCs, for example, routing VCs (RVCs) and performance VCs (PVCs).
- RVCs routing VCs
- PVCs performance VCs
- two priority queues are maintained among all VCs that may operate according to two strategies, 215 . In alternate embodiments, more than two priority queues may be supported.
- the first priority queue may operate in a First-In, First-Out (FIFO) order.
- the second priority queue may operate in a Round Robin (RR) manner.
- the size of the RR queue equals the total number of VCs supported by a single physical port.
- the size of the FIFO queue also equals the total number of VCs supported by a single physical port.
- every VC has one slot in the RR priority queue at all times.
- any VC with a flit waiting will have a slot in the FIFO queue only if the leading flit has waited long enough. Insertion of the VC into the FIFO queue occurs at the time of age threshold crossing. Deletion of the VC from the FIFO occurs upon successful crossbar arbitration and resource (i.e., VC and credit) availability. In one embodiment, the deletion occurs only if there is no other queued flit associated with that VC at that input port.
- the FIFO slots are considered a subset of RR slots.
- each input VC context has an associated timer that starts counting in response to the leading flit arriving and waits for grant. Once the timer expires after a configurable waiting time, the corresponding VC is entered into the FIFO queue and the timer is disabled unless the VC insertion into the FIFO queue is delayed. If multiple VC timers expire at the same time, the VCs may be entered into the FIFO in a selected order, for example, based on VC index. In one embodiment, the insertion is performed one entry per clock cycle, delaying other entries for subsequent clock cycles. These timers can be considered saturation timers.
- arbitration is accomplished in the following manner. All active VC requests are filtered by pre-qualification based on the availability of their respective target port, VC and/or credit. Among the qualified requesting VCs, the one that is at the head of the FIFO queue is selected as the winner. If so, the VC is de-queued and the timer starts again if there are more flits waiting for that VC. If not the RR queue determines the winner.
- a single candidate per input port is then passed to the second (GA) stage of arbitration 230 .
- the GA selects the oldest candidate among all different input port candidates.
- anti-starvation mechanism with round robin arbiter 240 utilized at GA stage 230 provides two functions. The first is to raise the GA stage “awareness” of the starved (aged) packets at the input port and the second is to give waiting body and tail flits higher priority in using available buffer credits.
- Various embodiments of the anti-starvation mechanism within the GA stage are described below.
- a separate priority class is provided for VCs that have starved (aged) at input ports and/or have been turned back from GA stage 230 a selected number of times (programmable) because of lack of buffer credit or VC resources downstream.
- the input port tags headers for packets that have starved and when the packets are subsequently presented to GA stage 230 , they carry the tag, which causes GA stage 230 to place the request in a higher priority arbitration class that can be separate from the main arbitration scheme.
- the higher priority arbitration and the regular round robin arbitration scheme have mutually disjoint requests (i.e., no overlap amongst the requests in the two arbitration schemes).
- the requests in the higher priority arbitration scheme (after passing a resource check) will be chosen before the round robin requests. This is provided by GA priority control 265 .
- the second function mentioned above facilitates keeping flits of long packets together with their respective header flits.
- the starvation flag at the output port and credit return mechanism 245 may be utilized.
- One embodiment of a technique for operation of anti-starvation mechanism credit return and anti-starvation counter control is described in greater detail with respect to FIG. 3 .
- the flit is checked to see if it is a body flit or a tail flit. If so, the starvation flag is set at the target output VC. Otherwise, the starvation flag is set in the corresponding output routing VC context.
- the starvation flag is checked at the output port according to the following strategy. If the starvation flag is not set on the credited VC, the existing crediting policy remains unchanged. That is, a credit is put into the shared credit pool unless the occupancy is set for an in-use VC or it is reserved for a soon-to-be free routing VC. If the starvation flag is set on the credited VC, the crediting policy is changed. In one embodiment, if the reserved credit of that VC is not set, that reserved credit is set regardless of occupancy of that VC and the starvation flag of the VC is cleared. If the reserved credit for that VC is set, the returned credit is put into the shared pool.
- the technique described above relies on two different arbitration schemes that maintain priority alignment between the LA and the GA.
- the technique combines a two-tier priority arbitration at the LA stage and an anti-starvation mechanism (ASM) with round robin arbiter at the GA stage.
- the two-tier priority arbitration at the LA stage relies on a double priority scheme.
- the first priority queue works in a FIFO order and the second priority queue works in a round robin fashion. Priority is given to the VC that is first in the FIFO queue; if no VC is qualified, the round robin scheme determines the winner.
- the anti-starvation mechanism with round robin arbiter used at the GA stage provides the stages “awareness” of starved packets at the input port(s) and gives waiting body and tail flits higher priority in using available buffer credit.
- the selected flit is transmitted, 270 , via an output port, which is part of the switch transversal, 280 , portion of the router pipeline.
- FIG. 3 is a flow diagram of one embodiment of a technique for operation of anti-starvation mechanism credit return and anti-starvation counter control.
- the functionality described with respect to FIG. 3 may be provided in the GA stage to provide information to the LA stage to support the two different arbitration schemes that maintain priority alignment between the LA and the GA.
- Global arbitration is performed, 305 .
- the global arbitration is performed as described herein.
- the result of global arbitration is a determination of whether the request wins the output port. If the request does not win the output port, 310 , the anti-starvation mechanism counter for the input port is increased, 320 . If the request does not have its starved tag set, 330 , then a normal credit return occurs, 335 . If the request does have its starved tag set, 330 , then the GA stage determines if the request carries a head flit or a single flit.
- the output port starvation flag is set at the routing VC, 345 . If the request does not carry a head flit or a single flit, 340 , then the GA stage asserts body or tail flit and sets the output starvation flag at the target VC, 350 .
- the global arbitration stage determines whether a tail flit or a single flit has been delivered. If a tail flit or a single flit has been delivered, 310 , the ASM counter for the input port is reset, the output port starvation flag is cleared and the starvation tag is cleared, 315 . If a tail flit or a single flit has not been delivered, 310 , the flit is either a header flit or a body flit, 325 .
- a credit return is reserved for the VC to guarantee an advance, 365 .
- a normal credit return occurs, 370 .
- FIG. 4 is a block diagram of one embodiment of an electronic system.
- the electronic system illustrated in FIG. 4 is intended to represent a range of electronic systems (either wired or wireless) including, for example, desktop computer systems, laptop computer systems, cellular telephones, set top boxes, smart phones, tablets, ultrabooks, netbooks.
- Alternative electronic systems may include more, fewer and/or different components.
- Electronic system 400 includes interconnection network 405 or other communication device to communicate information, and processor 410 coupled to interconnection network 405 that may process information. While electronic system 400 is illustrated with a single processor, electronic system 400 may include multiple processors and/or co-processors. Electronic system 400 further may include random access memory (RAM) or other dynamic storage device 420 (referred to as main memory), coupled to interconnection network 405 and may store information and instructions that may be executed by processor 410 . Main memory 420 may also be used to store temporary variables or other intermediate information during execution of instructions by processor 410 .
- RAM random access memory
- main memory main memory
- Electronic system 400 may also include read only memory (ROM) and/or other static storage device 430 coupled to interconnection network 405 that may store static information and instructions for processor 410 .
- Data storage device 440 may be coupled to interconnection network 405 to store information and instructions.
- Data storage device 440 such as a magnetic disk or optical disc and corresponding drive may be coupled to electronic system 400 .
- Electronic system 400 may also be coupled via interconnection network 405 to display device 450 , such as a cathode ray tube (CRT) or liquid crystal display (LCD), to display information to a user.
- display device 450 such as a cathode ray tube (CRT) or liquid crystal display (LCD)
- Alphanumeric input device 460 may be coupled to interconnection network 405 to communicate information and command selections to processor 410 .
- cursor control 470 such as a mouse, a trackball, or cursor direction keys to communicate direction information and command selections to processor 410 and to control cursor movement on display 450 .
- Electronic system 400 further may include network interface(s) 480 to provide access to a network, such as a local area network.
- Network interface(s) 480 may include, for example, a wireless network interface having antenna 485 , which may represent one or more antenna(e).
- Network interface(s) 480 may also include, for example, a wired network interface to communicate with remote devices via network cable 487 , which may be, for example, an Ethernet cable, a coaxial cable, a fiber optic cable, a serial cable, or a parallel cable.
- network interface(s) 480 may provide access to a local area network, for example, by conforming to IEEE 802.11b and/or IEEE 802.11g standards, and/or the wireless network interface may provide access to a personal area network, for example, by conforming to Bluetooth standards.
- Other wireless network interfaces and/or protocols, for example, IEEE 802.11n and Thunderbolt can also be supported.
- IEEE 802.11b corresponds to IEEE Std. 802.11b-1999 entitled “Local and Metropolitan Area Networks, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications: Higher-Speed Physical Layer Extension in the 2.4 GHz Band,” approved Sep. 16, 1999 as well as related documents.
- IEEE 802.11g corresponds to IEEE Std. 802.11g-2003 entitled “Local and Metropolitan Area Networks, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, Amendment 4: Further Higher Rate Extension in the 2.4 GHz Band,” approved Jun. 27, 2003 as well as related documents.
- Bluetooth protocols are described in “Specification of the Bluetooth System: Core, Version 1.1,” published Feb. 22, 2001 by the Bluetooth Special Interest Group, Inc. Associated as well as previous or subsequent versions of the Bluetooth standard may also be supported.
- network interface(s) 480 may provide wireless communications using, for example, Time Division, Multiple Access (TDMA) protocols, Global System for Mobile Communications (GSM) protocols, Code Division, Multiple Access (CDMA) protocols, and/or any other type of wireless communications protocol.
- TDMA Time Division, Multiple Access
- GSM Global System for Mobile Communications
- CDMA Code Division, Multiple Access
- an apparatus in one embodiment, includes input ports, input buffers coupled with respective input ports, output ports, and routing control circuitry coupled with the input ports, the input buffers and/or the output ports.
- the plurality of input buffers and the plurality of output ports, the routing control circuitry to maintain a two-tier priority scheme having at least two queues for prioritizing requests stored in the plurality of input buffers.
- each input buffer is shared by all virtual channels supported by the plurality of input ports.
- each virtual channel has a corresponding linked list of flits belonging to the virtual channel.
- the two priority queues are a first queue maintained according to a first-in/first-out (FIFO) ordering and a second queue maintained according to a round robin ordering.
- the two-tier priority scheme is maintained by a local arbitration stage with anti-starvation information provided by a global arbitration stage.
- the router control circuitry includes at least a local arbiter and a global arbiter.
- the local arbiter maintains a round robin queue to provide access for messages stored in input buffers to the output ports.
- the local arbiter also maintains a first-in/first-out (FIFO) queue for messages that have been in the round robin queue for greater than a selected threshold length of time. The messages in the FIFO queue are given a higher priority than messages in the round robin queue.
- FIFO first-in/first-out
- the local arbiter passes to the global arbiter an oldest candidate message from all input buffer messages.
- the global arbiter operates as an anti-starvation mechanism with round robin queue arbiter to place messages tagged having a starvation flag set to a higher priority arbitration class that is separate from one or more main arbitration classes such that the higher priority class and the main classes have mutually disjoint sets of message requests and message requests in the higher priority class are chosen ahead of the main arbitration classes.
- the virtual channel crediting polices are determined based on whether the starvation flag is set for a corresponding message request.
- at least one input port is coupled with a first processing core and at least one output port is coupled with a second processing core.
- the first processing core and the second processing core are within a single integrated circuit package.
- a method for managing traffic with a routing device having input ports and output ports included the following.
- a local arbiter maintains a round robin queue to provide access for messages stored in the input buffers to the output ports.
- the local arbiter also maintains a first-in/first-out (FIFO) queue for messages that have been in the round robin queue for greater than a selected threshold length of time.
- the messages in the FIFO queue are given a higher priority than messages in the round robin queue.
- a global arbiter operates as an anti-starvation mechanism with round robin queue to place messages tagged having a starvation flag set to a higher priority arbitration class that is separate from one or more main arbitration classes such that the higher priority class and the main classes have mutually disjoint sets of message requests and message requests in the higher priority class are chosen ahead of the main arbitration classes.
- virtual channel crediting polices are determined based on whether the starvation flag is set for a corresponding message request.
- at least one of the input ports is coupled with a first processing core and at least one of the output ports is coupled with a second processing core, wherein the first processing core and the second processing core are within a single integrated circuit package.
- each input buffer is shared by all virtual channels supported by the input ports.
- each virtual channel has a corresponding linked list of flits belonging to the virtual channel.
- the two priority queues include a first queue maintained according to a first-in/first-out (FIFO) ordering and a second queue maintained according to a round robin ordering.
- the two-tier priority scheme is maintained by a local arbitration stage with anti-starvation information provided by a global arbitration stage.
- an apparatus for managing traffic with a routing device having input ports and output ports includes the following. Means for maintaining, within a local arbiter, a round robin queue to provide access for messages stored in the input buffers to the output ports.
- the local arbiter further maintains a first-in/first-out (FIFO) queue for messages that have been in the round robin queue for greater than a selected threshold length of time. The messages in the FIFO queue are given a higher priority than messages in the round robin queue.
- FIFO first-in/first-out
- the apparatus further includes means for operating, within a global arbiter, an anti-starvation mechanism with round robin queue to place messages tagged having a starvation flag set to a higher priority arbitration class that is separate from one or more main arbitration classes such that the higher priority class and the main classes have mutually disjoint sets of message requests and message requests in the higher priority class are chosen ahead of the main arbitration classes.
- virtual channel crediting polices are determined based on whether the starvation flag is set for a corresponding message request.
- at least one of the input ports is coupled with a first processing core and at least one of the output ports is coupled with a second processing core.
- the first processing core and the second processing core are within a single integrated circuit package.
- each input buffer is shared by all virtual channels supported by the plurality of input ports.
- each virtual channel has a corresponding linked list of flits belonging to the virtual channel.
- the two priority queues comprise a first queue maintained according to a first-in/first-out (FIFO) ordering and a second queue maintained according to a round robin ordering.
- the two-tier priority scheme is maintained by a local arbitration stage with anti-starvation information provided by a global arbitration stage.
- a communications device arranged to operate as described above.
- a tablet computing device is arranged to operate as described above.
- a smartphone device is arranged to operate as described above.
- a laptop computing device is arranged to operate as described above.
- a desktop computing device is arranged to operate as described above.
- an ultrabook computing device is arranged to operate as described above.
- one or more integrated circuit packages contain one or more dies configured for managing traffic with a routing device having a plurality of input ports and a plurality of output ports configured to operate as described above.
- a computing device includes an interconnection of functional components within a single integrated circuit package that operate as described above and a memory coupled to store data carried in one or more messages transmitted via at least one output buffer.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Small-Scale Networks (AREA)
Abstract
Description
- Embodiments of the invention relate to techniques for traffic management between connected nodes. More particularly, embodiments of the invention relate to a multi-tier arbitration scheme that may be used to traffic management between the connected nodes.
- For traffic between connected devices, for example, nodes within a multi-node system or processing cores on a chip or processing cores and memory devices, routers are utilized to manage traffic. For example, routers with multiple virtual channels may use round robin priority arbitration schemes in both local arbitration (LA) and global arbitration (GA) stages.
- However, the combination of independent round robin schemes may result in priority miss-alignments between the LA arbiter and the GA arbiter, which can increase the interconnect latency by orders of magnitude beyond average network latency. This potential raises concerns about arbitration fairness and delivery predictability for message traffic.
- Worst case miss-alignments that may occur consist primarily of persistent loss of priority either at the LA arbiter or the GA arbiter and is a result of independent round robin policies used at each stage. For example, when a message is able to receive a highest priority at the LA arbiter, the GA arbiter may point to a different port when the message finally arrives and the LA arbiter may have already changed the highest priority to a different message that is not the oldest at the input port. This may happen multiple times for the same message, increasing the worst case latency, even up to starvation.
- Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
-
FIG. 1 is a block diagram of one embodiment of an apparatus for a physical interconnect. -
FIG. 2 is a conceptual diagram of a router pipeline and arbitration scheme as described herein. -
FIG. 3 is a flow diagram of one embodiment of a technique for operation of anti-starvation mechanism credit return and anti-starvation counter control. -
FIG. 4 is a block diagram of one embodiment of an electronic system. - In the following description, numerous specific details are set forth. However, embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures and techniques have not been shown in detail in order not to obscure the understanding of this description.
- Based on current technology advancement trends, processor packages with tens to hundreds of processing cores and other functional blocks can be integrated on a single die and will be widely available for use in, for example, cloud computing environments. The arbitration scheme described herein may be utilized in any interconnection network between these processor cores and/or other functional blocks and may provide fairness and latency predictability for messages (packets) routed through the architecture. Interconnection schemes can include, for example, two-dimensional router-based mesh or torus interconnect with virtual channel support, which can be very scalable. The techniques described herein may be applied to interconnections between packages as well.
-
FIG. 1 is a block diagram of one embodiment of an apparatus for a physical interconnect. In one aspect, the apparatus depicts a physical layer for a cache-coherent, link-based interconnect scheme for a processor, chipset, and/or IO bridge components. For example, the physical interconnect may be performed by each physical layer of an integrated device. Specifically, the physical layer may provide communication between two ports over a physical interconnect comprising two uni-directional links. Specifically, oneuni-directional link 104 fromfirst transmit port 150 inphysical layer 102 of a first integrated device to afirst receiver port 150 of a second integrated device. Likewise, asecond uni-directional link 106 from a first transmit port 160 in physical layer 112 of the second integrated device to afirst receiver port 150 of the first integrated device. However, the claimed subject matter is not limited to two uni-directional links. - The links described with respect to
FIG. 1 may be used, for example, as interconnections between an on-chip network. The on-chip network may include, for example, multiple processing cores, multiple caches, multiple memories, other components and/or any combination thereof. The techniques described herein may significantly reduce the worst case delay experienced by a packet (message) or eliminate starvation of a packet (message) in an on-chip network with multiple virtual channels. This may result in improved latency predictability, even under heavy congestion. - In an on-chip network, for example, several routers may be interconnected and utilized to provide the desired bandwidth. Routers may be designed with the availability to support multiple virtual channels to allow all message types to be mapped to the same physical network, thus providing optimal use of the available bandwidth for a given traffic mix at any time and at any point in the network. The router with virtual channels typically contains two different arbitration stages. The Local Arbiter (LA) that belongs to a particular input port, and, during a second stage, the Global Arbiter (GA) that selects a winner among all input ports that compete for a particular output port.
-
FIG. 2 is a conceptual diagram of a router pipeline and arbitration scheme as described herein. The techniques described herein utilize multiple different arbitration schemes that maintain priority alignment between the LA and the GA. In one embodiment, a technique utilizing two-tier priority arbitration at the LA stage and an anti-starvation mechanism with round robin arbiter at the GA stage may be utilized. - A router may receive a packet (message) after route computation (RC), 200. Route computation may be accomplished according to any technique known in the art. Route computation results in a path for a packet to travel through the interconnected nodes in the network, which may include one or more routers.
- In one embodiment, the LA 205 utilizes a two-tier priority arbitration scheme. Packets (messages) that are received (e.g., travel across one or more links of the on-chip network) may be divided into flits. The first flit of a packet is referred to as the head flit, the last flit is referred to as the tail flit and any intervening flits are referred to as body flits.
- In one embodiment, for each input port in a router, the input buffer is shared by all virtual channels (VCs) providing input to the router port. The example of
FIG. 2 provides four input ports (e.g., 210, 212, 214, 216); however, any number of input ports may be supported. Further, each input port may support any number of VCs (e.g., 218). In one embodiment, each VC maintains a separate linked list among the flits belonging to it. There may be multiple types of VCs, for example, routing VCs (RVCs) and performance VCs (PVCs). - In one embodiment, two priority queues are maintained among all VCs that may operate according to two strategies, 215. In alternate embodiments, more than two priority queues may be supported. In the example of
FIG. 2 , the first priority queue may operate in a First-In, First-Out (FIFO) order. The second priority queue may operate in a Round Robin (RR) manner. In one embodiment, the size of the RR queue equals the total number of VCs supported by a single physical port. The size of the FIFO queue also equals the total number of VCs supported by a single physical port. - In one embodiment, every VC has one slot in the RR priority queue at all times. In one embodiment, any VC with a flit waiting will have a slot in the FIFO queue only if the leading flit has waited long enough. Insertion of the VC into the FIFO queue occurs at the time of age threshold crossing. Deletion of the VC from the FIFO occurs upon successful crossbar arbitration and resource (i.e., VC and credit) availability. In one embodiment, the deletion occurs only if there is no other queued flit associated with that VC at that input port. The FIFO slots are considered a subset of RR slots.
- In one embodiment, each input VC context has an associated timer that starts counting in response to the leading flit arriving and waits for grant. Once the timer expires after a configurable waiting time, the corresponding VC is entered into the FIFO queue and the timer is disabled unless the VC insertion into the FIFO queue is delayed. If multiple VC timers expire at the same time, the VCs may be entered into the FIFO in a selected order, for example, based on VC index. In one embodiment, the insertion is performed one entry per clock cycle, delaying other entries for subsequent clock cycles. These timers can be considered saturation timers.
- In one embodiment, for each input port, arbitration is accomplished in the following manner. All active VC requests are filtered by pre-qualification based on the availability of their respective target port, VC and/or credit. Among the qualified requesting VCs, the one that is at the head of the FIFO queue is selected as the winner. If so, the VC is de-queued and the timer starts again if there are more flits waiting for that VC. If not the RR queue determines the winner.
- A single candidate per input port is then passed to the second (GA) stage of
arbitration 230. The GA selects the oldest candidate among all different input port candidates. In one embodiment, anti-starvation mechanism withround robin arbiter 240 utilized atGA stage 230 provides two functions. The first is to raise the GA stage “awareness” of the starved (aged) packets at the input port and the second is to give waiting body and tail flits higher priority in using available buffer credits. Various embodiments of the anti-starvation mechanism within the GA stage are described below. - In one embodiment, a separate priority class is provided for VCs that have starved (aged) at input ports and/or have been turned back from GA stage 230 a selected number of times (programmable) because of lack of buffer credit or VC resources downstream. In one embodiment, the input port tags headers for packets that have starved and when the packets are subsequently presented to
GA stage 230, they carry the tag, which causesGA stage 230 to place the request in a higher priority arbitration class that can be separate from the main arbitration scheme. - The higher priority arbitration and the regular round robin arbitration scheme have mutually disjoint requests (i.e., no overlap amongst the requests in the two arbitration schemes). The requests in the higher priority arbitration scheme (after passing a resource check) will be chosen before the round robin requests. This is provided by
GA priority control 265. - The second function mentioned above facilitates keeping flits of long packets together with their respective header flits. In order to accomplish this, the starvation flag at the output port and
credit return mechanism 245 may be utilized. One embodiment of a technique for operation of anti-starvation mechanism credit return and anti-starvation counter control is described in greater detail with respect toFIG. 3 . - When a request with a starved (aged) tag is sent from
GA 230 toLA 205 for retry, 255, the flit is checked to see if it is a body flit or a tail flit. If so, the starvation flag is set at the target output VC. Otherwise, the starvation flag is set in the corresponding output routing VC context. - In one embodiment, when a downstream credit is returned to an output port, the starvation flag is checked at the output port according to the following strategy. If the starvation flag is not set on the credited VC, the existing crediting policy remains unchanged. That is, a credit is put into the shared credit pool unless the occupancy is set for an in-use VC or it is reserved for a soon-to-be free routing VC. If the starvation flag is set on the credited VC, the crediting policy is changed. In one embodiment, if the reserved credit of that VC is not set, that reserved credit is set regardless of occupancy of that VC and the starvation flag of the VC is cleared. If the reserved credit for that VC is set, the returned credit is put into the shared pool.
- In summary, the technique described above relies on two different arbitration schemes that maintain priority alignment between the LA and the GA. The technique combines a two-tier priority arbitration at the LA stage and an anti-starvation mechanism (ASM) with round robin arbiter at the GA stage. The two-tier priority arbitration at the LA stage relies on a double priority scheme. The first priority queue works in a FIFO order and the second priority queue works in a round robin fashion. Priority is given to the VC that is first in the FIFO queue; if no VC is qualified, the round robin scheme determines the winner. The anti-starvation mechanism with round robin arbiter used at the GA stage provides the stages “awareness” of starved packets at the input port(s) and gives waiting body and tail flits higher priority in using available buffer credit. The selected flit is transmitted, 270, via an output port, which is part of the switch transversal, 280, portion of the router pipeline.
-
FIG. 3 is a flow diagram of one embodiment of a technique for operation of anti-starvation mechanism credit return and anti-starvation counter control. The functionality described with respect toFIG. 3 may be provided in the GA stage to provide information to the LA stage to support the two different arbitration schemes that maintain priority alignment between the LA and the GA. - Global arbitration is performed, 305. The global arbitration is performed as described herein. The result of global arbitration is a determination of whether the request wins the output port. If the request does not win the output port, 310, the anti-starvation mechanism counter for the input port is increased, 320. If the request does not have its starved tag set, 330, then a normal credit return occurs, 335. If the request does have its starved tag set, 330, then the GA stage determines if the request carries a head flit or a single flit.
- If the request carries a head flit or a single flit, 340, the output port starvation flag is set at the routing VC, 345. If the request does not carry a head flit or a single flit, 340, then the GA stage asserts body or tail flit and sets the output starvation flag at the target VC, 350.
- Returning to the result of global arbitration, 305, if the request does win the output port, 310, the global arbitration stage determines whether a tail flit or a single flit has been delivered. If a tail flit or a single flit has been delivered, 310, the ASM counter for the input port is reset, the output port starvation flag is cleared and the starvation tag is cleared, 315. If a tail flit or a single flit has not been delivered, 310, the flit is either a header flit or a body flit, 325.
- For a header flit or a body flit, 325, if the remaining flits buffered at the input port are for the same message, 360, then a credit return is reserved for the VC to guarantee an advance, 365. For a header flit or a body flit, 325, if the remaining flits buffered at the input port are not for the same message, 360, then a normal credit return occurs, 370.
-
FIG. 4 is a block diagram of one embodiment of an electronic system. The electronic system illustrated inFIG. 4 is intended to represent a range of electronic systems (either wired or wireless) including, for example, desktop computer systems, laptop computer systems, cellular telephones, set top boxes, smart phones, tablets, ultrabooks, netbooks. Alternative electronic systems may include more, fewer and/or different components. -
Electronic system 400 includes interconnection network 405 or other communication device to communicate information, andprocessor 410 coupled to interconnection network 405 that may process information. Whileelectronic system 400 is illustrated with a single processor,electronic system 400 may include multiple processors and/or co-processors.Electronic system 400 further may include random access memory (RAM) or other dynamic storage device 420 (referred to as main memory), coupled to interconnection network 405 and may store information and instructions that may be executed byprocessor 410.Main memory 420 may also be used to store temporary variables or other intermediate information during execution of instructions byprocessor 410. -
Electronic system 400 may also include read only memory (ROM) and/or otherstatic storage device 430 coupled to interconnection network 405 that may store static information and instructions forprocessor 410.Data storage device 440 may be coupled to interconnection network 405 to store information and instructions.Data storage device 440 such as a magnetic disk or optical disc and corresponding drive may be coupled toelectronic system 400. -
Electronic system 400 may also be coupled via interconnection network 405 to displaydevice 450, such as a cathode ray tube (CRT) or liquid crystal display (LCD), to display information to a user.Alphanumeric input device 460, including alphanumeric and other keys, may be coupled to interconnection network 405 to communicate information and command selections toprocessor 410. Another type of user input device iscursor control 470, such as a mouse, a trackball, or cursor direction keys to communicate direction information and command selections toprocessor 410 and to control cursor movement ondisplay 450. -
Electronic system 400 further may include network interface(s) 480 to provide access to a network, such as a local area network. Network interface(s) 480 may include, for example, a wireless networkinterface having antenna 485, which may represent one or more antenna(e). Network interface(s) 480 may also include, for example, a wired network interface to communicate with remote devices vianetwork cable 487, which may be, for example, an Ethernet cable, a coaxial cable, a fiber optic cable, a serial cable, or a parallel cable. - In one embodiment, network interface(s) 480 may provide access to a local area network, for example, by conforming to IEEE 802.11b and/or IEEE 802.11g standards, and/or the wireless network interface may provide access to a personal area network, for example, by conforming to Bluetooth standards. Other wireless network interfaces and/or protocols, for example, IEEE 802.11n and Thunderbolt can also be supported.
- IEEE 802.11b corresponds to IEEE Std. 802.11b-1999 entitled “Local and Metropolitan Area Networks, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications: Higher-Speed Physical Layer Extension in the 2.4 GHz Band,” approved Sep. 16, 1999 as well as related documents. IEEE 802.11g corresponds to IEEE Std. 802.11g-2003 entitled “Local and Metropolitan Area Networks, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, Amendment 4: Further Higher Rate Extension in the 2.4 GHz Band,” approved Jun. 27, 2003 as well as related documents. Bluetooth protocols are described in “Specification of the Bluetooth System: Core, Version 1.1,” published Feb. 22, 2001 by the Bluetooth Special Interest Group, Inc. Associated as well as previous or subsequent versions of the Bluetooth standard may also be supported.
- In addition to, or instead of, communication via wireless LAN standards, network interface(s) 480 may provide wireless communications using, for example, Time Division, Multiple Access (TDMA) protocols, Global System for Mobile Communications (GSM) protocols, Code Division, Multiple Access (CDMA) protocols, and/or any other type of wireless communications protocol.
- In one embodiment, an apparatus includes input ports, input buffers coupled with respective input ports, output ports, and routing control circuitry coupled with the input ports, the input buffers and/or the output ports. The plurality of input buffers and the plurality of output ports, the routing control circuitry to maintain a two-tier priority scheme having at least two queues for prioritizing requests stored in the plurality of input buffers. In one embodiment, each input buffer is shared by all virtual channels supported by the plurality of input ports.
- In one embodiment, each virtual channel has a corresponding linked list of flits belonging to the virtual channel. In one embodiment, the two priority queues are a first queue maintained according to a first-in/first-out (FIFO) ordering and a second queue maintained according to a round robin ordering. In one embodiment, the two-tier priority scheme is maintained by a local arbitration stage with anti-starvation information provided by a global arbitration stage.
- In one embodiment, the router control circuitry includes at least a local arbiter and a global arbiter. The local arbiter maintains a round robin queue to provide access for messages stored in input buffers to the output ports. The local arbiter also maintains a first-in/first-out (FIFO) queue for messages that have been in the round robin queue for greater than a selected threshold length of time. The messages in the FIFO queue are given a higher priority than messages in the round robin queue.
- In one embodiment, the local arbiter passes to the global arbiter an oldest candidate message from all input buffer messages. The global arbiter operates as an anti-starvation mechanism with round robin queue arbiter to place messages tagged having a starvation flag set to a higher priority arbitration class that is separate from one or more main arbitration classes such that the higher priority class and the main classes have mutually disjoint sets of message requests and message requests in the higher priority class are chosen ahead of the main arbitration classes.
- In one embodiment, the virtual channel crediting polices are determined based on whether the starvation flag is set for a corresponding message request. In one embodiment, at least one input port is coupled with a first processing core and at least one output port is coupled with a second processing core. The first processing core and the second processing core are within a single integrated circuit package.
- In one embodiment, a method for managing traffic with a routing device having input ports and output ports included the following. A local arbiter maintains a round robin queue to provide access for messages stored in the input buffers to the output ports. The local arbiter also maintains a first-in/first-out (FIFO) queue for messages that have been in the round robin queue for greater than a selected threshold length of time. The messages in the FIFO queue are given a higher priority than messages in the round robin queue.
- A global arbiter operates as an anti-starvation mechanism with round robin queue to place messages tagged having a starvation flag set to a higher priority arbitration class that is separate from one or more main arbitration classes such that the higher priority class and the main classes have mutually disjoint sets of message requests and message requests in the higher priority class are chosen ahead of the main arbitration classes.
- In one embodiment virtual channel crediting polices are determined based on whether the starvation flag is set for a corresponding message request. In one embodiment, at least one of the input ports is coupled with a first processing core and at least one of the output ports is coupled with a second processing core, wherein the first processing core and the second processing core are within a single integrated circuit package.
- In one embodiment, each input buffer is shared by all virtual channels supported by the input ports. In one embodiment, each virtual channel has a corresponding linked list of flits belonging to the virtual channel. In one embodiment, the two priority queues include a first queue maintained according to a first-in/first-out (FIFO) ordering and a second queue maintained according to a round robin ordering. In one embodiment, the two-tier priority scheme is maintained by a local arbitration stage with anti-starvation information provided by a global arbitration stage.
- In one embodiment, an apparatus for managing traffic with a routing device having input ports and output ports includes the following. Means for maintaining, within a local arbiter, a round robin queue to provide access for messages stored in the input buffers to the output ports. The local arbiter further maintains a first-in/first-out (FIFO) queue for messages that have been in the round robin queue for greater than a selected threshold length of time. The messages in the FIFO queue are given a higher priority than messages in the round robin queue. The apparatus further includes means for operating, within a global arbiter, an anti-starvation mechanism with round robin queue to place messages tagged having a starvation flag set to a higher priority arbitration class that is separate from one or more main arbitration classes such that the higher priority class and the main classes have mutually disjoint sets of message requests and message requests in the higher priority class are chosen ahead of the main arbitration classes.
- In one embodiment, virtual channel crediting polices are determined based on whether the starvation flag is set for a corresponding message request. In one embodiment, at least one of the input ports is coupled with a first processing core and at least one of the output ports is coupled with a second processing core. The first processing core and the second processing core are within a single integrated circuit package.
- In one embodiment, each input buffer is shared by all virtual channels supported by the plurality of input ports. In one embodiment, each virtual channel has a corresponding linked list of flits belonging to the virtual channel. In one embodiment, the two priority queues comprise a first queue maintained according to a first-in/first-out (FIFO) ordering and a second queue maintained according to a round robin ordering. In one embodiment, the two-tier priority scheme is maintained by a local arbitration stage with anti-starvation information provided by a global arbitration stage.
- In one embodiment, a communications device arranged to operate as described above. In one embodiment, a tablet computing device is arranged to operate as described above. In one embodiment, a smartphone device is arranged to operate as described above. In one embodiment, a laptop computing device is arranged to operate as described above. In one embodiment, a desktop computing device is arranged to operate as described above. In one embodiment, an ultrabook computing device is arranged to operate as described above.
- In one embodiment, one or more integrated circuit packages contain one or more dies configured for managing traffic with a routing device having a plurality of input ports and a plurality of output ports configured to operate as described above. In one embodiment, a computing device includes an interconnection of functional components within a single integrated circuit package that operate as described above and a memory coupled to store data carried in one or more messages transmitted via at least one output buffer.
- Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
- While the invention has been described in terms of several embodiments, those skilled in the art will recognize that the invention is not limited to the embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting.
Claims (26)
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/629,351 US8867559B2 (en) | 2012-09-27 | 2012-09-27 | Managing starvation and congestion in a two-dimensional network having flow control |
DE112013004750.0T DE112013004750B4 (en) | 2012-09-27 | 2013-06-18 | Management of starvation and congestion in a two-dimensional network with flow control |
PCT/US2013/046397 WO2014051758A1 (en) | 2012-09-27 | 2013-06-18 | Managing starvation and congestion in a two-dimensional network having flow control |
CN201380045277.XA CN104584497B (en) | 2012-09-27 | 2013-06-18 | Starvation and obstruction in two-dimensional network of the management with flow control |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/629,351 US8867559B2 (en) | 2012-09-27 | 2012-09-27 | Managing starvation and congestion in a two-dimensional network having flow control |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140086260A1 true US20140086260A1 (en) | 2014-03-27 |
US8867559B2 US8867559B2 (en) | 2014-10-21 |
Family
ID=50338813
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/629,351 Expired - Fee Related US8867559B2 (en) | 2012-09-27 | 2012-09-27 | Managing starvation and congestion in a two-dimensional network having flow control |
Country Status (4)
Country | Link |
---|---|
US (1) | US8867559B2 (en) |
CN (1) | CN104584497B (en) |
DE (1) | DE112013004750B4 (en) |
WO (1) | WO2014051758A1 (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9742630B2 (en) * | 2014-09-22 | 2017-08-22 | Netspeed Systems | Configurable router for a network on chip (NoC) |
US10084692B2 (en) | 2013-12-30 | 2018-09-25 | Netspeed Systems, Inc. | Streaming bridge design with host interfaces and network on chip (NoC) layers |
CN109218230A (en) * | 2017-06-30 | 2019-01-15 | 英特尔公司 | For balancing the technology of the handling capacity of the input port across multistage network interchanger |
US10218580B2 (en) | 2015-06-18 | 2019-02-26 | Netspeed Systems | Generating physically aware network-on-chip design from a physical system-on-chip specification |
US10348563B2 (en) | 2015-02-18 | 2019-07-09 | Netspeed Systems, Inc. | System-on-chip (SoC) optimization through transformation and generation of a network-on-chip (NoC) topology |
US10419300B2 (en) | 2017-02-01 | 2019-09-17 | Netspeed Systems, Inc. | Cost management against requirements for the generation of a NoC |
US10452124B2 (en) | 2016-09-12 | 2019-10-22 | Netspeed Systems, Inc. | Systems and methods for facilitating low power on a network-on-chip |
US10523599B2 (en) | 2017-01-10 | 2019-12-31 | Netspeed Systems, Inc. | Buffer sizing of a NoC through machine learning |
US10547514B2 (en) | 2018-02-22 | 2020-01-28 | Netspeed Systems, Inc. | Automatic crossbar generation and router connections for network-on-chip (NOC) topology generation |
US10735335B2 (en) | 2016-12-02 | 2020-08-04 | Netspeed Systems, Inc. | Interface virtualization and fast path for network on chip |
US10896476B2 (en) | 2018-02-22 | 2021-01-19 | Netspeed Systems, Inc. | Repository of integration description of hardware intellectual property for NoC construction and SoC integration |
US10983910B2 (en) | 2018-02-22 | 2021-04-20 | Netspeed Systems, Inc. | Bandwidth weighting mechanism based network-on-chip (NoC) configuration |
US11023377B2 (en) | 2018-02-23 | 2021-06-01 | Netspeed Systems, Inc. | Application mapping on hardened network-on-chip (NoC) of field-programmable gate array (FPGA) |
US11144457B2 (en) | 2018-02-22 | 2021-10-12 | Netspeed Systems, Inc. | Enhanced page locality in network-on-chip (NoC) architectures |
US11176302B2 (en) | 2018-02-23 | 2021-11-16 | Netspeed Systems, Inc. | System on chip (SoC) builder |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8867559B2 (en) | 2012-09-27 | 2014-10-21 | Intel Corporation | Managing starvation and congestion in a two-dimensional network having flow control |
EP3066573A4 (en) * | 2013-11-06 | 2017-06-21 | Intel Corporation | Virtual retry queue |
US10050843B2 (en) * | 2015-02-18 | 2018-08-14 | Netspeed Systems | Generation of network-on-chip layout based on user specified topological constraints |
US9864728B2 (en) * | 2015-05-29 | 2018-01-09 | Netspeed Systems, Inc. | Automatic generation of physically aware aggregation/distribution networks |
CN105022717B (en) * | 2015-06-04 | 2018-11-27 | 中国航空无线电电子研究所 | The network-on-chip arbitration method and arbitration unit of additional request number priority |
US11741025B2 (en) * | 2020-12-01 | 2023-08-29 | Western Digital Technologies, Inc. | Storage system and method for providing a dual-priority credit system |
Family Cites Families (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5014265A (en) * | 1989-11-30 | 1991-05-07 | At&T Bell Laboratories | Method and apparatus for congestion control in a data network |
US6018765A (en) | 1996-01-23 | 2000-01-25 | Storage Concepts, Inc. | Multi-channel multimedia data server |
US6285679B1 (en) * | 1997-08-22 | 2001-09-04 | Avici Systems, Inc. | Methods and apparatus for event-driven routing |
US6751698B1 (en) * | 1999-09-29 | 2004-06-15 | Silicon Graphics, Inc. | Multiprocessor node controller circuit and method |
JP4879382B2 (en) * | 2000-03-22 | 2012-02-22 | 富士通株式会社 | Packet switch, scheduling device, discard control circuit, multicast control circuit, and QoS control device |
US6968392B1 (en) | 2000-06-29 | 2005-11-22 | Cisco Technology, Inc. | Method and apparatus providing improved statistics collection for high bandwidth interfaces supporting multiple connections |
US6961781B1 (en) * | 2000-08-31 | 2005-11-01 | Hewlett-Packard Development Company, L.P. | Priority rules for reducing network message routing latency |
US6975638B1 (en) | 2000-10-13 | 2005-12-13 | Force10 Networks, Inc. | Interleaved weighted fair queuing mechanism and system |
US8090869B2 (en) * | 2002-06-04 | 2012-01-03 | Alcatel Lucent | Priority-biased exit queue arbitration with fairness |
WO2004004190A2 (en) * | 2002-06-27 | 2004-01-08 | Tellabs Operations, Inc. | Apparatus and method to switch packets using a switch fabric with memory |
EP1683310A1 (en) * | 2003-10-31 | 2006-07-26 | Koninklijke Philips Electronics N.V. | Integrated circuit and method for avoiding starvation of data |
GR1004758B (en) | 2004-02-11 | 2004-12-15 | Intracomαα.Ε.Αελληνικηαβιομηχανιαατηλεπικοινωνιωνα&Ασυστηματωναπληροφορικησαα | Synthesizable vhdl model of a multi-channel dma-engine core for embedded bus systems |
US8006017B2 (en) * | 2004-12-21 | 2011-08-23 | Intel Corporation | Stream priority |
US8228930B1 (en) * | 2006-06-02 | 2012-07-24 | The Board Of Trustees Of The Leland Stanford Junior University | Interconnection network router arrangements and methods therefor |
US20080091866A1 (en) * | 2006-10-12 | 2008-04-17 | International Business Machines Corporation | Maintaining forward progress in a shared L2 by detecting and breaking up requestor starvation |
US7925816B2 (en) * | 2006-11-06 | 2011-04-12 | Oracle America, Inc. | Architecture for an output buffered switch with input groups |
US9053753B2 (en) * | 2006-11-09 | 2015-06-09 | Broadcom Corporation | Method and system for a flexible multiplexer and mixer |
US20080151894A1 (en) * | 2006-12-22 | 2008-06-26 | Intel Corporation | Selectively hybrid input and output queued router |
US20090097495A1 (en) * | 2007-10-11 | 2009-04-16 | Brocade Communications Systems, Inc. | Flexible virtual queues |
US7974278B1 (en) * | 2007-12-12 | 2011-07-05 | Integrated Device Technology, Inc. | Packet switch with configurable virtual channels |
US8539130B2 (en) * | 2009-09-24 | 2013-09-17 | Nvidia Corporation | Virtual channels for effective packet transfer |
US8626968B2 (en) * | 2009-12-26 | 2014-01-07 | Intel Corporation | Inter-queue anti-starvation mechanism with dynamic deadlock avoidance in a retry based pipeline |
US8705544B2 (en) | 2011-03-07 | 2014-04-22 | Broadcom Corporation | Method and apparatus for routing in a single tier switched network |
US9129060B2 (en) * | 2011-10-13 | 2015-09-08 | Cavium, Inc. | QoS based dynamic execution engine selection |
US8867559B2 (en) | 2012-09-27 | 2014-10-21 | Intel Corporation | Managing starvation and congestion in a two-dimensional network having flow control |
-
2012
- 2012-09-27 US US13/629,351 patent/US8867559B2/en not_active Expired - Fee Related
-
2013
- 2013-06-18 CN CN201380045277.XA patent/CN104584497B/en active Active
- 2013-06-18 WO PCT/US2013/046397 patent/WO2014051758A1/en active Application Filing
- 2013-06-18 DE DE112013004750.0T patent/DE112013004750B4/en active Active
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10084692B2 (en) | 2013-12-30 | 2018-09-25 | Netspeed Systems, Inc. | Streaming bridge design with host interfaces and network on chip (NoC) layers |
US9742630B2 (en) * | 2014-09-22 | 2017-08-22 | Netspeed Systems | Configurable router for a network on chip (NoC) |
US10348563B2 (en) | 2015-02-18 | 2019-07-09 | Netspeed Systems, Inc. | System-on-chip (SoC) optimization through transformation and generation of a network-on-chip (NoC) topology |
US10218580B2 (en) | 2015-06-18 | 2019-02-26 | Netspeed Systems | Generating physically aware network-on-chip design from a physical system-on-chip specification |
US10564704B2 (en) | 2016-09-12 | 2020-02-18 | Netspeed Systems, Inc. | Systems and methods for facilitating low power on a network-on-chip |
US10452124B2 (en) | 2016-09-12 | 2019-10-22 | Netspeed Systems, Inc. | Systems and methods for facilitating low power on a network-on-chip |
US10613616B2 (en) | 2016-09-12 | 2020-04-07 | Netspeed Systems, Inc. | Systems and methods for facilitating low power on a network-on-chip |
US10564703B2 (en) | 2016-09-12 | 2020-02-18 | Netspeed Systems, Inc. | Systems and methods for facilitating low power on a network-on-chip |
US10749811B2 (en) | 2016-12-02 | 2020-08-18 | Netspeed Systems, Inc. | Interface virtualization and fast path for Network on Chip |
US10735335B2 (en) | 2016-12-02 | 2020-08-04 | Netspeed Systems, Inc. | Interface virtualization and fast path for network on chip |
US10523599B2 (en) | 2017-01-10 | 2019-12-31 | Netspeed Systems, Inc. | Buffer sizing of a NoC through machine learning |
US10419300B2 (en) | 2017-02-01 | 2019-09-17 | Netspeed Systems, Inc. | Cost management against requirements for the generation of a NoC |
US10469337B2 (en) | 2017-02-01 | 2019-11-05 | Netspeed Systems, Inc. | Cost management against requirements for the generation of a NoC |
US10469338B2 (en) | 2017-02-01 | 2019-11-05 | Netspeed Systems, Inc. | Cost management against requirements for the generation of a NoC |
CN109218230A (en) * | 2017-06-30 | 2019-01-15 | 英特尔公司 | For balancing the technology of the handling capacity of the input port across multistage network interchanger |
US10547514B2 (en) | 2018-02-22 | 2020-01-28 | Netspeed Systems, Inc. | Automatic crossbar generation and router connections for network-on-chip (NOC) topology generation |
US10896476B2 (en) | 2018-02-22 | 2021-01-19 | Netspeed Systems, Inc. | Repository of integration description of hardware intellectual property for NoC construction and SoC integration |
US10983910B2 (en) | 2018-02-22 | 2021-04-20 | Netspeed Systems, Inc. | Bandwidth weighting mechanism based network-on-chip (NoC) configuration |
US11144457B2 (en) | 2018-02-22 | 2021-10-12 | Netspeed Systems, Inc. | Enhanced page locality in network-on-chip (NoC) architectures |
US11023377B2 (en) | 2018-02-23 | 2021-06-01 | Netspeed Systems, Inc. | Application mapping on hardened network-on-chip (NoC) of field-programmable gate array (FPGA) |
US11176302B2 (en) | 2018-02-23 | 2021-11-16 | Netspeed Systems, Inc. | System on chip (SoC) builder |
Also Published As
Publication number | Publication date |
---|---|
US8867559B2 (en) | 2014-10-21 |
DE112013004750B4 (en) | 2021-05-06 |
CN104584497A (en) | 2015-04-29 |
WO2014051758A1 (en) | 2014-04-03 |
DE112013004750T5 (en) | 2015-08-27 |
WO2014051758A8 (en) | 2014-11-13 |
CN104584497B (en) | 2017-09-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8867559B2 (en) | Managing starvation and congestion in a two-dimensional network having flow control | |
US8085801B2 (en) | Resource arbitration | |
US8718065B2 (en) | Transmission using multiple physical interface | |
US9571402B2 (en) | Congestion control and QoS in NoC by regulating the injection traffic | |
US8059671B2 (en) | Switching device | |
US9325637B2 (en) | System for performing distributed data cut-through | |
US20130179613A1 (en) | Network on chip (noc) with qos features | |
US11394664B2 (en) | Network interface device | |
US10387355B2 (en) | NoC interconnect with linearly-tunable QoS guarantees for real-time isolation | |
US20140281099A1 (en) | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR CONTROLLING FLOW OF PCIe TRANSPORT LAYER PACKETS | |
WO2019080866A1 (en) | Data transmission method and device, and computer storage medium | |
US10289598B2 (en) | Non-blocking network | |
US8116311B1 (en) | Method and system for tag arbitration in switches | |
US8671220B1 (en) | Network-on-chip system, method, and computer program product for transmitting messages utilizing a centralized on-chip shared memory switch | |
US8040907B2 (en) | Switching method | |
US8976802B2 (en) | Prediction-based switch allocator | |
KR20170015000A (en) | On-chip network and communication method thereof | |
US10185606B2 (en) | Scalable autonomic message-transport with synchronization | |
CN115695578A (en) | A data center network TCP and RDMA hybrid flow scheduling method, system and device | |
Rad et al. | Flow control and scheduling mechanism to improve network performance in wireless NoC | |
Zhang et al. | QBNoC: QoS-aware bufferless NoC architecture | |
US10419367B2 (en) | Queue buffer de-queuing | |
WO2016088371A1 (en) | Management node, terminal, communication system, communication method, and program recording medium | |
WO2020001608A1 (en) | Cache allocation method and device | |
CN115225421A (en) | System and method for reducing congestion in a network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DAI, DONGLAI;MEJIA, ANDRES;PORTA, GASPAR MORA;REEL/FRAME:030002/0910 Effective date: 20121116 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.) |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20181021 |