US20140153582A1 - Method and apparatus for providing a packet buffer random access memory - Google Patents
Method and apparatus for providing a packet buffer random access memory Download PDFInfo
- Publication number
- US20140153582A1 US20140153582A1 US14/175,142 US201414175142A US2014153582A1 US 20140153582 A1 US20140153582 A1 US 20140153582A1 US 201414175142 A US201414175142 A US 201414175142A US 2014153582 A1 US2014153582 A1 US 2014153582A1
- Authority
- US
- United States
- Prior art keywords
- packet
- command
- pbram
- queue
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/625—Queue scheduling characterised by scheduling criteria for service slots or service orders
- H04L47/6275—Queue scheduling characterised by scheduling criteria for service slots or service orders based on priority
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/6215—Individual queue per QOS, rate or priority
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/901—Buffering arrangements using storage descriptor, e.g. read or write pointers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/9021—Plurality of buffers per packet
Definitions
- U.S. patent application Ser. No. 13/369,593 is a continuation of U.S. patent application Ser. No. 12/718,300, filed on Mar. 5, 2010 and which is now U.S. Pat. No. 8,126,003.
- U.S. patent application Ser. No. 12/718,300 is a continuation of U.S. patent application Ser. No. 10/614,558, filed on Jul. 7, 2003 and which is now U.S. Pat. No. 7,675,925.
- U.S. patent application Ser. No. 10/614,558 is a continuation of U.S. patent application Ser. No. 09/283,778, filed on Mar.
- LANs local-area networks
- LANs include a bus that is shared by a number of computers.
- Local-area networks permit only one computer to send data over the bus at a given time and that computer can only utilize the bus for a certain period of time before it is required to relinquish it.
- each computer typically segments the information into packets having predefined maximum and minimum lengths. Each packet is sent during a separate bus transaction. If more than one computer needs to send information, then the computers alternately send their packets, so as to share the bus.
- Computer networks are more useful where they are connected to one another such that information can be communicated between two computers on different physical networks. This can be done by employing intermediate computers referred to as “routers”. Each router has two or more network connections to different physical networks. The routers relay packets received from one interface to the other interface and vice versa. For example, consider the network configuration depicted in FIG. 1 . Five hosts 2 , 4 , 6 , 18 and 20 , and two routers 8 , 10 are connected by networks 12 , 14 and 16 . The router R 1 is able to directly deliver any messages that are intended for delivery to hosts 2 , 4 , 18 and 20 . However, a message that is intended for host H 5 must be initially delivered to router R 2 which is able to directly deliver it to H 5 .
- LAN switching is necessary due to the increasing volume of traffic present on many corporate LANs.
- New applications such as the world-wide web (WWW) and voice-over-IP are responsible for that increased network load.
- a LAN switch resembles a router in that it relays packets received at one interface, to another interface on the same device. However, the switch must perform this relay operation at high speed and therefore typically does so in hardware rather than software as is the case with a router. Accordingly, it is usually necessary to employ some form of memory in a network switch to handle the case where a packets intended output port is occupied sending or receiving other traffic.
- FIG. 2 shows a situation where buffering is required. Ports P 1 and P 2 each receive traffic for the output port P 3 .
- queue 22 Assuming that the input and output ports operate at the same speed, some form of buffering is required such as queue 22 . If port P 3 is busy when packets arrive from ports P 1 or P 2 , then the packets are buffered in queue 22 . Once port P 3 is free, the data packets will be released from queue 22 in the order that they were received.
- each network port (either input or output) has memory associated with it.
- the network port may write packets only into its dedicated memory, and read packets only from its dedicated memory. Usually, a packet must be completely transferred from an input memory to an output memory.
- this transfer methodology is the primary disadvantage of the dedicated port architecture.
- the other disadvantage is that the amount of memory allocated to a port is finite. If a port's buffer becomes filled, any further information sent to that port will be lost even though memory may be unused elsewhere in the switch.
- the primary advantage of the dedicated port memory is that there is no need for a port to arbitrate for access to memory, which can be a significant time consuming operation.
- the switch In the shared global memory architecture, the switch has access to a single global memory and all network ports must arbitrate for access to that memory.
- the primary advantages of this architecture are that no copying of packets in memory is required, and the memory is useable by all ports such that no port will be denied any memory until all the memory is in use.
- the disadvantages of the global memory architecture are twofold. First, a very high bandwidth bus is required to permit all input ports to write into and read out of the memory at speeds that approach the data rate of the network. For example, a twenty-four-port 100 Mbit/second Ethernet switch may perform twenty-four 100 Mbit/second reads and twenty-four 100 MBit/second writes, for a total bus data rate of 4.8 Gbit/sec. It should be noted that such a data rate exceeds the capacity of a 64-bit, 66 MHz PCI bus.
- the second disadvantage of the global memory architecture is that time is lost in arbitrating for the memory among all of the ports.
- an embodiment of the present invention is a packet buffer RAM (PBRAM) that provides advantages of the aforementioned memory architectures while removing the disadvantages.
- PBRAM is a single global memory arranged in a queue architecture, so it has the properties that no packet data copying is required, and that all of the memory is available to all of the ports.
- PBRAM in the preferred embodiment is a 32-port memory. This means that 32 different devices may access the memory without the need to arbitrate for the data channels.
- a method and apparatus for storing data packets, transferred across a computer network, in a packet buffer random access memory or PBRAM device.
- The, PBRAM device receives a number of data packets from network controllers that are coupled to the computer network via associated input ports. After the data packets are received portions thereof are serially transferred to different segments of serial registers that are connected between the input ports and the memory array. Lastly, the data packets are conveyed to the memory array portion of the device in parallel manner while other portions of the packets are being conveyed to other segments of the serial registers.
- the PBRAM device further assigns input queue structures in the memory array. It also stores pointers to the packets in a packet table and stores pointers to associated locations of the packet table in the queue structures. Those queue structures are accessible by associated output ports of the PBRAM device such that said pointers are transferred from the input queue structures to associated output queue structures that deliver the data packets to the output ports.
- FIG. 1 is a schematic drawing of a typical network configuration
- FIG. 2 is a schematic diagram of a buffering operation performed between a number of network ports
- FIG. 3 is a schematic diagram of an SRAM memory configuration
- FIG. 4 is a schematic diagram of a DRAM memory configuration
- FIG. 5 is block diagram of a two-bank DRAM device
- FIG. 6 is a block diagram of a network switch configuration that includes a PBRAM device, according to the present invention.
- FIG. 7 is a schematic diagram of the PBRAM device of FIG. 6 ;
- FIG. 8 is a schematic diagram of an internal DRAM memory array of the PBRAM device of FIG. 6 ;
- FIG. 9 is a block diagram of a twenty-four port Ethernet switch including the PBRAM device of FIG. 6 ;
- FIG. 10 is a block diagram of a configuration including a number of PBRAM devices such as shown of FIG. 6 ;
- FIG. 11 illustrates packets that have been distributed across the configuration of PBRAMs, such as shown in FIG. 10 ;
- FIG. 12 is a flow diagram of the operation of the PBRAM device shown in FIG. 6 .
- FIG. 13 depicts the structure of the Read Data Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 14 depicts the structure of the Suspend Output Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 15 depicts the structure of the Assign Queue Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 16 depicts the structure of the Assign Tag Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 17 depicts the structure of the Assign Length Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 18 depicts the structure of the Commit Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 19 depicts the structure of the Write Abort Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 20 depicts the structure of the Transfer Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 21 depicts the structure of the Drop Data Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 22 depicts the structure of the Flush Queue Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 23 depicts the structure of the Reset Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 24 depicts the structure of the No-Op Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 25 depicts the structure of the Test Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 26 depicts the structure of the Set Chip Count Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 27 depicts the structure of the Set Tag Length Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 28 depicts the structure of the Timing Reference Command that can be executed on the PBRAM device of FIG. 7 ;
- FIG. 29 depicts the structure of the Vernier Adjust Command that can be executed on the PBRAM device of FIG. 7 ;
- an embodiment of the present invention is a packet buffer random access memory (PBRAM) that provides the advantages of the aforementioned memory architectures while removing the disadvantages.
- PBRAM includes a single global memory, so it has the properties that no packet data copying is required, and that all of the memory is available to all of the ports.
- the PBRAM of the preferred embodiment includes a 32-port memory. This means that 32 different devices may access the memory without the need to arbitrate for the data channels. Each port may operate at up to 250 Mbit/sec, so the whole chip may run at 8 Gbit/sec. Further, it is much easier to increase the total bandwidth of PBRAM than it is to increase the bandwidth of a PCI bus or similar memory bus.
- each network port (either input or output) has memory associated with it.
- the network port may read and write packets only into its dedicated memory. Using that architecture, a packet must be completely transferred from an input memory to an output memory.
- this transfer methodology is the primary disadvantage of the dedicated port architecture.
- the other disadvantage is that the amount of memory allocated to a port is finite. If a port's buffer becomes filled, any further information sent to that port will be lost even though memory may be unused elsewhere in the switch.
- the primary advantage of the dedicated port memory is that there is no need for a port to arbitrate for access to memory, which can be a significant time consuming operation.
- the switch In the shared global memory architecture, the switch has access to a single global memory and all network ports must arbitrate for access to that memory.
- the primary advantages of this architecture are that no copying of packets in memory is required, and the memory is useable by all ports such that no port will be denied any memory until all the memory is in use.
- the disadvantages of the global memory architecture are twofold. First, a very high bandwidth bus is required to permit all input ports to write into and read out of the memory at speeds that approach the data rate of the network. Second, time is lost in arbitrating for the memory among all of the ports.
- SRAM static random access memory
- DRAM dynamic random access memory
- SRAM static random access memory
- DRAM dynamic random access memory
- Each of these memories consists of an array of wordlines and bitlines. In either configuration, a memory is accessed by turning-on one of the associated wordlines. Responsively, all memory cells connected to that wordline either take a new state from the bitlines (write operation), or deliver their state to the bitlines (read operation). For read operations, a circuits called sense amplifiers detect minute voltage changes on the bitlines caused by the memory cells and thereby retrieve the read data from the bitlines. The sensing speed of the device is dependent on the technology used and the load present on the bitlines. Since the bitlines and memory-cell connections are capacitive, increasing the number of memory cells connected to a bitline will slow down the sensing operation.
- FIG. 3 is a block diagram depicting a portion of a typical fast SRAM memory 29 .
- SRAM memory cell 34 is connected to wordline 32 a and bitlines 36 and 38 .
- Clamp devices 30 prevent the bitline voltage from falling below a level defined by the supply voltage (Vdd) minus the threshold voltage (Vtn) of transistors 30 a and 30 b .
- Vdd supply voltage
- Vtn threshold voltage
- wordline 32 a When the read cycle is complete, wordline 32 a is turned off. A different wordline may then be turned-on for the next read cycle depending on the data to be retrieved.
- the memory is designed such that each SRAM memory cell may rapidly pull the bitlines 36 and 38 to a proper state during a read cycle. Each bitline 36 and 38 is guaranteed to be no more than a threshold voltage Vtn away from its final value at the start of the read. Typically, the entire operation occurs within 20 ns or less, from the time that the read command is specified to the device to the time when output data is available on the data pins.
- FIG. 4 depicts a block diagram of a DRAM memory 41 .
- a single-transistor DRAM cell 42 stores a logic state as a small amount of charge on a capacitor 43 . Accordingly, a read operation of a DRAM memory cell 42 proceeds much differently than a read operation of an SRAM memory cell. Since DRAM memory cell 42 is incapable of reversing the differential voltage on bitlines 44 and 46 , they are precharged to a common voltage level by precharge circuit 52 before the read operation is commenced. To start the read cycle, wordline 50 is turned on, at which point the charge stored in memory cell 42 is dumped onto bitline 44 . Note that only one bitline is connected to each memory cell of the DRAM memory whereas both bitlines were connected to the SAM memory cells.
- the small charge difference can then be sensed with the sense amp 48 .
- wordline 50 a is turned off and a precharge cycle is performed.
- a precharge cycle is always performed at the end of the read cycle so that the memory cells can respond to a new access with minimum latency.
- the read sensing operation in a typical DRAM takes 30-60 ns, with the precharge taking an additional 30 ns. Accordingly, the overall operation is much slower than that of the SRAM.
- DRAM accesses are divided up into “row cycles” and “column cycles”.
- a wordline e.g. 50 a
- sensing occurs.
- column cycles may occur. Since the DRAM memory data appears at the output of the sense amplifiers, multiple column-cycle reads can actually occur as fast as they do in an SRAM memory 29 .
- a precharge cycle for the current row and a row cycle for the new row must be performed. Effective use of row and column cycles requires that adjacent memory accesses reference the same row as much as possible.
- Each DRAM bank is an independent memory device however all banks share the same input and output ports.
- Bank A 54 and Bank B 56 each connect to I/O circuitry 58 .
- I/O circuitry 58 Such an architecture permits row cycles to be started in banks A 54 and B 56 concurrently. Data may be read first from bank A 54 , then from bank B 56 . While data is being read from bank B, bank A is precharged and a new row cycle is started. Column cycles can then proceed from bank A 54 while bank B 56 is being precharged. In this manner, DRAM reads can proceed continuously, without an externally visible pause for a new row cycle.
- PBRAM 62 is a 32-port scalable memory device used in a packet switching environment.
- devices that interface to a network referred to as media-access controllers or MACs 60
- media-access controllers or MACs 60 all connect to PBRAM 62 .
- MACs 60 media-access controllers
- MACs 60 all connect to PBRAM 62 .
- a switching ASIC 64 also connects to the PBRAM 62 .
- the switching ASIC 64 contains a hardware implementation of the network packet switching/routing algorithms. Note that all MAC devices 60 have direct access to the PBRAM through their own dedicated ports.
- FIG. 7 A block diagram of PBRAM 62 is shown in FIG. 7 .
- Thirty-two I/O ports 70 each connect to an associated one of thirty-two serial registers 72 .
- a 2048-bit wide databus 77 connects the serial registers to DRAM array 74 .
- the I/O ports 70 are half-duplex ports, full-duplex ports such as required for some network protocols can be implemented through the use of one port for each data transfer direction.
- Each data port consists of two bi-directional pins DQ 70 a and DQM 70 b .
- the thirty-two ports 70 are grouped into four groups of eight ports each. Each group runs off a common clock referred to as signal DCLK 71 a .
- each group of ports has two return clock outputs referred to as signals QS 71 b and QSCAL 71 c . Their functions will be described below.
- control ports 76 are provided to submit commands to the PBRAM 62 .
- Each control port consists of a command clock CCLK 76 a , a command flag CMDF 76 b and an eight-bit command port CCMD ⁇ 7:0> 76 c .
- the devices connected to PBRAM 62 multiplex commands onto the command ports 76 .
- an alternate embodiment of the present invention could include full-duplex I/O ports so that protocols such as the gigabit Ethernet protocol may be supported without requiring a port for each direction.
- a full-duplex PBRAM solution could be implemented by merging the command and data ports such that commands and data are intermixed on the input ports, thereby eliminating the need for arbitration of commands on the control ports 76 .
- SDR signaling Two data signaling techniques, referred to as single-data rate (SDR) signaling and double-data-rate (DDR) signaling, are supported by PBRAM 62 .
- SDR signaling When SDR signaling is utilized, a new data item is available on each rising edge of signal DCLK 71 a .
- DDR signaling When DDR signaling is used, a new data item is available at both the rising and falling edges of signal DCLK 71 a . Accordingly, DDR signaling doubles the maximum rate at which data may be transferred at the expense of complicated timing circuitry such as a delay-locked loop (DLL).
- DLL delay-locked loop
- PBRAM 62 requires a DLL for other reasons, so this does not pose an implementation problem.
- a maximum clock speed of 125 MHz can be achieved. Such clock speeds permit I/O port 70 to operate at 125 megabytes per second (Mbps) in SDR mode and 250 Mbps in DDR mode. These port speeds are sufficient for many network protocols, e.g. 10/100 Mbps Ethernet, and 155 Mbps FDDI. However, such speeds are not sufficient for the gigabit Ethernet protocol.
- PBRAM permits two, four or eight I/O ports 70 to be aggregated, i.e. the ports operate in parallel.
- a gigabit Ethernet port can be formed by aggregating four I/O ports 70 that are operating in DDR mode.
- each I/O port 70 includes two signals referred to as DQ 70 a and DQM 70 b .
- Signal DQ 70 a is a data signal that conveys packet data as a serial stream of logical zeroes and logical ones.
- Signal DQM 70 b is a mask signal that is used to qualify that packet data as follows:
- the “no data” qualification is used when the MAC devices 60 do not run at the same clock speed as I/O port 70 and hence there are some clock cycles that convey no information and should be ignored. That qualification is also necessary where the network protocol performs a “bit-stuffing” operation. For example, in the HDLC protocol used for X.25 and Frame Relay communication, a sequence of six consecutive logical one values in the user data is prohibited from occurring. When such a bit pattern occurs in data to be transferred, the HDLC transmitter inserts a logical zero bit after the fifth logical one bit to break up the prohibited sequence. The HDLC receiver will remove such bits so that the data returned to the user is the same as the data that was sent.
- PBRAM 62 if the data stream entering PBRAM 62 is synchronous with the data stream entering the HDLC receiver, then a “hole” in the data will occur when the padded “0” bit is removed. To keep the two devices in synchronization, a “no data” indication is sent to PBRAM 62 at that time.
- the two-bit interface permits a fourth qualification referred to as “end-of-packet”. That qualification is used when working with protocols where the length of a packet is not known in advance. Accordingly, once a MAC device 60 detects the end of a packet, it can signal this condition to the PBRAM 62 by generating an end-of-packet signal.
- the DRAM array 74 also referred to as the core 74 , consists of 8192 rows and 8192 columns for a total of 64 Megabytes of memory capacity.
- the core 74 is broken up into 64 banks, each including 1024 rows and 1024 columns. Each bank has its own row and colunm circuitry such that the banks may operate independently.
- Each serial, register 72 is 2048 bits wide. The serial registers 72 are divided into eight segments of 256 bits each. There are a total of thirty-two serial registers 72 or one for each of the PBRAM's 62 I/O ports 70 .
- Each serial register 72 is connected to the DRAM array 74 and the adjacent registers by a 2048-bit wide data bus 77 .
- Each data bus 77 is connected through a 4:1 multiplexer 76 to an 8192-bit wide DRAM databus 79 .
- the 4:1 multiplexer 76 is utilized because the SRAM cells that make up the serial registers 72 are four times as wide as the DRAM cells in DRAM array 74 .
- PBRAM 62 is addressed as if it were an array of queues.
- Each memory address supplied to PBRAM 62 represents a queue.
- a write operation appends a packet of data to the tail of such a queue, and a read operation obtains a packet of data from the head of such a queue.
- a data transfer command causes packets to be copied from one queue to another. The transfer command is processed by modifying pointers to packet data within the PBRAM 62 itself. Therefore, no packet data is actually moved around in memory. Addressing by queues transfers all responsibility for optimal address allocation from the end user, i.e. MAC devices 60 , to the PBRAM 62 itself. Each PBRAM 62 may therefore perform allocation that is optimal for its configuration. Consequently, some of the memory capacity of the PBRAM 62 is consumed by queue management operations.
- An embodiment of PBRAM 62 supports a total of 256 queues. Each queue is further, broken down into sub-queues that are each associated with one of sixteen priority levels, for a total of 4096 queue/priority-level combinations.
- the sub-queues and priority levels permit quality-of-service (QoS). For example, if a queue is mapped to an output port, then the sub-queues may be used to hold regular and priority packets at different priority levels. Therefore, when data is read from the queues, it is retrieved from the highest priority sub-queue that contains data.
- QoS quality-of-service
- PBRAM 62 When a packet is written to PBRAM 62 , it is stored in a physical location in memory array 74 that is currently unused. An associated write command will identify a queue structure within that memory to which the packet should be associated. Accordingly, a pointer to the physical location in memory array 74 is maintained in a packet table. When the packet is associated with a queue structure, a pointer to the appropriate packet table entry is placed on that queue structure. Therefore, upon issuance of a read command, the pointer on the queue is transferred to an output queue such that the packet can be accessed and output via the serial register 72 . More specifically, a PBRAM system has 4096 packet queues. All data in a PBRAM system is addressed through 12-bit queue descriptors. A packet switch does not need to perform its own queue management.
- a packet switch can use the queues in any number of ways. For example, each of the 32 ports can have its own input and output queue. For prioritized service, each port can be assigned multiple queues. For example, 16 input and 16 output queues may be set up per port, using only 1024 of the 4096 available queues. PBRAM puts no restrictions on queue assignment; the controller may use the queues as it sees fit.
- PBRAM When data is written to PBRAM, the write command must specify a queue to write to. The packet will be appended to the tail of the requested queue. PBRAM will automatically direct packet data to an unused area on the chip.
- a read command must specify a queue to read from.
- PBRAM will return the packet at the head of the queue.
- the read command may optionally dequeue the packet. If a packet is not dequeued, then a subsequent read command for the same queue will return the same packet. If a packet is dequeued, then the memory occupied by the packet will be returned to a free pool for re-use.
- PBRAM supports a cut-through operation. If a write command is issued to an empty queue, then a read command may be issued from the same queue no earlier than 256 bit-times after the start of write data. In this case, PBRAM will return the data being written. Care must be taken not to underrun in a cut-through operation: if the write function is held up such that fewer than 256 bits separate the current read and write pointers, then the returned data is undefined.
- a data transfer command allows a packet at the head of one queue to be dequeued and appended to the tail of another. This operation is the only way to move packets in a PBRAM system. It is also the only way to address specific packets. If multiple read operations must be performed on a single packet, then the read commands must not dequeue the packet. If the PBRAM controller does not want repeated processing of one packet to block processing of others, then it may move the packet to an empty queue where it can be processed without blocking traffic at the source queue.
- a queue drop command causes the packet at the head of the specified queue to be dropped. This operation is useful in case PBRAM experiences congestion.
- a queue flush command causes the entire contents of a queue to be freed. Only one queue flush operation may be in effect in the entire PBRAM system at any given time.
- PBRAM 62 When a packet is written to PBRAM 62 , the PBRAM 62 will allocate memory for it. When the packet is read back to the network, i.e. when it is de-queued, PBRAM 62 will return the contents of the memory occupied by the packet to a list of free memory locations, referred to as the free pool. It is possible for a packet to be present in more than one queue at the same time. For example, to broadcast a packet, therefore, the memory the packet occupies is re-used only after the last instance of the packet is de-queued.
- PBRAM 62 permits a MAC controller 60 to inquire about the length of a packet without reading the entire packet itself. This is done by storing the length of a packet along with its data in the memory array.
- PBRAM 62 may be configured to pre-pend the packet length to any read data it returns.
- PBRAM 62 Some network switches operate by examining incoming packets and assigning each packet a “tag” indicating how the packet is to be processed.
- PBRAM 62 allows such a packet tag (up to four bytes long) to be assigned to each packet and stored at a predetermined memory location that is associated with that packet. Again, the packet tag can be read back without reading back any of the packet data itself.
- PBRAM 62 can be configured to pre-pend the packet tag to any read data it returns.
- PBRAM 62 improves packet switching
- FIG. 9 Three eight-port Ethernet MAC controllers 60 are connected to PBRAM 62 .
- Each MAC controller 60 has eight data ports 104 that connect to the eight I/O ports 70 of the PBRAM 62 .
- each MAC controller 60 connects to a command channel 106 that is coupled to the command port 76 of PBRAM 62 .
- each I/O port 70 has its own logical input queue wherein queue addresses 0-23 are used for each of twenty-four input queues reserving 8 ports for classifier. A separate input queue is required for each Ethernet connection so that the origin of the packets can be identified. This information is often used to make filtering decisions for security reasons.
- the PBRAM device 62 actually includes thirty-two input queues that can be associated with I/O ports 70 . However, eight of those ports are typically dedicated for use by the classifier 102 , as will be described.
- the length of an Ethernet packet is not known in advance, rather, the end of the packet is detected when the physical Ethernet transceiver detects an absence of the incoming signal. For this reason, the MAC controller 60 must generate an end-of-packet signal conveyed via command port 76 to denote the end of the packet (i.e. signals DQ 70 a and DQM 70 b are asserted to logical “one” values as previously described) (Step 206 ).
- the switch ASIC 64 determines where it is intended to be transferred to (Step 208 ). This is done using the classifier 102 .
- the classifier 102 connects to the PBRAM 62 using a data channel 108 and the fourth command channel 110 .
- the classifier 102 issues a read command to read the first few bytes of the packet, i.e. the packet header, in order to determine where the packet should be sent (Step 210 ).
- a “transfer” command is issued to PBRAM 62 to move the packet to an output queue that is associated with the intended destination (step 212 ).
- Logical queue addresses 24-47 map to output queues for each I/O port 70 and therefore the classifier 102 generates one of these queue addresses (step 214 ). Furthermore, the switch ASIC 64 defines four service priority levels that map to four sub-queues of each output queue (step 216 ). By inspecting the source and destination address fields of the packet, the classifier is able to move the packet to the correct output queue and priority (step 218 ). Accordingly, when a packet arrives at an output queue, the corresponding MAC controller 60 is able to issue a read command to read it and sent it out onto the network (step 220 ).
- the PBRAM 62 includes 64 megabits of memory storage capacity. That memory capacity represents the current state of the art on merged DRAM logic processing. However, the resulting memory size of 8 MB is too small for many purposes. For this reason, PBRAM 62 has been designed such that it is extensible. In other words, multiple PBRAM devices can be connected together to form a larger PBRAM.
- each network port is connected to each PBRAM 62 .
- FIG. 10 such a merged and interconnected architecture is shown in FIG. 10 .
- each of the ports 78 , 80 , 82 and 84 is connected to both PBRAMs 86 and 88 .
- the I/O ports can be utilized in conjunction such that the combination of PBRAM 86 and PBRAM 88 appear to be a single, larger version of the same device.
- Packets are distributed between PBRAMs 86 and 88 by writing those packets into one PBRAM 86 or 88 until it is full. Once it is full, the other PBRAM 86 or 88 begins to store the packet beginning with the data element that was not stored in the other PBRAM.
- PBRAMs 86 and 88 It is possible for a single packet to be distributed across both PBRAMs 86 and 88 .
- the PBRAMs 86 and 88 must communicate with one another to determine which one of them is nearly full (and therefore to start filling the other), and to co-ordinate the subsequent read-out of the distributed packet data.
- An alternate technique used in an embodiment of the invention, is to distribute all packets evenly across all PBRAMs 86 and 88 in the system, as diagrammatically shown in FIG. 11 .
- Two PBRAMs 94 and 96 are shown with two packets 90 and 92 .
- the packets are stored in the same bank, row and segment of each PBRAM 94 and 96 .
- the first half of packet 90 is stored in PBRAM 96
- the second half of packet 90 is stored in PBRAM 94 .
- the first half of packet 92 is stored in PBRAM 94 and the second half of packet 92 is stored in PBRAM 96 .
- PBRAM 96 Using that storage scheme, a portion of every packet is stored in each PBRAM.
- Each PBRAM 94 and 96 is connected to the command 76 c and data ports 104 in parallel. Accordingly, since all PBRAMs 94 and 96 in that configuration are subject to the exact same network traffic, and all PBRAMs 94 and 96 implement the exact same queuing and allocation algorithm, the PBRAMs 94 and 96 can operate in lock-step without any need for communication between them. With such a scheme, each PBRAM 94 and 96 is configured with a chip address using external pins. Once configured, each PBRAM 94 and 96 knows which portion of each packet it is responsible for.
- read data returned from the PBRAMs 62 has to appear seamless, even though the actual PBRAM 62 that is sourcing the data may change throughout the packet transfer.
- a requirement is complicated by the fact that on a circuit board, the trace length between a PBRAM and the device it is sending data to may vary.
- the variance in trace length causes a variance in data timing. In extreme situations, those variances may lead to synchronization failures at the receiver device and will typically cause momentary bus contention when one PBRAM 62 starts to drive the bus just before the previous PBRAM 62 stops.
- the PBRAM 62 includes complex timing scheme that prevents such problems from occurring.
- each group of eight PBRAM ports is associated with a DCLK signal 71 a .
- the network controller sending data to PBRAM 62 drives DCLK 71 a and ensures that the data being written is synchronous thereto. Accordingly, each PBRAM 62 is synchronized to DCLK 71 a and latches the data at the rate indicated thereby. This mode of operation is robust since there is only one transmitter, i.e. the network controller.
- Each PBRAM 62 has two output pins QS 71 b and QSCAL 71 c associated with each group of eight I/O ports.
- the QS signal 71 b generates a clock signal to which the data output signal must be referenced.
- Each PBRAM 62 is equipped with a programmable delay-lock loop (DLL) that is used to insert a programmable phase difference between the DCLK 71 a and the QS 71 b signals.
- DLL programmable delay-lock loop
- Calibration is performed by instructing one PBRAM to output its timing reference on its QS signal 71 b , and instructing another PB RAM to output its timing reference on the QSCAL signal 71 c . Any other PBRAMs in the system are kept silent.
- the network controller may then evaluate the phase difference between QS 71 b and QSCAL 71 c . If a phase difference is detected, then the DLL on one of the PBRAMs is tuned to eliminate the phase difference.
- the network controller must tune each PBRAM 62 , to which it is connected, in turn.
- Commands are sent to the PBRAM 62 over one of the four command ports 76 .
- each command port is typically associated with a group of eight ports, there is no requirement that this be the case.
- All command data bytes are sampled at the rising edge of the command clock CCLK 76 a , regardless of whether the operating modes SDR or DDR are selected for any given port.
- the CMDF signal 76 b is used as a flag indication in that it is de-asserted to a logic low level at the start of a command, and is asserted to a logic high level on occurrence of the last byte of a command.
- the PBRAM command controller may issue commands back-to-back.
- the commands themselves consist of a variable-length stream of bytes wherein the shortest command is two bytes long.
- Commands are delivered to PBRAM through the command bus.
- the multiple buses permit a PBRAM system to be controlled from multiple switch controllers without having the controllers perform any arbitration procedure for the command bus.
- Commands are variable length; the shortest command is two bytes long.
- the CMDF signal is used to frame commands. It is high when the command bus is idle, and on the last byte of a command. CMDF is low otherwise.
- the PBRAM provides no acknowledgment of successful command completion.
- the controller is responsible for ensuring that all command preconditions are met. Illegal commands result in undefined operation.
- commands may take a variable amount of time to execute. Due to the internal queue management function, the time between a read command issue and the start of data is not deterministic.
- a “read” command can be issued to a PBRAM 62 in order to read data stored therein.
- the command specifies the port to send the data to, the queue identifier to read the packet from, and can optionally request a selected data format.
- the read command can include parameters that request that the returned data include the packet tag value and packet length or simply the packet data (i.e. if packet data is not requested then the read command returns only the packet tag value and packet length).
- the read command can further include a parameter that requests that the packet is removed from the head of the queue it was stored on, after the data is returned.
- the read command can further include a parameter that aborts a previous read operation that is still in progress.
- the read command requests that packet data for the packet at the head of the selected queue be returned through one of the I/O ports.
- the selected port must not be in use for a write operation.
- the latency between an issued read command and the start of packet data will be bounded, but is currently unspecified. If the “abort” flag is set and a previous read operation is still in progress, then the previous read operation will be aborted. In this case, PBRAM will generate an EOP indication to separate the previous packet data from the current packet data. If the abort flag is not set, then the read command will execute immediately after the current read command completes. At most one read command may be buffered ahead in this manner. If the aborted read command had its “free” flag set, then the packet will be lost.
- the controller wishes to preserve a packet despite the possibility of its transfer being aborted, then it should not use the “free” flag. Rather, the “drop data.” command should be used to dequeue the packet after it has been properly received. If the “free” flag is set, then the packet will be dequeued from the queue after successful delivery. If the free flag is not set, then the packet will remain queued to the head of the requested queue. If the “peek” flag is set, then only the packet length and tag data will be returned. If the peek flag is not set, then the entire packet data will be returned, prefixed with the length and tag information. If the selected queue is empty, then PBRAM will generate an immediate EOP indication on the read channel.
- a “suspend output” command can be issued to a PBRAM 62 in order to temporarily suspend packet output.
- the suspend-output command is used to transmit data over networks that employ bit-stuffing (as described above) or flow-control. When operating in conjunction with such networks, it is necessary to suspend the output from PBRAM 62 temporarily such that proper synchronization may be maintained.
- the command specifies the port that is to be suspended as well as the number of bits to be ignored before packet transmission is resumed.
- PBRAM 62 will output the “no data” indication on the DQ 70 a and DQM 70 b signals while packet output is suspended.
- This command is useful for applications where network output may occur at a variable bit rate.
- the HDLC protocol used for synchronous serial transmission makes use of “bit-stuffing” to avoid certain bit patterns in the signal. Each bit-stuffing operation delays the output of the data by one bit. If sufficient delays are incurred, then data output from PBRAM may overrun the controller. The “suspend output” command is used in these cases to flow-control the read data so that this overrun does not occur.
- Writing a packet into PBRAM 62 is initiated by either issuing a write command to that PBRAM 62 , or by starting to write data into one of the I/O ports.
- Writing data into an I/O port, before issuing a write command, is useful for quickly responding to network traffic.
- the network controller is permitted to transmit up to 256 bits of data to PBRAM 62 before an associated write command is issued.
- a packet write command specifies the packet length, tag (optional) and queue to append the data to. Each of these may be specified as separate commands since correct values may not be known at the time a packet arrives. Accordingly, the “assign queue” command (see FIG. 15 ) specifies the I/O) port over which data is arriving, and the queue(s) to append the data to.
- the “assign queue” command assigns the packet currently being written to one of the 4096 queues. This command is most efficient if it is issued within 256 bit times of the start of the packet.
- the “assign tag” command specifies the I/O port over which the data is arriving, and the tag data to assign to the packet.
- the “assign tag” command assigns a tag value to the incoming packet. This command is variable-length. Anywhere from one to four tag bytes may follow the command word. The number of bytes that follow must match the length of the tag field configured at system start-up. The CMDF pin must be low for each of the command bytes except for the last.
- the “assign length” command specifies the length of the packet. If this command is issued the PBRAM 62 will perform the write operation immediately upon receipt of the last data bit of the packet. Alternatively, an end-of-packet indication can be applied to the input pins DQ 70 a and DQM 70 b to denote the end of the packet.
- the assign length command may be useful for protocols such as ATM where cells can appear in a so called back-to-back manner between which there is no space to place an end-of-packet signal.
- the “assign length” command sets the packet length. This command is useful when receiving gapless input data.
- the current write command will complete automatically upon receipt of the specified amount of data. This command must be issued sufficiently far in advance of the actual end of the packet. The minimum time interval between the issue of this command and the end of packet is currently unspecified. If the commit flag is set, then any subsequent write commands will apply to the following packet. Otherwise, write commands will continue to apply to the current packet, so that the tag and queue may be set.
- This command is variable-length. Anywhere from one to three length bytes may follow the command word. The number of bytes that follow must match the length of the packet length field configured at system start-up. The CMDF pin must be low for each of the command bytes except for the last.
- a write operation is started by writing data to the I/O port. Data transfer may proceed even before a write command is issued. It is also permissible to start writing a new packet before completing all write commands for the previous packet. In this case, the previous packet must be committed before 256 bits of the current packet have been received.
- the “assign queue” and “assign tag” commands have a “commit” flag. If this flag is set, then the command completes the current write. If the flag is not set, then the command is not completed; further write commands may be used to communicate additional options. All writes must be committed eventually. Each write command may be issued at most once for any given packet. If a write command is issued more than once for a packet, the results are undefined.
- the “write commit” command indicates that no more attributes (length, tag, queue) are to be assigned, and that no more data will arrive.
- the packet may thereafter be written into the DRAM memory array 74 . It is used to indicate the end of packet data once all other write options have been given.
- the “write abort” command aborts a write operation that is currently in progress. After a write abort command is issued, PBRAM 62 will wait until the indicated length is reached, or an end-of-packet signal is received. Data received up to that point is discarded and PBRAM 62 will begin acquiring data for the next packet.
- the “transfer” command transfers data from one queue to one or more other queues.
- the command specifies the source queue and one or more destination queues.
- the command can indicate that the packet is to be de-queued from the source queue. More particularly, this command transfers one packet from the head of the source queue to the tail of the destination queue. If the head queue is empty, then this command has no effect. It is illegal to transfer a packet that is currently being read, and whose “free” flat is set. This command has execution time independent of the length of the length of the packet being transferred.
- the “drop data” command (see FIG. 21 ) is used to remove data from a specified queue.
- the packet at the head of the lowest-priority sub-queue is de-queued and freed. This command is useful to free data in an emergency if the PBRAM system is nearly filled to capacity.
- the “flush queue” command de-queues all data that has previously been queued on a specified queue. That command is issued in response to serious unexpected events, such as the failure of a network interface. In that situation, any data queued to the interface should be discarded and the resulting free memory space used to buffer traffic that is arriving from other I/O ports.
- the “reset” command (see FIG. 23 ) resets the chip and causes all of the data queues to be emptied. This command causes all I/O operations to cease. Any write commands in progress are aborted. The QS and QSC outputs of each chip are disabled. If the “R” bit is zero, then data in the queue is not lost. If the “R” bit is one, then all data is cleared from the chip.
- the “No Operation” command causes the command bus to be placed in a particular state when no command is being issued.
- test command puts the chip into a number of different test modes. The exact nature of the test modes is unspecified.
- the “set chip count” command informs each chip of the total number of PBRAM devices 62 in the system. Based on this information, each PBRAM 62 can determine the extent to which packets are interleaved across the PBRAM devices 62 . This command sets the number and size of the buffers on the PBRAMs. Each buffer is 32*“Buffer size” bytes in length. There will be a total of 2 ⁇ (18+buffer count) buffers in the system. The sum of “Buffer size” and “buffer count” must equal the base-2 log of the number of PBRAM chips in the system.
- the “set tag length” command configures the number of bytes used to convey both the packet length and the packet tag.
- a packet length can be stored in two bytes, but if a packet exceeds 65535 bytes in length, then three bytes will be required.
- the length of the packet tag depends on the controller.
- the present embodiment of PBRAM 62 supports tags having from zero to four bytes of information.
- PBRAM will generate an EOP signal after every successful packet read. If the “E” bit is not set, then no EOP will be issued.
- the “timing reference” command (see FIG. 28 ) requests that a PBRAM 62 transmit its return clock on either the QS 71 b or QSCAL 71 c signal.
- the command specifies both the ID number of the chip that is requested to perform the operation and the port group number (0-3) for which to generate the related timing information.
- the selected chip will output the return clock on the QS pin corresponding to the selected port. Otherwise, the QS pin for the selected port will be tri-state. If the “QSC” bit is clear, then the QSCAL pin for the selected port will be tri-state. Otherwise, if the “ENC” bit is set, then the selected chip will output its return clock on the QSCAL pin corresponding to the selected port. If the “ENC” bit is clear, then the QSCAL pin will be held low.
- Each port set is calibrated by having one of the PBRAMs output its QS as a reference. The QS pins on all other chips should be tri-state. Next, another PBRAM is instructed to output its echo clock on QSCAL.
- the controller may then make phase measurements and adjust the verniers as required.
- exactly one QS pin should be running for each port.
- the QSCAL pin should be held low by setting “ENC” and “QSC” on one part. All other chips should hold their QS and QSC pins tri-state.
- the “vernier adjust” command (see FIG. 29 ) adjusts the phase offset generated by the DLL on each chip for each port group.
- the phase may be set in 1/32 clock period increments.
- the “port configuration” command specifies how a port group is to operate. For example, this command sets the operation to be SDR or DDR mode and optionally aggregates two, four or all eight ports to form one or more high-speed ports, as previously described.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 13/369,593, filed on Feb. 9, 2012. U.S. patent application Ser. No. 13/369,593 is a continuation of U.S. patent application Ser. No. 12/718,300, filed on Mar. 5, 2010 and which is now U.S. Pat. No. 8,126,003. U.S. patent application Ser. No. 12/718,300 is a continuation of U.S. patent application Ser. No. 10/614,558, filed on Jul. 7, 2003 and which is now U.S. Pat. No. 7,675,925. U.S. patent application Ser. No. 10/614,558 is a continuation of U.S. patent application Ser. No. 09/283,778, filed on Mar. 31, 1999 and which is now U.S. Pat. No. 6,590,901. U.S. patent application Ser. No. 09/283,778 claims the benefit of U.S. Provisional Patent Application No. 60/080,362, filed on Apr. 1, 1998. The entire teachings of the above application(s) are incorporated herein by reference.
- As it is known in the art, computer networks permit the transfer of information from one computer to another. Some networks, referred to as local-area networks (LANs) include a bus that is shared by a number of computers. Local-area networks permit only one computer to send data over the bus at a given time and that computer can only utilize the bus for a certain period of time before it is required to relinquish it. Because of those constraints, each computer typically segments the information into packets having predefined maximum and minimum lengths. Each packet is sent during a separate bus transaction. If more than one computer needs to send information, then the computers alternately send their packets, so as to share the bus.
- On some computer networks, for example Ethernet networks, a collision resolution procedure exists that handles the case where two computers attempt to use the bus at nearly the same time. When a collision occurs, the computers involved in the collision must stop transmitting. Then, each computer re-transmits its information at separate times such that a collision is avoided.
- Computer networks are more useful where they are connected to one another such that information can be communicated between two computers on different physical networks. This can be done by employing intermediate computers referred to as “routers”. Each router has two or more network connections to different physical networks. The routers relay packets received from one interface to the other interface and vice versa. For example, consider the network configuration depicted in
FIG. 1 . Fivehosts routers networks hosts - Local-area network (LAN) switching is necessary due to the increasing volume of traffic present on many corporate LANs. New applications such as the world-wide web (WWW) and voice-over-IP are responsible for that increased network load. A LAN switch resembles a router in that it relays packets received at one interface, to another interface on the same device. However, the switch must perform this relay operation at high speed and therefore typically does so in hardware rather than software as is the case with a router. Accordingly, it is usually necessary to employ some form of memory in a network switch to handle the case where a packets intended output port is occupied sending or receiving other traffic.
FIG. 2 shows a situation where buffering is required. Ports P1 and P2 each receive traffic for the output port P3. Assuming that the input and output ports operate at the same speed, some form of buffering is required such asqueue 22. If port P3 is busy when packets arrive from ports P1 or P2, then the packets are buffered inqueue 22. Once port P3 is free, the data packets will be released fromqueue 22 in the order that they were received. - Two common switch memory architectures exist today that are referred to as the dedicated port memory and the shared global memory. Some switches may use either or both of those architectures to varying degrees. In the dedicated port memory architecture, each network port (either input or output) has memory associated with it. The network port may write packets only into its dedicated memory, and read packets only from its dedicated memory. Usually, a packet must be completely transferred from an input memory to an output memory. However, this transfer methodology is the primary disadvantage of the dedicated port architecture. The other disadvantage is that the amount of memory allocated to a port is finite. If a port's buffer becomes filled, any further information sent to that port will be lost even though memory may be unused elsewhere in the switch. On the other hand, the primary advantage of the dedicated port memory is that there is no need for a port to arbitrate for access to memory, which can be a significant time consuming operation.
- In the shared global memory architecture, the switch has access to a single global memory and all network ports must arbitrate for access to that memory. The primary advantages of this architecture are that no copying of packets in memory is required, and the memory is useable by all ports such that no port will be denied any memory until all the memory is in use. The disadvantages of the global memory architecture are twofold. First, a very high bandwidth bus is required to permit all input ports to write into and read out of the memory at speeds that approach the data rate of the network. For example, a twenty-four-
port 100 Mbit/second Ethernet switch may perform twenty-four 100 Mbit/second reads and twenty-four 100 MBit/second writes, for a total bus data rate of 4.8 Gbit/sec. It should be noted that such a data rate exceeds the capacity of a 64-bit, 66 MHz PCI bus. The second disadvantage of the global memory architecture is that time is lost in arbitrating for the memory among all of the ports. - Generally, an embodiment of the present invention is a packet buffer RAM (PBRAM) that provides advantages of the aforementioned memory architectures while removing the disadvantages. PBRAM is a single global memory arranged in a queue architecture, so it has the properties that no packet data copying is required, and that all of the memory is available to all of the ports. PBRAM in the preferred embodiment is a 32-port memory. This means that 32 different devices may access the memory without the need to arbitrate for the data channels.
- More specifically, a method and apparatus is provided for storing data packets, transferred across a computer network, in a packet buffer random access memory or PBRAM device. The, PBRAM device receives a number of data packets from network controllers that are coupled to the computer network via associated input ports. After the data packets are received portions thereof are serially transferred to different segments of serial registers that are connected between the input ports and the memory array. Lastly, the data packets are conveyed to the memory array portion of the device in parallel manner while other portions of the packets are being conveyed to other segments of the serial registers.
- The PBRAM device further assigns input queue structures in the memory array. It also stores pointers to the packets in a packet table and stores pointers to associated locations of the packet table in the queue structures. Those queue structures are accessible by associated output ports of the PBRAM device such that said pointers are transferred from the input queue structures to associated output queue structures that deliver the data packets to the output ports.
- The foregoing will be apparent from the following more particular description of example embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments of the present invention.
-
FIG. 1 is a schematic drawing of a typical network configuration; -
FIG. 2 is a schematic diagram of a buffering operation performed between a number of network ports; -
FIG. 3 is a schematic diagram of an SRAM memory configuration; -
FIG. 4 is a schematic diagram of a DRAM memory configuration; -
FIG. 5 is block diagram of a two-bank DRAM device; -
FIG. 6 is a block diagram of a network switch configuration that includes a PBRAM device, according to the present invention; -
FIG. 7 is a schematic diagram of the PBRAM device ofFIG. 6 ; -
FIG. 8 is a schematic diagram of an internal DRAM memory array of the PBRAM device ofFIG. 6 ; -
FIG. 9 is a block diagram of a twenty-four port Ethernet switch including the PBRAM device ofFIG. 6 ; -
FIG. 10 is a block diagram of a configuration including a number of PBRAM devices such as shown ofFIG. 6 ; -
FIG. 11 illustrates packets that have been distributed across the configuration of PBRAMs, such as shown inFIG. 10 ; and -
FIG. 12 is a flow diagram of the operation of the PBRAM device shown inFIG. 6 . -
FIG. 13 depicts the structure of the Read Data Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 14 depicts the structure of the Suspend Output Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 15 depicts the structure of the Assign Queue Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 16 depicts the structure of the Assign Tag Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 17 depicts the structure of the Assign Length Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 18 depicts the structure of the Commit Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 19 depicts the structure of the Write Abort Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 20 depicts the structure of the Transfer Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 21 depicts the structure of the Drop Data Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 22 depicts the structure of the Flush Queue Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 23 depicts the structure of the Reset Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 24 depicts the structure of the No-Op Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 25 depicts the structure of the Test Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 26 depicts the structure of the Set Chip Count Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 27 depicts the structure of the Set Tag Length Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 28 depicts the structure of the Timing Reference Command that can be executed on the PBRAM device ofFIG. 7 ; -
FIG. 29 depicts the structure of the Vernier Adjust Command that can be executed on the PBRAM device ofFIG. 7 ; - A description of example embodiments of the invention follows.
- The teachings of all patents, published applications and references cited herein are incorporated by reference in their entirety.
- While this invention has been particularly shown and described with references to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
- Generally, an embodiment of the present invention is a packet buffer random access memory (PBRAM) that provides the advantages of the aforementioned memory architectures while removing the disadvantages. PBRAM includes a single global memory, so it has the properties that no packet data copying is required, and that all of the memory is available to all of the ports. The PBRAM of the preferred embodiment includes a 32-port memory. This means that 32 different devices may access the memory without the need to arbitrate for the data channels. Each port may operate at up to 250 Mbit/sec, so the whole chip may run at 8 Gbit/sec. Further, it is much easier to increase the total bandwidth of PBRAM than it is to increase the bandwidth of a PCI bus or similar memory bus.
- Two common switch memory architectures exist today that are referred to as dedicated port memory and shared global memory. In the dedicated port memory architecture, each network port (either input or output) has memory associated with it. The network port may read and write packets only into its dedicated memory. Using that architecture, a packet must be completely transferred from an input memory to an output memory. However, this transfer methodology is the primary disadvantage of the dedicated port architecture. The other disadvantage is that the amount of memory allocated to a port is finite. If a port's buffer becomes filled, any further information sent to that port will be lost even though memory may be unused elsewhere in the switch. On the other hand, the primary advantage of the dedicated port memory is that there is no need for a port to arbitrate for access to memory, which can be a significant time consuming operation.
- In the shared global memory architecture, the switch has access to a single global memory and all network ports must arbitrate for access to that memory. The primary advantages of this architecture are that no copying of packets in memory is required, and the memory is useable by all ports such that no port will be denied any memory until all the memory is in use. The disadvantages of the global memory architecture are twofold. First, a very high bandwidth bus is required to permit all input ports to write into and read out of the memory at speeds that approach the data rate of the network. Second, time is lost in arbitrating for the memory among all of the ports.
- The two primary types of volatile semiconductor memory commonly used to implement dedicated port and shared global memory architectures are static random access memory (SRAM) and dynamic random access memory (DRAM). Each of these memories consists of an array of wordlines and bitlines. In either configuration, a memory is accessed by turning-on one of the associated wordlines. Responsively, all memory cells connected to that wordline either take a new state from the bitlines (write operation), or deliver their state to the bitlines (read operation). For read operations, a circuits called sense amplifiers detect minute voltage changes on the bitlines caused by the memory cells and thereby retrieve the read data from the bitlines. The sensing speed of the device is dependent on the technology used and the load present on the bitlines. Since the bitlines and memory-cell connections are capacitive, increasing the number of memory cells connected to a bitline will slow down the sensing operation.
- Each SRAM memory cell is a bistable element that will retain its state as long as power is supplied to the device.
FIG. 3 is a block diagram depicting a portion of a typicalfast SRAM memory 29.SRAM memory cell 34 is connected to wordline 32 a and bitlines 36 and 38.Clamp devices 30 prevent the bitline voltage from falling below a level defined by the supply voltage (Vdd) minus the threshold voltage (Vtn) of transistors 30 a and 30 b. When wordline 32 a is turned-on during a read cycle, thememory cell 34 outputs complementary versions of its state onbitlines bitlines sense amp 40. When the read cycle is complete, wordline 32 a is turned off. A different wordline may then be turned-on for the next read cycle depending on the data to be retrieved. The memory is designed such that each SRAM memory cell may rapidly pull thebitlines bitline - In contrast,
FIG. 4 depicts a block diagram of aDRAM memory 41. A single-transistor DRAM cell 42 stores a logic state as a small amount of charge on acapacitor 43. Accordingly, a read operation of aDRAM memory cell 42 proceeds much differently than a read operation of an SRAM memory cell. SinceDRAM memory cell 42 is incapable of reversing the differential voltage onbitlines precharge circuit 52 before the read operation is commenced. To start the read cycle, wordline 50 is turned on, at which point the charge stored inmemory cell 42 is dumped ontobitline 44. Note that only one bitline is connected to each memory cell of the DRAM memory whereas both bitlines were connected to the SAM memory cells. The small charge difference can then be sensed with thesense amp 48. After the read cycle completes, wordline 50 a is turned off and a precharge cycle is performed. A precharge cycle is always performed at the end of the read cycle so that the memory cells can respond to a new access with minimum latency. The read sensing operation in a typical DRAM takes 30-60 ns, with the precharge taking an additional 30 ns. Accordingly, the overall operation is much slower than that of the SRAM. - To make
DRAM memory 41 more attractive to users despite its slower operation, DRAM accesses are divided up into “row cycles” and “column cycles”. During each row cycle, a wordline, e.g. 50 a, is raised, and sensing occurs. At this point, column cycles may occur. Since the DRAM memory data appears at the output of the sense amplifiers, multiple column-cycle reads can actually occur as fast as they do in anSRAM memory 29. However, to change to a different row, a precharge cycle for the current row and a row cycle for the new row must be performed. Effective use of row and column cycles requires that adjacent memory accesses reference the same row as much as possible. - To further improve the performance of DRAM memory, multiple banks of DRAM memory cells are used. Each DRAM bank is an independent memory device however all banks share the same input and output ports. Consider the two-bank device shown in
FIG. 5 .Bank A 54 andBank B 56 each connect to I/O circuitry 58. Such an architecture permits row cycles to be started in banks A 54 andB 56 concurrently. Data may be read first frombank A 54, then frombank B 56. While data is being read from bank B, bank A is precharged and a new row cycle is started. Column cycles can then proceed frombank A 54 whilebank B 56 is being precharged. In this manner, DRAM reads can proceed continuously, without an externally visible pause for a new row cycle. There is no limit to the number of banks that can be used, although the additional circuitry required for each bank uses additional silicon area. Since network traffic patterns are effectively random, it is difficult to use DRAM memory in a manner that optimizes the memory accesses for effective use of row and column cycles. - Referring now to
FIG. 6 , an exemplary network switch is shown to include a Packet Buffer Random Access memory orPBRAM 62.PBRAM 62 is a 32-port scalable memory device used in a packet switching environment. As shown, devices that interface to a network, referred to as media-access controllers orMACs 60, all connect to PBRAM 62. In addition, a switchingASIC 64 also connects to thePBRAM 62. The switchingASIC 64 contains a hardware implementation of the network packet switching/routing algorithms. Note that allMAC devices 60 have direct access to the PBRAM through their own dedicated ports. - A block diagram of
PBRAM 62 is shown inFIG. 7 . Thirty-two I/O ports 70 each connect to an associated one of thirty-twoserial registers 72. A 2048-bitwide databus 77 connects the serial registers toDRAM array 74. While in the preferred embodiment the I/O ports 70 are half-duplex ports, full-duplex ports such as required for some network protocols can be implemented through the use of one port for each data transfer direction. Each data port consists of two bi-directional pins DQ 70 a andDQM 70 b. The thirty-twoports 70 are grouped into four groups of eight ports each. Each group runs off a common clock referred to assignal DCLK 71 a. There are four DCLK pins 71 a on thePBRAM device 62, one for each group of ports. In addition, each group of ports has two return clock outputs referred to as signals QS 71 b and QSCAL 71 c. Their functions will be described below. - In addition to the thirty-two data ports, four
control ports 76 are provided to submit commands to thePBRAM 62. Each control port consists of a command clock CCLK 76 a, acommand flag CMDF 76 b and an eight-bit command port CCMD<7:0> 76 c. The devices connected to PBRAM 62 multiplex commands onto thecommand ports 76. For example, it is common to have a single semiconductor chip with eightEthernet MAC devices 60 on it. Such a semiconductor chip would connect to eight PBRAM I/O ports 70, and onecontrol port 76. All I/O operations initiated from thatMAC chip 70 would issue over thesingle control port 76. - It will be recognized by one of ordinary skill in the art that an alternate embodiment of the present invention could include full-duplex I/O ports so that protocols such as the gigabit Ethernet protocol may be supported without requiring a port for each direction. In addition, a full-duplex PBRAM solution could be implemented by merging the command and data ports such that commands and data are intermixed on the input ports, thereby eliminating the need for arbitration of commands on the
control ports 76. - Two data signaling techniques, referred to as single-data rate (SDR) signaling and double-data-rate (DDR) signaling, are supported by
PBRAM 62. When SDR signaling is utilized, a new data item is available on each rising edge ofsignal DCLK 71 a. When DDR signaling is used, a new data item is available at both the rising and falling edges ofsignal DCLK 71 a. Accordingly, DDR signaling doubles the maximum rate at which data may be transferred at the expense of complicated timing circuitry such as a delay-locked loop (DLL). However, as will be shown later,PBRAM 62 requires a DLL for other reasons, so this does not pose an implementation problem. - With typical embedded DRAM process technologies, a maximum clock speed of 125 MHz can be achieved. Such clock speeds permit I/
O port 70 to operate at 125 megabytes per second (Mbps) in SDR mode and 250 Mbps in DDR mode. These port speeds are sufficient for many network protocols, e.g. 10/100 Mbps Ethernet, and 155 Mbps FDDI. However, such speeds are not sufficient for the gigabit Ethernet protocol. To accommodate the gigabit Ethernet protocol, PBRAM permits two, four or eight I/O ports 70 to be aggregated, i.e. the ports operate in parallel. For example, a gigabit Ethernet port can be formed by aggregating four I/O ports 70 that are operating in DDR mode. - As previously mentioned, each I/
O port 70 includes two signals referred to asDQ 70 a andDQM 70 b.Signal DQ 70 a is a data signal that conveys packet data as a serial stream of logical zeroes and logical ones.Signal DQM 70 b is a mask signal that is used to qualify that packet data as follows: -
DQ DQM Qualified Meaning 0 0 Logic low 1 0 Logic high 0 1 No data 1 1 End-of-packet - The “no data” qualification is used when the
MAC devices 60 do not run at the same clock speed as I/O port 70 and hence there are some clock cycles that convey no information and should be ignored. That qualification is also necessary where the network protocol performs a “bit-stuffing” operation. For example, in the HDLC protocol used for X.25 and Frame Relay communication, a sequence of six consecutive logical one values in the user data is prohibited from occurring. When such a bit pattern occurs in data to be transferred, the HDLC transmitter inserts a logical zero bit after the fifth logical one bit to break up the prohibited sequence. The HDLC receiver will remove such bits so that the data returned to the user is the same as the data that was sent. However, if the datastream entering PBRAM 62 is synchronous with the data stream entering the HDLC receiver, then a “hole” in the data will occur when the padded “0” bit is removed. To keep the two devices in synchronization, a “no data” indication is sent to PBRAM 62 at that time. Finally, the two-bit interface permits a fourth qualification referred to as “end-of-packet”. That qualification is used when working with protocols where the length of a packet is not known in advance. Accordingly, once aMAC device 60 detects the end of a packet, it can signal this condition to thePBRAM 62 by generating an end-of-packet signal. - Referring now to
FIG. 8 , theinternal DRAM array 74 architecture ofPBRAM 62 is shown. TheDRAM array 74, also referred to as thecore 74, consists of 8192 rows and 8192 columns for a total of 64 Megabytes of memory capacity. Thecore 74 is broken up into 64 banks, each including 1024 rows and 1024 columns. Each bank has its own row and colunm circuitry such that the banks may operate independently. Each serial, register 72 is 2048 bits wide. The serial registers 72 are divided into eight segments of 256 bits each. There are a total of thirty-twoserial registers 72 or one for each of the PBRAM's 62 I/O ports 70. Eachserial register 72 is connected to theDRAM array 74 and the adjacent registers by a 2048-bitwide data bus 77. Eachdata bus 77 is connected through a 4:1multiplexer 76 to an 8192-bit wide DRAM databus 79. The 4:1multiplexer 76 is utilized because the SRAM cells that make up theserial registers 72 are four times as wide as the DRAM cells inDRAM array 74. - On packet data input, once a segment of the
serial register 72 is full, its contents may be transferred to theDRAM array 74 using a single column cycle. Typically,PBRAM 62 will input data until a segment of theserial register 72 is half-full, at which point the data will be copied into theDRAM 74, concurrent with more data being input into another segment of theserial register 72. In this manner, data transfer into the serial register can be seamless. The multi-bank architecture permits row cycles for up to eight packets, corresponding to the eight segments, to be run simultaneously. Since access toDRAM array 74 is not necessary until the contents of theserial registers 72 are ready for transfer, there is ample time to perform any required row cycles. On packet output, the reverse operations occur. In other words, a portion of a packet is transferred into one or more segments ofserial register 72, from which the data may be read out from the data port. In the meantime, row cycles for additional packet data may be performed. - To keep the system flexible,
PBRAM 62 is addressed as if it were an array of queues. Each memory address supplied to PBRAM 62 represents a queue. A write operation appends a packet of data to the tail of such a queue, and a read operation obtains a packet of data from the head of such a queue. Further, a data transfer command causes packets to be copied from one queue to another. The transfer command is processed by modifying pointers to packet data within thePBRAM 62 itself. Therefore, no packet data is actually moved around in memory. Addressing by queues transfers all responsibility for optimal address allocation from the end user, i.e.MAC devices 60, to thePBRAM 62 itself. EachPBRAM 62 may therefore perform allocation that is optimal for its configuration. Consequently, some of the memory capacity of thePBRAM 62 is consumed by queue management operations. - An embodiment of
PBRAM 62 supports a total of 256 queues. Each queue is further, broken down into sub-queues that are each associated with one of sixteen priority levels, for a total of 4096 queue/priority-level combinations. The sub-queues and priority levels permit quality-of-service (QoS). For example, if a queue is mapped to an output port, then the sub-queues may be used to hold regular and priority packets at different priority levels. Therefore, when data is read from the queues, it is retrieved from the highest priority sub-queue that contains data. - When a packet is written to PBRAM 62, it is stored in a physical location in
memory array 74 that is currently unused. An associated write command will identify a queue structure within that memory to which the packet should be associated. Accordingly, a pointer to the physical location inmemory array 74 is maintained in a packet table. When the packet is associated with a queue structure, a pointer to the appropriate packet table entry is placed on that queue structure. Therefore, upon issuance of a read command, the pointer on the queue is transferred to an output queue such that the packet can be accessed and output via theserial register 72. More specifically, a PBRAM system has 4096 packet queues. All data in a PBRAM system is addressed through 12-bit queue descriptors. A packet switch does not need to perform its own queue management. - A packet switch can use the queues in any number of ways. For example, each of the 32 ports can have its own input and output queue. For prioritized service, each port can be assigned multiple queues. For example, 16 input and 16 output queues may be set up per port, using only 1024 of the 4096 available queues. PBRAM puts no restrictions on queue assignment; the controller may use the queues as it sees fit.
- When data is written to PBRAM, the write command must specify a queue to write to. The packet will be appended to the tail of the requested queue. PBRAM will automatically direct packet data to an unused area on the chip.
- A read command must specify a queue to read from. PBRAM will return the packet at the head of the queue. The read command may optionally dequeue the packet. If a packet is not dequeued, then a subsequent read command for the same queue will return the same packet. If a packet is dequeued, then the memory occupied by the packet will be returned to a free pool for re-use.
- PBRAM supports a cut-through operation. If a write command is issued to an empty queue, then a read command may be issued from the same queue no earlier than 256 bit-times after the start of write data. In this case, PBRAM will return the data being written. Care must be taken not to underrun in a cut-through operation: if the write function is held up such that fewer than 256 bits separate the current read and write pointers, then the returned data is undefined.
- A data transfer command allows a packet at the head of one queue to be dequeued and appended to the tail of another. This operation is the only way to move packets in a PBRAM system. It is also the only way to address specific packets. If multiple read operations must be performed on a single packet, then the read commands must not dequeue the packet. If the PBRAM controller does not want repeated processing of one packet to block processing of others, then it may move the packet to an empty queue where it can be processed without blocking traffic at the source queue.
- A queue drop command causes the packet at the head of the specified queue to be dropped. This operation is useful in case PBRAM experiences congestion.
- Finally, a queue flush command causes the entire contents of a queue to be freed. Only one queue flush operation may be in effect in the entire PBRAM system at any given time.
- All queues are emptied upon chip reset.
- When a packet is written to PBRAM 62, the
PBRAM 62 will allocate memory for it. When the packet is read back to the network, i.e. when it is de-queued,PBRAM 62 will return the contents of the memory occupied by the packet to a list of free memory locations, referred to as the free pool. It is possible for a packet to be present in more than one queue at the same time. For example, to broadcast a packet, therefore, the memory the packet occupies is re-used only after the last instance of the packet is de-queued. - In addition to the packet data itself,
MAC controllers 60 often need to know the length of a packet before it is transmitted. For this reason,PBRAM 62 permits aMAC controller 60 to inquire about the length of a packet without reading the entire packet itself. This is done by storing the length of a packet along with its data in the memory array. Alternatively,PBRAM 62 may be configured to pre-pend the packet length to any read data it returns. - Some network switches operate by examining incoming packets and assigning each packet a “tag” indicating how the packet is to be processed.
PBRAM 62 allows such a packet tag (up to four bytes long) to be assigned to each packet and stored at a predetermined memory location that is associated with that packet. Again, the packet tag can be read back without reading back any of the packet data itself. Alternatively,PBRAM 62 can be configured to pre-pend the packet tag to any read data it returns. - To illustrate how PBRAM 62 improves packet switching, consider the 24-port Ethernet switch shown in
FIG. 9 and the flow diagram ofFIG. 12 . Three eight-portEthernet MAC controllers 60 are connected to PBRAM 62. EachMAC controller 60 has eightdata ports 104 that connect to the eight I/O ports 70 of thePBRAM 62. Also, eachMAC controller 60 connects to acommand channel 106 that is coupled to thecommand port 76 ofPBRAM 62. - When a packet arrives at one of the MAC controllers 60 (step 200), that
MAC controller 60 will start writing data into thePBRAM 62 via data ports 104 (step 202). At the same time, theMAC controller 60 sends a “write” command to thePBRAM 62 via thecommand channel 106 andcommand port 76, indicating the logical queue that the packet is to be appended to (Step 204). Each I/O port 70 has its own logical input queue wherein queue addresses 0-23 are used for each of twenty-four input queues reserving 8 ports for classifier. A separate input queue is required for each Ethernet connection so that the origin of the packets can be identified. This information is often used to make filtering decisions for security reasons. It should be noted that thePBRAM device 62 actually includes thirty-two input queues that can be associated with I/O ports 70. However, eight of those ports are typically dedicated for use by theclassifier 102, as will be described. - The length of an Ethernet packet is not known in advance, rather, the end of the packet is detected when the physical Ethernet transceiver detects an absence of the incoming signal. For this reason, the
MAC controller 60 must generate an end-of-packet signal conveyed viacommand port 76 to denote the end of the packet (i.e. signalsDQ 70 a andDQM 70 b are asserted to logical “one” values as previously described) (Step 206). - In considering a data packet's trip through the system of
FIG. 6 , after receiving the packet, theswitch ASIC 64 determines where it is intended to be transferred to (Step 208). This is done using theclassifier 102. Theclassifier 102 connects to thePBRAM 62 using adata channel 108 and thefourth command channel 110. Theclassifier 102 issues a read command to read the first few bytes of the packet, i.e. the packet header, in order to determine where the packet should be sent (Step 210). Once theclassifier 102 has seen enough of the packet to determine where it should go, a “transfer” command is issued to PBRAM 62 to move the packet to an output queue that is associated with the intended destination (step 212). Logical queue addresses 24-47 map to output queues for each I/O port 70 and therefore theclassifier 102 generates one of these queue addresses (step 214). Furthermore, theswitch ASIC 64 defines four service priority levels that map to four sub-queues of each output queue (step 216). By inspecting the source and destination address fields of the packet, the classifier is able to move the packet to the correct output queue and priority (step 218). Accordingly, when a packet arrives at an output queue, thecorresponding MAC controller 60 is able to issue a read command to read it and sent it out onto the network (step 220). - As previously described, the
PBRAM 62 includes 64 megabits of memory storage capacity. That memory capacity represents the current state of the art on merged DRAM logic processing. However, the resulting memory size of 8 MB is too small for many purposes. For this reason,PBRAM 62 has been designed such that it is extensible. In other words, multiple PBRAM devices can be connected together to form a larger PBRAM. - To remain effective,
multiple PBRAMs 62 should be combined in parallel such that each network port is connected to eachPBRAM 62. For illustration purposes, such a merged and interconnected architecture is shown inFIG. 10 . Here, each of theports PBRAMs port 78 toport 84 and therefore the I/O ports can be utilized in conjunction such that the combination ofPBRAM 86 andPBRAM 88 appear to be a single, larger version of the same device. Packets are distributed betweenPBRAMs PBRAM other PBRAM - It is possible for a single packet to be distributed across both
PBRAMs PBRAMs - An alternate technique, used in an embodiment of the invention, is to distribute all packets evenly across all
PBRAMs FIG. 11 . Two PBRAMs 94 and 96 are shown with twopackets PBRAM packet 90 is stored inPBRAM 96, and the second half ofpacket 90 is stored inPBRAM 94. Similarly, the first half ofpacket 92 is stored inPBRAM 94 and the second half ofpacket 92 is stored inPBRAM 96. Using that storage scheme, a portion of every packet is stored in each PBRAM. EachPBRAM command 76 c anddata ports 104 in parallel. Accordingly, since all PBRAMs 94 and 96 in that configuration are subject to the exact same network traffic, and all PBRAMs 94 and 96 implement the exact same queuing and allocation algorithm, thePBRAMs PBRAM PBRAM - To an external device, read data returned from the
PBRAMs 62 has to appear seamless, even though theactual PBRAM 62 that is sourcing the data may change throughout the packet transfer. Such a requirement is complicated by the fact that on a circuit board, the trace length between a PBRAM and the device it is sending data to may vary. The variance in trace length causes a variance in data timing. In extreme situations, those variances may lead to synchronization failures at the receiver device and will typically cause momentary bus contention when onePBRAM 62 starts to drive the bus just before theprevious PBRAM 62 stops. However, thePBRAM 62 includes complex timing scheme that prevents such problems from occurring. - For writes from network controllers into
PBRAM 62, each group of eight PBRAM ports is associated with aDCLK signal 71 a. The network controller sending data to PBRAM 62 drives DCLK 71 a and ensures that the data being written is synchronous thereto. Accordingly, each PBRAM 62 is synchronized toDCLK 71 a and latches the data at the rate indicated thereby. This mode of operation is robust since there is only one transmitter, i.e. the network controller. - On the other hand, Read operations cause data to be generated by
PBRAM 62 and transmitted to the network controllers. These operations are much more complex since they involve a multiple number of transmitters, as will be described. EachPBRAM 62 has two output pins QS 71 b and QSCAL 71 c associated with each group of eight I/O ports. TheQS signal 71 b generates a clock signal to which the data output signal must be referenced. EachPBRAM 62 is equipped with a programmable delay-lock loop (DLL) that is used to insert a programmable phase difference between theDCLK 71 a and theQS 71 b signals. When the system is first powered up, each network controller calibrates the clocks of the PBRAM ports connected to it. Calibration is performed by instructing one PBRAM to output its timing reference on itsQS signal 71 b, and instructing another PB RAM to output its timing reference on the QSCAL signal 71 c. Any other PBRAMs in the system are kept silent. The network controller may then evaluate the phase difference betweenQS 71 b and QSCAL 71 c. If a phase difference is detected, then the DLL on one of the PBRAMs is tuned to eliminate the phase difference. The network controller must tune eachPBRAM 62, to which it is connected, in turn. - Commands are sent to the
PBRAM 62 over one of the fourcommand ports 76. Although each command port is typically associated with a group of eight ports, there is no requirement that this be the case. All command data bytes are sampled at the rising edge of the command clock CCLK 76 a, regardless of whether the operating modes SDR or DDR are selected for any given port. TheCMDF signal 76 b is used as a flag indication in that it is de-asserted to a logic low level at the start of a command, and is asserted to a logic high level on occurrence of the last byte of a command. The PBRAM command controller may issue commands back-to-back. The commands themselves consist of a variable-length stream of bytes wherein the shortest command is two bytes long. - Commands are delivered to PBRAM through the command bus. There are four independent command buses. The multiple buses permit a PBRAM system to be controlled from multiple switch controllers without having the controllers perform any arbitration procedure for the command bus. Commands are variable length; the shortest command is two bytes long. The CMDF signal is used to frame commands. It is high when the command bus is idle, and on the last byte of a command. CMDF is low otherwise. The PBRAM provides no acknowledgment of successful command completion. The controller is responsible for ensuring that all command preconditions are met. Illegal commands result in undefined operation.
- It should be noted that commands may take a variable amount of time to execute. Due to the internal queue management function, the time between a read command issue and the start of data is not deterministic.
- VIII. Commands Associated with Reading Packets from PBRAM
- A “read” command can be issued to a
PBRAM 62 in order to read data stored therein. Referring toFIG. 13 , the command specifies the port to send the data to, the queue identifier to read the packet from, and can optionally request a selected data format. For example, the read command can include parameters that request that the returned data include the packet tag value and packet length or simply the packet data (i.e. if packet data is not requested then the read command returns only the packet tag value and packet length). The read command can further include a parameter that requests that the packet is removed from the head of the queue it was stored on, after the data is returned. Lastly, the read command can further include a parameter that aborts a previous read operation that is still in progress. - The read command requests that packet data for the packet at the head of the selected queue be returned through one of the I/O ports. The selected port must not be in use for a write operation. The latency between an issued read command and the start of packet data will be bounded, but is currently unspecified. If the “abort” flag is set and a previous read operation is still in progress, then the previous read operation will be aborted. In this case, PBRAM will generate an EOP indication to separate the previous packet data from the current packet data. If the abort flag is not set, then the read command will execute immediately after the current read command completes. At most one read command may be buffered ahead in this manner. If the aborted read command had its “free” flag set, then the packet will be lost.
- If the controller wishes to preserve a packet despite the possibility of its transfer being aborted, then it should not use the “free” flag. Rather, the “drop data.” command should be used to dequeue the packet after it has been properly received. If the “free” flag is set, then the packet will be dequeued from the queue after successful delivery. If the free flag is not set, then the packet will remain queued to the head of the requested queue. If the “peek” flag is set, then only the packet length and tag data will be returned. If the peek flag is not set, then the entire packet data will be returned, prefixed with the length and tag information. If the selected queue is empty, then PBRAM will generate an immediate EOP indication on the read channel.
- Referring to
FIG. 14 , a “suspend output” command can be issued to aPBRAM 62 in order to temporarily suspend packet output. The suspend-output command is used to transmit data over networks that employ bit-stuffing (as described above) or flow-control. When operating in conjunction with such networks, it is necessary to suspend the output fromPBRAM 62 temporarily such that proper synchronization may be maintained. The command specifies the port that is to be suspended as well as the number of bits to be ignored before packet transmission is resumed.PBRAM 62 will output the “no data” indication on theDQ 70 a andDQM 70 b signals while packet output is suspended. - The “suspend output” command causes read data being output on a port to be suspended. If the “F” bit is a “1”, then output t to the given port is suspended indefinitely. If the “delay” value is zero, then output to the port resumes normally. This option is used to resume output after a “suspend output” command with F=1. If the “delay” value is between 1 and 31 inclusive, then output on the port is suspended for “delay” clock cycles, after which it automatically resumes. PBRAM will drive the DQM pin high and the DQ pin low while output is suspended.
- This command is useful for applications where network output may occur at a variable bit rate. For example, the HDLC protocol used for synchronous serial transmission makes use of “bit-stuffing” to avoid certain bit patterns in the signal. Each bit-stuffing operation delays the output of the data by one bit. If sufficient delays are incurred, then data output from PBRAM may overrun the controller. The “suspend output” command is used in these cases to flow-control the read data so that this overrun does not occur.
- IX. Commands Associated with Writing Packets to PBRAM
- Writing a packet into
PBRAM 62 is initiated by either issuing a write command to thatPBRAM 62, or by starting to write data into one of the I/O ports. Writing data into an I/O port, before issuing a write command, is useful for quickly responding to network traffic. The network controller is permitted to transmit up to 256 bits of data to PBRAM 62 before an associated write command is issued. A packet write command specifies the packet length, tag (optional) and queue to append the data to. Each of these may be specified as separate commands since correct values may not be known at the time a packet arrives. Accordingly, the “assign queue” command (seeFIG. 15 ) specifies the I/O) port over which data is arriving, and the queue(s) to append the data to. The “assign queue” command assigns the packet currently being written to one of the 4096 queues. This command is most efficient if it is issued within 256 bit times of the start of the packet. - Referring now to
FIG. 16 , the “assign tag” command specifies the I/O port over which the data is arriving, and the tag data to assign to the packet. The “assign tag” command assigns a tag value to the incoming packet. This command is variable-length. Anywhere from one to four tag bytes may follow the command word. The number of bytes that follow must match the length of the tag field configured at system start-up. The CMDF pin must be low for each of the command bytes except for the last. - The “assign length” command (see
FIG. 17 ) specifies the length of the packet. If this command is issued thePBRAM 62 will perform the write operation immediately upon receipt of the last data bit of the packet. Alternatively, an end-of-packet indication can be applied to the input pinsDQ 70 a andDQM 70 b to denote the end of the packet. The assign length command may be useful for protocols such as ATM where cells can appear in a so called back-to-back manner between which there is no space to place an end-of-packet signal. - The “assign length” command sets the packet length. This command is useful when receiving gapless input data. The current write command will complete automatically upon receipt of the specified amount of data. This command must be issued sufficiently far in advance of the actual end of the packet. The minimum time interval between the issue of this command and the end of packet is currently unspecified. If the commit flag is set, then any subsequent write commands will apply to the following packet. Otherwise, write commands will continue to apply to the current packet, so that the tag and queue may be set. This command is variable-length. Anywhere from one to three length bytes may follow the command word. The number of bytes that follow must match the length of the packet length field configured at system start-up. The CMDF pin must be low for each of the command bytes except for the last.
- A write operation is started by writing data to the I/O port. Data transfer may proceed even before a write command is issued. It is also permissible to start writing a new packet before completing all write commands for the previous packet. In this case, the previous packet must be committed before 256 bits of the current packet have been received. The “assign queue” and “assign tag” commands have a “commit” flag. If this flag is set, then the command completes the current write. If the flag is not set, then the command is not completed; further write commands may be used to communicate additional options. All writes must be committed eventually. Each write command may be issued at most once for any given packet. If a write command is issued more than once for a packet, the results are undefined.
- Referring to
FIG. 18 , the “write commit” command indicates that no more attributes (length, tag, queue) are to be assigned, and that no more data will arrive. The packet may thereafter be written into theDRAM memory array 74. It is used to indicate the end of packet data once all other write options have been given. - Referring now to
FIG. 19 , the “write abort” command aborts a write operation that is currently in progress. After a write abort command is issued,PBRAM 62 will wait until the indicated length is reached, or an end-of-packet signal is received. Data received up to that point is discarded andPBRAM 62 will begin acquiring data for the next packet. - Referring now to
FIG. 20 , the “transfer” command transfers data from one queue to one or more other queues. The command specifies the source queue and one or more destination queues. Optionally, the command can indicate that the packet is to be de-queued from the source queue. More particularly, this command transfers one packet from the head of the source queue to the tail of the destination queue. If the head queue is empty, then this command has no effect. It is illegal to transfer a packet that is currently being read, and whose “free” flat is set. This command has execution time independent of the length of the length of the packet being transferred. - The “drop data” command (see
FIG. 21 ) is used to remove data from a specified queue. The packet at the head of the lowest-priority sub-queue is de-queued and freed. This command is useful to free data in an emergency if the PBRAM system is nearly filled to capacity. - The “flush queue” command (see
FIG. 22 ) de-queues all data that has previously been queued on a specified queue. That command is issued in response to serious unexpected events, such as the failure of a network interface. In that situation, any data queued to the interface should be discarded and the resulting free memory space used to buffer traffic that is arriving from other I/O ports. - The “reset” command (see
FIG. 23 ) resets the chip and causes all of the data queues to be emptied. This command causes all I/O operations to cease. Any write commands in progress are aborted. The QS and QSC outputs of each chip are disabled. If the “R” bit is zero, then data in the queue is not lost. If the “R” bit is one, then all data is cleared from the chip. - The “No Operation” command (see
FIG. 24 ) causes the command bus to be placed in a particular state when no command is being issued. - The “test” command (see
FIG. 25 ) puts the chip into a number of different test modes. The exact nature of the test modes is unspecified. - The “set chip count” command (see
FIG. 26 ) informs each chip of the total number ofPBRAM devices 62 in the system. Based on this information, each PBRAM 62 can determine the extent to which packets are interleaved across thePBRAM devices 62. This command sets the number and size of the buffers on the PBRAMs. Each buffer is 32*“Buffer size” bytes in length. There will be a total of 2 Λ (18+buffer count) buffers in the system. The sum of “Buffer size” and “buffer count” must equal the base-2 log of the number of PBRAM chips in the system. - The “set tag length” command (see
FIG. 27 ) configures the number of bytes used to convey both the packet length and the packet tag. Typically, a packet length can be stored in two bytes, but if a packet exceeds 65535 bytes in length, then three bytes will be required. The length of the packet tag depends on the controller. The present embodiment ofPBRAM 62 supports tags having from zero to four bytes of information. - If the “E” bit is set, then PBRAM will generate an EOP signal after every successful packet read. If the “E” bit is not set, then no EOP will be issued.
- The “timing reference” command (see
FIG. 28 ) requests that aPBRAM 62 transmit its return clock on either theQS 71 b or QSCAL 71 c signal. The command specifies both the ID number of the chip that is requested to perform the operation and the port group number (0-3) for which to generate the related timing information. - If the “QS” bit is set, then the selected chip will output the return clock on the QS pin corresponding to the selected port. Otherwise, the QS pin for the selected port will be tri-state. If the “QSC” bit is clear, then the QSCAL pin for the selected port will be tri-state. Otherwise, if the “ENC” bit is set, then the selected chip will output its return clock on the QSCAL pin corresponding to the selected port. If the “ENC” bit is clear, then the QSCAL pin will be held low. Each port set is calibrated by having one of the PBRAMs output its QS as a reference. The QS pins on all other chips should be tri-state. Next, another PBRAM is instructed to output its echo clock on QSCAL. The controller may then make phase measurements and adjust the verniers as required. In normal operation, exactly one QS pin should be running for each port. The QSCAL pin should be held low by setting “ENC” and “QSC” on one part. All other chips should hold their QS and QSC pins tri-state.
- The “vernier adjust” command (see
FIG. 29 ) adjusts the phase offset generated by the DLL on each chip for each port group. The phase may be set in 1/32 clock period increments. - These last two commands differ from other commands in that they specify the ID of the
PBRAM 62 that is to perform the related operation. All other commands are acted upon by allPBRAMs 62 that are configured in the system. - Finally, the “port configuration” command specifies how a port group is to operate. For example, this command sets the operation to be SDR or DDR mode and optionally aggregates two, four or all eight ports to form one or more high-speed ports, as previously described.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/175,142 US20140153582A1 (en) | 1998-04-01 | 2014-02-07 | Method and apparatus for providing a packet buffer random access memory |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US8036298P | 1998-04-01 | 1998-04-01 | |
US09/283,778 US6590901B1 (en) | 1998-04-01 | 1999-03-31 | Method and apparatus for providing a packet buffer random access memory |
US10/614,558 US7675925B2 (en) | 1998-04-01 | 2003-07-07 | Method and apparatus for providing a packet buffer random access memory |
US12/718,300 US8126003B2 (en) | 1998-04-01 | 2010-03-05 | Method and apparatus for providing a packet buffer random access memory |
US13/369,593 US20120137070A1 (en) | 1998-04-01 | 2012-02-09 | Method and apparatus for providing a packet buffer random access memory |
US14/175,142 US20140153582A1 (en) | 1998-04-01 | 2014-02-07 | Method and apparatus for providing a packet buffer random access memory |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/369,593 Continuation US20120137070A1 (en) | 1998-04-01 | 2012-02-09 | Method and apparatus for providing a packet buffer random access memory |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140153582A1 true US20140153582A1 (en) | 2014-06-05 |
Family
ID=26763416
Family Applications (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/283,778 Expired - Lifetime US6590901B1 (en) | 1998-04-01 | 1999-03-31 | Method and apparatus for providing a packet buffer random access memory |
US10/614,558 Expired - Fee Related US7675925B2 (en) | 1998-04-01 | 2003-07-07 | Method and apparatus for providing a packet buffer random access memory |
US12/718,300 Expired - Fee Related US8126003B2 (en) | 1998-04-01 | 2010-03-05 | Method and apparatus for providing a packet buffer random access memory |
US13/369,593 Abandoned US20120137070A1 (en) | 1998-04-01 | 2012-02-09 | Method and apparatus for providing a packet buffer random access memory |
US14/175,142 Abandoned US20140153582A1 (en) | 1998-04-01 | 2014-02-07 | Method and apparatus for providing a packet buffer random access memory |
Family Applications Before (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/283,778 Expired - Lifetime US6590901B1 (en) | 1998-04-01 | 1999-03-31 | Method and apparatus for providing a packet buffer random access memory |
US10/614,558 Expired - Fee Related US7675925B2 (en) | 1998-04-01 | 2003-07-07 | Method and apparatus for providing a packet buffer random access memory |
US12/718,300 Expired - Fee Related US8126003B2 (en) | 1998-04-01 | 2010-03-05 | Method and apparatus for providing a packet buffer random access memory |
US13/369,593 Abandoned US20120137070A1 (en) | 1998-04-01 | 2012-02-09 | Method and apparatus for providing a packet buffer random access memory |
Country Status (1)
Country | Link |
---|---|
US (5) | US6590901B1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11513839B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Memory request size management in a multi-threaded, self-scheduling processor |
US11513837B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Thread commencement and completion using work descriptor packets in a system having a self-scheduling processor and a hybrid threading fabric |
US11513840B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Thread creation on local or remote compute elements by a multi-threaded, self-scheduling processor |
US11513838B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Thread state monitoring in a system having a multi-threaded, self-scheduling processor |
Families Citing this family (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6590901B1 (en) * | 1998-04-01 | 2003-07-08 | Mosaid Technologies, Inc. | Method and apparatus for providing a packet buffer random access memory |
JP2001060967A (en) * | 1999-08-23 | 2001-03-06 | Fujitsu Ltd | Packet switch device |
US7102999B1 (en) * | 1999-11-24 | 2006-09-05 | Juniper Networks, Inc. | Switching device |
US7161938B1 (en) * | 2000-07-26 | 2007-01-09 | Infineon Technologies North America Corp. | Network switch |
US7113516B1 (en) * | 2000-11-28 | 2006-09-26 | Texas Instruments Incorporated | Transmit buffer with dynamic size queues |
US7058070B2 (en) * | 2001-05-01 | 2006-06-06 | Integrated Device Technology, Inc. | Back pressure control system for network switch port |
US20030016625A1 (en) * | 2001-07-23 | 2003-01-23 | Anees Narsinh | Preclassifying traffic during periods of oversubscription |
US6661554B2 (en) * | 2001-08-23 | 2003-12-09 | Cyoptics (Israel) Ltd. | Biasing of an electro-optical component |
US20030095536A1 (en) * | 2001-11-16 | 2003-05-22 | Hu Teck H. | Multi-priority re-sequencing method and apparatus |
US7558197B1 (en) | 2002-01-17 | 2009-07-07 | Juniper Networks, Inc. | Dequeuing and congestion control systems and methods |
US7382793B1 (en) | 2002-01-17 | 2008-06-03 | Juniper Networks, Inc. | Systems and methods for determining the bandwidth used by a queue |
US7684422B1 (en) * | 2002-01-17 | 2010-03-23 | Juniper Networks, Inc. | Systems and methods for congestion control using random early drop at head of buffer |
US6983354B2 (en) * | 2002-05-24 | 2006-01-03 | Micron Technology, Inc. | Memory device sequencer and method supporting multiple memory device clock speeds |
TW569574B (en) * | 2002-07-01 | 2004-01-01 | Via Tech Inc | Ethernet switch controller with console command logic unit and application apparatus thereof |
KR100576715B1 (en) * | 2003-12-23 | 2006-05-03 | 한국전자통신연구원 | Apparatus for multiplexing/demultiplexing 10Gigabit ethernet frames |
US7180823B1 (en) * | 2004-01-09 | 2007-02-20 | Sigmatel, Inc. | Flexible SDRAM clocking (MS-DLL) |
JP4432621B2 (en) * | 2004-05-31 | 2010-03-17 | 三菱電機株式会社 | Image display device |
US7929518B2 (en) * | 2004-07-15 | 2011-04-19 | Broadcom Corporation | Method and system for a gigabit Ethernet IP telephone chip with integrated DDR interface |
US7606251B2 (en) * | 2004-08-05 | 2009-10-20 | International Business Machines Corporation | Method, system, and computer program product for reducing network copies by port-based routing to application-specific buffers |
US7802028B2 (en) * | 2005-05-02 | 2010-09-21 | Broadcom Corporation | Total dynamic sharing of a transaction queue |
US20070147404A1 (en) * | 2005-12-27 | 2007-06-28 | Lucent Technologies, Inc. | Method and apparatus for policing connections using a leaky bucket algorithm with token bucket queuing |
US8077610B1 (en) * | 2006-02-22 | 2011-12-13 | Marvell Israel (M.I.S.L) Ltd. | Memory architecture for high speed network devices |
US7889739B2 (en) * | 2007-10-22 | 2011-02-15 | Verizon Patent And Licensing Inc. | Label and exp field based MPLS network device |
US7826469B1 (en) * | 2009-03-09 | 2010-11-02 | Juniper Networks, Inc. | Memory utilization in a priority queuing system of a network device |
US8631213B2 (en) | 2010-09-16 | 2014-01-14 | Apple Inc. | Dynamic QoS upgrading |
US8314807B2 (en) | 2010-09-16 | 2012-11-20 | Apple Inc. | Memory controller with QoS-aware scheduling |
KR20120056018A (en) * | 2010-11-24 | 2012-06-01 | 삼성전자주식회사 | Semiconductor device with cross-shaped bumps and test pads arrangement |
US8838999B1 (en) | 2011-05-17 | 2014-09-16 | Applied Micro Circuits Corporation | Cut-through packet stream encryption/decryption |
US9053058B2 (en) | 2012-12-20 | 2015-06-09 | Apple Inc. | QoS inband upgrade |
US9229896B2 (en) | 2012-12-21 | 2016-01-05 | Apple Inc. | Systems and methods for maintaining an order of read and write transactions in a computing system |
US9537776B2 (en) * | 2013-03-15 | 2017-01-03 | Innovasic, Inc. | Ethernet traffic management apparatus |
US9544247B2 (en) * | 2013-03-15 | 2017-01-10 | Innovasic, Inc. | Packet data traffic management apparatus |
US9497025B2 (en) * | 2014-09-20 | 2016-11-15 | Innovasic Inc. | Ethernet interface module |
US9762491B2 (en) | 2015-03-30 | 2017-09-12 | Mellanox Technologies Tlv Ltd. | Dynamic thresholds for congestion control |
US10069748B2 (en) | 2015-12-14 | 2018-09-04 | Mellanox Technologies Tlv Ltd. | Congestion estimation for multi-priority traffic |
US10069701B2 (en) * | 2016-01-13 | 2018-09-04 | Mellanox Technologies Tlv Ltd. | Flexible allocation of packet buffers |
US10250530B2 (en) * | 2016-03-08 | 2019-04-02 | Mellanox Technologies Tlv Ltd. | Flexible buffer allocation in a network switch |
US10084716B2 (en) | 2016-03-20 | 2018-09-25 | Mellanox Technologies Tlv Ltd. | Flexible application of congestion control measures |
US10205683B2 (en) | 2016-03-28 | 2019-02-12 | Mellanox Technologies Tlv Ltd. | Optimizing buffer allocation for network flow control |
US10387074B2 (en) | 2016-05-23 | 2019-08-20 | Mellanox Technologies Tlv Ltd. | Efficient use of buffer space in a network switch |
US9985910B2 (en) | 2016-06-28 | 2018-05-29 | Mellanox Technologies Tlv Ltd. | Adaptive flow prioritization |
CN106533976B (en) * | 2016-11-07 | 2019-12-06 | 深圳怡化电脑股份有限公司 | data packet processing method and device |
US10389646B2 (en) | 2017-02-15 | 2019-08-20 | Mellanox Technologies Tlv Ltd. | Evading congestion spreading for victim flows |
US10645033B2 (en) | 2017-03-27 | 2020-05-05 | Mellanox Technologies Tlv Ltd. | Buffer optimization in modular switches |
US11171889B2 (en) * | 2017-04-07 | 2021-11-09 | Telefonaktiebolaget Lm Ericsson (Publ) | Technique for packet buffering |
US11005770B2 (en) | 2019-06-16 | 2021-05-11 | Mellanox Technologies Tlv Ltd. | Listing congestion notification packet generation by switch |
US10999221B2 (en) | 2019-07-02 | 2021-05-04 | Mellanox Technologies Tlv Ltd. | Transaction based scheduling |
US11470010B2 (en) | 2020-02-06 | 2022-10-11 | Mellanox Technologies, Ltd. | Head-of-queue blocking for multiple lossless queues |
US12112040B2 (en) * | 2021-08-16 | 2024-10-08 | International Business Machines Corporation | Data movement intimation using input/output (I/O) queue management |
US11973696B2 (en) | 2022-01-31 | 2024-04-30 | Mellanox Technologies, Ltd. | Allocation of shared reserve memory to queues in a network device |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5157277A (en) * | 1990-12-28 | 1992-10-20 | Compaq Computer Corporation | Clock buffer with adjustable delay and fixed duty cycle output |
US5303302A (en) * | 1992-06-18 | 1994-04-12 | Digital Equipment Corporation | Network packet receiver with buffer logic for reassembling interleaved data packets |
US5521916A (en) * | 1994-12-02 | 1996-05-28 | At&T Corp. | Implementation of selective pushout for space priorities in a shared memory asynchronous transfer mode switch |
US5633865A (en) * | 1995-03-31 | 1997-05-27 | Netvantage | Apparatus for selectively transferring data packets between local area networks |
US5663665A (en) * | 1995-11-29 | 1997-09-02 | Cypress Semiconductor Corp. | Means for control limits for delay locked loop |
US5724358A (en) * | 1996-02-23 | 1998-03-03 | Zeitnet, Inc. | High speed packet-switched digital switch and method |
US5757771A (en) * | 1995-11-14 | 1998-05-26 | Yurie Systems, Inc. | Queue management to serve variable and constant bit rate traffic at multiple quality of service levels in a ATM switch |
US5896347A (en) * | 1996-12-27 | 1999-04-20 | Fujitsu Limited | Semiconductor memory system using a clock-synchronous semiconductor device and semiconductor memory device for use in the same |
US5946268A (en) * | 1997-06-18 | 1999-08-31 | Mitsubishi Denki Kabushiki Kaisha | Internal clock signal generation circuit including delay line, and synchronous type semiconductor memory device including internal clock signal |
US6212165B1 (en) * | 1998-03-24 | 2001-04-03 | 3Com Corporation | Apparatus for and method of allocating a shared resource among multiple ports |
US6253207B1 (en) * | 1997-09-25 | 2001-06-26 | Lucent Technologies Inc. | Method and apparatus for transporting multimedia information over heterogeneous wide area networks |
US6343072B1 (en) * | 1997-10-01 | 2002-01-29 | Cisco Technology, Inc. | Single-chip architecture for shared-memory router |
US6463068B1 (en) * | 1997-12-31 | 2002-10-08 | Cisco Technologies, Inc. | Router with class of service mapping |
US6487212B1 (en) * | 1997-02-14 | 2002-11-26 | Advanced Micro Devices, Inc. | Queuing structure and method for prioritization of frames in a network switch |
US6909708B1 (en) * | 1996-11-18 | 2005-06-21 | Mci Communications Corporation | System, method and article of manufacture for a communication system architecture including video conferencing |
US6912680B1 (en) * | 1997-02-11 | 2005-06-28 | Micron Technology, Inc. | Memory system with dynamic timing correction |
US6940814B1 (en) * | 1997-06-30 | 2005-09-06 | Sun Microsystems, Inc. | System and method for a quality of service in a multi-layer network element |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4612634A (en) * | 1984-04-26 | 1986-09-16 | Data General Corporation | Integrated digital network (IDN) |
KR960001106B1 (en) * | 1986-12-17 | 1996-01-18 | 가부시기가이샤 히다찌세이사꾸쇼 | Semiconductor memory |
US4891795A (en) * | 1987-05-21 | 1990-01-02 | Texas Instruments Incorporated | Dual-port memory having pipelined serial output |
US6112287A (en) | 1993-03-01 | 2000-08-29 | Busless Computers Sarl | Shared memory multiprocessor system using a set of serial links as processors-memory switch |
US5187795A (en) * | 1989-01-27 | 1993-02-16 | Hughes Aircraft Company | Pipelined signal processor having a plurality of bidirectional configurable parallel ports that are configurable as individual ports or as coupled pair of ports |
US5475680A (en) * | 1989-09-15 | 1995-12-12 | Gpt Limited | Asynchronous time division multiplex switching system |
JP2740063B2 (en) * | 1990-10-15 | 1998-04-15 | 株式会社東芝 | Semiconductor storage device |
US5815723A (en) * | 1990-11-13 | 1998-09-29 | International Business Machines Corporation | Picket autonomy on a SIMD machine |
US5440523A (en) * | 1993-08-19 | 1995-08-08 | Multimedia Communications, Inc. | Multiple-port shared memory interface and associated method |
US5694143A (en) * | 1994-06-02 | 1997-12-02 | Accelerix Limited | Single chip frame buffer and graphics accelerator |
US5737547A (en) * | 1995-06-07 | 1998-04-07 | Microunity Systems Engineering, Inc. | System for placing entries of an outstanding processor request into a free pool after the request is accepted by a corresponding peripheral device |
JP2940457B2 (en) * | 1996-01-23 | 1999-08-25 | 日本電気株式会社 | Semiconductor memory |
US5917760A (en) * | 1996-09-20 | 1999-06-29 | Sldram, Inc. | De-skewing data signals in a memory system |
US5945886A (en) * | 1996-09-20 | 1999-08-31 | Sldram, Inc. | High-speed bus structure for printed circuit boards |
US6493347B2 (en) * | 1996-12-16 | 2002-12-10 | Juniper Networks, Inc. | Memory organization in a switching device |
US5859849A (en) * | 1997-05-06 | 1999-01-12 | Motorola Inc. | Modular switch element for shared memory switch fabric |
US6295299B1 (en) * | 1997-08-29 | 2001-09-25 | Extreme Networks, Inc. | Data path architecture for a LAN switch |
US6590901B1 (en) * | 1998-04-01 | 2003-07-08 | Mosaid Technologies, Inc. | Method and apparatus for providing a packet buffer random access memory |
-
1999
- 1999-03-31 US US09/283,778 patent/US6590901B1/en not_active Expired - Lifetime
-
2003
- 2003-07-07 US US10/614,558 patent/US7675925B2/en not_active Expired - Fee Related
-
2010
- 2010-03-05 US US12/718,300 patent/US8126003B2/en not_active Expired - Fee Related
-
2012
- 2012-02-09 US US13/369,593 patent/US20120137070A1/en not_active Abandoned
-
2014
- 2014-02-07 US US14/175,142 patent/US20140153582A1/en not_active Abandoned
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5157277A (en) * | 1990-12-28 | 1992-10-20 | Compaq Computer Corporation | Clock buffer with adjustable delay and fixed duty cycle output |
US5303302A (en) * | 1992-06-18 | 1994-04-12 | Digital Equipment Corporation | Network packet receiver with buffer logic for reassembling interleaved data packets |
US5521916A (en) * | 1994-12-02 | 1996-05-28 | At&T Corp. | Implementation of selective pushout for space priorities in a shared memory asynchronous transfer mode switch |
US5633865A (en) * | 1995-03-31 | 1997-05-27 | Netvantage | Apparatus for selectively transferring data packets between local area networks |
US5757771A (en) * | 1995-11-14 | 1998-05-26 | Yurie Systems, Inc. | Queue management to serve variable and constant bit rate traffic at multiple quality of service levels in a ATM switch |
US5663665A (en) * | 1995-11-29 | 1997-09-02 | Cypress Semiconductor Corp. | Means for control limits for delay locked loop |
US5724358A (en) * | 1996-02-23 | 1998-03-03 | Zeitnet, Inc. | High speed packet-switched digital switch and method |
US6909708B1 (en) * | 1996-11-18 | 2005-06-21 | Mci Communications Corporation | System, method and article of manufacture for a communication system architecture including video conferencing |
US5896347A (en) * | 1996-12-27 | 1999-04-20 | Fujitsu Limited | Semiconductor memory system using a clock-synchronous semiconductor device and semiconductor memory device for use in the same |
US6912680B1 (en) * | 1997-02-11 | 2005-06-28 | Micron Technology, Inc. | Memory system with dynamic timing correction |
US6487212B1 (en) * | 1997-02-14 | 2002-11-26 | Advanced Micro Devices, Inc. | Queuing structure and method for prioritization of frames in a network switch |
US5946268A (en) * | 1997-06-18 | 1999-08-31 | Mitsubishi Denki Kabushiki Kaisha | Internal clock signal generation circuit including delay line, and synchronous type semiconductor memory device including internal clock signal |
US6940814B1 (en) * | 1997-06-30 | 2005-09-06 | Sun Microsystems, Inc. | System and method for a quality of service in a multi-layer network element |
US6253207B1 (en) * | 1997-09-25 | 2001-06-26 | Lucent Technologies Inc. | Method and apparatus for transporting multimedia information over heterogeneous wide area networks |
US6343072B1 (en) * | 1997-10-01 | 2002-01-29 | Cisco Technology, Inc. | Single-chip architecture for shared-memory router |
US6463068B1 (en) * | 1997-12-31 | 2002-10-08 | Cisco Technologies, Inc. | Router with class of service mapping |
US7106731B1 (en) * | 1997-12-31 | 2006-09-12 | Cisco Technology, Inc. | Router with class of service mapping |
US6212165B1 (en) * | 1998-03-24 | 2001-04-03 | 3Com Corporation | Apparatus for and method of allocating a shared resource among multiple ports |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11513839B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Memory request size management in a multi-threaded, self-scheduling processor |
US11513837B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Thread commencement and completion using work descriptor packets in a system having a self-scheduling processor and a hybrid threading fabric |
US11513840B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Thread creation on local or remote compute elements by a multi-threaded, self-scheduling processor |
US11513838B2 (en) * | 2018-05-07 | 2022-11-29 | Micron Technology, Inc. | Thread state monitoring in a system having a multi-threaded, self-scheduling processor |
Also Published As
Publication number | Publication date |
---|---|
US7675925B2 (en) | 2010-03-09 |
US20120137070A1 (en) | 2012-05-31 |
US8126003B2 (en) | 2012-02-28 |
US6590901B1 (en) | 2003-07-08 |
US20100223435A1 (en) | 2010-09-02 |
US20040008714A1 (en) | 2004-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8126003B2 (en) | Method and apparatus for providing a packet buffer random access memory | |
US6424658B1 (en) | Store-and-forward network switch using an embedded DRAM | |
JP4369660B2 (en) | Dynamic random access memory system with bank collision avoidance function | |
US7308526B2 (en) | Memory controller module having independent memory controllers for different memory types | |
US5923660A (en) | Switching ethernet controller | |
US8082404B2 (en) | Memory arbitration system and method having an arbitration packet protocol | |
US6446173B1 (en) | Memory controller in a multi-port bridge for a local area network | |
US7596669B2 (en) | Apparatus and method for managing memory in a network switch | |
JP4046943B2 (en) | Multiport internal cache DRAM | |
JP3560056B2 (en) | Queue manager for buffers | |
US7017020B2 (en) | Apparatus and method for optimizing access to memory | |
EP0993680B1 (en) | Method and apparatus in a packet routing switch for controlling access at different data rates to a shared memory | |
US5867731A (en) | System for data transfer across asynchronous interface | |
JP4439154B2 (en) | Method and apparatus for interleaved non-blocking packet buffer | |
JPH10511208A (en) | SAM with expandable data width for multiport RAM | |
US7000073B2 (en) | Buffer controller and management method thereof | |
US7362751B2 (en) | Variable length switch fabric | |
US20070297437A1 (en) | Distributed switch memory architecture | |
US6850999B1 (en) | Coherency coverage of data across multiple packets varying in sizes | |
WO1999051000A1 (en) | Ampic dram system in a telecommunication switch | |
US6574231B1 (en) | Method and apparatus for queuing data frames in a network switch port | |
US20040215869A1 (en) | Method and system for scaling memory bandwidth in a data network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC., Free format text: CHANGE OF NAME;ASSIGNOR:MOSAID TECHNOLOGIES INCORPORATED;REEL/FRAME:032457/0560 Effective date: 20140101 Owner name: CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC., CANADA Free format text: CHANGE OF NAME;ASSIGNOR:MOSAID TECHNOLOGIES INCORPORATED;REEL/FRAME:032457/0560 Effective date: 20140101 |
|
AS | Assignment |
Owner name: CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC., CANADA Free format text: CHANGE OF ADDRESS;ASSIGNOR:CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC.;REEL/FRAME:033678/0096 Effective date: 20140820 Owner name: CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC., Free format text: CHANGE OF ADDRESS;ASSIGNOR:CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC.;REEL/FRAME:033678/0096 Effective date: 20140820 |
|
AS | Assignment |
Owner name: CPPIB CREDIT INVESTMENTS INC., AS LENDER, CANADA Free format text: U.S. PATENT SECURITY AGREEMENT (FOR NON-U.S. GRANTORS);ASSIGNOR:CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC.;REEL/FRAME:033706/0367 Effective date: 20140611 Owner name: ROYAL BANK OF CANADA, AS LENDER, CANADA Free format text: U.S. PATENT SECURITY AGREEMENT (FOR NON-U.S. GRANTORS);ASSIGNOR:CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC.;REEL/FRAME:033706/0367 Effective date: 20140611 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |
|
AS | Assignment |
Owner name: CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC., CANADA Free format text: RELEASE OF U.S. PATENT AGREEMENT (FOR NON-U.S. GRANTORS);ASSIGNOR:ROYAL BANK OF CANADA, AS LENDER;REEL/FRAME:047645/0424 Effective date: 20180731 Owner name: CONVERSANT INTELLECTUAL PROPERTY MANAGEMENT INC., Free format text: RELEASE OF U.S. PATENT AGREEMENT (FOR NON-U.S. GRANTORS);ASSIGNOR:ROYAL BANK OF CANADA, AS LENDER;REEL/FRAME:047645/0424 Effective date: 20180731 |