US20070165596A1 - Creation and management of routing table for PCI bus address based routing with integrated DID - Google Patents
Creation and management of routing table for PCI bus address based routing with integrated DID Download PDFInfo
- Publication number
- US20070165596A1 US20070165596A1 US11/334,678 US33467806A US2007165596A1 US 20070165596 A1 US20070165596 A1 US 20070165596A1 US 33467806 A US33467806 A US 33467806A US 2007165596 A1 US2007165596 A1 US 2007165596A1
- Authority
- US
- United States
- Prior art keywords
- pci
- root
- switches
- switch
- specified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4022—Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/54—Organization of routing tables
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
Definitions
- the invention disclosed and claimed herein generally pertains to a method and related apparatus for routing PCIe transaction packets between multiple hosts and adapters, through a PCIe switched-fabric. More particularly, the invention pertains to a method for creating and managing the structures needed for routing PCI transaction packets between multiple hosts and adapters when using a Destination Identification (DID) that is integrated into the PBA.
- DID Destination Identification
- PCI Express is widely used in computer systems to interconnect host units to adapters or other components, by means of a PCI switched-fabric bus or the like.
- PCIe currently does not permit the sharing of input/output (I/O) adapters in topologies where there are multiple hosts with multiple shared PCIe links.
- I/O input/output
- adapters for PCIe and secondary networks e.g., FC, IB, Enet
- FC, IB, Enet secondary networks
- a mechanism is required for creating and managing the structures needed for routing PCI transaction packets between multiple hosts and adapters.
- the mechanism must be designed so that it protects memory and data in the system image of one host from being accessed by unauthorized applications in system images of other hosts. Access by other adapters in the same PCI tree must also be prevented.
- implementation of the mechanism should minimize changes that must be made to currently used PCI hardware.
- the invention is generally directed to the provision and management of tables for routing packets through an environment that includes multiple hosts and shared PCIe switches and adapters.
- the invention features modification of a conventional PCI Bus Address (PBA) by including a Destination Identification (DID) field in the PBA.
- DID Destination Identification
- the DID field is embedded in a transaction packet dispatched through the PCIe switches, and is integrated into the PCI address.
- a particular DID is associated with a particular host or system image, and thus identifies the physical or virtual end point of its packet.
- One useful embodiment of the invention is directed to a method for creating and managing the structures needed for routing PCIe transaction packets through PCIe switches in a distributed computer system comprising multiple root nodes, wherein each root node includes one or more hosts.
- the system further includes one or more PCI adapters.
- a physical tree that is indicative of a physical configuration of the distributed computing system is determined, and a virtual tree is created from the physical tree.
- the virtual tree is then modified to change an association between at least one source device and at least one target device in the virtual tree.
- a validation mechanism validates the changed association between the at least one source device and the at least one target device to enable routing of data from the at least one source device to the at least one target device.
- FIG. 1 is a block diagram showing a generic distributed computer system for use with an embodiment of the invention.
- FIG. 2 is a block diagram showing an exemplary logical partition platform in the system of FIG. 1 .
- FIG. 3 is a block diagram showing a distributed computer system in further detail, wherein the system of FIG. 3 is adapted to implement an embodiment of the invention.
- FIG. 4 is a schematic diagram depicting several PCI Bus Addresses, each with an integrated DID component and associated with either a Root Complex or a Virtual End Point for use in an embodiment of the invention.
- FIG. 5 is a schematic diagram showing a PCI-E transaction packet, together with a simplified Integrated Destination ID Routing Table and a simplified Integrated Destination ID Validation Table, according to an embodiment of the invention.
- FIG. 6 illustrates a PCI configuration header according to an exemplary embodiment of the present invention
- FIG. 7 presents diagrams that schematically illustrate a system for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention
- FIG. 8 is a flowchart that illustrates a method for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention.
- FIG. 9 is a flowchart that illustrates a method for assigning source and destination identifiers in connection with managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention.
- the IOAs may be single function, such as IOAs 168 - 170 and 176 , or multiple function, such as IOAs 172 - 174 and 178 .
- respective IOAs may be connected to the I/O fabric 144 via single links, such as links 180 - 186 , or with multiple links for redundancy, such as links 188 - 194 .
- the RCs 110 , 120 , and 130 are integral components of RN 160 , 162 and 164 , respectively. There may be more than one RC in an RN, such as RCs 140 and 142 which are both integral components of RN 166 .
- each RN consists of one or more Central Processing Units (CPUs) 102 - 104 , 112 - 114 , 122 - 124 and 132 - 134 , memories 106 , 116 , 126 and 136 , and memory controllers 108 , 118 , 128 and 138 .
- the memory controllers respectively interconnect the CPUS, memory, and I/O RCs of their corresponding RNs, and perform such functions as handling the coherency traffic for respective memories.
- RN's may be connected together at their memory controllers, such as by a link 146 extending between memory controllers 108 and 118 of RNs 160 and 162 .
- This forms one coherency domain which may act as a single Symmetric Multi-Processing (SMP) system.
- SMP Symmetric Multi-Processing
- nodes may be independent from one another with separate coherency domains as in RNs 164 and 166 .
- FIG. 1 shows a PCI Configuration Manager (PCM) 148 incorporated into one of the RNs, such as RN 160 , as an integral component thereof.
- the PCM configures the shared resources of the I/O fabric and assigns resources to the RNs.
- Distributed computing system 100 may be implemented using various commercially available computer systems.
- distributed computing system 100 may be implemented using an IBM eServer iSeries Model 840 system available from International Business Machines Corporation.
- Such a system may support logical partitioning using an OS/400 operating system, which is also available from International Business Machines Corporation.
- FIG. 1 may vary.
- other peripheral devices such as optical disk drives and the like, also may be used in addition to or in place of the hardware depicted.
- the depicted example is not meant to imply architectural limitations with respect to the present invention.
- Logically partitioned platform 200 includes partitioned hardware 230 , operating systems 202 , 204 , 206 , 208 and hypervisor 210 .
- Operating systems 202 , 204 , 206 and 208 may be multiple copies of a single operating system, or may be multiple heterogeneous operating systems simultaneously run on platform 200 .
- These operating systems may be implemented using OS/400, which is designed to interface with a hypervisor.
- Operating systems 202 , 204 , 206 and 208 are located in partitions 212 , 214 , 216 and 218 , respectively. Additionally, these partitions respectively include firmware loaders 222 , 224 , 226 and 228 . When partitions 212 , 214 , 216 and 218 are instantiated, a copy of open firmware is loaded into each partition by the hypervisor's partition manager. The processors associated or assigned to the partitions are then dispatched to the partitions' memory to execute the partition firmware.
- Partitioned hardware 230 includes a plurality of processors 232 - 238 , a plurality of system memory units 240 - 246 , a plurality of input/output (I/O) adapters 248 - 262 , and a storage unit 270 .
- Partition hardware 230 also includes service processor 290 , which may be used to provide various services, such as processing of errors in the partitions.
- Each of the processors 232 - 238 , memory units 240 - 246 , NVRAM 298 , and I/O adapters 248 - 262 may be assigned to one of multiple partitions within logically partitioned platform 200 , each of which corresponds to one of operating systems 202 , 204 , 206 and 208 .
- Partition management firmware (hypervisor) 210 performs a number of functions and services for partitions 212 , 214 , 216 and 218 to create and enforce the partitioning of logically partitioned platform 200 .
- Hypervisor 210 is a firmware implemented virtual machine identical to the underlying hardware. Hypervisor software is available from International Business Machines Corporation. Firmware is “software” stored in a memory chip that holds its content without electrical power, such as, for example, read-only memory (ROM), programmable ROM (PROM), electrically erasable programmable ROM (EEPROM), and non-volatile random access memory (NVRAM).
- ROM read-only memory
- PROM programmable ROM
- EEPROM electrically erasable programmable ROM
- NVRAM non-volatile random access memory
- Hardware management console 280 is a separate distributed computing system from which a system administrator may perform various functions including reallocation of resources to different partitions.
- some functionality is needed in the bridges that connect IOAs to the I/O bus so as to be able to assign resources, such as individual IOAs or parts of IOAs to separate partitions; and, at the same time, prevent the assigned resources from affecting other partitions such as by obtaining access to resources of the other partitions.
- FIG. 3 there is shown a distributed computer system 300 that includes a more detailed representation of the I/O switched-fabric 144 depicted in FIG. 1 . More particularly, to further illustrate the concept of a PCI fabric that supports multiple root nodes through the use of multiple switches, fabric 144 is shown in FIG. 3 to comprise a plurality of PCI switches (or bridges) 302 , 304 and 306 , wherein switches 302 and 304 are multi-root aware switches. FIG. 3 further shows switches 302 , 304 and 306 provided with ports 308 - 314 , 316 - 324 and 326 - 330 , respectively. It is to be understood that the term “switch”, when used herein by itself, may include both switches and bridges.
- bridge as used herein generally pertains to a device for connecting two segments of a network that use the same protocol.
- FIG. 3 further shows switch 302 provided with an Integrated Destination Identifier-to-Port Routing Table (IDIRT) 382 .
- Switch 304 is similarly provided with an IDIRT 384 .
- IDIRTs described hereinafter in greater detail in connection with FIGS. 4 and 5 , are set up for routing PCI packets using integrated DID. More particularly, each IDIRT contains entries that pertain to specific hosts and adapters.
- host CPU sets 332 , 334 and 336 each containing a single or a plurality of system images (SIs).
- host set 332 contains system image SI 1 and SI 2
- host set 334 contains system image SI 3
- host set 336 contains system images SI 4 and SI 5 .
- each system image is equivalent or corresponds to a partition, such as partitions 212 - 218 , as described above in connection with FIG. 2 .
- Each system image is also equivalent to a host.
- system images SI 1 and SI 2 are each equivalent to one of the hosts of host CPU set 332 .
- Each of the host CPU sets has an associated root complex as described above, through which the system images of respective hosts interface with or access the I/O fabric 144 . More particularly, host sets 332 - 336 are interconnected to RCs 338 - 342 , respectively. Root complex 338 has ports 344 and 346 , and root complexes 340 and 342 each has only a single port, i.e. ports 348 and 350 , respectively. Each of the host CPU sets, together with its corresponding root complex, comprises an example or instance of a root node, such as RNs 160 - 166 shown in FIG. 1 . Moreover, host CPU set 332 is provided with a PCM 370 that is similar or identical to the PCM 148 of FIG. 1 .
- FIG. 3 further shows each of the RCs 338 - 342 connected to one of the ports 316 - 320 , which respectively comprise ports of multi-root aware switch 304 .
- Each of the multi-root aware switches 304 and 302 provides the capability to configure a PCI fabric such as I/O fabric 144 with multiple routings or data paths, in order to accommodate multiple root nodes.
- Respective ports of a multi-root aware switch can be used as upstream ports, downstream ports, or both upstream and downstream ports.
- upstream ports are closer to a source of data and receive a data stream.
- Downstream ports are further from the data source and send out a data stream.
- Upstream/downstream ports can have characteristics of both upstream and downstream ports.
- ports 316 , 318 , 320 , 326 and 308 are upstream ports.
- Ports 324 , 312 , 314 , 328 and 330 are downstream ports, and ports 322 and 310 are upstream/downstream ports.
- multi-root aware switch 302 uses downstream port 312 to connect to an I/O adapter 352 , which has two virtual I/O adapters or resources 354 and 356 .
- multi-root aware switch 302 uses downstream port 314 to connect to an I/O adapter 358 , which has three virtual I/O adapters or resources 360 , 362 and 364 .
- Multi-root aware switch 304 uses downstream port 324 to connect to port 326 of switch 306 .
- Multi-root aware switch 304 uses downstream ports 328 and 330 to connect to I/O adapter 366 and I/O adapter 368 , respectively.
- Each of the ports configured as an upstream port is used to connect to one of the root complexes 338 - 342 .
- FIG. 3 shows multi-root aware switch 302 using upstream port 308 to connect to port 344 of RC 338 .
- multi-root aware switch 304 uses upstream ports 316 , 318 and 320 to respectively connect to port 346 of root complex 338 , to the single port 348 of RC 340 , and to the single port 350 of RC 342 .
- FIG. 3 shows multi-root aware switch 302 using upstream/downstream port 310 to connect to upstream/downstream port 322 of multi-root aware switch 304 .
- I/O adapter 352 is shown as a virtualized I/O adapter, having its function 0 (F 0 ) assigned and accessible to the system image SI 1 , and its function 1 (F 1 ) assigned and accessible to the system image SI 2 .
- I/O adapter 358 is shown as a virtualized I/O adapter, having its function 0 (F 0 ) assigned and assessible to SI 3 , its function 1 (F 1 ) assigned and accessible to SI 4 and its function 3 (F 3 ) assigned to SI 5 .
- I/O adapter 366 is shown as a virtualized I/O adapter with its function F 0 assigned and accessible to SI 2 and its function F 1 assigned and accessible to SI 4 .
- I/O adapter 368 is shown as a single function I/O adapter assigned and accessible to SI 5 .
- the PCM In a system such as distributed computer system 300 , the PCM must query a PCI switch, to determine whether or not the switch supports use of integrated DID for routing packets.
- switches 302 and 304 support integrated DID as described herein, but switch 306 does not.
- FIG. 4 there is shown a schematic representation of a section or component 400 of an IDIRT, such as IDIRT 384 of switch 304 . More particularly, FIG. 4 depicts PCI Bus Address spaces 402 - 410 , each containing a total of 64 bits. Moreover, in FIG. 4 the bits in each address space are respectively grouped into the highest 16 bits and lowest 48 bits.
- the higher order bits in the PCI address space are used to identify a destination.
- a switch receiving a PCIe Packet uses the high order bits, for example the upper 16 bits, of the address to select the port that routes to the correct destination. The remaining 48 bits of the address base will then be addresses that are used by that destination.
- FIG. 4 further shows an address type for each PCI address space. This is done to emphasize that the address spaces of FIG. 4 can be used with different address types. Thus, addresses 402 , 404 and 406 are each used with a root complex, whereas addresses 408 and 410 are each used with a virtual end point.
- the PCM configures the switch so that one of the PBA address spaces of the IDIRT is assigned to the particular host.
- the PCM carries this out by creating an entry in the IDIRT for each connected host.
- an entry could be made that, as an example, assigns address space 402 of FIG. 4 to the host associated with SI 2 of host CPU set 332 .
- address space 404 could be assigned to the host associated with SI 3 of host set 334 .
- each root complex such as root complexes 338 , 340 , and 342 , is identified by the destination identifier and can use host virtualization to route incoming PCIe transactions to the appropriate host SI.
- the adapter places the integrated DID in the upper 16 bits of the PCIe memory transaction's address field.
- the switches then use the IDIRT to route PCIe transaction to the root complex associated with the integrated DID.
- the switch When an adapter is connected to a switch capable of supporting integrated DID, the switch reports this event to the PCM.
- the PCM places an entry in the switch IDIRT for each virtual end point and communicates to each root complex the set of virtual end points that are associated to that root complex, along with the integrated DID for each of those virtual end points.
- the virtual end points adapter are “made visible” to each of the associated hosts, and can be accessed thereby.
- the bits x0001 of space 408 could be the assigned DID to virtual end point 354 .
- Each virtual end point such as virtual end points 354 , 356 , 360 , 362 , 364 , 350 , 351 , and 352 , is identified by the destination identifier and can use host virtualization to route incoming PCIe transactions to the appropriate virtual end point.
- a root complex such as 338
- the root complex places the integrated DID in the upper 16 bits of the PCIe memory transaction's address field.
- the switches then use the IDIRT to route PCIe transaction to the virtual end point associated with the integrated DID.
- the PCM can query the IDIRT of a switch to determine what is in the switch configuration. Also, the PCM can modify entries in a switch IDIRT or can destroy or delete entries therein when those entries are no longer valid. Embodiments of the invention thus combine or aggregate multiple devices with a single DID number, to simplify routing lookup. Moreover, each host can only communicate to PCI addresses within its PCI address space segment. This is enforced at the switch containing the IDIRT, which is also referred to herein as a root switch. All PCIe component trees below a root switch are joined at the switch to form a single tree.
- Packet 540 includes BDF and PBA fields 544 and 546 , wherein a BDF number is an integer representing the bus, device and function of a PCI component. Packet 540 further includes an integrated DID number 542 , as described above, that is shown to be located in the PBA address field. Packet 540 further includes a PCIe component address 564 , as described above, that is shown to also be located in the PBA address field.
- the Integrated DID number 542 of the packet is used by the switch to look up an entry in the IDIRT 500 that contains the switch port number to emit the packet out of. For example, if the Integrated DID number 542 points to IDIRT entry 1 548 , then Port A 556 on the switch is used to emit the packet.
- FIG. 5 further shows entries 550 and 552 respectively corresponding to ports 558 and 560 .
- the switch Before an outbound PCIe packet can be emitted from a port, the switch checks if the port can accept PCIe packets from the BDF# contained in the inbound PCIe packet 540 . The switch performs this function by using the Integrated DID 542 to look up an entry in the Integrated DID-to-BDF# Validation Table (IDIVT) 570 and comparing the BDF# 544 from the incoming packet 540 to the list of BDFs 590 in the IDIVT 570 . IDID numbers 584 and 588 respectively correspond to BDF numbers 595 and 598 .
- IDIVT Integrated DID-to-BDF# Validation Table
- FIG. 6 illustrates a PCI configuration header according to an exemplary embodiment of the present invention.
- the PCI configuration header is generally designated by reference number 600 , and PCI Express starts its extended capabilities 602 at a fixed address in PCI configuration header 600 . These can be used to determine if the PCI component is a multi-root aware PCI component and if the device supports Integrated DID-based routing. If the PCI Express extended capabilities 602 has multi-root aware bit 603 set and Integrated DID based routing supported bit 604 then the IDID# for the device can be stored in the PCI Express Extended Capabilities area 605 . It should be understood, however, that the present invention is not limited to the herein described scenario where the PCI extended capabilities are used to define the IDID. Any other field could be redefined or reserved fields used for the Integrated Destination ID field implementation on other specifications for PCI.
- the present invention is directed to a method and system for managing the routing of data in a distributed computing system, for example, a distributed computing system that uses PCI Express protocol to communicate over an I/O fabric, to reflect modifications made to the distributed computing system.
- the present invention provides a mechanism for managing the Integrated Destination ID field included in the above-described data routing mechanism to ensure that the routing mechanism properly reflects modifications made in the distributed computing system that affects the routing of data through the system such as transferring IOAs from one host to another, or adding or removing hosts and/or IOAs from the system.
- FIG. 7 presents diagrams that schematically illustrate a system for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention.
- FIG. 7 illustrates a specific example of how a routing mechanism in the distributed computing system is altered to reflect a change in an association between a root complex and an IOA in the distributed computing system.
- the PCI Configuration Manager first creates an Integrated DID Routing Table (IDIDRT) representing a tree indicative of the current physical configuration of the distributed computing system.
- the PCM creates this table by discovering the current configuration of the I/O fabric so that it will have a full view of the physical configuration of the fabric, and then creates the IDIDRT from this information.
- the manner in which this may be accomplished is described in detail in commonly assigned, copending U.S. patent application entitled ______, Ser. No. ______, Attorney Docket No. AUS920050367US1, filed on ______, the disclosure of which is hereby incorporated by reference.
- the system administrator or agent for RC 1 modifies the virtual tree by deleting EP 2 so that it cannot communicate with RC 1 as shown in diagram 706 .
- the PCM then creates a new IDID Validation Table (IDIDVT) to reflect the modification of the virtual tree.
- IDIDVT IDID Validation Table
- the procedure illustrated in diagrams 704 and 706 is then repeated for RC 2 .
- the PCM presents a virtual tree to the system administrator or agent for RC 2 , and the system administrator or agent modifies the virtual tree by deleting EP 1 and EP 3 so that they cannot communicate with RC 2 as shown in diagram 708 .
- the IDIDVT in the switch will be as shown in diagram 710 wherein the IDIDVT validates RC 1 to communicate with EP 1 and EP 3 and vice versa, and validates RC 2 to communicate with EP 2 and vice versa. It should be understood that although only two RCs and three EPs are included in the physical tree in FIG. 7 , this is intended to be exemplary only, as the tree may include any desired number of RCs and EPs.
- FIG. 8 is a flowchart that illustrates a method for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention.
- the method is generally designated by reference number 800 , and begins by the PCM creating a full table of the physical configuration of the I/O fabric utilizing the mechanism described in the above-referenced commonly assigned, copending U.S. patent application entitled ______, Ser. No. ______, Attorney Docket No. AUS92005367US1, filed on ______ (Step 802 ).
- the PCM then creates an IDIDRT from the information on physical configuration to make “IDID-to-switch port” associations (Step 804 ).
- An IDID and BDF# is then assigned to all RCs and EPs in the IDIDRT and Bus#s are assigned to all switch to switch links (Step 806 ).
- FIG. 9 is a flowchart that illustrates a method for assigning source and destination identifiers in connection with managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention.
- the method is generally designated by reference number 900 and may be implemented as Step 806 in FIG. 8 .
- Step 902 a determination is first made whether the switch is multi-root aware. If the switch is not multi-root aware (No output of Step 902 ), the method finishes with an error (Step 904 ) because the switch will not support multi-root configurations.
- the PCM queries the PCIe Configuration Space of the component attached to port AP (Step 908 ).
- a determination is made whether the component is a switch (Step 910 ). If the component is a switch (Yes output of Step 910 ), a determination is made whether a Bus# has been assigned to port AP (Step 912 ). If a Bus# has been assigned to port AP (Yes output of Step 912 ), port AP is set equal to port AP ⁇ 1 (Step 914 ), and the method returns to Step 908 to repeat the method with the next port.
- Step 920 a determination is made whether the component is an RC (Step 920 ). If the component is an RC (Yes output of Step 920 ), a BDF# is assigned (Step 922 ) and a determination is made whether the RC supports the IDID (Step 924 ). If the RC does support the IDID (Yes output of Step 924 ), the IDID is assigned to the RC (Step 926 ). The AP is then set to be equal to AP ⁇ 1 (Step 928 ), and a determination is made whether the AP is greater than 0 (Step 930 ). If the AP is not greater than 0 (No output of Step 930 ), the method ends. If the AP is greater than 0 (Yes output of Step 930 ), the method returns to Step 908 to query the PCIe configuration Space of the component attached to the next port.
- Step 806 after an IDID and BDF# has been assigned to all RCs and EPs in the IDIDRT, and Bus#s are assigned to all switch to switch links (Step 806 ), the RCN is set to the number of RCs in the fabric (Step 808 ), and a virtual tree is created for the RCN by copying the full physical tree (Step 810 ). The virtual tree is then presented to the administrator or agent for the RC (Step 812 ). The system administrator or agent deletes EPs from the tree (Step 814 ), and a similar process is repeated until the virtual tree has been fully modified as desired.
- a IDIDVT is then created on each switch showing the RC IDID# associated with the list of EP BDFs, and EP IDID# associated with the list of EP BDF#s (Step 816 ).
- the present invention thus provides a method and system for managing the routing of data in a distributed computing system, such as a distributed computing system that uses PCI Express protocol to communicate over an I/O fabric.
- a physical tree that is indicative of a physical configuration of the distributed computing system is determined, and a virtual tree is created from the physical tree.
- the virtual tree is then modified to change an association between at least one source device and at least one target device in the virtual tree.
- a validation mechanism validates the changed association between the at least one source device and the at least one target device to enable routing of data from the at least one source device to the at least one target device.
- the invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements.
- the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
- the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
- a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- the medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium.
- Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk.
- Current examples of optical disks include compact disk—read only memory (CD-ROM), compact disk—read/write (CD-R/W) and DVD.
- a data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus.
- the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
- I/O devices including but not limited to keyboards, displays, pointing devices, etc.
- I/O controllers can be coupled to the system either directly or through intervening I/O controllers.
- Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks.
- Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Multi Processors (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Bus Control (AREA)
- Small-Scale Networks (AREA)
Abstract
Description
- 1. Field of the Invention
- The invention disclosed and claimed herein generally pertains to a method and related apparatus for routing PCIe transaction packets between multiple hosts and adapters, through a PCIe switched-fabric. More particularly, the invention pertains to a method for creating and managing the structures needed for routing PCI transaction packets between multiple hosts and adapters when using a Destination Identification (DID) that is integrated into the PBA.
- 2. Description of the Related Art
- As is well known by those of skill in the art, PCI Express (PCIe) is widely used in computer systems to interconnect host units to adapters or other components, by means of a PCI switched-fabric bus or the like. However, PCIe currently does not permit the sharing of input/output (I/O) adapters in topologies where there are multiple hosts with multiple shared PCIe links. As a result, even though such sharing capability could be very valuable when using blade clusters or other clustered servers, adapters for PCIe and secondary networks (e.g., FC, IB, Enet) are at present generally placed only into individual blades and server systems. Thus, such adapters cannot be shared between clustered blades, or even between multiple roots within a clustered system.
- In an environment containing multiple blades or blade clusters, it can be very costly to dedicate a PCI adapter for use with only a single blade. For example, a 10 Gigabit Ethernet (10 GigE) adapter currently costs on the order of $6,000. The inability to share these expensive adapters between blades has, in fact, contributed to the slow adoption rate of certain new network technologies such as 10 GigE. Moreover, there is a constraint imposed by the limited space available in blades to accommodate I/O adapters. This problem of limited space could be overcome if a PC network was able to support attachment of multiple hosts to a single PCI adapter, so that virtual PCIe I/O adapters could be shared between the multiple hosts.
- In order to allow virtualization of PCIe adapters in the above environment, a mechanism is required for creating and managing the structures needed for routing PCI transaction packets between multiple hosts and adapters. The mechanism must be designed so that it protects memory and data in the system image of one host from being accessed by unauthorized applications in system images of other hosts. Access by other adapters in the same PCI tree must also be prevented. Moreover, implementation of the mechanism should minimize changes that must be made to currently used PCI hardware.
- The invention is generally directed to the provision and management of tables for routing packets through an environment that includes multiple hosts and shared PCIe switches and adapters. The invention features modification of a conventional PCI Bus Address (PBA) by including a Destination Identification (DID) field in the PBA. Thus, the DID field is embedded in a transaction packet dispatched through the PCIe switches, and is integrated into the PCI address. A particular DID is associated with a particular host or system image, and thus identifies the physical or virtual end point of its packet. One useful embodiment of the invention is directed to a method for creating and managing the structures needed for routing PCIe transaction packets through PCIe switches in a distributed computer system comprising multiple root nodes, wherein each root node includes one or more hosts. The system further includes one or more PCI adapters. A physical tree that is indicative of a physical configuration of the distributed computing system is determined, and a virtual tree is created from the physical tree. The virtual tree is then modified to change an association between at least one source device and at least one target device in the virtual tree. A validation mechanism validates the changed association between the at least one source device and the at least one target device to enable routing of data from the at least one source device to the at least one target device.
-
FIG. 1 is a block diagram showing a generic distributed computer system for use with an embodiment of the invention. -
FIG. 2 is a block diagram showing an exemplary logical partition platform in the system ofFIG. 1 . -
FIG. 3 is a block diagram showing a distributed computer system in further detail, wherein the system ofFIG. 3 is adapted to implement an embodiment of the invention. -
FIG. 4 is a schematic diagram depicting several PCI Bus Addresses, each with an integrated DID component and associated with either a Root Complex or a Virtual End Point for use in an embodiment of the invention. -
FIG. 5 is a schematic diagram showing a PCI-E transaction packet, together with a simplified Integrated Destination ID Routing Table and a simplified Integrated Destination ID Validation Table, according to an embodiment of the invention. -
FIG. 6 illustrates a PCI configuration header according to an exemplary embodiment of the present invention; -
FIG. 7 presents diagrams that schematically illustrate a system for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention; -
FIG. 8 is a flowchart that illustrates a method for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention; and -
FIG. 9 is a flowchart that illustrates a method for assigning source and destination identifiers in connection with managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention. -
FIG. 1 shows adistributed computer system 100 comprising a preferred embodiment of the present invention. Thedistributed computer system 100 inFIG. 1 takes the form of multiple root complexes (RCs) 110, 120, 130, 140 and 142, respectively connected to an I/O switched-fabric bus 144 through I/O links memory controllers O fabric 144 via single links, such as links 180-186, or with multiple links for redundancy, such as links 188-194. - The
RCs RN RCs RN 166. In addition to the RCs, each RN consists of one or more Central Processing Units (CPUs) 102-104, 112-114, 122-124 and 132-134,memories memory controllers - RN's may be connected together at their memory controllers, such as by a
link 146 extending betweenmemory controllers RNs RNs -
FIG. 1 shows a PCI Configuration Manager (PCM) 148 incorporated into one of the RNs, such asRN 160, as an integral component thereof. The PCM configures the shared resources of the I/O fabric and assigns resources to the RNs. - Distributed
computing system 100 may be implemented using various commercially available computer systems. For example,distributed computing system 100 may be implemented using an IBM eServer iSeries Model 840 system available from International Business Machines Corporation. Such a system may support logical partitioning using an OS/400 operating system, which is also available from International Business Machines Corporation. - Those of ordinary skill in the art will appreciate that the hardware depicted in
FIG. 1 may vary. For example, other peripheral devices, such as optical disk drives and the like, also may be used in addition to or in place of the hardware depicted. The depicted example is not meant to imply architectural limitations with respect to the present invention. - With reference to
FIG. 2 , a block diagram of an exemplary logicalpartitioned platform 200 is depicted in which the present invention may be implemented. The hardware in logically partitionedplatform 200 may be implemented as, for example,data processing system 100 inFIG. 1 . Logically partitionedplatform 200 includespartitioned hardware 230,operating systems hypervisor 210.Operating systems platform 200. These operating systems may be implemented using OS/400, which is designed to interface with a hypervisor.Operating systems partitions firmware loaders partitions -
Partitioned hardware 230 includes a plurality of processors 232-238, a plurality of system memory units 240-246, a plurality of input/output (I/O) adapters 248-262, and astorage unit 270.Partition hardware 230 also includesservice processor 290, which may be used to provide various services, such as processing of errors in the partitions. Each of the processors 232-238, memory units 240-246,NVRAM 298, and I/O adapters 248-262 may be assigned to one of multiple partitions within logically partitionedplatform 200, each of which corresponds to one ofoperating systems - Partition management firmware (hypervisor) 210 performs a number of functions and services for
partitions platform 200.Hypervisor 210 is a firmware implemented virtual machine identical to the underlying hardware. Hypervisor software is available from International Business Machines Corporation. Firmware is “software” stored in a memory chip that holds its content without electrical power, such as, for example, read-only memory (ROM), programmable ROM (PROM), electrically erasable programmable ROM (EEPROM), and non-volatile random access memory (NVRAM). Thus,hypervisor 210 allows the simultaneous execution ofindependent OS images platform 200. - Operation of the different partitions may be controlled through a hardware management console, such as
hardware management console 280.Hardware management console 280 is a separate distributed computing system from which a system administrator may perform various functions including reallocation of resources to different partitions. - In an environment of the type shown in
FIG. 2 , it is not permissible for resources or programs in one partition to affect operations in another partition. Moreover, to be useful, the assignment of resources needs to be fine-grained. For example, it is often not acceptable to assign all IOAs under a particular PHB to the same partition, as that will restrict configurability of the system, including the ability to dynamically move resources between partitions. - Accordingly, some functionality is needed in the bridges that connect IOAs to the I/O bus so as to be able to assign resources, such as individual IOAs or parts of IOAs to separate partitions; and, at the same time, prevent the assigned resources from affecting other partitions such as by obtaining access to resources of the other partitions.
- Referring to
FIG. 3 , there is shown a distributedcomputer system 300 that includes a more detailed representation of the I/O switched-fabric 144 depicted inFIG. 1 . More particularly, to further illustrate the concept of a PCI fabric that supports multiple root nodes through the use of multiple switches,fabric 144 is shown inFIG. 3 to comprise a plurality of PCI switches (or bridges) 302, 304 and 306, wherein switches 302 and 304 are multi-root aware switches.FIG. 3 further shows switches 302, 304 and 306 provided with ports 308-314, 316-324 and 326-330, respectively. It is to be understood that the term “switch”, when used herein by itself, may include both switches and bridges. The term “bridge” as used herein generally pertains to a device for connecting two segments of a network that use the same protocol. -
FIG. 3 further shows switch 302 provided with an Integrated Destination Identifier-to-Port Routing Table (IDIRT) 382.Switch 304 is similarly provided with anIDIRT 384. The IDIRTs, described hereinafter in greater detail in connection withFIGS. 4 and 5 , are set up for routing PCI packets using integrated DID. More particularly, each IDIRT contains entries that pertain to specific hosts and adapters. - Referring further to
FIG. 3 , there are shown host CPU sets 332, 334 and 336, each containing a single or a plurality of system images (SIs). Thus, host set 332 containssystem image SI 1 andSI 2, host set 334 containssystem image SI 3, and host set 336 containssystem images SI 4 andSI 5. It is to be understood that each system image is equivalent or corresponds to a partition, such as partitions 212-218, as described above in connection withFIG. 2 . Each system image is also equivalent to a host. Thus,system images SI 1 andSI 2 are each equivalent to one of the hosts ofhost CPU set 332. - Each of the host CPU sets has an associated root complex as described above, through which the system images of respective hosts interface with or access the I/
O fabric 144. More particularly, host sets 332-336 are interconnected to RCs 338-342, respectively. Root complex 338 hasports root complexes ports FIG. 1 . Moreover, host CPU set 332 is provided with aPCM 370 that is similar or identical to thePCM 148 ofFIG. 1 . -
FIG. 3 further shows each of the RCs 338-342 connected to one of the ports 316-320, which respectively comprise ports of multi-rootaware switch 304. Each of the multi-rootaware switches O fabric 144 with multiple routings or data paths, in order to accommodate multiple root nodes. - Respective ports of a multi-root aware switch, such as
switches FIG. 3 ports Ports ports - The ports configured as downstream ports are to be attached or connected to adapters or to the upstream port of another switch. In
FIG. 3 , multi-rootaware switch 302 usesdownstream port 312 to connect to an I/O adapter 352, which has two virtual I/O adapters orresources aware switch 302 usesdownstream port 314 to connect to an I/O adapter 358, which has three virtual I/O adapters orresources aware switch 304 usesdownstream port 324 to connect to port 326 ofswitch 306. Multi-rootaware switch 304 usesdownstream ports O adapter 366 and I/O adapter 368, respectively. - Each of the ports configured as an upstream port is used to connect to one of the root complexes 338-342. Thus,
FIG. 3 shows multi-rootaware switch 302 usingupstream port 308 to connect to port 344 ofRC 338. Similarly, multi-rootaware switch 304 usesupstream ports root complex 338, to thesingle port 348 ofRC 340, and to thesingle port 350 ofRC 342. - The ports configured as upstream/downstream ports are used to connect to the upstream/downstream port of another switch. Thus,
FIG. 3 shows multi-rootaware switch 302 using upstream/downstream port 310 to connect to upstream/downstream port 322 of multi-rootaware switch 304. - I/
O adapter 352 is shown as a virtualized I/O adapter, having its function 0 (F0) assigned and accessible to thesystem image SI 1, and its function 1 (F1) assigned and accessible to thesystem image SI 2. Similarly, I/O adapter 358 is shown as a virtualized I/O adapter, having its function 0 (F0) assigned and assessible toSI 3, its function 1 (F1) assigned and accessible toSI 4 and its function 3 (F3) assigned toSI 5. I/O adapter 366 is shown as a virtualized I/O adapter with its function F0 assigned and accessible toSI 2 and its function F1 assigned and accessible toSI 4. I/O adapter 368 is shown as a single function I/O adapter assigned and accessible toSI 5. - In a system such as distributed
computer system 300, the PCM must query a PCI switch, to determine whether or not the switch supports use of integrated DID for routing packets. Insystem 300,switches - Referring to
FIG. 4 , there is shown a schematic representation of a section orcomponent 400 of an IDIRT, such asIDIRT 384 ofswitch 304. More particularly,FIG. 4 depicts PCI Bus Address spaces 402-410, each containing a total of 64 bits. Moreover, inFIG. 4 the bits in each address space are respectively grouped into the highest 16 bits and lowest 48 bits. - More specifically, it is essential to understand that in connection with the IDIRT, the higher order bits in the PCI address space (selected to be the highest 16 bits in this embodiment) are used to identify a destination. Thus, a switch receiving a PCIe Packet uses the high order bits, for example the upper 16 bits, of the address to select the port that routes to the correct destination. The remaining 48 bits of the address base will then be addresses that are used by that destination.
-
FIG. 4 further shows an address type for each PCI address space. This is done to emphasize that the address spaces ofFIG. 4 can be used with different address types. Thus, addresses 402, 404 and 406 are each used with a root complex, whereasaddresses - When a particular host connects to a switch that supports integrated DID, the PCM configures the switch so that one of the PBA address spaces of the IDIRT is assigned to the particular host. The PCM carries this out by creating an entry in the IDIRT for each connected host. Thus, an entry could be made that, as an example, assigns
address space 402 ofFIG. 4 to the host associated withSI 2 ofhost CPU set 332. Similarly,address space 404 could be assigned to the host associated withSI 3 of host set 334. - As stated above, when a PBA address space is assigned to a host, the highest 16 bits of the address space are thereafter used as a destination identifier or DID that is associated with the host. For example, the bits x0000 of
space 402 could be the assigned DID to root complex 338. The switch would then report to the host that the lower 48 bits of theaddress space 402 are available for use with packets pertaining to root complex 338. Each root complex, such asroot complexes - When an adapter is connected to a switch capable of supporting integrated DID, the switch reports this event to the PCM. The PCM then places an entry in the switch IDIRT for each virtual end point and communicates to each root complex the set of virtual end points that are associated to that root complex, along with the integrated DID for each of those virtual end points. As a result of this action, the virtual end points adapter are “made visible” to each of the associated hosts, and can be accessed thereby. For example, the bits x0001 of
space 408 could be the assigned DID tovirtual end point 354. Each virtual end point, such asvirtual end points - The PCM can query the IDIRT of a switch to determine what is in the switch configuration. Also, the PCM can modify entries in a switch IDIRT or can destroy or delete entries therein when those entries are no longer valid. Embodiments of the invention thus combine or aggregate multiple devices with a single DID number, to simplify routing lookup. Moreover, each host can only communicate to PCI addresses within its PCI address space segment. This is enforced at the switch containing the IDIRT, which is also referred to herein as a root switch. All PCIe component trees below a root switch are joined at the switch to form a single tree.
- Referring to
FIG. 5 , there is shown a simplifiedIDIRT 500 in a root switch ofsystem 300, wherein the root switch has received a PCIexpress packet 540.Packet 540 includes BDF andPBA fields Packet 540 further includes an integrated DIDnumber 542, as described above, that is shown to be located in the PBA address field.Packet 540 further includes aPCIe component address 564, as described above, that is shown to also be located in the PBA address field. - The Integrated DID
number 542 of the packet is used by the switch to look up an entry in theIDIRT 500 that contains the switch port number to emit the packet out of. For example, if the Integrated DIDnumber 542 points toIDIRT entry 1 548, thenPort A 556 on the switch is used to emit the packet.FIG. 5 further showsentries ports - Before an outbound PCIe packet can be emitted from a port, the switch checks if the port can accept PCIe packets from the BDF# contained in the
inbound PCIe packet 540. The switch performs this function by using the Integrated DID 542 to look up an entry in the Integrated DID-to-BDF# Validation Table (IDIVT) 570 and comparing theBDF# 544 from theincoming packet 540 to the list ofBDFs 590 in theIDIVT 570.IDID numbers BDF numbers -
FIG. 6 illustrates a PCI configuration header according to an exemplary embodiment of the present invention. The PCI configuration header is generally designated by reference number 600, and PCI Express starts itsextended capabilities 602 at a fixed address in PCI configuration header 600. These can be used to determine if the PCI component is a multi-root aware PCI component and if the device supports Integrated DID-based routing. If the PCI Express extendedcapabilities 602 has multi-rootaware bit 603 set and Integrated DID based routing supportedbit 604 then the IDID# for the device can be stored in the PCI ExpressExtended Capabilities area 605. It should be understood, however, that the present invention is not limited to the herein described scenario where the PCI extended capabilities are used to define the IDID. Any other field could be redefined or reserved fields used for the Integrated Destination ID field implementation on other specifications for PCI. - The present invention is directed to a method and system for managing the routing of data in a distributed computing system, for example, a distributed computing system that uses PCI Express protocol to communicate over an I/O fabric, to reflect modifications made to the distributed computing system. In particular, the present invention provides a mechanism for managing the Integrated Destination ID field included in the above-described data routing mechanism to ensure that the routing mechanism properly reflects modifications made in the distributed computing system that affects the routing of data through the system such as transferring IOAs from one host to another, or adding or removing hosts and/or IOAs from the system.
-
FIG. 7 presents diagrams that schematically illustrate a system for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention. In particular,FIG. 7 illustrates a specific example of how a routing mechanism in the distributed computing system is altered to reflect a change in an association between a root complex and an IOA in the distributed computing system. - As shown in diagram 702, the PCI Configuration Manager (PCM) first creates an Integrated DID Routing Table (IDIDRT) representing a tree indicative of the current physical configuration of the distributed computing system. The PCM creates this table by discovering the current configuration of the I/O fabric so that it will have a full view of the physical configuration of the fabric, and then creates the IDIDRT from this information. The manner in which this may be accomplished is described in detail in commonly assigned, copending U.S. patent application entitled ______, Ser. No. ______, Attorney Docket No. AUS920050367US1, filed on ______, the disclosure of which is hereby incorporated by reference. In the physical tree shown in diagram 702, it is assumed that End Point 1 (EP 1) and
EP 3 be assigned toRC 1, and thatEP 2 be assigned toRC 2. The PCM then creates a virtual tree from the physical tree to be presented to an administrator or agent forRC 1 as shown in diagram 704. It will be noted that this configuration is the same as the physical configuration shown in diagram 702, but is now virtual. - The system administrator or agent for
RC 1 then modifies the virtual tree by deletingEP 2 so that it cannot communicate withRC 1 as shown in diagram 706. The PCM then creates a new IDID Validation Table (IDIDVT) to reflect the modification of the virtual tree. - The procedure illustrated in diagrams 704 and 706 is then repeated for
RC 2. In particular, the PCM presents a virtual tree to the system administrator or agent forRC 2, and the system administrator or agent modifies the virtual tree by deletingEP 1 andEP 3 so that they cannot communicate withRC 2 as shown in diagram 708. - When the above-described process has been completed for all RCs in the physical tree, the IDIDVT in the switch will be as shown in diagram 710 wherein the IDIDVT validates
RC 1 to communicate withEP 1 andEP 3 and vice versa, and validatesRC 2 to communicate withEP 2 and vice versa. It should be understood that although only two RCs and three EPs are included in the physical tree inFIG. 7 , this is intended to be exemplary only, as the tree may include any desired number of RCs and EPs. -
FIG. 8 is a flowchart that illustrates a method for managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention. The method is generally designated byreference number 800, and begins by the PCM creating a full table of the physical configuration of the I/O fabric utilizing the mechanism described in the above-referenced commonly assigned, copending U.S. patent application entitled ______, Ser. No. ______, Attorney Docket No. AUS92005367US1, filed on ______ (Step 802). The PCM then creates an IDIDRT from the information on physical configuration to make “IDID-to-switch port” associations (Step 804). An IDID and BDF# is then assigned to all RCs and EPs in the IDIDRT and Bus#s are assigned to all switch to switch links (Step 806). -
FIG. 9 is a flowchart that illustrates a method for assigning source and destination identifiers in connection with managing the routing of data in a distributed computing system according to an exemplary embodiment of the present invention. The method is generally designated byreference number 900 and may be implemented asStep 806 inFIG. 8 . - Referring to
FIG. 9 , a determination is first made whether the switch is multi-root aware (Step 902). If the switch is not multi-root aware (No output of Step 902), the method finishes with an error (Step 904) because the switch will not support multi-root configurations. - If the switch is multi-root aware (Yes output of Step 902), the PCM begins at Port AP (AP=Active Port) of the switch, and starts with Bus#=0 (Step 906). The PCM then queries the PCIe Configuration Space of the component attached to port AP (Step 908). A determination is made whether the component is a switch (Step 910). If the component is a switch (Yes output of Step 910), a determination is made whether a Bus# has been assigned to port AP (Step 912). If a Bus# has been assigned to port AP (Yes output of Step 912), port AP is set equal to port AP−1 (Step 914), and the method returns to Step 908 to repeat the method with the next port.
- If a Bus# has not been assigned to port AP (No output of Step 912), a Bus# of AP=BN is assigned on current; BN=BN+1 (Step 916), and Bus#s are assigned to the I/O fabric below the switch by re-entering this method for the switch below the switch (Step 918). Port AP is then set equal to port AP−1 (Step 914), and the method returns to Step 908 to repeat the method with the next port.
- If the component is determined not to be a switch (No output to Step 910), a determination is made whether the component is an RC (Step 920). If the component is an RC (Yes output of Step 920), a BDF# is assigned (Step 922) and a determination is made whether the RC supports the IDID (Step 924). If the RC does support the IDID (Yes output of Step 924), the IDID is assigned to the RC (Step 926). The AP is then set to be equal to AP−1 (Step 928), and a determination is made whether the AP is greater than 0 (Step 930). If the AP is not greater than 0 (No output of Step 930), the method ends. If the AP is greater than 0 (Yes output of Step 930), the method returns to Step 908 to query the PCIe configuration Space of the component attached to the next port.
- If the RC does not support IDID (No output of Step 924), the AP is set=AP−1 (Step 928), and the process continues as described above.
- Meanwhile, if the component is determined not to be an RC (No output of Step 920), a BDF# is assigned (Step 932), and a determination is made whether the EP supports IDID (Step 934). If the EP supports IDID (Yes output of Step 934), the IDID is assigned to each Virtual EP (Step 936). The AP is set=AP−1 (Step 928), and the process continues from there as described above.
- If the EP does not support IDID (No output of Step 934), the AP is set=AP−1 (Step 928), and the process continues as described above.
- Returning back to
FIG. 8 , after an IDID and BDF# has been assigned to all RCs and EPs in the IDIDRT, and Bus#s are assigned to all switch to switch links (Step 806), the RCN is set to the number of RCs in the fabric (Step 808), and a virtual tree is created for the RCN by copying the full physical tree (Step 810). The virtual tree is then presented to the administrator or agent for the RC (Step 812). The system administrator or agent deletes EPs from the tree (Step 814), and a similar process is repeated until the virtual tree has been fully modified as desired. - A IDIDVT is then created on each switch showing the RC IDID# associated with the list of EP BDFs, and EP IDID# associated with the list of EP BDF#s (Step 816). The RCN is then made equal to RCN−1 (Step 818), and a determination is made whether RCN=0 (Step 820). If the RCN=0 (Yes output of Step 820), the method ends. If RCN does not equal 0 (No output of Step 820), the method returns to Step 810, and a virtual tree is created by copying the next physical tree and repeating the subsequent steps for the next virtual tree.
- The present invention thus provides a method and system for managing the routing of data in a distributed computing system, such as a distributed computing system that uses PCI Express protocol to communicate over an I/O fabric. A physical tree that is indicative of a physical configuration of the distributed computing system is determined, and a virtual tree is created from the physical tree. The virtual tree is then modified to change an association between at least one source device and at least one target device in the virtual tree. A validation mechanism validates the changed association between the at least one source device and the at least one target device to enable routing of data from the at least one source device to the at least one target device.
- The invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
- Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk—read only memory (CD-ROM), compact disk—read/write (CD-R/W) and DVD.
- A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
- Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers.
- Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
- The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Claims (20)
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/334,678 US20070165596A1 (en) | 2006-01-18 | 2006-01-18 | Creation and management of routing table for PCI bus address based routing with integrated DID |
JP2006348980A JP2007195166A (en) | 2006-01-18 | 2006-12-26 | Method, computer program, and apparatus for creation and management of routing table for pci bus address based routing with integrated did |
TW096101491A TW200736924A (en) | 2006-01-18 | 2007-01-15 | Creation and management of routing table for PCI bus address based routing with integrated DID |
CN2007100019686A CN101013989B (en) | 2006-01-18 | 2007-01-17 | Method and apparatus for pci bus address based routing with integrated did |
US12/134,952 US7907604B2 (en) | 2006-01-18 | 2008-06-06 | Creation and management of routing table for PCI bus address based routing with integrated DID |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/334,678 US20070165596A1 (en) | 2006-01-18 | 2006-01-18 | Creation and management of routing table for PCI bus address based routing with integrated DID |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/134,952 Continuation US7907604B2 (en) | 2006-01-18 | 2008-06-06 | Creation and management of routing table for PCI bus address based routing with integrated DID |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070165596A1 true US20070165596A1 (en) | 2007-07-19 |
Family
ID=38263071
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/334,678 Abandoned US20070165596A1 (en) | 2006-01-18 | 2006-01-18 | Creation and management of routing table for PCI bus address based routing with integrated DID |
US12/134,952 Expired - Fee Related US7907604B2 (en) | 2006-01-18 | 2008-06-06 | Creation and management of routing table for PCI bus address based routing with integrated DID |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/134,952 Expired - Fee Related US7907604B2 (en) | 2006-01-18 | 2008-06-06 | Creation and management of routing table for PCI bus address based routing with integrated DID |
Country Status (4)
Country | Link |
---|---|
US (2) | US20070165596A1 (en) |
JP (1) | JP2007195166A (en) |
CN (1) | CN101013989B (en) |
TW (1) | TW200736924A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080140839A1 (en) * | 2005-10-27 | 2008-06-12 | Boyd William T | Creation and management of destination id routing structures in multi-host pci topologies |
US20080235431A1 (en) * | 2005-10-27 | 2008-09-25 | International Business Machines Corporation | Method Using a Master Node to Control I/O Fabric Configuration in a Multi-Host Environment |
US20080235785A1 (en) * | 2006-02-07 | 2008-09-25 | International Business Machines Corporation | Method, Apparatus, and Computer Program Product for Routing Packets Utilizing a Unique Identifier, Included within a Standard Address, that Identifies the Destination Host Computer System |
US20080307116A1 (en) * | 2005-10-27 | 2008-12-11 | International Business Machines Corporation | Routing Mechanism in PCI Multi-Host Topologies Using Destination ID Field |
US20090100204A1 (en) * | 2006-02-09 | 2009-04-16 | International Business Machines Corporation | Method, Apparatus, and Computer Usable Program Code for Migrating Virtual Adapters from Source Physical Adapters to Destination Physical Adapters |
US20100036995A1 (en) * | 2008-08-05 | 2010-02-11 | Hitachi, Ltd. | Computer system and bus assignment method |
US20100082874A1 (en) * | 2008-09-29 | 2010-04-01 | Hitachi, Ltd. | Computer system and method for sharing pci devices thereof |
US20100211717A1 (en) * | 2009-02-19 | 2010-08-19 | Hitachi, Ltd. | Computer system, method of managing pci switch, and management server |
US20100312943A1 (en) * | 2009-06-04 | 2010-12-09 | Hitachi, Ltd. | Computer system managing i/o path and port |
US7930598B2 (en) | 2005-07-28 | 2011-04-19 | International Business Machines Corporation | Broadcast of shared I/O fabric error messages in a multi-host environment to all affected root nodes |
EP2568389A3 (en) * | 2011-09-07 | 2013-04-17 | Apple Inc. | Coherence switch for I/O traffic |
US20140310430A1 (en) * | 2013-04-10 | 2014-10-16 | Marvell World Trade Ltd. | Tunneling Transaction Packets |
US9110865B2 (en) | 2011-07-01 | 2015-08-18 | International Business Machines Corporation | Virtual machine dynamic routing |
US20160140073A1 (en) * | 2014-11-13 | 2016-05-19 | Cavium, Inc. | Programmable validation of transaction requests |
US11604742B2 (en) | 2019-06-25 | 2023-03-14 | Huawei Technologies Co., Ltd. | Independent central processing unit (CPU) networking using an intermediate device |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007105373A1 (en) * | 2006-03-10 | 2007-09-20 | Sony Corporation | Bridge, information processing system, and access control method |
US8141094B2 (en) * | 2007-12-03 | 2012-03-20 | International Business Machines Corporation | Distribution of resources for I/O virtualized (IOV) adapters and management of the adapters through an IOV management partition via user selection of compatible virtual functions |
US7752346B2 (en) * | 2007-12-21 | 2010-07-06 | Aprius, Inc. | Universal routing in PCI-Express fabrics |
JP5154238B2 (en) * | 2008-01-18 | 2013-02-27 | 株式会社日立製作所 | Management method of composite computer system and composite computer system |
JP5074274B2 (en) * | 2008-04-16 | 2012-11-14 | 株式会社日立製作所 | Computer system and communication path monitoring method |
JP5280135B2 (en) * | 2008-09-01 | 2013-09-04 | 株式会社日立製作所 | Data transfer device |
US20110047313A1 (en) * | 2008-10-23 | 2011-02-24 | Joseph Hui | Memory area network for extended computer systems |
JPWO2010084529A1 (en) * | 2009-01-23 | 2012-07-12 | 株式会社日立製作所 | Information processing system |
JP5069732B2 (en) | 2009-10-05 | 2012-11-07 | 株式会社日立製作所 | Computer device, computer system, adapter succession method |
US8775579B2 (en) * | 2010-01-13 | 2014-07-08 | Htc Corporation | Method for addressing management object in management tree and associated device management system |
US9407577B2 (en) | 2011-03-23 | 2016-08-02 | Nec Corporation | Communication control system, switch node and communication control method |
US8949474B1 (en) * | 2011-11-21 | 2015-02-03 | Marvell International Ltd. | Method for inter-chip and intra-chip addressing using port identifiers and address mapping |
US8793539B2 (en) | 2012-06-13 | 2014-07-29 | International Business Machines Corporation | External settings that reconfigure the error handling behavior of a distributed PCIe switch |
US10447575B1 (en) * | 2012-12-27 | 2019-10-15 | Sitting Man, Llc | Routing methods, systems, and computer program products |
US9858228B2 (en) * | 2015-08-10 | 2018-01-02 | Futurewei Technologies, Inc. | Dynamic assignment of groups of resources in a peripheral component interconnect express network |
US11018947B2 (en) | 2016-01-27 | 2021-05-25 | Oracle International Corporation | System and method for supporting on-demand setup of local host channel adapter port partition membership in a high-performance computing environment |
US10469621B2 (en) | 2016-01-27 | 2019-11-05 | Oracle International Corporation | System and method of host-side configuration of a host channel adapter (HCA) in a high-performance computing environment |
US10972375B2 (en) | 2016-01-27 | 2021-04-06 | Oracle International Corporation | System and method of reserving a specific queue pair number for proprietary management traffic in a high-performance computing environment |
FR3076142A1 (en) * | 2017-12-21 | 2019-06-28 | Bull Sas | METHOD AND SERVER OF TOPOLOGICAL ADDRESS ALLOCATION TO NETWORK SWITCHES, COMPUTER PROGRAM AND CLUSTER OF CORRESPONDING SERVERS |
CN112540941B (en) * | 2019-09-21 | 2024-09-20 | 华为技术有限公司 | Data forwarding chip and server |
US20210255973A1 (en) * | 2020-12-17 | 2021-08-19 | Intel Corporation | Stream routing and ide enhancements for pcie |
US20220292026A1 (en) * | 2021-03-12 | 2022-09-15 | Micron Technology, Inc. | Virtual addresses for a memory system |
US11899593B2 (en) * | 2021-12-21 | 2024-02-13 | Intel Corporation | Method and apparatus for detecting ATS-based DMA attack |
Citations (54)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US719538A (en) * | 1902-04-09 | 1903-02-03 | William C Gaston | Buggy-top attachment. |
US5257353A (en) * | 1986-07-18 | 1993-10-26 | Intel Corporation | I/O control system having a plurality of access enabling bits for controlling access to selective parts of an I/O device |
US5367695A (en) * | 1991-09-27 | 1994-11-22 | Sun Microsystems, Inc. | Bus-to-bus interface for preventing data incoherence in a multiple processor computer system |
US5392328A (en) * | 1993-02-04 | 1995-02-21 | Bell Communications Research, Inc. | System and method for automatically detecting root causes of switching connection failures in a telephone network |
US5960213A (en) * | 1995-12-18 | 1999-09-28 | 3D Labs Inc. Ltd | Dynamically reconfigurable multi-function PCI adapter device |
US5968189A (en) * | 1997-04-08 | 1999-10-19 | International Business Machines Corporation | System of reporting errors by a hardware element of a distributed computer system |
US6061753A (en) * | 1998-01-27 | 2000-05-09 | Emc Corporation | Apparatus and method of accessing target devices across a bus utilizing initiator identifiers |
US20020161937A1 (en) * | 2001-04-30 | 2002-10-31 | Odenwald Louis H. | System and method employing a dynamic logical identifier |
US20020188701A1 (en) * | 2001-06-12 | 2002-12-12 | International Business Machines Corporation | Apparatus and method for managing configuration of computer systems on a computer network |
US20030221030A1 (en) * | 2002-05-24 | 2003-11-27 | Timothy A. Pontius | Access control bus system |
US6662251B2 (en) * | 2001-03-26 | 2003-12-09 | International Business Machines Corporation | Selective targeting of transactions to devices on a shared bus |
US20040039986A1 (en) * | 2002-08-23 | 2004-02-26 | Solomon Gary A. | Store and forward switch device, system and method |
US20040123014A1 (en) * | 2002-12-19 | 2004-06-24 | Intel Corporation | System and method for communicating over intra-hierarchy and inter-hierarchy links |
US6769021B1 (en) * | 1999-09-15 | 2004-07-27 | Adaptec, Inc. | Methods for partitioning end nodes in a network fabric |
US6775750B2 (en) * | 2001-06-29 | 2004-08-10 | Texas Instruments Incorporated | System protection map |
US20040172494A1 (en) * | 2003-01-21 | 2004-09-02 | Nextio Inc. | Method and apparatus for shared I/O in a load/store fabric |
US20040179534A1 (en) * | 2003-01-21 | 2004-09-16 | Nextio Inc. | Method and apparatus for shared I/O in a load/store fabric |
US20040210754A1 (en) * | 2003-04-16 | 2004-10-21 | Barron Dwight L. | Shared security transform device, system and methods |
US20040230709A1 (en) * | 2003-05-15 | 2004-11-18 | Moll Laurent R. | Peripheral bus transaction routing using primary and node ID routing information |
US20040230735A1 (en) * | 2003-05-15 | 2004-11-18 | Moll Laurent R. | Peripheral bus switch having virtual peripheral bus and configurable host bridge |
US20050025119A1 (en) * | 2003-01-21 | 2005-02-03 | Nextio Inc. | Switching apparatus and method for providing shared I/O within a load-store fabric |
US20050044301A1 (en) * | 2003-08-20 | 2005-02-24 | Vasilevsky Alexander David | Method and apparatus for providing virtual computing services |
US20050102682A1 (en) * | 2003-11-12 | 2005-05-12 | Intel Corporation | Method, system, and program for interfacing with a network adaptor supporting a plurality of devices |
US6907510B2 (en) * | 2002-04-01 | 2005-06-14 | Intel Corporation | Mapping of interconnect configuration space |
US20050147117A1 (en) * | 2003-01-21 | 2005-07-07 | Nextio Inc. | Apparatus and method for port polarity initialization in a shared I/O device |
US20050228531A1 (en) * | 2004-03-31 | 2005-10-13 | Genovker Victoria V | Advanced switching fabric discovery protocol |
US20050270988A1 (en) * | 2004-06-04 | 2005-12-08 | Dehaemer Eric | Mechanism of dynamic upstream port selection in a PCI express switch |
US7036122B2 (en) * | 2002-04-01 | 2006-04-25 | Intel Corporation | Device virtualization and assignment of interconnect devices |
US20060179195A1 (en) * | 2005-02-03 | 2006-08-10 | International Business Machines Corporation | Method and apparatus for restricting input/output device peer-to-peer operations in a data processing system to improve reliability, availability, and serviceability |
US20060184711A1 (en) * | 2003-01-21 | 2006-08-17 | Nextio Inc. | Switching apparatus and method for providing shared i/o within a load-store fabric |
US20060195617A1 (en) * | 2005-02-25 | 2006-08-31 | International Business Machines Corporation | Method and system for native virtualization on a partially trusted adapter using adapter bus, device and function number for identification |
US20060206655A1 (en) * | 2004-12-10 | 2006-09-14 | Chappell Christopher L | Packet processing in switched fabric networks |
US20060206936A1 (en) * | 2005-03-11 | 2006-09-14 | Yung-Chang Liang | Method and apparatus for securing a computer network |
US20060212620A1 (en) * | 2005-02-25 | 2006-09-21 | International Business Machines Corporation | System and method for virtual adapter resource allocation |
US20060212608A1 (en) * | 2005-02-25 | 2006-09-21 | International Business Machines Corporation | System, method, and computer program product for a fully trusted adapter validation of incoming memory mapped I/O operations on a physical adapter that supports virtual adapters or virtual resources |
US20060212870A1 (en) * | 2005-02-25 | 2006-09-21 | International Business Machines Corporation | Association of memory access through protection attributes that are associated to an access control level on a PCI adapter that supports virtualization |
US20060230181A1 (en) * | 2005-03-11 | 2006-10-12 | Riley Dwight D | System and method for multi-host sharing of a single-host device |
US20060242354A1 (en) * | 2005-04-22 | 2006-10-26 | Johnsen Bjorn D | Flexible routing and addressing |
US20060239287A1 (en) * | 2005-04-22 | 2006-10-26 | Johnsen Bjorn D | Adding packet routing information without ECRC recalculation |
US20060242352A1 (en) * | 2005-04-22 | 2006-10-26 | Ola Torudbakken | Device sharing |
US20060242333A1 (en) * | 2005-04-22 | 2006-10-26 | Johnsen Bjorn D | Scalable routing and addressing |
US7134052B2 (en) * | 2003-05-15 | 2006-11-07 | International Business Machines Corporation | Autonomic recovery from hardware errors in an input/output fabric |
US20060253619A1 (en) * | 2005-04-22 | 2006-11-09 | Ola Torudbakken | Virtualization for device sharing |
US20070019637A1 (en) * | 2005-07-07 | 2007-01-25 | Boyd William T | Mechanism to virtualize all address spaces in shared I/O fabrics |
US20070027952A1 (en) * | 2005-07-28 | 2007-02-01 | Boyd William T | Broadcast of shared I/O fabric error messages in a multi-host environment to all affected root nodes |
US7188209B2 (en) * | 2003-04-18 | 2007-03-06 | Nextio, Inc. | Apparatus and method for sharing I/O endpoints within a load store fabric by encapsulation of domain information in transaction layer packets |
US20070073960A1 (en) * | 2005-03-24 | 2007-03-29 | Fujitsu Limited | PCI-Express communications system |
US20070101016A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Method for confirming identity of a master node selected to control I/O fabric configuration in a multi-host environment |
US20070097948A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Creation and management of destination ID routing structures in multi-host PCI topologies |
US20070097950A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Routing mechanism in PCI multi-host topologies using destination ID field |
US20070097949A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Method using a master node to control I/O fabric configuration in a multi-host environment |
US20070097871A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Method of routing I/O adapter error messages in a multi-host environment |
US20070136458A1 (en) * | 2005-12-12 | 2007-06-14 | Boyd William T | Creation and management of ATPT in switches of multi-host PCI topologies |
US7457897B1 (en) * | 2004-03-17 | 2008-11-25 | Suoer Talent Electronics, Inc. | PCI express-compatible controller and interface for flash memory |
Family Cites Families (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3454294B2 (en) * | 1994-06-20 | 2003-10-06 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Multiple bus information processing system and bridge circuit |
US6901451B1 (en) * | 2000-10-31 | 2005-05-31 | Fujitsu Limited | PCI bridge over network |
US6611883B1 (en) * | 2000-11-16 | 2003-08-26 | Sun Microsystems, Inc. | Method and apparatus for implementing PCI DMA speculative prefetching in a message passing queue oriented bus system |
US7401126B2 (en) * | 2001-03-23 | 2008-07-15 | Neteffect, Inc. | Transaction switch and network interface adapter incorporating same |
US7363389B2 (en) * | 2001-03-29 | 2008-04-22 | Intel Corporation | Apparatus and method for enhanced channel adapter performance through implementation of a completion queue engine and address translation engine |
US20040025166A1 (en) * | 2002-02-02 | 2004-02-05 | International Business Machines Corporation | Server computer and a method for accessing resources from virtual machines of a server computer via a fibre channel |
US7194538B1 (en) | 2002-06-04 | 2007-03-20 | Veritas Operating Corporation | Storage area network (SAN) management system for discovering SAN components using a SAN management server |
JP2004258985A (en) * | 2003-02-26 | 2004-09-16 | Nec Corp | Multiprocessor system and its input/output control method |
US7058738B2 (en) * | 2004-04-28 | 2006-06-06 | Microsoft Corporation | Configurable PCI express switch which allows multiple CPUs to be connected to multiple I/O devices |
US20060174094A1 (en) * | 2005-02-02 | 2006-08-03 | Bryan Lloyd | Systems and methods for providing complementary operands to an ALU |
US20060179265A1 (en) * | 2005-02-08 | 2006-08-10 | Flood Rachel M | Systems and methods for executing x-form instructions |
US7360058B2 (en) * | 2005-02-09 | 2008-04-15 | International Business Machines Corporation | System and method for generating effective address |
US7350029B2 (en) * | 2005-02-10 | 2008-03-25 | International Business Machines Corporation | Data stream prefetching in a microprocessor |
US7380066B2 (en) * | 2005-02-10 | 2008-05-27 | International Business Machines Corporation | Store stream prefetching in a microprocessor |
US7395414B2 (en) * | 2005-02-11 | 2008-07-01 | International Business Machines Corporation | Dynamic recalculation of resource vector at issue queue for steering of dependent instructions |
US7631308B2 (en) * | 2005-02-11 | 2009-12-08 | International Business Machines Corporation | Thread priority method for ensuring processing fairness in simultaneous multi-threading microprocessors |
US7254697B2 (en) * | 2005-02-11 | 2007-08-07 | International Business Machines Corporation | Method and apparatus for dynamic modification of microprocessor instruction group at dispatch |
US20060184769A1 (en) * | 2005-02-11 | 2006-08-17 | International Business Machines Corporation | Localized generation of global flush requests while guaranteeing forward progress of a processor |
US20060184770A1 (en) * | 2005-02-12 | 2006-08-17 | International Business Machines Corporation | Method of implementing precise, localized hardware-error workarounds under centralized control |
US7694047B1 (en) * | 2005-02-17 | 2010-04-06 | Qlogic, Corporation | Method and system for sharing input/output devices |
US7493425B2 (en) * | 2005-02-25 | 2009-02-17 | International Business Machines Corporation | Method, system and program product for differentiating between virtual hosts on bus transactions and associating allowable memory access for an input/output adapter that supports virtualization |
US20060195848A1 (en) * | 2005-02-25 | 2006-08-31 | International Business Machines Corporation | System and method of virtual resource modification on a physical adapter that supports virtual resources |
US20060195663A1 (en) * | 2005-02-25 | 2006-08-31 | International Business Machines Corporation | Virtualized I/O adapter for a multi-processor data processing system |
US7260664B2 (en) * | 2005-02-25 | 2007-08-21 | International Business Machines Corporation | Interrupt mechanism on an IO adapter that supports virtualization |
US7480742B2 (en) * | 2005-02-25 | 2009-01-20 | International Business Machines Corporation | Method for virtual adapter destruction on a physical adapter that supports virtual adapters |
US7543084B2 (en) * | 2005-02-25 | 2009-06-02 | International Business Machines Corporation | Method for destroying virtual resources in a logically partitioned data processing system |
US7870301B2 (en) * | 2005-02-25 | 2011-01-11 | International Business Machines Corporation | System and method for modification of virtual adapter resources in a logically partitioned data processing system |
US7398337B2 (en) * | 2005-02-25 | 2008-07-08 | International Business Machines Corporation | Association of host translations that are associated to an access control level on a PCI bridge that supports virtualization |
US7685335B2 (en) * | 2005-02-25 | 2010-03-23 | International Business Machines Corporation | Virtualized fibre channel adapter for a multi-processor data processing system |
US7496790B2 (en) * | 2005-02-25 | 2009-02-24 | International Business Machines Corporation | Method, apparatus, and computer program product for coordinating error reporting and reset utilizing an I/O adapter that supports virtualization |
US7409589B2 (en) * | 2005-05-27 | 2008-08-05 | International Business Machines Corporation | Method and apparatus for reducing number of cycles required to checkpoint instructions in a multi-threaded processor |
US7707465B2 (en) * | 2006-01-26 | 2010-04-27 | International Business Machines Corporation | Routing of shared I/O fabric error messages in a multi-host environment to a master control root node |
US7380046B2 (en) * | 2006-02-07 | 2008-05-27 | International Business Machines Corporation | Method, apparatus, and computer program product for routing packets utilizing a unique identifier, included within a standard address, that identifies the destination host computer system |
US7484029B2 (en) * | 2006-02-09 | 2009-01-27 | International Business Machines Corporation | Method, apparatus, and computer usable program code for migrating virtual adapters from source physical adapters to destination physical adapters |
-
2006
- 2006-01-18 US US11/334,678 patent/US20070165596A1/en not_active Abandoned
- 2006-12-26 JP JP2006348980A patent/JP2007195166A/en active Pending
-
2007
- 2007-01-15 TW TW096101491A patent/TW200736924A/en unknown
- 2007-01-17 CN CN2007100019686A patent/CN101013989B/en not_active Expired - Fee Related
-
2008
- 2008-06-06 US US12/134,952 patent/US7907604B2/en not_active Expired - Fee Related
Patent Citations (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US719538A (en) * | 1902-04-09 | 1903-02-03 | William C Gaston | Buggy-top attachment. |
US5257353A (en) * | 1986-07-18 | 1993-10-26 | Intel Corporation | I/O control system having a plurality of access enabling bits for controlling access to selective parts of an I/O device |
US5367695A (en) * | 1991-09-27 | 1994-11-22 | Sun Microsystems, Inc. | Bus-to-bus interface for preventing data incoherence in a multiple processor computer system |
US5392328A (en) * | 1993-02-04 | 1995-02-21 | Bell Communications Research, Inc. | System and method for automatically detecting root causes of switching connection failures in a telephone network |
US5960213A (en) * | 1995-12-18 | 1999-09-28 | 3D Labs Inc. Ltd | Dynamically reconfigurable multi-function PCI adapter device |
US5968189A (en) * | 1997-04-08 | 1999-10-19 | International Business Machines Corporation | System of reporting errors by a hardware element of a distributed computer system |
US6061753A (en) * | 1998-01-27 | 2000-05-09 | Emc Corporation | Apparatus and method of accessing target devices across a bus utilizing initiator identifiers |
US6769021B1 (en) * | 1999-09-15 | 2004-07-27 | Adaptec, Inc. | Methods for partitioning end nodes in a network fabric |
US6662251B2 (en) * | 2001-03-26 | 2003-12-09 | International Business Machines Corporation | Selective targeting of transactions to devices on a shared bus |
US20020161937A1 (en) * | 2001-04-30 | 2002-10-31 | Odenwald Louis H. | System and method employing a dynamic logical identifier |
US20020188701A1 (en) * | 2001-06-12 | 2002-12-12 | International Business Machines Corporation | Apparatus and method for managing configuration of computer systems on a computer network |
US20060168361A1 (en) * | 2001-06-12 | 2006-07-27 | International Business Machines Corporation | Apparatus and method for managing configuration of computer systems on a computer network |
US20050188116A1 (en) * | 2001-06-12 | 2005-08-25 | International Business Machines Corporation | Apparatus and method for managing configuration of computer systems on a computer network |
US6775750B2 (en) * | 2001-06-29 | 2004-08-10 | Texas Instruments Incorporated | System protection map |
US6907510B2 (en) * | 2002-04-01 | 2005-06-14 | Intel Corporation | Mapping of interconnect configuration space |
US7036122B2 (en) * | 2002-04-01 | 2006-04-25 | Intel Corporation | Device virtualization and assignment of interconnect devices |
US20030221030A1 (en) * | 2002-05-24 | 2003-11-27 | Timothy A. Pontius | Access control bus system |
US20040039986A1 (en) * | 2002-08-23 | 2004-02-26 | Solomon Gary A. | Store and forward switch device, system and method |
US20040123014A1 (en) * | 2002-12-19 | 2004-06-24 | Intel Corporation | System and method for communicating over intra-hierarchy and inter-hierarchy links |
US20060184711A1 (en) * | 2003-01-21 | 2006-08-17 | Nextio Inc. | Switching apparatus and method for providing shared i/o within a load-store fabric |
US7174413B2 (en) * | 2003-01-21 | 2007-02-06 | Nextio Inc. | Switching apparatus and method for providing shared I/O within a load-store fabric |
US20050025119A1 (en) * | 2003-01-21 | 2005-02-03 | Nextio Inc. | Switching apparatus and method for providing shared I/O within a load-store fabric |
US20050147117A1 (en) * | 2003-01-21 | 2005-07-07 | Nextio Inc. | Apparatus and method for port polarity initialization in a shared I/O device |
US20040172494A1 (en) * | 2003-01-21 | 2004-09-02 | Nextio Inc. | Method and apparatus for shared I/O in a load/store fabric |
US20040179534A1 (en) * | 2003-01-21 | 2004-09-16 | Nextio Inc. | Method and apparatus for shared I/O in a load/store fabric |
US20040210754A1 (en) * | 2003-04-16 | 2004-10-21 | Barron Dwight L. | Shared security transform device, system and methods |
US7188209B2 (en) * | 2003-04-18 | 2007-03-06 | Nextio, Inc. | Apparatus and method for sharing I/O endpoints within a load store fabric by encapsulation of domain information in transaction layer packets |
US20060230217A1 (en) * | 2003-05-15 | 2006-10-12 | Moll Laurent R | Peripheral bus switch having virtual peripheral bus and configurable host bridge |
US20040230735A1 (en) * | 2003-05-15 | 2004-11-18 | Moll Laurent R. | Peripheral bus switch having virtual peripheral bus and configurable host bridge |
US7134052B2 (en) * | 2003-05-15 | 2006-11-07 | International Business Machines Corporation | Autonomic recovery from hardware errors in an input/output fabric |
US7380018B2 (en) * | 2003-05-15 | 2008-05-27 | Broadcom Corporation | Peripheral bus transaction routing using primary and node ID routing information |
US20040230709A1 (en) * | 2003-05-15 | 2004-11-18 | Moll Laurent R. | Peripheral bus transaction routing using primary and node ID routing information |
US7096305B2 (en) * | 2003-05-15 | 2006-08-22 | Broadcom Corporation | Peripheral bus switch having virtual peripheral bus and configurable host bridge |
US20050044301A1 (en) * | 2003-08-20 | 2005-02-24 | Vasilevsky Alexander David | Method and apparatus for providing virtual computing services |
US20050102682A1 (en) * | 2003-11-12 | 2005-05-12 | Intel Corporation | Method, system, and program for interfacing with a network adaptor supporting a plurality of devices |
US7457897B1 (en) * | 2004-03-17 | 2008-11-25 | Suoer Talent Electronics, Inc. | PCI express-compatible controller and interface for flash memory |
US20050228531A1 (en) * | 2004-03-31 | 2005-10-13 | Genovker Victoria V | Advanced switching fabric discovery protocol |
US20050270988A1 (en) * | 2004-06-04 | 2005-12-08 | Dehaemer Eric | Mechanism of dynamic upstream port selection in a PCI express switch |
US20060206655A1 (en) * | 2004-12-10 | 2006-09-14 | Chappell Christopher L | Packet processing in switched fabric networks |
US20060179195A1 (en) * | 2005-02-03 | 2006-08-10 | International Business Machines Corporation | Method and apparatus for restricting input/output device peer-to-peer operations in a data processing system to improve reliability, availability, and serviceability |
US20060212620A1 (en) * | 2005-02-25 | 2006-09-21 | International Business Machines Corporation | System and method for virtual adapter resource allocation |
US20060212870A1 (en) * | 2005-02-25 | 2006-09-21 | International Business Machines Corporation | Association of memory access through protection attributes that are associated to an access control level on a PCI adapter that supports virtualization |
US20060212608A1 (en) * | 2005-02-25 | 2006-09-21 | International Business Machines Corporation | System, method, and computer program product for a fully trusted adapter validation of incoming memory mapped I/O operations on a physical adapter that supports virtual adapters or virtual resources |
US20060195617A1 (en) * | 2005-02-25 | 2006-08-31 | International Business Machines Corporation | Method and system for native virtualization on a partially trusted adapter using adapter bus, device and function number for identification |
US20060230181A1 (en) * | 2005-03-11 | 2006-10-12 | Riley Dwight D | System and method for multi-host sharing of a single-host device |
US20060206936A1 (en) * | 2005-03-11 | 2006-09-14 | Yung-Chang Liang | Method and apparatus for securing a computer network |
US20070073960A1 (en) * | 2005-03-24 | 2007-03-29 | Fujitsu Limited | PCI-Express communications system |
US20060242333A1 (en) * | 2005-04-22 | 2006-10-26 | Johnsen Bjorn D | Scalable routing and addressing |
US20060253619A1 (en) * | 2005-04-22 | 2006-11-09 | Ola Torudbakken | Virtualization for device sharing |
US20060242352A1 (en) * | 2005-04-22 | 2006-10-26 | Ola Torudbakken | Device sharing |
US20060239287A1 (en) * | 2005-04-22 | 2006-10-26 | Johnsen Bjorn D | Adding packet routing information without ECRC recalculation |
US20060242354A1 (en) * | 2005-04-22 | 2006-10-26 | Johnsen Bjorn D | Flexible routing and addressing |
US20070019637A1 (en) * | 2005-07-07 | 2007-01-25 | Boyd William T | Mechanism to virtualize all address spaces in shared I/O fabrics |
US20070027952A1 (en) * | 2005-07-28 | 2007-02-01 | Boyd William T | Broadcast of shared I/O fabric error messages in a multi-host environment to all affected root nodes |
US20070101016A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Method for confirming identity of a master node selected to control I/O fabric configuration in a multi-host environment |
US20070097948A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Creation and management of destination ID routing structures in multi-host PCI topologies |
US20070097950A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Routing mechanism in PCI multi-host topologies using destination ID field |
US20070097949A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Method using a master node to control I/O fabric configuration in a multi-host environment |
US20070097871A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Method of routing I/O adapter error messages in a multi-host environment |
US20070136458A1 (en) * | 2005-12-12 | 2007-06-14 | Boyd William T | Creation and management of ATPT in switches of multi-host PCI topologies |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7930598B2 (en) | 2005-07-28 | 2011-04-19 | International Business Machines Corporation | Broadcast of shared I/O fabric error messages in a multi-host environment to all affected root nodes |
US7549003B2 (en) | 2005-10-27 | 2009-06-16 | International Business Machines Corporation | Creation and management of destination ID routing structures in multi-host PCI topologies |
US20080307116A1 (en) * | 2005-10-27 | 2008-12-11 | International Business Machines Corporation | Routing Mechanism in PCI Multi-Host Topologies Using Destination ID Field |
US7506094B2 (en) | 2005-10-27 | 2009-03-17 | International Business Machines Corporation | Method using a master node to control I/O fabric configuration in a multi-host environment |
US20080140839A1 (en) * | 2005-10-27 | 2008-06-12 | Boyd William T | Creation and management of destination id routing structures in multi-host pci topologies |
US20080235431A1 (en) * | 2005-10-27 | 2008-09-25 | International Business Machines Corporation | Method Using a Master Node to Control I/O Fabric Configuration in a Multi-Host Environment |
US20080235785A1 (en) * | 2006-02-07 | 2008-09-25 | International Business Machines Corporation | Method, Apparatus, and Computer Program Product for Routing Packets Utilizing a Unique Identifier, Included within a Standard Address, that Identifies the Destination Host Computer System |
US7831759B2 (en) | 2006-02-07 | 2010-11-09 | International Business Machines Corporation | Method, apparatus, and computer program product for routing packets utilizing a unique identifier, included within a standard address, that identifies the destination host computer system |
US20090100204A1 (en) * | 2006-02-09 | 2009-04-16 | International Business Machines Corporation | Method, Apparatus, and Computer Usable Program Code for Migrating Virtual Adapters from Source Physical Adapters to Destination Physical Adapters |
US7937518B2 (en) | 2006-02-09 | 2011-05-03 | International Business Machines Corporation | Method, apparatus, and computer usable program code for migrating virtual adapters from source physical adapters to destination physical adapters |
US20100036995A1 (en) * | 2008-08-05 | 2010-02-11 | Hitachi, Ltd. | Computer system and bus assignment method |
US8683109B2 (en) | 2008-08-05 | 2014-03-25 | Hitachi, Ltd. | Computer system and bus assignment method |
US8352665B2 (en) | 2008-08-05 | 2013-01-08 | Hitachi, Ltd. | Computer system and bus assignment method |
US8341327B2 (en) | 2008-09-29 | 2012-12-25 | Hitachi, Ltd. | Computer system and method for sharing PCI devices thereof |
US8725926B2 (en) | 2008-09-29 | 2014-05-13 | Hitachi, Ltd. | Computer system and method for sharing PCI devices thereof |
US20100082874A1 (en) * | 2008-09-29 | 2010-04-01 | Hitachi, Ltd. | Computer system and method for sharing pci devices thereof |
US20100211717A1 (en) * | 2009-02-19 | 2010-08-19 | Hitachi, Ltd. | Computer system, method of managing pci switch, and management server |
US8533381B2 (en) | 2009-02-19 | 2013-09-10 | Hitachi, Ltd. | Computer system, method of managing PCI switch, and management server |
US8407391B2 (en) | 2009-06-04 | 2013-03-26 | Hitachi, Ltd. | Computer system managing I/O path and port |
US20100312943A1 (en) * | 2009-06-04 | 2010-12-09 | Hitachi, Ltd. | Computer system managing i/o path and port |
US9110865B2 (en) | 2011-07-01 | 2015-08-18 | International Business Machines Corporation | Virtual machine dynamic routing |
KR101405751B1 (en) * | 2011-09-07 | 2014-06-10 | 애플 인크. | Coherence switch for i/o traffic |
EP2568389A3 (en) * | 2011-09-07 | 2013-04-17 | Apple Inc. | Coherence switch for I/O traffic |
US9176913B2 (en) | 2011-09-07 | 2015-11-03 | Apple Inc. | Coherence switch for I/O traffic |
US20140310430A1 (en) * | 2013-04-10 | 2014-10-16 | Marvell World Trade Ltd. | Tunneling Transaction Packets |
US9116836B2 (en) * | 2013-04-10 | 2015-08-25 | Marvell World Trade Ltd. | Tunneling transaction packets |
US20160140073A1 (en) * | 2014-11-13 | 2016-05-19 | Cavium, Inc. | Programmable validation of transaction requests |
US10013385B2 (en) * | 2014-11-13 | 2018-07-03 | Cavium, Inc. | Programmable validation of transaction requests |
US11604742B2 (en) | 2019-06-25 | 2023-03-14 | Huawei Technologies Co., Ltd. | Independent central processing unit (CPU) networking using an intermediate device |
Also Published As
Publication number | Publication date |
---|---|
CN101013989B (en) | 2011-01-12 |
TW200736924A (en) | 2007-10-01 |
US7907604B2 (en) | 2011-03-15 |
CN101013989A (en) | 2007-08-08 |
US20080235430A1 (en) | 2008-09-25 |
JP2007195166A (en) | 2007-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7907604B2 (en) | Creation and management of routing table for PCI bus address based routing with integrated DID | |
US7549003B2 (en) | Creation and management of destination ID routing structures in multi-host PCI topologies | |
US7506094B2 (en) | Method using a master node to control I/O fabric configuration in a multi-host environment | |
US7930598B2 (en) | Broadcast of shared I/O fabric error messages in a multi-host environment to all affected root nodes | |
US7430630B2 (en) | Routing mechanism in PCI multi-host topologies using destination ID field | |
US20070136458A1 (en) | Creation and management of ATPT in switches of multi-host PCI topologies | |
US7831759B2 (en) | Method, apparatus, and computer program product for routing packets utilizing a unique identifier, included within a standard address, that identifies the destination host computer system | |
US7889667B2 (en) | Method of routing I/O adapter error messages in a multi-host environment | |
US7571273B2 (en) | Bus/device/function translation within and routing of communications packets in a PCI switched-fabric in a multi-host environment utilizing multiple root switches | |
US7707465B2 (en) | Routing of shared I/O fabric error messages in a multi-host environment to a master control root node | |
US8346997B2 (en) | Use of peripheral component interconnect input/output virtualization devices to create redundant configurations | |
US7492723B2 (en) | Mechanism to virtualize all address spaces in shared I/O fabrics | |
US7631050B2 (en) | Method for confirming identity of a master node selected to control I/O fabric configuration in a multi-host environment | |
US20080137676A1 (en) | Bus/device/function translation within and routing of communications packets in a pci switched-fabric in a multi-host environment environment utilizing a root switch | |
US8225005B2 (en) | Use of peripheral component interconnect input/output virtualization devices to create high-speed, low-latency interconnect | |
US7685321B2 (en) | Native virtualization on a partially trusted adapter using PCI host bus, device, and function number for identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MACHINES CORPORATION, INTERNATIONAL BUSINESS, NEW Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BOYD, WILLIAM T.;FREIMUTH, DOUGLAS M.;HOLLAND, WILLIAM G.;AND OTHERS;REEL/FRAME:017271/0260;SIGNING DATES FROM 20051122 TO 20051213 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE NAME TO READ: INTERNATIONAL BUSINESS MACHINES CORPORATION PREVIOUSLY RECORDED ON REEL 017271 FRAME 0260. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT OF ASSIGNOR'S INTEREST;ASSIGNORS:BOYD, WILLIAM T.;FREIMUTH, DOUGLAS M.;HOLLAND, WILLIAM G.;AND OTHERS;SIGNING DATES FROM 20051122 TO 20051213;REEL/FRAME:030259/0525 |