WO2024149043A1 - Reliable transmission method and apparatus for p2mp data - Google Patents
Reliable transmission method and apparatus for p2mp data Download PDFInfo
- Publication number
- WO2024149043A1 WO2024149043A1 PCT/CN2023/140662 CN2023140662W WO2024149043A1 WO 2024149043 A1 WO2024149043 A1 WO 2024149043A1 CN 2023140662 W CN2023140662 W CN 2023140662W WO 2024149043 A1 WO2024149043 A1 WO 2024149043A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- p2mp
- nodes
- data
- identification information
- Prior art date
Links
- 230000005540 biological transmission Effects 0.000 title claims abstract description 93
- 238000000034 method Methods 0.000 title claims abstract description 89
- 230000006854 communication Effects 0.000 claims abstract description 214
- 238000004891 communication Methods 0.000 claims abstract description 213
- 230000004044 response Effects 0.000 claims abstract description 110
- 238000012545 processing Methods 0.000 claims abstract description 59
- 238000004590 computer program Methods 0.000 claims description 15
- 238000007726 management method Methods 0.000 description 26
- 230000007246 mechanism Effects 0.000 description 23
- 230000008569 process Effects 0.000 description 20
- 238000010586 diagram Methods 0.000 description 18
- 230000011664 signaling Effects 0.000 description 15
- 238000012790 confirmation Methods 0.000 description 10
- 230000003993 interaction Effects 0.000 description 9
- 238000013473 artificial intelligence Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 6
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 3
- 230000002776 aggregation Effects 0.000 description 3
- 238000004220 aggregation Methods 0.000 description 3
- 238000013524 data verification Methods 0.000 description 3
- 238000004321 preservation Methods 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 101150042618 comm2 gene Proteins 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/02—Details
- H04L12/16—Arrangements for providing special services to substations
- H04L12/18—Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1061—Peer-to-peer [P2P] networks using node-based peer discovery mechanisms
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1074—Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
Definitions
- the present application relates to the field of communication network technology, and in particular to a reliable transmission method and device for P2MP data.
- Point-to-multipoint (P2MP) communication refers to the communication between a source node that sends multicast data and a series of receiving nodes that receive the multicast data.
- the reliable guarantee mechanism for P2MP data transmission/reception usually establishes a unicast reliable transmission connection at the transport layer between the source node and each receiving node of the P2MP data, and guarantees reliable transmission through the reliable transmission connection, or establishes a reliable transmission mechanism at the application layer based on the unreliable datagram connection at the transport layer to ensure reliable transmission.
- the resource consumption required for the source node to maintain the reliable transmission mechanism alone may be serious; or when the multicast source frequently switches in the P2MP communication domain, the overhead of establishing a reliable transmission connection for P2MP data on the control plane may be too high, which ultimately leads to the inability to guarantee the reliability of multicast data transmission. Therefore, it is necessary to establish an efficient P2MP data reliable transmission guarantee mechanism.
- the present application provides a reliable transmission method, device, electronic device, computer-readable storage medium and computer program product for P2MP data, which can efficiently ensure the reliable transmission of P2MP data when the scale of communication nodes in the P2MP communication domain is large and multicast source switching occurs frequently in the P2MP communication domain.
- the present application provides a reliable transmission method for P2MP data, which is applied to a first node, wherein the first node is a node in a P2MP communication domain, wherein the P2MP communication domain includes multiple nodes, the first node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information.
- the method includes: receiving a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet includes at least P2MP data, and the second node is one of the multiple nodes; determining the identification information of a third node according to a forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node having a direct connection relationship in the logical interconnection network; and sending a response message to the third node according to a state of receiving the P2MP data and the identification information of the third node.
- the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; sending a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node includes: when the state of receiving the P2MP data is erroneous, sending the response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and a state flag of receiving the P2MP data, the state flag of receiving the P2MP data is set to NAK, and NAK indicates that the received P2MP data is erroneous; the method also includes: receiving a correct P2MP data packet retransmitted by the third node.
- the P2MP data packet includes at least P2MP data and identification information of the second node; determining the identification information of the third node according to the forwarding table of the first node includes: determining the identification information of the third node according to the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node contains at least one field ⁇ key: value ⁇ , the key in the field is the identification information of the second node or a wildcard, and the value in the field is the identification information of the third node.
- the logical interconnection network is established by connecting multiple nodes. It includes: a plurality of nodes are divided into a plurality of node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and the first nodes in the plurality of node groups are connected.
- the present application provides a reliable transmission method for P2MP data, which is applied to a third node, the third node is a node in a P2MP communication domain, a P2MP communication domain includes multiple nodes, the third node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information.
- the method includes: receiving a response message from a first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least the identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, and the P2MP data packet is sent by a second node in a P2MP communication domain, and the second node is one of the multiple nodes; according to the response message of the first node, the management table of the third node is updated; the management table includes at least the status of the first node receiving P2MP data.
- the method further includes: determining whether a status flag of the first node receiving P2MP data in the response message of the first node is NAK, and if so, retransmitting a correct P2MP data packet to the first node and starting a timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; receiving a response message from the first node, and stopping the timer.
- the P2MP data packet includes at least P2MP data and identification information of the second node; updating the management table of the third node according to the response message of the first node includes: updating the state flag of the first node receiving the P2MP data in the management table of the third node according to the identification information of the first node and the state flag of the first node receiving the P2MP data; wherein the management table of the second node includes at least one field ⁇ key: value: state ⁇ , the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving the P2MP data.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a third node, the physical distance between the nodes in the node group is less than or equal to a preset distance, the nodes in the node group are connected, and the third nodes in multiple node groups are connected.
- the present application provides a reliable transmission device for P2MP data, which is deployed on a first node, the first node is a node in a P2MP communication domain, a P2MP communication domain includes multiple nodes, the first node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information.
- the device includes: a communication module, which is used to receive a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet includes at least P2MP data, and the second node is one of the multiple nodes; a processing module, which is used to determine the identification information of a third node according to a forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the processing module is also used to send a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node.
- a communication module which is used to receive a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet includes at least P2MP data, and the second node is one of the multiple no
- the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; when the processing module sends a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node, it is used to: when the state of receiving the P2MP data is erroneous, send a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and a state flag of receiving the P2MP data, and the state flag of receiving the P2MP data is set to NAK, and NAK indicates that the received P2MP data is erroneous; the communication module is also used to: receive a correct P2MP data packet retransmitted by the third node.
- the P2MP data packet includes at least P2MP data and identification information of the second node; when the processing module determines the identification information of the third node according to the forwarding table of the first node, it is used to: determine the identification information of the third node according to the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node includes at least one field ⁇ key: value ⁇ , the key in the field is the identification information of the second node or a wildcard, and the value in the field is the identification information of the third node.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and connections are established with the first nodes in the multiple node groups.
- the present application provides a reliable transmission device for P2MP data, which is deployed on a third node, the third node is a node in a P2MP communication domain, a P2MP communication domain includes multiple nodes, the third node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information.
- the device includes: a communication module, which is used to receive a response message from a first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least the identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, and the P2MP data packet is sent by the second node in a P2MP communication domain, and the second node is one of the multiple nodes; a processing module, which is used to update the management table of the third node according to the response message of the first node; the management table includes at least the status of the first node receiving P2MP data.
- a communication module which is used to receive a response message from a first node; in the logical interconnection network, the first node has at least one
- the processing module is further used to: determine whether the status flag of the first node receiving the P2MP data in the response message of the first node is NAK, and if so, retransmit the correct P2MP data packet to the first node and start the timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; and the communication module is further used to: receive the response message from the first node and stop the timer.
- the P2MP data packet includes at least P2MP data and identification information of the second node; when the processing module updates the management table of the second node according to the response message of the first node, it is used to: update the state flag of the first node receiving the P2MP data in the management table of the second node according to the identification information of the first node and the state flag of the first node receiving the P2MP data; wherein the management table of the second node includes at least one field ⁇ key: value: state ⁇ , the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving the P2MP data.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a third node, the physical distance between the nodes in the node group is less than or equal to a preset distance, the nodes in the node group are connected, and the third nodes in multiple node groups are connected.
- the present application provides a computer-readable storage medium, which stores a computer program.
- the processor executes the method described in the first aspect or any possible implementation of the first aspect, or executes the method described in the second aspect or any possible implementation of the second aspect.
- the present application provides a computer program product.
- the processor executes the method described in the first aspect or any possible implementation of the first aspect, or executes the method described in the second aspect or any possible implementation of the second aspect.
- FIG1 is a schematic diagram of an encoding process for establishing an MPI communication domain
- FIG2 is a schematic diagram of a typical MPI interface using a P2MP communication model
- FIG3 is a flow chart of network layer multicast forwarding and application layer multicast forwarding P2MP data
- FIG4 is a schematic diagram of the salient features of P2MP data forwarding within a P2MP communication domain
- FIG5 is a schematic diagram illustrating the problem of reliable forwarding of P2MP data
- FIG6a is a physical interconnection network established between nodes in a P2MP communication domain provided by an embodiment of the present application
- FIG6b is a forwarding path of P2MP data in a physical interconnection network provided by an embodiment of the present application.
- FIG7a is a logical interconnection network established between nodes in a P2MP communication domain provided by an embodiment of the present application
- FIG7b is a reliable transmission confirmation relationship of P2MP data in a logical interconnection network provided by an embodiment of the present application.
- FIG8 is a logical interconnection network established based on a quick response strategy provided in an embodiment of the present application.
- FIG9 is a flow chart of a reliable transmission method for P2MP data provided in an embodiment of the present application.
- FIG10 is a flow chart of a reliable transmission method for P2MP data provided in an embodiment of the present application.
- FIG11 is a schematic diagram of a logical interconnection network and a forwarding table provided in an embodiment of the present application.
- FIG12 is a flow chart of a reliable transmission method for P2MP data provided in an embodiment of the present application.
- FIG13a is a diagram of a network layer multicast resource configuration based on RC connection in a P2MP communication domain provided by an embodiment of the present application;
- FIG13b is a schematic diagram of a logical connection based on a data plane in a P2MP communication domain provided by an embodiment of the present application;
- FIG13c is a schematic diagram of a logical connection based on a control plane in a P2MP communication domain provided by an embodiment of the present application.
- FIG14 is a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application.
- FIG. 15 is a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application.
- a and/or B in this article is a description of the association relationship of associated objects, indicating that there can be three relationships.
- a and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone.
- the symbol "/" in this article indicates that the associated objects are in an or relationship, for example, A/B means A or B.
- first and second in the specification and claims herein are used to distinguish different objects rather than to describe a specific order of the objects.
- a first response message and a second response message are used to distinguish different response messages rather than to describe a specific order of the response messages.
- words such as “exemplary” or “for example” are used to indicate examples, illustrations or descriptions. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be interpreted as being more preferred or more advantageous than other embodiments or designs. Specifically, the use of words such as “exemplary” or “for example” is intended to present related concepts in a specific way.
- multiple means two or more than two.
- multiple processing units refer to two or more processing units, etc.; multiple elements refer to two or more elements, etc.
- MPI message passing interface
- P2P point-to-point
- CC collective communication
- point-to-point communication supports communication between a pair of processes, while group communication sets a specified process group, and all processes in the group participate in global data processing and communication operations.
- One process is carried on a node such as a CPU or server.
- MPI builds multiple different application program interfaces (APIs) based on different parallel computing demand models. These MPI interfaces complete data movement, aggregation or synchronization through different communication modes.
- APIs application program interfaces
- an HPC or AI application that uses MPI for message passing, in group communication mode, usually first specifies a group of related processes to establish an MPI communication domain.
- the processes in the communication domain jointly implement all or part of the application functions of the HPC or AI.
- Figure 1 shows a coding process for establishing an MPI communication domain.
- the coding process for establishing communication domains comm1 and comm2 is divided into the following steps: the first step is to specify communication domain members, which include all or some processes of the system specified by themselves, for example, by calling the function MPI_Group_incl() to specify the processes in group1 in the system, or calling the function MPI_Group_excl() to exclude the processes in group2 in the system; the second step is to create a communication domain, which is implemented by calling the function MPI_Comm_create(); the third step is to send/receive multicast data, which is implemented by calling an MPI interface MPI_Bcast().
- MPI_Bcast() MPI_Allreduce()
- MPI_Scatter MPI_Reduce_scatter()
- MPI_Reduce_scatter MPI_Reduce_scatter()
- FIG2 shows a typical MPI interface using the P2MP communication model.
- the execution system builds an MPI communication domain containing four processes according to the request of the application.
- the four processes run on the hardware devices GPU0-GPU3 respectively.
- GPU0-GPU3 is used to refer to the four corresponding members in the communication domain, or nodes, or processes for explanation.
- the application execution device uses the MPI_Bcast interface and adopts the P2MP communication model of 1 sender n receivers (1SnR) in the MPI communication domain to realize data movement.
- the other three communication members GPU1-GPU3 serve as receiving nodes.
- the source node GPU0 sends the received and saved data "A" to the three receiving nodes GPU0-GPU3 in P2MP mode, and the receiving nodes GPU1-GPU3 save the received data "A".
- the hardware devices GPU0 ⁇ GPU3 can perform different processing based on the received data "A” according to the allocation instructions of the execution device to complete the calculation task of data "A" in parallel.
- the execution system of the application uses the MPI_Allreduce() interface and adopts the model of multiple senders and multiple receivers (n senders n receivers, nSnR) to achieve data aggregation and movement, which can be decomposed into two steps: the first step is that a single node collects data, using the model of multiple senders and single receivers (n senders 1 receiver, nS1R); the second step is that the single node processes the collected data and sends it to multiple receiving nodes in the communication domain, using the 1SnR P2MP communication model. Specifically, the communication members GPU0/GPU1/GPU2/GPU3 first receive or establish different data "A/B/C/D" respectively.
- the nS1R model is used to select the communication members GPU1/GPU2/GPU3 in the communication domain as the source nodes for data forwarding, and collect the data of multiple groups of communication members to the communication member GPU0. For example, first select the communication node GPU1 as the source node and the communication member GPU0 as the receiving node. At this time, the source node GPU1 sends the saved data "B" to GPU0 in the P2P communication mode, and the process of GPU2 and GPU3 sending data is similar.
- the communication member GPU0 completes the aggregation of the data "A”, "B", "C", and “D” and performs related processing to obtain the data "A+B+C+D", it uses the P2MP method to send it.
- the hardware devices GPU0-GPU3 can perform different processing based on the same data "A+B+C+D" according to the system's allocation instructions to complete the calculation task of the data "A+B+C+D" in parallel.
- FIG3 shows a flowchart of network layer multicast forwarding and application layer multicast forwarding of P2MP data.
- a typical implementation method is to use network layer multicast or application layer multicast forwarding.
- the multicast distribution network will build a multicast distribution tree (MDT) for each node in the P2MP domain that serves as a multicast source.
- the network equipment in each MDT is responsible for forwarding P2MP data from the source node to each receiving node, that is, the sink node.
- P2MP data when using application layer multicast forwarding, P2MP data will be disassembled into multiple P2P data, and the source node 1 will send P2P data multiple times, and the P2P data will be unicast forwarded to each receiving node 2-5.
- some receiving nodes may participate in the forwarding. For example, after receiving the P2P data, the receiving node 2 forwards it to other receiving nodes 2-3 again.
- P2MP data needs to be reliably sent from the source node to the relevant receiving node, that is, a reliable transmission guarantee mechanism needs to be established to manage the sending/receiving of P2MP data.
- Figure 4 shows the significant characteristics of P2MP data forwarding in a P2MP communication domain.
- different nodes may be selected as the source of P2MP data transmission at different times in the life cycle of the P2MP communication domain, that is, there is frequent multicast source switching in the P2MP communication domain; in addition, the P2MP data sent in the P2MP communication domain needs to be received by all other nodes except the source node.
- FIG5 shows a description of the problem of reliable forwarding of P2MP data.
- each receiving node in the P2MP communication domain will send an ACK/NAK response message to the source node to indicate whether the receiving node has correctly received the P2MP data. Therefore, the source node of the P2MP communication domain needs to maintain the connection with each receiving node on the control plane, and also needs to manage the receiving status of each receiving node and retransmit the P2MP data to the receiving node that sent the NAK response message.
- the node scale in the P2MP communication domain increases, the resource consumption of the source node is very large.
- the multicast data source in the P2MP communication domain will switch. It can be understood that any node in the P2MP communication domain may serve as a P2MP source node, and the resource consumption problem of the source node will extend to any node in the P2MP communication domain.
- RDMA remote direct memory access
- a business model is established based on business scenarios such as HPC or AI, and a P2MP communication model is used between the MPI interfaces of parallel processing nodes such as CPUs and servers.
- a P2MP communication model is used between the MPI interfaces of parallel processing nodes such as CPUs and servers.
- different reliable guarantee mechanisms can be adopted according to the scale of collective communication in the business system.
- the system has a P2MP data source node and each An RC connection at the transport layer is established between the receiving nodes.
- the RC connection is similar to a TCP connection.
- P2MP data uses application layer multicast technology to forward unicast data on the RC connection.
- the RC connection at the transport layer implements packet loss retransmission to ensure reliable transmission of P2MP data.
- the system establishes a UD connection at the transport layer between the source node of the P2MP data and each receiving node to save QP resources.
- the UD connection is very similar to the UDP connection.
- P2MP uses network layer multicast or application layer multicast technology to multicast/unicast forward data packets on the UD connection.
- the reliable transmission of P2MP data requires the application layer of the receiving node to perform packet loss identification and retransmission requests, and the source node to perform packet loss retransmission and other operations to complete.
- the following solutions can be adopted: establish RC connections on demand, that is, create connections between source node communication members and receiving node communication members only when there is a P2MP communication demand, thereby saving QP resources of hardware devices, but this solution will greatly affect the transmission efficiency of P2MP data because the establishment of RC connections is time-consuming; or in order to ensure low latency overhead for establishing RC connections after the source node is switched, establish RC connections in advance.
- N*(N-1) ⁇ 2 RC connections need to be established in advance.
- RC connections use limited hardware resources on hardware devices, and member processes of a communication domain may be concentrated on a certain PC host or server, which will cause RC connections to be enriched on a certain physical terminal, exhaust resources, and ultimately lead to reliability cannot be guaranteed.
- the embodiment of the present application provides a reliable transmission method for P2MP data. Based on the characteristic that each receiving node in the P2MP communication domain receives the same data, a logical interconnection network is established between multiple nodes in the P2MP communication domain, and a part of the tasks of reliable transmission processing performed by the source node is distributed and deployed to multiple nodes in the P2MP communication domain, which can effectively reduce the pressure of single-point resources and avoid single-point processing bottlenecks.
- a method in which multiple nodes jointly maintain reliable transmission there is no need to establish a Full-Mesh connection between each node in the P2MP communication domain, thereby saving the cost of reliable transmission guarantee of the control plane caused by frequent switching of the source node.
- each receiving node receives the same data, so it is only necessary to establish an effective connection between each node at the signaling level, and design a reliable transmission scheme to implement a response mechanism (including sending a response message), a data verification mechanism and an order preservation mechanism (including packet loss request, data retransmission).
- the reliable transmission method provided in the embodiment of the present application decouples P2MP data forwarding and reliable transmission confirmation of P2MP data.
- a physical interconnection network is established between all nodes in the P2MP communication domain, that is, an interconnection network constructed by network devices, to realize the forwarding of P2MP data, and a logical interconnection network is established between all nodes in the P2MP communication domain to determine the reliable transmission confirmation relationship of P2MP data.
- FIG6a shows a physical internet network established between nodes in a P2MP communication domain provided by an embodiment of the present application.
- the P2MP communication domain includes eight nodes, each of which has its own identification information, respectively identified as nodes 1-8.
- a physical internet network with a secondary network topology is established between the eight nodes.
- P2MP data sent by any node can be forwarded in the Spine-Leaf network and reach the other seven nodes in the P2MP communication domain.
- FIG6b shows a forwarding path of P2MP data in a physical internet network provided by an embodiment of the present application.
- FIG6b shows a P2MP data forwarding path with node 1 as the multicast source.
- Node 1 can use network layer multicast (source node sends P2MP data once) to send data and forward it along the forwarding path of FIG6b.
- FIG7a shows a logical interconnection network established between nodes in a P2MP communication domain provided by an embodiment of the present application. Specifically, based on the P2MP communication domain and the physical interconnection network shown in FIG6a, a logical interconnection network is established between eight nodes. In the logical interconnection network, except for the first and last nodes, all nodes are directly connected to two other nodes. Two nodes with a direct connection relationship are neighbor nodes, and each node can have multiple neighbor nodes.
- FIG. 7b shows a reliable transmission confirmation relationship of P2MP data in a logical interconnected network provided by an embodiment of the present application.
- Figure 7b shows that after node 1 sends P2MP data as a multicast source, each receiving node needs to send a response message to a neighboring node after correctly receiving the P2MP data or identifying that the P2MP data is erroneous. For example, after node 6 identifies that the P2MP data is erroneous, it needs to send a NAK response message to node 5, and node 5 completes the retransmission of the P2MP data to node 6.
- node 5 can replace source node 1 to retransmit P2MP data to receiving node 6.
- the logical interconnection network shown in 7a it is not necessary for all receiving nodes 2-7 in the P2MP communication domain to establish a connection with source node 1, thereby saving the overhead of reliable transmission guarantee of the control plane caused by frequent switching of source nodes.
- any P2MP data sent by a multicast source can be forwarded to all other receiving nodes through the physical interconnection network.
- the physical interconnection network can use the network layer multicast mechanism or the application layer multicast mechanism to forward P2MP data, which is selected based on the requirements of different application scenarios.
- the logical interconnection network establishes the adjacency relationship between the nodes in the P2MP communication domain. Based on the adjacency relationship, each node in the P2MP domain confirms the reliable transmission of the P2MP data received from the physical interconnection network to the neighboring nodes in its logical interconnection network, and the neighboring nodes are responsible for retransmitting the P2MP data.
- the logical interconnection network between the eight nodes can have a variety of different networking modes in addition to the networking mode shown in FIG7a, so that different connections between the eight nodes can be formed.
- the establishment of connections between multiple nodes in the P2MP communication domain will be constrained by different strategies, for example, the number of connections between all nodes must be minimized to save resources required for establishing connections, the physical distance of the connection must be minimized so that neighboring nodes can quickly notify and respond to each other, etc. Therefore, it is necessary to build a logical interconnection network between multiple nodes in the P2MP communication domain based on different strategies and goals.
- a logical interconnection network is established based on a quick response strategy, multiple nodes can be divided into multiple node groups according to the physical distance, the physical distance between the nodes in each node group is less than or equal to the preset distance, and a connection is established between the nodes in each node group, and a node in one node group is connected to a node in another node group, thereby realizing the interconnection of multiple nodes.
- FIG8 shows a logical interconnection network established based on a quick response strategy provided by an embodiment of the present application, assuming that there is a certain physical neighbor relationship between certain nodes, and the physical neighbor relationship can be the relationship between nodes in a basic physical design unit (point of delivery, PoD) of a data center.
- point of delivery point of delivery
- PoD includes servers, access networks, converged network cabinets and their supporting facilities, which is an area of the entire network.
- the entire network includes multiple PoDs, for example, nodes 1-3 are deployed in the same PoD, nodes 4-6 are deployed in the second PoD, and nodes 7-8 are deployed in the third PoD.
- Based on the fast response strategy first establish connections between nodes in the same PoD, and then select a node from each PoD, such as nodes 3, 6, and 7, to establish connections between the three PoDs. This maximizes the use of the physical adjacency deployment characteristics of the nodes, and reliable transmission control signaling can be processed nearby and respond quickly. However, the connections of specific nodes 3, 6, and 7 responsible for the connection between PoDs will increase.
- each of the multiple nodes in the P2MP communication domain establishes a direct connection with a neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
- a chain topology can be used to connect all nodes in the P2MP communication domain, so that each node needs to establish connections with at most two neighbor nodes respectively, and the total number of connections is minimized, and the total number of connections in the P2MP communication domain is evenly distributed.
- a logical interconnection network is established between eight nodes based on the control connection number strategy, and the total number of connections between the eight nodes is minimized, which will not be repeated here.
- a communication connection is established between two neighboring nodes in the logical interconnection network. This communication connection is used to transmit reliable transmission response messages between neighboring nodes.
- the communication connection between neighboring nodes is created at the beginning of the establishment of the logical interconnection network and is used throughout the life cycle of the P2MP communication domain.
- FIG9 shows a flow chart of a reliable transmission method for P2MP data provided by an embodiment of the present application, assuming that P2MP communication
- the domain includes at least a first node, a second node, and a third node.
- a logical interconnection network is established between the multiple nodes in the P2MP communication domain.
- the first node and the third node are neighbor nodes
- the second node and the third node are neighbor nodes.
- the second node when used as a source node for sending P2MP data, the first node and the third node are used as receiving nodes, and a confirmation process for reliable transmission of P2MP data is performed in the P2MP communication domain, including the following steps S901-S905:
- Step S901 The second node sends a P2MP data packet in the P2MP communication domain.
- the P2MP data packet at least includes P2MP data, a sequence number of the P2MP data, and identification information of the second node (source node).
- the first node receives the P2MP data packet sent by the second node, and determines the identification information of the third node according to the forwarding table of the first node. It is assumed here that the first node determines that the received P2MP data is erroneous, and sends a response message to the third node.
- the response message includes at least the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of the first node, and the identification information of the second node (source node).
- the status flag of receiving the P2MP data is set to NAK, indicating that the received P2MP data is erroneous.
- Step S903 The third node receives the P2MP data sent by the second node, and determines the identification information of the second node according to the forwarding table of the third node. It is assumed here that the third node determines that the received P2MP data is correct, and sends a response message to the second node.
- the response message includes at least the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of the third node, and the identification information of the second node (source node). At this time, the status flag of receiving the P2MP data is set to ACK, indicating that the P2MP data is correctly received.
- step S902 there is no order relationship between step S902 and step S903, and the actual transmission situation depends on the physical interconnection network established by multiple nodes in the P2MP communication domain.
- Step S904 The third node receives the NAK response message sent by the first node, updates the state of the first node receiving the P2MP data in the management table, and retransmits the correct P2MP data to the first node.
- Step S905 The first node receives the P2MP data retransmitted by the third node, and determines the identification information of the third node according to the forwarding table of the first node. It is assumed here that the first node determines that the received P2MP data is correct, and sends a response message to the third node.
- the response message includes at least the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of the first node, and the identification information of the second node (source node). At this time, the status flag of receiving the P2MP data is set to ACK, indicating that the P2MP data is correctly received.
- each node in the P2MP communication domain has a forwarding table and a management table.
- the forwarding table is used to confirm a neighbor node that sends a response message after receiving a P2MP data packet sent by the source node
- the management table is used to record the state information of the managed receiving node receiving P2MP data after receiving a response message sent by the managed receiving node.
- Figure 10 shows a flow chart of a reliable transmission method of P2MP data provided in an embodiment of the present application, which is applied to a first node, where the first node is a node in a P2MP communication domain.
- a P2MP communication domain includes multiple nodes, and the first node is one of the multiple nodes.
- a logical interconnection network is built between the multiple nodes, and each node has its own identification information.
- the reliable transmission method of P2MP data includes the following steps S1010 - S1030 .
- Step S1010 receiving a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet at least includes P2MP data, and the second node is one of the plurality of nodes.
- Step S1020 determining identification information of the third node according to the forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network.
- Step S1030 Send a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node.
- the response message includes ACK and NAK messages.
- ACK indicates that the P2MP data is correctly received
- NAK indicates that the received P2MP data is erroneous.
- the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; sending a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node includes: when the state of receiving the P2MP data is erroneous, sending a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and the state flag of receiving the P2MP data, the state flag of receiving the P2MP data is set to NAK, and NAK indicates that the received P2MP data is erroneous; the method also includes: receiving a correct P2MP data packet retransmitted by the third node.
- the P2MP communication domain includes eight nodes, and a logical interconnection network is built between the eight nodes.
- Each node has its own identification information.
- node 1 is regarded as the second node
- node 5 is regarded as the first node
- node 6 is regarded as the third node to illustrate the reliable transmission method of P2MP data.
- node 1 is selected as the multicast source for sending P2MP data in the P2MP communication domain
- nodes 2-8 are selected as receiving nodes for receiving P2MP data.
- any of the eight nodes may be used as the source node for sending P2MP data.
- some additional information needs to be added to form a P2MP data packet, such as the identification information of the source node, the sequence number of the P2MP data, etc.
- node 6 has two neighbor nodes, namely node 5 and node 7.
- node 6 determines the identification information of node 5 according to the forwarding table, and then sends a response message to node 5 according to the status of receiving the P2MP data.
- the response message at least includes the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of node 6, and the identification information of node 1.
- the status flag of receiving the P2MP data is set to ACK, indicating that the P2MP data is correctly received, or is set to NAK, indicating that the received P2MP data is incorrect.
- NAK it is also necessary to receive the correct P2MP data retransmitted by node 5.
- a response message may be sent to the third node once after all P2MP data packets forwarded by the second node are received, or a response message may be sent to the third node once each time the second node receives a P2MP data packet or a fixed number of times.
- the selection of the above response message sending method is set in the control plane of the system. It can be understood that a fixed response message sending method can be set throughout the life cycle of the P2MP communication domain, or the response message sending method can be switched in real time according to the network congestion situation. For the convenience of description, in this application, a response message is sent once each time a P2MP data packet is received. The implementation methods of other message sending methods are similar and will not be repeated here.
- FIG11 shows a logical interconnection network and a forwarding table provided in an embodiment of the present application.
- the P2MP communication domain includes eight nodes, and a logical interconnection network is established between the nodes.
- a node may have multiple neighboring nodes. Therefore, after the logical interconnection network is established, it is necessary to establish a forwarding table for each node, select a neighboring node according to the forwarding table, and perform reliable transmission control signaling (ACK/NAK response message) sending and retransmission processing.
- ACK/NAK response message reliable transmission control signaling
- the establishment of the forwarding table mainly considers two aspects:
- the source node forwarding P2MP data will frequently switch during the life cycle of the P2MP communication domain, the impact of the source node switching can be considered, and the source node identification information can be used as an index item to establish a forwarding table, and the corresponding neighbor node can be selected according to different source nodes.
- the source node identification information can be used as an index item to establish a forwarding table, and the corresponding neighbor node can be selected according to different source nodes.
- the source node identification information can be used as an index item to establish a forwarding table, and the corresponding neighbor node can be selected according to different source nodes.
- the influence of the source node switching may not be considered, and a fixed neighbor node may be designated to interact with the reliable transmission signaling when establishing the node forwarding table. It is understandable that for all nodes in the P2MP communication domain, the influence of the source node switching may be considered when establishing the forwarding table for some of the nodes, and the identification information of the source node may be used as part of the forwarding table, while the influence of the source node switching may not be considered when establishing the forwarding table for the other nodes, and a fixed neighbor node may be designated for these nodes to interact with the reliable transmission signaling, and the influence of the source node switching may be considered or not considered for all the nodes.
- any node in P2MP its forwarding table can be established using the field ⁇ key: value ⁇ , where the key in the field is the identification information or wildcard of the source node, and the value in the field is the identification information of the neighboring node for reliable transmission signaling interaction.
- the wildcard indicates that the impact of the source node switching is not considered, and the identification information of any source node will hit the wildcard.
- the establishment of a forwarding table also needs to ensure that neighboring nodes (including source nodes) that interact with each other in reliable transmission signaling can achieve effective connection on the reply message sending path.
- the forwarding tables of node 2 and node 3 are given.
- the settings of the two forwarding tables both consider the impact of source node switching.
- the neighbor node identification information of node 2 is listed when the source node is node 1.
- Figure 11(b) shows a schematic diagram of reliable transmission signaling interaction when node 1 is the source node and receiving nodes 2-8.
- nodes 1-7 as neighbor nodes of other receiving nodes, have the task of processing signaling interaction, and nodes 1-7 can achieve effective connection on the path of sending the reply message based on the neighbor information configured in the forwarding table, so that nodes 1-7 can correctly receive P2MP data, thereby replacing source node 1 to achieve retransmission of P2MP data to the corresponding receiving node.
- FIG 11(c) shows a schematic diagram of reliable transmission signaling interaction when node 1 is the source node and receiving nodes 2-8. It can be seen from Figure 11(c) that nodes 1 and 3-7, as neighbor nodes of other receiving nodes, have the task of processing signaling interaction, while nodes 1 and 3-7 do not achieve effective connection on the reply message sending path under the configuration of the forwarding table, forming independent islands in two logical interconnected networks. Therefore, when nodes 3 and 4 do not correctly receive the P2MP data sent by node 1 due to network failure, they cannot obtain the correct retransmission. The P2MP packet cannot retransmit the correct P2MP data to the receiving node it manages, so it cannot guarantee that each receiving node in the P2MP communication domain can correctly receive the P2MP data.
- FIG 11(d) shows a schematic diagram of reliable transmission signaling interaction when node 1 is used as the source node and receiving nodes 2-8.
- nodes 2-7 as neighboring nodes of other receiving nodes, have the task of processing signaling interaction, while nodes 2-7 are not connected to node 1 on the reply message path under the configuration of the forwarding table, forming independent islands in two logical interconnected networks. Therefore, when nodes 2 and 3 do not correctly receive the P2MP data sent by node 1 due to network failure, they cannot obtain the retransmitted correct P2MP packets, nor can they retransmit the correct P2MP data to the receiving nodes they manage, thereby failing to ensure that each receiving node in the P2MP communication domain can correctly receive the P2MP data.
- Figure 12 shows a flow chart of a reliable transmission method of P2MP data provided in an embodiment of the present application, which is applied to a third node, where the third node is a node in a P2MP communication domain.
- a P2MP communication domain includes multiple nodes, and the third node is one of the multiple nodes.
- a logical interconnection network is built between the multiple nodes, and each node has its own identification information.
- the reliable transmission method of P2MP data includes the following steps S1210 - S1220 .
- Step S1210 receiving a response message from the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, the P2MP data packet is sent by the second node in a P2MP communication domain, and the second node is one of the multiple nodes.
- Step S1220 updating the management table of the third node according to the response message of the first node; the management table at least includes the state of the first node receiving the P2MP data.
- the third node updates the status of the first node receiving the P2MP data in the management table according to the response message of the first node.
- the reliable transmission method of P2MP data further includes: determining whether the status flag of the first node receiving P2MP data in the response message of the first node is NAK, and if so, retransmitting the correct P2MP data packet to the first node and starting a timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; receiving the response message from the first node again, and stopping the timer.
- NAK indicates that the P2MP data received by the first node is incorrect
- the third node receives the response message from the first node again, it is also necessary to update the status of the first node receiving P2MP data in the management table.
- the third node (a receiving node) and the second node (source node) as neighbor nodes of other receiving nodes have the task of processing signaling interaction.
- the second node after sending the P2MP data, it is necessary to start a timer for all receiving nodes that need to send a reply message to the second node. After receiving the reply message from each receiving node corresponding to the timer, the timer started for this receiving node is closed. If the timer times out, the correct P2MP data is retransmitted to this receiving node.
- the third node it is necessary to perform similar operations for all receiving nodes that need to send a reply message to the third node after receiving the P2MP data, which will not be repeated here.
- any node in P2MP such as the third node, its management table can be established using the field ⁇ key: value: state ⁇ , where the key in the field is the identification information or wildcard of the source node sending the P2MP data, the value in the field is the identification information of the corresponding receiving node, and the state in the field is the state flag of the corresponding receiving node receiving the P2MP data.
- the two tables can be two independent tables or combined into one table. It is only necessary to ensure that when looking up the table, the identification information of the neighbor node that sends the response message and the receiving node that receives the response message can be obtained respectively.
- the mechanism for implementing reliable transmission of multicast data is distributed and deployed to multiple communication nodes in the P2MP communication domain, which can effectively alleviate the problem of reliable transmission from only the source node to multiple receiving nodes. Processing pressure and avoid single-point processing bottlenecks.
- FIG. 13a shows a network layer multicast resource configuration diagram based on RC connection in a P2MP communication domain provided by an embodiment of the present application.
- a P2MP communication domain includes seven communication nodes, each of which can be carried in the same or different computers, servers, clusters, storage devices, including computing processing units such as smart network cards and channel adapters, which can realize physical layer links through gateways, routers, etc.
- node 1 is selected as the source node for sending P2MP data.
- the network layer multicast resource configuration based on the RC connection is set between the source node 1 and each receiving node 2-7 through the control plane. It can be seen from Figure 13a that the number of RC connections that need to be established is 6, among which node 1 establishes 1 RC connection with receiving nodes 2, 3, and 4 respectively, and nodes 2, 4, and 5 establish 1 RC connection with receiving nodes 5, 7, and 6 respectively.
- Figure 13b shows a logical connection diagram based on the data plane within a P2MP communication domain provided by an embodiment of the present application.
- the source node 1 makes multiple copies of the P2MP data in the multicast network and then sends them to the receiving nodes 2-7 at one time.
- FIG. 13c shows a schematic diagram of a logical connection based on a control plane in a P2MP communication domain provided by an embodiment of the present application.
- nodes in the P2MP communication domain for example, node 1 (multicast source) ⁇ node 2 (multicast sink 1, a receiving node managed by the multicast source, a neighbor node of multicast sink 2) ⁇ node 5 (multicast sink 2, a receiving node managed by multicast sink 1, a neighbor node of multicast sink 3) ⁇ node 6 (multicast sink 3, a receiving node managed by multicast sink 2).
- node 2 (multicast sink 1): first, according to the forwarding table, it is determined that the neighbor node in the reliable transmission control domain is node 1. Secondly, it is necessary to send a message to the multicast source whether the node can reliably receive P2MP data, and it is not necessary to inform the multicast source whether the node can reliably receive P2MP data.
- node 5 (multicast sink 2): first determine the neighbor node in the reliable transmission control domain as node 2 according to the forwarding table, and then need to notify its neighbor node groupcast sink 1 whether it can reliably receive P2MP data, and do not need to send whether this node can reliably receive P2MP data to the multicast source, and do not need to inform the receiving node groupcast sink 3 managed by the multicast source whether it can reliably receive P2MP data
- node 6 (multicast sink 3) first determine the neighbor node in the reliable transmission control domain as node 5 according to the forwarding table, and then need to notify its neighbor node groupcast sink 2 whether it can reliably receive P2MP data, and do not need to send whether this node can reliably receive P2MP data to the multicast source.
- multicast sink 1 can correctly receive the P2MP data sent by the multicast source, and can also retransmit the correct P2MP data when multicast sink 2 sends a NAK response message.
- any node only needs to feedback information to its neighboring node, which may be a multicast source or a multicast sink. Therefore, through the distributed reliable transmission control domain division, the pressure of a single point can be effectively reduced and the bottleneck of single point processing can be avoided.
- Figure 14 shows a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application.
- the device can be deployed on a first node.
- the first node is a node in a P2MP communication domain.
- a P2MP communication domain includes multiple nodes.
- the first node is one of the multiple nodes.
- a logical interconnection network is built between the multiple nodes.
- Each node has its own identification information.
- the confirmation device 1400 includes: a communication module 1410 and a processing module 1420.
- the communication module 1410 may receive a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet includes at least P2MP data, and the second node is one of the plurality of nodes.
- the processing module 1420 can determine the identification information of the third node according to the forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the processing module 1420 can also send a response message to the third node according to the status of receiving P2MP data and the identification information of the third node.
- the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; the processing module 1420 may send a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node, and specifically, when the state of receiving the P2MP data is erroneous, send a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and the state flag of receiving the P2MP data, and the state flag of receiving the P2MP data is set to NAK, which indicates that the received P2MP data is erroneous; the communication module 1410 may also receive a correct P2MP data packet retransmitted by the third node.
- the P2MP data packet includes at least P2MP data and identification information of the second node; the processing module 1420 can determine the identification information of the third node based on the forwarding table of the first node, specifically, determine the identification information of the third node based on the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node includes at least one field ⁇ key: value ⁇ , the key in the field is the identification information of the second node or a wildcard, and the value in the field is the identification information of the third node.
- a logical interconnection network is established by connecting multiple nodes, and the connections between the multiple nodes include: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and connections are established with the first nodes in the multiple node groups.
- Figure 15 shows a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application.
- the device can be deployed at a third node.
- the third node is a node in a P2MP communication domain.
- a P2MP communication domain includes multiple nodes.
- the third node is one of the multiple nodes.
- a logical interconnection network is built between the multiple nodes.
- Each node has its own identification information.
- the confirmation device 1500 includes: a communication module 1510 and a processing module 1520.
- the communication module 1510 can receive a response message from the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, the P2MP data packet is sent by the second node in a P2MP communication domain, and the second node is one of multiple nodes.
- the processing module 1520 may update the management table of the third node according to the response message of the first node; the management table at least includes the state of the first node receiving the P2MP data.
- the processing module 1520 can determine whether the status flag of the first node receiving the P2MP data in the response message of the first node is NAK. If so, retransmit the correct P2MP data packet to the first node and start the timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; the communication module 1510 is also used to: receive the response message from the first node and turn off the timer.
- the P2MP data packet includes at least P2MP data and identification information of the second node;
- the processing module 1520 can update the management table of the second node according to the response message of the first node, and specifically, update the state flag of the first node receiving P2MP data in the management table of the second node according to the identification information of the first node and the state flag of the first node receiving P2MP data;
- the management table of the second node includes at least one field ⁇ key: value: state ⁇ , the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving P2MP data.
- a logical interconnection network is established by connecting multiple nodes, and the connections between the multiple nodes include: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
- a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a third node, the physical distance between the nodes in the node group is less than or equal to a preset distance, the nodes in the node group are connected, and the third nodes in multiple node groups are connected.
- an embodiment of the present application provides an electronic device.
- the electronic device may include: a display screen; at least one memory for storing programs; at least one processor for executing the programs stored in the memory. Wherein, when the program stored in the memory is executed, the processor is used to execute the method described in the above embodiment.
- the electronic device may be a mobile phone, a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, a server, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a netbook, and a cellular phone, a personal digital assistant (personal digital assistant, PDA), an augmented reality (augmented reality, AR) device, a virtual reality (virtual reality, VR) device, an artificial intelligence (artificial intelligence, AI) device, a wearable device, a vehicle-mounted device, a smart home device and/or a smart city device.
- PDA personal digital assistant
- AR augmented reality
- VR virtual reality
- AI artificial intelligence
- an embodiment of the present application provides a computer-readable storage medium, which stores a computer program.
- the computer program runs on a processor, the processor executes the method in the above embodiment.
- an embodiment of the present application provides a computer program product.
- the computer program product runs on a processor
- the processor executes the method in the above embodiment.
- processors in the embodiments of the present application may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
- the general-purpose processor may be a microprocessor or any conventional processor.
- the method steps in the embodiments of the present application can be implemented by hardware or by a processor executing software instructions.
- the software instructions can be composed of corresponding software modules, which can be stored in random access memory (RAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disks, mobile hard disks, CD-ROMs, or any other form of storage medium known in the art.
- An exemplary storage medium is coupled to a processor so that the processor can read information from the storage medium and write information to the storage medium.
- the storage medium can also be a component of the processor.
- the processor and the storage medium can be located in an ASIC.
- all or part of the embodiments may be implemented by software, hardware, firmware or any combination thereof.
- all or part of the embodiments may be implemented in the form of a computer program product.
- the computer program product includes one or more computer instructions.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
A reliable transmission method for P2MP data, which method is applied to a first node, wherein the first node is a node in a P2MP communication domain, the P2MP communication domain comprises a plurality of nodes, the first node is one of the plurality of nodes, and a logic interconnection network is established between the plurality of nodes. The method comprises: receiving a P2MP data packet sent by a second node in a P2MP communication domain, wherein the P2MP data packet at least comprises P2MP data, and the second node is one of a plurality of nodes; determining a neighbor node in a logic interconnection network according to a forwarding table of a first node; and sending a response message to the neighbor node according to the state of the receiving of the P2MP data, and the neighbor node executing a reliable transmission processing task with regard to the first node receiving the P2MP data. In this way, a logic interconnection network is established between a plurality of nodes in a P2MP communication domain, and the task, which is executed by only a source node, of reliable transmission processing of P2MP data is deployed on the plurality of nodes in the P2MP communication domain in a distributed manner, such that the single-point processing pressure can be effectively reduced.
Description
本申请要求于2023年01月10日提交中国国家知识产权局、申请号为202310037879.6、申请名称为“一种P2MP数据的可靠传输方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application filed with the State Intellectual Property Office of China on January 10, 2023, with application number 202310037879.6 and application name “A method and device for reliable transmission of P2MP data”, the entire contents of which are incorporated by reference into this application.
本申请涉及通信网络技术领域,尤其涉及一种P2MP数据的可靠传输方法及装置。The present application relates to the field of communication network technology, and in particular to a reliable transmission method and device for P2MP data.
点对多点(point to multiple point,P2MP)通信是指发送组播数据的源节点和一系列接收组播数据的接收节点之间的通信。Point-to-multipoint (P2MP) communication refers to the communication between a source node that sends multicast data and a series of receiving nodes that receive the multicast data.
目前,P2MP数据发送/接收的可靠保障机制,通常会在P2MP数据的源节点和每个接收节点之间建立传输层的单播可靠传输连接,通过可靠传输连接来保障可靠传输,或基于传输层的不可靠数据报连接,在应用层建立可靠传输机制来保障可靠传输。但在P2MP通信域内通信节点规模较大时,可能由于源节点单独维护可靠传输机制所需要的资源消耗严重;或在P2MP通信域内发生组播源频繁切换时,可能由于在控制面建立P2MP数据可靠传输连接的开销过大,最终导致组播数据传输的可靠性无法保证。因此,需要建立高效的P2MP数据可靠传输保障机制。At present, the reliable guarantee mechanism for P2MP data transmission/reception usually establishes a unicast reliable transmission connection at the transport layer between the source node and each receiving node of the P2MP data, and guarantees reliable transmission through the reliable transmission connection, or establishes a reliable transmission mechanism at the application layer based on the unreliable datagram connection at the transport layer to ensure reliable transmission. However, when the scale of communication nodes in the P2MP communication domain is large, the resource consumption required for the source node to maintain the reliable transmission mechanism alone may be serious; or when the multicast source frequently switches in the P2MP communication domain, the overhead of establishing a reliable transmission connection for P2MP data on the control plane may be too high, which ultimately leads to the inability to guarantee the reliability of multicast data transmission. Therefore, it is necessary to establish an efficient P2MP data reliable transmission guarantee mechanism.
发明内容Summary of the invention
本申请提供了一种P2MP数据的可靠传输方法、装置、电子设备、计算机可读存储介质及计算机程序产品,能够在P2MP通信域内通信节点规模较大,以及P2MP通信域内发生组播源切换频繁时,高效的保障P2MP数据的可靠传输。The present application provides a reliable transmission method, device, electronic device, computer-readable storage medium and computer program product for P2MP data, which can efficiently ensure the reliable transmission of P2MP data when the scale of communication nodes in the P2MP communication domain is large and multicast source switching occurs frequently in the P2MP communication domain.
第一方面,本申请提供一种P2MP数据的可靠传输方法,应用于第一节点,第一节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第一节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,方法包括:接收第二节点在一个P2MP通信域内发送的P2MP数据包;P2MP数据包至少包括P2MP数据,第二节点是多个节点中的一个;根据第一节点的转发表确定第三节点的标识信息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点;根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息。In a first aspect, the present application provides a reliable transmission method for P2MP data, which is applied to a first node, wherein the first node is a node in a P2MP communication domain, wherein the P2MP communication domain includes multiple nodes, the first node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information. The method includes: receiving a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet includes at least P2MP data, and the second node is one of the multiple nodes; determining the identification information of a third node according to a forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node having a direct connection relationship in the logical interconnection network; and sending a response message to the third node according to a state of receiving the P2MP data and the identification information of the third node.
由此,通过对P2MP通信域内的多个节点建立逻辑互联网络,将仅由源节点执行P2MP数据可靠传输处理的任务,分布式部署到P2MP通信域内的多个节点,可以有效减轻单点处理压力。Therefore, by establishing a logical interconnection network for multiple nodes in the P2MP communication domain, the task of reliable transmission and processing of P2MP data performed only by the source node is distributed and deployed to multiple nodes in the P2MP communication domain, which can effectively reduce the pressure of single-point processing.
在一种可能的实现方式中,P2MP数据包至少包括P2MP数据、P2MP数据的序列号;根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息,包括:在接收P2MP数据的状态为有误时,向第三节点发送应答消息;应答消息至少包括第一节点的标识信息、P2MP数据的序列号以及接收P2MP数据的状态标志,接收P2MP数据的状态标志设置为NAK,NAK表示接收到的P2MP数据有误;方法还包括:接收第三节点重传的正确的P2MP数据包。In a possible implementation, the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; sending a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node includes: when the state of receiving the P2MP data is erroneous, sending the response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and a state flag of receiving the P2MP data, the state flag of receiving the P2MP data is set to NAK, and NAK indicates that the received P2MP data is erroneous; the method also includes: receiving a correct P2MP data packet retransmitted by the third node.
在一种可能的实现方式中,其中,P2MP数据包至少包括P2MP数据、第二节点的标识信息;根据第一节点的转发表确定第三节点的标识信息,包括:根据第一节点的转发表和第二节点的标识信息确定第三节点的标识信息;其中,第一节点的转发表包含至少一条字段{key:value},字段中的key为第二节点的标识信息或通配符,字段中的value为第三节点的标识信息。In a possible implementation, the P2MP data packet includes at least P2MP data and identification information of the second node; determining the identification information of the third node according to the forwarding table of the first node includes: determining the identification information of the third node according to the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node contains at least one field {key: value}, the key in the field is the identification information of the second node or a wildcard, and the value in the field is the identification information of the third node.
在一种可能的实现方式中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点中的每个节点具有至少一个邻居节点,多个节点中的每个节点与邻居节点建立直连,多个节点中的每个节点的邻居节点之间不建立直连。In one possible implementation, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
在一种可能的实现方式中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,
包括:多个节点分成多个节点组,每个节点组包括第一节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间建立连接,多个节点组中的第一节点建立连接。In a possible implementation, the logical interconnection network is established by connecting multiple nodes. It includes: a plurality of nodes are divided into a plurality of node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and the first nodes in the plurality of node groups are connected.
第二方面,本申请提供一种P2MP数据的可靠传输方法,应用于第三节点,第三节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第三节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,方法包括:接收来自第一节点的应答消息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点;应答消息至少包括第一节点的标识信息和第一节点接收P2MP数据的状态标志;P2MP数据包含在P2MP数据包中,P2MP数据包由第二节点在一个P2MP通信域内发送,第二节点是多个节点中的一个;根据第一节点的应答消息,更新第三节点的管理表;管理表至少包括第一节点接收P2MP数据的状态。In a second aspect, the present application provides a reliable transmission method for P2MP data, which is applied to a third node, the third node is a node in a P2MP communication domain, a P2MP communication domain includes multiple nodes, the third node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information. The method includes: receiving a response message from a first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least the identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, and the P2MP data packet is sent by a second node in a P2MP communication domain, and the second node is one of the multiple nodes; according to the response message of the first node, the management table of the third node is updated; the management table includes at least the status of the first node receiving P2MP data.
由此,通过对P2MP通信域内的多个节点建立逻辑互联网络,将仅由源节点执行P2MP数据可靠传输处理的任务,分布式部署到P2MP通信域内的多个节点,可以有效减轻单点处理压力。Therefore, by establishing a logical interconnection network for multiple nodes in the P2MP communication domain, the task of reliable transmission and processing of P2MP data performed only by the source node is distributed and deployed to multiple nodes in the P2MP communication domain, which can effectively reduce the pressure of single-point processing.
在一种可能的实现方式中,接收来自第一节点的应答消息之后,方法还包括:判断第一节点的应答消息中第一节点接收P2MP数据的状态标志是否是NAK,如果是,则向第一节点重传正确的P2MP数据包,并开启定时器;其中,NAK表示第一节点接收到的P2MP数据有误;接收来自第一节点的应答消息,关闭定时器。In a possible implementation, after receiving a response message from the first node, the method further includes: determining whether a status flag of the first node receiving P2MP data in the response message of the first node is NAK, and if so, retransmitting a correct P2MP data packet to the first node and starting a timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; receiving a response message from the first node, and stopping the timer.
在一种可能的实现方式中,其中,P2MP数据包至少包括P2MP数据、第二节点的标识信息;根据第一节点的应答消息,更新第三节点的管理表,包括:根据第一节点的标识信息和第一节点接收P2MP数据的状态标志,更新第三节点的管理表中第一节点接收P2MP数据的状态标志;其中,第二节点的管理表包含至少一条字段{key:value:state},字段中的key为第二节点的标识信息或通配符,字段中的value为第一节点的标识信息,字段中的state为第一节点接收P2MP数据的状态标志。In a possible implementation, the P2MP data packet includes at least P2MP data and identification information of the second node; updating the management table of the third node according to the response message of the first node includes: updating the state flag of the first node receiving the P2MP data in the management table of the third node according to the identification information of the first node and the state flag of the first node receiving the P2MP data; wherein the management table of the second node includes at least one field {key: value: state}, the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving the P2MP data.
在一种可能的实现方式中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点中的每个节点具有至少一个邻居节点,多个节点中的每个节点与邻居节点建立直连,多个节点中的每个节点的邻居节点之间不建立直连。In one possible implementation, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
在一种可能的实现方式中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点分成多个节点组,每个节点组包括第三节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间进行连接,多个节点组中的第三节点进行连接。In one possible implementation, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a third node, the physical distance between the nodes in the node group is less than or equal to a preset distance, the nodes in the node group are connected, and the third nodes in multiple node groups are connected.
第三方面,本申请提供一种P2MP数据的可靠传输装置,部署于第一节点,第一节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第一节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,装置包括:通信模块,用于接收第二节点在一个P2MP通信域内发送的P2MP数据包;P2MP数据包至少包括P2MP数据,第二节点是多个节点中的一个;处理模块,用于根据第一节点的转发表确定第三节点的标识信息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点;处理模块,还用于根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息。In a third aspect, the present application provides a reliable transmission device for P2MP data, which is deployed on a first node, the first node is a node in a P2MP communication domain, a P2MP communication domain includes multiple nodes, the first node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information. The device includes: a communication module, which is used to receive a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet includes at least P2MP data, and the second node is one of the multiple nodes; a processing module, which is used to determine the identification information of a third node according to a forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the processing module is also used to send a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node.
在一种可能的实现方式中,其中,P2MP数据包至少包括P2MP数据、P2MP数据的序列号;处理模块根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息时,用于:在接收P2MP数据的状态为有误时,向第三节点发送应答消息;应答消息至少包括第一节点的标识信息、P2MP数据的序列号以及接收P2MP数据的状态标志,接收P2MP数据的状态标志设置为NAK,NAK表示接收到的P2MP数据有误;通信模块还用于:接收第三节点重传的正确的P2MP数据包。In a possible implementation, the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; when the processing module sends a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node, it is used to: when the state of receiving the P2MP data is erroneous, send a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and a state flag of receiving the P2MP data, and the state flag of receiving the P2MP data is set to NAK, and NAK indicates that the received P2MP data is erroneous; the communication module is also used to: receive a correct P2MP data packet retransmitted by the third node.
在一种可能的实现方式中,P2MP数据包至少包括P2MP数据、第二节点的标识信息;处理模块根据第一节点的转发表确定第三节点的标识信息时,用于:根据第一节点的转发表和第二节点的标识信息确定第三节点的标识信息;其中,第一节点的转发表包含至少一条字段{key:value},字段中的key为第二节点的标识信息或通配符,字段中的value为第三节点的标识信息。In a possible implementation, the P2MP data packet includes at least P2MP data and identification information of the second node; when the processing module determines the identification information of the third node according to the forwarding table of the first node, it is used to: determine the identification information of the third node according to the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node includes at least one field {key: value}, the key in the field is the identification information of the second node or a wildcard, and the value in the field is the identification information of the third node.
在一种可能的实现方式中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点中的每个节点具有至少一个邻居节点,多个节点中的每个节点与邻居节点建立直连,多个节点中的每个节点的邻居节点之间不建立直连。
In one possible implementation, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
在一种可能的实现方式中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点分成多个节点组,每个节点组包括第一节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间建立连接,多个节点组中的第一节点建立连接。In one possible implementation, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and connections are established with the first nodes in the multiple node groups.
第四方面,本申请提供一种P2MP数据的可靠传输装置,部署于第三节点,第三节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第三节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,装置包括:通信模块,用于接收来自第一节点的应答消息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点;应答消息至少包括第一节点的标识信息和第一节点接收P2MP数据的状态标志;P2MP数据包含在P2MP数据包中,P2MP数据包由第二节点在一个P2MP通信域内发送,第二节点是多个节点中的一个;处理模块,用于根据第一节点的应答消息,更新第三节点的管理表;管理表至少包括第一节点接收P2MP数据的状态。In a fourth aspect, the present application provides a reliable transmission device for P2MP data, which is deployed on a third node, the third node is a node in a P2MP communication domain, a P2MP communication domain includes multiple nodes, the third node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information. The device includes: a communication module, which is used to receive a response message from a first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least the identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, and the P2MP data packet is sent by the second node in a P2MP communication domain, and the second node is one of the multiple nodes; a processing module, which is used to update the management table of the third node according to the response message of the first node; the management table includes at least the status of the first node receiving P2MP data.
在一种可能的实现方式中,处理模块接收来自第一节点的应答消息之后,还用于:判断第一节点的应答消息中第一节点接收P2MP数据的状态标志是否是NAK,如果是,则向第一节点重传正确的P2MP数据包,并开启定时器;其中,NAK表示第一节点接收到的P2MP数据有误;通信模块还用于:接收来自第一节点的应答消息,关闭定时器。In a possible implementation, after receiving the response message from the first node, the processing module is further used to: determine whether the status flag of the first node receiving the P2MP data in the response message of the first node is NAK, and if so, retransmit the correct P2MP data packet to the first node and start the timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; and the communication module is further used to: receive the response message from the first node and stop the timer.
在一种可能的实现方式中,其中,P2MP数据包至少包括P2MP数据、第二节点的标识信息;处理模块根据第一节点的应答消息,更新第二节点的管理表时,用于:根据第一节点的标识信息和第一节点接收P2MP数据的状态标志,更新第二节点的管理表中第一节点接收P2MP数据的状态标志;其中,第二节点的管理表包含至少一条字段{key:value:state},字段中的key为第二节点的标识信息或通配符,字段中的value为第一节点的标识信息,字段中的state为第一节点接收P2MP数据的状态标志。In a possible implementation, the P2MP data packet includes at least P2MP data and identification information of the second node; when the processing module updates the management table of the second node according to the response message of the first node, it is used to: update the state flag of the first node receiving the P2MP data in the management table of the second node according to the identification information of the first node and the state flag of the first node receiving the P2MP data; wherein the management table of the second node includes at least one field {key: value: state}, the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving the P2MP data.
在一种可能的实现方式中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点中的每个节点具有至少一个邻居节点,多个节点中的每个节点与邻居节点建立直连,多个节点中的每个节点的邻居节点之间不建立直连。In one possible implementation, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
在一种可能的实现方式中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点分成多个节点组,每个节点组包括第三节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间进行连接,多个节点组中的第三节点进行连接。In one possible implementation, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a third node, the physical distance between the nodes in the node group is less than or equal to a preset distance, the nodes in the node group are connected, and the third nodes in multiple node groups are connected.
第五方面,本申请提供一种电子设备,包括:至少一个存储器,用于存储程序;至少一个处理器,用于执行存储器存储的程序;其中,当存储器存储的程序被执行时,处理器用于执行第一方面或第一方面的任一种可能的实现方式所描述的方法,或者,执行第二方面或第二方面的任一种可能的实现方式所描述的方法。In a fifth aspect, the present application provides an electronic device, comprising: at least one memory for storing programs; and at least one processor for executing the programs stored in the memory; wherein, when the program stored in the memory is executed, the processor is used to execute the method described in the first aspect or any possible implementation of the first aspect, or to execute the method described in the second aspect or any possible implementation of the second aspect.
第六方面,本申请提供一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,当计算机程序在处理器上运行时,使得处理器执行第一方面或第一方面的任一种可能的实现方式所描述的方法,或者,执行第二方面或第二方面的任一种可能的实现方式所描述的方法。In a sixth aspect, the present application provides a computer-readable storage medium, which stores a computer program. When the computer program runs on a processor, the processor executes the method described in the first aspect or any possible implementation of the first aspect, or executes the method described in the second aspect or any possible implementation of the second aspect.
第七方面,本申请提供一种计算机程序产品,当计算机程序产品在处理器上运行时,使得处理器执行第一方面或第一方面的任一种可能的实现方式所描述的方法,或者,执行第二方面或第二方面的任一种可能的实现方式所描述的方法。In the seventh aspect, the present application provides a computer program product. When the computer program product runs on a processor, the processor executes the method described in the first aspect or any possible implementation of the first aspect, or executes the method described in the second aspect or any possible implementation of the second aspect.
可以理解的是,上述第三方面至第七方面的有益效果可以参见上述第一方面至第二方面中的相关描述,在此不再赘述。It can be understood that the beneficial effects of the third to seventh aspects mentioned above can be found in the relevant descriptions of the first to second aspects mentioned above, and will not be repeated here.
图1是一种建立MPI通信域的编码过程示意图;FIG1 is a schematic diagram of an encoding process for establishing an MPI communication domain;
图2是使用P2MP通信模型的典型MPI接口示意图;FIG2 is a schematic diagram of a typical MPI interface using a P2MP communication model;
图3是网络层组播转发和应用层组播转发P2MP数据的流程图;FIG3 is a flow chart of network layer multicast forwarding and application layer multicast forwarding P2MP data;
图4是P2MP通信域内P2MP数据转发的显著特点示意图;FIG4 is a schematic diagram of the salient features of P2MP data forwarding within a P2MP communication domain;
图5是进行P2MP数据可靠转发的问题描述示意图;FIG5 is a schematic diagram illustrating the problem of reliable forwarding of P2MP data;
图6a是本申请实施例提供的一种P2MP通信域内节点间建立的物理互联网络;
FIG6a is a physical interconnection network established between nodes in a P2MP communication domain provided by an embodiment of the present application;
图6b是本申请实施例提供的一种P2MP数据在物理互联网络中的转发路径;FIG6b is a forwarding path of P2MP data in a physical interconnection network provided by an embodiment of the present application;
图7a是本申请实施例提供的一种P2MP通信域内节点间建立的逻辑互联网络;FIG7a is a logical interconnection network established between nodes in a P2MP communication domain provided by an embodiment of the present application;
图7b是本申请实施例提供的一种P2MP数据在逻辑互联网络中的可靠传输确认关系;FIG7b is a reliable transmission confirmation relationship of P2MP data in a logical interconnection network provided by an embodiment of the present application;
图8是本申请实施例提供的一种基于快速响应策略建立的逻辑互联网络;FIG8 is a logical interconnection network established based on a quick response strategy provided in an embodiment of the present application;
图9是本申请实施例提供的一种P2MP数据的可靠传输方法流程图;FIG9 is a flow chart of a reliable transmission method for P2MP data provided in an embodiment of the present application;
图10是本申请实施例提供的一种P2MP数据的可靠传输方法流程图;FIG10 is a flow chart of a reliable transmission method for P2MP data provided in an embodiment of the present application;
图11是本申请实施例提供的一种逻辑互联网络以及转发表示意图;FIG11 is a schematic diagram of a logical interconnection network and a forwarding table provided in an embodiment of the present application;
图12是本申请实施例提供的一种P2MP数据的可靠传输方法流程图;FIG12 is a flow chart of a reliable transmission method for P2MP data provided in an embodiment of the present application;
图13a是本申请实施例提供的一种在P2MP通信域内基于RC连接的网络层组播资源配置图;FIG13a is a diagram of a network layer multicast resource configuration based on RC connection in a P2MP communication domain provided by an embodiment of the present application;
图13b是本申请实施例提供的一种在P2MP通信域内基于数据面的逻辑连接示意图;FIG13b is a schematic diagram of a logical connection based on a data plane in a P2MP communication domain provided by an embodiment of the present application;
图13c是本申请实施例提供的一种在P2MP通信域内基于控制面的逻辑连接示意图;FIG13c is a schematic diagram of a logical connection based on a control plane in a P2MP communication domain provided by an embodiment of the present application;
图14是本申请实施例提供的一种P2MP数据的可靠传输装置示意图;FIG14 is a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application;
图15是本申请实施例提供的一种P2MP数据的可靠传输装置示意图。FIG. 15 is a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application.
本文中术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。本文中符号“/”表示关联对象是或者的关系,例如A/B表示A或者B。The term "and/or" in this article is a description of the association relationship of associated objects, indicating that there can be three relationships. For example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone. The symbol "/" in this article indicates that the associated objects are in an or relationship, for example, A/B means A or B.
本文中的说明书和权利要求书中的术语“第一”和“第二”等是用于区别不同的对象,而不是用于描述对象的特定顺序。例如,第一响应消息和第二响应消息等是用于区别不同的响应消息,而不是用于描述响应消息的特定顺序。The terms "first" and "second" in the specification and claims herein are used to distinguish different objects rather than to describe a specific order of the objects. For example, a first response message and a second response message are used to distinguish different response messages rather than to describe a specific order of the response messages.
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used to indicate examples, illustrations or descriptions. Any embodiment or design described as "exemplary" or "for example" in the embodiments of the present application should not be interpreted as being more preferred or more advantageous than other embodiments or designs. Specifically, the use of words such as "exemplary" or "for example" is intended to present related concepts in a specific way.
在本申请实施例的描述中,除非另有说明,“多个”的含义是指两个或者两个以上,例如,多个处理单元是指两个或者两个以上的处理单元等;多个元件是指两个或者两个以上的元件等。In the description of the embodiments of the present application, unless otherwise specified, "multiple" means two or more than two. For example, multiple processing units refer to two or more processing units, etc.; multiple elements refer to two or more elements, etc.
在高性能计算(high performance computing,HPC)、人工智能(artificial intelligence,AI)等业务场景中大量使用并行计算,而消息传递接口(message passing interface,MPI)是一组用于在参与并行计算的CPU、服务器等节点之间进行消息传递的规范或接口,MPI支持点对点(point to point,P2P)通信模式和组通信(collective communication,CC)模式。其中,点对点通信支持一对进程之间的通信,而组通信通过设定一个指定的进程组,由组内所有进程参加全局的数据处理和通信操作,一个进程承载于一个CPU或服务器等节点。MPI基于不同并行计算需求模型,构建多个不同的应用程序接口(application program interface,API),这些MPI接口通过不同的通信模式完成数据的移动、聚合或同步。Parallel computing is widely used in business scenarios such as high performance computing (HPC) and artificial intelligence (AI). The message passing interface (MPI) is a set of specifications or interfaces for message passing between nodes such as CPUs and servers that participate in parallel computing. MPI supports point-to-point (P2P) communication mode and collective communication (CC) mode. Among them, point-to-point communication supports communication between a pair of processes, while group communication sets a specified process group, and all processes in the group participate in global data processing and communication operations. One process is carried on a node such as a CPU or server. MPI builds multiple different application program interfaces (APIs) based on different parallel computing demand models. These MPI interfaces complete data movement, aggregation or synchronization through different communication modes.
示例性的,使用MPI进行消息传递的HPC或AI应用程序,在组通信模式下,通常会先指定一组相关的进程建立MPI通信域,该通信域中的进程共同实现HPC或AI的全部或部分应用程序功能。For example, an HPC or AI application that uses MPI for message passing, in group communication mode, usually first specifies a group of related processes to establish an MPI communication domain. The processes in the communication domain jointly implement all or part of the application functions of the HPC or AI.
图1示出了一种建立MPI通信域的编码过程。如图1所示,编码建立通信域comm1和comm2,分为以下步骤:第一步,指定通信域成员,通信域成员包括系统的所有或自行指定的部分进程,比如,通过调用函数MPI_Group_incl()在系统中指定位于group1的进程,或调用函数MPI_Group_excl()在系统中排除位于group2的进程来实现;第二步,创建通信域,通过调用函数MPI_Comm_create()来实现;第三步,组播数据发送/接收,通过调用一种MPI接口MPI_Bcast()来实现。Figure 1 shows a coding process for establishing an MPI communication domain. As shown in Figure 1, the coding process for establishing communication domains comm1 and comm2 is divided into the following steps: the first step is to specify communication domain members, which include all or some processes of the system specified by themselves, for example, by calling the function MPI_Group_incl() to specify the processes in group1 in the system, or calling the function MPI_Group_excl() to exclude the processes in group2 in the system; the second step is to create a communication domain, which is implemented by calling the function MPI_Comm_create(); the third step is to send/receive multicast data, which is implemented by calling an MPI interface MPI_Bcast().
示例性的,常用的MPI接口包括MPI_Bcast(),MPI_Allreduce(),MPI_Scatter(),MPI_Reduce_scatter()等。这些典型的MPI API中,大量使用P2MP方式进行通信。Exemplary, commonly used MPI interfaces include MPI_Bcast(), MPI_Allreduce(), MPI_Scatter(), MPI_Reduce_scatter(), etc. In these typical MPI APIs, P2MP is widely used for communication.
图2示出了使用P2MP通信模型的典型MPI接口。如图2所示,执行系统按照应用程序的请求,构建包含四个进程的MPI通信域。四个进程分别运行在硬件设备GPU0-GPU3上。接下来,以“GPU0-GPU3”代称通信域中的四个对应成员,或称为节点,或称为进程来进行说明。FIG2 shows a typical MPI interface using the P2MP communication model. As shown in FIG2 , the execution system builds an MPI communication domain containing four processes according to the request of the application. The four processes run on the hardware devices GPU0-GPU3 respectively. Next, “GPU0-GPU3” is used to refer to the four corresponding members in the communication domain, or nodes, or processes for explanation.
如图2(a)所示,应用程序的执行设备使用MPI_Bcast接口,在MPI通信域内采用单发多收(1 sender n receivers,1SnR)的P2MP通信模型实现数据移动。比如,选定通信域中的一个通信成员GPU0
作为数据转发的源节点,其他的三个通信成员GPU1-GPU3作为接收节点。源节点GPU0将接收并保存的数据“A”,以P2MP方式发送给三个接收节点GPU0-GPU3,接收节点GPU1-GPU3保存接收到的数据“A”。至此,硬件设备GPU0~GPU3可以基于接收到的数据“A”,按照执行设备的分配指令执行不同的处理,以并行完成对数据“A”的计算任务。As shown in Figure 2(a), the application execution device uses the MPI_Bcast interface and adopts the P2MP communication model of 1 sender n receivers (1SnR) in the MPI communication domain to realize data movement. As the source node of data forwarding, the other three communication members GPU1-GPU3 serve as receiving nodes. The source node GPU0 sends the received and saved data "A" to the three receiving nodes GPU0-GPU3 in P2MP mode, and the receiving nodes GPU1-GPU3 save the received data "A". At this point, the hardware devices GPU0~GPU3 can perform different processing based on the received data "A" according to the allocation instructions of the execution device to complete the calculation task of data "A" in parallel.
如图2(b)所示,应用程序的执行系统使用MPI_Allreduce()接口,采用多发多收(n senders n receivers,nSnR)的模型实现数据聚合和移动,可以分解为两步:第一步,单节点进行数据收集,采用多发单收(n senders 1 receiver,nS1R)的模型;第二步,单节点将收集到的数据进行处理,并群发到通信域内的多个接收节点,采用1SnR的P2MP通信模型。具体的,通信成员GPU0/GPU1/GPU2/GPU3先分别接收或建立不同的数据“A/B/C/D”,第一步,采用nS1R模型,依次选定通信域中的通信成员GPU1/GPU2/GPU3作为数据转发的源节点,将多组通信成员的数据收集到通信成员GPU0。比如,首先选择通信节点GPU1作为源节点,通信成员GPU0作为接收节点,此时,源节点GPU1将保存的数据“B”以P2P的通信模式发送给GPU0,GPU2和GPU3发送数据的过程以此类推。第二步,当通信成员GPU0完成数据“A”、“B”、“C”、“D”的汇总并进行相关处理,得到数据“A+B+C+D”后,再使用P2MP方式进行发送。具体的,选定通信成员GPU0为源节点,将数据“A+B+C+D”以P2MP方式发送到P2MP通信域内所有的其他通信成员GPU1-GPU3。至此,硬件设备GPU0-GPU3可以基于同样的数据“A+B+C+D”,按照系统的分配指令执行不同的处理,以并行完成对数据“A+B+C+D”的计算任务。As shown in Figure 2(b), the execution system of the application uses the MPI_Allreduce() interface and adopts the model of multiple senders and multiple receivers (n senders n receivers, nSnR) to achieve data aggregation and movement, which can be decomposed into two steps: the first step is that a single node collects data, using the model of multiple senders and single receivers (n senders 1 receiver, nS1R); the second step is that the single node processes the collected data and sends it to multiple receiving nodes in the communication domain, using the 1SnR P2MP communication model. Specifically, the communication members GPU0/GPU1/GPU2/GPU3 first receive or establish different data "A/B/C/D" respectively. In the first step, the nS1R model is used to select the communication members GPU1/GPU2/GPU3 in the communication domain as the source nodes for data forwarding, and collect the data of multiple groups of communication members to the communication member GPU0. For example, first select the communication node GPU1 as the source node and the communication member GPU0 as the receiving node. At this time, the source node GPU1 sends the saved data "B" to GPU0 in the P2P communication mode, and the process of GPU2 and GPU3 sending data is similar. In the second step, when the communication member GPU0 completes the aggregation of the data "A", "B", "C", and "D" and performs related processing to obtain the data "A+B+C+D", it uses the P2MP method to send it. Specifically, select the communication member GPU0 as the source node, and send the data "A+B+C+D" to all other communication members GPU1-GPU3 in the P2MP communication domain in the P2MP mode. At this point, the hardware devices GPU0-GPU3 can perform different processing based on the same data "A+B+C+D" according to the system's allocation instructions to complete the calculation task of the data "A+B+C+D" in parallel.
示例性的,图3示出了网络层组播转发和应用层组播转发P2MP数据的流程图。对P2MP数据的发送,可以有多种实现方式,典型的实现方式是使用网络层组播或应用层组播转发来实现。如图3(a)所示,在使用网络层组播转发时,组播分发网络会为P2MP域内的每个作为组播源的节点构建一颗组播分发树(multicast distribution tree,MDT),每个MDT中的网络设备负责将P2MP数据从源节点转发到每个接收节点,也就是宿节点。如图3(b)所示,在使用应用层组播转发时,P2MP数据会被拆解为多个P2P数据,由源节点1进行多次P2P数据发送,P2P数据单播转发到各个接收节点2-5。在应用层组播转发时,可能会存在一些接收节点参与转发,比如,接收节点2在接收P2P数据后,再次转发给其他的接收节点2-3。Exemplarily, FIG3 shows a flowchart of network layer multicast forwarding and application layer multicast forwarding of P2MP data. There are multiple ways to implement the transmission of P2MP data, and a typical implementation method is to use network layer multicast or application layer multicast forwarding. As shown in FIG3(a), when using network layer multicast forwarding, the multicast distribution network will build a multicast distribution tree (MDT) for each node in the P2MP domain that serves as a multicast source. The network equipment in each MDT is responsible for forwarding P2MP data from the source node to each receiving node, that is, the sink node. As shown in FIG3(b), when using application layer multicast forwarding, P2MP data will be disassembled into multiple P2P data, and the source node 1 will send P2P data multiple times, and the P2P data will be unicast forwarded to each receiving node 2-5. When the application layer multicast forwarding is performed, some receiving nodes may participate in the forwarding. For example, after receiving the P2P data, the receiving node 2 forwards it to other receiving nodes 2-3 again.
可以理解的是,在采用P2MP通信模型的通信域中,不管采取上述图3中(a)还是(b)的发送方式,对于HPC或AI等业务场景,都需要P2MP数据能够可靠的由源节点发送到达相关的接收节点,即需要建立可靠传输保障机制管理P2MP数据的发送/接收。It can be understood that in the communication domain using the P2MP communication model, regardless of the sending method (a) or (b) in Figure 3 above, for business scenarios such as HPC or AI, P2MP data needs to be reliably sent from the source node to the relevant receiving node, that is, a reliable transmission guarantee mechanism needs to be established to manage the sending/receiving of P2MP data.
图4示出了P2MP通信域内P2MP数据转发的显著特点。如图4(a)、(b)、(c)所示,在P2MP通信域生命周期的不同时段,可能选取不同的节点作为P2MP数据的发送源,即P2MP通信域内存在频繁组播源切换;另外,P2MP通信域内发送的P2MP数据,需要除源节点之外的所有其他节点进行接收。Figure 4 shows the significant characteristics of P2MP data forwarding in a P2MP communication domain. As shown in Figures 4(a), (b), and (c), different nodes may be selected as the source of P2MP data transmission at different times in the life cycle of the P2MP communication domain, that is, there is frequent multicast source switching in the P2MP communication domain; in addition, the P2MP data sent in the P2MP communication domain needs to be received by all other nodes except the source node.
图5示出了进行P2MP数据可靠转发的问题描述。在可靠传输机制建立时,P2MP通信域内的每一个接收节点都会发送ACK/NAK应答消息到源节点,以指示接收节点是否正确接收P2MP数据,因此,P2MP通信域的源节点需要在控制面维护同每一个接收节点之间的连接,还需要对每一个接收节点的接收状态进行管理,对发送NAK应答消息的接收节点进行P2MP数据的重传。当P2MP通信域内节点规模增大时,对源节点的资源消耗非常大。如图5(a)、(b)所示,在P2MP通信域生命周期的时刻1、时刻2,P2MP通信域内的组播数据源会发生切换,可以理解的是,P2MP通信域内的任意节点都有可能会作为P2MP源节点,源节点的资源消耗问题会扩展到P2MP通信域内任何一个节点。FIG5 shows a description of the problem of reliable forwarding of P2MP data. When the reliable transmission mechanism is established, each receiving node in the P2MP communication domain will send an ACK/NAK response message to the source node to indicate whether the receiving node has correctly received the P2MP data. Therefore, the source node of the P2MP communication domain needs to maintain the connection with each receiving node on the control plane, and also needs to manage the receiving status of each receiving node and retransmit the P2MP data to the receiving node that sent the NAK response message. When the node scale in the P2MP communication domain increases, the resource consumption of the source node is very large. As shown in FIG5(a) and (b), at time 1 and time 2 of the life cycle of the P2MP communication domain, the multicast data source in the P2MP communication domain will switch. It can be understood that any node in the P2MP communication domain may serve as a P2MP source node, and the resource consumption problem of the source node will extend to any node in the P2MP communication domain.
在本申请以下的描述中,以分布式系统中的各节点进行并行运算处理时,各节点间采用远程直接数据存取(remote direct memory access,RDMA)技术的通信流程为例,对分布式系统进行并行运算处理的硬件配置、软件设计等进行说明。可以理解的是,如果在分布式系统中的各节点间采用其他数据存取技术(比如,传统的直接连接存储等)进行数据的交换,也需要在分布式系统中进行类似硬件配置、软件设计等,以实现业务程序的并行运算处理。实际应用中,RDMA技术通过网络把数据直接传入计算机的存储区,将数据从一个系统快速移动到远程系统存储器中,RDMA的基本通信单元是相关队列对(queue pair,QP),这些QP可以部署于智能网卡、通道适配器等,基于QP的服务类型有可靠传输连接(reliable connection,RC),不可靠数据报连接(unreliable datagram,UD)等。In the following description of this application, when each node in a distributed system performs parallel computing, the communication process between each node using remote direct memory access (RDMA) technology is taken as an example to explain the hardware configuration, software design, etc. of the distributed system for parallel computing. It is understandable that if other data access technologies (such as traditional direct connection storage, etc.) are used to exchange data between each node in the distributed system, similar hardware configuration, software design, etc. are also required in the distributed system to realize the parallel computing of business programs. In practical applications, RDMA technology directly transfers data to the storage area of the computer through the network, and quickly moves data from one system to the remote system memory. The basic communication unit of RDMA is the queue pair (QP), which can be deployed in smart network cards, channel adapters, etc. The service types based on QP include reliable transmission connection (RC), unreliable datagram connection (UD), etc.
在一个可能的方案中,基于HPC或AI等业务场景建立业务模型,在CPU、服务器等并行处理节点的MPI接口间使用P2MP通信模型,为了保障P2MP数据转发的可靠性,可以根据业务系统中集合通信的规模大小而采用不同的可靠保障机制。比如,在小规模的集合通信场景中,系统在P2MP数据的源节点和每个
接收节点之间建立传输层的RC连接。RC连接类似于TCP连接,此时P2MP数据使用应用层组播技术,在RC连接上进行单播数据转发,同时,由传输层的RC连接实现丢包重传,保证P2MP数据的可靠传输;或者在大规模的集合通信场景中,系统在P2MP数据的源节点和每个接收节点之间建立传输层的UD连接,以节省QP资源。UD连接与UDP连接非常相似,此时P2MP使用网络层组播或应用层组播技术,在UD连接上组播/单播转发数据报文,而P2MP数据的可靠传输,需要交由接收节点的应用层进行丢包识别和重传请求、源节点进行丢包重传等操作来完成。In a possible solution, a business model is established based on business scenarios such as HPC or AI, and a P2MP communication model is used between the MPI interfaces of parallel processing nodes such as CPUs and servers. In order to ensure the reliability of P2MP data forwarding, different reliable guarantee mechanisms can be adopted according to the scale of collective communication in the business system. For example, in a small-scale collective communication scenario, the system has a P2MP data source node and each An RC connection at the transport layer is established between the receiving nodes. The RC connection is similar to a TCP connection. In this case, P2MP data uses application layer multicast technology to forward unicast data on the RC connection. At the same time, the RC connection at the transport layer implements packet loss retransmission to ensure reliable transmission of P2MP data. Alternatively, in large-scale collective communication scenarios, the system establishes a UD connection at the transport layer between the source node of the P2MP data and each receiving node to save QP resources. The UD connection is very similar to the UDP connection. In this case, P2MP uses network layer multicast or application layer multicast technology to multicast/unicast forward data packets on the UD connection. The reliable transmission of P2MP data requires the application layer of the receiving node to perform packet loss identification and retransmission requests, and the source node to perform packet loss retransmission and other operations to complete.
可见,在采用P2MP通信模型的通信域中,对于采取RC或UD连接的可靠保障机制,需要在传输层或应用层,由控制面建立源节点和每个接收节点的应答机制、数据校验机制和保序机制,以保证P2MP数据的正确转发。It can be seen that in the communication domain adopting the P2MP communication model, for the reliable guarantee mechanism of RC or UD connection, it is necessary for the control plane to establish a response mechanism, data verification mechanism and order preservation mechanism for the source node and each receiving node at the transport layer or application layer to ensure the correct forwarding of P2MP data.
示例性的,为了解决P2MP通信域内通信节点规模较大,或P2MP数据源频繁切换所导致的扩展性问题,可以采取如下方案:按需建立RC连接,即在有P2MP通信需求时,才创建源节点通信成员与接收节点通信成员之间的连接,从而节省了硬件设备的QP资源,但此方案会因为RC连接的建立耗时,而大大影响P2MP数据的传输效率;或者为了保证源节点切换后RC连接建立的低时延开销,而提前建立RC连接,由于无法确定会发送P2MP数据的源节点,对于包含N个节点的P2MP通信域,如果采用Full-Mesh连接形式,需要提前建立N*(N-1)÷2个RC连接,RC连接使用硬件设备上的有限硬件资源,而一个通信域的成员进程可能集中在某个PC主机或服务器上,这样就会造成RC连接在某个物理终端上富集,资源耗尽,最终导致可靠性无法保证。Exemplarily, in order to solve the scalability problem caused by the large scale of communication nodes in the P2MP communication domain or the frequent switching of P2MP data sources, the following solutions can be adopted: establish RC connections on demand, that is, create connections between source node communication members and receiving node communication members only when there is a P2MP communication demand, thereby saving QP resources of hardware devices, but this solution will greatly affect the transmission efficiency of P2MP data because the establishment of RC connections is time-consuming; or in order to ensure low latency overhead for establishing RC connections after the source node is switched, establish RC connections in advance. Since it is impossible to determine the source node that will send P2MP data, for a P2MP communication domain containing N nodes, if a Full-Mesh connection form is adopted, N*(N-1)÷2 RC connections need to be established in advance. RC connections use limited hardware resources on hardware devices, and member processes of a communication domain may be concentrated on a certain PC host or server, which will cause RC connections to be enriched on a certain physical terminal, exhaust resources, and ultimately lead to reliability cannot be guaranteed.
有鉴于此,本申请实施例提供一种P2MP数据的可靠传输方法,基于P2MP通信域内每个接收节点接收相同数据的特点,在P2MP通信域内的多个节点间建立逻辑互联网络,将由源节点进行可靠传输处理的一部分任务,分布式部署到P2MP通信域内的多个节点,可有效减轻单点资源压力和避免单点处理瓶颈。另外,采取多节点共同维护可靠传输的方式,不需要在P2MP通信域内的各个节点间建立Full-Mesh连接,从而节省了由源节点频繁切换所导致的控制面可靠传输保障的开销。In view of this, the embodiment of the present application provides a reliable transmission method for P2MP data. Based on the characteristic that each receiving node in the P2MP communication domain receives the same data, a logical interconnection network is established between multiple nodes in the P2MP communication domain, and a part of the tasks of reliable transmission processing performed by the source node is distributed and deployed to multiple nodes in the P2MP communication domain, which can effectively reduce the pressure of single-point resources and avoid single-point processing bottlenecks. In addition, by adopting a method in which multiple nodes jointly maintain reliable transmission, there is no need to establish a Full-Mesh connection between each node in the P2MP communication domain, thereby saving the cost of reliable transmission guarantee of the control plane caused by frequent switching of the source node.
在实现有效的应答机制、数据校验机制和保序机制时,能保障P2MP域内P2MP数据的可靠传输,而P2MP通信域内,每个接收节点接收的数据相同,故可只在信令层面建立各个节点间的有效连接,设计可靠传输方案,即可实现应答机制(包括发送应答消息),数据校验机制和保序机制(包括丢包请求,数据重传)。本申请实施例中提供的可靠传输方法,将P2MP数据转发和P2MP数据的可靠传输确认进行解耦。在P2MP通信域内所有节点之间建立物理互联网络,即由网络设备构建的互联网络,实现P2MP数据的转发,在P2MP通信域内所有节点之间建立逻辑互联网络,确定P2MP数据的可靠传输确认关系。When an effective response mechanism, data verification mechanism and order preservation mechanism are implemented, the reliable transmission of P2MP data within the P2MP domain can be guaranteed. In the P2MP communication domain, each receiving node receives the same data, so it is only necessary to establish an effective connection between each node at the signaling level, and design a reliable transmission scheme to implement a response mechanism (including sending a response message), a data verification mechanism and an order preservation mechanism (including packet loss request, data retransmission). The reliable transmission method provided in the embodiment of the present application decouples P2MP data forwarding and reliable transmission confirmation of P2MP data. A physical interconnection network is established between all nodes in the P2MP communication domain, that is, an interconnection network constructed by network devices, to realize the forwarding of P2MP data, and a logical interconnection network is established between all nodes in the P2MP communication domain to determine the reliable transmission confirmation relationship of P2MP data.
示例性的,图6a示出了本申请实施例提供的一种P2MP通信域内节点间建立的物理互联网络,具体的,P2MP通信域内包含八个节点,每个节点具有各自的标识信息,分别标识为节点1-8,基于叶脊网络架构Spine-Leaf,在八个节点之间建立二级网络拓扑结构的物理互联网络,任何一个节点发出的P2MP数据,可以在该Spine-Leaf网络中转发而到达P2MP通信域内的其他七个节点。图6b示出了本申请实施例提供的一种P2MP数据在物理互联网络中的转发路径,基于图6a所示的物理互联网络,图6b示出了以节点1为组播源的P2MP数据转发路径,节点1可以使用网络层组播(源节点一次P2MP数据发送)进行数据发送,并沿图6b的转发路径转发。Exemplarily, FIG6a shows a physical internet network established between nodes in a P2MP communication domain provided by an embodiment of the present application. Specifically, the P2MP communication domain includes eight nodes, each of which has its own identification information, respectively identified as nodes 1-8. Based on the spine network architecture Spine-Leaf, a physical internet network with a secondary network topology is established between the eight nodes. P2MP data sent by any node can be forwarded in the Spine-Leaf network and reach the other seven nodes in the P2MP communication domain. FIG6b shows a forwarding path of P2MP data in a physical internet network provided by an embodiment of the present application. Based on the physical internet network shown in FIG6a, FIG6b shows a P2MP data forwarding path with node 1 as the multicast source. Node 1 can use network layer multicast (source node sends P2MP data once) to send data and forward it along the forwarding path of FIG6b.
示例性的,图7a示出了本申请实施例提供的一种P2MP通信域内节点间建立的逻辑互联网络,具体的,基于图6a所示的P2MP通信域和物理互联网络,在八个节点之间建立逻辑互联网络,在该逻辑互联网络中,除首、末节点外,所有节点都和两个其他的节点直连,具有直连关系的两个节点互为邻居节点,每个节点可以有多个邻居节点。在P2MP通信域内,若接收节点正确接收到P2MP数据,需要向其邻居节点发送ACK应答消息,以确认已经正确接收到P2MP数据;若接收节点识别出接收的P2MP数据有误,需要向其邻居节点发送NAK应答消息,告知未能正确接收P2MP数据,由邻居节点对P2MP数据进行重传。图7b示出了本申请实施例提供的一种P2MP数据在逻辑互联网络中的可靠传输确认关系,基于图7a所示的逻辑互联网络,图7b示出了在节点1作为组播源发送P2MP数据后,各个接收节点在正确接收P2MP数据,或识别出P2MP数据有误后,需要发送应答消息的邻居节点。比如,在节点6识别出P2MP数据有误后,需要向节点5发送NAK应答消息,节点5对节点6完成P2MP数据的重传。由此,将原本由源节点1负责对接收节点6进行P2MP数据可靠传输管理的任务分散到节点5上,可以有效减轻单点资源压力和避免单点处理瓶颈。可以理解的是,由于在P2MP通信域内,每个接收节点接收到的P2MP数据是相同的,
所以节点5可以代替源节点1对接收节点6实现P2MP数据的重传。另外,在7a示出的逻辑互联网络中,不需要对P2MP通信域内的接收节点2-7都建立与源节点1的连接,从而节省由源节点频繁切换所导致的控制面可靠传输保障的开销。Exemplarily, FIG7a shows a logical interconnection network established between nodes in a P2MP communication domain provided by an embodiment of the present application. Specifically, based on the P2MP communication domain and the physical interconnection network shown in FIG6a, a logical interconnection network is established between eight nodes. In the logical interconnection network, except for the first and last nodes, all nodes are directly connected to two other nodes. Two nodes with a direct connection relationship are neighbor nodes, and each node can have multiple neighbor nodes. In the P2MP communication domain, if the receiving node correctly receives the P2MP data, it needs to send an ACK response message to its neighbor node to confirm that the P2MP data has been correctly received; if the receiving node identifies that the received P2MP data is incorrect, it needs to send a NAK response message to its neighbor node to inform that the P2MP data was not received correctly, and the neighbor node retransmits the P2MP data. Figure 7b shows a reliable transmission confirmation relationship of P2MP data in a logical interconnected network provided by an embodiment of the present application. Based on the logical interconnected network shown in Figure 7a, Figure 7b shows that after node 1 sends P2MP data as a multicast source, each receiving node needs to send a response message to a neighboring node after correctly receiving the P2MP data or identifying that the P2MP data is erroneous. For example, after node 6 identifies that the P2MP data is erroneous, it needs to send a NAK response message to node 5, and node 5 completes the retransmission of the P2MP data to node 6. As a result, the task of managing the reliable transmission of P2MP data for receiving node 6, which was originally the responsibility of source node 1, is distributed to node 5, which can effectively reduce the pressure on single-point resources and avoid single-point processing bottlenecks. It can be understood that since the P2MP data received by each receiving node in the P2MP communication domain is the same, Therefore, node 5 can replace source node 1 to retransmit P2MP data to receiving node 6. In addition, in the logical interconnection network shown in 7a, it is not necessary for all receiving nodes 2-7 in the P2MP communication domain to establish a connection with source node 1, thereby saving the overhead of reliable transmission guarantee of the control plane caused by frequent switching of source nodes.
本实施例中,系统按照应用程序的执行要求建立并行运算需求模型,并分配不同CPU,或服务器节点参与并行运算。MPI基于不同的需求模型,对其中的全部或者部分任务组建基于P2MP通信模型的通信域,通信域中包含多个节点,每个节点承载于某一个计算处理单元,这些计算处理单元可以位于相同或不同的电脑、服务器、集群、存储设备中,包括智能网卡、通道适配器等,这些通信成员所承载的计算处理单元可以通过网关、路由器等实现物理链接,即P2MP通信域内的各成员间建有物理互联网络,实现P2MP数据的转发。在此基础上,为了实现P2MP数据的可靠传输保障机制,需要在P2MP通信域内的各成员间建立逻辑互联网络,确定P2MP数据的可靠传输确认关系。In this embodiment, the system establishes a parallel computing demand model according to the execution requirements of the application program, and allocates different CPUs or server nodes to participate in parallel computing. Based on different demand models, MPI forms a communication domain based on the P2MP communication model for all or part of the tasks. The communication domain contains multiple nodes, each of which is carried by a certain computing processing unit. These computing processing units can be located in the same or different computers, servers, clusters, storage devices, including smart network cards, channel adapters, etc. The computing processing units carried by these communication members can be physically linked through gateways, routers, etc., that is, a physical interconnection network is built between the members in the P2MP communication domain to realize the forwarding of P2MP data. On this basis, in order to realize the reliable transmission guarantee mechanism of P2MP data, it is necessary to establish a logical interconnection network between the members in the P2MP communication domain to determine the reliable transmission confirmation relationship of P2MP data.
在P2MP通信域内,任何一个组播源发送的P2MP数据,都可以通过物理互联网络转发到其他所有接收节点。物理互联网络对P2MP数据的转发可以使用网络层组播机制,或使用应用层组播机制,基于不同的应用场景需求进行选择。逻辑互联网络建立了P2MP通信域内各节点之间的邻接关系,依据该邻接关系,P2MP域内各节点针对从物理互联网络中接收P2MP数据的状态,向自己的逻辑互联网络中的邻居节点进行可靠传输确认,并由邻居节点负责P2MP数据重传。In the P2MP communication domain, any P2MP data sent by a multicast source can be forwarded to all other receiving nodes through the physical interconnection network. The physical interconnection network can use the network layer multicast mechanism or the application layer multicast mechanism to forward P2MP data, which is selected based on the requirements of different application scenarios. The logical interconnection network establishes the adjacency relationship between the nodes in the P2MP communication domain. Based on the adjacency relationship, each node in the P2MP domain confirms the reliable transmission of the P2MP data received from the physical interconnection network to the neighboring nodes in its logical interconnection network, and the neighboring nodes are responsible for retransmitting the P2MP data.
在一个示例中,如图6a所示,P2MP通信域内有八个节点,可以理解的是,八个节点之间的逻辑互联网络除了图7a所示的组网方式,还可以有多种不同的组网方式,这样就可以构成八个节点之间不同的连接。P2MP通信域内多个节点间连接的建立会有不同策略的约束,比如,要求所有节点之间的连接数量最少以达到节省建立连接所需要的资源,要求连接的物理距离最短以使得互为邻居的节点之间能够做到快速的通告和响应等,所以需要基于不同的策略和目标,来构建P2MP通信域内多个节点间的逻辑互联网络。In one example, as shown in FIG6a, there are eight nodes in the P2MP communication domain. It is understandable that the logical interconnection network between the eight nodes can have a variety of different networking modes in addition to the networking mode shown in FIG7a, so that different connections between the eight nodes can be formed. The establishment of connections between multiple nodes in the P2MP communication domain will be constrained by different strategies, for example, the number of connections between all nodes must be minimized to save resources required for establishing connections, the physical distance of the connection must be minimized so that neighboring nodes can quickly notify and respond to each other, etc. Therefore, it is necessary to build a logical interconnection network between multiple nodes in the P2MP communication domain based on different strategies and goals.
如果是基于快速响应策略建立逻辑互联网络,则可以根据物理距离的远近将多个节点分成多个节点组,每个节点组中的节点之间的物理距离小于等于预设距离,每个节点组中的节点之间建立连接,一个节点组中一个节点和另一个节点组中的一个节点建立连接,从而实现多个节点的互连。图8中示出了本申请实施例提供的一种基于快速响应策略建立的逻辑互联网络,假设某些节点之间有一定的物理近邻关系,物理近邻关系可以是一个数据中心基本物理设计单元(point of delivery,PoD)内的节点之间的关系,PoD包括服务器、接入网络、汇聚网络机柜还有它们的配套设施,是整个组网的一个区域,整个组网包括多个PoD,比如,节点1-3部署在同一个PoD内,节点4-6部署在第二个PoD内,节点7-8部署在第三个PoD内。基于快速响应策略,先将位于同一个PoD内的节点间分别建立连接,再分别从每个PoD内选择一个节点,比如节点3、6、7,在三个PoD之间建立连接,这样最大限度的利用了节点的物理邻接部署特性,可靠传输控制信令能就近处理,快速响应,但负责PoD间连接的特定节点3、6、7的连接会有所增加,因此需避免一个节点在多个P2MP通信域内均承担较多连接的角色,避免连接数量需求在某节点上富集。比如,如果在P2MP通信域内还有第四个PoD,包括节点9-12,其连接关系如图8所示,选择节点10与其他PoD内的节点进行连接时,就尽量将节点10连接在节点3、7上,而不是连接在节点6上面,避免在节点6上聚集太多的连接任务而造成单点处理瓶颈。If a logical interconnection network is established based on a quick response strategy, multiple nodes can be divided into multiple node groups according to the physical distance, the physical distance between the nodes in each node group is less than or equal to the preset distance, and a connection is established between the nodes in each node group, and a node in one node group is connected to a node in another node group, thereby realizing the interconnection of multiple nodes. FIG8 shows a logical interconnection network established based on a quick response strategy provided by an embodiment of the present application, assuming that there is a certain physical neighbor relationship between certain nodes, and the physical neighbor relationship can be the relationship between nodes in a basic physical design unit (point of delivery, PoD) of a data center. PoD includes servers, access networks, converged network cabinets and their supporting facilities, which is an area of the entire network. The entire network includes multiple PoDs, for example, nodes 1-3 are deployed in the same PoD, nodes 4-6 are deployed in the second PoD, and nodes 7-8 are deployed in the third PoD. Based on the fast response strategy, first establish connections between nodes in the same PoD, and then select a node from each PoD, such as nodes 3, 6, and 7, to establish connections between the three PoDs. This maximizes the use of the physical adjacency deployment characteristics of the nodes, and reliable transmission control signaling can be processed nearby and respond quickly. However, the connections of specific nodes 3, 6, and 7 responsible for the connection between PoDs will increase. Therefore, it is necessary to avoid a node taking on more connection roles in multiple P2MP communication domains to avoid the enrichment of the number of connections on a certain node. For example, if there is a fourth PoD in the P2MP communication domain, including nodes 9-12, and its connection relationship is shown in Figure 8, when selecting node 10 to connect with nodes in other PoDs, try to connect node 10 to nodes 3 and 7 instead of node 6, so as to avoid too many connection tasks being gathered on node 6 and causing a single-point processing bottleneck.
如果是基于控制连接数量策略建立逻辑互联网络,则P2MP通信域内多个节点中的每个节点与邻居节点建立直连,多个节点中的每个节点的邻居节点之间不建立直连,比如,可以采用链式拓扑连接P2MP通信域内的所有节点,这样每个节点最多需要与2个邻居节点分别建立连接,总连接数量最少,且把P2MP通信域内总连接数量进行了平均分散部署,如图7a所示,基于控制连接数量策略在八个节点间建立了逻辑互联网络,八个节点间的总连接数量达到了最少,在此不再赘述。If a logical interconnection network is established based on the control connection number strategy, each of the multiple nodes in the P2MP communication domain establishes a direct connection with a neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes. For example, a chain topology can be used to connect all nodes in the P2MP communication domain, so that each node needs to establish connections with at most two neighbor nodes respectively, and the total number of connections is minimized, and the total number of connections in the P2MP communication domain is evenly distributed. As shown in FIG7a, a logical interconnection network is established between eight nodes based on the control connection number strategy, and the total number of connections between the eight nodes is minimized, which will not be repeated here.
至此,在逻辑互联网络中,互为邻居节点的两个节点之间建立一个通信连接(communication connection,CC),该通信连接用于邻居节点之间的可靠传输应答消息的传输。邻居节点之间的通信连接,在逻辑互联网络建立之初完成创建,在P2MP通信域的整个生命周期内使用。At this point, a communication connection (CC) is established between two neighboring nodes in the logical interconnection network. This communication connection is used to transmit reliable transmission response messages between neighboring nodes. The communication connection between neighboring nodes is created at the beginning of the establishment of the logical interconnection network and is used throughout the life cycle of the P2MP communication domain.
基于图6-8所示的内容,对本申请实施例提供的一种P2MP数据的可靠传输方法进行介绍,可以理解的是,该方法可以通过任何具有计算、处理能力的装置、设备、平台、设备集群来执行。Based on the contents shown in FIGS. 6-8 , a reliable transmission method of P2MP data provided in an embodiment of the present application is introduced. It can be understood that the method can be executed by any device, equipment, platform, or device cluster with computing and processing capabilities.
示例性的,图9示出了本申请实施例提供的一种P2MP数据的可靠传输方法流程图,假设P2MP通信
域内至少包含第一节点、第二节点和第三节点,P2MP通信域内的多个节点间建有逻辑互联网络,在逻辑互联网络中,第一节点和第三节点互为邻居节点,第二节点和第三节点互为邻居节点,如图9所示,列举了在P2MP通信域生命周期的某一时刻,当第二节点作为发送P2MP数据的源节点时,第一节点和第三节点作为接收节点,在P2MP通信域内进行P2MP数据可靠传输的确认流程,包括以下步骤S901-S905:For example, FIG9 shows a flow chart of a reliable transmission method for P2MP data provided by an embodiment of the present application, assuming that P2MP communication The domain includes at least a first node, a second node, and a third node. A logical interconnection network is established between the multiple nodes in the P2MP communication domain. In the logical interconnection network, the first node and the third node are neighbor nodes, and the second node and the third node are neighbor nodes. As shown in FIG9 , at a certain moment in the life cycle of the P2MP communication domain, when the second node is used as a source node for sending P2MP data, the first node and the third node are used as receiving nodes, and a confirmation process for reliable transmission of P2MP data is performed in the P2MP communication domain, including the following steps S901-S905:
步骤S901,第二节点在P2MP通信域内发送P2MP数据包,P2MP数据包至少包括P2MP数据,P2MP数据的序列号,以及第二节点的标识信息(源节点)。Step S901: The second node sends a P2MP data packet in the P2MP communication domain. The P2MP data packet at least includes P2MP data, a sequence number of the P2MP data, and identification information of the second node (source node).
步骤S902,第一节点接收到第二节点发送的P2MP数据包,根据第一节点的转发表,确定第三节点的标识信息,这里假设第一节点判断接收的P2MP数据有误,向第三节点发送应答消息,应答消息至少包括P2MP数据的序列号,接收P2MP数据的状态,第一节点的标识信息,以及第二节点的标识信息(源节点),此时,接收P2MP数据的状态标志设置为NAK,表示接收到的P2MP数据有误。In step S902, the first node receives the P2MP data packet sent by the second node, and determines the identification information of the third node according to the forwarding table of the first node. It is assumed here that the first node determines that the received P2MP data is erroneous, and sends a response message to the third node. The response message includes at least the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of the first node, and the identification information of the second node (source node). At this time, the status flag of receiving the P2MP data is set to NAK, indicating that the received P2MP data is erroneous.
步骤S903,第三节点接收到第二节点发送的P2MP数据,根据第三节点的转发表,确定第二节点的标识信息,这里假设第三节点判断接收的P2MP数据正确,向第二节点发送应答消息,应答消息至少包括P2MP数据的序列号,接收P2MP数据的状态,第三节点的标识信息,以及第二节点的标识信息(源节点),此时,接收P2MP数据的状态标志设置为ACK,表示正确接收到P2MP数据。Step S903: The third node receives the P2MP data sent by the second node, and determines the identification information of the second node according to the forwarding table of the third node. It is assumed here that the third node determines that the received P2MP data is correct, and sends a response message to the second node. The response message includes at least the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of the third node, and the identification information of the second node (source node). At this time, the status flag of receiving the P2MP data is set to ACK, indicating that the P2MP data is correctly received.
需要说明的是,步骤S902和步骤S903没有先后顺序关系,实际传输情况取决于P2MP通信域内多节点建立的物理互联网络。It should be noted that there is no order relationship between step S902 and step S903, and the actual transmission situation depends on the physical interconnection network established by multiple nodes in the P2MP communication domain.
步骤S904,第三节点接收到第一节点发送的NAK应答消息,更新管理表中关于第一节点接收P2MP数据的状态,并向第一节点重传正确的P2MP数据。Step S904: The third node receives the NAK response message sent by the first node, updates the state of the first node receiving the P2MP data in the management table, and retransmits the correct P2MP data to the first node.
步骤S905,第一节点接收到第三节点重传的P2MP数据,根据第一节点的转发表,确定第三节点的标识信息,这里假设第一节点判断接收到的P2MP数据正确,向第三节点发送应答消息,应答消息至少包括P2MP数据的序列号,接收P2MP数据的状态,第一节点的标识信息,以及第二节点的标识信息(源节点),此时,接收P2MP数据的状态标志设置为ACK,表示正确接收到P2MP数据。Step S905: The first node receives the P2MP data retransmitted by the third node, and determines the identification information of the third node according to the forwarding table of the first node. It is assumed here that the first node determines that the received P2MP data is correct, and sends a response message to the third node. The response message includes at least the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of the first node, and the identification information of the second node (source node). At this time, the status flag of receiving the P2MP data is set to ACK, indicating that the P2MP data is correctly received.
在以上步骤S901-S905中,P2MP通信域内的每个节点都有一个转发表和管理表,转发表用来确认接收到源节点发送的P2MP数据包后,发送应答消息的一个邻居节点,管理表用来记录接收到一个被管理接收节点发送的应答消息后,更新该被管理接收节点接收P2MP数据的状态信息。In the above steps S901-S905, each node in the P2MP communication domain has a forwarding table and a management table. The forwarding table is used to confirm a neighbor node that sends a response message after receiving a P2MP data packet sent by the source node, and the management table is used to record the state information of the managed receiving node receiving P2MP data after receiving a response message sent by the managed receiving node.
可以理解的是,图9所示的P2MP数据的确认流程图只是列举了一种可能发生的可靠传输处理流程,在该P2MP通信域内,还有很多其他可能发生的可靠传输处理流程。It is understandable that the P2MP data confirmation flowchart shown in FIG. 9 only lists one possible reliable transmission processing flow, and there are many other possible reliable transmission processing flows in the P2MP communication domain.
示例性的,图10示出了本申请实施例提供的一种P2MP数据的可靠传输方法流程图,应用于第一节点,第一节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第一节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息。Exemplarily, Figure 10 shows a flow chart of a reliable transmission method of P2MP data provided in an embodiment of the present application, which is applied to a first node, where the first node is a node in a P2MP communication domain. A P2MP communication domain includes multiple nodes, and the first node is one of the multiple nodes. A logical interconnection network is built between the multiple nodes, and each node has its own identification information.
如图10所示,该P2MP数据的可靠传输方法包括以下步骤S1010-S1030。As shown in FIG. 10 , the reliable transmission method of P2MP data includes the following steps S1010 - S1030 .
步骤S1010,接收第二节点在一个P2MP通信域内发送的P2MP数据包;P2MP数据包至少包括P2MP数据,第二节点是多个节点中的一个。Step S1010: receiving a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet at least includes P2MP data, and the second node is one of the plurality of nodes.
步骤S1020,根据第一节点的转发表确定第三节点的标识信息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点。Step S1020, determining identification information of the third node according to the forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network.
步骤S1030,根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息。其中,应答消息包括ACK和NAK消息,ACK表示正确接收到P2MP数据,NAK表示接收到的P2MP数据有误。Step S1030: Send a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node. The response message includes ACK and NAK messages. ACK indicates that the P2MP data is correctly received, and NAK indicates that the received P2MP data is erroneous.
在一个示例中,其中,P2MP数据包至少包括P2MP数据、P2MP数据的序列号;根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息,包括:在接收P2MP数据的状态为有误时,向第三节点发送应答消息;应答消息至少包括第一节点的标识信息、P2MP数据的序列号以及接收P2MP数据的状态标志,接收P2MP数据的状态标志设置为NAK,NAK表示接收到的P2MP数据有误;方法还包括:接收第三节点重传的正确的P2MP数据包。In one example, the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; sending a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node includes: when the state of receiving the P2MP data is erroneous, sending a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and the state flag of receiving the P2MP data, the state flag of receiving the P2MP data is set to NAK, and NAK indicates that the received P2MP data is erroneous; the method also includes: receiving a correct P2MP data packet retransmitted by the third node.
在一个示例中,如图7b所示,P2MP通信域包含八个节点,八个节点间建有逻辑互联网络,每个节点具有各自的标识信息,为了便于描述,将节点1看作第二节点,节点5看作第一节点,节点6看作第三节点,对P2MP数据的可靠传输方法进行说明。在P2MP通信域生命周期的某一具体时刻,选择节点1作为P2MP通信域内发送P2MP数据的组播源,选择节点2-8作为接收P2MP数据的接收节点,可以理解的是,在P2MP通信域中,八个节点中的任一节点都可能作为发送P2MP数据的源节点。为了建立可靠传输机制,
在发送P2MP数据之前,需要增加一些额外信息组成P2MP数据包,比如,源节点的标识信息,P2MP数据的序列号等。In one example, as shown in FIG7b, the P2MP communication domain includes eight nodes, and a logical interconnection network is built between the eight nodes. Each node has its own identification information. For ease of description, node 1 is regarded as the second node, node 5 is regarded as the first node, and node 6 is regarded as the third node to illustrate the reliable transmission method of P2MP data. At a specific moment in the life cycle of the P2MP communication domain, node 1 is selected as the multicast source for sending P2MP data in the P2MP communication domain, and nodes 2-8 are selected as receiving nodes for receiving P2MP data. It can be understood that in the P2MP communication domain, any of the eight nodes may be used as the source node for sending P2MP data. In order to establish a reliable transmission mechanism, Before sending P2MP data, some additional information needs to be added to form a P2MP data packet, such as the identification information of the source node, the sequence number of the P2MP data, etc.
在如图7b所示的逻辑互联网络中,节点6有两个邻居节点,分别为节点5和节点7,当节点6接收到节点1在P2MP通信域内发送的P2MP数据包,根据转发表确定节点5的标识信息,再根据接收P2MP数据的状态,向节点5发送应答消息。应答消息至少包括P2MP数据的序列号,接收P2MP数据的状态,节点6的标识信息,以及节点1的标识信息等,此时,接收P2MP数据的状态标志设置为ACK,表示正确接收到P2MP数据,或设置为NAK,表示接收到的P2MP数据有误。当发送的应答消息为NAK时,还需要接收节点5重传的正确的P2MP数据。In the logical interconnection network shown in FIG7b, node 6 has two neighbor nodes, namely node 5 and node 7. When node 6 receives a P2MP data packet sent by node 1 in the P2MP communication domain, it determines the identification information of node 5 according to the forwarding table, and then sends a response message to node 5 according to the status of receiving the P2MP data. The response message at least includes the sequence number of the P2MP data, the status of receiving the P2MP data, the identification information of node 6, and the identification information of node 1. At this time, the status flag of receiving the P2MP data is set to ACK, indicating that the P2MP data is correctly received, or is set to NAK, indicating that the received P2MP data is incorrect. When the sent response message is NAK, it is also necessary to receive the correct P2MP data retransmitted by node 5.
在实际应用时,对第一节点而言,可以接收完第二节点转发的所有P2MP数据包后,向第三节点发送一次应答消息,也可以在第二节点每接收一次或固定次数P2MP数据包后,向第三节点发送一次应答消息,以上应答消息发送方式的选择,在系统的控制面中进行设置,可以理解的是,可以在P2MP通信域的整个生命周期内设置一种固定的应答消息发送方式,也可以根据网络拥塞情况实时进行应答消息发送方式的切换,为了叙述方便,在本申请中以每接收一次P2MP数据包即发送一次应答消息进行说明,其他应发消息发送方式的实现方式类似,在此不再赘述。In actual application, for the first node, a response message may be sent to the third node once after all P2MP data packets forwarded by the second node are received, or a response message may be sent to the third node once each time the second node receives a P2MP data packet or a fixed number of times. The selection of the above response message sending method is set in the control plane of the system. It can be understood that a fixed response message sending method can be set throughout the life cycle of the P2MP communication domain, or the response message sending method can be switched in real time according to the network congestion situation. For the convenience of description, in this application, a response message is sent once each time a P2MP data packet is received. The implementation methods of other message sending methods are similar and will not be repeated here.
图11中示出了本申请实施例提供的一种逻辑互联网络以及转发表,如图11(a)所示,P2MP通信域内包含八个节点,节点间建立了逻辑互联网络,一个节点可能会有多个邻居节点,因此,在逻辑互联网络建立后,需要为每个节点建立一个转发表,根据转发表选择一个邻居节点,进行可靠传输控制信令(ACK/NAK应答消息)的发送与重传处理。FIG11 shows a logical interconnection network and a forwarding table provided in an embodiment of the present application. As shown in FIG11( a), the P2MP communication domain includes eight nodes, and a logical interconnection network is established between the nodes. A node may have multiple neighboring nodes. Therefore, after the logical interconnection network is established, it is necessary to establish a forwarding table for each node, select a neighboring node according to the forwarding table, and perform reliable transmission control signaling (ACK/NAK response message) sending and retransmission processing.
在一个示例中,转发表的建立主要考虑两个方面:In an example, the establishment of the forwarding table mainly considers two aspects:
第一方面,由于在P2MP通信域的生命周期内,转发P2MP数据的源节点会频繁发生切换,因此,可以考虑源节点切换的影响,将源节点的标识信息作为索引项建立转发表,根据不同源节点选择相应的邻居节点。其中,又分为两种情况:比如,对图11(a)所示逻辑互联网络中的节点8,只有一个邻居节点7,这样无论节点1-7中任何一个作为源节点,都需要选择节点7作为进行可靠传输信令交互的邻居节点,还有一种情况,比如对节点6,有两个邻居节点5和7,则可以根据不同的源节点的标识信息作为索引项去建立转发表。First, since the source node forwarding P2MP data will frequently switch during the life cycle of the P2MP communication domain, the impact of the source node switching can be considered, and the source node identification information can be used as an index item to establish a forwarding table, and the corresponding neighbor node can be selected according to different source nodes. There are two cases: for example, for node 8 in the logical interconnection network shown in Figure 11 (a), there is only one neighbor node 7, so no matter which of the nodes 1-7 is used as the source node, node 7 needs to be selected as the neighbor node for reliable transmission signaling interaction. In another case, for example, for node 6, there are two neighbor nodes 5 and 7, then the forwarding table can be established according to the identification information of different source nodes as index items.
当然,也可以不考虑源节点切换的影响,在建立节点的转发表时,指定一个固定的邻居节点去进行可靠传输信令的交互。可以理解的是,对于P2MP通信域的所有节点,可以对其中一部分节点建立转发表时考虑源节点切换的影响,将源节点的标识信息作为转发表的一部分,而对另一部分节点建立转发表时不考虑源节点切换的影响,对这些节点指定一个固定的邻居节点去进行可靠传输信令的交互,也可以对全部节点考虑或不考虑源节点切换的影响。Of course, the influence of the source node switching may not be considered, and a fixed neighbor node may be designated to interact with the reliable transmission signaling when establishing the node forwarding table. It is understandable that for all nodes in the P2MP communication domain, the influence of the source node switching may be considered when establishing the forwarding table for some of the nodes, and the identification information of the source node may be used as part of the forwarding table, while the influence of the source node switching may not be considered when establishing the forwarding table for the other nodes, and a fixed neighbor node may be designated for these nodes to interact with the reliable transmission signaling, and the influence of the source node switching may be considered or not considered for all the nodes.
无论是以上哪种情况,对于P2MP中任一节点,其转发表都可以使用字段{key:value}来建立,其中,字段中的key为源节点的标识信息或通配符,字段中的value为进行可靠传输信令交互的邻居节点的标识信息,通配符表示不考虑源节点切换的影响,任何一个源节点的标识信息都会命中该通配符。Regardless of the above situation, for any node in P2MP, its forwarding table can be established using the field {key: value}, where the key in the field is the identification information or wildcard of the source node, and the value in the field is the identification information of the neighboring node for reliable transmission signaling interaction. The wildcard indicates that the impact of the source node switching is not considered, and the identification information of any source node will hit the wildcard.
第二方面,在P2MP通信域内的多节点间建立逻辑互联网络后,转发表的建立还需要保证进行可靠传输信令交互的邻居节点(包括源节点)之间可以实现应答消息发送路径上的有效连接。Secondly, after establishing a logical interconnection network among multiple nodes in the P2MP communication domain, the establishment of a forwarding table also needs to ensure that neighboring nodes (including source nodes) that interact with each other in reliable transmission signaling can achieve effective connection on the reply message sending path.
如图11(b)所示,给出了节点2和节点3的转发表,两张转发表的设置都考虑了源节点切换的影响,在节点2的转发表中,列举了当源节点为节点1时,节点2选择发送应答消息的邻居节点标识信息,通常,在一个节点的转发表中,如果选择考虑源节点切换的影响,则需要在key项中对P2MP通信域内的所有节点进行穷举,并选择对应的邻居节点填写在value项中。图11(b)示出了当节点1作为源节点,接收节点2-8进行可靠传输信令交互的示意图,从图11(b)中可以看出,节点1-7作为其他接收节点的邻居节点都有处理信令交互的任务,而节点1-7基于转发表的配置的邻居信息,可以实现应答消息发送路径上的有效连接,这样节点1-7就能够正确接收P2MP数据,从而代替源节点1对对应的接收节点实现P2MP数据的重传。As shown in Figure 11(b), the forwarding tables of node 2 and node 3 are given. The settings of the two forwarding tables both consider the impact of source node switching. In the forwarding table of node 2, the neighbor node identification information of node 2 is listed when the source node is node 1. Usually, in the forwarding table of a node, if the impact of source node switching is considered, it is necessary to exhaustively enumerate all nodes in the P2MP communication domain in the key item and select the corresponding neighbor node to fill in the value item. Figure 11(b) shows a schematic diagram of reliable transmission signaling interaction when node 1 is the source node and receiving nodes 2-8. It can be seen from Figure 11(b) that nodes 1-7, as neighbor nodes of other receiving nodes, have the task of processing signaling interaction, and nodes 1-7 can achieve effective connection on the path of sending the reply message based on the neighbor information configured in the forwarding table, so that nodes 1-7 can correctly receive P2MP data, thereby replacing source node 1 to achieve retransmission of P2MP data to the corresponding receiving node.
如图11(c)所示,给出了节点2和节点3的转发表,两张转发表的设置都考虑了源节点切换的影响,图11(c)示出了当节点1作为源节点,接收节点2-8进行可靠传输信令交互的示意图,从图11(c)中可以看出,节点1、3-7作为其他接收节点的邻居节点都有处理信令交互的任务,而节点1和节点3-7在转发表的配置下没有实现应答消息发送路径上的有效连接,形成两个逻辑互联网络中的独立孤岛,因此,当节点3和4因为网络故障都没有正确接收到由节点1发送的P2MP数据时,则无法得到重传的正确的
P2MP包,也无法将正确的P2MP数据重传给其管理的接收节点,从而无法保证P2MP通信域中每个接收节点都能正确接收到P2MP数据。As shown in Figure 11(c), the forwarding tables of node 2 and node 3 are given. The settings of the two forwarding tables both take into account the impact of source node switching. Figure 11(c) shows a schematic diagram of reliable transmission signaling interaction when node 1 is the source node and receiving nodes 2-8. It can be seen from Figure 11(c) that nodes 1 and 3-7, as neighbor nodes of other receiving nodes, have the task of processing signaling interaction, while nodes 1 and 3-7 do not achieve effective connection on the reply message sending path under the configuration of the forwarding table, forming independent islands in two logical interconnected networks. Therefore, when nodes 3 and 4 do not correctly receive the P2MP data sent by node 1 due to network failure, they cannot obtain the correct retransmission. The P2MP packet cannot retransmit the correct P2MP data to the receiving node it manages, so it cannot guarantee that each receiving node in the P2MP communication domain can correctly receive the P2MP data.
如图11(d)所示,给出了节点2和节点3的转发表,节点3转发表的设置考虑了源节点切换的影响,节点2转发表的设置没有考虑源节点切换的影响,将节点2发送应答消息的邻居节点进行了通配设置,图11(d)示出了当节点1作为源节点,接收节点2-8进行可靠传输信令交互的示意图,从图11(c)中可以看出,节点2-7作为其他接收节点的邻居节点都有处理信令交互的任务,而节点2-7在转发表的配置下没有在应答消息路径上连接节点1,形成两个逻辑互联网络中的独立孤岛,因此,当节点2和3因为网络故障都没有正确接收到由节点1发送的P2MP数据时,则无法得到重传的正确的P2MP包,也无法将正确的P2MP数据重传给其管理的接收节点,从而无法保证P2MP通信域中每个接收节点都能正确接收到P2MP数据。As shown in Figure 11(d), the forwarding tables of nodes 2 and 3 are given. The setting of the forwarding table of node 3 takes into account the influence of the source node switching, while the setting of the forwarding table of node 2 does not take into account the influence of the source node switching. The neighboring nodes to which node 2 sends the reply message are wildcarded. Figure 11(d) shows a schematic diagram of reliable transmission signaling interaction when node 1 is used as the source node and receiving nodes 2-8. As can be seen from Figure 11(c), nodes 2-7, as neighboring nodes of other receiving nodes, have the task of processing signaling interaction, while nodes 2-7 are not connected to node 1 on the reply message path under the configuration of the forwarding table, forming independent islands in two logical interconnected networks. Therefore, when nodes 2 and 3 do not correctly receive the P2MP data sent by node 1 due to network failure, they cannot obtain the retransmitted correct P2MP packets, nor can they retransmit the correct P2MP data to the receiving nodes they manage, thereby failing to ensure that each receiving node in the P2MP communication domain can correctly receive the P2MP data.
因此,如图11(c)、图11(d)所示节点2和节点3的转发表,无法满足本申请中转发表设置的要求,需要做出相应修改。Therefore, the forwarding tables of node 2 and node 3 shown in Figure 11(c) and Figure 11(d) cannot meet the requirements of the forwarding table settings in this application and need to be modified accordingly.
由此,通过对P2MP通信域内的多个节点建立逻辑互联网络,将仅由源节点执行P2MP数据可靠传输处理的任务,分布式部署到P2MP通信域内的多个节点,可以有效减轻单点处理压力。Therefore, by establishing a logical interconnection network for multiple nodes in the P2MP communication domain, the task of reliable transmission and processing of P2MP data performed only by the source node is distributed and deployed to multiple nodes in the P2MP communication domain, which can effectively reduce the pressure of single-point processing.
示例性的,图12示出了本申请实施例提供的一种P2MP数据的可靠传输方法流程图,应用于第三节点,第三节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第三节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息。Exemplarily, Figure 12 shows a flow chart of a reliable transmission method of P2MP data provided in an embodiment of the present application, which is applied to a third node, where the third node is a node in a P2MP communication domain. A P2MP communication domain includes multiple nodes, and the third node is one of the multiple nodes. A logical interconnection network is built between the multiple nodes, and each node has its own identification information.
如图12所示,该P2MP数据的可靠传输方法包括以下步骤S1210-S1220。As shown in FIG. 12 , the reliable transmission method of P2MP data includes the following steps S1210 - S1220 .
步骤S1210,接收来自第一节点的应答消息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点;应答消息至少包括第一节点的标识信息和第一节点接收P2MP数据的状态标志;P2MP数据包含在P2MP数据包中,P2MP数据包由第二节点在一个P2MP通信域内发送,第二节点是多个节点中的一个。Step S1210, receiving a response message from the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, the P2MP data packet is sent by the second node in a P2MP communication domain, and the second node is one of the multiple nodes.
步骤S1220,根据第一节点的应答消息,更新第三节点的管理表;管理表至少包括第一节点接收P2MP数据的状态。Step S1220: updating the management table of the third node according to the response message of the first node; the management table at least includes the state of the first node receiving the P2MP data.
在一个示例中,第三节点根据第一节点的应答消息更新管理表中关于第一节点接收P2MP数据的状态。In an example, the third node updates the status of the first node receiving the P2MP data in the management table according to the response message of the first node.
在一个示例中,接收来自第一节点的应答消息之后,P2MP数据的可靠传输方法还包括:判断第一节点的应答消息中第一节点接收P2MP数据的状态标志是否是NAK,如果是,则向第一节点重传正确的P2MP数据包,并开启定时器;其中,NAK表示第一节点接收到的P2MP数据有误;再次接收来自第一节点的应答消息,关闭定时器。类似的,当第三节点再次接收来自第一节点的应答消息后,也需要对管理表中关于第一节点接收P2MP数据的状态进行更新。In one example, after receiving a response message from the first node, the reliable transmission method of P2MP data further includes: determining whether the status flag of the first node receiving P2MP data in the response message of the first node is NAK, and if so, retransmitting the correct P2MP data packet to the first node and starting a timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; receiving the response message from the first node again, and stopping the timer. Similarly, when the third node receives the response message from the first node again, it is also necessary to update the status of the first node receiving P2MP data in the management table.
在一个示例中,在P2MP通信域中,第三节点(一个接收节点)、第二节点(源节点)作为其他接收节点的邻居节点都有处理信令交互的任务,对第二节点而言,在发送完P2MP数据后就需要为所有需要向第二节点发送应答消息的接收节点各开启定时器,当接收来自定时器对应的各接收节点的应答消息后,则关闭为这一接收节点开启的定时器,若定时器超时则为这一接收节点重传正确的P2MP数据;对第三节点而言,则需要在接收到P2MP数据以后为所有需要向第三节点发送应答消息的接收节点做类似操作,在此不再赘述。In one example, in a P2MP communication domain, the third node (a receiving node) and the second node (source node) as neighbor nodes of other receiving nodes have the task of processing signaling interaction. For the second node, after sending the P2MP data, it is necessary to start a timer for all receiving nodes that need to send a reply message to the second node. After receiving the reply message from each receiving node corresponding to the timer, the timer started for this receiving node is closed. If the timer times out, the correct P2MP data is retransmitted to this receiving node. For the third node, it is necessary to perform similar operations for all receiving nodes that need to send a reply message to the third node after receiving the P2MP data, which will not be repeated here.
在P2MP通信域内的多个节点建立逻辑互联网络后,需要为每个节点建立一个管理表,通常需要考虑源节点切换的影响,将源节点的标识信息作为索引项建立管理表,当然也可以不考虑源节点切换的影响,可以理解的是,此处建立原理和步骤S1030中转发表的建立过程中“第一方面”的描述类似,在此不再赘述。After multiple nodes in the P2MP communication domain establish a logical interconnection network, it is necessary to establish a management table for each node. Usually, it is necessary to consider the impact of the source node switching, and use the identification information of the source node as an index item to establish the management table. Of course, the impact of the source node switching may also be ignored. It can be understood that the establishment principle here is similar to the description of the "first aspect" in the process of establishing the forwarding table in step S1030, and will not be repeated here.
对于P2MP中任一节点,比如第三节点,其管理表都可以使用字段{key:value:state}来建立,字段中的key为发送P2MP数据的源节点的标识信息或通配符,字段中的value为对应的接收节点的标识信息,字段中的state为对应的接收节点接收P2MP数据的状态标志。For any node in P2MP, such as the third node, its management table can be established using the field {key: value: state}, where the key in the field is the identification information or wildcard of the source node sending the P2MP data, the value in the field is the identification information of the corresponding receiving node, and the state in the field is the state flag of the corresponding receiving node receiving the P2MP data.
对于P2MP中任一节点,都需要维护转发表和管理表,两个表格可以为两个独立的表格,也可以合并为一个表格,只需保证查表时,能分别获取:发送应答消息的邻居节点、接收应答消息的接收节点的标识信息即可。For any node in P2MP, it is necessary to maintain a forwarding table and a management table. The two tables can be two independent tables or combined into one table. It is only necessary to ensure that when looking up the table, the identification information of the neighbor node that sends the response message and the receiving node that receives the response message can be obtained respectively.
由此,通过在P2MP通信域内建立通信节点间的逻辑互联网络,将实现组播数据可靠传输的机制,分布式部署到P2MP通信域内的多个通信节点,可以有效减轻仅由源节点对多个接收节点进行可靠传输状态
处理的压力,避免单点处理瓶颈。Therefore, by establishing a logical interconnection network between communication nodes in the P2MP communication domain, the mechanism for implementing reliable transmission of multicast data is distributed and deployed to multiple communication nodes in the P2MP communication domain, which can effectively alleviate the problem of reliable transmission from only the source node to multiple receiving nodes. Processing pressure and avoid single-point processing bottlenecks.
示例性的,图13a示出了本申请实施例提供的一种在P2MP通信域内基于RC连接的网络层组播资源配置图,在一个P2MP通信域内包含七个通信节点,每个通信节点可以承载于相同或不同的电脑、服务器、集群、存储设备中,包括智能网卡、通道适配器等计算处理单元,这些计算处理单元可以通过网关、路由器等实现物理层链接。在P2MP通信域生命周期的某个时刻,选择节点1作为发送P2MP数据的源节点。基于P2MP通信域内的各节点在图13a中建立的逻辑互联网络,通过控制面在源节点1和每个接收节点2-7之间设置基于RC连接的网络层组播资源配置,从图13a中可以看出,需要建立的RC连接数量为6个,其中,节点1分别与接收节点2、3、4建立1个RC连接,节点2、4、5分别与接收节点5、7、6建立1个RC连接,类比于七个通信节点建立Full-Mesh连接,需要建立RC连接的数量为7*(7-1)÷2=21个,从而节省了由源节点频繁切换所导致的控制面可靠传输保障的开销。Exemplarily, FIG. 13a shows a network layer multicast resource configuration diagram based on RC connection in a P2MP communication domain provided by an embodiment of the present application. A P2MP communication domain includes seven communication nodes, each of which can be carried in the same or different computers, servers, clusters, storage devices, including computing processing units such as smart network cards and channel adapters, which can realize physical layer links through gateways, routers, etc. At a certain moment in the life cycle of the P2MP communication domain, node 1 is selected as the source node for sending P2MP data. Based on the logical interconnected network established by each node in the P2MP communication domain in Figure 13a, the network layer multicast resource configuration based on the RC connection is set between the source node 1 and each receiving node 2-7 through the control plane. It can be seen from Figure 13a that the number of RC connections that need to be established is 6, among which node 1 establishes 1 RC connection with receiving nodes 2, 3, and 4 respectively, and nodes 2, 4, and 5 establish 1 RC connection with receiving nodes 5, 7, and 6 respectively. Analogously to the establishment of a Full-Mesh connection among seven communication nodes, the number of RC connections that need to be established is 7*(7-1)÷2=21, thereby saving the overhead of reliable transmission guarantee of the control plane caused by frequent switching of the source node.
示例性的,图13b示出了本申请实施例提供的一种在P2MP通信域内基于数据面的逻辑连接示意图,如图13b所示,源节点1将P2MP数据在组播网络中进行多份复制,然后一次性发送到接收节点2-7。Exemplarily, Figure 13b shows a logical connection diagram based on the data plane within a P2MP communication domain provided by an embodiment of the present application. As shown in Figure 13b, the source node 1 makes multiple copies of the P2MP data in the multicast network and then sends them to the receiving nodes 2-7 at one time.
示例性的,图13c示出了本申请实施例提供的一种在P2MP通信域内基于控制面的逻辑连接示意图,如图13c所示,在P2MP通信域内有如下几类节点,比如,节点1(组播源)→节点2(组播宿1,组播源管理的接收节点,组播宿2的邻居节点)→节点5(组播宿2,组播宿1管理的接收节点,组播宿3的邻居节点)→节点6(组播宿3,组播宿2管理的接收节点),在各自的可靠传输确认域内,节点2(组播宿1):首先根据转发表确定可靠传输控制域中的邻居节点为节点1,其次,需要向组播源发送本节点是否可靠接收P2MP数据,不需告知组播源自己管理的接收节点组播宿2是否可靠接收P2MP数据,节点5(组播宿2):首次根据转发表确定可靠传输控制域中的邻居节点为节点2,其次,需要向自己的邻居节点组播宿1通告是否可靠接收P2MP数据,不需要向组播源发送本节点是否可靠接收P2MP数据,不需告知组播源自己管理的接收节点组播宿3是否可靠接收P2MP数据,节点6(组播宿3):首先根据转发表确定可靠传输控制域中的邻居节点为节点5,其次,需要向自己的邻居节点组播宿2通告是否可靠接收P2MP数据,不需要向组播源发送本节点是否可靠接收P2MP数据。其他节点3、4、7的可靠传输保障机制的工作原理类似,不再赘述。在逻辑互联网络中,组播宿1可以正确接收到组播源发送的P2MP数据,也可以在组播宿2发送NAK应答消息时,重传正确的P2MP数据,并且,任何一个节点只需要向自己的邻居节点反馈信息,邻居节点可能会是组播源,也可能会是组播宿。由此,通过分布式的可靠传输控制域划分,可以有效减轻单点压力和避免单点处理瓶颈。Exemplarily, FIG. 13c shows a schematic diagram of a logical connection based on a control plane in a P2MP communication domain provided by an embodiment of the present application. As shown in FIG13c, there are the following types of nodes in the P2MP communication domain, for example, node 1 (multicast source) → node 2 (multicast sink 1, a receiving node managed by the multicast source, a neighbor node of multicast sink 2) → node 5 (multicast sink 2, a receiving node managed by multicast sink 1, a neighbor node of multicast sink 3) → node 6 (multicast sink 3, a receiving node managed by multicast sink 2). In their respective reliable transmission confirmation domains, node 2 (multicast sink 1): first, according to the forwarding table, it is determined that the neighbor node in the reliable transmission control domain is node 1. Secondly, it is necessary to send a message to the multicast source whether the node can reliably receive P2MP data, and it is not necessary to inform the multicast source whether the node can reliably receive P2MP data. Whether the receiving node groupcast sink 2 managed by the multicast source itself can reliably receive P2MP data, node 5 (multicast sink 2): first determine the neighbor node in the reliable transmission control domain as node 2 according to the forwarding table, and then need to notify its neighbor node groupcast sink 1 whether it can reliably receive P2MP data, and do not need to send whether this node can reliably receive P2MP data to the multicast source, and do not need to inform the receiving node groupcast sink 3 managed by the multicast source whether it can reliably receive P2MP data, node 6 (multicast sink 3): first determine the neighbor node in the reliable transmission control domain as node 5 according to the forwarding table, and then need to notify its neighbor node groupcast sink 2 whether it can reliably receive P2MP data, and do not need to send whether this node can reliably receive P2MP data to the multicast source. The working principle of the reliable transmission guarantee mechanism of other nodes 3, 4, and 7 is similar and will not be repeated. In the logical interconnection network, multicast sink 1 can correctly receive the P2MP data sent by the multicast source, and can also retransmit the correct P2MP data when multicast sink 2 sends a NAK response message. In addition, any node only needs to feedback information to its neighboring node, which may be a multicast source or a multicast sink. Therefore, through the distributed reliable transmission control domain division, the pressure of a single point can be effectively reduced and the bottleneck of single point processing can be avoided.
基于上述实施例中的方法,示例性的,图14示出了本申请实施例提供的一种P2MP数据的可靠传输装置示意图,该装置可以部署于第一节点,第一节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第一节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,如图14所示,该确认装置1400包括:通信模块1410和处理模块1420。Based on the method in the above embodiment, exemplarily, Figure 14 shows a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application. The device can be deployed on a first node. The first node is a node in a P2MP communication domain. A P2MP communication domain includes multiple nodes. The first node is one of the multiple nodes. A logical interconnection network is built between the multiple nodes. Each node has its own identification information. As shown in Figure 14, the confirmation device 1400 includes: a communication module 1410 and a processing module 1420.
通信模块1410可以接收第二节点在一个P2MP通信域内发送的P2MP数据包;P2MP数据包至少包括P2MP数据,第二节点是多个节点中的一个。The communication module 1410 may receive a P2MP data packet sent by a second node in a P2MP communication domain; the P2MP data packet includes at least P2MP data, and the second node is one of the plurality of nodes.
处理模块1420可以根据第一节点的转发表确定第三节点的标识信息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点;处理模块1420还可以根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息。The processing module 1420 can determine the identification information of the third node according to the forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the processing module 1420 can also send a response message to the third node according to the status of receiving P2MP data and the identification information of the third node.
在一些实施例中,其中,P2MP数据包至少包括P2MP数据、P2MP数据的序列号;处理模块1420可以根据接收P2MP数据的状态和第三节点的标识信息,向第三节点发送应答消息,具体的,在接收P2MP数据的状态为有误时,向第三节点发送应答消息;应答消息至少包括第一节点的标识信息、P2MP数据的序列号以及接收P2MP数据的状态标志,接收P2MP数据的状态标志设置为NAK,NAK表示接收到的P2MP数据有误;通信模块1410还可以接收第三节点重传的正确的P2MP数据包。In some embodiments, the P2MP data packet includes at least the P2MP data and the sequence number of the P2MP data; the processing module 1420 may send a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node, and specifically, when the state of receiving the P2MP data is erroneous, send a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data and the state flag of receiving the P2MP data, and the state flag of receiving the P2MP data is set to NAK, which indicates that the received P2MP data is erroneous; the communication module 1410 may also receive a correct P2MP data packet retransmitted by the third node.
在一些实施例中,P2MP数据包至少包括P2MP数据、第二节点的标识信息;处理模块1420可以根据第一节点的转发表确定第三节点的标识信息,具体的,根据第一节点的转发表和第二节点的标识信息确定第三节点的标识信息;其中,第一节点的转发表包含至少一条字段{key:value},字段中的key为第二节点的标识信息或通配符,字段中的value为第三节点的标识信息。
In some embodiments, the P2MP data packet includes at least P2MP data and identification information of the second node; the processing module 1420 can determine the identification information of the third node based on the forwarding table of the first node, specifically, determine the identification information of the third node based on the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node includes at least one field {key: value}, the key in the field is the identification information of the second node or a wildcard, and the value in the field is the identification information of the third node.
在一些实施例中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点中的每个节点具有至少一个邻居节点,多个节点中的每个节点与邻居节点建立直连,多个节点中的每个节点的邻居节点之间不建立直连。In some embodiments, a logical interconnection network is established by connecting multiple nodes, and the connections between the multiple nodes include: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
在一些实施例中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点分成多个节点组,每个节点组包括第一节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间建立连接,多个节点组中的第一节点建立连接。In some embodiments, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and connections are established with the first nodes in the multiple node groups.
示例性的,图15示出了本申请实施例提供的一种P2MP数据的可靠传输装置示意图,该装置可以部署于第三节点,第三节点是一个P2MP通信域内的节点,一个P2MP通信域内包括多个节点,第三节点是多个节点中的一个,多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,如图15所示,该确认装置1500包括:通信模块1510和处理模块1520。Exemplarily, Figure 15 shows a schematic diagram of a reliable transmission device for P2MP data provided in an embodiment of the present application. The device can be deployed at a third node. The third node is a node in a P2MP communication domain. A P2MP communication domain includes multiple nodes. The third node is one of the multiple nodes. A logical interconnection network is built between the multiple nodes. Each node has its own identification information. As shown in Figure 15, the confirmation device 1500 includes: a communication module 1510 and a processing module 1520.
通信模块1510可以接收来自第一节点的应答消息;在逻辑互联网络中,第一节点具有至少一个邻居节点,第三节点是第一节点的邻居节点中的一个;邻居节点是在逻辑互联网络中具有直连关系的节点;应答消息至少包括第一节点的标识信息和第一节点接收P2MP数据的状态标志;P2MP数据包含在P2MP数据包中,P2MP数据包由第二节点在一个P2MP通信域内发送,第二节点是多个节点中的一个。The communication module 1510 can receive a response message from the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node with a direct connection relationship in the logical interconnection network; the response message includes at least identification information of the first node and a status flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, the P2MP data packet is sent by the second node in a P2MP communication domain, and the second node is one of multiple nodes.
处理模块1520可以根据第一节点的应答消息,更新第三节点的管理表;管理表至少包括第一节点接收P2MP数据的状态。The processing module 1520 may update the management table of the third node according to the response message of the first node; the management table at least includes the state of the first node receiving the P2MP data.
在一些实施例中,处理模块1520可以接收来自第一节点的应答消息之后,判断第一节点的应答消息中第一节点接收P2MP数据的状态标志是否是NAK,如果是,则向第一节点重传正确的P2MP数据包,并开启定时器;其中,NAK表示第一节点接收到的P2MP数据有误;通信模块1510还用于:接收来自第一节点的应答消息,关闭定时器。In some embodiments, after receiving the response message from the first node, the processing module 1520 can determine whether the status flag of the first node receiving the P2MP data in the response message of the first node is NAK. If so, retransmit the correct P2MP data packet to the first node and start the timer; wherein NAK indicates that the P2MP data received by the first node is incorrect; the communication module 1510 is also used to: receive the response message from the first node and turn off the timer.
在一些实施例中,其中,P2MP数据包至少包括P2MP数据、第二节点的标识信息;处理模块1520可以根据第一节点的应答消息,更新第二节点的管理表,具体的,根据第一节点的标识信息和第一节点接收P2MP数据的状态标志,更新第二节点的管理表中第一节点接收P2MP数据的状态标志;其中,第二节点的管理表包含至少一条字段{key:value:state},字段中的key为第二节点的标识信息或通配符,字段中的value为第一节点的标识信息,字段中的state为第一节点接收P2MP数据的状态标志。In some embodiments, the P2MP data packet includes at least P2MP data and identification information of the second node; the processing module 1520 can update the management table of the second node according to the response message of the first node, and specifically, update the state flag of the first node receiving P2MP data in the management table of the second node according to the identification information of the first node and the state flag of the first node receiving P2MP data; wherein the management table of the second node includes at least one field {key: value: state}, the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving P2MP data.
在一些实施例中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点中的每个节点具有至少一个邻居节点,多个节点中的每个节点与邻居节点建立直连,多个节点中的每个节点的邻居节点之间不建立直连。In some embodiments, a logical interconnection network is established by connecting multiple nodes, and the connections between the multiple nodes include: each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and no direct connection is established between the neighbor nodes of each of the multiple nodes.
在一些实施例中,其中,逻辑互联网络由多个节点之间进行连接建立,多个节点之间进行连接,包括:多个节点分成多个节点组,每个节点组包括第三节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间进行连接,多个节点组中的第三节点进行连接。In some embodiments, a logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including: multiple nodes are divided into multiple node groups, each node group includes a third node, the physical distance between the nodes in the node group is less than or equal to a preset distance, the nodes in the node group are connected, and the third nodes in multiple node groups are connected.
基于上述实施例中的方法,本申请实施例提供了一种电子设备。该电子设备可以包括:显示屏;至少一个存储器,用于存储程序;至少一个处理器,用于执行所述存储器存储的程序。其中,当所述存储器存储的程序被执行时,所述处理器用于执行上述实施例中所描述的方法。示例性的,该电子设备可以为是手机、平板电脑、桌面型计算机、膝上型计算机、手持计算机、笔记本电脑、服务器、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本,以及蜂窝电话、个人数字助理(personal digital assistant,PDA)、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、人工智能(artificial intelligence,AI)设备、可穿戴式设备、车载设备、智能家居设备和/或智慧城市设备,本申请实施例对该电子设备的具体类型不作特殊限制。Based on the method in the above embodiment, an embodiment of the present application provides an electronic device. The electronic device may include: a display screen; at least one memory for storing programs; at least one processor for executing the programs stored in the memory. Wherein, when the program stored in the memory is executed, the processor is used to execute the method described in the above embodiment. Exemplarily, the electronic device may be a mobile phone, a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, a server, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a netbook, and a cellular phone, a personal digital assistant (personal digital assistant, PDA), an augmented reality (augmented reality, AR) device, a virtual reality (virtual reality, VR) device, an artificial intelligence (artificial intelligence, AI) device, a wearable device, a vehicle-mounted device, a smart home device and/or a smart city device. The embodiment of the present application does not impose any special restrictions on the specific type of the electronic device.
基于上述实施例中的方法,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,当计算机程序在处理器上运行时,使得处理器执行上述实施例中的方法。Based on the method in the above embodiment, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program. When the computer program runs on a processor, the processor executes the method in the above embodiment.
基于上述实施例中的方法,本申请实施例提供了一种计算机程序产品,当计算机程序产品在处理器上运行时,使得处理器执行上述实施例中的方法。Based on the method in the above embodiment, an embodiment of the present application provides a computer program product. When the computer program product runs on a processor, the processor executes the method in the above embodiment.
可以理解的是,本申请的实施例中的处理器可以是中央处理单元(central processing unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。
It is understood that the processor in the embodiments of the present application may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. The general-purpose processor may be a microprocessor or any conventional processor.
本申请的实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器(programmable rom,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。The method steps in the embodiments of the present application can be implemented by hardware or by a processor executing software instructions. The software instructions can be composed of corresponding software modules, which can be stored in random access memory (RAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disks, mobile hard disks, CD-ROMs, or any other form of storage medium known in the art. An exemplary storage medium is coupled to a processor so that the processor can read information from the storage medium and write information to the storage medium. Of course, the storage medium can also be a component of the processor. The processor and the storage medium can be located in an ASIC.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。
In the above embodiments, all or part of the embodiments may be implemented by software, hardware, firmware or any combination thereof. When implemented by software, all or part of the embodiments may be implemented in the form of a computer program product. The computer program product includes one or more computer instructions.
Claims (23)
- 一种P2MP数据的可靠传输方法,其特征在于,应用于第一节点,所述第一节点是一个P2MP通信域内的节点,所述一个P2MP通信域内包括多个节点,所述第一节点是所述多个节点中的一个,所述多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,所述方法包括:A reliable transmission method for P2MP data, characterized in that it is applied to a first node, the first node is a node in a P2MP communication domain, the P2MP communication domain includes multiple nodes, the first node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information, the method comprising:接收第二节点在所述一个P2MP通信域内发送的P2MP数据包;所述P2MP数据包至少包括P2MP数据,所述第二节点是所述多个节点中的一个;receiving a P2MP data packet sent by a second node in the one P2MP communication domain; the P2MP data packet at least includes P2MP data, and the second node is one of the plurality of nodes;根据第一节点的转发表确定第三节点的标识信息;在所述逻辑互联网络中,所述第一节点具有至少一个邻居节点,所述第三节点是所述第一节点的邻居节点中的一个;所述邻居节点是在逻辑互联网络中具有直连关系的节点;Determine identification information of a third node according to a forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node having a direct connection relationship in the logical interconnection network;根据接收P2MP数据的状态和所述第三节点的标识信息,向所述第三节点发送应答消息。A response message is sent to the third node according to the state of receiving the P2MP data and the identification information of the third node.
- 根据权利要求1所述的方法,其特征在于,其中,所述P2MP数据包至少包括P2MP数据、所述P2MP数据的序列号;The method according to claim 1, wherein the P2MP data packet at least includes P2MP data and a sequence number of the P2MP data;所述根据接收P2MP数据的状态和所述第三节点的标识信息,向所述第三节点发送应答消息,包括:The sending a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node includes:在接收P2MP数据的状态为有误时,向所述第三节点发送应答消息;所述应答消息至少包括所述第一节点的标识信息、所述P2MP数据的序列号以及接收P2MP数据的状态标志,所述接收P2MP数据的状态标志设置为NAK,所述NAK表示接收到的P2MP数据有误;When the state of receiving the P2MP data is erroneous, sending a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data, and a state flag of receiving the P2MP data, the state flag of receiving the P2MP data is set to NAK, and the NAK indicates that the received P2MP data is erroneous;所述方法还包括:接收所述第三节点重传的正确的P2MP数据包。The method further includes: receiving a correct P2MP data packet retransmitted by the third node.
- 根据权利要求1所述的方法,其特征在于,其中,所述P2MP数据包至少包括P2MP数据、所述第二节点的标识信息;The method according to claim 1, wherein the P2MP data packet at least includes P2MP data and identification information of the second node;所述根据第一节点的转发表确定第三节点的标识信息,包括:The determining the identification information of the third node according to the forwarding table of the first node includes:根据第一节点的转发表和所述第二节点的标识信息确定第三节点的标识信息;其中,所述第一节点的转发表包含至少一条字段{key:value},所述字段中的key为所述第二节点的标识信息或通配符,所述字段中的value为所述第三节点的标识信息。Determine the identification information of the third node based on the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node contains at least one field {key: value}, the key in the field is the identification information or a wildcard of the second node, and the value in the field is the identification information of the third node.
- 根据权利要求1所述的方法,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The method according to claim 1, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including:所述多个节点中的每个节点具有至少一个邻居节点,所述多个节点中的每个节点与邻居节点建立直连,所述多个节点中的每个节点的邻居节点之间不建立直连。Each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and the neighbor nodes of each of the multiple nodes do not establish a direct connection with each other.
- 根据权利要求1所述的方法,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The method according to claim 1, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including:所述多个节点分成多个节点组,每个节点组包括第一节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间建立连接,多个节点组中的第一节点建立连接。The multiple nodes are divided into multiple node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and the first node in the multiple node groups is connected.
- 一种P2MP数据的可靠传输方法,其特征在于,应用于第三节点,所述第三节点是一个P2MP通信域内的节点,所述一个P2MP通信域内包括多个节点,所述第三节点是所述多个节点中的一个,所述多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,所述方法包括:A reliable transmission method for P2MP data, characterized in that it is applied to a third node, the third node is a node in a P2MP communication domain, the P2MP communication domain includes multiple nodes, the third node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, and each node has its own identification information, the method comprising:接收来自第一节点的应答消息;在所述逻辑互联网络中,所述第一节点具有至少一个邻居节点,所述第三节点是所述第一节点的邻居节点中的一个;所述邻居节点是在逻辑互联网络中具有直连关系的节点;所述应答消息至少包括所述第一节点的标识信息和所述第一节点接收P2MP数据的状态标志;所述P2MP数据包含在P2MP数据包中,所述P2MP数据包由第二节点在所述一个P2MP通信域内发送,所述第二节点是所述多个节点中的一个;receiving a response message from a first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node having a direct connection relationship in the logical interconnection network; the response message includes at least identification information of the first node and a state flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, and the P2MP data packet is sent by a second node in the one P2MP communication domain, and the second node is one of the multiple nodes;根据所述第一节点的应答消息,更新第三节点的管理表;所述管理表至少包括所述第一节点接收所述P2MP数据的状态。 According to the response message of the first node, a management table of the third node is updated; the management table at least includes a state in which the first node receives the P2MP data.
- 根据权利要求6所述的方法,其特征在于,所述接收来自第一节点的应答消息之后,所述方法还包括:The method according to claim 6, characterized in that after receiving the response message from the first node, the method further comprises:判断所述第一节点的应答消息中所述第一节点接收P2MP数据的状态标志是否是NAK,如果是,则向所述第一节点重传正确的P2MP数据包,并开启定时器;其中,所述NAK表示所述第一节点接收到的P2MP数据有误;Determine whether a status flag indicating that the first node receives the P2MP data in a response message of the first node is NAK, and if so, retransmit a correct P2MP data packet to the first node and start a timer; wherein the NAK indicates that the P2MP data received by the first node is incorrect;接收来自第一节点的应答消息,关闭所述定时器。A response message is received from the first node, and the timer is turned off.
- 根据权利要求6所述的方法,其特征在于,其中,所述P2MP数据包至少包括P2MP数据、所述第二节点的标识信息;The method according to claim 6, wherein the P2MP data packet at least includes P2MP data and identification information of the second node;所述根据所述第一节点的应答消息,更新第三节点的管理表,包括:The updating the management table of the third node according to the response message of the first node includes:根据第一节点的标识信息和第一节点接收P2MP数据的状态标志,更新第三节点的管理表中第一节点接收P2MP数据的状态标志;其中,所述第二节点的管理表包含至少一条字段{key:value:state},所述字段中的key为所述第二节点的标识信息或通配符,所述字段中的value为所述第一节点的标识信息,所述字段中的state为所述第一节点接收P2MP数据的状态标志。According to the identification information of the first node and the state flag of the first node receiving the P2MP data, the state flag of the first node receiving the P2MP data in the management table of the third node is updated; wherein the management table of the second node includes at least one field {key: value: state}, the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving the P2MP data.
- 根据权利要求6所述的方法,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The method according to claim 6, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including:所述多个节点中的每个节点具有至少一个邻居节点,所述多个节点中的每个节点与邻居节点建立直连,所述多个节点中的每个节点的邻居节点之间不建立直连。Each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and the neighbor nodes of each of the multiple nodes do not establish a direct connection with each other.
- 根据权利要求6所述的方法,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The method according to claim 6, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected, including:所述多个节点分成多个节点组,每个节点组包括第三节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间进行连接,多个节点组中的第三节点进行连接。The multiple nodes are divided into multiple node groups, each node group includes a third node, the physical distance between the nodes in the node group is less than or equal to a preset distance, the nodes in the node group are connected, and the third nodes in the multiple node groups are connected.
- 一种P2MP数据的可靠传输装置,其特征在于,部署于第一节点,所述第一节点是一个P2MP通信域内的节点,所述一个P2MP通信域内包括多个节点,所述第一节点是所述多个节点中的一个,所述多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,所述装置包括:A reliable transmission device for P2MP data, characterized in that it is deployed on a first node, the first node is a node in a P2MP communication domain, the P2MP communication domain includes multiple nodes, the first node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, each node has its own identification information, and the device includes:通信模块,用于接收第二节点在所述一个P2MP通信域内发送的P2MP数据包;所述P2MP数据包至少包括P2MP数据,所述第二节点是所述多个节点中的一个;a communication module, configured to receive a P2MP data packet sent by a second node in the one P2MP communication domain; the P2MP data packet at least includes P2MP data, and the second node is one of the plurality of nodes;处理模块,用于根据第一节点的转发表确定第三节点的标识信息;在所述逻辑互联网络中,所述第一节点具有至少一个邻居节点,所述第三节点是所述第一节点的邻居节点中的一个;所述邻居节点是在逻辑互联网络中具有直连关系的节点;A processing module, configured to determine identification information of a third node according to a forwarding table of the first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node having a direct connection relationship in the logical interconnection network;处理模块,还用于根据接收P2MP数据的状态和所述第三节点的标识信息,向所述第三节点发送应答消息。The processing module is further configured to send a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node.
- 根据权利要求11所述的装置,其特征在于,其中,所述P2MP数据包至少包括P2MP数据、所述P2MP数据的序列号;The device according to claim 11, wherein the P2MP data packet at least includes P2MP data and a sequence number of the P2MP data;所述处理模块根据接收P2MP数据的状态和所述第三节点的标识信息,向所述第三节点发送应答消息时,用于:When the processing module sends a response message to the third node according to the state of receiving the P2MP data and the identification information of the third node, it is used to:在接收P2MP数据的状态为有误时,向所述第三节点发送应答消息;所述应答消息至少包括所述第一节点的标识信息、所述P2MP数据的序列号以及接收P2MP数据的状态标志,所述接收P2MP数据的状态标志设置为NAK,所述NAK表示接收到的P2MP数据有误;When the state of receiving the P2MP data is erroneous, sending a response message to the third node; the response message includes at least the identification information of the first node, the sequence number of the P2MP data, and a state flag of receiving the P2MP data, the state flag of receiving the P2MP data is set to NAK, and the NAK indicates that the received P2MP data is erroneous;所述通信模块还用于:接收所述第三节点重传的正确的P2MP数据包。The communication module is further used for receiving the correct P2MP data packet retransmitted by the third node.
- 根据权利要求11所述的装置,其特征在于,其中,所述P2MP数据包至少包括P2MP数据、所述第二节点的标识信息;The device according to claim 11, wherein the P2MP data packet at least includes P2MP data and identification information of the second node;所述处理模块根据第一节点的转发表确定第三节点的标识信息时,用于: When the processing module determines the identification information of the third node according to the forwarding table of the first node, it is used to:根据第一节点的转发表和所述第二节点的标识信息确定第三节点的标识信息;其中,所述第一节点的转发表包含至少一条字段{key:value},所述字段中的key为所述第二节点的标识信息或通配符,所述字段中的value为所述第三节点的标识信息。Determine the identification information of the third node based on the forwarding table of the first node and the identification information of the second node; wherein the forwarding table of the first node contains at least one field {key: value}, the key in the field is the identification information or a wildcard of the second node, and the value in the field is the identification information of the third node.
- 根据权利要求11所述的装置,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The device according to claim 11, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected including:所述多个节点中的每个节点具有至少一个邻居节点,所述多个节点中的每个节点与邻居节点建立直连,所述多个节点中的每个节点的邻居节点之间不建立直连。Each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and the neighbor nodes of each of the multiple nodes do not establish a direct connection with each other.
- 根据权利要求11所述的装置,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The device according to claim 11, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected including:所述多个节点分成多个节点组,每个节点组包括第一节点,节点组中的节点之间的物理距离小于等于预设距离,节点组中的节点之间建立连接,多个节点组中的第一节点建立连接。The multiple nodes are divided into multiple node groups, each node group includes a first node, the physical distance between the nodes in the node group is less than or equal to a preset distance, connections are established between the nodes in the node group, and the first node in the multiple node groups is connected.
- 一种P2MP数据的可靠传输装置,其特征在于,部署于第三节点,所述第三节点是一个P2MP通信域内的节点,所述一个P2MP通信域内包括多个节点,所述第三节点是所述多个节点中的一个,所述多个节点间建有逻辑互联网络,每个节点具有各自的标识信息,所述装置包括:A reliable transmission device for P2MP data, characterized in that it is deployed on a third node, the third node is a node in a P2MP communication domain, the P2MP communication domain includes multiple nodes, the third node is one of the multiple nodes, a logical interconnection network is established between the multiple nodes, each node has its own identification information, and the device includes:通信模块,用于接收来自第一节点的应答消息;在所述逻辑互联网络中,所述第一节点具有至少一个邻居节点,所述第三节点是所述第一节点的邻居节点中的一个;所述邻居节点是在逻辑互联网络中具有直连关系的节点;所述应答消息至少包括所述第一节点的标识信息和所述第一节点接收P2MP数据的状态标志;所述P2MP数据包含在P2MP数据包中,所述P2MP数据包由第二节点在所述一个P2MP通信域内发送,所述第二节点是所述多个节点中的一个;a communication module, configured to receive a response message from a first node; in the logical interconnection network, the first node has at least one neighbor node, and the third node is one of the neighbor nodes of the first node; the neighbor node is a node having a direct connection relationship in the logical interconnection network; the response message includes at least identification information of the first node and a state flag of the first node receiving P2MP data; the P2MP data is included in a P2MP data packet, and the P2MP data packet is sent by a second node in the one P2MP communication domain, and the second node is one of the multiple nodes;处理模块,用于根据所述第一节点的应答消息,更新第三节点的管理表;所述管理表至少包括所述第一节点接收所述P2MP数据的状态。The processing module is used to update the management table of the third node according to the response message of the first node; the management table at least includes the state of the first node receiving the P2MP data.
- 根据权利要求16所述的装置,其特征在于,所述处理模块接收来自第一节点的应答消息之后,还用于:The device according to claim 16, characterized in that after the processing module receives the response message from the first node, it is further used to:判断所述第一节点的应答消息中所述第一节点接收P2MP数据的状态标志是否是NAK,如果是,则向所述第一节点重传正确的P2MP数据包,并开启定时器;其中,所述NAK表示所述第一节点接收到的P2MP数据有误;Determine whether a status flag indicating that the first node receives the P2MP data in a response message of the first node is NAK, and if so, retransmit a correct P2MP data packet to the first node and start a timer; wherein the NAK indicates that the P2MP data received by the first node is incorrect;所述通信模块还用于:接收来自第一节点的应答消息,关闭所述定时器。The communication module is further used for: receiving a response message from the first node and shutting down the timer.
- 根据权利要求16所述的装置,其特征在于,其中,所述P2MP数据包至少包括P2MP数据、所述第二节点的标识信息;The apparatus according to claim 16, wherein the P2MP data packet comprises at least P2MP data and identification information of the second node;所述处理模块根据所述第一节点的应答消息,更新第二节点的管理表时,用于:When the processing module updates the management table of the second node according to the response message of the first node, it is used to:根据第一节点的标识信息和第一节点接收P2MP数据的状态标志,更新第二节点的管理表中第一节点接收P2MP数据的状态标志;其中,所述第二节点的管理表包含至少一条字段{key:value:state},所述字段中的key为所述第二节点的标识信息或通配符,所述字段中的value为所述第一节点的标识信息,所述字段中的state为所述第一节点接收P2MP数据的状态标志。According to the identification information of the first node and the state flag of the first node receiving the P2MP data, the state flag of the first node receiving the P2MP data in the management table of the second node is updated; wherein the management table of the second node includes at least one field {key: value: state}, the key in the field is the identification information or a wildcard of the second node, the value in the field is the identification information of the first node, and the state in the field is the state flag of the first node receiving the P2MP data.
- 根据权利要求16所述的装置,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The device according to claim 16, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected including:所述多个节点中的每个节点具有至少一个邻居节点,所述多个节点中的每个节点与邻居节点建立直连,所述多个节点中的每个节点的邻居节点之间不建立直连。Each of the multiple nodes has at least one neighbor node, each of the multiple nodes establishes a direct connection with the neighbor node, and the neighbor nodes of each of the multiple nodes do not establish a direct connection with each other.
- 根据权利要求16所述的装置,其特征在于,其中,所述逻辑互联网络由多个节点之间进行连接建立,所述多个节点之间进行连接,包括:The device according to claim 16, wherein the logical interconnection network is established by connecting multiple nodes, and the multiple nodes are connected including:所述多个节点分成多个节点组,每个节点组包括第三节点,节点组中的节点之间的物理距离小于等 于预设距离,节点组中的节点之间进行连接,多个节点组中的第三节点进行连接。The plurality of nodes are divided into a plurality of node groups, each node group includes a third node, and the physical distance between the nodes in the node group is less than the equal distance between the nodes. At a preset distance, the nodes in the node group are connected, and the third nodes in the plurality of node groups are connected.
- 一种电子设备,其特征在于,包括:An electronic device, comprising:至少一个存储器,用于存储程序;at least one memory for storing a program;至少一个处理器,用于执行所述存储器存储的程序;at least one processor, configured to execute the program stored in the memory;其中,当所述存储器存储的程序被执行时,所述处理器用于执行如权利要求1-10任一所述的方法。Wherein, when the program stored in the memory is executed, the processor is used to execute the method according to any one of claims 1-10.
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,当所述计算机程序在处理器上运行时,使得所述处理器执行如权利要求1-10任一所述的方法。A computer-readable storage medium stores a computer program, and when the computer program runs on a processor, the processor executes the method according to any one of claims 1 to 10.
- 一种计算机程序产品,其特征在于,当所述计算机程序产品在处理器上运行时,使得所述处理器执行如权利要求1-10任一所述的方法。 A computer program product, characterized in that when the computer program product runs on a processor, the processor is caused to execute the method according to any one of claims 1 to 10.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310037879.6A CN118337792A (en) | 2023-01-10 | 2023-01-10 | Reliable transmission method and device for P2MP data |
CN202310037879.6 | 2023-01-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024149043A1 true WO2024149043A1 (en) | 2024-07-18 |
Family
ID=91770804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/140662 WO2024149043A1 (en) | 2023-01-10 | 2023-12-21 | Reliable transmission method and apparatus for p2mp data |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118337792A (en) |
WO (1) | WO2024149043A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050188107A1 (en) * | 2004-01-14 | 2005-08-25 | Piercey Benjamin F. | Redundant pipelined file transfer |
CN101005422A (en) * | 2006-12-07 | 2007-07-25 | 中国科学院计算技术研究所 | Method for establishing radio sensor network rout ebased on route neighbour list |
CN104811387A (en) * | 2014-01-24 | 2015-07-29 | 思科技术公司 | Equal Cost Multi-path With Bit Indexed Explicit Replication |
CN112534782A (en) * | 2018-08-17 | 2021-03-19 | 瑞典爱立信有限公司 | Independent redundant path discovery for bluetooth networks |
-
2023
- 2023-01-10 CN CN202310037879.6A patent/CN118337792A/en active Pending
- 2023-12-21 WO PCT/CN2023/140662 patent/WO2024149043A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050188107A1 (en) * | 2004-01-14 | 2005-08-25 | Piercey Benjamin F. | Redundant pipelined file transfer |
CN101005422A (en) * | 2006-12-07 | 2007-07-25 | 中国科学院计算技术研究所 | Method for establishing radio sensor network rout ebased on route neighbour list |
CN104811387A (en) * | 2014-01-24 | 2015-07-29 | 思科技术公司 | Equal Cost Multi-path With Bit Indexed Explicit Replication |
CN112534782A (en) * | 2018-08-17 | 2021-03-19 | 瑞典爱立信有限公司 | Independent redundant path discovery for bluetooth networks |
Also Published As
Publication number | Publication date |
---|---|
CN118337792A (en) | 2024-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7623474B2 (en) | Techniques for distributing information using multicast subsets | |
US7031308B2 (en) | Tree-based ordered multicasting method | |
Li et al. | Reliable multicast in data center networks | |
US20170118042A1 (en) | High availability for distributed network services in an extended bridge | |
CN103125102B (en) | For providing the system and method for the Ethernet virtual concentrator scalability based on infinite bandwidth in middleware machine environment | |
CN1633647B (en) | System and method for managing data transfers in a network | |
WO2014150378A1 (en) | Distributed database management | |
Li et al. | Exploring efficient and scalable multicast routing in future data center networks | |
US20180287928A1 (en) | Switch-based reliable multicast service | |
EP1847071A2 (en) | Layered multicast and fair bandwidth allocation and packet prioritization | |
Li et al. | RDCM: Reliable data center multicast | |
JP5331898B2 (en) | Communication method, information processing apparatus, and program for parallel computation | |
US20050188107A1 (en) | Redundant pipelined file transfer | |
CN110708175B (en) | Method for synchronizing messages in a distributed network | |
WO2021244206A1 (en) | Message processing method, device, system, and storage medium | |
WO2022253087A1 (en) | Data transmission method, node, network manager, and system | |
Yu et al. | High performance and reliable NIC-based multicast over Myrinet/GM-2 | |
WO2022048328A1 (en) | Data processing method and apparatus, device, and medium | |
EP1561304A2 (en) | Method and apparatus for updating group member views in group communication systems | |
WO2024149043A1 (en) | Reliable transmission method and apparatus for p2mp data | |
WO2023133697A1 (en) | Packet loss processing method and apparatus, switch, sending device and data transmission system | |
Li et al. | Gleam: An rdma-accelerated multicast protocol for datacenter networks | |
Wang et al. | A cost-effective low-latency overlaid torus-based data center network architecture | |
Yu et al. | Scalable, High-performance NIC-based All-to-all Broadcast over Myrinet/GM | |
Dan et al. | SOPA: source routing based packet-level multi-path routing in data center networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23915797 Country of ref document: EP Kind code of ref document: A1 |