CN104580476B - The method and apparatus for choosing node in a distributed system - Google Patents
The method and apparatus for choosing node in a distributed system Download PDFInfo
- Publication number
- CN104580476B CN104580476B CN201510016624.7A CN201510016624A CN104580476B CN 104580476 B CN104580476 B CN 104580476B CN 201510016624 A CN201510016624 A CN 201510016624A CN 104580476 B CN104580476 B CN 104580476B
- Authority
- CN
- China
- Prior art keywords
- nodes
- node
- none
- candidate
- distributed system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 239000012634 fragment Substances 0.000 claims 2
- 238000004364 calculation method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000002776 aggregation Effects 0.000 description 4
- 238000004220 aggregation Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Multi Processors (AREA)
Abstract
本发明提供一种在分布式系统中选取节点的方法和装置,有助于在比较低的工作量和成本下通过提高None节点的内存容量,从而提高整个PrestoDB集群的性能。该方法包括:在所述分布式系统中,将指定的一部分节点作为候选的None节点;在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点,然后将所述多个Source节点的数据片汇聚到选择的节点。
The present invention provides a method and device for selecting nodes in a distributed system, which helps improve the performance of the entire PrestoDB cluster by increasing the memory capacity of None nodes with relatively low workload and cost. The method includes: in the distributed system, using a specified part of the nodes as candidate None nodes; in the case that data pieces of multiple Source nodes need to be aggregated into one node, selecting among the candidate None nodes A node, and then gather the data slices of the multiple Source nodes to the selected node.
Description
技术领域technical field
本发明涉及计算机技术领域,特别地涉及一种在分布式系统中选取节点的方法和装置。The invention relates to the field of computer technology, in particular to a method and device for selecting nodes in a distributed system.
背景技术Background technique
伴随着大数据的兴起,互联网公司的业务数据量逐年上升,因此各大互联网公司都在内部推行大数据技术,并且针对于核心业务系统建设数据仓库,目前数据仓库分为两种类型:离线数据仓库和实时数据仓库。With the rise of big data, the amount of business data of Internet companies is increasing year by year. Therefore, major Internet companies are implementing big data technology internally and building data warehouses for core business systems. Currently, data warehouses are divided into two types: offline data warehouse and real-time data warehouse.
离线数据仓库的代表产品就是hive,该产品由于底层计算框架是MapReduce,因此其适合于超大数据集的离线分析和计算,对于实时性要求比较高的数据分析和计算并不适合。The representative product of offline data warehouse is hive. Since the underlying computing framework of this product is MapReduce, it is suitable for offline analysis and calculation of very large data sets, but it is not suitable for data analysis and calculation with high real-time requirements.
实时数据仓库的代表产品是PrestoDB,该产品由FaceBook开发,采用了PipeLine的分布式数据计算和传输模式,对于大数据的分析和计算能够满足在100ms-20m之内,满足了实时数据分析和计算的要求。The representative product of the real-time data warehouse is PrestoDB. This product is developed by FaceBook and adopts the distributed data calculation and transmission mode of PipeLine. The analysis and calculation of big data can meet the requirements of real-time data analysis and calculation within 100ms-20m. requirements.
由于PrestoDB是一个基于内存的分布式计算框架,在进行数据分析和计算的时候,PrestoDB首先将需要分析和计算的数据分为数据片并将每个数据片读取到PrestoDB的Source节点中的内存中,然后将每个Source节点内存中的数据通过网络汇聚到一个None节点或者多个Fixed节点中,具体是汇聚到None节点还是Fixed节点与聚合函数的类型相关,例如:如果查询中包含有order by语句,那么就需要对所有的结果进行整体排序,因此各个Source节点内存中的数据就需要汇聚到一个None节点中,然后进行整体排序;如果查询中包含有group by语句,那么就需要对所有的结果进行分组,因此各个Source节点内存中的数据就需要汇聚到多个Fixed节点中,从而进行分组。Since PrestoDB is a memory-based distributed computing framework, when performing data analysis and calculation, PrestoDB first divides the data to be analyzed and calculated into data slices and reads each data slice into the memory in the Source node of PrestoDB , and then aggregate the data in the memory of each Source node into a None node or multiple Fixed nodes through the network. The specific aggregation to the None node or the Fixed node is related to the type of aggregation function. For example: if the query contains order by statement, then all the results need to be sorted as a whole, so the data in the memory of each Source node needs to be aggregated into a None node, and then sorted as a whole; if the query contains a group by statement, then it is necessary to sort all The results are grouped, so the data in the memory of each Source node needs to be aggregated into multiple Fixed nodes for grouping.
目前PrestoDB是从整个集群中随机选取一个节点作为None节点的,具体PrestoDB各种节点的选取算法如图1所示,图1是根据现有技术中的在PrestoDB集群中选取节点的流程的示意图。如图1所示,首先判断需要选取的节点的类型,如需选取None节点或Fixed节点,则在集群中随机选取;如需选取Source节点,先判断是否需要采用硬件感知的方式,若是,则根据数据本地性来选取,否则随机选取多个节点作为Source节点。这里的硬件感知是指感知需要处理的数据所在的位置,本地性是指优先选择数据所在的节点作为工作节点。因为如果分配的工作节点,刚好就是需要处理的数据所在的节点,就能减少数据进行网络传输所需要的时间,能够减少计算任务所需要的时间。所以在一些情况下可采用硬件感知方式,按本地性原则选取节点。At present, PrestoDB randomly selects a node from the entire cluster as the None node. The specific selection algorithm of various nodes in PrestoDB is shown in Figure 1. Figure 1 is a schematic diagram of the process of selecting nodes in the PrestoDB cluster according to the prior art. As shown in Figure 1, first determine the type of node to be selected. If you need to select a None node or a Fixed node, select it randomly in the cluster; if you need to select a Source node, first determine whether you need to use hardware-aware methods. Select according to data locality, otherwise randomly select multiple nodes as Source nodes. The hardware awareness here refers to the perception of the location of the data that needs to be processed, and the locality refers to the priority selection of the node where the data is located as the working node. Because if the assigned working node happens to be the node where the data to be processed is located, the time required for data transmission over the network can be reduced, and the time required for computing tasks can be reduced. Therefore, in some cases, the hardware perception method can be used to select nodes according to the principle of locality.
因此可以看出,如果一个节点被选择作为None节点,那么对其内存容量的要求就比较大。要想保证PrestoDB大数据量分析与计算的顺利进行,就必须对集群中的所有节点进行内存升级,使各个节点在被选择为None节点时都能胜任计算要求,这种升级工作量和成本都比较大。Therefore, it can be seen that if a node is selected as a None node, the requirement for its memory capacity is relatively large. In order to ensure the smooth progress of PrestoDB’s large-scale data analysis and calculation, it is necessary to upgrade the memory of all nodes in the cluster so that each node can meet the computing requirements when it is selected as a None node. The workload and cost of this upgrade are very high. bigger.
发明内容Contents of the invention
有鉴于此,本发明提供一种在分布式系统中选取节点的方法和装置,通过只提高None节点的内存容量,从而在比较低的工作量和成本下提高整个PrestoDB集群的性能。In view of this, the present invention provides a method and device for selecting nodes in a distributed system, which improves the performance of the entire PrestoDB cluster with relatively low workload and cost by only increasing the memory capacity of None nodes.
为实现上述目的,根据本发明的一个方面,提供了一种在分布式系统中选取节点的方法。To achieve the above purpose, according to one aspect of the present invention, a method for selecting nodes in a distributed system is provided.
本发明的在分布式系统中选取节点的方法中,分布式系统为PrestoDB集群,该方法包括:在所述分布式系统中,将指定的一部分节点作为候选的None节点;在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点,然后将所述多个Source节点的数据片汇聚到选择的节点。In the method for selecting nodes in a distributed system of the present invention, the distributed system is a PrestoDB cluster, and the method includes: in the distributed system, using a specified part of nodes as candidate None nodes; When the data pieces of the nodes are aggregated to one node, a node is selected among the candidate None nodes, and then the data pieces of the multiple Source nodes are aggregated to the selected node.
可选地,在将指定的一部分节点作为候选的None节点之后,还包括:在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。Optionally, after the specified part of the nodes are selected as the candidate None nodes, it also includes: judging whether the candidate None nodes are currently allowed under the condition that the data slices of multiple Source nodes need to be aggregated to multiple Fixed nodes As a candidate Fixed node, if yes, randomly select a plurality of nodes in the distributed system as Fixed nodes; otherwise, randomly select a plurality of nodes in the distributed system other than the candidate None nodes as Fixed nodes.
可选地,在将指定的一部分节点作为候选的None节点之后,还包括:在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。Optionally, after the specified part of the nodes are used as the candidate None nodes, it also includes: in the case of needing to save the fragmented data to the Source node, judging whether the candidate None nodes are currently allowed to be the candidate Source nodes , if so, randomly select multiple nodes in the distributed system as Source nodes, otherwise randomly select multiple nodes other than the candidate None nodes in the distributed system as Source nodes.
可选地,在所述分布式系统中随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中按照本地性原则选取多个节点作为Source节点。Optionally, the step of randomly selecting multiple nodes as Source nodes in the distributed system includes: in the case of currently adopting a hardware-aware manner, selecting multiple nodes as Source nodes in the distributed system according to the principle of locality node.
可选地,在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。Optionally, the step of randomly selecting a plurality of nodes other than the candidate None nodes in the distributed system as Source nodes includes: in the case of currently using a hardware-aware manner, in the distributed system the In addition to the candidate None node, multiple nodes are selected as Source nodes according to the principle of locality.
根据本发明的另一方面,提供了一种在分布式系统中选取节点的装置。According to another aspect of the present invention, a device for selecting a node in a distributed system is provided.
对于本发明的在分布式系统中选取节点的装置,分布式系统为PrestoDB集群,该装置包括:配置模块,用于记录所述分布式系统中被指定的作为候选的None节点的一部分节点;None节点选择模块,用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点作为None节点。For the device for selecting nodes in a distributed system of the present invention, the distributed system is a PrestoDB cluster, and the device includes: a configuration module for recording a part of nodes designated as candidate None nodes in the distributed system; None A node selection module, configured to select a node among the candidate None nodes as a None node when the data pieces of multiple Source nodes need to be aggregated into one node.
可选地,还包括Fixed节点选择模块,用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。Optionally, it also includes a Fixed node selection module, which is used to determine whether the candidate None node is currently allowed as a candidate Fixed node when the data pieces of multiple Source nodes need to be aggregated to multiple Fixed nodes, if , then randomly select multiple nodes in the distributed system as Fixed nodes, otherwise randomly select multiple nodes other than the candidate None nodes in the distributed system as Fixed nodes.
可选地,还包括Source节点选择模块,用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。Optionally, it also includes a Source node selection module, which is used to determine whether the candidate None node is currently allowed as a candidate Source node when the fragmented data needs to be saved to the Source node, and if so, in the A plurality of nodes are randomly selected as Source nodes in the distributed system, otherwise, a plurality of nodes other than the candidate None nodes in the distributed system are randomly selected as Source nodes.
可选地,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。Optionally, the Source node selection module is further configured to select multiple nodes in the distributed system as Source nodes according to the principle of locality, in the case of currently adopting a hardware-aware manner.
可选地,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。Optionally, the Source node selection module is further configured to select multiple nodes in the distributed system as Source nodes according to the principle of locality, in the case of currently adopting a hardware-aware manner.
根据本发明的技术方案,在PrestoDB集群中指定一部分节点作为候选的None节点,从而将None节点的选取限定在一定范围之内,这样可以对该范围的节点进行内存升级和扩容,使之胜任计算要求。这种方式无需对整个PrestoDB集群的所有节点进行内存升级扩容,因此升级扩容的工作量比较低,并且能够提高整个PrestoDB集群的性能。According to the technical solution of the present invention, a part of nodes are designated as candidate None nodes in the PrestoDB cluster, thereby limiting the selection of None nodes within a certain range, so that the memory upgrade and expansion of the nodes in this range can be performed to make them competent for computing Require. This method does not need to upgrade and expand the memory of all nodes in the entire PrestoDB cluster, so the workload of upgrading and expanding is relatively low, and it can improve the performance of the entire PrestoDB cluster.
附图说明Description of drawings
附图用于更好地理解本发明,不构成对本发明的不当限定。其中:The accompanying drawings are used to better understand the present invention, and do not constitute improper limitations to the present invention. in:
图1是根据本发明实施例的示意图;Fig. 1 is a schematic diagram according to an embodiment of the present invention;
图2是根据本发明实施例的在分布式系统中选取节点的方法的示意图;Fig. 2 is a schematic diagram of a method for selecting a node in a distributed system according to an embodiment of the present invention;
图3是根据本发明实施例的在分布式系统中选取节点的装置的主要模块的示意图。Fig. 3 is a schematic diagram of main modules of an apparatus for selecting a node in a distributed system according to an embodiment of the present invention.
具体实施方式Detailed ways
以下结合附图对本发明的示范性实施例做出说明,其中包括本发明实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本发明的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
在本发明实施例的方案中,事先指定PrestoDB集群中的一部分节点作为候选的None节点,在需要选择None节点时就从这一部分节点中选择。也可以设置配置项,对于是从这一部分节点中选择None节点还是在PrestoDB集群中随机选择None节点进行配置。在PrestoDB启动的时候,对该配置项进行解析,根据配置项中的配置信息,构建一个由对应的IP-Port对组成的一个列表,并在分配None节点时进行使用。其配置规范例如:None汇聚节点=IP地址1:端口1;IP地址2:端口2。即指定了IP地址为地址1和地址2的两个节点作为候选的None节点,端口分别为端口1和端口2。在配置项中,还可以对于是否允许上述的候选的None节点作为候选的Fixed节点进行配置,对于是否允许上述的候选的None节点作为候选的Source节点也进行配置。这样,在需要选择节点时,可按图2所示流程来进行。图2是根据本发明实施例的在分布式系统中选取节点的方法的示意图。该方法可由PrestoDB中的Coordinator节点来执行。In the solution of the embodiment of the present invention, a part of nodes in the PrestoDB cluster is designated in advance as candidate None nodes, and when it is necessary to select a None node, it is selected from this part of nodes. You can also set configuration items to configure whether to select None nodes from this part of nodes or randomly select None nodes in the PrestoDB cluster. When PrestoDB is started, the configuration item is parsed, and a list of corresponding IP-Port pairs is constructed according to the configuration information in the configuration item, and used when assigning None nodes. Its configuration specifications are, for example: None aggregation node = IP address 1: port 1; IP address 2: port 2. That is, two nodes whose IP addresses are address 1 and address 2 are designated as candidate None nodes, and the ports are port 1 and port 2 respectively. In the configuration item, it is also possible to configure whether to allow the above-mentioned candidate None node as a candidate Fixed node, and to configure whether to allow the above-mentioned candidate None node to be a candidate Source node. In this way, when a node needs to be selected, it can be performed according to the process shown in FIG. 2 . Fig. 2 is a schematic diagram of a method for selecting a node in a distributed system according to an embodiment of the present invention. This method can be executed by the Coordinator node in PrestoDB.
步骤S21:判断需要选择的节点的类型。在需要将分片的数据保存到Source节点的情况下,需选择的节点是Source节点,进入步骤S24。在需要对数据片进行汇聚处理时根据聚合函数的类型来确定需选择的节点的类型,在需选择None节点时,进入步骤S22;在需选择Fixed节点时,进入步骤S23。Step S21: Determine the type of node to be selected. In the case that the fragmented data needs to be saved to the Source node, the node to be selected is the Source node, and step S24 is entered. When data slices need to be aggregated, the type of node to be selected is determined according to the type of aggregation function. When a None node needs to be selected, go to step S22; when a Fixed node needs to be selected, go to step S23.
步骤S22:判断None节点是否从指定范围中选取。该判断根据上述的配置项进行。若是,则从配置项中记录的候选的None节点中选取一个节点作为None节点(步骤S221),否则随机选取一个节点作为None节点(步骤S222)。Step S22: Determine whether the None node is selected from the specified range. The judgment is made according to the above configuration items. If so, select a node from the candidate None nodes recorded in the configuration item as the None node (step S221), otherwise select a node randomly as the None node (step S222).
步骤S23:判断是否允许候选的None节点作为候选的Fixed节点。若是,则可以随机选取多个节点作为Fixed节点(步骤S231),否则在分布式系统中的候选的None节点之外随机选取多个节点作为Fixed节点。Step S23: Judging whether to allow the candidate None node to be the candidate Fixed node. If so, a plurality of nodes may be randomly selected as Fixed nodes (step S231 ), otherwise, a plurality of nodes other than candidate None nodes in the distributed system may be randomly selected as Fixed nodes.
步骤S24:判断是否采用硬件感知的方式,若是,进入步骤S241,否则进入步骤S242。Step S24: Determine whether hardware sensing is used, if yes, go to step S241, otherwise go to step S242.
步骤S241:判断是否允许候选的None节点作为候选的Source节点。若是,则可以按照本地性原则选取多个节点作为Source节点(步骤S2411),否则在分布式系统中的候选的None节点之外按照本地性原则选取多个节点作为Source节点(步骤S2412)。Step S241: Judging whether to allow the candidate None node to be the candidate Source node. If so, multiple nodes can be selected as Source nodes according to the principle of locality (step S2411), otherwise multiple nodes can be selected as Source nodes according to the principle of locality in addition to the candidate None nodes in the distributed system (step S2412).
步骤S242:判断是否允许候选的None节点作为候选的Source节点。若是,则可以随机选取多个节点作为Source节点(步骤S2421),否则在分布式系统中的候选的None节点之外随机选取多个节点作为Source节点(步骤S2422)。Step S242: Judging whether to allow the candidate None node to be the candidate Source node. If so, multiple nodes may be randomly selected as Source nodes (step S2421), otherwise multiple nodes other than the candidate None nodes in the distributed system may be randomly selected as Source nodes (step S2422).
图3是根据本发明实施例的在分布式系统中选取节点的装置的主要模块的示意图。如图3所示,本发明实施例的在分布式系统中选取节点的装置30主要包括配置模块31和None节点选择模块32。配置模块31用于记录分布式系统中被指定的作为候选的None节点的一部分节点。None节点选择模块32用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在候选的None节点中选择一个节点作为None节点。Fig. 3 is a schematic diagram of main modules of an apparatus for selecting a node in a distributed system according to an embodiment of the present invention. As shown in FIG. 3 , the device 30 for selecting a node in a distributed system according to the embodiment of the present invention mainly includes a configuration module 31 and a None node selection module 32 . The configuration module 31 is used for recording a part of nodes designated as candidate None nodes in the distributed system. The None node selection module 32 is used for selecting a node among the candidate None nodes as a None node when the data slices of multiple Source nodes need to be aggregated into one node.
装置30还可以包括还包括Fixed节点选择模块(图中未示出),用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许候选的None节点作为候选的Fixed节点,若是,则在分布式系统中随机选取多个节点作为Fixed节点,否则在分布式系统中候选的None节点之外随机选取多个节点作为Fixed节点。Apparatus 30 may also include a Fixed node selection module (not shown in the figure), which is used to determine whether the currently available None node is allowed to serve as Candidate Fixed nodes, if so, randomly select multiple nodes as Fixed nodes in the distributed system, otherwise randomly select multiple nodes as Fixed nodes other than the candidate None nodes in the distributed system.
装置30还可以包括Source节点选择模块(图中未示出),用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许候选的None节点作为候选的Source节点,若是,则在分布式系统中随机选取多个节点作为Source节点,否则在分布式系统中候选的None节点之外随机选取多个节点作为Source节点。The device 30 may also include a Source node selection module (not shown in the figure), which is used to determine whether the currently available None node is allowed as a candidate Source node when the fragmented data needs to be saved to the Source node, and if so, Then randomly select multiple nodes as Source nodes in the distributed system, otherwise randomly select multiple nodes as Source nodes other than the candidate None nodes in the distributed system.
Source节点选择模块还可用于在当前采用硬件感知方式的情况下,在分布式系统中候选的None节点之外按照本地性原则选取多个节点作为Source节点。Source节点选择模块还用于在当前采用硬件感知方式的情况下,在分布式系统中候选的None节点之外按照本地性原则选取多个节点作为Source节点。The Source node selection module can also be used to select multiple nodes as Source nodes according to the principle of locality, in addition to the candidate None nodes in the distributed system when the hardware-aware method is currently used. The Source node selection module is also used to select multiple nodes as Source nodes in accordance with the principle of locality in addition to the candidate None nodes in the distributed system under the current hardware-aware mode.
根据本发明实施例的技术方案,在PrestoDB集群中指定一部分节点作为候选的None节点,从而将None节点的选取限定在一定范围之内,这样可以对该范围的节点进行内存升级和扩容,使之胜任计算要求。这种方式无需对整个PrestoDB集群的所有节点进行内存升级扩容,因此升级扩容的工作量比较低,并且能够提高整个PrestoDB集群的性能。According to the technical solution of the embodiment of the present invention, a part of nodes are designated as candidate None nodes in the PrestoDB cluster, thereby limiting the selection of None nodes within a certain range, so that the memory upgrade and capacity expansion of the nodes in this range can be performed, so that Competent computing requirements. This method does not need to upgrade and expand the memory of all nodes in the entire PrestoDB cluster, so the workload of upgrading and expanding is relatively low, and it can improve the performance of the entire PrestoDB cluster.
上述具体实施方式,并不构成对本发明保护范围的限制。本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。任何在本发明的精神和原则之内所作的修改、等同替换和改进等,均应包含在本发明保护范围之内。The above specific implementation methods do not constitute a limitation to the protection scope of the present invention. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
Claims (6)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510016624.7A CN104580476B (en) | 2015-01-13 | 2015-01-13 | The method and apparatus for choosing node in a distributed system |
PCT/CN2016/070551 WO2016112831A1 (en) | 2015-01-13 | 2016-01-11 | Method and device of selecting distributed system node |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510016624.7A CN104580476B (en) | 2015-01-13 | 2015-01-13 | The method and apparatus for choosing node in a distributed system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104580476A CN104580476A (en) | 2015-04-29 |
CN104580476B true CN104580476B (en) | 2018-09-14 |
Family
ID=53095633
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510016624.7A Active CN104580476B (en) | 2015-01-13 | 2015-01-13 | The method and apparatus for choosing node in a distributed system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN104580476B (en) |
WO (1) | WO2016112831A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104580476B (en) * | 2015-01-13 | 2018-09-14 | 北京京东尚科信息技术有限公司 | The method and apparatus for choosing node in a distributed system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101102225A (en) * | 2007-07-26 | 2008-01-09 | 北京航空航天大学 | Wireless sensor network node management method |
US7710884B2 (en) * | 2006-09-01 | 2010-05-04 | International Business Machines Corporation | Methods and system for dynamic reallocation of data processing resources for efficient processing of sensor data in a distributed network |
CN101924777A (en) * | 2009-06-17 | 2010-12-22 | 中国移动通信集团公司 | Method, system and device for searching active nodes in P2P streaming media system |
CN103188161A (en) * | 2011-12-30 | 2013-07-03 | 中国移动通信集团公司 | Method and system of distributed data loading scheduling |
CN104168332A (en) * | 2014-09-01 | 2014-11-26 | 广东电网公司信息中心 | Load balance and node state monitoring method in high performance computing |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100242042A1 (en) * | 2006-03-13 | 2010-09-23 | Nikhil Bansal | Method and apparatus for scheduling work in a stream-oriented computer system |
CN102572809A (en) * | 2010-12-27 | 2012-07-11 | 中国移动通信集团公司 | Method, system and equipment for selecting gateway nodes |
CN104580476B (en) * | 2015-01-13 | 2018-09-14 | 北京京东尚科信息技术有限公司 | The method and apparatus for choosing node in a distributed system |
-
2015
- 2015-01-13 CN CN201510016624.7A patent/CN104580476B/en active Active
-
2016
- 2016-01-11 WO PCT/CN2016/070551 patent/WO2016112831A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7710884B2 (en) * | 2006-09-01 | 2010-05-04 | International Business Machines Corporation | Methods and system for dynamic reallocation of data processing resources for efficient processing of sensor data in a distributed network |
CN101102225A (en) * | 2007-07-26 | 2008-01-09 | 北京航空航天大学 | Wireless sensor network node management method |
CN101924777A (en) * | 2009-06-17 | 2010-12-22 | 中国移动通信集团公司 | Method, system and device for searching active nodes in P2P streaming media system |
CN103188161A (en) * | 2011-12-30 | 2013-07-03 | 中国移动通信集团公司 | Method and system of distributed data loading scheduling |
CN104168332A (en) * | 2014-09-01 | 2014-11-26 | 广东电网公司信息中心 | Load balance and node state monitoring method in high performance computing |
Also Published As
Publication number | Publication date |
---|---|
CN104580476A (en) | 2015-04-29 |
WO2016112831A1 (en) | 2016-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11423082B2 (en) | Methods and apparatus for subgraph matching in big data analysis | |
Hu et al. | Time-and cost-efficient task scheduling across geo-distributed data centers | |
US8886781B2 (en) | Load balancing in cluster storage systems | |
US9645756B2 (en) | Optimization of in-memory data grid placement | |
US20170097853A1 (en) | Realizing graph processing based on the mapreduce architecture | |
US11303499B2 (en) | Moving nodes in a distributed system | |
US10747764B1 (en) | Index-based replica scale-out | |
EP3014444A1 (en) | Computing connected components in large graphs | |
US10599648B2 (en) | Optimized storage solution for real-time queries and data modeling | |
CN111083179B (en) | Internet of Things cloud platform, device interaction method and device based on Internet of Things cloud platform | |
Ubarhande et al. | Novel data-distribution technique for Hadoop in heterogeneous cloud environments | |
CN101739398A (en) | Distributed database multi-join query optimization algorithm | |
Liu et al. | An adaptive approach to better load balancing in a consumer-centric cloud environment | |
EP3183848B1 (en) | Optimization framework for multi-tenant data centers | |
CN103823846A (en) | Method for storing and querying big data on basis of graph theories | |
CN107995032B (en) | Method and device for building network experiment platform based on cloud data center | |
CN104063501A (en) | Copy balancing method based HDFS | |
US20230132117A1 (en) | Handling system-characteristics drift in machine learning applications | |
Dai et al. | Research and implementation of big data preprocessing system based on Hadoop | |
US20220156324A1 (en) | Graph refactorization method and graph refactorization apparatus | |
CN105635285B (en) | A kind of VM migration scheduling method based on state aware | |
CN104580476B (en) | The method and apparatus for choosing node in a distributed system | |
Xu et al. | Balancing reducer workload for skewed data using sampling-based partitioning | |
KR101878213B1 (en) | Method, apparatus and computer program for summaring of a weighted graph | |
CN107203554A (en) | A kind of distributed search method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |