RU2459239C2

RU2459239C2 - Distributed computer system of optimum solutions

Info

Publication number: RU2459239C2
Application number: RU2010145469/08A
Authority: RU
Inventors: Кирилл Евгеньевич Чирков (RU); Кирилл Евгеньевич Чирков; Алексей Петрович Сарапульцев (RU); Алексей Петрович Сарапульцев; Герман Петрович Сарапульцев (RU); Герман Петрович Сарапульцев
Original assignee: Кирилл Евгеньевич Чирков; Алексей Петрович Сарапульцев; Герман Петрович Сарапульцев
Priority date: 2010-11-08
Filing date: 2010-11-08
Publication date: 2012-08-20
Also published as: RU2010145469A

Abstract

FIELD: information technology.

SUBSTANCE: distributed computer system of optimum solutions, based on distribution of computations PaaS, has a sever, a server accessibility verification unit, a master computer, n-slave computers, for example assume n=2, i.e. first and second, a task assignment unit, an information control unit, first and second two-way virtual buses for sending the task and the result, first and second two-way virtual buses for polling machines and a virtual initialisation bus with the following connections: server accessibility verification unit, through the master computer and the virtual initialisation bus, is connected to the control input of the server, the master computer is connected by the first and second two-way virtual buses for sending the task and the result to the first and second slave computers, respectively, the server is connected by the first and second two-way virtual polling buses to the first and second slave computers, respectively, the task assignment unit and the information control unit, which are connected by two-way virtual buses to corresponding inputs of the master computer.

EFFECT: high efficiency of a distributed computer system.

2 cl, 1 dwg

Description

Изобретение относится к вычислительной технике и может быть использовано для создания распределенных вычислительных систем с оптимальной архитектурой для решения различных задач: математических, управленческих, прикладных и др., требующих больших вычислительных мощностей.The invention relates to computer technology and can be used to create distributed computing systems with optimal architecture for solving various problems: mathematical, administrative, applied, etc., requiring large computing power.

Общей проблемой при создании вычислительных систем или сетей является минимизация системы с одновременным повышением быстродействия при упрощении программного обеспечения.A common problem when creating computing systems or networks is to minimize the system while improving performance while simplifying software.

Значительным фактором, если не одним из основных, является уменьшение энергопотребления. Так, суммарные затраты на энергопотребление современного суперкомпьютера за 2-3 года превышают стоимость самого суперкомпьютера.A significant factor, if not one of the main ones, is the reduction in energy consumption. So, the total energy costs of a modern supercomputer for 2-3 years exceed the cost of the supercomputer itself.

До сих пор нет единого мнения об оптимальном построении вычислительной системы. Существует суперкомпьютер Jaguar - суперкомпьютер класса массивно-параллельных систем, размещенный в Национальном центре компьютерных исследований в Окридже, шт. Теннеси (National Center for Computational Sciences (NCCS)). Суперкомпьютер имеет массово-параллельную архитектуру, то есть состоит из множества автономных ячеек (англ. nodes). Все ячейки делятся на два раздела (англ. partitions): XT5 и ХТ4 моделей Cray XT5 и ХТ4, соответственно. Раздел XT5 содержит 18688 вычислительных ячеек, а также вспомогательные ячейки для входа пользователей и обслуживания. Каждая вычислительная ячейка содержит 2 четырехъядерных процессора AMD Opteron 2356 (Barcelona) с внутренней частотой 2,3 ГГц, 16 ГБ памяти DDR2-800, и роутер SeaStar 2+. Всего раздел содержит 149504 вычислительных ядер, более 300 ТБ памяти, более 6 ПБ дискового пространства и пиковую производительность 1,38 петафлопс. Раздел ХТ4 содержит 7832 вычислительных ячеек плюс вспомогательные ячейки для входа пользователей и обслуживания. Ячейка содержит 4-ядерный процессор AMD Opteron 1354 (Budapest) с внутренней частотой 2,1 ГГц, 8 ГБ памяти DDR2-800 (в некоторых ячейках - DDR2-667) и роутер SeaStar2. Всего раздел содержит 31 328 вычислительных ядер, более 62 ТБ памяти, более 600 ТБ дискового пространства и пиковую производительность 263 TFLOPS. Пропускная способность каналов обмена с памятью равна 578 ТБ/с, а подсистема ввода-вывода - узкое место многих высокопроизводительных систем - способна перемещать каждую секунду 284 ГБ данных.There is still no consensus on the optimal construction of a computing system. There is a Jaguar supercomputer, a massively parallel class supercomputer, hosted at the National Center for Computer Research in Oak Ridge, pc. Tennessee (National Center for Computational Sciences (NCCS)). The supercomputer has a mass-parallel architecture, that is, it consists of many autonomous cells (English nodes). All cells are divided into two sections (English partitions): XT5 and XT4 models Cray XT5 and XT4, respectively. The XT5 section contains 18688 compute cells, as well as auxiliary cells for user login and maintenance. Each computing cell contains 2 quad-core AMD Opteron 2356 processors (Barcelona) with an internal frequency of 2.3 GHz, 16 GB of DDR2-800 memory, and a SeaStar 2+ router. In total, the section contains 149504 cores, more than 300 TB of memory, more than 6 PB of disk space and a peak performance of 1.38 petaflops. The XT4 section contains 7832 computational cells plus auxiliary cells for user login and maintenance. The cell contains a 4-core AMD Opteron 1354 processor (Budapest) with an internal frequency of 2.1 GHz, 8 GB of DDR2-800 memory (in some cells - DDR2-667) and a SeaStar2 router. In total, the section contains 31,328 cores, more than 62 TB of memory, more than 600 TB of disk space and peak performance of 263 TFLOPS. The throughput of memory channels is 578 TB / s, and the I / O subsystem - the bottleneck of many high-performance systems - is capable of moving 284 GB of data every second.

Часть задач, которые решает Ягуар, связана с изменениями климата, другой частью является моделирование процессов горения, см. http://ru.wikipedia.org/wiki/jaguar/.Some of the tasks that Jaguar solves are related to climate change; another part is the modeling of combustion processes, see http://ru.wikipedia.org/wiki/jaguar/.

Также существует отечественный суперкомпьютер «Ломоносов» - первый гибридный суперкомпьютер такого масштаба в России и Восточной Европе. В нем используются 3 вида вычислительных узлов и процессоры с различной архитектурой. В качестве основных узлов, обеспечивающих свыше 90% производительности системы, используется blade-платформа T-Blade2. В нем используются 3 вида вычислительных узлов и процессоры с различной архитектурой. Суперкомпьютер использует трехуровневую систему хранения данных суммарным объемом до 1 350 ТБ с параллельной файловой системой Lustre. Система хранения данных обеспечивает одновременный доступ к данным для всех вычислительных узлов суперкомпьютера с агрегированной скоростью чтения данных 20 Гб/сек и агрегированной скоростью записи 16 Гб/сек. Предполагается использовать суперкомпьютер для решения ресурсоемких вычислительных задач в рамках фундаментальных научных исследований, а также для проведения научной работы в области разработки алгоритмов и программного обеспечения для мощных вычислительных систем, см. http://www/t-platforms/ru/clusters/umgue/lomonosov.html. За прототип взята система GRID (коллайдер), см.There is also a domestic Lomonosov supercomputer, the first hybrid supercomputer of this scale in Russia and Eastern Europe. It uses 3 types of computing nodes and processors with different architectures. The T-Blade2 blade platform is used as the main nodes providing over 90% of the system performance. It uses 3 types of computing nodes and processors with different architectures. The supercomputer uses a three-tier storage system with a total capacity of up to 1,350 TB with a parallel Luster file system. The data storage system provides simultaneous access to data for all computing nodes of the supercomputer with an aggregated data reading speed of 20 Gb / s and an aggregated writing speed of 16 Gb / s. It is supposed to use a supercomputer to solve resource-intensive computing problems within the framework of fundamental scientific research, as well as to conduct scientific work in the field of developing algorithms and software for powerful computing systems, see http: // www / t-platforms / ru / clusters / umgue / lomonosov.html. The prototype is taken by the GRID system (collider), see

http://www.gazeta.ru/science/2010/04/02_a_3346640.shtmlhttp://www.gazeta.ru/science/2010/04/02_a_3346640.shtml

Данным суперкомпьютерам присущи два основных недостатка:These supercomputers have two main drawbacks:

- относительно низкое соотношение производительность/стоимость;- relatively low performance / cost ratio;

- высокое энергопотребление (затраты на электроэнергию могут превысить стоимость суперкомпьютера через 2-4 года).- high energy consumption (energy costs may exceed the cost of a supercomputer in 2-4 years).

Технической задачей изобретения является оптимизация соотношения стоимость-эффективность.An object of the invention is to optimize the cost-effectiveness ratio.

Для решения поставленной задачи предлагается распределенная вычислительная система оптимальных решений, основанная на распределении вычислений PaaS, отличающаяся от других систем тем, что содержит сервер (сервером называется компьютер, выделенный из группы персональных компьютеров для выполнения какой-либо сервисной задачи без непосредственного участия человека. В некоторых случаях сервер и рабочая станция могут иметь одинаковую аппаратную конфигурацию, см. («Все о серверах» Computer Bild №22, 2008), блок проверки доступности сервера, ведущий компьютер, n-ведомых компьютеров, для примера примем n=2, т.е. первый и второй, блок постановки задачи, блок информационного контроля, первую и вторую двунаправленные виртуальные шины отправки задания и результата, первую и вторую двунаправленные виртуальные шины опроса машин и виртуальную шину инициализации со следующими соединениями: блок проверки доступности сервера, через ведущий компьютер и виртуальную шину инициализации, соединенную с контрольным входом сервера, ведущий компьютер первый и второй двунаправленными виртуальными шинами отправки задания и результата соединен с первым и вторым ведомыми компьютерами соответственно, сервер соединен первой и второй двунаправленными виртуальными шинами опроса с первым и вторым ведомым компьютерами соответственно, блок постановки задачи и блок информационного контроля, которые двунаправленными виртуальными шинами соединены с соответствующими входами ведущего компьютера.To solve this problem, we propose a distributed computing system of optimal solutions based on the distribution of PaaS calculations, which differs from other systems in that it contains a server (a server is a computer that is selected from a group of personal computers to perform a service task without the direct involvement of a person. In some In some cases, the server and the workstation may have the same hardware configuration, see (“All About Servers” Computer Bild No. 22, 2008), server availability check block leading to computer, n-slave computers, for example, take n = 2, i.e., the first and second, task setting unit, information control unit, the first and second bidirectional virtual buses sending the task and the result, the first and second bidirectional virtual buses polling machines and virtual initialization bus with the following connections: server availability checker, through the host computer and a virtual initialization bus connected to the control input of the server, the host computer, the first and second bi-directional virtual buses sending task and the result is coupled to the first and second slave computers respectively server connected first and second bidirectional virtual tires survey with the driven first and second computers, respectively, the block statement of the problem and information control unit that bidirectional virtual tires are connected to respective inputs of the master computer.

На чертеже показана функциональная схема системы, на которой изображено: 1 - ведущий компьютер, 2 и 3 - ведомые компьютеры, 4 - программный блок постановки задачи, 5 - сервер, 6 - блок проверки доступности сервера, 7 - блок информационного контроля, 8 и 9 - первая и вторая двунаправленные виртуальные шины отправки задания и результата, 10 - виртуальные шины инициализации, 11 и 12 - двунаправленные первая и вторая виртуальные шины опроса машин соответственно, 13 - виртуальная шина информационного контроля.The drawing shows a functional diagram of the system, which shows: 1 - the host computer, 2 and 3 - slave computers, 4 - the program unit for setting the task, 5 - server, 6 - unit for checking server availability, 7 - information control unit, 8 and 9 - the first and second bi-directional virtual buses for sending the task and the result, 10 - virtual buses for initialization, 11 and 12 - the bi-directional first and second virtual buses for polling machines, respectively, 13 - the virtual bus for information control.

Функциональная схема имеет следующие соединения: блок проверки доступности сервера 6, через ведущий компьютер 1 и виртуальную шину инициализации 13 соединен с контрольным входом сервера 5, ведущий компьютер 1 первой и второй двунаправленными виртуальными шинами отправки задания и результата 8 и 9 соединен с первым и вторым ведомыми компьютерами 2 соответственно, сервер 5 соединен первой и второй двунаправленными виртуальными шинами опроса 11 и 12 с первым и вторым ведомым компьютерами 2 соответственно, блок постановки задачи 4 и блок информационного контроля 7 двунаправленными виртуальными шинами соединены с соответствующими входами ведущего компьютера 1.The functional diagram has the following connections: the server availability check unit 6, through the host computer 1 and the virtual initialization bus 13 is connected to the control input of the server 5, the host computer 1 of the first and second bidirectional virtual buses sending the job and result 8 and 9 is connected to the first and second slaves by computers 2, respectively, server 5 is connected by the first and second bi-directional virtual poll buses 11 and 12 to the first and second slave computers 2, respectively, the task setting unit 4 and the information block Control 7 bi-directional virtual buses are connected to the corresponding inputs of the host computer 1.

Система работает следующим образом.The system operates as follows.

Сервер 5 динамически контролирует состояние сети по виртуальным шинам 11 и 12.Server 5 dynamically monitors the status of the network via virtual buses 11 and 12.

При запуске клиентской части проверяется доступность сервера 5, время доступа до него и скорость канала связи. Этой задачей занимается блок проверки сервера, являющийся подпрограммой, входящей в клиентскую часть. Затем проверяется мощность компьютера и подсчитывается возможное количество ресурсов для раздачи. Эти данные посылаются на сервер, который регистрирует клиента в сети. Сервер 5 заносит в базу IP адрес (адрес компьютера в сети Интернет) клиента, параметры подключения и мощность машины. После этого на клиентскую машину посылается сигнал, разрешающий дальнейшую работу.When starting the client part, the availability of server 5, the access time to it, and the speed of the communication channel are checked. This task is handled by the server verification unit, which is a subprogram included in the client part. Then the power of the computer is checked and the possible amount of resources for distribution is calculated. This data is sent to the server, which registers the client on the network. Server 5 enters in the database the IP address (computer address on the Internet) of the client, connection parameters and machine power. After that, a signal is sent to the client machine, allowing further work.

Уровни системыSystem levels

Параллельно с обработкой данных задач, являющихся приложениями обычного уровня, осуществляет свою деятельность система работы на низком уровне (проверяет выполняющийся код, преимущественно забирающий ресурсы системы в течение продолжительного времени).In parallel with the processing of data from tasks that are applications of the usual level, the system operates at a low level (checks the executable code, mainly taking away system resources for a long time).

Если в процессе просмотра обнаруживается код, выполнение которого можно оптимизировать, то запускается блок постановки задачи, и одновременно посылается запрос серверу для поиска доступных машин и ключ связи. Блок постановки задачи - это подпрограмма, занимающаяся обработкой и подготовкой объектов для дальнейшей оптимизации. Ключ связи служит защитой от кражи информации, путем подмены машин. Сервер получает сигнал от клиента, выбирает свободную на данный момент машину, проверяет доступность ее в сети, посылая ключ связи, и отсылает данные о подключении клиенту.If a code is found during the viewing process, the execution of which can be optimized, the task setting unit is launched, and at the same time a request is sent to the server to search for available machines and a communication key. The task statement block is a subroutine that processes and prepares objects for further optimization. The communication key protects against theft of information by replacing machines. The server receives a signal from the client, selects a machine that is currently free, checks its availability on the network, sending a communication key, and sends connection data to the client.

Процесс постановки и решения новых заданий осуществляется постоянно, вне зависимости от статуса предшествующих заданий.The process of setting and solving new tasks is carried out constantly, regardless of the status of previous tasks.

Решение задачи на клиентских ведомых машинахSolving a problem on client slaves

Клиентские машины соединяются между собой, проверяют правильность ключа связи. При совпадении ключей одна клиентская машина (ведущая) посылает другой (ведомой) задачу для выполнения. Вторая (ведомая) клиентская машина решает эту задачу и посылает ответ.Client machines are interconnected, verify the correctness of the communication key. If the keys match, one client machine (master) sends another (slave) task for execution. The second (slave) client machine solves this problem and sends a response.

Решение задачи на клиентской ведущей машинеSolving a problem on a client host

Если ответ от собственных вычислительных мощностей приходит раньше, то задача считается выполненной, в противном случае используется решение из сети. (При том или ином способе получения результата акцептор результата действия на ведущем компьютере посылает сигнал серверу, а сервер посылает этот сигнал ведомому компьютеру, и тот прекращает выполнение задачи).If the answer from their own computing power comes earlier, then the task is considered complete, otherwise a solution from the network is used. (With one or another method of obtaining the result, the acceptor of the result of the action on the host computer sends a signal to the server, and the server sends this signal to the slave computer, and that stops the task).

Примечание: сервер не хранит никакую информацию касательно решаемой задачи, кроме ее статуса (текущее состояние, решена задача или нет).Note: the server does not store any information regarding the task being solved, except for its status (current state, task solved or not).

Возможные недостатки:Possible disadvantages:

- ответ на одну и ту же задачу, выполняемую при одинаковых переменных, может быть разным, если в процессе решения используются генераторы случайных чисел;- the answer to the same task performed with the same variables may be different if random number generators are used in the solution process;

- одновременная работа с антивирусом может быть невозможна (эффект одновременного низкоуровневого доступа).- simultaneous operation with antivirus software may not be possible (effect of simultaneous low-level access).

- вариабельность времени решений задач, обусловленная текущими параметрами загрузки сети и ведомых клиентских машин.- the variability of time for solving problems due to the current parameters of the network load and slave client machines.

Claims

1. Distributed computing system of optimal solutions based on the distribution of calculations in the PaaS format, which differs from other systems in that it contains a server, a server availability checker, a host computer, n-slave computers, for example, take n = 2, i.e. the first and second, task setting unit, information control unit, the first and second bi-directional virtual buses for sending the task and result, the first and second bi-directional virtual buses for polling machines and the virtual initialization bus, with the following connections: server availability check unit, through the host computer and virtual the initialization bus connected to the control input of the server, the host computer of the first and second bidirectional virtual buses sending the job and the result is connected to the first and second slave computers, respectively, the task setting unit and the information control unit, which are connected by bidirectional virtual buses to the corresponding inputs of the host computer, while the computing system itself is based on the technology of datagrams, i.e. on independent promotion of packets in packet networks without establishing logical channels, and in networks with the transmission of datagrams, packet routing is carried out on a packet basis, while packets are equipped with a destination address and they independently travel to destination nodes in such a way that a lot of packets that belong a single message can travel to the destination node by different routes.

2. The distributed computing system according to claim 1, characterized in that when solving problems using computers on the Internet, owned by private users, having less simultaneous workload of local tasks, especially at night, while any participant in the system - a private user - has the right to with other participants to request the required amount of resources required to solve the local problem.