Nothing Special   »   [go: up one dir, main page]

WO2024001411A1 - Multi-thread scheduling method and device - Google Patents

Multi-thread scheduling method and device Download PDF

Info

Publication number
WO2024001411A1
WO2024001411A1 PCT/CN2023/087477 CN2023087477W WO2024001411A1 WO 2024001411 A1 WO2024001411 A1 WO 2024001411A1 CN 2023087477 W CN2023087477 W CN 2023087477W WO 2024001411 A1 WO2024001411 A1 WO 2024001411A1
Authority
WO
WIPO (PCT)
Prior art keywords
thread
state
message
linked list
target
Prior art date
Application number
PCT/CN2023/087477
Other languages
French (fr)
Chinese (zh)
Inventor
沈洋
徐金林
牛新伟
韩建辉
李铮
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Publication of WO2024001411A1 publication Critical patent/WO2024001411A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Definitions

  • the embodiments of the present application relate to the technical field of core network processors, and specifically, to a multi-thread scheduling method and device.
  • network processors as the core component of data forwarding in the field of digital communication, are specifically used in various tasks in the communication field such as packet processing, protocol analysis, route lookup, voice/data aggregation, and firewalls.
  • Embodiments of the present application provide a multi-thread scheduling method and device to at least solve the problem in related technologies that it is impossible to ensure that messages entering the micro engine first are forwarded first in the order in which the messages enter the micro engine.
  • a multi-thread scheduling method including: after each message enters the processor core, the thread number carried by each message is stored in sequence corresponding to the thread group to which the message belongs. in the thread management linked list, and establish a mapping relationship between the thread number and the node of the thread management linked list;
  • the target thread in the executable state is scheduled from the thread group in the order in which messages enter the processor core, and the target thread is Enter the pipeline corresponding to the target thread.
  • a multi-thread scheduling device including: a setting module configured to sequentially store the thread number carried by each message into the processor core after each message enters the processor core. in the thread management linked list corresponding to the thread group to which the message belongs, and establish a mapping relationship between the thread number and the node of the thread management linked list; the scheduling module is used to calculate the mapping relationship according to the mapping relationship and the thread state machine corresponding to each thread. status, schedule the target thread in the executable state from the thread group according to the order in which messages enter the processor core, and input the target thread into the pipeline corresponding to the target thread.
  • a computer-readable storage medium is also provided, wherein the computer-readable storage medium A computer program is stored in the medium, wherein the computer program is configured to execute the steps in any of the above method embodiments when running.
  • an electronic device including a memory and a processor.
  • a computer program is stored in the memory, and the processor is configured to run the computer program to perform any of the above. Steps in method embodiments.
  • the thread scheduling method is optimized by introducing a thread management linked list to ensure that messages that enter the processor core first are scheduled and executed first. Therefore, it is possible to solve the problem in the related technology that the packets that enter the kernel first are forwarded first according to the order in which the packets enter the kernel, thereby achieving the effect of reducing the execution delay of the packets.
  • Figure 1 is a hardware structure block diagram of a computer terminal running the multi-thread scheduling method according to the embodiment of the present application;
  • Figure 2 is a flow chart of a multi-thread scheduling method according to an embodiment of the present application.
  • Figure 3 is a structural block diagram of a multi-thread scheduling device according to an embodiment of the present application.
  • Figure 4 is a structural block diagram of a multi-thread scheduling device according to another embodiment of the present application.
  • Figure 5 is a structural block diagram of a multi-thread scheduling device according to yet another embodiment of the present application.
  • Figure 6 is a schematic structural diagram of a coarse-grained multi-thread scheduling device according to an embodiment of the present application.
  • Figure 7 is a schematic diagram corresponding to threads and thread management linked lists according to an embodiment of the present application.
  • Figure 8 is a schematic diagram of thread state switching according to an embodiment of the present application.
  • Figure 9 is a flowchart for executing coarse-grained multi-thread scheduling according to an embodiment of the present application.
  • FIG. 1 is a hardware structure block diagram of a computer terminal running the multi-thread scheduling method according to the embodiment of the present application.
  • the computer terminal may include one or more (only one is shown in Figure 1) processors 102 (the processor 102 may include but is not limited to a microprocessor (Central Processing Unit, MCU) or a programmable logic device (Field Programmable Gate Array, FPGA) and other processing devices) and a memory 104 for storing data, wherein the above-mentioned computer terminal may also include a transmission device 106 for communication functions and an input and output device 108.
  • processors 102 may include but is not limited to a microprocessor (Central Processing Unit, MCU) or a programmable logic device (Field Programmable Gate Array, FPGA) and other processing devices) and a memory 104 for storing data
  • the above-mentioned computer terminal may also include a transmission device 106 for communication functions and an input and output device 108.
  • the structure shown in Figure 1 is only illustrative, and it does not limit the structure of the above-mentioned computer terminal.
  • the computer terminal may also include more or fewer components than
  • the memory 104 can be used to store computer programs, for example, software programs and modules of application software, such as the computer program corresponding to the multi-thread scheduling method in the embodiment of the present application.
  • the processor 102 executes the computer program by running the computer program stored in the memory 104.
  • Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 104 may further include memory located remotely relative to the processor 102, and these remote memories may be connected to the computer terminal through a network. Examples of the above networks include, but are not limited to Internet, intranet, local area network, mobile communication network and their combinations.
  • the transmission device 106 is used to receive or send data via a network.
  • Specific examples of the above-mentioned network may include a wireless network provided by a communication provider of the computer terminal.
  • the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 may be a radio frequency (Radio Frequency, RF for short) module, which is used to communicate with the Internet wirelessly.
  • NIC Network Interface Controller
  • FIG. 2 is a flow chart of the multi-thread scheduling method according to the embodiment of the present application. As shown in Figure 2, the process includes the following steps:
  • Step S202 After each message enters the processor core, the thread number carried by each message is sequentially stored in the thread management linked list corresponding to the thread group to which the message belongs, and a node of the thread number and thread management linked list is established. the mapping relationship between;
  • Step S204 According to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group according to the order in which messages enter the processor core, and the target thread is The thread input corresponds to the pipeline of the target thread.
  • the method further includes: assigning a thread number to each message entering the processor core, and dividing all threads into thread groups corresponding to the number of pipelines.
  • each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.
  • step S202 of this embodiment the mapping relationship between the nodes of the thread management linked list and the thread numbers is represented by a bitmap.
  • step S204 of this embodiment it includes: calculating the transmission request corresponding to each node according to the value of the bitmap and the readiness status of each thread in the thread group, and performing priority scheduling on the thread with the transmission request. , so that the thread that first enters the processor core and is in the ready state is authorized, converted to the executable state, and the thread in the executable state is scheduled as the target thread; obtains the instruction corresponding to the target thread, And input the target thread into the pipeline corresponding to the target thread to execute the instruction.
  • the method further includes: scheduling the message after the instruction has been executed out of the processor core, and releasing the thread corresponding to the message; The thread number of the message is cleared from the node of the thread management linked list, and other thread numbers stored in the node of the thread management linked list are moved forward by one node in sequence.
  • each pipeline corresponds to a main control state machine
  • each thread corresponds to a thread state machine
  • each pipeline is in two states: idle and authorized
  • each thread is in four states: idle, ready, executable and waiting. Transition between states.
  • each pipeline transitions between two states: idle and authorized, and each thread transitions between four states: idle, ready, executable, and waiting, including: when the main control state machine is in an idle state, it means Allow new messages to enter the processor core. After the new message enters the processor core, the corresponding thread is in an idle state; the thread number of the message is stored in the node of the thread management linked list, and the thread number associated with the thread is retrieved from the instruction storage module.
  • the thread transfers from the idle state to the ready state; in the authorization state of the main control state machine, the thread in the ready state is authorized, and the authorized thread transfers from the ready state to the executable state; in After the thread in the executable state executes the corresponding instruction, it transitions from the executable state to the idle state; when the thread in the executable state is waiting for data, table lookup, or re-fetching instructions during the execution of the instruction, the thread is changed from the executable state to the idle state.
  • the execution state transfers to the waiting state; the thread in the waiting state waits for data After the end, or the table lookup result is returned, or the instruction is retrieved and returned, the waiting state is transferred back to the ready state; after the thread number of the thread that has completed the instruction is released, the main control state machine enters the idle state.
  • the thread scheduling method is optimized by introducing a thread management linked list to ensure that the packets that enter the processor core first are scheduled and executed first. Therefore, it is possible to solve the problem in the related technology that the packets entering the micro-engine first are forwarded first according to the order in which the packets enter the micro-engine, thereby achieving the effect of reducing the execution delay of the packets.
  • the computer software product is stored in a storage medium (such as read-only memory/random access memory).
  • the memory Read-Only Memory/Random Access Memory, ROM/RAM), magnetic disk, optical disk
  • ROM/RAM Read-Only Memory/Random Access Memory
  • magnetic disk magnetic disk
  • optical disk includes several instructions to cause a terminal device (which can be a mobile phone, computer, server, or network device, etc.) to execute this application Methods described in various embodiments.
  • module may be a combination of software and/or hardware that implements a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
  • Figure 3 is a structural block diagram of a multi-thread scheduling device according to an embodiment of the present application. As shown in Figure 3, the device includes: a creation module 10 and a scheduling module 20.
  • the establishment module 10 is used to store the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs after each message enters the processor core, and establish the thread number and thread number. Manage the mapping relationship between nodes in the linked list;
  • the scheduling module 20 is configured to schedule the target thread in the executable state from the thread group according to the order in which messages enter the processor core according to the mapping relationship and the state of the thread state machine corresponding to each thread. And input the target thread into the pipeline corresponding to the target thread.
  • Figure 4 is a structural block diagram of a multi-thread scheduling device according to another embodiment of the present application. As shown in Figure 4, in addition to all the modules shown in Figure 3, the device also includes:
  • the allocation module 30 is configured to allocate a thread number to each message entering the processor core, and divide all threads into thread groups corresponding to the number of pipelines.
  • each thread group corresponds to a thread management linked list
  • the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.
  • Figure 5 is a structural block diagram of a multi-thread scheduling device according to yet another embodiment of the present application. As shown in Figure 5, in addition to all the modules shown in Figure 4, the device also includes:
  • the release module 40 schedules the message that has completed the execution of the instruction corresponding to the target thread out of the processor core, releases the thread corresponding to the message, and clears the thread number of the message in the node of the thread management linked list. , and move other thread numbers stored in the nodes of the thread management linked list forward by one node in sequence.
  • each pipeline corresponds to a main control state machine
  • each thread corresponds to a thread state machine
  • each pipeline is in two states: idle and authorized, and each thread is in idle, ready, and available states. Transition between execution and waiting 4 states.
  • each of the above modules can be implemented through software or hardware.
  • it can be implemented through It can be implemented in the following manner, but is not limited to this: the above-mentioned modules are all located in the same processor; or, the above-mentioned modules are located in different processors in any combination.
  • Embodiments of the present application also provide a computer-readable storage medium that stores a computer program, wherein the computer program is configured to execute the steps in any of the above method embodiments when running.
  • the computer-readable storage medium may include but is not limited to: U disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM) , mobile hard disk, magnetic disk or optical disk and other media that can store computer programs.
  • ROM read-only memory
  • RAM random access memory
  • mobile hard disk magnetic disk or optical disk and other media that can store computer programs.
  • An embodiment of the present application also provides an electronic device, including a memory and a processor.
  • a computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.
  • the above-mentioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the above-mentioned processor, and the input-output device is connected to the above-mentioned processor.
  • a micro-engine when a micro-engine receives a new message, it first allocates a thread number to the new message.
  • the priority of the message is not distinguished; the traditional least recently used (Least Recently Used, LRU) scheduling algorithm It can make the most frequently used threads get the highest priority, but when the number of threads is large, there is no guarantee that the packets that enter the kernel first will be executed and forwarded first.
  • LRU least recently used
  • FIG. 6 is a multi-thread scheduling device based on coarse-grainedness according to an embodiment of the present application.
  • a schematic structural diagram of the device is shown in Figure 6.
  • the multi-thread scheduling system includes: a thread scheduling module 11, an instruction storage module 12 and a completion scheduling module 13.
  • Thread scheduling module 11 is used to allocate thread numbers to new messages, divide all threads into thread groups corresponding to the number of pipeline lines, and each thread group schedules ready executable threads according to the order in which the messages enter, from the instruction storage
  • the instruction corresponding to the thread is obtained in the module, and is launched into the pipeline corresponding to the thread group for execution. After the execution is completed, the completion scheduling module 13 is notified; in this implementation, the thread scheduling module 11 functionally includes the above-mentioned establishment in the embodiment. Functions of module 10, scheduling module 20 and allocation module 30.
  • the instruction storage module 12 is used to store instructions used for thread execution, including an instruction level 2 cache and an instruction level 1 cache;
  • the completion scheduling module 13 is used to receive the message execution completion signal sent by the thread scheduling module, schedule the corresponding message out of the kernel and release the thread number information; in this implementation, the completion scheduling module 13 is functionally equivalent to the above embodiment.
  • a thread management linked list is introduced to manage the thread number information corresponding to the packet entry sequence, and schedule advanced messages and ready executable threads from each thread group to launch into the corresponding Pipeline execution; wherein, each thread group corresponds to a thread management linked list, and the number of linked list nodes is the same as the number of threads contained in each thread group;
  • Figure 7 is a schematic diagram corresponding to threads and thread management linked lists according to an embodiment of the present application, as shown in Figure As shown in Figure 7, 20 threads are divided into 2 thread groups. One thread group contains 10 threads.
  • the corresponding thread management linked list has 10 nodes node0-node9. The above nodes are used to store the thread numbers assigned to incoming packets.
  • the thread number information it carries is stored in the nodes node0-node9 from left to right. There is an existence between the thread management list node and the thread number.
  • a layer of mapping relationship The mapping relationship between thread management linked list nodes and thread numbers can be maintained through bitmap.
  • Each thread management linked list has 10 bitmap values corresponding to the nodes; according to the bitmap value and thread
  • the readiness status (rdy) of each thread in the group is calculated to determine the launch request corresponding to each node, and participates in priority scheduling (SP). Executable threads that advance messages and are ready are authorized and launched into the pipeline corresponding to the thread group. .
  • the messages are scheduled out of the kernel, the corresponding threads are released, the corresponding thread number information is retrieved in the thread management linked list, the thread number information of the matching node is cleared, and at the same time, the thread number information stored in all nodes to the right of the matching node is The thread number information is shifted one node to the left for storage.
  • the above two thread groups correspond to two pipelines respectively; the number of threads in each thread group can be 10 or any other number; the pipeline can be divided into five levels of pipelines or other levels.
  • Running water e.g., seventh level, etc.
  • the thread scheduling module 11 can also control the state transition of each thread.
  • Each pipeline corresponds to a main control state machine, and each thread corresponds to a thread state machine.
  • the specific conversion includes the following steps:
  • the main control state machine when the main control state machine is in the idle (IDLE) state, it means that new packets are allowed to enter.
  • IDLE idle
  • new packets enter the kernel they are first in the IDLE state;
  • the authorized thread is transferred from the rdy state to the running (exe) state. Only one thread in each thread group can be authorized at the same time.
  • the two thread groups can schedule the executable thread of the most advanced package from their respective groups and launch it into the corresponding Pipeline (pipeline 0 or pipeline 1) execution;
  • the corresponding thread transfers from the exe state to the idle state, the message is scheduled out of the kernel, the thread number information of the matching node in the thread management linked list is retrieved and deleted, and the corresponding thread number is released;
  • GRANT authorizes the remaining threads with the most advanced package in the rdy state to enter the exe state. After the data waiting period of the thread that previously transferred to the wait state is completed, or the table lookup data has been returned, or the index is retrieved and returned, the wait state will be restarted. Transfer to rdy state; since the thread management linked list saves its packet entry sequence information, when the thread currently in exe state transfers to other states, the thread transferred from wait state to rdy state can still receive priority scheduling until it completes the execution of the package Send an instruction and the thread changes from exe state to idle state;
  • the main control state machine After releasing the corresponding thread number, the main control state machine enters the IDLE state.
  • Figure 9 is a flow chart for executing coarse-grained multi-thread scheduling according to an embodiment of the present application. As shown in Figure 9, when a new packet enters, it is first determined whether the thread group is in the IDLE state. If not, the new packet It is necessary to wait until there is an idle thread available for allocation; if so, select a thread i from the idle thread and assign it to the incoming message; then, send an instruction fetch instruction to the instruction storage module.
  • thread i is transferred from IDLE The state transfers to rdy state; several threads in rdy state in the same thread group perform SP scheduling (GRANT), so that the thread of the most advanced package obtains authorization GRANTi; after obtaining authorization, thread i transfers to exe state; thread i in exe state is in When an instruction with data dependency is found during instruction execution, or an instruction with table lookup returns data dependency, or when instructions need to be re-fetched and require a long wait, thread i transfers to wait state; after the data waiting cycle is completed, or the table lookup data has been returned, or after re-fetching and returning, thread i will re-enter the rdy state from the wait state. Due to the use of SP scheduling based on the packet entry sequence, waiting for the thread currently in the exe state When the thread transfers to other states, thread i can receive priority scheduling until it completes the execution of the package sending instructions, transfers to the idle state, and releases the thread.
  • SP scheduling GRANT
  • the main control state machine is in the IDLE state, indicating that new packets are allowed to enter.
  • a new packet enters the kernel, it is first in the idle state, sends an instruction fetch instruction to the instruction storage module, and maintains the correspondence in Figure 7 according to the assigned thread number.
  • the thread management linked list of the thread stores the thread number information in the corresponding node.
  • the corresponding thread transfers from the idle state to the rdy state.
  • Several threads in the rdy state in the same thread group combine each thread in the thread management linked list.
  • the order of incoming packet threads obtained by node mapping is performed by SP scheduling (GRANT), so that the thread of the most advanced package is authorized.
  • the authorized thread is transferred from the rdy state to the exe state. Only one thread in each thread group can be authorized at the same time.
  • the two thread groups can schedule the executable thread of the most advanced package from their respective groups and launch it into the corresponding pipeline (pipeline 0 or pipeline 1) for execution.
  • the thread in the exe state executes the package and sends the instruction, and the corresponding thread is transferred from the exe state.
  • the message is scheduled out of the kernel, the thread number information of the matching node in the thread management linked list is retrieved and deleted, and the corresponding thread number is released.
  • the thread in the exe state finds instructions with data correlation during the instruction execution, or returns from the table lookup.
  • the corresponding thread When there is a data-related instruction, or when instructions need to be re-fetched and require a long wait, the corresponding thread will be transferred from the exe state to the wait state, and GRANT will authorize the remaining threads with the most advanced package in the rdy state to enter the exe state until the previous transfer.
  • the thread data waiting period in the wait state is completed, or the table lookup data has been returned, or after re-fetching and returning, the wait state is transferred back to the rdy state.
  • the thread management linked list saves its packet entry sequence information, when the thread currently in the exe state When the thread transfers to other states, the thread that transfers from the wait state to the rdy state can still receive priority scheduling until it completes the execution of the package sending instructions, the thread transfers from the exe state to the idle state, and the main control state machine enters the IDLE state.
  • the coarse-grained multi-thread scheduling method ensures that the packets that enter the kernel first are scheduled first, and are only switched when a costly pause occurs (such as re-fetching, table lookup, etc.) Other threads execute, greatly reducing the possibility of slowing down the execution speed of any message, and reducing the execution delay of any message.
  • modules or steps of the present application can be implemented using general-purpose computing devices, and they can be concentrated on a single computing device, or distributed across a network composed of multiple computing devices. They may be implemented in program code executable by a computing device, such that they may be stored in a storage device for execution by the computing device, and in some cases may be executed in a sequence different from that shown herein. Or the described steps can be implemented by making them into individual integrated circuit modules respectively, or by making multiple modules or steps among them into a single integrated circuit module. As such, the application is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Embodiments of the present application provide a multi-thread scheduling method and device. The method comprises: after each message enters a processor core, sequentially storing a thread number carried by each message in a thread management linked list corresponding to a thread group to which the message belongs, and establishing a mapping relationship between the thread number and a node of the thread management linked list; and according to the mapping relationship and the state of a thread state machine corresponding to each thread, scheduling a target thread in an executable state from the thread group according to the order in which the messages enter the processor core, and inputting the target thread into a pipeline corresponding to the target thread.

Description

多线程调度方法及装置Multi-thread scheduling method and device
相关申请Related applications
本申请要求于2022年6月27号申请的、申请号为202210738293.8的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application with application number 202210738293.8 filed on June 27, 2022, the entire content of which is incorporated into this application by reference.
技术领域Technical field
本申请实施例涉及核网络处理器技术领域,具体而言,涉及一种多线程调度方法及装置。The embodiments of the present application relate to the technical field of core network processors, and specifically, to a multi-thread scheduling method and device.
背景技术Background technique
随着通信技术的飞速发展,网络处理器作为数字通信领域中数据转发的核心部件,特定应用于包处理、协议分析、路由查找、声音/数据汇聚、防火墙等通信领域的各项任务。With the rapid development of communication technology, network processors, as the core component of data forwarding in the field of digital communication, are specifically used in various tasks in the communication field such as packet processing, protocol analysis, route lookup, voice/data aggregation, and firewalls.
为了适应不断发展的网络技术,对网络处理器的处理能力提出越来越高的要求。传统的网络处理器采用细粒度多线程结构方式,在使用并行处理技术提高微引擎内核数据处理并行度的同时,利用多线程的切换来隐藏流水线和存储器的延迟,从而提高处理器的吞吐量;细粒度多线程每个时钟周期在线程间进行一次切换,使多个线程的指令执行过程交织在一起,这种交织通常采用Round-Robin轮询调度算法对准备就绪的线程按序号进行调度,无法保证按照报文进入微引擎的顺序使先进入微引擎的报文优先得到转发,且会存在一个准备就绪、没有停顿的线程可能被其他线程的执行延迟的情况,从而减缓个体线程的执行速度。In order to adapt to the ever-evolving network technology, higher and higher requirements are placed on the processing capabilities of network processors. Traditional network processors adopt a fine-grained multi-thread structure. While using parallel processing technology to improve the parallelism of micro-engine core data processing, they also use multi-thread switching to hide pipeline and memory delays, thereby improving processor throughput; Fine-grained multi-threading switches between threads once every clock cycle, so that the instruction execution processes of multiple threads are intertwined. This kind of interleaving usually uses the Round-Robin polling scheduling algorithm to schedule ready threads according to serial numbers, which cannot It is guaranteed that the packets that enter the micro-engine first are forwarded first according to the order in which the packets enter the micro-engine. There will be a situation where a thread that is ready and has no pause may be delayed by the execution of other threads, thereby slowing down the execution speed of individual threads.
发明内容Contents of the invention
本申请实施例提供了一种多线程调度方法及装置,以至少解决相关技术中无法保证按照报文进入微引擎的顺序使先进入微引擎的报文优先得到转发的问题。Embodiments of the present application provide a multi-thread scheduling method and device to at least solve the problem in related technologies that it is impossible to ensure that messages entering the micro engine first are forwarded first in the order in which the messages enter the micro engine.
根据本申请的一个实施例,提供了一种多线程调度方法,包括:每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;According to an embodiment of the present application, a multi-thread scheduling method is provided, including: after each message enters the processor core, the thread number carried by each message is stored in sequence corresponding to the thread group to which the message belongs. in the thread management linked list, and establish a mapping relationship between the thread number and the node of the thread management linked list;
根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。According to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group in the order in which messages enter the processor core, and the target thread is Enter the pipeline corresponding to the target thread.
根据本申请的另一个实施例,提供了一种多线程调度装置,包括:建立模块,用于在每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;调度模块,用于根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。According to another embodiment of the present application, a multi-thread scheduling device is provided, including: a setting module configured to sequentially store the thread number carried by each message into the processor core after each message enters the processor core. in the thread management linked list corresponding to the thread group to which the message belongs, and establish a mapping relationship between the thread number and the node of the thread management linked list; the scheduling module is used to calculate the mapping relationship according to the mapping relationship and the thread state machine corresponding to each thread. status, schedule the target thread in the executable state from the thread group according to the order in which messages enter the processor core, and input the target thread into the pipeline corresponding to the target thread.
根据本申请的又一个实施例,还提供了一种计算机可读存储介质,所述计算机可读存储 介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present application, a computer-readable storage medium is also provided, wherein the computer-readable storage medium A computer program is stored in the medium, wherein the computer program is configured to execute the steps in any of the above method embodiments when running.
根据本申请的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present application, an electronic device is also provided, including a memory and a processor. A computer program is stored in the memory, and the processor is configured to run the computer program to perform any of the above. Steps in method embodiments.
通过本申请,通过引入线程管理链表优化线程调度方式,保证先进入处理器内核的报文优先得到调度执行。因此,可以解决相关技术中无法保证按照报文进入内核的顺序使先进入内核的报文优先得到转发的问题,达到了降低报文的执行延迟的效果。Through this application, the thread scheduling method is optimized by introducing a thread management linked list to ensure that messages that enter the processor core first are scheduled and executed first. Therefore, it is possible to solve the problem in the related technology that the packets that enter the kernel first are forwarded first according to the order in which the packets enter the kernel, thereby achieving the effect of reducing the execution delay of the packets.
附图说明Description of drawings
图1是运行本申请实施例的多线程调度方法的计算机终端的硬件结构框图;Figure 1 is a hardware structure block diagram of a computer terminal running the multi-thread scheduling method according to the embodiment of the present application;
图2是根据本申请实施例的多线程调度方法的流程图;Figure 2 is a flow chart of a multi-thread scheduling method according to an embodiment of the present application;
图3是根据本申请实施例的多线程调度装置的结构框图;Figure 3 is a structural block diagram of a multi-thread scheduling device according to an embodiment of the present application;
图4是根据本申请另一实施例的多线程调度装置的结构框图;Figure 4 is a structural block diagram of a multi-thread scheduling device according to another embodiment of the present application;
图5是根据本申请再一实施例的多线程调度装置的结构框图;Figure 5 is a structural block diagram of a multi-thread scheduling device according to yet another embodiment of the present application;
图6是根据本申请实施例的基于粗粒度的多线程调度装置的结构示意图;Figure 6 is a schematic structural diagram of a coarse-grained multi-thread scheduling device according to an embodiment of the present application;
图7是根据本申请实施例的线程与线程管理链表对应的示意图;Figure 7 is a schematic diagram corresponding to threads and thread management linked lists according to an embodiment of the present application;
图8是根据本申请实施例的线程状态装换示意图;Figure 8 is a schematic diagram of thread state switching according to an embodiment of the present application;
图9是根据本申请实施例的执行基于粗粒度的多线程调度的流程图。Figure 9 is a flowchart for executing coarse-grained multi-thread scheduling according to an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请的实施例。The embodiments of the present application will be described in detail below with reference to the accompanying drawings and in combination with the embodiments.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that the terms "first", "second", etc. in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.
本申请实施例中所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在计算机终端上为例,图1是运行本申请实施例的多线程调度方法的计算机终端的硬件结构框图。如图1所示,计算机终端可以包括一个或多个(图1中仅示出一个)处理器102(处理器102可以包括但不限于微处理器(Central Processing Unit,MCU)或可编程逻辑器件(Field Programmable Gate Array,FPGA)等的处理装置)和用于存储数据的存储器104,其中,上述计算机终端还可以包括用于通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述计算机终端的结构造成限定。例如,计算机终端还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。The method embodiments provided in the embodiments of this application can be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking running on a computer terminal as an example, FIG. 1 is a hardware structure block diagram of a computer terminal running the multi-thread scheduling method according to the embodiment of the present application. As shown in Figure 1, the computer terminal may include one or more (only one is shown in Figure 1) processors 102 (the processor 102 may include but is not limited to a microprocessor (Central Processing Unit, MCU) or a programmable logic device (Field Programmable Gate Array, FPGA) and other processing devices) and a memory 104 for storing data, wherein the above-mentioned computer terminal may also include a transmission device 106 for communication functions and an input and output device 108. Persons of ordinary skill in the art can understand that the structure shown in Figure 1 is only illustrative, and it does not limit the structure of the above-mentioned computer terminal. For example, the computer terminal may also include more or fewer components than shown in FIG. 1 , or have a different configuration than shown in FIG. 1 .
存储器104可用于存储计算机程序,例如,应用软件的软件程序以及模块,如本申请实施例中的多线程调度方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端。上述网络的实例包括但不限于 互联网、企业内部网、局域网、移动通信网及其组合。The memory 104 can be used to store computer programs, for example, software programs and modules of application software, such as the computer program corresponding to the multi-thread scheduling method in the embodiment of the present application. The processor 102 executes the computer program by running the computer program stored in the memory 104. Various functional applications and data processing implement the above methods. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely relative to the processor 102, and these remote memories may be connected to the computer terminal through a network. Examples of the above networks include, but are not limited to Internet, intranet, local area network, mobile communication network and their combinations.
传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,简称为NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,简称为RF)模块,其用于通过无线方式与互联网进行通讯。The transmission device 106 is used to receive or send data via a network. Specific examples of the above-mentioned network may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station to communicate with the Internet. In one example, the transmission device 106 may be a radio frequency (Radio Frequency, RF for short) module, which is used to communicate with the Internet wirelessly.
在本实施例中提供了一种运行于上述计算机终端的多线程调度方法,图2是根据本申请实施例的多线程调度方法的流程图,如图2所示,该流程包括如下步骤:This embodiment provides a multi-thread scheduling method running on the above-mentioned computer terminal. Figure 2 is a flow chart of the multi-thread scheduling method according to the embodiment of the present application. As shown in Figure 2, the process includes the following steps:
步骤S202,每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;Step S202: After each message enters the processor core, the thread number carried by each message is sequentially stored in the thread management linked list corresponding to the thread group to which the message belongs, and a node of the thread number and thread management linked list is established. the mapping relationship between;
步骤S204,根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将目标线程输入与目标线程对应的流水线。Step S204: According to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group according to the order in which messages enter the processor core, and the target thread is The thread input corresponds to the pipeline of the target thread.
在本实施例的步骤S202之前,还包括:为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。Before step S202 in this embodiment, the method further includes: assigning a thread number to each message entering the processor core, and dividing all threads into thread groups corresponding to the number of pipelines.
在本实施例的步骤S202中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。In step S202 of this embodiment, each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.
在本实施例的步骤S202中,所述线程管理链表的节点到线程号之间的映射关系采用比特图方式表示。In step S202 of this embodiment, the mapping relationship between the nodes of the thread management linked list and the thread numbers is represented by a bitmap.
在本实施例的步骤S204中,包括:根据所述比特图的值及所述线程组中各线程的准备状态计算得出每个节点对应的发射请求,将具有发射请求的线程进行优先级调度,使先进入所述处理器内核且处于准备就绪状态的线程获得授权,转换为可执行状态,并将所述可执行状态的线程作为所述目标线程进行调度;获取与目标线程对应的指令,并将目标线程输入与目标线程对应的流水线以执行所述指令。In step S204 of this embodiment, it includes: calculating the transmission request corresponding to each node according to the value of the bitmap and the readiness status of each thread in the thread group, and performing priority scheduling on the thread with the transmission request. , so that the thread that first enters the processor core and is in the ready state is authorized, converted to the executable state, and the thread in the executable state is scheduled as the target thread; obtains the instruction corresponding to the target thread, And input the target thread into the pipeline corresponding to the target thread to execute the instruction.
在本实施例中,将该线程发射进入对应的流水线以执行所述指令之后,还包括:将执行完指令的报文调度出所述处理器内核,并释放与该报文对应的线程;在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。In this embodiment, after launching the thread into the corresponding pipeline to execute the instruction, the method further includes: scheduling the message after the instruction has been executed out of the processor core, and releasing the thread corresponding to the message; The thread number of the message is cleared from the node of the thread management linked list, and other thread numbers stored in the node of the thread management linked list are moved forward by one node in sequence.
在本实施例中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。In this embodiment, each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in four states: idle, ready, executable and waiting. Transition between states.
在本实施例中,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换,包括:当所述主控状态机处于空闲状态,表示允许新的报文进入处理器内核,新的报文进入处理器内核后,对应线程处于空闲状态;在报文的线程号存入线程管理链表的节点,并从指令存储模块取回与该线程对应的指令后,该线程从空闲状态转入准备就绪状态;在主控状态机的授权状态下对处于准备就绪状态的线程进行授权,获得授权的线程由准备就绪状态转入可执行状态;处于可执行状态的线程执行完对应的指令后由可执行状态转入空闲状态;处于可执行状态的线程在指令执行过程中存在数据等待、查表等待或重新取指令的情况时,该线程由可执行状态转入等待状态;处于等待状态的线程数据等待 结束、或者查表结果返回、或者重新取指返回后,由等待状态重新转入准备就绪状态;在有执行完指令的线程的线程号被释放后,主控状态机进入空闲状态。In this embodiment, each pipeline transitions between two states: idle and authorized, and each thread transitions between four states: idle, ready, executable, and waiting, including: when the main control state machine is in an idle state, it means Allow new messages to enter the processor core. After the new message enters the processor core, the corresponding thread is in an idle state; the thread number of the message is stored in the node of the thread management linked list, and the thread number associated with the thread is retrieved from the instruction storage module. After the corresponding instruction, the thread transfers from the idle state to the ready state; in the authorization state of the main control state machine, the thread in the ready state is authorized, and the authorized thread transfers from the ready state to the executable state; in After the thread in the executable state executes the corresponding instruction, it transitions from the executable state to the idle state; when the thread in the executable state is waiting for data, table lookup, or re-fetching instructions during the execution of the instruction, the thread is changed from the executable state to the idle state. The execution state transfers to the waiting state; the thread in the waiting state waits for data After the end, or the table lookup result is returned, or the instruction is retrieved and returned, the waiting state is transferred back to the ready state; after the thread number of the thread that has completed the instruction is released, the main control state machine enters the idle state.
通过上述步骤,通过引入线程管理链表优化线程调度方式,保证先进入处理器内核的报文优先得到调度执行。因此,可以解决相关技术中无法保证按照报文进入微引擎的顺序使先进入微引擎的报文优先得到转发的问题,达到了降低报文的执行延迟的效果。Through the above steps, the thread scheduling method is optimized by introducing a thread management linked list to ensure that the packets that enter the processor core first are scheduled and executed first. Therefore, it is possible to solve the problem in the related technology that the packets entering the micro-engine first are forwarded first according to the order in which the packets enter the micro-engine, thereby achieving the effect of reducing the execution delay of the packets.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如只读存储器/随机存取存储器(Read-Only Memory/Random Access Memory,ROM/RAM)、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or that contributes to the existing technology. The computer software product is stored in a storage medium (such as read-only memory/random access memory). The memory (Read-Only Memory/Random Access Memory, ROM/RAM), magnetic disk, optical disk) includes several instructions to cause a terminal device (which can be a mobile phone, computer, server, or network device, etc.) to execute this application Methods described in various embodiments.
在本实施例中还提供了一种多线程调度装置,该装置用于实现上述实施例及可选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。This embodiment also provides a multi-thread scheduling device, which is used to implement the above embodiments and optional implementations. What has been described will not be described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
图3是根据本申请实施例的多线程调度装置的结构框图,如图3所示,该装置包括:建立模块10和调度模块20。Figure 3 is a structural block diagram of a multi-thread scheduling device according to an embodiment of the present application. As shown in Figure 3, the device includes: a creation module 10 and a scheduling module 20.
建立模块10,用于在每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;The establishment module 10 is used to store the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs after each message enters the processor core, and establish the thread number and thread number. Manage the mapping relationship between nodes in the linked list;
调度模块20,用于根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将目标线程输入与目标线程对应的流水线。The scheduling module 20 is configured to schedule the target thread in the executable state from the thread group according to the order in which messages enter the processor core according to the mapping relationship and the state of the thread state machine corresponding to each thread. And input the target thread into the pipeline corresponding to the target thread.
图4是根据本申请另一实施例的多线程调度装置的结构框图,如图4所示,该装置除包括图3所示的所有模块外,还包括:Figure 4 is a structural block diagram of a multi-thread scheduling device according to another embodiment of the present application. As shown in Figure 4, in addition to all the modules shown in Figure 3, the device also includes:
分配模块30,用于为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。The allocation module 30 is configured to allocate a thread number to each message entering the processor core, and divide all threads into thread groups corresponding to the number of pipelines.
在一个示例性实施例中,其中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。In an exemplary embodiment, each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.
图5是根据本申请再一实施例的多线程调度装置的结构框图,如图5所示,该装置除包括图4所示的所有模块外,还包括:Figure 5 is a structural block diagram of a multi-thread scheduling device according to yet another embodiment of the present application. As shown in Figure 5, in addition to all the modules shown in Figure 4, the device also includes:
释放模块40,将执行完与目标线程对应的指令的报文调度出所述处理器内核,并释放与该报文对应的线程,在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。The release module 40 schedules the message that has completed the execution of the instruction corresponding to the target thread out of the processor core, releases the thread corresponding to the message, and clears the thread number of the message in the node of the thread management linked list. , and move other thread numbers stored in the nodes of the thread management linked list forward by one node in sequence.
在一个示例性实施例中,其中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。In an exemplary embodiment, each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in idle, ready, and available states. Transition between execution and waiting 4 states.
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以 下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。It should be noted that each of the above modules can be implemented through software or hardware. For the latter, it can be implemented through It can be implemented in the following manner, but is not limited to this: the above-mentioned modules are all located in the same processor; or, the above-mentioned modules are located in different processors in any combination.
本申请的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。Embodiments of the present application also provide a computer-readable storage medium that stores a computer program, wherein the computer program is configured to execute the steps in any of the above method embodiments when running.
在一个示例性实施例中,上述计算机可读存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。In an exemplary embodiment, the computer-readable storage medium may include but is not limited to: U disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM) , mobile hard disk, magnetic disk or optical disk and other media that can store computer programs.
本申请的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。An embodiment of the present application also provides an electronic device, including a memory and a processor. A computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.
在一个示例性实施例中,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。In an exemplary embodiment, the above-mentioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the above-mentioned processor, and the input-output device is connected to the above-mentioned processor.
本实施例中的具体示例可以参考上述实施例及示例性实施方式中所描述的示例,本实施例在此不再赘述。For specific examples in this embodiment, reference may be made to the examples described in the above-mentioned embodiments and exemplary implementations, and details will not be described again in this embodiment.
为了便于对本申请所提供的技术方案的理解,下面将结合具体场景的实施例进行详细描述。In order to facilitate understanding of the technical solutions provided in this application, a detailed description will be given below with reference to embodiments of specific scenarios.
在相关技术中,微引擎接收到新报文时,首先为新报文分配一个线程号,分配线程号时不区分报文的优先级;传统的最近最少使用(Least Recently Used,LRU)调度算法能够使最频繁使用的线程获得最高优先级,但线程数量较多时,无法保证先进入内核的报文优先获得执行并转发。In related technologies, when a micro-engine receives a new message, it first allocates a thread number to the new message. When allocating the thread number, the priority of the message is not distinguished; the traditional least recently used (Least Recently Used, LRU) scheduling algorithm It can make the most frequently used threads get the highest priority, but when the number of threads is large, there is no guarantee that the packets that enter the kernel first will be executed and forwarded first.
针对上述先进入内核的报文不能优先执行完成的问题,在本申请实施例中提供了一种基于粗粒度的多线程调度装置,图6是根据本申请实施例的基于粗粒度的多线程调度装置的结构示意图,如图6所示,该多线程调度系统包括了:线程调度模块11、指令存储模块12和完成调度模块13。In order to solve the problem that messages that enter the kernel first cannot be executed first, a coarse-grained multi-thread scheduling device is provided in an embodiment of the present application. Figure 6 is a multi-thread scheduling device based on coarse-grainedness according to an embodiment of the present application. A schematic structural diagram of the device is shown in Figure 6. The multi-thread scheduling system includes: a thread scheduling module 11, an instruction storage module 12 and a completion scheduling module 13.
线程调度模块11,用于为新报文分配线程号,将所有线程分为与流水线条数对应的线程组,每个线程组按照报文进入顺序调度出准备就绪的可执行线程,从指令存储模块中获取与线程对应的指令,发射进入与线程组对应的流水线进行执行,执行完成后通知完成调度模块13;在本实施中,线程调度模块11在功能上包含了上述是实施例中的建立模块10、调度模块20以及分配模块30的功能。Thread scheduling module 11 is used to allocate thread numbers to new messages, divide all threads into thread groups corresponding to the number of pipeline lines, and each thread group schedules ready executable threads according to the order in which the messages enter, from the instruction storage The instruction corresponding to the thread is obtained in the module, and is launched into the pipeline corresponding to the thread group for execution. After the execution is completed, the completion scheduling module 13 is notified; in this implementation, the thread scheduling module 11 functionally includes the above-mentioned establishment in the embodiment. Functions of module 10, scheduling module 20 and allocation module 30.
指令存储模块12,用于存储线程执行所用的指令,包含指令二级缓存、指令一级缓存;The instruction storage module 12 is used to store instructions used for thread execution, including an instruction level 2 cache and an instruction level 1 cache;
完成调度模块13,用于接收线程调度模块发送的报文执行完成信号,将相应报文调度出内核并释放线程号信息;在本实施中,完成调度模块13在功能上相当于上述是实施例中的释放模块40的功能。The completion scheduling module 13 is used to receive the message execution completion signal sent by the thread scheduling module, schedule the corresponding message out of the kernel and release the thread number information; in this implementation, the completion scheduling module 13 is functionally equivalent to the above embodiment. The function of release module 40 in .
具体地,在线程调度模块11中,引入线程管理链表,用于管理与报文进包顺序对应的线程号信息,从各线程组中调度出先进报文且准备就绪的可执行线程发射进入对应的流水线执行;其中,每个线程组对应一个线程管理链表,链表节点数量与每个线程组包含的线程数量相同;图7是根据本申请实施例的线程与线程管理链表对应的示意图,如图7所示,20个线程分为2个线程组,一个线程组包含10个线程,对应的线程管理链表设有10个节点node0-node9,上述节点用于存储进包报文被分配的线程号信息;报文进入内核后,将其携带的线程号信息从左到右依次存入节点node0-node9中,线程管理链表节点到线程号之间存在 一层映射关系,所述线程管理链表节点到线程号之间的映射关系可以通过位图(bitmap)方式进行维护,每个线程管理链表有10个与节点对应的bitmap值;依据bitmap值及线程组中各线程准备状态(rdy)计算得出每个节点对应的发射请求,参与优先级调度(SP),先进报文且准备就绪的可执行线程获得授权,被发射进与线程组对应的流水线。Specifically, in the thread scheduling module 11, a thread management linked list is introduced to manage the thread number information corresponding to the packet entry sequence, and schedule advanced messages and ready executable threads from each thread group to launch into the corresponding Pipeline execution; wherein, each thread group corresponds to a thread management linked list, and the number of linked list nodes is the same as the number of threads contained in each thread group; Figure 7 is a schematic diagram corresponding to threads and thread management linked lists according to an embodiment of the present application, as shown in Figure As shown in Figure 7, 20 threads are divided into 2 thread groups. One thread group contains 10 threads. The corresponding thread management linked list has 10 nodes node0-node9. The above nodes are used to store the thread numbers assigned to incoming packets. Information; after the message enters the kernel, the thread number information it carries is stored in the nodes node0-node9 from left to right. There is an existence between the thread management list node and the thread number. A layer of mapping relationship. The mapping relationship between thread management linked list nodes and thread numbers can be maintained through bitmap. Each thread management linked list has 10 bitmap values corresponding to the nodes; according to the bitmap value and thread The readiness status (rdy) of each thread in the group is calculated to determine the launch request corresponding to each node, and participates in priority scheduling (SP). Executable threads that advance messages and are ready are authorized and launched into the pipeline corresponding to the thread group. .
执行完各自指令的报文被调度出内核,相应线程被释放,在线程管理链表中检索出对应的线程号信息,清除匹配节点的线程号信息,同时将该匹配节点右侧所有节点中存储的线程号信息左移一个节点进行存储。After executing the respective instructions, the messages are scheduled out of the kernel, the corresponding threads are released, the corresponding thread number information is retrieved in the thread management linked list, the thread number information of the matching node is cleared, and at the same time, the thread number information stored in all nodes to the right of the matching node is The thread number information is shifted one node to the left for storage.
在本实施例中,上述2个线程组分别与2条流水线对应;每个线程组中线程的个数可以为10或其它任意个数;所述流水线可以分为五级流水或其它级数的流水(例如,七级等)。In this embodiment, the above two thread groups correspond to two pipelines respectively; the number of threads in each thread group can be 10 or any other number; the pipeline can be divided into five levels of pipelines or other levels. Running water (e.g., seventh level, etc.).
在本实施例中,线程调度模块11还可以控制每个线程的状态转换,每条流水线对应一个主控状态机,每个线程对应一个线程状态机。如图8所示,具体转换包括如下步骤:In this embodiment, the thread scheduling module 11 can also control the state transition of each thread. Each pipeline corresponds to a main control state machine, and each thread corresponds to a thread state machine. As shown in Figure 8, the specific conversion includes the following steps:
第一,当主控状态机处于闲置(IDLE)状态时,表示允许新包进入,新包进入内核首先处于IDLE状态;First, when the main control state machine is in the idle (IDLE) state, it means that new packets are allowed to enter. When new packets enter the kernel, they are first in the IDLE state;
第二,向指令存储模块发送取指指令,依据分配的线程号,维护图7中的对应线程的线程管理链表,将线程号信息存入相应节点,待指令从指令存储模块返回后,对应线程从IDLE状态转入rdy状态;Second, send an instruction fetch instruction to the instruction storage module, maintain the thread management linked list of the corresponding thread in Figure 7 according to the assigned thread number, and store the thread number information in the corresponding node. After the instruction returns from the instruction storage module, the corresponding thread Transition from IDLE state to rdy state;
第三,同一线程组中若干处于rdy状态的线程,结合线程管理链表各节点映射得到的进包线程顺序,进行SP调度(GRANT),使得最先进包的线程获得授权;Third, several threads in the rdy state in the same thread group perform SP scheduling (GRANT) based on the order of incoming packet threads mapped by each node of the thread management chain, so that the thread with the most advanced packet is authorized;
获得授权的线程由rdy状态转入运行(exe)状态,每一线程组同一时刻只有一个线程可以获得授权,2个线程组可从各自组内调度出最先进包的可执行线程发射进对应的流水线(流水线0或者流水线1)执行;The authorized thread is transferred from the rdy state to the running (exe) state. Only one thread in each thread group can be authorized at the same time. The two thread groups can schedule the executable thread of the most advanced package from their respective groups and launch it into the corresponding Pipeline (pipeline 0 or pipeline 1) execution;
第四,处于exe状态的线程执行完包发送指令,相应线程由exe状态转入idle状态,报文调度出内核,检索并删除线程管理链表匹配节点的线程号信息,释放相应的线程号;Fourth, after the thread in the exe state executes the packet sending instruction, the corresponding thread transfers from the exe state to the idle state, the message is scheduled out of the kernel, the thread number information of the matching node in the thread management linked list is retrieved and deleted, and the corresponding thread number is released;
第五,处于exe状态的线程在指令执行过程中发现存在数据相关性的指令、或查表返回数据相关性的指令,或者需要重新取指等需要长时间等待的情况时,相应线程由exe状态转入wait状态;Fifth, when a thread in the exe state finds an instruction with data dependency during the execution of the instruction, or an instruction that returns data dependency from a table lookup, or needs to re-fetch instructions and other situations that require a long wait, the corresponding thread will be changed from the exe state. Transfer to wait state;
第六,由GRANT授权其余最先进包处于rdy状态的线程进入exe状态,待先前转入wait状态的线程数据等待周期完成、或者查表数据已返回、或者重新取指返回后,由wait状态重新转入rdy状态;由于线程管理链表保存有其进包顺序信息,待当前处于exe状态的线程转入其他状态时,由wait状态转入rdy状态的线程仍可获得优先调度,直至其执行完包发送指令,线程由exe状态转入idle状态;Sixth, GRANT authorizes the remaining threads with the most advanced package in the rdy state to enter the exe state. After the data waiting period of the thread that previously transferred to the wait state is completed, or the table lookup data has been returned, or the index is retrieved and returned, the wait state will be restarted. Transfer to rdy state; since the thread management linked list saves its packet entry sequence information, when the thread currently in exe state transfers to other states, the thread transferred from wait state to rdy state can still receive priority scheduling until it completes the execution of the package Send an instruction and the thread changes from exe state to idle state;
第七,释放相应的线程号后,主控状态机进入IDLE状态。Seventh, after releasing the corresponding thread number, the main control state machine enters the IDLE state.
图9是根据本申请实施例的执行基于粗粒度的多线程调度的流程图,如图9所示,当有新包进入时,首先判断是线程组是否处于IDLE状态,如果否,则新包需等待至有空闲线程可供分配;如果是,则从空闲线程中选取一个线程i分配给新进报文;接着,向指令存储模块发送取指指令,待取指返回后,线程i由IDLE状态转入rdy状态;同一线程组中若干处于rdy状态的线程进行SP调度(GRANT),使得最先进包的线程获得授权GRANTi;获得授权后线程i转入exe状态;处于exe状态的线程i在指令执行过程中发现存在数据相关性的指令、或查表返回数据相关性的指令,或者需要重新取指等需要长时间等待的情况时,线程i转入 wait状态;待数据等待周期完成、或者查表数据已返回、或者重新取指返回后,线程i由wait状态重新转入rdy状态,由于采用基于进包顺序的SP调度,待当前处于exe状态的线程转入其他状态时,线程i可获得优先调度,直至其执行完包发送指令,转入idle状态,并释放线程。Figure 9 is a flow chart for executing coarse-grained multi-thread scheduling according to an embodiment of the present application. As shown in Figure 9, when a new packet enters, it is first determined whether the thread group is in the IDLE state. If not, the new packet It is necessary to wait until there is an idle thread available for allocation; if so, select a thread i from the idle thread and assign it to the incoming message; then, send an instruction fetch instruction to the instruction storage module. After the instruction fetch returns, thread i is transferred from IDLE The state transfers to rdy state; several threads in rdy state in the same thread group perform SP scheduling (GRANT), so that the thread of the most advanced package obtains authorization GRANTi; after obtaining authorization, thread i transfers to exe state; thread i in exe state is in When an instruction with data dependency is found during instruction execution, or an instruction with table lookup returns data dependency, or when instructions need to be re-fetched and require a long wait, thread i transfers to wait state; after the data waiting cycle is completed, or the table lookup data has been returned, or after re-fetching and returning, thread i will re-enter the rdy state from the wait state. Due to the use of SP scheduling based on the packet entry sequence, waiting for the thread currently in the exe state When the thread transfers to other states, thread i can receive priority scheduling until it completes the execution of the package sending instructions, transfers to the idle state, and releases the thread.
在本实施例中,主控状态机处于IDLE状态,表示允许新包进入,新包进入内核首先处于idle状态,向指令存储模块发送取指指令,依据分配的线程号,维护图7中的对应线程的线程管理链表,将线程号信息存入相应节点,待指令从指令存储模块返回后,对应线程从idle状态转入rdy状态,同一线程组中若干处于rdy状态的线程,结合线程管理链表各节点映射得到的进包线程顺序,进行SP调度(GRANT),使得最先进包的线程获得授权,获得授权的线程由rdy状态转入exe状态,每一线程组同一时刻只有一个线程可以获得授权,2个线程组可从各自组内调度出最先进包的可执行线程发射进对应的流水线(流水线0或者流水线1)执行,处于exe状态的线程执行完包发送指令,相应线程由exe状态转入idle状态,报文调度出内核,检索并删除线程管理链表匹配节点的线程号信息,释放相应的线程号,处于exe状态的线程在指令执行过程中发现存在数据相关性的指令、或查表返回数据相关性的指令,或者需要重新取指等需要长时间等待的情况时,相应线程由exe状态转入wait状态,由GRANT授权其余最先进包处于rdy状态的线程进入exe状态,待先前转入wait状态的线程数据等待周期完成、或者查表数据已返回、或者重新取指返回后,由wait状态重新转入rdy状态,由于线程管理链表保存有其进包顺序信息,待当前处于exe状态的线程转入其他状态时,由wait状态转入rdy状态的线程仍可获得优先调度,直至其执行完包发送指令,线程由exe状态转入idle状态,主控状态机进入IDLE状态。In this embodiment, the main control state machine is in the IDLE state, indicating that new packets are allowed to enter. When a new packet enters the kernel, it is first in the idle state, sends an instruction fetch instruction to the instruction storage module, and maintains the correspondence in Figure 7 according to the assigned thread number. The thread management linked list of the thread stores the thread number information in the corresponding node. After the instruction returns from the instruction storage module, the corresponding thread transfers from the idle state to the rdy state. Several threads in the rdy state in the same thread group combine each thread in the thread management linked list. The order of incoming packet threads obtained by node mapping is performed by SP scheduling (GRANT), so that the thread of the most advanced package is authorized. The authorized thread is transferred from the rdy state to the exe state. Only one thread in each thread group can be authorized at the same time. The two thread groups can schedule the executable thread of the most advanced package from their respective groups and launch it into the corresponding pipeline (pipeline 0 or pipeline 1) for execution. The thread in the exe state executes the package and sends the instruction, and the corresponding thread is transferred from the exe state. In the idle state, the message is scheduled out of the kernel, the thread number information of the matching node in the thread management linked list is retrieved and deleted, and the corresponding thread number is released. The thread in the exe state finds instructions with data correlation during the instruction execution, or returns from the table lookup. When there is a data-related instruction, or when instructions need to be re-fetched and require a long wait, the corresponding thread will be transferred from the exe state to the wait state, and GRANT will authorize the remaining threads with the most advanced package in the rdy state to enter the exe state until the previous transfer. The thread data waiting period in the wait state is completed, or the table lookup data has been returned, or after re-fetching and returning, the wait state is transferred back to the rdy state. Since the thread management linked list saves its packet entry sequence information, when the thread currently in the exe state When the thread transfers to other states, the thread that transfers from the wait state to the rdy state can still receive priority scheduling until it completes the execution of the package sending instructions, the thread transfers from the exe state to the idle state, and the main control state machine enters the IDLE state.
通过本申请的上述实施例,基于粗粒度的多线程调度方法保证先进入内核的报文优先得到调度,并且只有在发生成本较高的停顿时(比如重新取指、查表等待等)才切换其他线程执行,大大降低减缓任一报文执行速度的可能性,以及降低了任一报文的执行延迟。Through the above embodiments of the present application, the coarse-grained multi-thread scheduling method ensures that the packets that enter the kernel first are scheduled first, and are only switched when a costly pause occurs (such as re-fetching, table lookup, etc.) Other threads execute, greatly reducing the possibility of slowing down the execution speed of any message, and reducing the execution delay of any message.
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the above-mentioned modules or steps of the present application can be implemented using general-purpose computing devices, and they can be concentrated on a single computing device, or distributed across a network composed of multiple computing devices. They may be implemented in program code executable by a computing device, such that they may be stored in a storage device for execution by the computing device, and in some cases may be executed in a sequence different from that shown herein. Or the described steps can be implemented by making them into individual integrated circuit modules respectively, or by making multiple modules or steps among them into a single integrated circuit module. As such, the application is not limited to any specific combination of hardware and software.
以上所述仅为本申请的可选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。 The above are only optional embodiments of the present application and are not intended to limit the present application. For those skilled in the art, the present application may have various modifications and changes. Any modifications, equivalent replacements, improvements, etc. made within the principles of this application shall be included in the protection scope of this application.

Claims (15)

  1. 一种多线程调度方法,包括:A multi-thread scheduling method including:
    每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;After each message enters the processor core, the thread number carried by each message is stored in the thread management linked list corresponding to the thread group to which the message belongs, and the relationship between the thread number and the node of the thread management linked list is established. Mapping relations;
    根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。According to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group in the order in which messages enter the processor core, and the target thread is Enter the pipeline corresponding to the target thread.
  2. 根据权利要求1所述的方法,其中,在将所述每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中之前,还包括:The method according to claim 1, wherein before sequentially storing the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs, it further includes:
    为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。Each message entering the processor core is assigned a thread number, and all threads are divided into thread groups corresponding to the number of pipelines.
  3. 根据权利要求1所述的方法,其中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。The method according to claim 1, wherein each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads contained in each thread group.
  4. 根据权利要求1所述的方法,其中,所述线程管理链表的节点到线程号之间的映射关系采用比特图方式表示。The method according to claim 1, wherein the mapping relationship between the nodes of the thread management linked list and the thread number is represented by a bitmap.
  5. 根据权利要求4所述的方法,其中,所述根据所述映射关系以及每个线程所处的状态,按照报文进入所述处理器内核的先后顺序从所述线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线,包括:The method according to claim 4, wherein, according to the mapping relationship and the state of each thread, the executable messages are scheduled from the thread group in the order in which messages enter the processor core. status of the target thread, and input the target thread into the pipeline corresponding to the target thread, including:
    根据所述比特图的值及所述线程组中各线程的准备状态计算得出每个节点对应的发射请求;Calculate the transmission request corresponding to each node based on the value of the bitmap and the readiness status of each thread in the thread group;
    将具有发射请求的线程进行优先级调度,使先进入所述处理器内核且处于准备就绪状态的线程获得授权,转换为可执行状态,并将所述可执行状态的线程作为所述目标线程进行调度;Priority scheduling is performed on threads with emission requests, so that the thread that enters the processor core first and is in a ready state is authorized, converted to an executable state, and the thread in the executable state is used as the target thread. Scheduling;
    获取与所述目标线程对应的指令,并将所述目标线程输入与所述目标线程对应的流水线,以执行所述指令。Obtain the instruction corresponding to the target thread, and input the target thread into the pipeline corresponding to the target thread to execute the instruction.
  6. 根据权利要求5所述的方法,其中,将所述目标线程输入与所述目标线程对应的流水线以执行所述指令之后,还包括:The method according to claim 5, wherein after inputting the target thread into the pipeline corresponding to the target thread to execute the instruction, it further includes:
    将执行完所述指令的报文调度出所述处理器内核,并释放与该报文对应的线程;Schedule the message after executing the instruction out of the processor core and release the thread corresponding to the message;
    在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。The thread number of the message is cleared in the node of the thread management linked list, and other thread numbers stored in the node of the thread management linked list are moved forward by one node in sequence.
  7. 根据权利要求1所述的方法,其中,The method of claim 1, wherein,
    每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。Each pipeline corresponds to a main control state machine, and each thread corresponds to a thread state machine. Each pipeline transitions between idle and authorized states, and each thread transitions between idle, ready, executable and waiting states.
  8. 根据权利要求7所述的方法,其中,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换,包括: The method according to claim 7, wherein each pipeline is in two states of idle and authorized and each thread is in four states of idle, ready, executable and waiting, including:
    当所述主控状态机处于空闲状态,表示允许新的报文进入处理器内核,新的报文进入处理器内核后,对应线程处于空闲状态;When the main control state machine is in the idle state, it means that new messages are allowed to enter the processor core. After the new messages enter the processor core, the corresponding thread is in the idle state;
    在报文的线程号存入线程管理链表的节点,并从指令存储模块取回与该线程对应的指令后,该线程从空闲状态转入准备就绪状态;After the thread number of the message is stored in the node of the thread management linked list and the instruction corresponding to the thread is retrieved from the instruction storage module, the thread transfers from the idle state to the ready state;
    在主控状态机的授权状态下对处于准备就绪状态的线程进行授权,获得授权的线程由准备就绪状态转入可执行状态;Authorize the thread in the ready state in the authorization state of the main control state machine, and the authorized thread transfers from the ready state to the executable state;
    处于可执行状态的线程执行完对应的指令后由可执行状态转入空闲状态;The thread in the executable state transitions from the executable state to the idle state after executing the corresponding instructions;
    处于可执行状态的线程在指令执行过程中存在数据等待、查表等待或重新取指令的情况时,该线程由可执行状态转入等待状态;When a thread in the executable state is waiting for data, waiting for table lookup, or re-fetching instructions during the execution of instructions, the thread will be transferred from the executable state to the waiting state;
    处于等待状态的线程数据等待结束、或者查表结果返回、或者重新取指返回后,由等待状态重新转入准备就绪状态;The thread in the waiting state transfers from the waiting state to the ready state again after the data wait ends, or the table lookup result returns, or the instruction is retrieved and returned;
    在有执行完指令的线程的线程号被释放后,主控状态机进入空闲状态。After the thread number of the thread that has completed the execution of the instruction is released, the main control state machine enters the idle state.
  9. 一种多线程调度装置,包括:A multi-thread scheduling device, including:
    建立模块,设置为在每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;Establish a module and set it to store the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs after each message enters the processor core, and establish the thread number and thread management The mapping relationship between the nodes of the linked list;
    调度模块,设置为根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。The scheduling module is configured to schedule the target thread in the executable state from the thread group according to the order in which messages enter the processor core according to the mapping relationship and the state of the thread state machine corresponding to each thread, and The target thread is input into the pipeline corresponding to the target thread.
  10. 根据权利要求9所述的装置,还包括:The device of claim 9, further comprising:
    分配模块,设置为为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。An allocation module is configured to allocate a thread number to each message entering the processor core, and divide all threads into thread groups corresponding to the number of pipelines.
  11. 根据权利要求10所述的装置,其中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。The device according to claim 10, wherein each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.
  12. 根据权利要求10所述的装置,还包括:The device of claim 10, further comprising:
    释放模块,设置为将执行完与所述目标线程对应的指令的报文调度出所述处理器内核,并释放与该报文对应的线程,在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。A release module configured to schedule the message that has completed the execution of the instruction corresponding to the target thread out of the processor core, release the thread corresponding to the message, and clear the message in the node of the thread management linked list. thread number, and move other thread numbers stored in the nodes of the thread management linked list forward by one node in sequence.
  13. 根据权利要求10所述的装置,其中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。The device according to claim 10, wherein each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in idle, ready, and Transition between executable and waiting 4 states.
  14. 一种计算机可读存储介质,其中,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现所述权利要求1至8任一项中所述的方法的步骤。A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, wherein when the computer program is executed by a processor, the method described in any one of claims 1 to 8 is implemented. A step of.
  15. 一种电子装置,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现所述权利要求1至8任一 项中所述的方法的步骤。 An electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements claims 1 to 8 when executing the computer program. Either The steps of the method described in Item .
PCT/CN2023/087477 2022-06-27 2023-04-11 Multi-thread scheduling method and device WO2024001411A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210738293.8 2022-06-27
CN202210738293.8A CN117331655A (en) 2022-06-27 2022-06-27 Multithreading scheduling method and device

Publications (1)

Publication Number Publication Date
WO2024001411A1 true WO2024001411A1 (en) 2024-01-04

Family

ID=89294062

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087477 WO2024001411A1 (en) 2022-06-27 2023-04-11 Multi-thread scheduling method and device

Country Status (2)

Country Link
CN (1) CN117331655A (en)
WO (1) WO2024001411A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118069071B (en) * 2024-04-19 2024-08-13 苏州元脑智能科技有限公司 Resource access control method, device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1582428A (en) * 2001-11-07 2005-02-16 国际商业机器公司 Method and apparatus for dispatching tasks in a non-uniform memory access (NUMA) computer system
CN104901901A (en) * 2014-03-07 2015-09-09 深圳市中兴微电子技术有限公司 Micro-engine and method for processing message therewith
US20150347192A1 (en) * 2014-05-29 2015-12-03 Apple Inc. Method and system for scheduling threads for execution
CN109257280A (en) * 2017-07-14 2019-01-22 深圳市中兴微电子技术有限公司 A kind of micro engine and its method for handling message

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1582428A (en) * 2001-11-07 2005-02-16 国际商业机器公司 Method and apparatus for dispatching tasks in a non-uniform memory access (NUMA) computer system
CN104901901A (en) * 2014-03-07 2015-09-09 深圳市中兴微电子技术有限公司 Micro-engine and method for processing message therewith
US20150347192A1 (en) * 2014-05-29 2015-12-03 Apple Inc. Method and system for scheduling threads for execution
CN109257280A (en) * 2017-07-14 2019-01-22 深圳市中兴微电子技术有限公司 A kind of micro engine and its method for handling message

Also Published As

Publication number Publication date
CN117331655A (en) 2024-01-02

Similar Documents

Publication Publication Date Title
US11620255B2 (en) Time sensitive networking device
CN105511954B (en) Message processing method and device
WO2017133623A1 (en) Data stream processing method, apparatus, and system
CN109697122B (en) Task processing method, device and computer storage medium
US20100088703A1 (en) Multi-core system with central transaction control
US9632977B2 (en) System and method for ordering packet transfers in a data processor
US8848532B2 (en) Method and system for processing data
CN105516086B (en) Method for processing business and device
US8464269B2 (en) Handling and reporting of object state transitions on a multiprocess architecture
US9141436B2 (en) Apparatus and method for partition scheduling for a processor with cores
WO2024001411A1 (en) Multi-thread scheduling method and device
CN111404931B (en) Remote data transmission method based on persistent memory
WO2017185285A1 (en) Method and device for assigning graphics processing unit task
CN111416858B (en) Media resource processing platform, method, device and server
CN106411778A (en) Data forwarding method and device
CN111163140A (en) Method, apparatus and computer readable storage medium for resource acquisition and allocation
WO2024156239A1 (en) Video streaming transmission method and apparatus, electronic device, and storage medium
CN110445580A (en) Data transmission method for uplink and device, storage medium, electronic device
CN114218135A (en) Source end flow control method and system based on Redis cache
CN109257280B (en) Micro-engine and message processing method thereof
CN110445874A (en) A kind of conversation processing method, device, equipment and storage medium
CN114157619A (en) Message cache management method and device and network processor
CN111324438A (en) Request scheduling method and device, storage medium and electronic equipment
CN118410001B (en) Method, system, device, product and equipment for data transmission between graphic processing units
CN114168233B (en) Data processing method, device, server and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23829601

Country of ref document: EP

Kind code of ref document: A1