CN118210620A

CN118210620A - Memory management method, device and system

Info

Publication number: CN118210620A
Application number: CN202410123466.4A
Authority: CN
Inventors: 李波涌
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2024-01-29
Filing date: 2024-01-29
Publication date: 2024-06-18

Abstract

The embodiment of the application provides a memory management method, a device and a system, wherein the method comprises the following steps: constructing a configuration information corresponding table according to the equipment information fed back by at least one OS management module of the OS layer and the memory resource information fed back by at least one CXL management module of the CXL Switch layer; responding to a memory allocation request sent by a target host, screening the memory resource information according to a configuration information corresponding table, determining target memory resources in a memory resource pool, generating a memory allocation instruction, and allocating the target memory resources to the target host according to the memory allocation instruction and equipment information. Unified and refined management of memory resources can be realized on the upper layers of CXL FM and OS FM, the utilization rate of the memory resources is improved, and the efficiency of memory management is improved.

Description

Memory management method, device and system

Technical Field

The present application relates to the field of computer technologies, and in particular, to a memory management method, device, and system.

Background

With the development of multi-core processors, the computing speed of the processors in the server is continuously increasing. However, the problem of "memory wall" is increasingly pronounced due to the fact that the access speed and memory capacity of the memory severely lag the computation speed of the processor. Computing quick links (Compute Express Link, CXL) technology is presented in this context.

In the related art, after the CXL memory pool is formed, the memory pool is generally managed by relying on an FM provided by a Fabric Manager (FM) or an operating system (OperatingSystem, OS) provided by a CXL Switch manufacturer, but implementation policies between FM of different manufacturers are not uniform, and compatibility problems may occur when a system is formed by using hosts of different manufacturers.

Disclosure of Invention

The application mainly provides a memory management method, a device and a system, which can improve the utilization rate of memory resources and further improve the efficiency of memory management.

The technical scheme of the application is realized as follows:

In a first aspect, an embodiment of the present application provides a memory management method, including:

Constructing a configuration information corresponding table according to the equipment information fed back by at least one OS management module of the OS layer and the memory resource information fed back by at least one CXL management module of the CXL Switch layer;

responding to a memory allocation request sent by a target host, screening the memory resource information according to a configuration information corresponding table, determining target memory resources in a memory resource pool, generating a memory allocation instruction, and allocating the target memory resources to the target host according to the memory allocation instruction and equipment information.

In a second aspect, an embodiment of the present application provides a memory management device, including:

the table construction unit is configured to construct a configuration information corresponding table according to the equipment information fed back by the at least one OS management module of the OS layer and the memory resource information fed back by the at least one CXL management module of the CXL Switch layer;

The memory allocation unit is configured to respond to a memory allocation request sent by the target host, screen the memory resource information according to the configuration information corresponding table, determine target memory resources in the memory resource pool, generate a memory allocation instruction, and allocate the target memory resources to the target host according to the memory allocation instruction and the equipment information.

In a third aspect, an embodiment of the present application provides a memory management system, including a memory resource pool, a memory management device as described in the second aspect, at least one CXL switch, and at least one host, where the at least one CXL switch is communicatively connected to the memory resource pool, the memory management device, and the corresponding host, and the memory management device is communicatively connected to the at least one host, respectively, where:

the memory resource pool is used for providing memory resources;

The at least one CXL Switch comprises a CXL Switch layer, wherein at least one CXL management module is arranged on the CXL Switch layer and used for feeding back memory resource information to the memory management device;

the at least one host comprises an OS layer, and at least one OS management module is arranged on the OS layer and used for feeding back equipment information to the memory management device;

the memory management device is used for constructing a configuration information corresponding table according to the equipment information and the memory resource information, responding to a memory allocation request sent by the target host, screening the memory resource information according to the configuration information corresponding table, determining target memory resources in the memory resource pool, generating a memory allocation instruction, and allocating the target memory resources to the target host according to the memory allocation instruction and the equipment information.

The application provides a memory management method, a device and a system, wherein the memory management device constructs a configuration information corresponding table according to equipment information fed back by an OS management module and memory resource information fed back by a CXL management module, and realizes unified management configuration of memory resources by combining the table and a custom scheduling rule. Therefore, unified and refined management of the memory resources is realized on the upper layers of the CXL FM and the OS FM, the utilization rate of the memory resources is improved, and the efficiency of the memory management is further improved.

Drawings

FIG. 1 is a schematic diagram of a memory management system according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a second embodiment of a memory management system according to the present application;

FIG. 3 is a schematic diagram of a memory resource pooling process according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating a first step of a memory management method according to an embodiment of the present application;

FIG. 5 is a second flowchart illustrating a memory management method according to an embodiment of the present application;

FIG. 6 is a flowchart illustrating a third step of a memory management method according to an embodiment of the present application;

Fig. 7 is a schematic diagram of a composition structure of a memory management device according to an embodiment of the present application.

Detailed Description

For a more complete understanding of the nature and the technical content of the embodiments of the present application, reference should be made to the following detailed description of embodiments of the application, taken in conjunction with the accompanying drawings, which are meant to be illustrative only and not limiting of the embodiments of the application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

It should also be noted that the term "first\second\third" in relation to embodiments of the present application is used merely to distinguish similar objects and does not represent a particular ordering for the objects, it being understood that the "first\second\third" may be interchanged in a particular order or sequence, where allowed, to enable embodiments of the present application described herein to be practiced in an order other than that illustrated or described herein.

As the size of servers increases, dynamic allocation of resources is a major problem for very large scale cloud service providers and cloud builders. Because the workload of different servers is not static, the configuration of memory resources cut at a time cannot meet the requirements of different servers, and finally, the waste of the memory resources is inevitably caused.

On the other hand, since the current mainstream computing system generally adopts a three-level storage structure of a cache (SRAM), a main memory (DRAM) and an external memory (NAND FLASH), data is transmitted among the three levels of storage every time a server starts to operate, the closer to the processor, the faster the memory speed is, so that the response time of a later level influences the overall performance, forming a "storage wall"; and when the data volume is too large, the data is required to be stored by an external memory, so that the processor needs to acquire the data in an Input/Output (IO) access mode, the IO access mode is reduced by several orders of magnitude relative to the access speed of memory access, the arrangement performance is seriously influenced, and an IO wall is formed.

To address the factors that limit computing system performance described above, CXL technology has evolved. CXL is a brand new internet technical standard which is accepted by the industry, and can effectively solve the bottleneck of the memory wall and the IO wall. CXL realizes the ability of accessing the same memory address together by multiple machines on hardware through memory sharing and memory access.

After CXL pooling, different hosts can manage and access pooled memory according to CXL standards. In the related art, the existing management schemes mainly include 2 kinds of: first, the memory pool is managed in dependence on the CXL FM provided by the CXL switch; second, an OS layer is provided on the host, and the memory pool is managed depending on the FM on the OS layer. Among them, CXL FM is the application logic responsible for system combining and resource allocation, and FM allocates one (logical) device to a host by using component command interfaces (Component Command Interface, CCI). It may take any form, including but not limited to software running on a host, embedded software running on a baseboard management controller (Board Management Controller, BMC), embedded firmware running on another CXL device or CXL switch, or a state machine running within the CXL device itself.

Although the above management scheme can realize the management of the memory resource pool, because FM is an open logic, different OS manufacturers and different CXL switch manufacturers all have different implementation schemes, and it is difficult to form a unified configuration logic. Moreover, the CXL FM and the OS FM provided at present also lack a fine management strategy for CXL pooled memory by an application end.

Based on the technical problems described above, the embodiments of the present application provide a memory management method, device and system, where the memory management device constructs a configuration information correspondence table according to device information fed back by an OS management module and memory resource information fed back by a CXL management module, and implements unified management configuration of memory resources by combining the table and a custom scheduling rule. Unified and refined management of memory resources can be realized on the upper layers of CXL FM and OS FM, the utilization rate of the memory resources is improved, and the efficiency of memory management is improved.

The application is further described in detail below with reference to the accompanying drawings and specific examples.

Fig. 1 is a schematic diagram of a composition structure of a memory management system according to an embodiment of the present application. As shown in fig. 1, the memory management system 10 may include: the memory management device 104 is in communication connection with at least one Host, and the at least one CXL Switch (CXL Switch) is in communication connection with the memory resource pool 103, the memory management device 104 and the corresponding Host respectively.

Fig. 2 is a schematic diagram of a second component structure of a memory management system according to an embodiment of the present application. As shown in fig. 2, memory management system 10 includes a host cluster 101, a CXL switch, and a memory resource pool 103. The host cluster 101 may include an H1 host 1011, an H2 host 1012, an H3 host 1013, an H4 host 1014, … …, an Hn host 1015, and the like, and is connected to the CXL switch cluster 102. Switch cluster 102 may include a first CXL switch 1021, … …, MCXL switch 1022. The memory resource pool 103 includes different memories provided by the host clusters 101 and external memory resources, which are all connected to the host clusters 101 on the upper layer through the CXL switch.

It should be noted that, in the embodiment of the present application, the CXL switch includes M CXL switches and n hosts, and in the implementation process, the number of CXL switches and the number of hosts may be specifically set as required.

The memory resource pool 103 is used for providing memory resources.

In an embodiment of the present application, the memory resource pool 103 may be a device including a plurality of memory resources, where different memory resources may include various types of memory media provided by different hosts, and all the memory media are pooled according to the CXL memory protocol, so that at least one CXL switch, at least one host, the memory management device 104, and other devices access the memory resources through unified addressing.

CXL includes three sub-protocols: the cxl.io sub-protocol, enabling CXL to run above the high speed serial computer expansion bus standard (PERIPHERAL COMPONENT INTERCONNECT EXPRESS, PCIe) physical layer; the CXL.cache allows the CXL equipment to cache the memory of the host computer; CXL.mem allows the host to access the device-side memory as if it were accessing local memory using Load/Store semantics.

Fig. 3 is a schematic diagram of a memory resource pooling process according to an embodiment of the present application. As shown in fig. 3, the system bus of a host includes one or more CXL Root Ports (RPs) that connect storage media, also referred to as CXL devices, of one or more hosts as Endpoints (EPs). In pooling, the host first performs a step ① query, and the host or CXL switch transmits the case of a PCIe protocol query for memory resources in an endpoint over the FlexBus bus, followed by a step ② in which the base register (BAS ADDRESS REGISTER, BAR) and its internal memory (which is referred to as host-MANAGED DEVICE memory, HDM) return to their sizes to enumerate the memory resources. Finally, step ③ is performed where the host or CXL switch maps the BAR and HDM into the host or CXL switch's memory space based on the BAR and HDM size and base address (base), and returns the base address to the endpoint to cause the endpoint to determine its mapped location. When the host processor accesses the memory request object, i.e. mem. Req. In the endpoint through the access instruction, i.e. Load/Store instruction, the request is transferred to the RP, and then transferred from the RP to the flow control unit, i.e. CXL flit, and finally transferred to the endpoint for processing, and after address translation in the endpoint, the corresponding memory request object is returned to the host processor.

The at least one CXL Switch includes a CXL Switch layer having at least one CXL management module disposed thereon, the CXL management module configured to feed back memory resource information to the memory management device 104.

In the embodiment of the present application, the CXL RPs shown in fig. 3 are connected to the upstream port of the CXL switch, and the downstream port of the CXL switch is connected to other CXL switches or the memory resource pool 103, and the CXL management module, which may also be referred to as CXL FM, may be used to manage the pooled memory resource accordingly. Illustratively, the CXL Switch is provided with an internal routing table, and when a host accesses the memory resource pool 103 via Load/Store commands, the CXL management module manages memory resources and exposes their base address (base) and size (size) to the memory space of the application program according to the three-layer virtual hierarchy of the CXL Switch layer-memory resource pool 103 of the host's OS layer-CXL Switch so that the application program running on the host can directly access the memory resources in the memory resource pool 103 through the memory space of the application HDM.

It should be noted that, the CXL management modules are disposed on the CXL Switch layer, and the CXL management modules disposed on different CXL switches may not have the same logic although they belong to the same layer. For example, in the case where the types of the first CXL switch 1021 and the MCXL th switch 1022 shown in fig. 2 are different, the CXL management module of the first CXL switch 1021 may not be compatible with the CXL management module of the MCXL th switch 1022, and at this time, the CXL management modules on the different CXL switches respectively implement out-of-band communication with the upper layers thereof, that is, the below-described OS layers, and the memory resource information is fed back to the corresponding OS management modules.

The memory resource information may include non-uniform memory access architecture (Non Uniform Memory Access, NUMA) information corresponding to the CXL memory, allocation status of the memory resource, HDM base address and memory size, related information after address conversion, and the like.

In the embodiment of the application, the host can only access the memory resource with the binding relation, and the binding relation can be acquired by the CXL management module. As shown in fig. 2, the labels of the memory medium in the memory resource pool 103 correspond to the hosts with the same labels in the host cluster 101, and the hosts with the same labels have a binding relationship with the memory resources, and illustratively, the H1 host 1011 may access the memory resources with diagonal labels, and the H2 host 1012 may access the memory resources with dot labels. In addition, the memory resource with the pure black mark is a SHARED memory resource (SHARED) and can be accessed by all hosts.

At least one host includes an OS layer, on which at least one OS management module is disposed, for feeding back device information to the memory management apparatus 104.

In the embodiment of the present application, as shown in fig. 2, the host cluster 101 may include at least one host, where each host is provided with an OS layer, and it is understood that the OS layers of different hosts are the same layer.

In the virtual hierarchy, the OS layer is the upper layer of the CXL Switch layer, and an OS management module is disposed on the OS layer, for managing the driver and device enumeration related to the CXL Switch.

When the host accesses the memory resource, the OS management module determines a corresponding CXL management module according to the device information stored therein, and the CXL management module further determines the memory resource to be accessed.

The device information may include a CXL switch ID, a virtual bridge ID, a physical port ID, a logical device ID, and the like, and may further include information such as a base address (base) and a size (size) of the memory resource shown in fig. 3, which may be acquired by the OS management module to the CXL management module.

The memory management apparatus 104 is configured to construct a configuration information corresponding table according to the device information and the memory resource information, and in response to a memory allocation request sent by the target host, perform screening in the memory resource information according to the configuration information corresponding table, determine a target memory resource in the memory resource pool 103, generate a memory allocation instruction, and allocate the target memory resource to the target host according to the memory allocation instruction and the device information.

Compared with the scheme that only the CXL Switch layer or only the OS layer is arranged in the prior art, in the embodiment of the application, in the virtual structure layer, an application software layer is additionally arranged on the upper layer of the CXL Switch layer where the CXL management module is arranged and the OS layer where the OS management module is arranged, so that the memory management device 104 on the layer can uniformly manage heterogeneous CXL management modules and the OS management modules, and further uniform and refined management of memory resources is realized.

In an embodiment of the present application, the memory management device 104 may be a software program disposed on a software layer, where the software layer is located at an upper layer of the OS layer in the virtual hierarchy. In the case that the host cluster 101 shown in fig. 2 includes only one host, the memory management device 104 may also be disposed on the host; if the host cluster 101 shown in fig. 2 includes at least one host, the memory management device 104 may be disposed on one of the hosts, where the host may communicate with other hosts, or may be disposed on a server that manages all the hosts.

The memory management device 104 gathers the binding relationship between the memory resource, the CXL Switch, the physical device and the logical device between the host according to the device information fed back by the OS layer and the memory resource information fed back by the CXL Switch layer, and constructs a configuration information correspondence table. According to the table, a virtual hierarchy of the memory resource and the OS management module of the memory management device 104-OS layer-the CXL management module of the CXL Switch layer can be established to schedule the memory resource. Wherein in some embodiments, the OS management module of the OS layer and the CXL management module of the CXL Switch layer may be in a parallel hierarchical relationship, both of which may be in direct communication with the memory management device 104.

In the embodiment of the present application, as described above, at least one OS management module and at least one CXL management module that communicate with the memory management device 104 may be set, and are incompatible with each other, that is, the management modules that are in the same layer do not have interfaces for communication with each other, which makes it difficult for different OS management modules to know the allocation situation of all memory resources. In this case, after receiving the memory allocation request sent by one host in the host cluster 101, the memory management device 104 may determine, according to the configuration information table and the memory allocation condition reflected by the memory resource information, that the target is not allocated to the memory resources of the other hosts, that is, the accessible memory resources in the memory resource pool 103, and may further filter the accessible memory resources according to the user's needs, to determine the target memory resources, so as to generate the correct memory allocation instruction. Then, according to the device information, after determining which CXL management module manages the target memory resource, the memory management device 104 transmits the memory allocation instruction to the CXL management module that manages the target memory resource through the OS management software, or the memory management device 104 directly transmits the access instruction to the CXL management module that manages the target memory resource, so that the CXL management module allocates the target memory resource to the corresponding host.

It should be further noted that, after the target memory resource is allocated to the corresponding host, the CXL management module may feed back the modified binding relationship to the memory management device 104, and the memory management device 104 updates the binding relationship in the configuration information table.

The embodiment of the application provides a memory management system, which realizes unified and refined management of memory resources by arranging a memory management device on the upper layers of CXL FM and OS FM, improves the utilization rate of the memory resources and improves the efficiency of memory management.

In yet another embodiment of the present application, the at least one host further includes a unified extensible firmware interface (Unified Extensible FIRMWARE INTERFACE, UEFI) layer, also referred to as a UEFI layer, wherein:

The UEFI layer is configured to mark the type of each memory resource in the memory resource pool 103 as a general type or a specific type according to the attribute of each memory resource in the memory resource pool 103.

The general type memory resource can be called by each OS management module, and the specific type memory resource can be called by the appointed OS management module.

The UEFI layer is also disposed on the host, and is located at a lower level of the OS layer and at an upper level of the CXL Switch layer in the virtual hierarchy.

In the process of power-on initialization of the memory management system 10, after the above-mentioned memory pooling step is performed, the UEFI layer needs to configure the memory resources in the memory resource pool 103 as a General-purpose memory (specific purpose memory) or a specific-purpose memory (specific purpose memory) in addition to the conventional memory resource initialization and CXL device initialization, where the General-purpose memory is a General-purpose memory and the specific-purpose memory is a specific-type memory.

In the embodiment of the present application, as shown in fig. 3, the memory resource pool 103 includes a plurality of memory resources, wherein, the memory with a pure black mark is a general type memory, and can be accessed by an OS management module on any OS layer; with other tags is special purpose memory that is only accessible to OS management modules on hosts with the same tag.

It should be further noted that, after the initialization of the memory resource and the initialization of the CXL device, the UEFI layer may feed back the related initialization information to the OS management module corresponding to the OS layer.

In some embodiments, the OS management module is further configured to configure a corresponding resource association table and a memory attribute table according to initialization information fed back by the UEFI layer, where the resource association table is used to characterize a correspondence between each host and the CXL Switch of each host, and between each host and a memory resource in the memory resource pool 103; the memory attribute table is used to characterize the attributes of each memory resource in the memory resource pool 103.

The OS management module configures acpi_ OSC (Operating System Capabilities) object (object) according to the initialization information fed back by the UEFI layer, and constructs a resource association table, which may be called a system resource association table (System Resource Affinity Table, SRAT), and may be called a memory attribute table (Heterogeneous Memory Attribute Table, HMAT).

In embodiments of the present application, the OS layer may manage CXL switch-related device drivers and device enumeration based on the resource association table and the memory attribute table.

Wherein the SRAT is configured to provide information regarding a binding between a memory resource and at least one host in the memory management system 10, and at least one CXL switch, and is also configured to provide information regarding a binding between a CXL switch and a host. HMAT describes information about memory attributes, such as storage media type, and may also be used to provide other performance information about memory resources, such as bandwidth, latency, etc.

In the embodiment of the present application, after the OS management module constructs the SRAT and the HMAT, the information in the table may be fed back to the memory management device 104 as device information through in-band or out-of-band communication with the memory management module, so that the memory management device 104 constructs a configuration information corresponding table.

And the CXL management module is used for determining the corresponding relation between the CXL Switch and the memory resources in the memory resource pool 103.

It should be noted that, the CXL management module may feed back, as the memory resource information, the related information of the memory resource managed by the CXL management module, such as non-uniform memory access architecture (Non Uniform Memory Access, NUMA) information corresponding to the memory resource, an HDM base address, a memory size, information after address conversion, and the like, to the memory management device 104, so that the memory management device 104 constructs the configuration information mapping table.

In some embodiments, the CXL management module is further configured to establish a binding relationship between each host and a physical port of the CXL Switch according to the resource association table.

In the embodiment of the present application, after executing the steps of constructing SRAT and HMAT in the process of powering up and initializing the memory management system 10, the CXL management module allocates the CXL switch to the corresponding host through the bind/unbind command according to the resource association table. This command requires four parameters: CXL switch (switch) ID, virtual bridge (virtual bridge) ID, physical port ID, and logical device ID. The CXL management module sends this command to the designated CXL switch, which checks if the physical port is not currently bound to a host. If not, the switch updates its internal state to perform the binding. The CXL switch then sends a hot-add prompt to the host, eventually informing the CXL management module that the binding was successful.

It should be noted that, the CXL management module further supports hot-plug of the storage medium in the memory resource pool 103, and by introducing the relevant device attribute table (Coherent Device Attribute Table, CDAT), when the storage medium is hot-plugged during operation, the host reads the relevant CDAT, and establishes a memory mapping relationship for the inserted storage medium in the idle digital range according to the information such as the internal NUMA field, the memory range, the bandwidth, and the delay fed back in the CDAT.

The above steps illustrate the initialization-related steps performed by the memory management system 10 after power-up. After the initialization is completed, the uppermost memory management device 104 may perform in-band or out-of-band communication with the heterogeneous at least one OS management module and the heterogeneous at least one CXL management module, and perform unified management on the at least one OS management module, the at least one CXL management module, and the memory resource pool 103. Specifically:

In some embodiments, the memory management device 104 is configured to determine a corresponding OS management module and a corresponding CXL management module according to the device information; and sending the memory allocation instruction to the corresponding CXL management module through the corresponding OS management module so that the corresponding CXL management module allocates the target memory resource to the target host according to the memory allocation instruction.

In some embodiments, the memory management device 104 is further configured to screen the memory resource information according to the configuration information correspondence table to determine an initial memory resource; and screening the initial memory resources according to the self-defined scheduling rules to determine target memory resources, wherein the target memory resources are consistent with the OS target memory resources determined according to the resource allocation strategy of the corresponding OS management module and the CXL target memory resources determined according to the resource allocation strategy of the corresponding CXL management module.

In some embodiments, the memory management device 104 is further configured to send a state change command to the target memory resource when the determined target memory resource is inconsistent with the OS target memory resource determined according to the resource allocation policy of the corresponding OS management module and the CXL target memory resource determined according to the resource allocation policy of the corresponding CXL management module according to the configuration information correspondence table and the custom scheduling rule, where the state change command is used to establish a correspondence between the target memory resource and the target host.

In some embodiments, the custom scheduling rules include at least one of:

a secure access rule for determining memory resources accessible by the memory management device 104 in the memory resource pool 103;

A hierarchical access rule for determining an order in which the memory management device 104 accesses the memory resources in the memory resource pool 103;

a performance guarantee rule, configured to determine, according to the history resource access record, that the memory management device 104 schedules the memory resource with the highest speed in the memory resource pool 103;

RAS rules for performing fault detection and alerting to the memory resource pool 103.

The embodiment of the application provides a memory management system, which is initialized by using different OS management modules and common flow parts when different CXL management modules are initialized, so that a memory management device can uniformly manage the differentiated OS management modules and the CXL management modules, and the utilization rate of a memory resource pool is improved.

The above embodiments illustrate the initialization process after the memory management system is powered on, and the following describes in detail the related functions of the memory management device for performing memory management.

In still another embodiment of the present application, fig. 4 is a flowchart illustrating a step of a memory management method according to an embodiment of the present application. As shown in fig. 4, the method may include the steps of:

S301, constructing a configuration information corresponding table according to equipment information fed back by at least one OS management module of an OS layer and memory resource information fed back by at least one CXL management module of a CXL Switch layer.

In the embodiment of the application, the memory management module establishes communication connection with at least one OS management module of the OS layer and at least one CXL management module of the CXL Switch layer, wherein the at least one OS management module may include a plurality of OS management modules with different implementation policies, and the at least one CXL management module may include a plurality of CXL management modules with different implementation policies.

As in the previous embodiment, the device information may include SRAT, CXL switch ID, virtual bridge ID, physical port ID, logical device ID, etc. stored in HMAT, and may further include information such as base address (base) and size (size) of the memory resource shown in fig. 3; the memory resource information may include information about the memory resource it manages, such as NUMA information corresponding to the memory resource, HDM base address and memory size, address converted information, and the like.

In the embodiment of the application, after the memory management device acquires the device information and the memory resource information, the related information of the CXL Switch managed by different OS management modules and the related information of the memory resource acquired by different CXL management modules are in one-to-one correspondence with the host, so that perfect database information is established, and a configuration information corresponding table of the corresponding relationship between the physical device and the logic device is constructed, so as to define the device binding relationship between the memory management device and the virtual hierarchy structure of the OS management module of the host OS layer, the CXL management module of the CXL Switch layer and the memory resource, and between adjacent layers.

It should be noted that in some alternative embodiments, the CXL management module may communicate out-of-band directly with the memory management device, in which case the OS layer and the CXL Switch layer may be located at the same layer in the virtual hierarchy.

S302, responding to a memory allocation request sent by a target host, screening the memory resource information according to a configuration information corresponding table, determining target memory resources in a memory resource pool, generating a memory allocation instruction, and allocating the target memory resources to the target host according to the memory allocation instruction and equipment information.

In the embodiment of the present application, the target host may be any host of the at least one host. When the memory requirement changes, the target host can send a memory allocation request to the memory management device. The memory allocation request may include, for example, a memory size to be increased, and may further include information related to a memory type, a bandwidth, and the like.

After receiving the memory allocation request, the memory management device may combine the configuration information mapping table, determine the memory resources that are not currently allocated to other hosts according to the memory resource information, and further screen based on the user's requirements, determine the appropriate target memory resources, and generate the memory allocation instruction. Further, a CXL management module for managing the target memory resource is determined according to the device information, and a memory allocation instruction is sent to the CXL management module, so that the CXL management module allocates the target memory resource to the target host according to the memory allocation instruction, namely, a binding relation between the target memory resource and the target host is established, and a configuration information corresponding table is updated.

If the memory management device fails to allocate the memory, the allocation failure information is returned to the target host if the target memory resource is not screened.

The embodiment of the application provides a memory management method, wherein a memory management device constructs a configuration information corresponding table according to equipment information fed back by an OS management module and memory resource information fed back by a CXL management module, and realizes unified management configuration of memory resources by combining the table and a custom scheduling rule. Therefore, unified and refined management of the memory resources is realized on the upper layers of the CXL FM and the OS FM, the utilization rate of the memory resources is improved, and the efficiency of the memory management is improved.

In another embodiment of the present application, fig. 5 is a second flowchart illustrating a step of a memory management method according to an embodiment of the present application. As shown in fig. 5, the above-mentioned allocation of the target memory resource to the target host according to the memory allocation instruction and the device information may include the following steps:

S401, determining a corresponding OS management module and a corresponding CXL management module according to the equipment information.

In the embodiment of the present application, after the memory management device receives the memory allocation instruction sent by the target host, since the memory management device is at the uppermost layer of the virtual hierarchy and the memory management device stores the device information for representing the management relationship between the devices, the memory management device may determine, according to the device information, which CXL management module manages the target memory resource, and which OS management module manages the target memory resource, that is, the transmission path of the memory allocation instruction.

S402, the memory allocation instruction is sent to the corresponding CXL management module through the corresponding OS management module, so that the corresponding CXL management module allocates the target memory resource to the target host according to the memory allocation instruction.

After determining the transmission path of the memory allocation instruction, the memory management device transmits the memory allocation instruction to the corresponding OS management module, and then the OS management module transmits the memory allocation instruction to the corresponding CXL management module. And finally, establishing a binding relation between the target memory resource and the target host by the CXL management module, and distributing the target memory resource to the target host.

In the process of issuing the memory allocation instruction by the memory management device, each layer also combines its own function judgment to judge the target memory resource in the memory allocation instruction again, so as to determine whether the target memory resource can be allocated to the target host. In the judging process, if the target memory resource does not belong to the OS target memory resource determined by the OS management module through the judging rule, the OS management module may modify the target memory resource in the memory allocation instruction into the OS target memory resource. Correspondingly, if the memory allocation instruction is issued to the CXL management module by the OS management module, the CXL management module determines that the OS target memory resource is inconsistent with the CXL target memory resource determined by the CXL management module through the judgment rule, the OS target memory resource in the memory allocation instruction is modified to be the CXL target memory resource, and finally, a binding relation between the CXL target memory resource and the target host is established.

In some embodiments, fig. 6 is a flowchart illustrating a step of a memory management method according to an embodiment of the present application. As shown in fig. 6, the above-mentioned filtering in the memory resource information according to the configuration information mapping table, to determine the target memory resource in the memory resource pool may include the following steps:

S501, screening is carried out in the memory resource information according to the configuration information corresponding table, and initial memory resources are determined.

In the embodiment of the present application, the initial memory resource may refer to selecting, according to the memory resource information, memory resources that are not allocated to other hosts in the memory resource pool.

S502, screening is conducted in the initial memory resources according to the self-defined scheduling rules, and the target memory resources are determined.

The target memory resource is consistent with the OS target memory resource determined according to the resource allocation strategy of the corresponding OS management module and the CXL target memory resource determined according to the resource allocation strategy of the corresponding CXL management module.

Further, the memory management device screens the initial memory resources again based on the user-defined scheduling rules set by the user. In the embodiment of the present application, in order to avoid the situation that the memory instruction is modified in the process of issuing the memory instruction, the memory management device may determine, in the process of determining the target memory resource, that the resource allocation policy of the corresponding OS management module on the lower OS layer and the resource allocation rule of the corresponding CXL management module on the lower CXL layer are both included, so that the initial memory resource determined by the foregoing steps may be consistent with the target memory resource determined in the memory allocation instruction, and may not be modified when issued to the corresponding CXL management module through the corresponding OS management module.

That is, the target memory resource in the memory allocation instruction is included in the OS target memory resource determined by the OS management module through its resource allocation policy, and also included in the CXL target memory resource determined by the CXL management module through its resource allocation policy.

In some embodiments, before generating the memory allocation instruction, the method may further include: and when the determined target memory resource is inconsistent with the OS target memory resource determined according to the resource allocation strategy of the corresponding OS management module and the CXL target memory resource determined according to the resource allocation strategy of the corresponding CXL management module, sending a state change command to the target memory resource, wherein the state change command is used for establishing the corresponding relation between the target memory resource and the target host.

As in the previous embodiment, after determining the target memory resource, the memory management device may determine the target memory resource according to the resource allocation policy of the OS management module and the memory allocation policy of the CXL management module, and generate the memory allocation instruction for issuing if the determination is consistent.

In the embodiment of the application, in order to ensure that the target memory resource determined by the memory management device is not modified in the issuing process, the memory management device can send a state change command to the corresponding CXL switch after determining the target memory resource, so that the CXL switch establishes the binding relationship between the target memory resource and the target host in advance. In this way, in the issuing process, when the OS management module and the CXL management module determine that the target memory resource is allocated to the target host, the memory allocation instruction is not modified any more.

The state change command may be an exclusive CCI command, which is used to forcedly cancel the binding relationship between the target memory resource and the current host, and allocate the target memory resource to the target host in an exclusive manner.

The embodiment of the application provides a memory management method, wherein the uppermost memory management device can be free from the influence of OS management modules and CXL management modules with different implementation strategies, a virtual hierarchy structure comprising heterogeneous OS management modules and heterogeneous CXL management modules is established, and the memory management modules realize unified management of the lower layers through state change commands, so that the efficiency of a memory management system is improved.

In still another embodiment of the present application, the custom scheduling rule includes at least one of the following:

And the security access rule is used for determining memory resources which can be accessed by the memory management device in the memory resource pool.

In the embodiment of the application, the security access rule can be used for preventing the attack behaviors such as abnormal memory access, malicious code execution and the like, and a security memory environment is constructed for the memory management system. By setting a refusal access rule, the memory management device controls some hosts to be unable to access some memory resources, so as to prevent malicious programs from acquiring sensitive data in the memory resources; the memory management device can also set a security verification rule, so that a host passing verification can access certain memory resources, and the storage security of the memory resources is improved.

It should be noted that, the security access rule of the OS may also be set in the OS management module, and the security access rule in the memory management device may be to continue to set more rules based on the security access rule of the OS to enhance security of memory management.

And the hierarchical access rule is used for determining the sequence of the memory management device accessing the memory resources in the memory resource pool.

As in the previous embodiment, the memory resources in the memory resource pool may include memory resources provided by the host, and may also include some external extended memory resources. When receiving a memory allocation request sent by a target host, the memory management module can determine the priority condition of accessing the memory resource for the target host according to the hierarchical access rule.

For example, the memory management device may set some priority rules, such as local priority rules, so that the memory management module preferentially allocates the memory resource provided by the target host as the target memory resource to the target host; or the memory management device can also set a service priority rule, so that the memory allocation request of the host running the key service can be processed preferentially without queuing.

And the performance guarantee rule is used for determining the memory resource with the highest scheduling speed in the memory resource pool according to the history resource access record.

In the embodiment of the present application, a history resource access record may be stored in the memory management device, where the history resource access record is used to record a delay time of each host accessing a memory resource in the memory resource pool in a latest preset time period. Thus, after receiving the memory allocation request sent by the target host, the memory management device can establish an optimal scheduling relationship by querying the history resource record, and determine the memory resource with the shortest access time in the history resource record of the target host as the target memory resource, so that the memory management device can access the target memory resource at the highest speed.

RAS rule for fault detection and alarm for memory resource pool.

RAS refers to Reliability (Availability), availability (Availability), and Serviceability (Serviceability) of a system. The goal of the RAS rules is to make the system operate as reliably as possible for a long period of time without shutdown, reducing the occurrence of sudden shutdown of the system.

The RAS rule provides a periodic detection function, and after detecting that the system has hardware faults, the memory management device can correct the hardware faults as far as possible through an error recovery mechanism provided by the RAS rule, so that the reliable operation of the system is maintained. If the system cannot be recovered, the memory management device can trigger interruption or abnormality according to RAS rules, and can give an alarm to an operator before the system is down due to hardware errors, so that the operator is reminded of timely replacing the hardware.

It should be further noted that the foregoing rule is an example of a custom scheduling rule, and the custom scheduling rule may also include other rules, where the content included in the custom scheduling rule and the sequence determined between the rules may be determined according to the selection of the user.

The embodiment of the application provides a memory management method, which realizes the fine management of memory resources and improves the flexibility and the safety of memory management by setting security verification, management classification and the like in a memory management module.

In still another embodiment of the present application, based on the foregoing embodiment, the initialization process of the memory management system and the functions of the memory management device are described in detail.

In the embodiment of the application, the virtual hierarchical structure of the memory management system and the execution flow of the CXL pooled memory management scheme after the memory management system is powered on are described:

The CXL memory protocol enables devices to disclose a host managed device memory HDM that allows a host to manage and access the memory, similar to a native Double Data Rate (DDR) memory connected to the host.

During initialization, the device may be integrated into system memory, and fig. 3 may be used to describe how the CXL device, i.e., the memory of the memory resource, maps to the memory space of the host.

The virtual hierarchy of the memory management system may include, from top to bottom:

management software layer: a unified pooled memory management method is established.

OS layer: the ACPI OSC object, system Resource Association Table (SRAT) and Heterogeneous Memory Attribute Table (HMAT) are configured. Managing drives and device enumeration of CXL-related devices.

UEFI layer: in addition to conventional memory initialization and CXL device initialization, the type of CXL memory is configured as either general purpose memory or special purpose memory and reported to the HMAT different reference fields.

CXL Switch layer: the device allocation/unbinding of the host is done by FM on CXL Switch via bind/unbind command, which receives four parameters: CXL switch (switch) ID, virtual bridge (virtual bridge) ID, physical port ID, and logical device ID. The FM sends this command to the designated switch to check if the physical port is currently unbound. If not, the switch updates the internal state in which the binding is performed. It then sends a hot-add indication to the host, eventually informing the FM of a successful binding. The virtual bridge and associated devices now appear in the virtual hierarchy of hosts. FM also manages the hot plug of CXL devices.

After the configuration of the memory management system is completed, at the uppermost layer of the virtual hierarchy, the memory management device on the management software layer can realize the following functions:

First, in-band or out-of-band communication is realized between the function of the management software, i.e., the memory management device and the management module (FM) on the CXL Switch. The management software collects the resource information of each HOST CXL pooled memory, such as NUMA information corresponding to the CXL memory, HDM base address and memory size, and related information after address conversion. And (3) the OS terminals of different hosts (Host) acquire address information of CXL equipment memory (HDM) managed by the Host and the Host are in one-to-one correspondence. Binding Host CXL switch ID, virtual bridge ID, physical port ID, logical device ID, etc. And establishing perfect database information to meet the configuration information corresponding table of the physical equipment and the logic equipment.

Secondly, establishing a function of calling relation rules capable of being configured in a self-defined way, comprising the following steps:

Security access rules, such as establishing denial of access rules, security validation rules. Security authentication may be enforced by security policies performed by the OS.

Access rights hierarchy, such as priority rules: if the local priority is given, the local CXL memory area is used preferentially; if the service is prioritized, the optimal memory allocation of the key service is ensured.

The performance guarantee and RAS functions, management software intelligently establishes an optimal scheduling relation by analyzing the existing access performance records, and simultaneously carries out hot backup or removal on CXL pooled memory modules with risk early warning or faults of RAS.

Finally, the management software has the function of adding judgment after the Host OS or the application makes an access request to the pooled memory module, and can generate a proper scheduling scheme according to the established calling rule and the known pooled memory resource information, and then the CXL Switch FM combines with the function judgment of the CXL Switch FM to generate a correct access command, such as bind and unbind a logic device. The key to this step is that the management software has to give a reasonable state of the CXL pooling module, such as changing the device state first with an exclusive CCI command. So that the CXL Switch FM will not unauthorized change the logical module that has been operated.

The embodiment of the application provides a memory management method and a memory management system, wherein a unified pooled memory management method is set in a software management layer, so that the access and the call of CXL pooled memories can be realized by controlling FM on an OS application or CXL Switch at an upper layer, and the access and the call of the CXL pooled memories are not influenced by different CXL Switch FM and different OS formulated strategies. Different levels of access rights can also be generated with user configurable rules, a series of security allocation policies can be configured and interoperation with FM on OS/CXL Switch to enhance configuration management of pooled memory. And (3) increasing the safety certification of the scheduling of the CXL pooled memory and realizing the management grading of the CXL pooled memory. Therefore, the CXL pooled memory is managed by better obtaining the demand relation from the system level, and the CXL pooled memory utilization rate is improved.

In still another embodiment of the present application, fig. 7 is a schematic diagram illustrating a composition structure of a memory management device according to an embodiment of the present application. As shown in fig. 7, the memory management device 104 may include:

A table construction unit 1041 configured to construct a configuration information corresponding table according to the device information fed back by the at least one OS management module of the OS layer and the memory resource information fed back by the at least one CXL management module of the CXL Switch layer;

The memory allocation unit 1042 is configured to respond to the memory allocation request sent by the target host, screen the memory resource information according to the configuration information corresponding table, determine the target memory resource in the memory resource pool, generate a memory allocation instruction, and allocate the target memory resource to the target host according to the memory allocation instruction and the device information.

In some embodiments, the memory allocation unit 1042 is further configured to determine a corresponding OS management module and a corresponding CXL management module according to the device information; and sending the memory allocation instruction to the corresponding CXL management module through the corresponding OS management module so that the corresponding CXL management module allocates the target memory resource to the target host according to the memory allocation instruction.

In some embodiments, the memory allocation unit 1042 is further configured to screen the memory resource information according to the configuration information mapping table to determine an initial memory resource; and screening the initial memory resources according to the self-defined scheduling rules to determine target memory resources, wherein the target memory resources are consistent with the OS target memory resources determined according to the resource allocation strategy of the corresponding OS management module and the CXL target memory resources determined according to the resource allocation strategy of the corresponding CXL management module.

In some embodiments, the memory allocation unit 1042 is further configured to send a state change command to the target memory resource when the determined target memory resource is inconsistent with the OS target memory resource determined according to the resource allocation policy of the corresponding OS management module and the CXL target memory resource determined according to the resource allocation policy of the corresponding CXL management module according to the configuration information correspondence table and the custom scheduling rule, where the state change command is used to establish a correspondence between the target memory resource and the target host.

In some embodiments, the memory allocation unit 1042 is further configured to a security access rule for determining memory resources accessible by the memory management device in the memory resource pool;

a hierarchical access rule for determining the order of access of the memory management device to the memory resources in the memory resource pool;

the performance guarantee rule is used for determining memory resources with the highest scheduling speed in the memory resource pool of the memory management device according to the history resource access record;

RAS rule for fault detection and alarm for memory resource pool.

It will be appreciated that in this embodiment, the "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., and may of course be a module, or may be non-modular. Furthermore, the components in the present embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional modules.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" or "some embodiments" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" or "in some embodiments" in various places throughout this specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application. The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.

It should also be noted that, in the present disclosure, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

The methods disclosed in the method embodiments provided by the application can be arbitrarily combined under the condition of no conflict to obtain a new method embodiment.

The features disclosed in the several product embodiments provided by the application can be combined arbitrarily under the condition of no conflict to obtain new product embodiments.

The features disclosed in the embodiments of the method or the apparatus provided by the application can be arbitrarily combined without conflict to obtain new embodiments of the method or the apparatus.

The above description is not intended to limit the scope of the application, but is intended to cover any modifications, equivalents, and improvements within the spirit and principles of the application.

Claims

1. A memory management method, comprising:

Responding to a memory allocation request sent by a target host, screening the memory resource information according to the configuration information corresponding table, determining target memory resources in a memory resource pool, generating a memory allocation instruction, and allocating the target memory resources to the target host according to the memory allocation instruction and the equipment information.

2. The method of claim 1, the allocating the target memory resource to a target host according to the memory allocation instruction and the device information, comprising:

determining a corresponding OS management module and a corresponding CXL management module according to the equipment information;

And sending the memory allocation instruction to the corresponding CXL management module through the corresponding OS management module, so that the corresponding CXL management module allocates the target memory resource to the target host according to the memory allocation instruction.

3. The method of claim 1, wherein the selecting from the memory resource information according to the configuration information mapping table to determine the target memory resource in the memory resource pool comprises:

screening the memory resource information according to the configuration information corresponding table to determine initial memory resources;

And screening the initial memory resources according to a self-defined scheduling rule to determine the target memory resources, wherein the target memory resources are consistent with the OS target memory resources determined according to the resource allocation strategy of the corresponding OS management module and the CXL target memory resources determined according to the resource allocation strategy of the corresponding CXL management module.

4. The method of claim 3, wherein prior to generating the memory allocation instruction, the method further comprises:

and when the determined target memory resource is inconsistent with the OS target memory resource determined according to the resource allocation strategy of the corresponding OS management module and the CXL target memory resource determined according to the resource allocation strategy of the corresponding CXL management module, sending a state change command to the target memory resource, wherein the state change command is used for establishing the corresponding relation between the target memory resource and the target host.

5. The method of claim 1, the custom scheduling rules comprising at least one of:

a security access rule for determining memory resources accessible by the memory management device in the memory resource pool;

A hierarchical access rule for determining an order in which the memory management device accesses memory resources in the memory resource pool;

the performance guarantee rule is used for determining that the memory management device schedules the memory resource with the highest speed in the memory resource pool according to the history resource access record;

RAS rule for fault detection and alarm for the memory resource pool.

6. A memory management device, comprising:

The memory allocation unit is configured to respond to a memory allocation request sent by a target host, screen the memory resource information according to the configuration information corresponding table, determine target memory resources in a memory resource pool, generate a memory allocation instruction, and allocate the target memory resources to the target host according to the memory allocation instruction and the equipment information.

7. A memory management system comprising a memory resource pool, the memory management device of claim 6, at least one CXL switch, and at least one host, the at least one CXL switch communicatively coupled to the memory resource pool, the memory management device, and the corresponding host, respectively, the memory management device communicatively coupled to the at least one host, wherein:

The memory resource pool is used for providing memory resources;

the at least one CXL Switch comprises a CXL Switch layer, wherein at least one CXL management module is arranged on the CXL Switch layer and is used for feeding back memory resource information to the memory management device;

the at least one host comprises an OS layer, wherein at least one OS management module is arranged on the OS layer and used for feeding back equipment information to the memory management device;

The memory management device is configured to construct a configuration information corresponding table according to the device information and the memory resource information, and respond to a memory allocation request sent by a target host, screen the memory resource information according to the configuration information corresponding table, determine a target memory resource in the memory resource pool, generate a memory allocation instruction, and allocate the target memory resource to the target host according to the memory allocation instruction and the device information.

8. The system of claim 7, the at least one host further comprising a UEFI layer, wherein:

The UEFI layer is configured to mark, according to an attribute of each memory resource in the memory resource pool, a type of each memory resource in the memory resource pool as a general type or a specific type;

9. The system according to claim 7,

The OS management module is further configured to configure a corresponding resource association table and a memory attribute table according to initialization information fed back by the UEFI layer, where the resource association table is used to characterize a correspondence between each host and the CXL Switch of each host, and between each host and a memory resource in the memory resource pool; the memory attribute table is used for representing the attribute of each memory resource in the memory resource pool;

And the CXL management module is used for determining the corresponding relation between the CXL Switch and the memory resources in the memory resource pool.

10. The system according to claim 7,

And the CXL management module is also used for establishing the binding relation between each host and the physical port of the CXL Switch according to the resource association table.