Nothing Special   »   [go: up one dir, main page]

CN110297787A - The method, device and equipment of I/O equipment access memory - Google Patents

The method, device and equipment of I/O equipment access memory Download PDF

Info

Publication number
CN110297787A
CN110297787A CN201810240206.XA CN201810240206A CN110297787A CN 110297787 A CN110297787 A CN 110297787A CN 201810240206 A CN201810240206 A CN 201810240206A CN 110297787 A CN110297787 A CN 110297787A
Authority
CN
China
Prior art keywords
cache
access
memory access
data
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810240206.XA
Other languages
Chinese (zh)
Other versions
CN110297787B (en
Inventor
李鹏
曾露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Loongson Technology Corp Ltd
Original Assignee
Loongson Technology Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Loongson Technology Corp Ltd filed Critical Loongson Technology Corp Ltd
Priority to CN201810240206.XA priority Critical patent/CN110297787B/en
Publication of CN110297787A publication Critical patent/CN110297787A/en
Application granted granted Critical
Publication of CN110297787B publication Critical patent/CN110297787B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The present invention provides a kind of method, device and equipment of I/O equipment access memory.Method of the invention is written into the maximum hit number of CPU access request in Cache to the period being read in the corresponding I/O data of I/O memory access write request by calculating, number can be used by updating I/O data in Cache according to the maximum hit number, make I/O data after updating that can be equal to the difference of the total number of Cache and the maximum hit number with number;Number can be used according to I/O data in the Cache, carry out I/O memory access processing, in real time according to CPU to the service condition of Cache, dynamically adjust I/O data the space occupied in Cache, so as under the premise of not influencing cpu performance, it improves I/O memory access performance and improves the space utilization rate of Cache to further improve the overall performance of processor.

Description

The method, device and equipment of I/O equipment access memory
Technical field
The present invention relates to field of processors more particularly to a kind of method, device and equipments of I/O equipment access memory.
Background technique
With the rapid development of microprocessor technology, the integrated level of microprocessor is higher and higher, the calculating energy of microprocessor Power is greatly improved, and the memory access performance of I/O equipment becomes the bottleneck of the performance boost of limitation processor.
Traditional I/O equipment access memory generallys use direct memory access (direct memory access, abbreviation DMA) mode or direct cache access (direct Cache access, abbreviation DCA) mode, wherein dma mode is permitted Perhaps I/O equipment direct read/write memory, to reduce degree of participation of the processor core in I/O data handling process;DCA mode is then In order to improve the memory access performance of I/O equipment, allow I/O equipment direct read/write cache memory (Cache).Though DCA mode The memory access performance of I/O equipment can be so improved, but I/O data are write direct into cache memory and will cause I/O data pair The pollution of Cache, so that producing serious influence to other processor processes.Therefore, in order to reduce I/O data to Cache's Pollution, occur based on division DMA caching (Partition-Based DMA Cache, abbreviation PBDC) mode, pass through by The static division of Cache is at two regions for being respectively used to storage I/O data and processor data, so that I/O data and processing Device data separate, to achieve the purpose that reduce Cache pollution.
But PBDC mode needs to carry out Cache structure and consistency protocol more apparent change, the complexity of realization It spends higher;And due to I/O memory access diversity, if very few for the available space of I/O data distribution in Cache, I/O can be made Cache insufficient space, the I/O data in Cache, which are also not used by, just replaces, and causes the decline of processor overall performance tight Weight;If it is excessive for the available space of I/O data distribution in Cache, and will affect the performance of other programs, also result in place Manage the decline of device overall performance.
Summary of the invention
The present invention provides a kind of method, device and equipment of I/O equipment access memory, to solve existing access method In due to I/O memory access diversity, if in Cache for I/O data distribution available space it is very few, the space Cache of I/O can be made Deficiency, the I/O data in Cache, which are also not used by, just replaces, and causes the decline of processor overall performance serious;If It is excessive for the available space of I/O data distribution in Cache, and will affect the performance of other programs, it is whole to also result in processor The problem of decline of body performance.
It is an aspect of the invention to provide a kind of methods of I/O equipment access memory, comprising:
I/O memory access write request is received, calculates and is written into speed buffering in the corresponding I/O data of the I/O memory access write request The maximum hit number of CPU access request in memory Cache to the period being read;
According to the maximum hit number, number can be used by updating I/O data in Cache, enable I/O data are used after updating Number is equal to the difference of the total number of Cache and the maximum hit number;
Number can be used according to I/O data in the Cache, carries out I/O memory access processing.
Another aspect of the present invention is to provide a kind of device of I/O equipment access memory, comprising:
First computing module is calculated for receiving I/O memory access write request in the corresponding I/O number of the I/O memory access write request According to the maximum hit number for being written into CPU access request in cache memory Cache to the period being read;
Update module, for according to the maximum hit number, number can be used by updating I/O data in Cache, so that updating I/O data can be equal to the difference of the total number of Cache and the maximum hit number with number afterwards;
Memory access processing module carries out I/O memory access processing for that can use number according to I/O data in the Cache.
Another aspect of the present invention is to provide a kind of computer equipment, comprising: processor, memory;
And it is stored in the computer program that can be executed on the memory and by the processor;
The processor realizes above-mentioned I/O equipment access memory method when executing the computer program.
The method, device and equipment of I/O equipment access memory provided by the invention, is write by calculating in an I/O memory access Corresponding I/O data are requested to be written into the maximum of CPU access request in cache memory Cache to the period being read Number is hit, according to the maximum hit number, number can be used by updating I/O data in Cache, so that I/O data can after updating It is equal to the difference of the total number of Cache and the maximum hit number with number, it is then available according to I/O data in the Cache Number carries out I/O memory access processing, realizes the service condition in real time according to CPU to Cache, dynamically update I/ in Cache The available number of O data, so as to dynamically adjust I/O data the space occupied in Cache, so that will not be to other Under the premise of the producing bigger effect of CPU line journey, number can be used by increasing I/O data, so as to before not influencing cpu performance It puts, improves I/O memory access performance to further improve the overall performance of processor and improve the space utilization of Cache Rate.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention Example, and be used to explain the principle of the present invention together with specification.
Fig. 1 is the method flow diagram that the I/O equipment that the embodiment of the present invention one provides accesses memory;
Fig. 2 is the method flow diagram that I/O equipment provided by Embodiment 2 of the present invention accesses memory;
Fig. 3 is the method flow diagram that the I/O equipment that the embodiment of the present invention three provides accesses memory;
Fig. 4 is the structural schematic diagram for the device that the I/O equipment that the embodiment of the present invention five provides accesses memory;
Fig. 5 is the structural schematic diagram for the computer equipment that the embodiment of the present invention nine provides.
Through the above attached drawings, it has been shown that the specific embodiment of the present invention will be hereinafter described in more detail.These attached drawings It is not intended to limit the scope of the inventive concept in any manner with verbal description, but is by referring to specific embodiments Those skilled in the art illustrate idea of the invention.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended The example of device and method being described in detail in claims, some aspects of the invention are consistent.
Noun according to the present invention is explained first:
Cache memory: also referred to as Cache is between centre in the hierarchical structure of computer memory system Manage the high speed low capacity storage between device (Central Processing Unit, abbreviation CPU or processor) and main memory Device, it constitutes the memory of level-one with main memory together.Cache is usually made of static storage chip (SRAM), capacity ratio Smaller but speed is more much higher than main memory, close to the speed of CPU.Due to the locality that processor executes instruction, so that processor There is very high hit rate in Cache, just goes to find in main memory when only can not find in Cache, substantially increase the place of CPU Manage speed.
Cache structure: Cache is usually realized that each memory block of associative storage is (also referred to as by associative storage Cache row) all there is additional storage information, referred to as label (Tag).When accessing associative storage, by address and each Label is compared simultaneously, to access to the identical memory block of label.Cache in the embodiment of the present invention uses multichannel The structure that group is connected, the group Cache that is connected is a kind of structure between complete association Cache and direct image Cache, and group is connected Cache has used the block of several groups of direct images, for some given index, can correspond to several in a group The position of Cache row, thus hit rate and system effectiveness can be increased.
Direct memory access (direct memory access, abbreviation DMA) mode: allow in I/O equipment direct read/write It deposits, to reduce degree of participation of the processor core in I/O data handling process.
Direct cache access (direct Cache access, abbreviation DCA) mode: allow I/O equipment direct read/write Cache memory (Cache), to improve the memory access performance of I/O equipment.
In addition, term " first ", " second " etc. are used for description purposes only, it is not understood to indicate or imply relatively important Property or implicitly indicate the quantity of indicated technical characteristic.In the description of following embodiment, the meaning of " plurality " is two More than a, unless otherwise specifically defined.
These specific embodiments can be combined with each other below, may be at certain for the same or similar concept or process It is repeated no more in a little embodiments.Below in conjunction with attached drawing, the embodiment of the present invention is described.
Embodiment one
Since I/O data have better properties of flow, I/O generally understands address and memory access access continuously occurs, that is, empty Between locality it is preferable, it means that the Cache structure being connected using multichannel group can make each group of Cache receive number The roughly the same I/O request of mesh, can be improved Cache space utilization rate;This characteristic is but also if carry out class to I/O data Also more effective like dividing by the Cache on road, the effect that the Cache than carrying out by corresponding CPU line journey is divided is more preferable, also avoids Replacement algorithm is divided using more complicated Cache such as hash index addressing.Therefore, the Cache in the embodiment of the present invention is using more The institutional framework that road group is connected.
Fig. 1 is the method flow diagram that the I/O equipment that the embodiment of the present invention one provides accesses memory.The embodiment of the present invention is directed to Due to I/O memory access diversity in existing access method, if very few for the available space of I/O data distribution in Cache, can make The Cache insufficient space of I/O, the I/O data in Cache, which are also not used by, just replaces, and causes under processor overall performance Drop is serious;If it is excessive for the available space of I/O data distribution in Cache, and will affect the performance of other programs, it can equally lead The problem of causing the decline of processor overall performance provides the method for I/O equipment access memory.Such as Fig. 1, this method specific steps It is as follows:
Step S101, I/O memory access write request is received, calculates and is written into high speed in the corresponding I/O data of I/O memory access write request The maximum hit number of CPU access request in buffer storage Cache to the period being read.
Wherein, the institutional framework that Cache uses the road W group to be connected, the n times power that W is 2, n is positive integer.
In the present embodiment, the corresponding I/O data of I/O memory access write request are written into cache memory Cache to quilt The period of reading, it is denoted as target time section.Processor can be write when receiving an I/O memory access write request in I/O memory access It requests corresponding I/O data to be written into cache memory Cache to the period being read, tracks this object time Hit situation of the CPU access request in Cache in section, exists so as to calculate CPU access request in target time section Maximum hit number in Cache.
Maximum hit number of the CPU access request in Cache can be target in target time section in the present embodiment When the cpu data in Cache is hit in other CPU line journey memory access in period, near the road of LRU position in the hit location Number, can embody other CPU line journeys to the utilization power of Cache.Wherein, LRU position refers to least recently used position, Be according to least recently used (Least Recently Used, abbreviation LRU) replacement policy, it is next to be replaced The position of Cache row.
Step S102, number is hit according to maximum, number can be used by updating I/O data in Cache, so that I/O number after updating According to the difference that can be equal to the total number of Cache and maximum hit number with number.
Be connected Cache for the road W group, if the value of the maximum hit number counted in the target time period be N (N≤ W), the cpu data if data on the road N are hit, before the road N also very maximum probability is known that according to LRU stack characteristic Had occurred and that hit;So, if being assigned to only cpu data N road Cache in target time section, hit rate not will receive It influences.Therefore, number that I/O data can use is provided as (W-N) in most conservative mode, in this way the shadow to other CPU line journeys It rings minimum.
In the present embodiment, cache memory is written into the corresponding I/O data of I/O memory access write request being calculated In Cache to the period being read after the maximum hit number of CPU access request, I/O data in Cache can be used into number The difference of the total number of Cache and maximum hit number is updated to, so that will not producing bigger effect to other CPU line journeys Under the premise of, increase I/O data can use number as far as possible, so as to improve I/O memory access under the premise of not influencing cpu performance Performance.
Step S103, number can be used according to I/O data in Cache, carries out I/O memory access processing.
The embodiment of the present invention is written into speed buffering in the corresponding I/O data of an I/O memory access write request by calculating and deposits The maximum hit number of CPU access request, hits number according to maximum, updates in reservoir Cache to the period being read I/O data can use number in Cache, so that I/O data can be equal to the total number of Cache and maximum hit number with number after updating Difference, then according to I/O data in Cache can use number, carry out I/O memory access processing, realize in real time according to CPU pairs The service condition of Cache dynamically updates the available number of I/O data in Cache, so as to dynamically adjust I/O data The space occupied in Cache, so that I/O data will not be increased to other CPU line journeys under the premise of producing bigger effect Number can be used, so as to I/O memory access performance be improved, to further improve processing under the premise of not influencing cpu performance The overall performance of device improves the space utilization rate of Cache.
Embodiment two
Fig. 2 is the method flow diagram that I/O equipment provided by Embodiment 2 of the present invention accesses memory.In above-described embodiment one On the basis of, in the present embodiment, set of samples is preset, and one monitoring register is set for each set of samples, only to set of samples Memory access situation tracked and recorded, calculate and be written into cache memory in the corresponding I/O data of I/O memory access write request The maximum hit number of CPU access request in Cache to the period being read.As shown in Fig. 2, above-mentioned steps S101 specifically may be used To be realized using following steps:
Step S201, access request is received, access request is I/O memory access write request or CPU access request.
Wherein, access request carries the destination address requested access to, destination address include at least the index of Cache group with The label of Cache row.
In practical applications, the access request that processor receives at least can be I/O memory access write request, I/O memory access is read Any one in request, CPU memory access write request, CPU memory access read request, in the present embodiment, for the access request received The processing of subsequent step is carried out for the case where I/O memory access write request, CPU memory access write request or CPU memory access read request.I/O number CPU reading is write according to apparent producer consumer characteristic, that is, CPU write I/O reading or I/O.
Step S202, the index carried according to access request, determines whether the target Cache group of access request access belongs to Set of samples.
In the present embodiment, an assisted tag catalogue can be increased, assisted tag catalogue have recorded set of samples index and The label of Cache row in set of samples, in the present embodiment using assisted tag catalogue count an I/O data from write-in Cache to Utilization power of other CPU line journeys to Cache in the target time section being read.Wherein, assisted tag catalogue and in existing skill Tag directory in art in Cache has identical structure, also uses LRU replacement strategy, difference is that assisted tag catalogue only tracks The hit situation of set of samples of the CPU access request in Cache then directly neglects I/O access request, and can use in this way should Assisted tag catalogue simulates behavior of the CPU program in Cache.
Set of samples in the present embodiment is that sampling selection obtains from all Cache groups, and set of samples includes in Cache Multiple Cache groups, can be by randomly selecting to obtain.Preferably for any road in Cache, exists in set of samples and belong to In this at least one Cache group all the way, that is to say the Cache group in set of samples cover Cache per all the way, so as to So that whole memory access situation of the memory access situation of set of samples closer to Cache.In addition, the Cache group number in set of samples is got over It is more, the whole memory access situation of the memory access situation of set of samples closer to Cache.
The institutional framework that Cache in this implementation uses the road W group to be connected, the n times power that W is 2, n is positive integer.Optionally, The number of set of samples can be W, and the number of set of samples and selection mode are not specifically limited in the present embodiment.
In the step, according to the index that access request carries, determine whether the target Cache group of access request access belongs to Set of samples can specifically be realized in the following way:
The index in destination address carried according to access request, determines whether access request accesses in assisted tag catalogue Index entry;If it is determined that the index entry in access request access assisted tag catalogue, it is determined that the target of access request access Cache group belongs to set of samples;If it is determined that access request is not the index entry accessed in assisted tag catalogue, it is determined that access request The target Cache group of access is not belonging to set of samples.
In addition, the index of set of samples can also be recorded in the present embodiment, whether recorded according to the index that access request carries In the index of set of samples, to determine whether the target Cache group of access request access belongs to set of samples.
If the target Cache group of access request access belongs to set of samples, step S203 is executed, determines that access request is I/O Access request.
If the target Cache group of access request access is not belonging to set of samples, terminates, which is not tracked Processing.
Step S203, determine whether access request is I/O access request.
If access request is I/O memory access write request, thens follow the steps S204-S205 and continue to track the access request Processing.
If access request is I/O memory access read request, terminate, tracking processing is not carried out to the access request.
If access request is not I/O access request, illustrate that the access request is CPU memory access write request or CPU memory access Read request executes step S206, carries out tracking processing to the CPU access request.
If step S204, access request is I/O memory access write request, it is determined that the corresponding monitoring register of target Cache group Whether unused state is in.
In the present embodiment, monitoring register at least can recorde the following contents: whether be using middle state, whether terminate, Label and number is hit.When initial, monitoring register is arranged in unused state, has terminated, label 0 has been hit Number is 0.For example, monitoring register may include following field: used, valid, tag and LHW.Wherein, used is for indicating Whether the monitoring register is in use state, namely has been whether that I/O data are being monitored, and used can be marked with one Will position, used are that " 1 " indicates to be in use state, and used is that " 0 " indicates to be in unused state;Valid indicates the monitoring Whether the corresponding monitoring process of register terminates, and the condition of end is the corresponding I/O data of label of monitoring register record Accessed, valid can also be indicated with a flag bit;Tag field is used to record the label of I/O data to be monitored;LHW Number has been hit for recording.When initial, all fields monitored in register are disposed as 0.
Optionally, a monitoring register can be added for each set of samples, or can use existing register reality The function of existing detected register.
If the corresponding monitoring register of target Cache group is in unused state, S205 is thened follow the steps.
If the corresponding monitoring register of target Cache group is in use state, terminate, which is not write and asked It asks and continues tracking processing.
Step S205, the label record for carrying I/O memory access write request is in the corresponding monitoring register of target Cache group In, and be state in use by the corresponding monitoring register tagging of target Cache group.
If the corresponding monitoring register of target Cache group is in unused state, illustrate not tracing into this also at present The memory access behavior of target Cache group, then in this step, the label that I/O memory access write request is carried is recorded in target Cache In the corresponding monitoring register of group, the behavior of the I/O data write-in Cache of the I/O memory access write request is had recorded with this, at this time It is state in use by the corresponding monitoring register tagging of target Cache group, to start the monitoring register pair in the I/O memory access Record CPU access request asks the hit number of Cache to realize to CPU memory access before the I/O data of write request are read The tracking to the service condition of Cache is asked to handle.
Step S206, access request is CPU access request, it is determined that whether the corresponding monitoring register of target Cache group It is in use state.
If the corresponding monitoring register of target Cache group is in use state, S207-S211 is thened follow the steps, to CPU Access request carries out subsequent tracking processing.
If the corresponding monitoring register of target Cache group is in unused state, illustrate not tracing into also to the mesh at present The memory access behavior for marking Cache group, then terminate, and does not carry out subsequent tracking processing to the CPU access request.
Step S207, judge whether CPU access request hits target the Cache row in Cache group.
If judging result is that CPU access request hits target the Cache row in Cache group, S208- is thened follow the steps S209 updates the hit number of monitoring register record.
If judging result is the Cache row in CPU access request target miss Cache group, S210- is thened follow the steps S211。
In the present embodiment, judge whether CPU access request hits target the Cache row in Cache group, can specifically use Following manner is realized:
The label in destination address carried according to access request, judges whether CPU access request accesses assisted tag mesh Tag entry in record;If CPU access request accesses the tag entry in assisted tag catalogue, it is determined that CPU access request hits mesh Mark the Cache row in Cache group;If CPU access request does not access the tag entry in assisted tag catalogue, it is determined that CPU memory access Request the Cache row in target miss Cache group.
In addition, the index and label of set of samples can also be recorded in the present embodiment, the index that is carried according to access request and Whether label hits the label of set of samples, to determine whether CPU access request hits target the Cache row in Cache group.
Step S208, the corresponding number of the Cache row monitoring register corresponding with target Cache group more currently hit The size for having hit number of middle record.
If the corresponding number of Cache row currently hit is less than or equal to the corresponding monitoring register of target Cache group The hit number of middle record is then not necessarily to the hit number recorded in the corresponding monitoring register of more fresh target Cache group.
If the corresponding number of Cache row step S209, currently hit is greater than the corresponding monitoring register of target Cache group The number of hit recorded in the corresponding monitoring register of target Cache group then is updated to work as by the hit number of middle record The corresponding number of Cache row of preceding hit.
Step S210, judge whether CPU access request hits target the mark that the corresponding monitoring register of Cache group is recorded Sign corresponding Cache row.
I/O data have apparent producer consumer characteristic, that is, CPU write I/O is read or I/O writes CPU reading.If CPU access request hits target the corresponding Cache row of label that the corresponding monitoring register of Cache group is recorded, and illustrates to monitor I/O data in the corresponding Cache row of the label that register is recorded are accessed, and that is to say, this I/O data is visited by CPU Deposit request to read, then follow the steps S211, using the number of hit that records in the corresponding monitoring register of target Cache group as Maximum hit number.
If the corresponding Cache of label that the corresponding monitoring register of CPU access request target miss Cache group is recorded Row, then to the tracking of the CPU access request, processing terminate.
If CPU access request hits target the corresponding Cache of label that the corresponding monitoring register of Cache group is recorded Row executes step S211.
Step S211, the number of hit recorded in the corresponding monitoring register of target Cache group is hit as maximum Number.
Optionally, the number of hit recorded in the corresponding monitoring register of target Cache group is hit into road as maximum After number, further includes:
The number of hit in the corresponding monitoring register of target Cache group is set 0;By the corresponding prison of target Cache group It surveys register and is denoted as unused state.
In addition, the number of hit recorded in monitoring register that target Cache group is corresponding hits road as maximum After number, can also by monitor register all field unsets be 0 so that the monitoring register can be used for next round with Track processing.
In the embodiment of the present invention, process flow shown in Fig. 2 is the process to carry out tracking processing to an access request For be described in detail, in the embodiment of the present invention, cyclically multiple access requests are carried out shown in Fig. 2 processed Journey, and processor normal memory access processing is parallel carries out, while not influencing processor normal memory access processing, dynamic in real time Ground calculates the corresponding I/O data of an I/O memory access write request and is written into CPU access request in Cache to the period being read Maximum hit number.
A monitoring register is arranged by presetting set of samples, and for each set of samples in the embodiment of the present invention, only right The memory access situation of set of samples is tracked and recorded, and the mode of sampling can save hardware spending, is realized and is real-time dynamicly calculated CPU in cache memory Cache to the period being read is written into the corresponding I/O data of I/O memory access write request out The maximum hit number of access request, so as to further realize the sky for dynamically adjusting I/O data and occupying in Cache Between, so that number can be used by increasing I/O data will not be to other CPU line journeys under the premise of producing bigger effect, so as to Under the premise of not influencing cpu performance, I/O memory access performance is improved, to further improve the overall performance of processor, is improved The space utilization rate of Cache.
Embodiment three
Fig. 3 is the method flow diagram that the I/O equipment that the embodiment of the present invention three provides accesses memory.In above-described embodiment one or On the basis of person's embodiment two, in the present embodiment, after receiving I/O memory access write request, Cache is hit in I/O memory access write request When, the corresponding I/O data of I/O memory access write request are write direct into Cache;In I/O memory access write request miss Cache, root The number occupied according to I/O data in the target Cache group for calculating the access of I/O memory access write request;It is visited according to I/O memory access write request Whether the number that I/O data have occupied in the target Cache group asked, which is less than I/O data in Cache, can use number, it is determined whether right I/O memory access write request carries out memory access processing using direct cache access DCA mode.As shown in figure 3, this method specific steps It is as follows:
Step S301, I/O memory access write request is received.
Step S302, judge whether I/O memory access write request hits Cache.
If I/O memory access write request hits Cache, S303 is thened follow the steps, directly by the corresponding I/O of I/O memory access write request Cache is written in data.
If I/O memory access write request miss Cache, thens follow the steps S304-S305.
If step S303, I/O memory access write request hits Cache, and the corresponding I/O data of I/O memory access write request are direct Cache is written.
If step S304, I/O memory access write request miss Cache, the target of I/O memory access write request access is calculated The number that I/O data have occupied in Cache group.
In the present embodiment, in order to distinguish I/O data and cpu data, in tag register original in Cache newly-increased two A marker, following field is included at least in improved tag register: whether type has been accessed, label.Further, The type of the data of storage can be indicated in tag register with flag flag bit, for distinguishing cpu data and I/O data, such as Flag is 1 expression I/O data, indicates cpu data for 0;It can indicate whether the data of storage have been accessed with used flag bit, Such as the used flag bit is that 1 expression has been accessed, used flag bit is that 0 expression is not visited;Tag is for record storage The corresponding label of data.
In the step, the number that I/O data have occupied in the target Cache group of I/O memory access write request access can pass through Count the number for the Cache row that flag flag bit is 1 in corresponding tag register in target Cache group.
Step S305, whether the number that I/O data have occupied in the target Cache group accessed according to I/O memory access write request Number can be used less than I/O data in Cache, it is determined whether direct cache access DCA mode is used to I/O memory access write request Carry out memory access processing.
In the present embodiment, it is according to the number that I/O data have occupied in the target Cache group of I/O memory access write request access It is no to use number less than I/O data in Cache, it is determined whether memory access processing is carried out using DCA mode to I/O memory access write request, Specifically include following several situations:
(1) if the number that I/O data have occupied in the target Cache group of I/O memory access write request access is less than I/ in Cache O data can use number, then carry out memory access processing using DCA mode to I/O memory access write request.
(2) if position to be replaced storage is the I/O data being accessed, the side DCA is used to I/O memory access write request Formula carries out memory access processing, and position to be replaced refers to according to the next Cache row that will be replaced of used replacement policy Position.
Wherein, position storage to be replaced is that the I/O data being accessed refer to: position storage to be replaced is I/O Data, and the I/O data of position to be replaced storage have been accessed.
Specifically, according to the corresponding tag register in position to be replaced, if in the corresponding tag register in position to be replaced Flag flag bit and used flag bit are 1, then can determine position storage to be replaced is the I/O data being accessed.
(3) if the number that I/O data have occupied in the target Cache group of I/O memory access write request access is greater than or equal to I/O data can use number in Cache, and position to be replaced storage is not the I/O data being accessed, then visits I/O It deposits write request and memory access processing is carried out using dma mode.
If current position storage to be replaced is not I/O data, or current position storage to be replaced I/O data not by It accessed, then can determine position storage to be replaced is not the I/O data being accessed.
Specifically, according to the corresponding tag register in position to be replaced, if in the corresponding tag register in position to be replaced Flag flag bit is 0, then can determine current position storage to be replaced is not I/O data;If the corresponding tag in position to be replaced Used flag bit in register is 0, then can determine the not visited mistake of I/O data of current position storage to be replaced;As long as Be determined that flag flag bit is 0 or has been determined that used flag bit is 0, so that it may determine position to be replaced storage be not by The I/O data accessed.
After the embodiment of the present invention is by receiving I/O memory access write request, when I/O memory access write request hits Cache, by I/ The corresponding I/O data of O memory access write request write direct Cache;In I/O memory access write request miss Cache, visited according to I/O Number can be used by depositing the number that has occupied of I/O data in the target Cache group of write request access and whether being less than I/O data in Cache, Determine whether to carry out memory access processing using direct cache access DCA mode to I/O memory access write request, improves processor Performance.
Example IV
On the basis of above-described embodiment three, in the present embodiment, I/O memory access write request is being visited using DCA mode During depositing processing, two are had occurred according to the quantity that the secondary Cache row for accessing and being just replaced does not occur and before being replaced The statistical result of the quantity of the Cache row of secondary access determines that Cache replacement policy still uses other using LRU replacement strategy Replacement policy.
Wherein, the second replacement policy is nearest most-often used (Most Recently Used, abbreviation MRU) replacement policy.
Preferably due to it is continuous using the destination address of the I/O write request of DCA mode, and I/O data are only by CPU It reads using primary, the second replacement policy are as follows: after the I/O data in Cache row are accessed once, which is made It is subsequent when target data is written into Cache preferentially to replace position, preferential replacement position preferentially is written into target data.
In the present embodiment, due to not knowing the behavior of I/O data initially, the conduct of LRU replacement strategy is initially used Cache replacement policy;Secondary visit has occurred to the quantity that the secondary Cache row for accessing and being just replaced does not occur and before being replaced First difference of the quantity for the Cache row asked is counted;When the first difference is equal to or more than the first preset threshold, if not The quantity that the secondary Cache row for accessing and being just replaced occurs is greater than the number that the Cache row of secondary access has occurred before being replaced Amount, then be updated to the second replacement policy for Cache replacement policy.
It is alternatively possible to which one N saturated counters are arranged, initial value is full 0.Where an I/O data When Cache row is replaced, if until replacement secondary access does not all occur for the I/O data, saturated counters add 1;As an I/O When Cache row where data is by secondary access, saturated counters subtract 1;When saturated counters become complete 1, by I/O data Cache replacement policy is changed to the second replacement policy.
Wherein, when saturated counters initial value is all 1, saturated counters count number is equal to the first preset threshold.First is pre- If threshold value can be set according to actual needs by technical staff, the present embodiment is not specifically limited in this embodiment.
Further, the improved tag register of embodiment three can also be utilized, used flag bit is extended to 2, it is low Position is for indicating whether to be accessed, and a high position is for indicating whether that secondary access occurs, the high position when back-call occurs 1.When being counted, when the Cache row (the flag flag bit of corresponding tag register is 1) where an I/O data is replaced When changing, if a high position for used flag bit is 0, illustrate that until being replaced secondary visit did not occurred for the I/O data in Cache row It asks, then saturated counters add 1;When the Cache row where an I/O data (the flag flag bit of corresponding tag register is 1) When accessed, if a used flag bit high position is 0, low level 1 then illustrates that the I/O data in the Cache row have been accessed one Secondary, then the mark position used is 11, illustrates that the I/O data in Cache row are accessed twice at this time, the value of saturated counters Subtract 1.
Optionally, can also to the secondary quantity (being denoted as the first quantity) of Cache row for accessing and being just replaced does not occur, and The quantity (being denoted as the second quantity) that the Cache row of secondary access has occurred before being replaced is counted respectively, real-time counting The difference of first quantity and the second quantity obtains the first difference, by comparing the size of the first difference and the first preset threshold;? When first difference is equal to or more than the first preset threshold, if the first quantity is greater than the second quantity, more by Cache replacement policy It is newly the second replacement policy.
In the present embodiment, it is changed to LRU replacement strategy for how to be replaced by the second replacement, then needs to preset auxiliary Tag directory, the corresponding Cache replacement policy of assisted tag catalogue use always LRU replacement strategy.In the assisted tag catalogue Cache replacement policy be always maintained at LRU replacement strategy, other field definitions are consistent completely with the tag in Cache.
After Cache replacement policy is updated to the second replacement policy, further includes:
After Cache replacement policy is updated to the second replacement policy, assisted tag catalogue is tracked, and is being tracked In the process to not occurring secondary to access the quantity of Cache row label being just replaced and secondary access has occurred before being replaced Second difference of the quantity of Cache row label is counted;When the second difference is equal to or more than the second preset threshold, if right The quantity that the secondary Cache row label for accessing and being just replaced does not occur is less than the Cache that secondary access has occurred before being replaced Cache replacement policy is then updated to LRU replacement strategy by the quantity of row label.
Wherein, during tracking to not occurring secondary to access the quantity of Cache row label being just replaced and replaced The mode that the second difference of the quantity of the Cache row label of secondary access is counted has occurred before changing, can using with it is aforementioned The counting mode of first difference is consistent, and details are not described herein again for the present embodiment.
For example, it is also possible to tracked using saturated counters to the assisted tag catalogue, when by Cache replacement policy more When being newly LRU replacement strategy, the value of saturated counters is complete 1, the condition of saturated counters increase and decrease and the meter of aforementioned first difference It is consistent during number, when the value of saturated counters becomes full 0, Cache replacement policy is changed to LRU replacement strategy.
In order to reduce hardware spending, equally by the way of sampling, assisted tag catalogue has recorded the assisted tag catalogue The label of Cache row in the index and set of samples of set of samples, be only arranged a limited number of group of set of samples can reach it is higher accurate Degree.
In addition, memory address can generally be organized as the three-dimensional structure of heap address, row address and column address, due to its physics Characteristic, the same row address efficiency highest of connected reference can be very good the row conflict for reducing memory, and externally performance is exactly continuously The read-write memory bandwidth of location is higher, postpones also smaller.When High Speed I/O carries out memory access using dma mode, due to I/O data Continuity can make full use of this advantage;But when using DCA mode, how then to be divided using this advantage Analysis optimization.
Strategy to be write to be generally divided into write through and write back two ways, write through mode is exactly when writing Cache while to write memory, The mode of writing back is only just to write memory when Cache replacement.In order to reduce frequent memory read-write, Modern microprocessor is mostly adopted With the mode write back, this is not that cpu data that is very strong and being used repeatedly is suitable for spatial locality.But if Same mode is used for I/O data, DCA mode bring benefit may be will be greatly reduced.
Due to continuous using the I/O write request address of DCA mode, and I/O data are only read by CPU and use once, Due to the filterability of Cache, the opportunity that the I/O data that difference organizes interior continuation address are replaced might not be continuous, this is resulted in The writing of original continuation address is divided into writing for many different addresses, these write requests are written back into memory in different times, Cause the serious row conflict of memory.And if will write Cache's using the I/O write request of DCA mode using write through mode Memory is also written simultaneously, since write operation can't then occur for the data, can directly be deleted from Cache when replacing Without being written back memory, memory access efficiency is substantially increased.
In the present embodiment, optionally, when carrying out memory access processing using DCA mode, when using LRU replacement strategy, adopt Memory access processing is carried out to I/O memory access write request with the mode that writes back;When using the second replacement policy, using through mode to I/O Memory access write request carries out memory access processing.
The embodiment of the present invention is during carrying out memory access processing using DCA mode to I/O memory access write request, according to not sending out The raw secondary quantity for accessing the Cache row being just replaced and the quantity for the Cache row that secondary access has occurred before being replaced Statistical result determines that Cache replacement policy still uses other replacement policies using LRU replacement strategy, realizes Cache replacement The switching at runtime of strategy, and memory mode is write using different for different Cache replacement policies is corresponding, it substantially increases Memory access efficiency, to improve the overall performance of processor.
Embodiment five
Fig. 4 is the structural schematic diagram for the device that the I/O equipment that the embodiment of the present invention five provides accesses memory.The present invention is implemented The device for the I/O equipment access memory that example provides can execute the processing stream that the embodiment of the method for I/O equipment access memory provides Journey.As shown in figure 4, the device 40 includes: the first computing module 401, update module 402 and memory access processing module 403.
Specifically, the first computing module 401 calculates corresponding in I/O memory access write request for receiving I/O memory access write request I/O data are written into the maximum hit number of CPU access request in cache memory Cache to the period being read.
Update module 402 is used to hit number according to maximum, and number can be used by updating I/O data in Cache, so that after updating I/O data can be equal to the difference of the total number of Cache and maximum hit number with number.
Memory access processing module 403 is used to that number can be used according to I/O data in Cache, carries out I/O memory access processing.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment one, Details are not described herein again for concrete function.
The embodiment of the present invention is written into speed buffering in the corresponding I/O data of an I/O memory access write request by calculating and deposits The maximum hit number of CPU access request, hits number according to maximum, updates in reservoir Cache to the period being read I/O data can use number in Cache, so that I/O data can be equal to the total number of Cache and maximum hit number with number after updating Difference, then according to I/O data in Cache can use number, carry out I/O memory access processing, realize in real time according to CPU pairs The service condition of Cache dynamically updates the available number of I/O data in Cache, so as to dynamically adjust I/O data The space occupied in Cache, so that I/O data will not be increased to other CPU line journeys under the premise of producing bigger effect Number can be used, so as to I/O memory access performance be improved, to further improve processing under the premise of not influencing cpu performance The overall performance of device improves the space utilization rate of Cache.
Embodiment six
On the basis of above-described embodiment five, in the present embodiment, the first computing module includes: receiving submodule, and first really Stator modules, second determines submodule, the first record sub module, and third determines submodule, and the first judging submodule compares submodule Block, the second record sub module and the 4th determine submodule.
Wherein, for receiving submodule for receiving access request, access request carries the destination address requested access to, target Address includes at least the index of Cache group and the label of Cache row.
First determines the index that submodule is used to carry according to access request, determines the target Cache of access request access Whether group belongs to set of samples.
If second determines that target Cache group of the submodule for access request access belongs to set of samples, and access request is I/O memory access write request, it is determined that whether the corresponding monitoring register of target Cache group is in unused state.
If the first record sub module is in unused state for the corresponding monitoring register of target Cache group, by I/O The label record that memory access write request carries is and corresponding by target Cache group in the corresponding monitoring register of target Cache group Monitoring register tagging is state in use.
Third determines submodule, if the target Cache group for access request access belongs to set of samples, and access request is CPU access request, it is determined that whether the corresponding monitoring register of target Cache group is in use state.
First judging submodule judges if being in use state for the corresponding monitoring register of target Cache group Whether CPU access request hits target the Cache row in Cache group.
Comparative sub-module compares if being that CPU access request hits target the Cache row in Cache group for judging result Compared with the hit number recorded in the corresponding number of the Cache row monitoring register corresponding with target Cache group currently hit Size.
Second record sub module, if the corresponding number of Cache row for currently hitting is corresponding greater than target Cache group The hit number recorded in monitoring register, then the hit road that will be recorded in the corresponding monitoring register of target Cache group Number is updated to the corresponding number of Cache row currently hit.
4th determines submodule, if being the Cache in CPU access request target miss Cache group for judging result Row, then when CPU access request hits target the corresponding Cache row of the corresponding label that is recorded of monitoring register of Cache group, The number of hit recorded in the corresponding monitoring register of target Cache group is determined as maximum hit number.
Optionally, the first computing module further includes resetting submodule.
It resets submodule to be used to the number of hit in the corresponding monitoring register of target Cache group setting 0, by target The corresponding monitoring register of Cache group is denoted as unused state.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment two, Details are not described herein again for concrete function.
A monitoring register is arranged by presetting set of samples, and for each set of samples in the embodiment of the present invention, only right The memory access situation of set of samples is tracked and recorded, and the mode of sampling can save hardware spending, is realized and is real-time dynamicly calculated CPU in cache memory Cache to the period being read is written into the corresponding I/O data of I/O memory access write request out The maximum hit number of access request, so as to further realize the sky for dynamically adjusting I/O data and occupying in Cache Between, so that number can be used by increasing I/O data will not be to other CPU line journeys under the premise of producing bigger effect, so as to Under the premise of not influencing cpu performance, I/O memory access performance is improved, to further improve the overall performance of processor, is improved The space utilization rate of Cache.
Embodiment seven
On the basis of above-described embodiment five or embodiment six, in the present embodiment, I/O equipment accesses the device of memory also It include: judgment module, writing module, the second computing module and determining module.
Wherein, judgment module is for judging whether I/O memory access write request hits Cache.
If writing module hits Cache for I/O memory access write request, and the corresponding I/O data of I/O memory access write request are straight Meet write-in Cache.
If the second computing module is used for I/O memory access write request miss Cache, the access of I/O memory access write request is calculated The number that I/O data have occupied in target Cache group.
Determining module is used for the number that I/O data have occupied in the target Cache group that accesses according to I/O memory access write request It is no to use number less than I/O data in Cache, it is determined whether the direct cache access side DCA is used to I/O memory access write request Formula carries out memory access processing.
Optionally, determining module includes: the first processing submodule and second processing submodule.
Wherein, if the first processing submodule has been accounted for for I/O data in the target Cache group of I/O memory access write request access Number, which is less than I/O data in Cache, can use number, then carries out memory access processing using DCA mode to I/O memory access write request.
If what the first processing submodule was also used to position storage to be replaced is the I/O data being accessed, I/O is visited Deposit write request and memory access processing carried out using DCA mode, position to be replaced refer to according to used replacement policy is next will The position for the Cache row being replaced.
If the road that second processing submodule has been occupied for I/O data in the target Cache group of I/O memory access write request access What number was greater than or equal to that I/O data in Cache can store with number or position to be replaced is not the I/O number being accessed According to then to I/O memory access write request using dma mode progress memory access processing.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment three, Details are not described herein again for concrete function.
After the embodiment of the present invention is by receiving I/O memory access write request, when I/O memory access write request hits Cache, by I/ The corresponding I/O data of O memory access write request write direct Cache;In I/O memory access write request miss Cache, visited according to I/O Number can be used by depositing the number that has occupied of I/O data in the target Cache group of write request access and whether being less than I/O data in Cache, Determine whether to carry out memory access processing using direct cache access DCA mode to I/O memory access write request, improves processor Performance.
Embodiment eight
On the basis of above-described embodiment seven, in the present embodiment, the first processing submodule is also used to: initially being replaced using LRU Strategy is changed as Cache replacement policy;The quantity that the secondary Cache row for accessing and being just replaced does not occur is sent out with before being replaced The first difference for having given birth to the quantity of the Cache row of secondary access is counted;It is preset when the first difference is equal to or more than first When threshold value, if the quantity that the secondary Cache row for accessing and being just replaced does not occur is greater than has occurred secondary access before being replaced Cache replacement policy is then updated to the second replacement policy by the quantity of Cache row;Wherein, the second replacement policy is MRU replacement Strategy or the second replacement policy are as follows: after the I/O data in Cache row are accessed once, using the Cache row as excellent Position is first replaced, it is subsequent when target data is written into Cache, preferential replacement position preferentially is written into target data.
First processing submodule is also used to: after Cache replacement policy is updated to the second replacement policy, acquisition is set in advance The assisted tag catalogue set, assisted tag catalogue have recorded the label of Cache row in the index and set of samples of set of samples, auxiliary mark The corresponding Cache replacement policy of label catalogue uses always LRU replacement strategy;Assisted tag catalogue is tracked, and is being tracked In the process to not occurring secondary to access the quantity of Cache row label being just replaced and secondary access has occurred before being replaced Second difference of the quantity of Cache row label is counted;When the second difference is equal to or more than the second preset threshold, if right The quantity that the secondary Cache row label for accessing and being just replaced does not occur is less than the Cache that secondary access has occurred before being replaced Cache replacement policy is then updated to LRU replacement strategy by the quantity of row label.
Optionally, the first processing submodule is also used to: when using LRU replacement strategy, using the mode that writes back to I/O memory access Write request carries out memory access processing;When using the second replacement policy, memory access is carried out to I/O memory access write request using write through mode Processing.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment four, Details are not described herein again for concrete function.
The embodiment of the present invention is during carrying out memory access processing using DCA mode to I/O memory access write request, according to not sending out The raw secondary quantity for accessing the Cache row being just replaced and the quantity for the Cache row that secondary access has occurred before being replaced Statistical result determines that Cache replacement policy still uses other replacement policies using LRU replacement strategy, realizes Cache replacement The switching at runtime of strategy, and memory mode is write using different for different Cache replacement policies is corresponding, it substantially increases Memory access efficiency, to improve the overall performance of processor.
Embodiment nine
Fig. 5 is the structural schematic diagram for the computer equipment that the embodiment of the present invention nine provides.As shown in figure 5, the equipment 50 is wrapped It includes: processor 501, memory 502, and it is stored in the computer program that can be executed on memory 502 and by processor 501.
Processor 501 realizes any of the above-described embodiment of the method when executing and storing in the computer program on memory 502 The method of the I/O equipment access memory of offer.
The embodiment of the present invention is written into speed buffering in the corresponding I/O data of an I/O memory access write request by calculating and deposits The maximum hit number of CPU access request, hits number according to maximum, updates in reservoir Cache to the period being read I/O data can use number in Cache, so that I/O data can be equal to the total number of Cache and maximum hit number with number after updating Difference, then according to I/O data in Cache can use number, carry out I/O memory access processing, realize in real time according to CPU pairs The service condition of Cache dynamically updates the available number of I/O data in Cache, so as to dynamically adjust I/O data The space occupied in Cache, so that I/O data will not be increased to other CPU line journeys under the premise of producing bigger effect Number can be used, so as to I/O memory access performance be improved, to further improve processing under the premise of not influencing cpu performance The overall performance of device improves the space utilization rate of Cache.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various It can store the medium of program code.
Those skilled in the art can be understood that, for convenience and simplicity of description, only with above-mentioned each functional module Division progress for example, in practical application, can according to need and above-mentioned function distribution is complete by different functional modules At the internal structure of device being divided into different functional modules, to complete all or part of the functions described above.On The specific work process for stating the device of description, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following Claims are pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by appended claims System.

Claims (19)

1. a kind of method of I/O equipment access memory characterized by comprising
I/O memory access write request is received, calculates and is written into caches in the corresponding I/O data of the I/O memory access write request The maximum hit number of CPU access request in device Cache to the period being read;
According to the maximum hit number, number can be used by updating I/O data in Cache, so that I/O data can use number after updating Equal to the difference of the total number of Cache and the maximum hit number;
Number can be used according to I/O data in the Cache, carries out I/O memory access processing.
2. the method according to claim 1, wherein the reception I/O memory access write request, calculates in the I/O The corresponding I/O data of memory access write request are written into CPU access request in cache memory Cache to the period being read Maximum hit number, comprising:
Access request is received, the access request carries the destination address requested access to, and the destination address includes at least The index of Cache group and the label of Cache row;
According to the index that the access request carries, determine whether the target Cache group of the access request access belongs to sampling Group, the set of samples include at least one Cache group;
If the target Cache group of the access request access belongs to set of samples, and the access request is I/O memory access write request, Then determine whether the corresponding monitoring register of the target Cache group is in unused state;
If the corresponding monitoring register of the target Cache group is in unused state, the I/O memory access write request is carried Label record in the corresponding monitoring register of the target Cache group, and by the target Cache group it is corresponding monitoring post Storage is labeled as state in use.
3. according to the method described in claim 2, it is characterized in that, the index carried according to the access request, determines Whether the target Cache group of the access request access belongs to after set of samples, further includes:
If the target Cache group of the access request access belongs to set of samples, and the access request is CPU access request, then Determine whether the corresponding monitoring register of the target Cache group is in use state;
If the corresponding monitoring register of the target Cache group is in use state, whether the CPU access request is judged Hit the Cache row in the target Cache group;
If judging result is that the CPU access request hits the Cache row in the target Cache group, relatively more current hit The corresponding number of Cache row monitoring register corresponding with the target Cache group in the size for having hit number that records;
Remember if the corresponding number of Cache row currently hit is greater than in the corresponding monitoring register of the target Cache group The number of hit recorded in the corresponding monitoring register of the target Cache group is then updated to institute by the hit number of record State the corresponding number of Cache row currently hit;
If judging result is the Cache row in target Cache group described in the CPU access request miss, visited in the CPU When depositing request and hitting the corresponding Cache row of label that the corresponding monitoring register of the target Cache group is recorded, by the mesh The number of hit recorded in the corresponding monitoring register of mark Cache group is as the maximum hit number.
4. according to the method described in claim 3, it is characterized in that, described deposit the corresponding monitoring of the target Cache group After the number of hit recorded in device is as the maximum hit number, further includes:
The number of hit in the corresponding monitoring register of the target Cache group is set 0;
The corresponding monitoring register of the target Cache group is denoted as unused state.
5. the method according to claim 1, wherein after the reception I/O memory access write request, further includes:
Judge whether the I/O memory access write request hits Cache;
If the I/O memory access write request hits Cache, the corresponding I/O data of the I/O memory access write request are write direct Cache;
If the I/O memory access write request miss Cache, calculate in the target Cache group that the I/O memory access write request accesses The number that I/O data have occupied;
Whether it is less than Cache according to the number that I/O data have occupied in the target Cache group of I/O memory access write request access Middle I/O data can use number, it is determined whether be carried out to the I/O memory access write request using direct cache access DCA mode Memory access processing.
6. according to the method described in claim 5, it is characterized in that, the target Cache accessed according to the I/O memory access write request Whether the number that I/O data have occupied in group, which is less than I/O data in Cache, can use number, it is determined whether write to the I/O memory access Request carries out memory access processing using direct cache access DCA mode, comprising:
If the number that I/O data have occupied in the target Cache group of the I/O memory access write request access is less than I/O number in Cache According to number can be used, then memory access processing is carried out using DCA mode to the I/O memory access write request;
If position storage to be replaced is the I/O data being accessed, DCA mode is used to the I/O memory access write request Memory access processing is carried out, the position to be replaced refers to according to the next Cache row that will be replaced of used replacement policy Position;
If the number that I/O data have occupied in the target Cache group of the I/O memory access write request access is greater than or equal to Cache What middle I/O data can be stored with number or position to be replaced is not the I/O data being accessed, then to the I/O memory access Write request carries out memory access processing using dma mode.
7. according to the method described in claim 5, it is characterized in that, being carried out using DCA mode to the I/O memory access write request When memory access is handled, further includes:
Initially using LRU replacement strategy as Cache replacement policy;
To the quantity and the Cache that secondary access has occurred before being replaced that the secondary Cache row for accessing and being just replaced does not occur First difference of capable quantity is counted;
When first difference is equal to or more than the first preset threshold, if the secondary Cache for accessing and being just replaced does not occur Capable quantity is greater than the quantity that the Cache row of secondary access has occurred before being replaced, then Cache replacement policy is updated to the Two replacement policies;
Wherein, second replacement policy is MRU replacement policy or second replacement policy are as follows: the I/ in Cache row It is subsequent that target data is being written into Cache using the Cache row as preferential replacement position after O data is accessed once When, the preferential replacement position preferentially is written into target data.
8. the method according to the description of claim 7 is characterized in that when Cache replacement policy is updated to the second replacement plan After slightly, further includes:
Obtain pre-set assisted tag catalogue, the assisted tag catalogue have recorded set of samples index and the set of samples The label of interior Cache row, the corresponding Cache replacement policy of the assisted tag catalogue use always LRU replacement strategy;
The assisted tag catalogue is tracked, and to not occurring secondary to access the Cache that is just replaced during tracking The quantity of row label and the second difference of the quantity for the Cache row label that secondary access has occurred before being replaced are counted;
When second difference is equal to or more than the second preset threshold, if to not occurring what secondary access was just replaced The quantity of Cache row label is less than the quantity that the Cache row label of secondary access has occurred before being replaced, then replaces Cache Changing policy update is LRU replacement strategy.
9. method according to claim 7 or 8, which is characterized in that in use DCA mode to the I/O memory access write request When carrying out memory access processing, further includes:
When using the LRU replacement strategy, memory access processing is carried out to the I/O memory access write request using the mode that writes back;
When using second replacement policy, memory access processing is carried out to the I/O memory access write request using write through mode.
10. a kind of device of I/O equipment access memory characterized by comprising
First computing module is calculated for receiving I/O memory access write request in the corresponding I/O data quilt of the I/O memory access write request The maximum hit number of CPU access request in cache memory Cache to the period being read is written;
Update module, for according to the maximum hit number, number can be used by updating I/O data in Cache, so that I/ after updating O data can be equal to the difference of the total number of Cache and the maximum hit number with number;
Memory access processing module carries out I/O memory access processing for that can use number according to I/O data in the Cache.
11. device according to claim 10, which is characterized in that first computing module includes:
Receiving submodule, for receiving access request, the access request carries the destination address requested access to, the target Address includes at least the index of Cache group and the label of Cache row;
First determines that submodule, the index for carrying according to the access request determine the target of the access request access Whether Cache group belongs to set of samples, and the set of samples includes at least one Cache group;
Second determines submodule, if the target Cache group for access request access belongs to set of samples, and the memory access is asked Seeking Truth I/O memory access write request, it is determined that whether the corresponding monitoring register of the target Cache group is in unused state;
First record sub module, if unused state is in for the corresponding monitoring register of the target Cache group, by institute The label record of I/O memory access write request carrying is stated in the corresponding monitoring register of the target Cache group, and by the target The corresponding monitoring register tagging of Cache group is state in use.
12. device according to claim 11, which is characterized in that first computing module further include:
Third determines submodule, if the target Cache group for access request access belongs to set of samples, and the memory access is asked Seeking Truth CPU access request, it is determined that whether the corresponding monitoring register of the target Cache group is in use state;
First judging submodule judges if being in use state for the corresponding monitoring register of the target Cache group Whether the CPU access request hits the Cache row in the target Cache group;
Comparative sub-module, if being that the CPU access request hits the Cache row in the target Cache group for judging result, It is recorded in the corresponding number of Cache row then more currently hit monitoring register corresponding with the target Cache group Hit the size of number;
Second record sub module, if being greater than the target Cache group pair for the corresponding number of Cache row currently hit The hit number that records in the monitoring register answered will then record in the corresponding monitoring register of the target Cache group It has hit number and has been updated to the corresponding number of Cache row currently hit;
4th determines submodule, if being in target Cache group described in the CPU access request miss for judging result Cache row then hits the label pair that the corresponding monitoring register of the target Cache group is recorded in the CPU access request When the Cache row answered, by the number of hit that records in the corresponding monitoring register of the target Cache group be determined as it is described most Big hit number.
13. device according to claim 12, which is characterized in that first computing module further include:
Submodule is resetted, it, will be described for the number of hit in the corresponding monitoring register of the target Cache group to be set 0 The corresponding monitoring register of target Cache group is denoted as unused state.
14. device according to claim 10, which is characterized in that further include:
Judgment module, for judging whether the I/O memory access write request hits Cache;
Writing module, if Cache is hit for the I/O memory access write request, by the corresponding I/O of the I/O memory access write request Data write direct Cache;
Second computing module calculates the I/O memory access write request and visits if being used for the I/O memory access write request miss Cache The number that I/O data have occupied in the target Cache group asked;
Determining module, the number that I/O data have occupied in the target Cache group for being accessed according to the I/O memory access write request Number can be used by whether being less than I/O data in Cache, it is determined whether be visited using direct cache the I/O memory access write request Ask that DCA mode carries out memory access processing.
15. device according to claim 14, which is characterized in that the determining module includes:
First processing submodule, if occupied for I/O data in the target Cache group of I/O memory access write request access Number, which is less than I/O data in Cache, can use number, then carries out memory access processing using DCA mode to the I/O memory access write request;
If what the first processing submodule was also used to position storage to be replaced is the I/O data being accessed, to described I/O memory access write request carries out memory access processing using DCA mode, and the position to be replaced refers to according under used replacement policy The position of one Cache row that will be replaced;
Second processing submodule, if occupied for I/O data in the target Cache group of I/O memory access write request access That number is greater than or equal to that I/O data in Cache can store with number or position to be replaced is not the I/O being accessed Data then carry out memory access processing using dma mode to the I/O memory access write request.
16. device according to claim 15, which is characterized in that the first processing submodule is also used to:
Initially using LRU replacement strategy as Cache replacement policy;
To the quantity and the Cache that secondary access has occurred before being replaced that the secondary Cache row for accessing and being just replaced does not occur First difference of capable quantity is counted;
When first difference is equal to or more than the first preset threshold, if the secondary Cache for accessing and being just replaced does not occur Capable quantity is greater than the quantity that the Cache row of secondary access has occurred before being replaced, then Cache replacement policy is updated to the Two replacement policies;
Wherein, second replacement policy is MRU replacement policy or second replacement policy are as follows: the I/ in Cache row It is subsequent that target data is being written into Cache using the Cache row as preferential replacement position after O data is accessed once When, the preferential replacement position preferentially is written into target data.
17. device according to claim 16, which is characterized in that the first processing submodule is also used to:
After Cache replacement policy is updated to second replacement policy, pre-set assisted tag catalogue is obtained, it is described Assisted tag catalogue has recorded the label of Cache row in the index and the set of samples of set of samples, the assisted tag catalogue pair The Cache replacement policy answered uses always LRU replacement strategy;
The assisted tag catalogue is tracked, and to not occurring secondary to access the Cache that is just replaced during tracking The quantity of row label and the second difference of the quantity for the Cache row label that secondary access has occurred before being replaced are counted;
When second difference is equal to or more than the second preset threshold, if to not occurring what secondary access was just replaced The quantity of Cache row label is less than the quantity that the Cache row label of secondary access has occurred before being replaced, then replaces Cache Changing policy update is LRU replacement strategy.
18. device according to claim 16 or 17, which is characterized in that the first processing submodule is also used to:
When using the LRU replacement strategy, memory access processing is carried out to the I/O memory access write request using the mode that writes back;
When using second replacement policy, memory access processing is carried out to the I/O memory access write request using write through mode.
19. a kind of computer equipment characterized by comprising processor, memory;
And it is stored in the computer program that can be executed on the memory and by the processor;
The processor realizes that the described in any item I/O equipment of claim 1-9 access memory when executing the computer program Method.
CN201810240206.XA 2018-03-22 2018-03-22 Method, device and equipment for accessing memory by I/O equipment Active CN110297787B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810240206.XA CN110297787B (en) 2018-03-22 2018-03-22 Method, device and equipment for accessing memory by I/O equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810240206.XA CN110297787B (en) 2018-03-22 2018-03-22 Method, device and equipment for accessing memory by I/O equipment

Publications (2)

Publication Number Publication Date
CN110297787A true CN110297787A (en) 2019-10-01
CN110297787B CN110297787B (en) 2021-06-01

Family

ID=68025548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810240206.XA Active CN110297787B (en) 2018-03-22 2018-03-22 Method, device and equipment for accessing memory by I/O equipment

Country Status (1)

Country Link
CN (1) CN110297787B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026682A (en) * 2019-12-26 2020-04-17 浪潮(北京)电子信息产业有限公司 Data access method and device of board card chip and computer readable storage medium
CN112069091A (en) * 2020-08-17 2020-12-11 北京科技大学 Access optimization method and device applied to molecular dynamics simulation software
CN112181864A (en) * 2020-10-23 2021-01-05 中山大学 Address tag allocation scheduling and multi-Path cache write-back method for Path ORAM
WO2021087115A1 (en) 2019-10-31 2021-05-06 Advanced Micro Devices, Inc. Cache access measurement deskew
CN113392043A (en) * 2021-07-06 2021-09-14 南京英锐创电子科技有限公司 Cache data replacement method, device, equipment and storage medium
CN114115746A (en) * 2021-12-02 2022-03-01 北京乐讯科技有限公司 Full link tracking device of user mode storage system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000013091A1 (en) * 1998-08-28 2000-03-09 Alacritech, Inc. Intelligent network interface device and system for accelerating communication
US6353877B1 (en) * 1996-11-12 2002-03-05 Compaq Computer Corporation Performance optimization and system bus duty cycle reduction by I/O bridge partial cache line write
CN102298556A (en) * 2011-08-26 2011-12-28 成都市华为赛门铁克科技有限公司 Data stream recognition method and device
CN104756090A (en) * 2012-11-27 2015-07-01 英特尔公司 Providing extended cache replacement state information
CN104781753A (en) * 2012-12-14 2015-07-15 英特尔公司 Power gating a portion of a cache memory
CN107368433A (en) * 2011-12-20 2017-11-21 英特尔公司 The dynamic part power-off of memory side cache in 2 grades of hierarchy of memory

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6353877B1 (en) * 1996-11-12 2002-03-05 Compaq Computer Corporation Performance optimization and system bus duty cycle reduction by I/O bridge partial cache line write
WO2000013091A1 (en) * 1998-08-28 2000-03-09 Alacritech, Inc. Intelligent network interface device and system for accelerating communication
CN102298556A (en) * 2011-08-26 2011-12-28 成都市华为赛门铁克科技有限公司 Data stream recognition method and device
CN107368433A (en) * 2011-12-20 2017-11-21 英特尔公司 The dynamic part power-off of memory side cache in 2 grades of hierarchy of memory
CN104756090A (en) * 2012-11-27 2015-07-01 英特尔公司 Providing extended cache replacement state information
CN104781753A (en) * 2012-12-14 2015-07-15 英特尔公司 Power gating a portion of a cache memory

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
唐轶轩: "《面向多线程应用的Cache优化策略及并行模拟研究》", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220087459A (en) * 2019-10-31 2022-06-24 어드밴스드 마이크로 디바이시즈, 인코포레이티드 Cache Access Measure Deskew
KR102709340B1 (en) 2019-10-31 2024-09-25 어드밴스드 마이크로 디바이시즈, 인코포레이티드 Cache Access Measurement Deskew
US11880310B2 (en) 2019-10-31 2024-01-23 Advanced Micro Devices, Inc. Cache access measurement deskew
WO2021087115A1 (en) 2019-10-31 2021-05-06 Advanced Micro Devices, Inc. Cache access measurement deskew
EP4052133A4 (en) * 2019-10-31 2023-11-29 Advanced Micro Devices, Inc. Cache access measurement deskew
CN111026682B (en) * 2019-12-26 2022-03-08 浪潮(北京)电子信息产业有限公司 Data access method and device of board card chip and computer readable storage medium
CN111026682A (en) * 2019-12-26 2020-04-17 浪潮(北京)电子信息产业有限公司 Data access method and device of board card chip and computer readable storage medium
CN112069091B (en) * 2020-08-17 2023-09-01 北京科技大学 Memory access optimization method and device applied to molecular dynamics simulation software
CN112069091A (en) * 2020-08-17 2020-12-11 北京科技大学 Access optimization method and device applied to molecular dynamics simulation software
CN112181864B (en) * 2020-10-23 2023-07-25 中山大学 Address tag allocation scheduling and multipath cache write-back method for Path ORAM
CN112181864A (en) * 2020-10-23 2021-01-05 中山大学 Address tag allocation scheduling and multi-Path cache write-back method for Path ORAM
CN113392043A (en) * 2021-07-06 2021-09-14 南京英锐创电子科技有限公司 Cache data replacement method, device, equipment and storage medium
CN114115746A (en) * 2021-12-02 2022-03-01 北京乐讯科技有限公司 Full link tracking device of user mode storage system

Also Published As

Publication number Publication date
CN110297787B (en) 2021-06-01

Similar Documents

Publication Publication Date Title
CN110297787A (en) The method, device and equipment of I/O equipment access memory
CN107193646B (en) High-efficiency dynamic page scheduling method based on mixed main memory architecture
CN104115133B (en) For method, system and the equipment of the Data Migration for being combined non-volatile memory device
CN106909515B (en) Multi-core shared last-level cache management method and device for mixed main memory
CN105095116B (en) Cache method, cache controller and the processor replaced
US20010014931A1 (en) Cache management for a multi-threaded processor
CN113424160A (en) Processing method, processing device and related equipment
US20110252215A1 (en) Computer memory with dynamic cell density
CN107066393A (en) The method for improving map information density in address mapping table
CN110888600B (en) Buffer area management method for NAND flash memory
US8793434B2 (en) Spatial locality monitor for thread accesses of a memory resource
CN109582600B (en) Data processing method and device
US11093410B2 (en) Cache management method, storage system and computer program product
JP2018537770A (en) Profiling cache replacement
CN110795363B (en) Hot page prediction method and page scheduling method of storage medium
CN109684231A (en) The system and method for dsc data in solid-state disk and stream for identification
JP2009524137A (en) Cyclic snoop to identify eviction candidates for higher level cache
CN104714898B (en) A kind of distribution method and device of Cache
CN111722797B (en) SSD and HA-SMR hybrid storage system oriented data management method, storage medium and device
CN111078143B (en) Hybrid storage method and system for data layout and scheduling based on segment mapping
CN103885890B (en) Replacement processing method and device for cache blocks in caches
CN109478164A (en) For storing the system and method for being used for the requested information of cache entries transmission
US7702875B1 (en) System and method for memory compression
CN105359116B (en) Buffer, shared cache management method and controller
Shi et al. A unified write buffer cache management scheme for flash memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing

Applicant after: Loongson Zhongke Technology Co.,Ltd.

Address before: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing

Applicant before: LOONGSON TECHNOLOGY Corp.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant