CN110297787A - The method, device and equipment of I/O equipment access memory - Google Patents
The method, device and equipment of I/O equipment access memory Download PDFInfo
- Publication number
- CN110297787A CN110297787A CN201810240206.XA CN201810240206A CN110297787A CN 110297787 A CN110297787 A CN 110297787A CN 201810240206 A CN201810240206 A CN 201810240206A CN 110297787 A CN110297787 A CN 110297787A
- Authority
- CN
- China
- Prior art keywords
- cache
- access
- memory access
- data
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/28—Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The present invention provides a kind of method, device and equipment of I/O equipment access memory.Method of the invention is written into the maximum hit number of CPU access request in Cache to the period being read in the corresponding I/O data of I/O memory access write request by calculating, number can be used by updating I/O data in Cache according to the maximum hit number, make I/O data after updating that can be equal to the difference of the total number of Cache and the maximum hit number with number;Number can be used according to I/O data in the Cache, carry out I/O memory access processing, in real time according to CPU to the service condition of Cache, dynamically adjust I/O data the space occupied in Cache, so as under the premise of not influencing cpu performance, it improves I/O memory access performance and improves the space utilization rate of Cache to further improve the overall performance of processor.
Description
Technical field
The present invention relates to field of processors more particularly to a kind of method, device and equipments of I/O equipment access memory.
Background technique
With the rapid development of microprocessor technology, the integrated level of microprocessor is higher and higher, the calculating energy of microprocessor
Power is greatly improved, and the memory access performance of I/O equipment becomes the bottleneck of the performance boost of limitation processor.
Traditional I/O equipment access memory generallys use direct memory access (direct memory access, abbreviation
DMA) mode or direct cache access (direct Cache access, abbreviation DCA) mode, wherein dma mode is permitted
Perhaps I/O equipment direct read/write memory, to reduce degree of participation of the processor core in I/O data handling process;DCA mode is then
In order to improve the memory access performance of I/O equipment, allow I/O equipment direct read/write cache memory (Cache).Though DCA mode
The memory access performance of I/O equipment can be so improved, but I/O data are write direct into cache memory and will cause I/O data pair
The pollution of Cache, so that producing serious influence to other processor processes.Therefore, in order to reduce I/O data to Cache's
Pollution, occur based on division DMA caching (Partition-Based DMA Cache, abbreviation PBDC) mode, pass through by
The static division of Cache is at two regions for being respectively used to storage I/O data and processor data, so that I/O data and processing
Device data separate, to achieve the purpose that reduce Cache pollution.
But PBDC mode needs to carry out Cache structure and consistency protocol more apparent change, the complexity of realization
It spends higher;And due to I/O memory access diversity, if very few for the available space of I/O data distribution in Cache, I/O can be made
Cache insufficient space, the I/O data in Cache, which are also not used by, just replaces, and causes the decline of processor overall performance tight
Weight;If it is excessive for the available space of I/O data distribution in Cache, and will affect the performance of other programs, also result in place
Manage the decline of device overall performance.
Summary of the invention
The present invention provides a kind of method, device and equipment of I/O equipment access memory, to solve existing access method
In due to I/O memory access diversity, if in Cache for I/O data distribution available space it is very few, the space Cache of I/O can be made
Deficiency, the I/O data in Cache, which are also not used by, just replaces, and causes the decline of processor overall performance serious;If
It is excessive for the available space of I/O data distribution in Cache, and will affect the performance of other programs, it is whole to also result in processor
The problem of decline of body performance.
It is an aspect of the invention to provide a kind of methods of I/O equipment access memory, comprising:
I/O memory access write request is received, calculates and is written into speed buffering in the corresponding I/O data of the I/O memory access write request
The maximum hit number of CPU access request in memory Cache to the period being read;
According to the maximum hit number, number can be used by updating I/O data in Cache, enable I/O data are used after updating
Number is equal to the difference of the total number of Cache and the maximum hit number;
Number can be used according to I/O data in the Cache, carries out I/O memory access processing.
Another aspect of the present invention is to provide a kind of device of I/O equipment access memory, comprising:
First computing module is calculated for receiving I/O memory access write request in the corresponding I/O number of the I/O memory access write request
According to the maximum hit number for being written into CPU access request in cache memory Cache to the period being read;
Update module, for according to the maximum hit number, number can be used by updating I/O data in Cache, so that updating
I/O data can be equal to the difference of the total number of Cache and the maximum hit number with number afterwards;
Memory access processing module carries out I/O memory access processing for that can use number according to I/O data in the Cache.
Another aspect of the present invention is to provide a kind of computer equipment, comprising: processor, memory;
And it is stored in the computer program that can be executed on the memory and by the processor;
The processor realizes above-mentioned I/O equipment access memory method when executing the computer program.
The method, device and equipment of I/O equipment access memory provided by the invention, is write by calculating in an I/O memory access
Corresponding I/O data are requested to be written into the maximum of CPU access request in cache memory Cache to the period being read
Number is hit, according to the maximum hit number, number can be used by updating I/O data in Cache, so that I/O data can after updating
It is equal to the difference of the total number of Cache and the maximum hit number with number, it is then available according to I/O data in the Cache
Number carries out I/O memory access processing, realizes the service condition in real time according to CPU to Cache, dynamically update I/ in Cache
The available number of O data, so as to dynamically adjust I/O data the space occupied in Cache, so that will not be to other
Under the premise of the producing bigger effect of CPU line journey, number can be used by increasing I/O data, so as to before not influencing cpu performance
It puts, improves I/O memory access performance to further improve the overall performance of processor and improve the space utilization of Cache
Rate.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention
Example, and be used to explain the principle of the present invention together with specification.
Fig. 1 is the method flow diagram that the I/O equipment that the embodiment of the present invention one provides accesses memory;
Fig. 2 is the method flow diagram that I/O equipment provided by Embodiment 2 of the present invention accesses memory;
Fig. 3 is the method flow diagram that the I/O equipment that the embodiment of the present invention three provides accesses memory;
Fig. 4 is the structural schematic diagram for the device that the I/O equipment that the embodiment of the present invention five provides accesses memory;
Fig. 5 is the structural schematic diagram for the computer equipment that the embodiment of the present invention nine provides.
Through the above attached drawings, it has been shown that the specific embodiment of the present invention will be hereinafter described in more detail.These attached drawings
It is not intended to limit the scope of the inventive concept in any manner with verbal description, but is by referring to specific embodiments
Those skilled in the art illustrate idea of the invention.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended
The example of device and method being described in detail in claims, some aspects of the invention are consistent.
Noun according to the present invention is explained first:
Cache memory: also referred to as Cache is between centre in the hierarchical structure of computer memory system
Manage the high speed low capacity storage between device (Central Processing Unit, abbreviation CPU or processor) and main memory
Device, it constitutes the memory of level-one with main memory together.Cache is usually made of static storage chip (SRAM), capacity ratio
Smaller but speed is more much higher than main memory, close to the speed of CPU.Due to the locality that processor executes instruction, so that processor
There is very high hit rate in Cache, just goes to find in main memory when only can not find in Cache, substantially increase the place of CPU
Manage speed.
Cache structure: Cache is usually realized that each memory block of associative storage is (also referred to as by associative storage
Cache row) all there is additional storage information, referred to as label (Tag).When accessing associative storage, by address and each
Label is compared simultaneously, to access to the identical memory block of label.Cache in the embodiment of the present invention uses multichannel
The structure that group is connected, the group Cache that is connected is a kind of structure between complete association Cache and direct image Cache, and group is connected
Cache has used the block of several groups of direct images, for some given index, can correspond to several in a group
The position of Cache row, thus hit rate and system effectiveness can be increased.
Direct memory access (direct memory access, abbreviation DMA) mode: allow in I/O equipment direct read/write
It deposits, to reduce degree of participation of the processor core in I/O data handling process.
Direct cache access (direct Cache access, abbreviation DCA) mode: allow I/O equipment direct read/write
Cache memory (Cache), to improve the memory access performance of I/O equipment.
In addition, term " first ", " second " etc. are used for description purposes only, it is not understood to indicate or imply relatively important
Property or implicitly indicate the quantity of indicated technical characteristic.In the description of following embodiment, the meaning of " plurality " is two
More than a, unless otherwise specifically defined.
These specific embodiments can be combined with each other below, may be at certain for the same or similar concept or process
It is repeated no more in a little embodiments.Below in conjunction with attached drawing, the embodiment of the present invention is described.
Embodiment one
Since I/O data have better properties of flow, I/O generally understands address and memory access access continuously occurs, that is, empty
Between locality it is preferable, it means that the Cache structure being connected using multichannel group can make each group of Cache receive number
The roughly the same I/O request of mesh, can be improved Cache space utilization rate;This characteristic is but also if carry out class to I/O data
Also more effective like dividing by the Cache on road, the effect that the Cache than carrying out by corresponding CPU line journey is divided is more preferable, also avoids
Replacement algorithm is divided using more complicated Cache such as hash index addressing.Therefore, the Cache in the embodiment of the present invention is using more
The institutional framework that road group is connected.
Fig. 1 is the method flow diagram that the I/O equipment that the embodiment of the present invention one provides accesses memory.The embodiment of the present invention is directed to
Due to I/O memory access diversity in existing access method, if very few for the available space of I/O data distribution in Cache, can make
The Cache insufficient space of I/O, the I/O data in Cache, which are also not used by, just replaces, and causes under processor overall performance
Drop is serious;If it is excessive for the available space of I/O data distribution in Cache, and will affect the performance of other programs, it can equally lead
The problem of causing the decline of processor overall performance provides the method for I/O equipment access memory.Such as Fig. 1, this method specific steps
It is as follows:
Step S101, I/O memory access write request is received, calculates and is written into high speed in the corresponding I/O data of I/O memory access write request
The maximum hit number of CPU access request in buffer storage Cache to the period being read.
Wherein, the institutional framework that Cache uses the road W group to be connected, the n times power that W is 2, n is positive integer.
In the present embodiment, the corresponding I/O data of I/O memory access write request are written into cache memory Cache to quilt
The period of reading, it is denoted as target time section.Processor can be write when receiving an I/O memory access write request in I/O memory access
It requests corresponding I/O data to be written into cache memory Cache to the period being read, tracks this object time
Hit situation of the CPU access request in Cache in section, exists so as to calculate CPU access request in target time section
Maximum hit number in Cache.
Maximum hit number of the CPU access request in Cache can be target in target time section in the present embodiment
When the cpu data in Cache is hit in other CPU line journey memory access in period, near the road of LRU position in the hit location
Number, can embody other CPU line journeys to the utilization power of Cache.Wherein, LRU position refers to least recently used position,
Be according to least recently used (Least Recently Used, abbreviation LRU) replacement policy, it is next to be replaced
The position of Cache row.
Step S102, number is hit according to maximum, number can be used by updating I/O data in Cache, so that I/O number after updating
According to the difference that can be equal to the total number of Cache and maximum hit number with number.
Be connected Cache for the road W group, if the value of the maximum hit number counted in the target time period be N (N≤
W), the cpu data if data on the road N are hit, before the road N also very maximum probability is known that according to LRU stack characteristic
Had occurred and that hit;So, if being assigned to only cpu data N road Cache in target time section, hit rate not will receive
It influences.Therefore, number that I/O data can use is provided as (W-N) in most conservative mode, in this way the shadow to other CPU line journeys
It rings minimum.
In the present embodiment, cache memory is written into the corresponding I/O data of I/O memory access write request being calculated
In Cache to the period being read after the maximum hit number of CPU access request, I/O data in Cache can be used into number
The difference of the total number of Cache and maximum hit number is updated to, so that will not producing bigger effect to other CPU line journeys
Under the premise of, increase I/O data can use number as far as possible, so as to improve I/O memory access under the premise of not influencing cpu performance
Performance.
Step S103, number can be used according to I/O data in Cache, carries out I/O memory access processing.
The embodiment of the present invention is written into speed buffering in the corresponding I/O data of an I/O memory access write request by calculating and deposits
The maximum hit number of CPU access request, hits number according to maximum, updates in reservoir Cache to the period being read
I/O data can use number in Cache, so that I/O data can be equal to the total number of Cache and maximum hit number with number after updating
Difference, then according to I/O data in Cache can use number, carry out I/O memory access processing, realize in real time according to CPU pairs
The service condition of Cache dynamically updates the available number of I/O data in Cache, so as to dynamically adjust I/O data
The space occupied in Cache, so that I/O data will not be increased to other CPU line journeys under the premise of producing bigger effect
Number can be used, so as to I/O memory access performance be improved, to further improve processing under the premise of not influencing cpu performance
The overall performance of device improves the space utilization rate of Cache.
Embodiment two
Fig. 2 is the method flow diagram that I/O equipment provided by Embodiment 2 of the present invention accesses memory.In above-described embodiment one
On the basis of, in the present embodiment, set of samples is preset, and one monitoring register is set for each set of samples, only to set of samples
Memory access situation tracked and recorded, calculate and be written into cache memory in the corresponding I/O data of I/O memory access write request
The maximum hit number of CPU access request in Cache to the period being read.As shown in Fig. 2, above-mentioned steps S101 specifically may be used
To be realized using following steps:
Step S201, access request is received, access request is I/O memory access write request or CPU access request.
Wherein, access request carries the destination address requested access to, destination address include at least the index of Cache group with
The label of Cache row.
In practical applications, the access request that processor receives at least can be I/O memory access write request, I/O memory access is read
Any one in request, CPU memory access write request, CPU memory access read request, in the present embodiment, for the access request received
The processing of subsequent step is carried out for the case where I/O memory access write request, CPU memory access write request or CPU memory access read request.I/O number
CPU reading is write according to apparent producer consumer characteristic, that is, CPU write I/O reading or I/O.
Step S202, the index carried according to access request, determines whether the target Cache group of access request access belongs to
Set of samples.
In the present embodiment, an assisted tag catalogue can be increased, assisted tag catalogue have recorded set of samples index and
The label of Cache row in set of samples, in the present embodiment using assisted tag catalogue count an I/O data from write-in Cache to
Utilization power of other CPU line journeys to Cache in the target time section being read.Wherein, assisted tag catalogue and in existing skill
Tag directory in art in Cache has identical structure, also uses LRU replacement strategy, difference is that assisted tag catalogue only tracks
The hit situation of set of samples of the CPU access request in Cache then directly neglects I/O access request, and can use in this way should
Assisted tag catalogue simulates behavior of the CPU program in Cache.
Set of samples in the present embodiment is that sampling selection obtains from all Cache groups, and set of samples includes in Cache
Multiple Cache groups, can be by randomly selecting to obtain.Preferably for any road in Cache, exists in set of samples and belong to
In this at least one Cache group all the way, that is to say the Cache group in set of samples cover Cache per all the way, so as to
So that whole memory access situation of the memory access situation of set of samples closer to Cache.In addition, the Cache group number in set of samples is got over
It is more, the whole memory access situation of the memory access situation of set of samples closer to Cache.
The institutional framework that Cache in this implementation uses the road W group to be connected, the n times power that W is 2, n is positive integer.Optionally,
The number of set of samples can be W, and the number of set of samples and selection mode are not specifically limited in the present embodiment.
In the step, according to the index that access request carries, determine whether the target Cache group of access request access belongs to
Set of samples can specifically be realized in the following way:
The index in destination address carried according to access request, determines whether access request accesses in assisted tag catalogue
Index entry;If it is determined that the index entry in access request access assisted tag catalogue, it is determined that the target of access request access
Cache group belongs to set of samples;If it is determined that access request is not the index entry accessed in assisted tag catalogue, it is determined that access request
The target Cache group of access is not belonging to set of samples.
In addition, the index of set of samples can also be recorded in the present embodiment, whether recorded according to the index that access request carries
In the index of set of samples, to determine whether the target Cache group of access request access belongs to set of samples.
If the target Cache group of access request access belongs to set of samples, step S203 is executed, determines that access request is I/O
Access request.
If the target Cache group of access request access is not belonging to set of samples, terminates, which is not tracked
Processing.
Step S203, determine whether access request is I/O access request.
If access request is I/O memory access write request, thens follow the steps S204-S205 and continue to track the access request
Processing.
If access request is I/O memory access read request, terminate, tracking processing is not carried out to the access request.
If access request is not I/O access request, illustrate that the access request is CPU memory access write request or CPU memory access
Read request executes step S206, carries out tracking processing to the CPU access request.
If step S204, access request is I/O memory access write request, it is determined that the corresponding monitoring register of target Cache group
Whether unused state is in.
In the present embodiment, monitoring register at least can recorde the following contents: whether be using middle state, whether terminate,
Label and number is hit.When initial, monitoring register is arranged in unused state, has terminated, label 0 has been hit
Number is 0.For example, monitoring register may include following field: used, valid, tag and LHW.Wherein, used is for indicating
Whether the monitoring register is in use state, namely has been whether that I/O data are being monitored, and used can be marked with one
Will position, used are that " 1 " indicates to be in use state, and used is that " 0 " indicates to be in unused state;Valid indicates the monitoring
Whether the corresponding monitoring process of register terminates, and the condition of end is the corresponding I/O data of label of monitoring register record
Accessed, valid can also be indicated with a flag bit;Tag field is used to record the label of I/O data to be monitored;LHW
Number has been hit for recording.When initial, all fields monitored in register are disposed as 0.
Optionally, a monitoring register can be added for each set of samples, or can use existing register reality
The function of existing detected register.
If the corresponding monitoring register of target Cache group is in unused state, S205 is thened follow the steps.
If the corresponding monitoring register of target Cache group is in use state, terminate, which is not write and asked
It asks and continues tracking processing.
Step S205, the label record for carrying I/O memory access write request is in the corresponding monitoring register of target Cache group
In, and be state in use by the corresponding monitoring register tagging of target Cache group.
If the corresponding monitoring register of target Cache group is in unused state, illustrate not tracing into this also at present
The memory access behavior of target Cache group, then in this step, the label that I/O memory access write request is carried is recorded in target Cache
In the corresponding monitoring register of group, the behavior of the I/O data write-in Cache of the I/O memory access write request is had recorded with this, at this time
It is state in use by the corresponding monitoring register tagging of target Cache group, to start the monitoring register pair in the I/O memory access
Record CPU access request asks the hit number of Cache to realize to CPU memory access before the I/O data of write request are read
The tracking to the service condition of Cache is asked to handle.
Step S206, access request is CPU access request, it is determined that whether the corresponding monitoring register of target Cache group
It is in use state.
If the corresponding monitoring register of target Cache group is in use state, S207-S211 is thened follow the steps, to CPU
Access request carries out subsequent tracking processing.
If the corresponding monitoring register of target Cache group is in unused state, illustrate not tracing into also to the mesh at present
The memory access behavior for marking Cache group, then terminate, and does not carry out subsequent tracking processing to the CPU access request.
Step S207, judge whether CPU access request hits target the Cache row in Cache group.
If judging result is that CPU access request hits target the Cache row in Cache group, S208- is thened follow the steps
S209 updates the hit number of monitoring register record.
If judging result is the Cache row in CPU access request target miss Cache group, S210- is thened follow the steps
S211。
In the present embodiment, judge whether CPU access request hits target the Cache row in Cache group, can specifically use
Following manner is realized:
The label in destination address carried according to access request, judges whether CPU access request accesses assisted tag mesh
Tag entry in record;If CPU access request accesses the tag entry in assisted tag catalogue, it is determined that CPU access request hits mesh
Mark the Cache row in Cache group;If CPU access request does not access the tag entry in assisted tag catalogue, it is determined that CPU memory access
Request the Cache row in target miss Cache group.
In addition, the index and label of set of samples can also be recorded in the present embodiment, the index that is carried according to access request and
Whether label hits the label of set of samples, to determine whether CPU access request hits target the Cache row in Cache group.
Step S208, the corresponding number of the Cache row monitoring register corresponding with target Cache group more currently hit
The size for having hit number of middle record.
If the corresponding number of Cache row currently hit is less than or equal to the corresponding monitoring register of target Cache group
The hit number of middle record is then not necessarily to the hit number recorded in the corresponding monitoring register of more fresh target Cache group.
If the corresponding number of Cache row step S209, currently hit is greater than the corresponding monitoring register of target Cache group
The number of hit recorded in the corresponding monitoring register of target Cache group then is updated to work as by the hit number of middle record
The corresponding number of Cache row of preceding hit.
Step S210, judge whether CPU access request hits target the mark that the corresponding monitoring register of Cache group is recorded
Sign corresponding Cache row.
I/O data have apparent producer consumer characteristic, that is, CPU write I/O is read or I/O writes CPU reading.If
CPU access request hits target the corresponding Cache row of label that the corresponding monitoring register of Cache group is recorded, and illustrates to monitor
I/O data in the corresponding Cache row of the label that register is recorded are accessed, and that is to say, this I/O data is visited by CPU
Deposit request to read, then follow the steps S211, using the number of hit that records in the corresponding monitoring register of target Cache group as
Maximum hit number.
If the corresponding Cache of label that the corresponding monitoring register of CPU access request target miss Cache group is recorded
Row, then to the tracking of the CPU access request, processing terminate.
If CPU access request hits target the corresponding Cache of label that the corresponding monitoring register of Cache group is recorded
Row executes step S211.
Step S211, the number of hit recorded in the corresponding monitoring register of target Cache group is hit as maximum
Number.
Optionally, the number of hit recorded in the corresponding monitoring register of target Cache group is hit into road as maximum
After number, further includes:
The number of hit in the corresponding monitoring register of target Cache group is set 0;By the corresponding prison of target Cache group
It surveys register and is denoted as unused state.
In addition, the number of hit recorded in monitoring register that target Cache group is corresponding hits road as maximum
After number, can also by monitor register all field unsets be 0 so that the monitoring register can be used for next round with
Track processing.
In the embodiment of the present invention, process flow shown in Fig. 2 is the process to carry out tracking processing to an access request
For be described in detail, in the embodiment of the present invention, cyclically multiple access requests are carried out shown in Fig. 2 processed
Journey, and processor normal memory access processing is parallel carries out, while not influencing processor normal memory access processing, dynamic in real time
Ground calculates the corresponding I/O data of an I/O memory access write request and is written into CPU access request in Cache to the period being read
Maximum hit number.
A monitoring register is arranged by presetting set of samples, and for each set of samples in the embodiment of the present invention, only right
The memory access situation of set of samples is tracked and recorded, and the mode of sampling can save hardware spending, is realized and is real-time dynamicly calculated
CPU in cache memory Cache to the period being read is written into the corresponding I/O data of I/O memory access write request out
The maximum hit number of access request, so as to further realize the sky for dynamically adjusting I/O data and occupying in Cache
Between, so that number can be used by increasing I/O data will not be to other CPU line journeys under the premise of producing bigger effect, so as to
Under the premise of not influencing cpu performance, I/O memory access performance is improved, to further improve the overall performance of processor, is improved
The space utilization rate of Cache.
Embodiment three
Fig. 3 is the method flow diagram that the I/O equipment that the embodiment of the present invention three provides accesses memory.In above-described embodiment one or
On the basis of person's embodiment two, in the present embodiment, after receiving I/O memory access write request, Cache is hit in I/O memory access write request
When, the corresponding I/O data of I/O memory access write request are write direct into Cache;In I/O memory access write request miss Cache, root
The number occupied according to I/O data in the target Cache group for calculating the access of I/O memory access write request;It is visited according to I/O memory access write request
Whether the number that I/O data have occupied in the target Cache group asked, which is less than I/O data in Cache, can use number, it is determined whether right
I/O memory access write request carries out memory access processing using direct cache access DCA mode.As shown in figure 3, this method specific steps
It is as follows:
Step S301, I/O memory access write request is received.
Step S302, judge whether I/O memory access write request hits Cache.
If I/O memory access write request hits Cache, S303 is thened follow the steps, directly by the corresponding I/O of I/O memory access write request
Cache is written in data.
If I/O memory access write request miss Cache, thens follow the steps S304-S305.
If step S303, I/O memory access write request hits Cache, and the corresponding I/O data of I/O memory access write request are direct
Cache is written.
If step S304, I/O memory access write request miss Cache, the target of I/O memory access write request access is calculated
The number that I/O data have occupied in Cache group.
In the present embodiment, in order to distinguish I/O data and cpu data, in tag register original in Cache newly-increased two
A marker, following field is included at least in improved tag register: whether type has been accessed, label.Further,
The type of the data of storage can be indicated in tag register with flag flag bit, for distinguishing cpu data and I/O data, such as
Flag is 1 expression I/O data, indicates cpu data for 0;It can indicate whether the data of storage have been accessed with used flag bit,
Such as the used flag bit is that 1 expression has been accessed, used flag bit is that 0 expression is not visited;Tag is for record storage
The corresponding label of data.
In the step, the number that I/O data have occupied in the target Cache group of I/O memory access write request access can pass through
Count the number for the Cache row that flag flag bit is 1 in corresponding tag register in target Cache group.
Step S305, whether the number that I/O data have occupied in the target Cache group accessed according to I/O memory access write request
Number can be used less than I/O data in Cache, it is determined whether direct cache access DCA mode is used to I/O memory access write request
Carry out memory access processing.
In the present embodiment, it is according to the number that I/O data have occupied in the target Cache group of I/O memory access write request access
It is no to use number less than I/O data in Cache, it is determined whether memory access processing is carried out using DCA mode to I/O memory access write request,
Specifically include following several situations:
(1) if the number that I/O data have occupied in the target Cache group of I/O memory access write request access is less than I/ in Cache
O data can use number, then carry out memory access processing using DCA mode to I/O memory access write request.
(2) if position to be replaced storage is the I/O data being accessed, the side DCA is used to I/O memory access write request
Formula carries out memory access processing, and position to be replaced refers to according to the next Cache row that will be replaced of used replacement policy
Position.
Wherein, position storage to be replaced is that the I/O data being accessed refer to: position storage to be replaced is I/O
Data, and the I/O data of position to be replaced storage have been accessed.
Specifically, according to the corresponding tag register in position to be replaced, if in the corresponding tag register in position to be replaced
Flag flag bit and used flag bit are 1, then can determine position storage to be replaced is the I/O data being accessed.
(3) if the number that I/O data have occupied in the target Cache group of I/O memory access write request access is greater than or equal to
I/O data can use number in Cache, and position to be replaced storage is not the I/O data being accessed, then visits I/O
It deposits write request and memory access processing is carried out using dma mode.
If current position storage to be replaced is not I/O data, or current position storage to be replaced I/O data not by
It accessed, then can determine position storage to be replaced is not the I/O data being accessed.
Specifically, according to the corresponding tag register in position to be replaced, if in the corresponding tag register in position to be replaced
Flag flag bit is 0, then can determine current position storage to be replaced is not I/O data;If the corresponding tag in position to be replaced
Used flag bit in register is 0, then can determine the not visited mistake of I/O data of current position storage to be replaced;As long as
Be determined that flag flag bit is 0 or has been determined that used flag bit is 0, so that it may determine position to be replaced storage be not by
The I/O data accessed.
After the embodiment of the present invention is by receiving I/O memory access write request, when I/O memory access write request hits Cache, by I/
The corresponding I/O data of O memory access write request write direct Cache;In I/O memory access write request miss Cache, visited according to I/O
Number can be used by depositing the number that has occupied of I/O data in the target Cache group of write request access and whether being less than I/O data in Cache,
Determine whether to carry out memory access processing using direct cache access DCA mode to I/O memory access write request, improves processor
Performance.
Example IV
On the basis of above-described embodiment three, in the present embodiment, I/O memory access write request is being visited using DCA mode
During depositing processing, two are had occurred according to the quantity that the secondary Cache row for accessing and being just replaced does not occur and before being replaced
The statistical result of the quantity of the Cache row of secondary access determines that Cache replacement policy still uses other using LRU replacement strategy
Replacement policy.
Wherein, the second replacement policy is nearest most-often used (Most Recently Used, abbreviation MRU) replacement policy.
Preferably due to it is continuous using the destination address of the I/O write request of DCA mode, and I/O data are only by CPU
It reads using primary, the second replacement policy are as follows: after the I/O data in Cache row are accessed once, which is made
It is subsequent when target data is written into Cache preferentially to replace position, preferential replacement position preferentially is written into target data.
In the present embodiment, due to not knowing the behavior of I/O data initially, the conduct of LRU replacement strategy is initially used
Cache replacement policy;Secondary visit has occurred to the quantity that the secondary Cache row for accessing and being just replaced does not occur and before being replaced
First difference of the quantity for the Cache row asked is counted;When the first difference is equal to or more than the first preset threshold, if not
The quantity that the secondary Cache row for accessing and being just replaced occurs is greater than the number that the Cache row of secondary access has occurred before being replaced
Amount, then be updated to the second replacement policy for Cache replacement policy.
It is alternatively possible to which one N saturated counters are arranged, initial value is full 0.Where an I/O data
When Cache row is replaced, if until replacement secondary access does not all occur for the I/O data, saturated counters add 1;As an I/O
When Cache row where data is by secondary access, saturated counters subtract 1;When saturated counters become complete 1, by I/O data
Cache replacement policy is changed to the second replacement policy.
Wherein, when saturated counters initial value is all 1, saturated counters count number is equal to the first preset threshold.First is pre-
If threshold value can be set according to actual needs by technical staff, the present embodiment is not specifically limited in this embodiment.
Further, the improved tag register of embodiment three can also be utilized, used flag bit is extended to 2, it is low
Position is for indicating whether to be accessed, and a high position is for indicating whether that secondary access occurs, the high position when back-call occurs
1.When being counted, when the Cache row (the flag flag bit of corresponding tag register is 1) where an I/O data is replaced
When changing, if a high position for used flag bit is 0, illustrate that until being replaced secondary visit did not occurred for the I/O data in Cache row
It asks, then saturated counters add 1;When the Cache row where an I/O data (the flag flag bit of corresponding tag register is 1)
When accessed, if a used flag bit high position is 0, low level 1 then illustrates that the I/O data in the Cache row have been accessed one
Secondary, then the mark position used is 11, illustrates that the I/O data in Cache row are accessed twice at this time, the value of saturated counters
Subtract 1.
Optionally, can also to the secondary quantity (being denoted as the first quantity) of Cache row for accessing and being just replaced does not occur, and
The quantity (being denoted as the second quantity) that the Cache row of secondary access has occurred before being replaced is counted respectively, real-time counting
The difference of first quantity and the second quantity obtains the first difference, by comparing the size of the first difference and the first preset threshold;?
When first difference is equal to or more than the first preset threshold, if the first quantity is greater than the second quantity, more by Cache replacement policy
It is newly the second replacement policy.
In the present embodiment, it is changed to LRU replacement strategy for how to be replaced by the second replacement, then needs to preset auxiliary
Tag directory, the corresponding Cache replacement policy of assisted tag catalogue use always LRU replacement strategy.In the assisted tag catalogue
Cache replacement policy be always maintained at LRU replacement strategy, other field definitions are consistent completely with the tag in Cache.
After Cache replacement policy is updated to the second replacement policy, further includes:
After Cache replacement policy is updated to the second replacement policy, assisted tag catalogue is tracked, and is being tracked
In the process to not occurring secondary to access the quantity of Cache row label being just replaced and secondary access has occurred before being replaced
Second difference of the quantity of Cache row label is counted;When the second difference is equal to or more than the second preset threshold, if right
The quantity that the secondary Cache row label for accessing and being just replaced does not occur is less than the Cache that secondary access has occurred before being replaced
Cache replacement policy is then updated to LRU replacement strategy by the quantity of row label.
Wherein, during tracking to not occurring secondary to access the quantity of Cache row label being just replaced and replaced
The mode that the second difference of the quantity of the Cache row label of secondary access is counted has occurred before changing, can using with it is aforementioned
The counting mode of first difference is consistent, and details are not described herein again for the present embodiment.
For example, it is also possible to tracked using saturated counters to the assisted tag catalogue, when by Cache replacement policy more
When being newly LRU replacement strategy, the value of saturated counters is complete 1, the condition of saturated counters increase and decrease and the meter of aforementioned first difference
It is consistent during number, when the value of saturated counters becomes full 0, Cache replacement policy is changed to LRU replacement strategy.
In order to reduce hardware spending, equally by the way of sampling, assisted tag catalogue has recorded the assisted tag catalogue
The label of Cache row in the index and set of samples of set of samples, be only arranged a limited number of group of set of samples can reach it is higher accurate
Degree.
In addition, memory address can generally be organized as the three-dimensional structure of heap address, row address and column address, due to its physics
Characteristic, the same row address efficiency highest of connected reference can be very good the row conflict for reducing memory, and externally performance is exactly continuously
The read-write memory bandwidth of location is higher, postpones also smaller.When High Speed I/O carries out memory access using dma mode, due to I/O data
Continuity can make full use of this advantage;But when using DCA mode, how then to be divided using this advantage
Analysis optimization.
Strategy to be write to be generally divided into write through and write back two ways, write through mode is exactly when writing Cache while to write memory,
The mode of writing back is only just to write memory when Cache replacement.In order to reduce frequent memory read-write, Modern microprocessor is mostly adopted
With the mode write back, this is not that cpu data that is very strong and being used repeatedly is suitable for spatial locality.But if
Same mode is used for I/O data, DCA mode bring benefit may be will be greatly reduced.
Due to continuous using the I/O write request address of DCA mode, and I/O data are only read by CPU and use once,
Due to the filterability of Cache, the opportunity that the I/O data that difference organizes interior continuation address are replaced might not be continuous, this is resulted in
The writing of original continuation address is divided into writing for many different addresses, these write requests are written back into memory in different times,
Cause the serious row conflict of memory.And if will write Cache's using the I/O write request of DCA mode using write through mode
Memory is also written simultaneously, since write operation can't then occur for the data, can directly be deleted from Cache when replacing
Without being written back memory, memory access efficiency is substantially increased.
In the present embodiment, optionally, when carrying out memory access processing using DCA mode, when using LRU replacement strategy, adopt
Memory access processing is carried out to I/O memory access write request with the mode that writes back;When using the second replacement policy, using through mode to I/O
Memory access write request carries out memory access processing.
The embodiment of the present invention is during carrying out memory access processing using DCA mode to I/O memory access write request, according to not sending out
The raw secondary quantity for accessing the Cache row being just replaced and the quantity for the Cache row that secondary access has occurred before being replaced
Statistical result determines that Cache replacement policy still uses other replacement policies using LRU replacement strategy, realizes Cache replacement
The switching at runtime of strategy, and memory mode is write using different for different Cache replacement policies is corresponding, it substantially increases
Memory access efficiency, to improve the overall performance of processor.
Embodiment five
Fig. 4 is the structural schematic diagram for the device that the I/O equipment that the embodiment of the present invention five provides accesses memory.The present invention is implemented
The device for the I/O equipment access memory that example provides can execute the processing stream that the embodiment of the method for I/O equipment access memory provides
Journey.As shown in figure 4, the device 40 includes: the first computing module 401, update module 402 and memory access processing module 403.
Specifically, the first computing module 401 calculates corresponding in I/O memory access write request for receiving I/O memory access write request
I/O data are written into the maximum hit number of CPU access request in cache memory Cache to the period being read.
Update module 402 is used to hit number according to maximum, and number can be used by updating I/O data in Cache, so that after updating
I/O data can be equal to the difference of the total number of Cache and maximum hit number with number.
Memory access processing module 403 is used to that number can be used according to I/O data in Cache, carries out I/O memory access processing.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment one,
Details are not described herein again for concrete function.
The embodiment of the present invention is written into speed buffering in the corresponding I/O data of an I/O memory access write request by calculating and deposits
The maximum hit number of CPU access request, hits number according to maximum, updates in reservoir Cache to the period being read
I/O data can use number in Cache, so that I/O data can be equal to the total number of Cache and maximum hit number with number after updating
Difference, then according to I/O data in Cache can use number, carry out I/O memory access processing, realize in real time according to CPU pairs
The service condition of Cache dynamically updates the available number of I/O data in Cache, so as to dynamically adjust I/O data
The space occupied in Cache, so that I/O data will not be increased to other CPU line journeys under the premise of producing bigger effect
Number can be used, so as to I/O memory access performance be improved, to further improve processing under the premise of not influencing cpu performance
The overall performance of device improves the space utilization rate of Cache.
Embodiment six
On the basis of above-described embodiment five, in the present embodiment, the first computing module includes: receiving submodule, and first really
Stator modules, second determines submodule, the first record sub module, and third determines submodule, and the first judging submodule compares submodule
Block, the second record sub module and the 4th determine submodule.
Wherein, for receiving submodule for receiving access request, access request carries the destination address requested access to, target
Address includes at least the index of Cache group and the label of Cache row.
First determines the index that submodule is used to carry according to access request, determines the target Cache of access request access
Whether group belongs to set of samples.
If second determines that target Cache group of the submodule for access request access belongs to set of samples, and access request is
I/O memory access write request, it is determined that whether the corresponding monitoring register of target Cache group is in unused state.
If the first record sub module is in unused state for the corresponding monitoring register of target Cache group, by I/O
The label record that memory access write request carries is and corresponding by target Cache group in the corresponding monitoring register of target Cache group
Monitoring register tagging is state in use.
Third determines submodule, if the target Cache group for access request access belongs to set of samples, and access request is
CPU access request, it is determined that whether the corresponding monitoring register of target Cache group is in use state.
First judging submodule judges if being in use state for the corresponding monitoring register of target Cache group
Whether CPU access request hits target the Cache row in Cache group.
Comparative sub-module compares if being that CPU access request hits target the Cache row in Cache group for judging result
Compared with the hit number recorded in the corresponding number of the Cache row monitoring register corresponding with target Cache group currently hit
Size.
Second record sub module, if the corresponding number of Cache row for currently hitting is corresponding greater than target Cache group
The hit number recorded in monitoring register, then the hit road that will be recorded in the corresponding monitoring register of target Cache group
Number is updated to the corresponding number of Cache row currently hit.
4th determines submodule, if being the Cache in CPU access request target miss Cache group for judging result
Row, then when CPU access request hits target the corresponding Cache row of the corresponding label that is recorded of monitoring register of Cache group,
The number of hit recorded in the corresponding monitoring register of target Cache group is determined as maximum hit number.
Optionally, the first computing module further includes resetting submodule.
It resets submodule to be used to the number of hit in the corresponding monitoring register of target Cache group setting 0, by target
The corresponding monitoring register of Cache group is denoted as unused state.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment two,
Details are not described herein again for concrete function.
A monitoring register is arranged by presetting set of samples, and for each set of samples in the embodiment of the present invention, only right
The memory access situation of set of samples is tracked and recorded, and the mode of sampling can save hardware spending, is realized and is real-time dynamicly calculated
CPU in cache memory Cache to the period being read is written into the corresponding I/O data of I/O memory access write request out
The maximum hit number of access request, so as to further realize the sky for dynamically adjusting I/O data and occupying in Cache
Between, so that number can be used by increasing I/O data will not be to other CPU line journeys under the premise of producing bigger effect, so as to
Under the premise of not influencing cpu performance, I/O memory access performance is improved, to further improve the overall performance of processor, is improved
The space utilization rate of Cache.
Embodiment seven
On the basis of above-described embodiment five or embodiment six, in the present embodiment, I/O equipment accesses the device of memory also
It include: judgment module, writing module, the second computing module and determining module.
Wherein, judgment module is for judging whether I/O memory access write request hits Cache.
If writing module hits Cache for I/O memory access write request, and the corresponding I/O data of I/O memory access write request are straight
Meet write-in Cache.
If the second computing module is used for I/O memory access write request miss Cache, the access of I/O memory access write request is calculated
The number that I/O data have occupied in target Cache group.
Determining module is used for the number that I/O data have occupied in the target Cache group that accesses according to I/O memory access write request
It is no to use number less than I/O data in Cache, it is determined whether the direct cache access side DCA is used to I/O memory access write request
Formula carries out memory access processing.
Optionally, determining module includes: the first processing submodule and second processing submodule.
Wherein, if the first processing submodule has been accounted for for I/O data in the target Cache group of I/O memory access write request access
Number, which is less than I/O data in Cache, can use number, then carries out memory access processing using DCA mode to I/O memory access write request.
If what the first processing submodule was also used to position storage to be replaced is the I/O data being accessed, I/O is visited
Deposit write request and memory access processing carried out using DCA mode, position to be replaced refer to according to used replacement policy is next will
The position for the Cache row being replaced.
If the road that second processing submodule has been occupied for I/O data in the target Cache group of I/O memory access write request access
What number was greater than or equal to that I/O data in Cache can store with number or position to be replaced is not the I/O number being accessed
According to then to I/O memory access write request using dma mode progress memory access processing.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment three,
Details are not described herein again for concrete function.
After the embodiment of the present invention is by receiving I/O memory access write request, when I/O memory access write request hits Cache, by I/
The corresponding I/O data of O memory access write request write direct Cache;In I/O memory access write request miss Cache, visited according to I/O
Number can be used by depositing the number that has occupied of I/O data in the target Cache group of write request access and whether being less than I/O data in Cache,
Determine whether to carry out memory access processing using direct cache access DCA mode to I/O memory access write request, improves processor
Performance.
Embodiment eight
On the basis of above-described embodiment seven, in the present embodiment, the first processing submodule is also used to: initially being replaced using LRU
Strategy is changed as Cache replacement policy;The quantity that the secondary Cache row for accessing and being just replaced does not occur is sent out with before being replaced
The first difference for having given birth to the quantity of the Cache row of secondary access is counted;It is preset when the first difference is equal to or more than first
When threshold value, if the quantity that the secondary Cache row for accessing and being just replaced does not occur is greater than has occurred secondary access before being replaced
Cache replacement policy is then updated to the second replacement policy by the quantity of Cache row;Wherein, the second replacement policy is MRU replacement
Strategy or the second replacement policy are as follows: after the I/O data in Cache row are accessed once, using the Cache row as excellent
Position is first replaced, it is subsequent when target data is written into Cache, preferential replacement position preferentially is written into target data.
First processing submodule is also used to: after Cache replacement policy is updated to the second replacement policy, acquisition is set in advance
The assisted tag catalogue set, assisted tag catalogue have recorded the label of Cache row in the index and set of samples of set of samples, auxiliary mark
The corresponding Cache replacement policy of label catalogue uses always LRU replacement strategy;Assisted tag catalogue is tracked, and is being tracked
In the process to not occurring secondary to access the quantity of Cache row label being just replaced and secondary access has occurred before being replaced
Second difference of the quantity of Cache row label is counted;When the second difference is equal to or more than the second preset threshold, if right
The quantity that the secondary Cache row label for accessing and being just replaced does not occur is less than the Cache that secondary access has occurred before being replaced
Cache replacement policy is then updated to LRU replacement strategy by the quantity of row label.
Optionally, the first processing submodule is also used to: when using LRU replacement strategy, using the mode that writes back to I/O memory access
Write request carries out memory access processing;When using the second replacement policy, memory access is carried out to I/O memory access write request using write through mode
Processing.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment four,
Details are not described herein again for concrete function.
The embodiment of the present invention is during carrying out memory access processing using DCA mode to I/O memory access write request, according to not sending out
The raw secondary quantity for accessing the Cache row being just replaced and the quantity for the Cache row that secondary access has occurred before being replaced
Statistical result determines that Cache replacement policy still uses other replacement policies using LRU replacement strategy, realizes Cache replacement
The switching at runtime of strategy, and memory mode is write using different for different Cache replacement policies is corresponding, it substantially increases
Memory access efficiency, to improve the overall performance of processor.
Embodiment nine
Fig. 5 is the structural schematic diagram for the computer equipment that the embodiment of the present invention nine provides.As shown in figure 5, the equipment 50 is wrapped
It includes: processor 501, memory 502, and it is stored in the computer program that can be executed on memory 502 and by processor 501.
Processor 501 realizes any of the above-described embodiment of the method when executing and storing in the computer program on memory 502
The method of the I/O equipment access memory of offer.
The embodiment of the present invention is written into speed buffering in the corresponding I/O data of an I/O memory access write request by calculating and deposits
The maximum hit number of CPU access request, hits number according to maximum, updates in reservoir Cache to the period being read
I/O data can use number in Cache, so that I/O data can be equal to the total number of Cache and maximum hit number with number after updating
Difference, then according to I/O data in Cache can use number, carry out I/O memory access processing, realize in real time according to CPU pairs
The service condition of Cache dynamically updates the available number of I/O data in Cache, so as to dynamically adjust I/O data
The space occupied in Cache, so that I/O data will not be increased to other CPU line journeys under the premise of producing bigger effect
Number can be used, so as to I/O memory access performance be improved, to further improve processing under the premise of not influencing cpu performance
The overall performance of device improves the space utilization rate of Cache.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit
Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one
In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer
It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention
The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-
Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various
It can store the medium of program code.
Those skilled in the art can be understood that, for convenience and simplicity of description, only with above-mentioned each functional module
Division progress for example, in practical application, can according to need and above-mentioned function distribution is complete by different functional modules
At the internal structure of device being divided into different functional modules, to complete all or part of the functions described above.On
The specific work process for stating the device of description, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claims are pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by appended claims
System.
Claims (19)
1. a kind of method of I/O equipment access memory characterized by comprising
I/O memory access write request is received, calculates and is written into caches in the corresponding I/O data of the I/O memory access write request
The maximum hit number of CPU access request in device Cache to the period being read;
According to the maximum hit number, number can be used by updating I/O data in Cache, so that I/O data can use number after updating
Equal to the difference of the total number of Cache and the maximum hit number;
Number can be used according to I/O data in the Cache, carries out I/O memory access processing.
2. the method according to claim 1, wherein the reception I/O memory access write request, calculates in the I/O
The corresponding I/O data of memory access write request are written into CPU access request in cache memory Cache to the period being read
Maximum hit number, comprising:
Access request is received, the access request carries the destination address requested access to, and the destination address includes at least
The index of Cache group and the label of Cache row;
According to the index that the access request carries, determine whether the target Cache group of the access request access belongs to sampling
Group, the set of samples include at least one Cache group;
If the target Cache group of the access request access belongs to set of samples, and the access request is I/O memory access write request,
Then determine whether the corresponding monitoring register of the target Cache group is in unused state;
If the corresponding monitoring register of the target Cache group is in unused state, the I/O memory access write request is carried
Label record in the corresponding monitoring register of the target Cache group, and by the target Cache group it is corresponding monitoring post
Storage is labeled as state in use.
3. according to the method described in claim 2, it is characterized in that, the index carried according to the access request, determines
Whether the target Cache group of the access request access belongs to after set of samples, further includes:
If the target Cache group of the access request access belongs to set of samples, and the access request is CPU access request, then
Determine whether the corresponding monitoring register of the target Cache group is in use state;
If the corresponding monitoring register of the target Cache group is in use state, whether the CPU access request is judged
Hit the Cache row in the target Cache group;
If judging result is that the CPU access request hits the Cache row in the target Cache group, relatively more current hit
The corresponding number of Cache row monitoring register corresponding with the target Cache group in the size for having hit number that records;
Remember if the corresponding number of Cache row currently hit is greater than in the corresponding monitoring register of the target Cache group
The number of hit recorded in the corresponding monitoring register of the target Cache group is then updated to institute by the hit number of record
State the corresponding number of Cache row currently hit;
If judging result is the Cache row in target Cache group described in the CPU access request miss, visited in the CPU
When depositing request and hitting the corresponding Cache row of label that the corresponding monitoring register of the target Cache group is recorded, by the mesh
The number of hit recorded in the corresponding monitoring register of mark Cache group is as the maximum hit number.
4. according to the method described in claim 3, it is characterized in that, described deposit the corresponding monitoring of the target Cache group
After the number of hit recorded in device is as the maximum hit number, further includes:
The number of hit in the corresponding monitoring register of the target Cache group is set 0;
The corresponding monitoring register of the target Cache group is denoted as unused state.
5. the method according to claim 1, wherein after the reception I/O memory access write request, further includes:
Judge whether the I/O memory access write request hits Cache;
If the I/O memory access write request hits Cache, the corresponding I/O data of the I/O memory access write request are write direct
Cache;
If the I/O memory access write request miss Cache, calculate in the target Cache group that the I/O memory access write request accesses
The number that I/O data have occupied;
Whether it is less than Cache according to the number that I/O data have occupied in the target Cache group of I/O memory access write request access
Middle I/O data can use number, it is determined whether be carried out to the I/O memory access write request using direct cache access DCA mode
Memory access processing.
6. according to the method described in claim 5, it is characterized in that, the target Cache accessed according to the I/O memory access write request
Whether the number that I/O data have occupied in group, which is less than I/O data in Cache, can use number, it is determined whether write to the I/O memory access
Request carries out memory access processing using direct cache access DCA mode, comprising:
If the number that I/O data have occupied in the target Cache group of the I/O memory access write request access is less than I/O number in Cache
According to number can be used, then memory access processing is carried out using DCA mode to the I/O memory access write request;
If position storage to be replaced is the I/O data being accessed, DCA mode is used to the I/O memory access write request
Memory access processing is carried out, the position to be replaced refers to according to the next Cache row that will be replaced of used replacement policy
Position;
If the number that I/O data have occupied in the target Cache group of the I/O memory access write request access is greater than or equal to Cache
What middle I/O data can be stored with number or position to be replaced is not the I/O data being accessed, then to the I/O memory access
Write request carries out memory access processing using dma mode.
7. according to the method described in claim 5, it is characterized in that, being carried out using DCA mode to the I/O memory access write request
When memory access is handled, further includes:
Initially using LRU replacement strategy as Cache replacement policy;
To the quantity and the Cache that secondary access has occurred before being replaced that the secondary Cache row for accessing and being just replaced does not occur
First difference of capable quantity is counted;
When first difference is equal to or more than the first preset threshold, if the secondary Cache for accessing and being just replaced does not occur
Capable quantity is greater than the quantity that the Cache row of secondary access has occurred before being replaced, then Cache replacement policy is updated to the
Two replacement policies;
Wherein, second replacement policy is MRU replacement policy or second replacement policy are as follows: the I/ in Cache row
It is subsequent that target data is being written into Cache using the Cache row as preferential replacement position after O data is accessed once
When, the preferential replacement position preferentially is written into target data.
8. the method according to the description of claim 7 is characterized in that when Cache replacement policy is updated to the second replacement plan
After slightly, further includes:
Obtain pre-set assisted tag catalogue, the assisted tag catalogue have recorded set of samples index and the set of samples
The label of interior Cache row, the corresponding Cache replacement policy of the assisted tag catalogue use always LRU replacement strategy;
The assisted tag catalogue is tracked, and to not occurring secondary to access the Cache that is just replaced during tracking
The quantity of row label and the second difference of the quantity for the Cache row label that secondary access has occurred before being replaced are counted;
When second difference is equal to or more than the second preset threshold, if to not occurring what secondary access was just replaced
The quantity of Cache row label is less than the quantity that the Cache row label of secondary access has occurred before being replaced, then replaces Cache
Changing policy update is LRU replacement strategy.
9. method according to claim 7 or 8, which is characterized in that in use DCA mode to the I/O memory access write request
When carrying out memory access processing, further includes:
When using the LRU replacement strategy, memory access processing is carried out to the I/O memory access write request using the mode that writes back;
When using second replacement policy, memory access processing is carried out to the I/O memory access write request using write through mode.
10. a kind of device of I/O equipment access memory characterized by comprising
First computing module is calculated for receiving I/O memory access write request in the corresponding I/O data quilt of the I/O memory access write request
The maximum hit number of CPU access request in cache memory Cache to the period being read is written;
Update module, for according to the maximum hit number, number can be used by updating I/O data in Cache, so that I/ after updating
O data can be equal to the difference of the total number of Cache and the maximum hit number with number;
Memory access processing module carries out I/O memory access processing for that can use number according to I/O data in the Cache.
11. device according to claim 10, which is characterized in that first computing module includes:
Receiving submodule, for receiving access request, the access request carries the destination address requested access to, the target
Address includes at least the index of Cache group and the label of Cache row;
First determines that submodule, the index for carrying according to the access request determine the target of the access request access
Whether Cache group belongs to set of samples, and the set of samples includes at least one Cache group;
Second determines submodule, if the target Cache group for access request access belongs to set of samples, and the memory access is asked
Seeking Truth I/O memory access write request, it is determined that whether the corresponding monitoring register of the target Cache group is in unused state;
First record sub module, if unused state is in for the corresponding monitoring register of the target Cache group, by institute
The label record of I/O memory access write request carrying is stated in the corresponding monitoring register of the target Cache group, and by the target
The corresponding monitoring register tagging of Cache group is state in use.
12. device according to claim 11, which is characterized in that first computing module further include:
Third determines submodule, if the target Cache group for access request access belongs to set of samples, and the memory access is asked
Seeking Truth CPU access request, it is determined that whether the corresponding monitoring register of the target Cache group is in use state;
First judging submodule judges if being in use state for the corresponding monitoring register of the target Cache group
Whether the CPU access request hits the Cache row in the target Cache group;
Comparative sub-module, if being that the CPU access request hits the Cache row in the target Cache group for judging result,
It is recorded in the corresponding number of Cache row then more currently hit monitoring register corresponding with the target Cache group
Hit the size of number;
Second record sub module, if being greater than the target Cache group pair for the corresponding number of Cache row currently hit
The hit number that records in the monitoring register answered will then record in the corresponding monitoring register of the target Cache group
It has hit number and has been updated to the corresponding number of Cache row currently hit;
4th determines submodule, if being in target Cache group described in the CPU access request miss for judging result
Cache row then hits the label pair that the corresponding monitoring register of the target Cache group is recorded in the CPU access request
When the Cache row answered, by the number of hit that records in the corresponding monitoring register of the target Cache group be determined as it is described most
Big hit number.
13. device according to claim 12, which is characterized in that first computing module further include:
Submodule is resetted, it, will be described for the number of hit in the corresponding monitoring register of the target Cache group to be set 0
The corresponding monitoring register of target Cache group is denoted as unused state.
14. device according to claim 10, which is characterized in that further include:
Judgment module, for judging whether the I/O memory access write request hits Cache;
Writing module, if Cache is hit for the I/O memory access write request, by the corresponding I/O of the I/O memory access write request
Data write direct Cache;
Second computing module calculates the I/O memory access write request and visits if being used for the I/O memory access write request miss Cache
The number that I/O data have occupied in the target Cache group asked;
Determining module, the number that I/O data have occupied in the target Cache group for being accessed according to the I/O memory access write request
Number can be used by whether being less than I/O data in Cache, it is determined whether be visited using direct cache the I/O memory access write request
Ask that DCA mode carries out memory access processing.
15. device according to claim 14, which is characterized in that the determining module includes:
First processing submodule, if occupied for I/O data in the target Cache group of I/O memory access write request access
Number, which is less than I/O data in Cache, can use number, then carries out memory access processing using DCA mode to the I/O memory access write request;
If what the first processing submodule was also used to position storage to be replaced is the I/O data being accessed, to described
I/O memory access write request carries out memory access processing using DCA mode, and the position to be replaced refers to according under used replacement policy
The position of one Cache row that will be replaced;
Second processing submodule, if occupied for I/O data in the target Cache group of I/O memory access write request access
That number is greater than or equal to that I/O data in Cache can store with number or position to be replaced is not the I/O being accessed
Data then carry out memory access processing using dma mode to the I/O memory access write request.
16. device according to claim 15, which is characterized in that the first processing submodule is also used to:
Initially using LRU replacement strategy as Cache replacement policy;
To the quantity and the Cache that secondary access has occurred before being replaced that the secondary Cache row for accessing and being just replaced does not occur
First difference of capable quantity is counted;
When first difference is equal to or more than the first preset threshold, if the secondary Cache for accessing and being just replaced does not occur
Capable quantity is greater than the quantity that the Cache row of secondary access has occurred before being replaced, then Cache replacement policy is updated to the
Two replacement policies;
Wherein, second replacement policy is MRU replacement policy or second replacement policy are as follows: the I/ in Cache row
It is subsequent that target data is being written into Cache using the Cache row as preferential replacement position after O data is accessed once
When, the preferential replacement position preferentially is written into target data.
17. device according to claim 16, which is characterized in that the first processing submodule is also used to:
After Cache replacement policy is updated to second replacement policy, pre-set assisted tag catalogue is obtained, it is described
Assisted tag catalogue has recorded the label of Cache row in the index and the set of samples of set of samples, the assisted tag catalogue pair
The Cache replacement policy answered uses always LRU replacement strategy;
The assisted tag catalogue is tracked, and to not occurring secondary to access the Cache that is just replaced during tracking
The quantity of row label and the second difference of the quantity for the Cache row label that secondary access has occurred before being replaced are counted;
When second difference is equal to or more than the second preset threshold, if to not occurring what secondary access was just replaced
The quantity of Cache row label is less than the quantity that the Cache row label of secondary access has occurred before being replaced, then replaces Cache
Changing policy update is LRU replacement strategy.
18. device according to claim 16 or 17, which is characterized in that the first processing submodule is also used to:
When using the LRU replacement strategy, memory access processing is carried out to the I/O memory access write request using the mode that writes back;
When using second replacement policy, memory access processing is carried out to the I/O memory access write request using write through mode.
19. a kind of computer equipment characterized by comprising processor, memory;
And it is stored in the computer program that can be executed on the memory and by the processor;
The processor realizes that the described in any item I/O equipment of claim 1-9 access memory when executing the computer program
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810240206.XA CN110297787B (en) | 2018-03-22 | 2018-03-22 | Method, device and equipment for accessing memory by I/O equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810240206.XA CN110297787B (en) | 2018-03-22 | 2018-03-22 | Method, device and equipment for accessing memory by I/O equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110297787A true CN110297787A (en) | 2019-10-01 |
CN110297787B CN110297787B (en) | 2021-06-01 |
Family
ID=68025548
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810240206.XA Active CN110297787B (en) | 2018-03-22 | 2018-03-22 | Method, device and equipment for accessing memory by I/O equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110297787B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111026682A (en) * | 2019-12-26 | 2020-04-17 | 浪潮(北京)电子信息产业有限公司 | Data access method and device of board card chip and computer readable storage medium |
CN112069091A (en) * | 2020-08-17 | 2020-12-11 | 北京科技大学 | Access optimization method and device applied to molecular dynamics simulation software |
CN112181864A (en) * | 2020-10-23 | 2021-01-05 | 中山大学 | Address tag allocation scheduling and multi-Path cache write-back method for Path ORAM |
WO2021087115A1 (en) | 2019-10-31 | 2021-05-06 | Advanced Micro Devices, Inc. | Cache access measurement deskew |
CN113392043A (en) * | 2021-07-06 | 2021-09-14 | 南京英锐创电子科技有限公司 | Cache data replacement method, device, equipment and storage medium |
CN114115746A (en) * | 2021-12-02 | 2022-03-01 | 北京乐讯科技有限公司 | Full link tracking device of user mode storage system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000013091A1 (en) * | 1998-08-28 | 2000-03-09 | Alacritech, Inc. | Intelligent network interface device and system for accelerating communication |
US6353877B1 (en) * | 1996-11-12 | 2002-03-05 | Compaq Computer Corporation | Performance optimization and system bus duty cycle reduction by I/O bridge partial cache line write |
CN102298556A (en) * | 2011-08-26 | 2011-12-28 | 成都市华为赛门铁克科技有限公司 | Data stream recognition method and device |
CN104756090A (en) * | 2012-11-27 | 2015-07-01 | 英特尔公司 | Providing extended cache replacement state information |
CN104781753A (en) * | 2012-12-14 | 2015-07-15 | 英特尔公司 | Power gating a portion of a cache memory |
CN107368433A (en) * | 2011-12-20 | 2017-11-21 | 英特尔公司 | The dynamic part power-off of memory side cache in 2 grades of hierarchy of memory |
-
2018
- 2018-03-22 CN CN201810240206.XA patent/CN110297787B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6353877B1 (en) * | 1996-11-12 | 2002-03-05 | Compaq Computer Corporation | Performance optimization and system bus duty cycle reduction by I/O bridge partial cache line write |
WO2000013091A1 (en) * | 1998-08-28 | 2000-03-09 | Alacritech, Inc. | Intelligent network interface device and system for accelerating communication |
CN102298556A (en) * | 2011-08-26 | 2011-12-28 | 成都市华为赛门铁克科技有限公司 | Data stream recognition method and device |
CN107368433A (en) * | 2011-12-20 | 2017-11-21 | 英特尔公司 | The dynamic part power-off of memory side cache in 2 grades of hierarchy of memory |
CN104756090A (en) * | 2012-11-27 | 2015-07-01 | 英特尔公司 | Providing extended cache replacement state information |
CN104781753A (en) * | 2012-12-14 | 2015-07-15 | 英特尔公司 | Power gating a portion of a cache memory |
Non-Patent Citations (1)
Title |
---|
唐轶轩: "《面向多线程应用的Cache优化策略及并行模拟研究》", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20220087459A (en) * | 2019-10-31 | 2022-06-24 | 어드밴스드 마이크로 디바이시즈, 인코포레이티드 | Cache Access Measure Deskew |
KR102709340B1 (en) | 2019-10-31 | 2024-09-25 | 어드밴스드 마이크로 디바이시즈, 인코포레이티드 | Cache Access Measurement Deskew |
US11880310B2 (en) | 2019-10-31 | 2024-01-23 | Advanced Micro Devices, Inc. | Cache access measurement deskew |
WO2021087115A1 (en) | 2019-10-31 | 2021-05-06 | Advanced Micro Devices, Inc. | Cache access measurement deskew |
EP4052133A4 (en) * | 2019-10-31 | 2023-11-29 | Advanced Micro Devices, Inc. | Cache access measurement deskew |
CN111026682B (en) * | 2019-12-26 | 2022-03-08 | 浪潮(北京)电子信息产业有限公司 | Data access method and device of board card chip and computer readable storage medium |
CN111026682A (en) * | 2019-12-26 | 2020-04-17 | 浪潮(北京)电子信息产业有限公司 | Data access method and device of board card chip and computer readable storage medium |
CN112069091B (en) * | 2020-08-17 | 2023-09-01 | 北京科技大学 | Memory access optimization method and device applied to molecular dynamics simulation software |
CN112069091A (en) * | 2020-08-17 | 2020-12-11 | 北京科技大学 | Access optimization method and device applied to molecular dynamics simulation software |
CN112181864B (en) * | 2020-10-23 | 2023-07-25 | 中山大学 | Address tag allocation scheduling and multipath cache write-back method for Path ORAM |
CN112181864A (en) * | 2020-10-23 | 2021-01-05 | 中山大学 | Address tag allocation scheduling and multi-Path cache write-back method for Path ORAM |
CN113392043A (en) * | 2021-07-06 | 2021-09-14 | 南京英锐创电子科技有限公司 | Cache data replacement method, device, equipment and storage medium |
CN114115746A (en) * | 2021-12-02 | 2022-03-01 | 北京乐讯科技有限公司 | Full link tracking device of user mode storage system |
Also Published As
Publication number | Publication date |
---|---|
CN110297787B (en) | 2021-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110297787A (en) | The method, device and equipment of I/O equipment access memory | |
CN107193646B (en) | High-efficiency dynamic page scheduling method based on mixed main memory architecture | |
CN104115133B (en) | For method, system and the equipment of the Data Migration for being combined non-volatile memory device | |
CN106909515B (en) | Multi-core shared last-level cache management method and device for mixed main memory | |
CN105095116B (en) | Cache method, cache controller and the processor replaced | |
US20010014931A1 (en) | Cache management for a multi-threaded processor | |
CN113424160A (en) | Processing method, processing device and related equipment | |
US20110252215A1 (en) | Computer memory with dynamic cell density | |
CN107066393A (en) | The method for improving map information density in address mapping table | |
CN110888600B (en) | Buffer area management method for NAND flash memory | |
US8793434B2 (en) | Spatial locality monitor for thread accesses of a memory resource | |
CN109582600B (en) | Data processing method and device | |
US11093410B2 (en) | Cache management method, storage system and computer program product | |
JP2018537770A (en) | Profiling cache replacement | |
CN110795363B (en) | Hot page prediction method and page scheduling method of storage medium | |
CN109684231A (en) | The system and method for dsc data in solid-state disk and stream for identification | |
JP2009524137A (en) | Cyclic snoop to identify eviction candidates for higher level cache | |
CN104714898B (en) | A kind of distribution method and device of Cache | |
CN111722797B (en) | SSD and HA-SMR hybrid storage system oriented data management method, storage medium and device | |
CN111078143B (en) | Hybrid storage method and system for data layout and scheduling based on segment mapping | |
CN103885890B (en) | Replacement processing method and device for cache blocks in caches | |
CN109478164A (en) | For storing the system and method for being used for the requested information of cache entries transmission | |
US7702875B1 (en) | System and method for memory compression | |
CN105359116B (en) | Buffer, shared cache management method and controller | |
Shi et al. | A unified write buffer cache management scheme for flash memory |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing Applicant after: Loongson Zhongke Technology Co.,Ltd. Address before: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing Applicant before: LOONGSON TECHNOLOGY Corp.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |