RELATED APPLICATIONS
This application is a continuation of U.S. patent application Ser. No. 12/148,940, filed Apr. 23, 2008 now U.S. Pat. No. 8,078,784 by the same inventor, which is incorporated herein by reference in its entirety.
FIELD OF THE INVENTION
The present invention relates to methods and apparatuses for a system on a chip (SOC) and is particularly concerned with data movement.
BACKGROUND OF THE INVENTION
Current Systems on a Chip (SOC) have grown more complicated than systems in the past. Systems in the future will have even more complexity than those of today. Complexity grows in many directions:
-
- 1) The system attempts to provide more features. In this situation, the number of blocks in the system increases to support the additional features.
- 2) The system attempts to do more on current metrics. For example, a camera may have more mega-pixels. Another example would be a turbo decoder that is upgraded to have a larger throughput. In this situation, each block either gets more complex or runs faster.
- 3) The system combines multiple legacy systems. For example, a simple phone has become a mobile phone, a camera, and a music player.
As systems become complex, one notices several trends:
-
- 1) There are many blocks that are similar across the system (i.e. multiple Turbo decoders for the different modes). The resulting system has blocks that are similar and exist in several areas of the system. However, due to the architecture, the blocks cannot be reused in the different modes.
- 2) Processors run faster and get larger to provide the processing power required. The new processors take a larger gate count. Also, the power requirement for the processor increases. Furthermore, as the processor becomes more complicated, the interface requirements become more complicated. With this, the time/cycle required to communicate outside the processor increases as the interface grows more complex.
- 3) Systems become an amalgam of disparate systems. The interfaces between the disparate systems are ad hoc and inefficient.
- 4) The power required to run the system increases.
- 5) Interfaces between each block become more specialized and cannot be reused even though the functions may be similar.
Referring to FIG. 1, there is illustrated in a block diagram a typical System on a Chip design (SOC) 10. SOC typically have:
-
- 1) Processor(s)
- 2) Memory
- 3) Blocks/Peripherals
- 4) Busses
In the above, the blocks and processors are connected via the interconnect bus. Also, there are many disparate busses. As the number of blocks increase on a bus, the throughput decreases and/or the latency increases since the loading on the bus increases. Bridges are used to split the busses up so that the blocks that can take a larger latency of lower bandwidth can be “moved further away.”
In the example in FIG. 1, the processor, the DMA, the graphics accelerator, the on board memory, memory controller, and the 3 bridges are on the main bus. Typically, this is the fastest bus with the highest bandwidth. However, there are too many blocks in the system to put all on the main bus. Therefore, the three bridges provide the bridging services for the other blocks. There is a slow bus that has 3 blocks to support external interfaces. There is another bus to the communication system to talk to Blocks 1 to Block n. Also, there is a legacy system that is connected through the Legacy System Bridge.
There are many examples of busses. Some of the popular busses are APB, AHB, and OCP. One of the hallmarks of the busses is registers and memories are memory mapped.
Also, blocks/peripherals that need to communicate some information, status, or timing to another block uses an ad hoc scheme to communicate. This ad hoc scheme is typically customized for the specific interface and cannot be used for another interface. In FIG. 1, I/F # 1 ad I/F #n are the ad hoc interface connecting the blocks so that the blocks can communicate. Also, the interrupts are not shown in FIG. 1.
Referring to FIGS. 2 and 3 there is illustrated an example of an ad hoc interface 20 and its timing diagram 30 of a block (for example. Block #n) in FIG. 1. The ad hoc interface for this example is for a viterbi decoder. The ad hoc interface is specialized for a viterbi decoder. This interface would not work for another block.
Blocks that need to communicate to the processor communicate in one of two ways:
-
- 1) The processor polls the blocks continuously.
- 2) The block interrupts the processor either directly or indirectly and the processor then goes gets the information.
Furthermore, blocks can directly talk to other blocks. Typically, they share a tightly coupled interface (i.e. another specialized interface). Often, the interface has a tight handshake protocol.
Systems and methods disclosed herein provide a method and apparatus for data movement in a system on a chip to obviate or mitigate at least some of the aforementioned disadvantages.
SUMMARY OF THE INVENTION
An object of the present invention is to provide improved method and apparatus for data movement in a system on a chip.
In accordance with an aspect of the present invention there is provided a method of configuring a system on chip comprising of the step of providing a destination of a message/packet at a predetermined time.
In accordance with another aspect of the present invention there is provided a system for comprising a plurality of blocks, each block comprising any hardware element and a plurality of segments for providing interconnection of the plurality of blocks.
In an embodiment of the present invention there is provided a system for comprising a plurality of systems on chip, each system on chip including a plurality of blocks, each block comprising any hardware element and a plurality of segments for providing interconnection of the plurality of blocks, at least two systems on chip connected via a segment that extends outside boundaries of each of the two systems on chip.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be further understood from the following detailed description with reference to the drawings in which:
FIG. 1 illustrates in a block diagram a typical System on a Chip design (SOC);
FIG. 2 illustrates an ad hoc interface of a block of FIG. 1;
FIG. 3 illustrates in a timing diagram for the ad hoc interface of FIG. 3;
FIG. 4 illustrates a system on chip in accordance with an embodiment of the present invention;
FIG. 5 illustrates how the example of FIG. 2 is implemented in accordance with an embodiment of the present invention; and
FIG. 6 illustrates how the implementation of FIG. 5 may be updated in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Referring to FIG. 4 there is illustrated a System-on-Chip SOC in accordance with an embodiment of the present invention. The system on chip has different blocks, including processors, that communicate with each other. For example, a system 100 that includes blocks 102, 104, 106 and segments 110, 112, 114 where a block is defined as any hardware element. A block may include one or more processors. A block may be a processor 120, in fact any of the blocks 102, 104 and 106 may be processors. A block has one of more ports. There is no minimum or maximum size of the block. Each block communicates via a unified interface.
The system simplifies the interface of both the processor 120 and the blocks 102, 104, 106 through the unified interface. The concept of the segments that connect the processor(s) and blocks to each other is based upon the unified interface for blocks and the application of a hierarchy of segments to provide for scaleable bandwidth. A processor 120 is treated just as any other blocks. The implementation of the system may, in fact, not include a processor.
Embodiments of the present invention also segment the communication traffic between the blocks and processors (i.e. block-to-block, block-to-processor, processor-to-block, and processor-to-processor).
The segments can scale at different levels of the hierarchy. A segment is the connector between multiple blocks, as shown by segments 114 and other segments as shown by segments 110 and 112. A segment (not shown in FIG. 4) can also join a mix of segments and blocks at the same level of hierarchy. The segments are connected via the ports of the block or other segments. The ports use a common unified interface.
The communication of the blocks is packet based. At a minimum, the packet includes a destination block. The packet may also include data, packet/message identification, padding, etc. Packets are used to carry the message that contains information to be sent from one block to another. A single message may span multiple packets.
The blocks can have one or more associated addresses. For example, providing a block with two separate addresses facilitates the segregation of control and data on two separate ports. In other instances, a block may have multiple ports, but only have 1 address (e.g., when it is desirable to increase the data rate). A block has one or more input ports and one or more output ports. A single port can be both if desired. The number of input and output ports for either block or segment does not have to be the same. A block can also have an input or an output.
In another embodiment, the system on chip has multiple segments where:
-
- a). A segment is the connector between multiple blocks and/or other segments. The segments are connected via the ports of the blocks and/or other segments.
- b). A segments have the ability to route the packets to the correct destinations.
- c). The routes do not have to be unique.
In another embodiment, the system on a chip includes:
-
- a). Each of the different blocks and segments having a different (or the same) properties.
- b). Properties include but are not limited to clocks, bandwidth, bit widths, and latencies.
- c). Properties describe block-to-block logical connections do not have to be the same.
In another embodiment, the system on a chip includes
-
- a). If multiple segments exist, multiple packets can be active on different segments.
- b). On a single segment, multiple packets can be active on different ports of the packet.
In another embodiment, we claim that one realization of the above claim (any or all) where
-
- a). Multiple messages/packets can exist on block-to-block communications.
- b). Certain block-to-block pairs may or may not communicate though a logical connection can exist.
In another embodiment, the system on a chip includes
-
- a). The destination of the message/packet is not known to the block until a later time.
- b). The later time include, but not limited to
- i. After fabrication.
- ii After a code update.
- iii. After provisioning.
- iv. After measuring or reading states in the environment.
- v. After a functionality change.
In another embodiment, the system on a chip includes
-
- a). The data and the address of the block is transmitted on the same interface or
- b). The data and the address of the block is transmitted on a different interface.
In another embodiment, the system includes multiple SOCs wherein each can be connected via a segment that extends outside the boundaries of a SOC.
Referring to FIG. 5, there is illustrated how the example of FIG. 2 is implemented in accordance with an embodiment of the present invention. From the example given in FIG. 2, the following changes are made to the block. The actual computation engine of the viterbi needs no change. In the example, we take a segment where the source of the data for the viterbi is present (i.e. de-interleaver), the viterbi block itself, and the destination of the viterbi's output (i.e. decrypter).
The interface of each of these blocks (including the viterbi) could be identical.
|
M_DATA[7:0] |
OUT |
Data |
|
Output Data. |
|
M_ENABLE |
OUT |
Enable |
|
Signals when M_DATA is valid. |
|
Used to flush interface out. Useful at startup. |
|
Clock for the interface. The CLK is an output if the interface is |
|
in Source Synchronous Mode. The CLK for the Master and |
|
Slave shall be phase and frequency aligned. |
|
Signal used by the Slave to indicate that the interface is busy. |
|
S_DATA[7:0] |
IN |
Data |
|
InputData. |
|
S_ENABLE |
IN |
Enable |
|
Signals when S_DATA is valid. |
|
Used to flush interface out. |
|
Clock. Can be the same net as M_CLK. |
|
Signal used by the Slave to indicate that the interface is busy. |
|
|
The following step demonstrates the data flow. This is a basic that does not demonstrate the full power of running the blocks in parallel.
- a) Through its master interface (M_*), the de-interleaver (e.g. source) sends the block (can be broken up into multiple blocks) to be decoded to the decoder on the decoder's slave interface (S_*).
- b) Upon receiving the data, the decoder engine starts and decodes the data.
- c) When the decoder is finished, the decoder sends the decoded data on its master interface to the slave interface of the decrypter.
A typical problem occurs when after a first generation of a product is produced, the next generation of product needs more features. For example, the new feature is to run the system with twice the amount of data.
Pre-Data Highway, the entire system would have to be re-architected. However, with the Data Highway, the large problem can be broken down into many smaller problems. In this example, since the de-interleaver and de-crypter are simple enough, they can handle the increase in data rates. However, the decoder cannot. Without a redesign, one can place two decoders and time share the two. Since the interface for the blocks (shown in Table A) is the same, this has minimal (if any effects) on the interleaver or de-crypter,
Referring to FIG. 6, there is illustrated how the implementation of FIG. 5 may be updated in accordance with an embodiment of the present invention. The following steps demonstrate the data flow. This is a basic example that does not demonstrate the full power of running the blocks in parallel.
- a) Through its master interface (M_*), the de-interleaver (e.g. source) sends the block (can be broken up into multiple blocks) to be decoded to the decoder on the decoder's slave interface (S_*).
- b) Upon receiving the data, the decoder engine # 1 starts and decodes the data.
- c) Meanwhile, the de-interleaver sends its next block to decoder # 2. The data is sent via the master interface of the de-interleaver to the slave interface of decoder # 2.
- d) Upon receiving the data, the decoder engine # 2 starts and decodes the data.
- e) When the decoder # 1 is finished, the decoder sends the decoded data on its master interface to the slave interface of the decrypter.
- f) When the decoder # 2 is finished, the decoder sends the decoded data on its master interface to the slave interface of the decrypter.
Numerous modifications, variations and adaptations may be made to the particular embodiments described above without departing from the scope patent disclosure, which is defined in the claims.