RapidChiplet is an open-source toolchain that can quickly and accurately evaluate various chiplet architectures. It estimates seven key metrics of chip fabrication, such as latency, throughput, manufacturing cost, thermal stability, energy consumption, area, and complexity.
The designers use these estimations to make better decisions regarding the chiplet placement, interconnect topology, and chiplet size, ultimately leading to optimized chiplet architectures.
It can predict seven most relevant metrics of chiplet architectures in a few microseconds for designs with tens of chiplets and in one second for designs with more than 300 chiplets. It seamlessly integrates with BookSim2, a cycle-accurate network-on-chip (NoC) simulator, for more refined performance analysis.
The framework can be used as a cost function for both conventional optimization algorithms, such as simulated annealing, and new optimization methods based on machine learning.
Some benefits of using chiplets are:
- heterogeneity: can be built in different technologies and reused for multiple products.
- modularity: can be arranged in different topologies to optimize the inter
‑chiplet interconnect (ICI).
- cost-effectiveness: chiplet-based designs can be cheaper to manufacture than larger monolithic chips.
A chiplet is a small integrated circuit that has a specific function and can be combined with other chiplets to form a larger and more complex chip. As the demand for faster, smarter, and cheaper computing devices grows, the traditional approach of building monolithic chips faces many challenges, such as high manufacturing cost, low yield, limited scalability, and increased power consumption.
The chiplet architecture performance depends on different elements, such as:
- number of chiplets
- chiplet size
- placement of chiplets
- topology of the inter-chiplet interconnect
- on-chip traffic types (what kind of data they send and receive)
There are four categories of on-chip traffic types: compute-to-compute (C2C), compute-to-memory (C2M), compute-to-IO (C2I), and memory-to-IO (M2I) traffic (see the figure below).
To compute the latency of the four traffic types C2C, C2M, C2I, and M2I, the researchers used a graph representation with nodes (the chiplets and interposer-routers) and edges (the die-to-die links and links between interposer-routers).
The available tools that can estimate the cost and performance of different chiplet designs are often too slow to explore the different possibilities. RapidChiplet solves this problem by being able to quickly measure the most relevant metrics in a unified framework.
The RapidChiplet architecture
RapidChiplet is a set of Python scripts that perform different tasks, such as estimating the metrics, visualizing the chiplets and the chip designs, running the simulations, and reproducing the results from the paper.
The picture below shows the RapidChiplet architecture.
RapidChiplet takes a design file containing references to seven distinct input files. It then parses and validates these input files. It calculates seven distinct metrics and the results are stored in a result file. To minimize execution time, each metric can be enabled or disabled using command-line arguments. For more accurate estimations, RapidChiplet can also run simulations using the BookSim2 network-on-chip (NoC) simulator.
RapidChiplet has a pipeline of 4 steps that it follows to estimate 7 metrics of chiplet architectures:
- Input processing: take a set of seven JSON files as input, which contain the information about the chip design.
- Input validation: check the input files for any errors or inconsistencies, such as overlapping chiplets, invalid keys, or non-existing links.
- Metrics computation:
- Area calculation: calculate the area of each chiplet and the package based on their dimensions and the package type. It also adds some margins and gaps to account for the packaging overhead.
- Power consumption estimation: estimate the power consumption and the leakage power of each chiplet and the Inter-Chiplet Interconnect (ICI).
- Manufacturing cost estimation: estimate the manufacturing cost of each chiplet and the package, based on their area, technology node, yield, and package type.
- Thermal stability estimation: estimate the maximum temperature of each chiplet and the ICI.
- ICI latency and throughput estimation: estimate the average latency and throughput of the ICI, based on the shortest path routing, the traffic types, and the congestion in the network.
- Result generation: save the estimates for the seven metrics as a JSON file in the results folder. It also creates a plot that shows the chip design and the ICI topology, and saves it as a pdf file in the visualizations folder.
RapidChiplet was tested on a set of chips that have a square grid of compute chiplets (compute-chiplets, memory-chiplets and IO-chiplets).
The team changed the size of the grid from 2 X 2 to 16 X 16 compute chiplets, and they used two topologies that can be seen in the next picture.
The figure below shows the time required for RapidChiplet to compute the time spent for input reading, input validation and 6 metrics of chiplet architectures: area, power, cost, length of the die-to-die links in the chiplet architecture, latency, and the throughput.
The thermal analysis is not included in the figure, because it depends on other factors, such as the grid resolution.
The results show that RapidChiplet can estimate all metrics in a very short time, from one millisecond for the smallest design (with 2X2 chiplets) to one second for the largest design (with 16X16 chiplets). The user can also choose which metrics to estimate, and that this can reduce the total runtime.
RapidChiplet is a toolchain that quickly and accurately estimates seven key metrics of chip fabrication, helping the designers to compare and optimize different chiplet architectures.
RapidChiplet is free, open-source, and modular. You can access the code here.
- Research paper: “RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures” (on arXiv)
- Website & code