# Physical Design Implementation of a complex IP core aimed at area optimization

## Javateertha Karekar<sup>1</sup>, Ramana Reddy<sup>2</sup>

<sup>1</sup>, 6<sup>th</sup> Sem, Mtech in VLSI Design & Embedded System, VTU regional Centre, UTL Technologies, B'lore-22 <sup>2</sup> Associate Professor, Department of E&CE, VTU Extension Centre, UTL Technologies Ltd, B'lore-22

**Abstract** - Today's electronic systems are shrinking and the need is to have as small gadget as possible and have as low power as possible with High performance. With advent of low technology nodes the entire systems are getting on a single Chip making System on Chip. The reduction in channel length yielding smaller transistor and on the same unit area double the transistors are getting placed. This has given an opportunity to the designers to add more and more features from application front. In this paper we have demonstrated different physical design techniques to optimize the area and fit the same design on optimized die size without compromising on the design features. With these physical design techniques one can optimize the design and reduce the area and thereby increase the chip margin from silicon point of view. We start with conventional Implementation flow and apply different methods on reduced area and show that the design is routable with minimal shorts.

\_\_\_\_\_\*\*\*\_

Key Words: Physical Design, Placement, Routing, congestion

## **1. INTRODUCTION**

As per the Moores law we see that the todays ICs are getting more and more feature on the same unit area. Along with this the demand to add as much possible as design features is increasing. This requires an increase in silicon area. But the cost of the silicon is directly proportional to the area of the silicon. Hence area of silicon plays an important role for any SoC both from features point of view and cost point of view. A chip in 45nm technology node is having double the features in 32nm technology node and the main cause for such a high features in today's chips advancement in the chip fabrication. The die containing 10s of millions of transistor is now containing 100s of millions in the same die size. The chips are becoming denser day by day. The other constraint to the designer is to route these complex designs with lesser metal layers. More the metal layers used for the design more would be the cost. Having the constraints of lower die size, less number of metal layers results in a highly congested design and makes it difficult to route the design. Though there is advancement in the technology node, accordingly there is a cost involved to design high complex chips. Hence the area of any chip becomes a major constraint and optimizing the area becomes an important strategy for physical design engineers.

#### **1.1 Traditional Physical Design Implementation**

\_\_\_\_\_

In traditional ASIC physical Design flow the starting point is the netlist. The synthesis is done using the wire-load model and the load on the logic gate is calculated as per the fan-outs. This leads to an optimistic and less accurate structured netlist. When such a netlist is taken to Physical Design the Physical Optimization ends up adding additional buffers and inverters to meet the transition requirements on the wires. As during the Physical design the actual wire resistance and capacitance is taken for timing calculation the buffering requirement doesn't match the wire load model and the traditional physical design implementations ends with an increase in area. Also the effort required to reduce the area with traditional flow becomes complex.

## **1.2 Our Contribution**

In the paper we demonstrate physical Design of a Complex IP core in 28nm Technology node. The main intend in the implementation is to optimize for the area. The implementation specification is as below

- The estimated gate count of the core is nearly 600K place able instances.
- The base line area of this core is 5.8mm<sup>2</sup>.
- Target technology node of implementation is 28nm.
- Metal stack used is 8 metal layers, 6 metals for signal route and 2 metals for power route.
- Power target is 12uW leakage and 40mW dynamic.
- All flops in the design are clock gated to save dynamic Power.

The area of the core is reduced by 10% (5.22mm<sup>2</sup>). Different techniques will be used in physical design methodology to reduce the area. Below are few of the techniques implemented.

- Use of Physical Aware synthesis to make best placement of core logic and reduce congestion.
- Choose of less pessimistic transition limits to reduce • buffer and inverter counts.
- Use of 1x2 vias for power plan and increase the routing resources.

The base line run with 5.8mm<sup>2</sup> area was first implemented and the summary of the results are shown in table below

Table -1: PnR results of 5.8mm^2 area

| PnR results of 5.8mm <sup>2</sup> area |                     |                        |                      |        |                     |                |  |
|----------------------------------------|---------------------|------------------------|----------------------|--------|---------------------|----------------|--|
| Area                                   | Core<br>Utilization | Horizontal<br>Overflow | Vertical<br>Overflow | Shorts | Buffer<br>Inverters | Total<br>Cells |  |
| 5.8mm^                                 | 80.27%              | 43121                  | 0                    | 544    | 226655<br>99993     | 966067         |  |

#### 2. WIRE LOAD MODEL SYNTHESIS

The first experiment that we carried out was with wire load model synthesis and reduced the area by 10% for Physical Design Implementation. As the area was reduced for the same design there was shooting up in the utilization. The results are shown in table-2

Table -2: PnR results of 5.2mm^2 area

| PnR results of 5.2mm <sup>2</sup> area |                     |                        |                      |        |                     |                |  |
|----------------------------------------|---------------------|------------------------|----------------------|--------|---------------------|----------------|--|
| Area                                   | Core<br>Utilization | Horizontal<br>Overflow | Vertical<br>Overflow | Shorts | Buffer<br>Inverters | Total<br>Cells |  |
| 5.2mm^                                 | 89.61%              | 175516                 | 21153                | 64436  | 228234<br>98591     | 966240         |  |

## **3. PHYSICAL AWARE SYNTHESIS**

In the earlier tools, the physical information was getting used only after the netlist was created. As a result the usage of netlist started in a vast majority. Traditional synthesis starts from RTL and uses wire load model to estimate the net delay. The use of wire load model resulted in a bit gap in physical correlation. There was lot of incremental optimization required when using wire load model before going to PnR. With wire load model the PnR was never a quick convergent and tool many iterations to converge the design.

In physical aware synthesis right from the start the modelling of interconnect delays gets into action. Here the physical layout abstract and congestion are considered during RTL to gate synthesis and a better netlist is created. This netlist can be then used in PnR to get a closer correlation with physical data. The barrier between physical and logical data is thus removed.

The table-2 shows the PnR results of 5.2mm<sup>2</sup> area where the Physical Design was implemented using the Physical aware synthesized netlist. The result shows that the shorts have reduced exponentially. **Table -3:** PnR results of 5.2mm<sup>2</sup> area with DCT netlist

| PnR results of 5.2mm <sup>2</sup> area |                     |                        |                      |        |                     |                |  |
|----------------------------------------|---------------------|------------------------|----------------------|--------|---------------------|----------------|--|
| Area                                   | Core<br>Utilization | Horizontal<br>Overflow | Vertical<br>Overflow | Shorts | Buffer<br>Inverters | Total<br>Cells |  |
| 5.2mm^                                 | 89.87%              | 88658                  | 0                    | 8865   | 308318<br>102760    | 1032135        |  |

## 4. TRANSITION LIMIT SETTING

A tighter transition results in Area increase due to below constraints

- Driver cells gets upsized
- Reducing the wire length by cell movements closer or decreasing long routed wire
- Buffer insertion
- Load splitting by making use of buffers and decrease the fan out number

If one makes a very tight transition it will result in an increase in area. This also needs the root cause analysis of bad transitions. Barriers such as routing blockages, macros, or fixed attributes of cells results in detours of long routed net lengths causing an increase in the load on the connected driver.

Hence for our design we set the transition of 300ps for data nets and 160ps for clock nets. With this the results of the PnR are shown in table-4. Here we can see the shorts have reduced to half and the buffer count has also reduced.

**Table -4:** PnR results of 5.2mm<sup>2</sup> area with DCT netlist and less optimistic transition limit

| PnR results of 5.2mm <sup>2</sup> area |                     |                        |                      |        |                     |                |  |
|----------------------------------------|---------------------|------------------------|----------------------|--------|---------------------|----------------|--|
| Area                                   | Core<br>Utilization | Horizontal<br>Overflow | Vertical<br>Overflow | Shorts | Buffer<br>Inverters | Total<br>Cells |  |
| 5.2mm^                                 | 89.87%              | 85274                  | 0                    | 4128   | 286535<br>102760    | 1032135        |  |

## **5. POWER MESH OPTIMIZATION**

Power mesh plays an important role in meeting design IR drop and signal routability. This section deals with the routing challenges with robust Power mesh and best practices. There are always tradeoffs between robust power mesh requirement for IR drop and Detour of Signal routes due to robust power mesh.

With robust power mesh we face below challenges

- Pin access is blocked
- More routing resources and tracks consumed
- Signal Detour

Hence we followed few methods in our design to beat the congestion problem. First was that we avoided the high via walls. These high via walls leads to detours around via walls. Instead we widened the PG wires to address the IR drop issues. Another technique we used was to avoid clusters of via arrays. Since the current requirement was less we used 1x2 vias instead 1x3 vias. These techniques were applied at local hot spots where the congestion was local and this resulted in further reduction of shorts. The results are shown below in the table-5

**Table -5:** PnR results of 5.2mm^2 area with DCT netlist andless optimistic transition limit and optimum power mesh

| PnR results of 5.2mm <sup>2</sup> area |                     |                        |                      |        |                     |                |  |
|----------------------------------------|---------------------|------------------------|----------------------|--------|---------------------|----------------|--|
| Area                                   | Core<br>Utilization | Horizontal<br>Overflow | Vertical<br>Overflow | Shorts | Buffer<br>Inverters | Total<br>Cells |  |
| 5.2mm^                                 | 89.87%              | 85274                  | 0                    | 498    | 286535<br>102760    | 1008312        |  |

#### 6. OVERALL RESULT SUMMARY

The graph below shows the overall results for Utilization, and Shorts.





#### 7. CONCLUSIONS

This paper has presented different ways to optimize the area without compromising the quality of the design. We have accomplished this with all above formulations. At the end of the implementation it is clear that the Physical Aware Synthesis is the main methodology which give exponential reduction in the signal Shorts. With the proper guidelines of the power mesh we ended up in the short less that 500 which matches the 5.8mm^2 area die size. Since we have reduced the die size by 10% the silicon area is reduced by 10%.

This has directly saved the cost of the silicon. We have ensured that the design has shrunk and achieved the need to have as small gadget as possible. As the lower technology nodes require sophisticated equipments to fabricate these lower technology transistors, with the reduction is size we have saved the silicon cost. Also the lower technology nodes makes EDA tools to be developed such that with less effort the tools can help in optimizing the design parameters like increase in performance, reducing power and reducing the area. And with the help of Physical Aware Synthesis we have made use of advancements in the Synthesis to optimize for the area. The constraint on routing this complex IP Core with lesser metal layers also full filled as we have made use of only Metal2-Metal6 layers.

As we saw on first non-DCT synthesis with the constraints of lower die size, less number of metal layers resulted in a highly congested design and made it difficult to route the design. But with the all the techniques mentioned above we were able to optimize the area and ensured the design is routable.

## REFERENCES

- [1] Structure-Aware Placement for Data path-Intensive Circuit Designs Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE : 2012
- [2] Evaluating Macro Placement in an SoC Block Based on a Congestion Estimate Electrical & Computer Engineering (CCECE), 2012 25th IEEE Canadian Conference on : 2012
- [3] ICCAD-2012 CAD Contest in Design Hierarchy Aware Routability-Driven Placement and Benchmark Suite Computer-Aided Design (ICCAD), 2012 IEEE/ACM International Conference on : 2012
- [4] Structure-Aware Placement Techniques for Designs with Datapaths IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2013
- [5] Dealing with the over-pessimism in ASIC physical design flow Signals and Systems Conference (ISSC 2012), IET Irish Year: 2012
- [6] Physical vs. Physically-Aware Estimation Flow: Case Study of Design Space Exploration of Adders VLSI (ISVLSI), 2014 IEEE Computer Society Annual Symposium on : 2014



- [7] Controlled Placement of Standard Cell Memory Arrays for High Density and Low Power in 28nm FD-SOI Design Automation Conference (ASP-DAC), 2015 20th Asia and South Pacific :2015
- [8] ACCELERATING TIMING CLOSURE USING INCREMENTAL ADVANCED OCV Design Automation Conference (ASP-DAC), 2015 20th Asia and South Pacific :2015
- [9] AIDA: Robust Layout-Aware Synthesis of Analog ICs including Sizing and Layout Generation Design Automation Conference (ASP-DAC), 2015 20th Asia and South Pacific :2015
- [10] Congestion-Aware Optimal Techniques for Assigning Inter-Tier Signals to 3D-Vias in a 3DIC Design Automation Conference (ASP-DAC), 2015 20th Asia and South Pacific :2015