

# **Power Optimized Transmitter for Future Switched Network**

Omkar C. Mane<sup>1</sup>, Prof. Usha Jadhav<sup>2</sup>

<sup>1</sup>Dept. of Electronics and Telecommunication Engg. D. Y. Patil College of Engineering, Akurdi Pune, India <sup>2</sup>Faculty of Dept. of Electronics and Telecommunication Engg. D. Y. Patil College of Engineering, Akurdi Pune, India

\*\*\*

Abstract - Network equipment power consumption is under increased scrutiny. To understand transmitter power consumption, Combination of CMOS and MOS current mode logic (MCML) is used and characterize power consumption using Tanner EDA Tool 13.0. For optical transmitters, we show that photonic components and front end drivers only consume a small fraction (<22%) of total serial transmitter power. This implies that the power of optical transmitter is reduced can only be obtained by paying attention to the physical layer. We propose a physical layer protocol suitable for optically switched links that retains the beneficial transmission characteristics of 8b/10b, but, even without power gating and voltage controlled oscillator power optimization, reduces the power consumption during idle periods by 29% compared with a conventional 8b/10b transmitter. We have made the toolkit available to the community at large in the hope of stimulating work in this field.

# **1. INTRODUCTION**

The persistent growth in network traffic advanced by recent developments, such as video sharing, IPTV, and cloud-based storage, is causing increased demands on the network switching capacity and energy consumption at the Internet core and within data centers. Increasing the capacity of current high-bandwidth electronic switches is not only technically demanding it also leads to higher thermal dissipation. This leads not only to interconnect technologies with high connectivity and capacity, but also lower latency, power consumption, and cost. Among these, the energy performance of networked systems has become a first class property of interest to industry and researchers. It has been shown that to make large energy savings through energyproportionality current computer systems must be made to do nothing well: minimizing consumption when not in use. Optical networks continue to deliver on the promise of bandwidth, latency, and low power utilization but if optical switch fabrics are to continue meeting their promise as a key component in future energy-proportional systems.

Then, we need a generation of high-speed transmitter designed with energy proportionality as the first class property.

Transceiver design has been focused upon providing high reliability with ever-higher levels of link capacity (bandwidth) to meet ever-growing needs to interconnect

computer devices. This has led to optical transceivers that are always on, exchanging information to remain synchronized even when carrying no data. Such designs suit point-to-point link communications, providing implicit information about the point-to-point link status, even when no data is carried. There is a wide range of transceivers, with electronics to drive twisted pair copper, multichannel coaxial copper, and a range of optical systems. Current commercial optical 10 Gb/s transceivers have lower power consumption than twisted-pair serial transceivers due to a lower complexity physical (PHY) layer, 1. Yet, in we showed that the popular 8b/10b coding scheme can consume more power when transmitting idle frames than when transmitting data. Finally, a further power consumption incentive comes from the increasing move of communication endpoints to on-chip in silicon-on-chip (SoC) processors. With predictions that a growing proportion of the chip will need to be power gated at any one time, the so called dark silicon effect. The serial electronic transceivers, which provide several Tb/s of offchip bandwidth required in high-performance SoC processors, are already consuming >20% of the total power. Silicon photonics has been widely proposed as one of the solutions to the processor communications bottleneck and energy issues. However, we show that optical transceiver power is dominated by other physical layer (PHY) functions such as serialization/deserialization (SERDES), clock recovery, and line coding. Hence, a simplistic change from electronic to optical transmitter will not reduce power consumption without an accompanying change to the PHY layer. Furthermore, at the packet timescale, an optical switched system sets a new optical pathway to each destination. Thus, the physical layer need not remain operating when idle and, without system wide time synchronization, an optical packet-switch proposal uses burst-mode receivers capable of fast locking to incoming packets each having different frequency, phase, and amplitude. This requires per packet clock recovery designs in a new PHY implementation.

Synchronization, an optical packet-switch proposal uses burst-mode receivers capable of fast locking to incoming packets each having different frequency, phase, and amplitude. This requires per packet clock recovery designs in a new PHY implementation.



## 2. Transmitter design

This section describes the main functional circuit blocks of the transceiver. Top-level representations of the transmitter and receiver are shown in Figs. 1 . Initially, the transmitter design was aimed at a payload bit rate (before adding coding overhead) of 10 Gb/s. In a later work, we characterized the circuits at different bit rates to investigate optimum bit rate versus power operating points.



Fig 1. Proposed block diagram of optical transmitter

#### I. Line Coding

The functions of the coding block include dc balance, byte alignment within the serial stream, and error detection. We consider two popular encoding schemes: 8b/10b block code and the scrambler-based 64B/66B.8b/10b represents a class of parity-disparity dc-balanced codes that map arriving 8-b symbols into a 10-b code words using predefined code groups at run time on the word-by-word basis. The code limits the run length of identical symbols to remove baseline wander in AC-coupled receivers and guarantee the required transition density for clock synchronization. 8b/10b coding has excellent transmission properties but has a bit rate overhead of 25%. For this reason, the hybrid-scrambled 64B/66B encoding scheme was selected for 10 Gb/s Ethernet, which reduces the overhead to 3%. The encoding module performs a framing function by transforming the 64b data and 8-b control inputs into a 66-b block. Each 64-b data word is scrambled with a 58th degree polynomial to ensure statistical dc-balance and transition density and a 2-b synchronization header is appended to allow frame detection and alignment to be performed. Fig. 3 shows block diagrams of the two alternative coding schemes in the transmitter. In the 64B/66B case, the transmitter accepts 64-b data at 156.25 MHz and carries out encoding and scrambling. The resulting 66-b are converted by the gearbox to 64-b interface at 161.13 MHz for more efficient serialization. In the 8b/10b case, we implemented versions with both 8-b wide client-side data interface running at 1.25 GHz and a dual encoder with a 16-b interface operating at 625 MHz. Figs.3 show the 16-b interface. In all cases, phase differences between the coding block and client-side interface are compensated by a first-in first-out (FIFO) buffer.

## 2. Scrambler





The first bit in sequence s1is summed modulo-2 with the modulo-2 sum of location 2 and 5 in the shift register. This sum becomes the first bit in bit sequence s2. As this bit is presented to the channel, the contents of the shift register are shifted up one stage as follow: 5 out, 4 goes to 5, 5goes to 3, 3 goes to 4, 2 goes to 3, 1 goes to 2. The first bit in s2 is also placed in shift register stage 1. The next bit of sequence s1 arrives, and the procedure is repeated.

#### II. Serialization and Deserialization

The SERDES circuits convert between the low-speed parallel data and a high-speed serial bit stream. The multiplexing ratios depend on the coding scheme used. In the case of 64B/66B, 64-b sequences at 161.13 MHz are converted to 10.3125 Gb/s using a 64:1 ratio. In contrast, a transceiver with 8b/10b coding performs either 10:1 or 20:1 multiplexing producing a line rate of 12.5 Gb/s. As shown in Figs 3 the SERDES circuits are implemented in a combination of static CMOS and MCML. To find a power-efficient SERDES design, we engineered a variety of configurations. For example, for 64B/66B, we investigated 64:1 SERDES based on 64:N CMOS and N:1 MCML circuits where N = 2, 4, or 8 (referred to throughout the rest of this paper as 64:N:1). In a similar way, we investigated 8b/10b SERDES using 20:N:1 (dual encoder) and 10:N:1 (single encoder) cases, with N = 2 or 4.



Fig 3. CMOS and MCML combination for serialization and deserialization.

The CMOS SERDES circuits are implemented as shift registers. The MCML circuits were implemented as binary tree multiplexers constructed by cascading 2:1 multiplexer cells, frequency dividers, and delay lines, which were manually optimized for the required bandwidth and timing operation. Fig. 5 shows an example of a 64:8:1 SERDES.

# III. Transmitter Phase-Locked Loop

Commonly used phase-locked loop (PLL) and CDR circuits often use multiple stages to facilitate a stable and consistent operation. This redundancy usually delivers high performance but the synchronization process takes a relatively long time to achieve a stable lock. The simplicity of our CDR design guarantees a fast locking time (≤10 clock cycles) and maximum power and area efficiency. Although realistic CDR implementation may require some modifications to the design to account for factors such as minor impedance mismatch, capacitive and inductive resistance variations, and so on, we believe that the power figures will be representative of real circuits.

# **IV. Channel Bonding**

To find the power consequences of using multiple lower bit rate serial streams rather than a single serial channel, we designed a channel bonding circuit in Verilog, which eliminates skew between multiple channels using a separate FIFO. In an optical link, these channels could be either space or wavelength division multiplexed. We tested the circuit operating on the output of two 8b/10b client-side streams, but the Verilog model is parameterized for higher numbers of channels. The circuit is designed for burst-mode operation.

# 3. Circuit design

To find the power consequences of using multiple lower bit rate serial streams rather than a single serial channel, we designed a channel bonding circuit in Verilog, which eliminates skew between multiple channels using a separate FIFO. In an optical link, these channels could be either space or wavelength division multiplexed. We tested the circuit operating on the output of two 8b/10b client-side streams, but the Verilog model is parameterized for higher numbers of channels. The circuit is designed for burst-mode operation.

# A. Design of CMOS Circuits

Design of the static CMOS circuits started with Register Transfer Level (RTL) Verilog hardware description language descriptions and synthesized using Synopsys Design Compiler with a commercially available 45-nm standard cell library. Constraints were set to minimize power consumption at the required operating frequency. The typical clock frequency margin used for synthesis is considered to be at least 15% faster than the nominal frequency value. The synthesized Verilog netlist was simulated using Mentor Graphics ModelSim to verify correct operation and store activity data for dynamic power analysis. The input stimulus for the simulations was extracted from realistic 10 Gb/s Ethernet trace files and analyzed under: 1) continuous data transmission and 2) continuous idle transmission input setups. Synopsys Prime Time was used to generate power consumption data for each circuit block.

# **B. Design of MCML Circuits**

Although new generations of CMOS technologies continuously improve their performance and power characteristics due to scaling, CMOS circuits are prone to generate a highlevel supply noise while operating at high speeds. The noise factor limits the on-chip integration of digital blocks with their analog counterparts. Logic families with differential signaling, such as MCML are characterized by an improved noise immunity and high-speed operation. The speed advantage is achieved by the fact that the current, generated by a constant current source, is steered between a pair of fully differential transistors and produces a reduced swing voltage drop at outputs (in combination with specific voltage gains), reducing the generation of logic level switching noise. It must be noted though that the presence of the current sink implies a constant power dissipation irrespective to the operating frequency or input sequence applied. Power dissipation in MCML circuit is dominated by a static power ( $P = Vdd \times Iss$ ) and is independent of the operating frequency. In this paper, an MCML cell library was developed. The design process used the transistor models supplied with the 45-nm CMOS standard cell library and a

semianalytical methodology developed in HSPICE environment for cells opti-mization. To satisfy the required performance criteria of high- speed operation and minimize power dissipation of individual gates, we used HSPICE optimization solver. This allowed us to produce the best case parameter variation model for a specific subset of supply voltages, voltage swings and biasing currents selected as the input characteristics. Appendix describes the design and optimization process for the MCML cell library in detail. Once the MCML cell library optimization process was complete, design of serialization, deserialization, and CDR circuits was performed. Correct operation was verified and power measured using SPICE simulation.

#### 4 MCML DESIGN AND OPTIMIZATION-

Design of MCML circuits requires optimization of a large number of parameters. Previous work in the field provided an analytical description of all parameters used in the MCML logic design process and reviewed the impact of these on performance/power response. In this paper, we developed an optimization toolkit, which allows deriving an MCML cell library parameters in an automated way via using standard SPICE descriptions of MOSFET transistors and satisfying the specific criteria in power efficiency and performance measured as system's outputs. In the following section, we review the major operation principles and properties of a typical MCML cell and provide the optimization procedure used throughout the cell development process.

A typical MCML gate is composed of three main blocks the pull-up network, implemented as a set of resistors or active p-MOS loads, the fully differential pull- down network, which steers the current between the branches, and the current source. The performance of a gate is a function of various metrics and is determined/evaluated by the corresponding adjustments made in transistor sizing, biasing voltages, reference currents/voltages, and differential voltage swings. The operation of a standard MCML inverter cell can be described as follows. Due to presence of active loads R, a voltage drop  $V = I \times R$  is produced, permitting logical 1 and 0 states to be represented as V dd and V dd - V voltages, respectively. The use of active loads, implemented as p-MOS transistors conducting in the linear region (assumed to provide a roughly linear transfer function response), allows online adaptability that helps compensating any spontaneous variations inside the circuit. Typical resistance values are in the order of 10 s of Ks and require sink currents to be in the order of couple of hundreds microamperes. The increase in transistor sizing, i.e., WP/L P ratio, lowers the load resistance, and, as a rule, propagation delay of inverter circuit; it is also followed by reduction in saturation voltage of the p-MOS loads causing degradation in linear response. An example of biasing circuit that is used for parameter's adjustment



Fig. 4. MCML inverter cell.

# 5. PREPROCESSING RESULTS AND HARDWARE IMPLEMENTATION

#### A. Front end

Proposed transmitter is consist of both front end and back end. Front end is consist of different block such as FIFO, encoder, bitslip, this block's are implemented by using Verilog code. RTL schematic of top level module is shown in fig 5.



Fig 5. RTL schematic of frontend

Front of proposed transmitter is designed by using VERILOG code to understand the performance of understand. The design of front end is accomplished by using XILINX 14.7 tool. Simulation result is shown in fig. 6



Volume: 04 Issue: 09 | Sep -2017

www.irjet.net

p-ISSN: 2395-0072



Fig. 6 Final simulation result of Front end.

# **Backend result-**

Backend of proposed transmitter is consist of 64:1 multiplexer and it is a combination of combination of MCML and CMOS circuit and it is implemented by using Tanner EDA tool.



Fig. 7 Schematic of 2: 1 MUX MCML circuit

Backend of proposed transmitter is designed to understand performance of MCML and CMOS circuit Simulation accomplished in Tanner EDA 13.0 by using 45nm technology. S-edit of Tanner EDA is used in order to create the schematics of the circuit. Simulation result shown in fig.7



Fig. 15 Final simulation result of 64:N bit MUX.

| Table 1 : Simulation results of MCML and CMOS | 64 |
|-----------------------------------------------|----|
| MUX.                                          |    |

| 45nm      | vdd | Delay   | Power   |
|-----------|-----|---------|---------|
| process   |     | [ns]    | [µW]    |
| AND gate  | 0.9 | 0.019   | 0.38    |
| NAND gate | 0.9 | 0.019   | 0.38    |
| INVERTER  | 0.9 | 0.00023 | 0.00198 |
| 2:1 MUX   | 0.9 | 4.75    | 41.67   |
| MCML      |     |         |         |

# 6. Conclusion

We note that, as ultralow energy silicon photonic communication components become commonplace, the power consumption of the other transceiver components must become the focus for major reductions in transceiver power. Such reductions can only be obtained with attention to the physical layer circuits and protocols of which SERDES is the largest component. Our results show that the highspeed subsystem, incorporating SERDES, CDR, and clock recovery, can, despite relatively simple logic, consume 50%-60% of the total power. This is largely due to the integration of standard CMOS and differential MCML components operating at a high clock rate



## References

- R. S. Tucker, "Green optical communications—Part II: Energy limitations in networks," IEEE J. Sel. Topics Quantum Electron., vol. 17, no. 2, pp. 261– 274, Mar./Apr. 2011.
- 2. D. A. Miller, "Rationale and challenges for optical interconnects to electronic chips," *Proc. IEEE*, vol. 88, no. 6, pp. 728–749, Jun. 2000
- 3. D. Huang, T. Sze, A. Landin, R. Lytel, and H. L. Davidson, "Optical interconnects: Out of the box forever?" *IEEE J. Sel. Topics Quantum Electron.*, vol. 9, no. 2, pp. 614–623, Mar./Apr. 2003.
- 4. L. A. Barroso and U. Holzle, "The case for energy-proportional
- 5. computing," *IEEE Comput.*, vol. 40, no. 12, pp. 33–37, Dec. 2007.
- 6. O'Connor, "Optical solutions for system-level interconnect," in *Proc.* Int. Workshop Syst. Level Interconnect Predict., 2004, pp. 79–88.

