Brent Kung Adder : Circuit, Working, Advantages, Disadvantages & Its Applications

The Brent–Kung adder was proposed in 1982 by Hsiang Te Kung & Richard Peirce Brent. It is a Parallel Prefix adder or tree adder which is used widely in Digital Design due to its flexibility. Parallel Prefix Adders can be built in several ways based on the number of logic levels, logic gates involved, the fan-out from every gate & the wiring in between the levels. There are different types of tree adders available, the fundamental Tree adders are Sklanskym KoggeStone & Brent-Kung, As compared to the KSA (Kogge–Stone adder), this adder provides high regularity to the structure of the adder and it has less wiring blocking which leads to better performance & less required chip area. This article provides brief information on a Brent Kung Adder.


What is Brent Kung Adder?

An adder that uses minimum circuitry to get the result is known as Brent Kung Adder and it is also known as a low-power adder or parallel adder. This adder is intended to save the size of the chip so that manufacturing these adders will become easier. This adder’s symmetry & usual construction structure will decrease production costs greatly and are allowable to use in pipelined topologies. The utilization of complementary pass transistor logic helps in enhancing the design performance with the multiplexer approach in various cell designs.

Brent Kung Adder Circuit

The brent-kung parallel prefix adder diagram is shown below which includes stage 1 (pre-processing stage), stages 2 to 7 are carry generation stages & stage 8 is post-processing. It is an advanced architecture and is very simple to construct & provides less wiring congestion. So, its less wiring will decrease the amount of required space to execute the architecture. In addition, routing becomes much easier due to crossing (or) overlapping of fewer wires. However, the penalty will increase in delay because of the increased number of stages, The fan out for this adder is increased, and then the delay will be increased.

Brent Kung Adder
                                                        Brent Kung Adder

How Does Brent Kung Adder Work?

Brent Kung Adder works by calculating the prefixes for two bit groups which are useful in finding the 4 bit group prefixes. These prefixes are used for computing the 8-bit group’s prefixes, etc. After that, these prefixes will be used for computing the carry-out of the specific bit stage. These carries are used with the next stage’s Group Propagate to calculate that stage’s Sum bit. Brent Kung Tree uses 2log2N – 1 stage.

32-bit Brent Kung Adder

The 32-bit Brent Kung adder layout is shown below. At the start of this layout, basic logic gates are designed like NAND, inverter, XOR, NOR, etc. After that, the necessary cells like black cells, gray cells, buffers, and PG logic are designed with the logic gates.

32-bit Brent Kung Adder
                                  32-bit Brent Kung Adder

In the below 32-bit Brent Kung adder, the inverting gates like AOI & OAI are alternatively used for mainly gray & black cells. So the black & grey cells are represented with grey & black blocks whereas the buffers are represented with circles.

Basic Cells in Adder
Basic Cells in Adder

The inputs like A & B are provided to PG logic which is shown in the block diagram. For a 32-bit adder, 32 PG logic blocks are necessary and the propagate (P) & generate (G) signals are the outputs of this block. These signals are provided to the Brent Kung adder tree structure. The structure of this adder includes grey cells & black cells.

A grey cell includes three inputs & single output. The propagate and generate signals from the present stage & generate signals from the previous stage are inputs whereas the group generate signals is the o/p. In any tree structure, every stage will end with a grey cell & the o/p of this cell is the group generate signal. This signal is considered simply as the carry of that stage. The Black cell includes four inputs & two outputs. The inputs for this cell are the present stage’s P & G signals & P, G signals from the previous stage.

A PG logic includes AND & XOR gates where the AND logic gate is used for generating the G signal & XOR logic gate provides the P signal. To eliminate needless inverters, two kinds of grey cells & black cells are utilized. The inverting gates used in one row for the gray cell are AOI or AND-OR-Inverter & the inverting gates for the black cell within the next row utilize OAI or OR-AND-Inverter. The AOI cell uses the normal inputs to provide inverted outputs whereas OAI uses inverted inputs to provide normal outputs.

Brent Kung Adder Operation

Brent Kung adder is a parallel prefix adder used for the operation of high-performance addition. This adder looks like a tree structure that performs the arithmetic operation. This adder includes black cells & gray cells. Every black cell has two AND gates & single OR gate and every gray cell has only a single AND gate.

The Brent-kung adder includes two stages; the pre-processing stage & generation stage. In the first stage, generate & propagate will be from every pair of inputs. Here the propagate provides an ‘XOR’ operation for input bits whereas generates provides an ‘AND’ operation for input bits. The propagate & generate like’ Pi’ and ‘Gi’ are given below.

Pi = Ai XOR Bi and Gi = Ai AND Bi.

In the second stage, the carry will be generated for every bit which is known as carry generate ‘Cg’ & carry is propagate for every bit is known as carry generate ‘Cp’. For the further operation, carry propagate & carry generate will be generated. The final cell available within the every bit operate provides carry. So the final bit carry will assist to sum of the next bit concurrently till the last bit. The carry generate & propagate are given as;

Cp = P1 AND P0 and Cg=G1 OR (P1 AND G0)

It is used mainly for the addition operation of two thirty-two bits & every bit experiences the preprocessing stage & generation stage then it provides the final sum.

The primary input bits go below the pre-processing stage & they produce propagate & generate. So these propagate as well as generate undergoes generation stage generates carry generates & carry propagates and provides final sum. The Brent-kung adder step-by-step process is shown below.

Efficient Block Diagram
Efficient Block Diagram

The Brent-kung adder arrangement looks like a tree structure and it is the high-speed adder that targets on gate-level logic. This adder can be designed with a decrease of the number of logic gates. Thus, it reduces the delay & memory utilized within this architecture.

Brent Kung Adder Verilog Code

The Brent Kung adder verilog code is shown below.

`define INPUTSIZE 64 //set the input size n

`define GROUPSIZE 8 //set the group size = 1, 2, 4 or 8

 

module Brent_Kung_Adder(A, B, S);

input [`INPUTSIZE – 1:0] A;

input [`INPUTSIZE – 1:0] B;

output [`INPUTSIZE:0] S;

wire [`INPUTSIZE / `GROUPSIZE * 2 – 1:0] r_temp;

wire [`INPUTSIZE / `GROUPSIZE * 2 – 1:0] r;

wire [`INPUTSIZE / `GROUPSIZE:0] cin;

wire [`INPUTSIZE / `GROUPSIZE * 2 – 1:0] q;

assign cin[0] = 1’b0;

generate

genvar i;

for (i = 0; i < `INPUTSIZE / `GROUPSIZE; i = i + 1) begin: parallel_FA_CLA_prefix

    group_q_generation #(.Groupsize(`GROUPSIZE))

    f(

        .a(A[`GROUPSIZE * (i + 1) – 1:`GROUPSIZE * i]),

        .b(B[`GROUPSIZE * (i + 1) – 1:`GROUPSIZE * i]),

        .cin(cin[i]),

        .s(S[`GROUPSIZE * (i + 1) – 1:`GROUPSIZE * i]),

        .qg(q[i * 2 + 1:i * 2])

    );

end

parallel_prefix_tree_first_half #(.Treesize(`INPUTSIZE / `GROUPSIZE))

t1(

    .q(q[`INPUTSIZE / `GROUPSIZE * 2 – 1:0]),

    .r(r_temp[`INPUTSIZE / `GROUPSIZE * 2 – 1:0])

);

parallel_prefix_tree_second_half #(.Treesize(`INPUTSIZE / `GROUPSIZE))

t2(

    .q(r_temp[`INPUTSIZE / `GROUPSIZE * 2 – 1:0]),

    .r(r[`INPUTSIZE / `GROUPSIZE * 2 – 1:0])

);

for (i = 0; i < `INPUTSIZE / `GROUPSIZE; i = i + 1) begin: cin_generation

    cin_generation_logic f(

        .r(r[2 * i + 1:2 * i]),

        .c0(1’b0),

        .cin(cin[i + 1])

    );

end

assign S[`INPUTSIZE] = cin[`INPUTSIZE / `GROUPSIZE];

endgenerate

endmodule

// First half of parallel prefix tree

module parallel_prefix_tree_first_half #(parameter Treesize = `INPUTSIZE / `GROUPSIZE)(q, r);

input [Treesize * 2 – 1:0] q;

output [Treesize * 2 – 1:0] r;

generate

genvar i;

if (Treesize == 2) begin: trivial_case

    assign r[1:0] = q[1:0];

    prefix_logic f(

        .ql(q[1:0]),

        .qh(q[3:2]),

        .r(r[3:2])

    );

end else begin: recursive_case

    wire [Treesize * 2 – 1:0] r_temp;

    parallel_prefix_tree_first_half #(.Treesize(Treesize / 2))

    recursion_lsbh(

        .q(q[Treesize – 1:0]),

        .r(r_temp[Treesize – 1:0])

    );

    parallel_prefix_tree_first_half #(.Treesize(Treesize / 2))

    recursion_msbh(

        .q(q[Treesize * 2 – 1:Treesize]),

        .r(r_temp[Treesize * 2 – 1:Treesize])

    );

    for (i = 0; i < Treesize * 2; i = i + 2) begin: parallel_stitch_up

        if (i != Treesize * 2 – 2) begin: parallel_stitch_up_pass

            assign r[i + 1:i] = r_temp[i + 1:i];

        end else begin: parallel_stitch_up_produce

            prefix_logic f(

                .ql(r_temp[Treesize – 1:Treesize – 2]),

                .qh(r_temp[Treesize * 2 – 1:Treesize * 2 – 2]),

                .r(r[Treesize * 2 – 1:Treesize * 2 – 2])

            );

        end

    end

end

endgenerate

endmodule

// Second half of parallel prefix tree

module parallel_prefix_tree_second_half #(parameter Treesize = `INPUTSIZE / `GROUPSIZE)(q, r);

input [Treesize * 2 – 1:0] q;

output [Treesize * 2 – 1:0] r;

wire [Treesize * 2 * ($clog2(Treesize) – 1) – 1:0] r_temp;

assign r_temp[Treesize * 2 – 1:0] = q[Treesize * 2 – 1:0];

generate

genvar i, j;

for (i = 0; i < $clog2(Treesize) – 2; i = i + 1) begin: second_half_level

    assign r_temp[Treesize * 2 * (i + 1) + ((Treesize / (2 ** i)) – 1 – 2 ** ($clog2(Treesize / 4) – i)) * 2 – 1:Treesize * 2 * (i + 1)] = r_temp[Treesize * 2 * i + ((Treesize / (2 ** i)) – 1 – 2 ** ($clog2(Treesize / 4) – i)) * 2 – 1:Treesize * 2 * i];

    for (j = (Treesize / (2 ** i)) – 1 – 2 ** ($clog2(Treesize / 4) – i); j < Treesize; j = j + 2 ** ($clog2(Treesize / 2) – i)) begin: second_half_level_logic

        prefix_logic f(

            .ql(r_temp[Treesize * 2 * i + (j – 2 ** ($clog2(Treesize / 4) – i)) * 2 + 1:Treesize * 2 * i + (j – 2 ** ($clog2(Treesize / 4) – i)) * 2]),

            .qh(r_temp[Treesize * 2 * i + j * 2 + 1:Treesize * 2 * i + j * 2]),

            .r(r_temp[Treesize * 2 * (i + 1) + j * 2 + 1:Treesize * 2 * (i + 1) + j * 2])

        );

        if (j != Treesize – 1 – 2 ** ($clog2(Treesize / 4) – i)) begin: second_half_level_direct_connect

            assign r_temp[Treesize * 2 * (i + 1) + (j + 2 ** ($clog2(Treesize / 2) – i)) * 2 – 1:Treesize * 2 * (i + 1) + j * 2 + 2] = r_temp[Treesize * 2 * i + (j + 2 ** ($clog2(Treesize / 2) – i)) * 2 – 1:Treesize * 2 * i + j * 2 + 2];

        end

    end

    assign r_temp[Treesize * 2 * (i + 2) – 1:Treesize * 2 * (i + 2) – (2 ** ($clog2(Treesize / 4) – i)) * 2] = r_temp[Treesize * 2 * (i + 1) – 1:Treesize * 2 * (i + 1) – (2 ** ($clog2(Treesize / 4) – i)) * 2];

end

assign r[1:0] = r_temp[Treesize * 2 * ($clog2(Treesize) – 2) + 1:Treesize * 2 * ($clog2(Treesize) – 2)];

for (i = 1; i < Treesize; i = i + 2) begin: final_r_odd

    assign r[i * 2 + 1:i * 2] = r_temp[Treesize * 2 * ($clog2(Treesize) – 2) + i * 2 + 1:Treesize * 2 * ($clog2(Treesize) – 2) + i * 2];

end

for (i = 2; i < Treesize; i = i + 2) begin: final_r_even

    prefix_logic f(

        .ql(r_temp[Treesize * 2 * ($clog2(Treesize) – 2) + i * 2 – 1:Treesize * 2 * ($clog2(Treesize) – 2) + i * 2 – 2]),

        .qh(r_temp[Treesize * 2 * ($clog2(Treesize) – 2) + i * 2 + 1:Treesize * 2 * ($clog2(Treesize) – 2) + i * 2]),

        .r(r[i * 2 + 1:i * 2])

    );

end

endgenerate

endmodule

module group_q_generation #(parameter Groupsize = `GROUPSIZE)(a, b, cin, s, qg);

input [Groupsize – 1:0] a;

input [Groupsize – 1:0] b;

input cin;

output [Groupsize – 1:0] s;

output [1:0] qg;

wire [2 * Groupsize – 1:0] q;

wire [Groupsize – 1:0] c;

assign c[0] = cin;

generate

genvar i;

for (i = 0; i < Groupsize; i = i + 1) begin: parallel_FA_CLA_prefix

    FA_CLA_prefix f(

        .a(a[i]),

        .b(b[i]),

        .cin(c[i]),

        .s(s[i]),

        .q(q[i * 2 + 1:i * 2])

    );

    if (i != Groupsize – 1) begin: special_case

        assign c[i + 1] = q[i * 2 + 1] | (q[i * 2] & c[i]);

    end

end

//group q generation based on the Groupsize

if (Groupsize == 1) begin: case_gs1

    assign qg[1] = q[1];

    assign qg[0] = q[0];

end else if (Groupsize == 2) begin: case_gs2

    assign qg[1] = q[3] | (q[1] & q[2]);

    assign qg[0] = q[2] & q[0];

end else if (Groupsize == 4) begin: case_gs4

    assign qg[1] = q[7] | (q[5] & q[6]) | (q[3] & q[6] & q[4]) | (q[1] & q[6] & q[4] & q[2]);

    assign qg[0] = q[6] & q[4] & q[2] & q[0];

end else if (Groupsize == 8) begin: case_gs8

    assign qg[1] = q[15] | (q[13] & q[14]) | (q[11] & q[14] & q[12]) | (q[9] & q[14] & q[12] & q[10]) | (q[7] & q[14] & q[12] & q[10] & q[8]) | (q[5] & q[14] & q[12] & q[10] & q[8] & q[6]) | (q[3] & q[14] & q[12] & q[10] & q[8] & q[6] & q[4]) | (q[1] & q[14] & q[12] & q[10] & q[8] & q[6] & q[4] & q[2]);

    assign qg[0] = q[14] & q[12] & q[10] & q[8] & q[6] & q[4] & q[2] & q[0];

end

endgenerate

endmodule

// Cin generation logic

module cin_generation_logic(r, c0, cin);

input [1:0] r;

input c0;

output cin;

assign cin = (r[0] & c0) | r[1];

endmodule

// Basic logic for prefix operations

module prefix_logic(ql, qh, r);

input [1:0] ql;

input [1:0] qh;

output [1:0] r;

assign r[0] = qh[0] & ql[0];

assign r[1] = (qh[0] & ql[1]) | qh[1];

endmodule

// Full Adder Cell with Carry Look-Ahead

module FA_CLA_prefix(a, b, cin, s, q);

input a;

input b;

input cin;

output s;

output [1:0] q;

assign q[0] = a ^ b;

assign s = q[0] ^ cin;

assign q[1] = a & b;

endmodule

Advantages

The advantages of Brent Kung Adder include the following.

  • This is a low-power adder because it uses a minimum circuit to get the result.
  • It is a very popular & widely used adder.
  • This kind of adder can be implemented by using fewer modules as compared to a Kogge-Stone adder.
  • Brent-Kung adder designing is very easy.
  • This adder has fewer connections with other modules.
  • These adders were proposed mainly to solve the drawbacks of Kogge-Stone adders.

Disadvantages

The disadvantages of Brent Kung Adder include the following.

  • These adders have greater delay and it need 2 log2 n − 2 logic levels for computing all the carry bits.
  • The main drawback of this adder is fanout which can cause propagating of the current throughout the adder to split up & become weaker.

Brent Kung Adder Applications

The applications of Brent Kung Adder include the following.

  • A Brent–Kung adder is used in a pipeline manner to decrease the consumption of power by decreasing the combinatorial logic depth & glitches stabilization.
  • Brent-Kung adder provides an outstanding number of stages from i/p to all o/ps but with asymmetric Intermediate stages loading.
  • This adder can be used within the multiplier as well as other data path elements.

Thus, this is an overview of Brent kung adder, its working, advantages, disadvantages, and its applications. This is a very efficient adder and its structure looks like a tree structure used mainly for high-performance arithmetic operations. This type of adder is very fast and focuses mainly on the gate-level logic. This adder is designed by using less number of logic gates. Thus, it reduces the memory & delay utilized within this architecture. Here is a question for you, Brent kung adder also known as?