# Technology mapping for Single Target Gate based Circuits using Boolean functional decomposition 

Nabila Abdessaied ${ }^{1}$, Mathias Soeken ${ }^{1,2}$, and Rolf Drechsler ${ }^{1,2}$<br>${ }^{1}$ Cyber-Physical Systems, DFKI GmbH, Bremen, Germany<br>${ }^{2}$ Institute of Computer Science, University of Bremen, Germany<br>\{nabila,msoeken, drechsle\}@informatik. uni-bremen.de


#### Abstract

Quantum computing offers a promising emerging technology due to the potential theoretical capacity of solving many important problems with exponentially less complexity. Since most of the known quantum algorithms include Boolean components, the design of quantum computers is often conducted by a two-stage approach. In a first step, the Boolean component is realized in reversible logic and then mapped to quantum gates in a second step. This paper describes a new mapping flow for determining quantum gate realizations for single-target gates (ST). Since each ST gate contains a Boolean control function, our method attempts to find a decomposition based on its BDD representation. It consists on breaking large ST gate into smaller ones using additional lines. Experiments show that we obtain smaller realizations when comparing to standard mapping.


## 1 Introduction

Quantum computers are one of the most promising emerging technologies, generating interest from the corporate sector and attracting government investment. Quantum computers exploit the often counter-intuitive rules of quantum physics to perform computations in a substantially different and often much more efficient way than classical computers, enabling computational solutions to problems that are considered intractable for classical systems [1]. The design and fabrication of these machines has progressed rapidly in the past decade, with many research groups now routinely fabricating and operating small quantum computers in multiple physical systems.

Quantum computing does not only provide challenges for physicists but also offers a variety of challenging and interesting problems to the field of computer science. Large parts of quantum computers perform classical computations which can be described in terms of classical Boolean functions instead of arbitrary unitary operations as they are used for general quantum computing. However, since all quantum computers need to be reversible, also the classical computations need to be described in terms of reversible Boolean functions [2]. In order to create a quantum circuit from such a Boolean function, a first intermediate step synthesizes a reversible circuit description. The most common gate library for this step consists of mixed-polarity multiple-controlled Toffoli gates. Toffoli
gates offer a convenient representation to model the functionality of a reversible circuit but are still too abstract to be used as quantum operations. Many aspects, particularly those considering fault tolerance and error correction properties, cannot effectively be considered on that abstraction level. For the latter, quantum gate libraries are used that consist of a few quantum gates that typically act on at most 2 qubits: one of the currently prominent libraries is the Clifford $+T$ gate library [3]. Technology mapping is performed in order to map Toffoli gates to gates from the quantum gate library and the majority of methods that have been presented so far originate from [4].

Albeit providing a high-level representation for reversible circuits, the lower bound of the size of a reversible circuit consisting of Toffoli gates is exponential [5], i.e., for every number of variables there exists a reversible function for which the size of the minimal circuit is exponential. In order to avoid this complexity when addressing large reversible functions and circuits, recently single-target gates are considered as a representation for reversible circuits. They are a generalization of Toffoli gates and a linear upper bound for reversible circuits composed of these gates has been shown in [6]. Besides that, synthesis approaches presented in [6] and [7] are based on this gate representation. However, for technology mapping into quantum circuits, so far single-target gates are mapped into cascades of Toffoli gates which are then independently mapped using the techniques described in [4].

In this paper, we present a technology mapping approach that is directly based on single-target gates and makes use of Boolean decomposition and a constant number of ancillary lines. Working on the higher level abstraction allows significant cost reductions as shown by our experimental evaluations. In the best case, we were able to reduce the costs of the quantum circuit by $75 \%$ and in the average by about $20 \%$ for the Clifford $+T$ gate library.

## 2 Preliminaries

To keep the paper self-contained, this section reviews definitions and notations from Boolean functions, function decomposition, reversible circuits, and reversible synthesis.

### 2.1 Boolean Functions

Let $\mathbb{B} \stackrel{\text { def }}{=}\{0,1\}$ denote the Boolean values. Then we refer to $\mathcal{B}_{n, m} \stackrel{\text { def }}{=}\{f \mid$ $\left.f: \mathbb{B}^{n} \rightarrow \mathbb{B}^{m}\right\}$ as the set of all Boolean multiple-output functions with $n$ inputs and $m$ outputs. There are $2^{m 2^{n}}$ such Boolean functions. We write $\mathcal{B}_{n} \xlongequal{\text { def }} \mathcal{B}_{n, 1}$ and assume that each $f \in \mathcal{B}_{n}$ is represented by a propositional formula over the variables $x_{1}, \ldots, x_{n}$. Furthermore, we assume that each function $f \in \mathcal{B}_{n, m}$ is represented as a tuple $f=\left(f_{1}, \ldots, f_{m}\right)$ where $f_{i} \in \mathcal{B}_{n}$ for each $i \in\{1, \ldots, m\}$ and hence $f(\boldsymbol{x})=\left(f_{1}(\boldsymbol{x}), \ldots, f_{m}(\boldsymbol{x})\right)$ for each $\boldsymbol{x} \in \mathbb{B}^{n}$. If we emphasize on the domain of the function we write $f(X)$ where $X$ refers to the set of input variables.

### 2.2 Exclusive Sum Of Products

Exclusive sum-of-products (ESOPs, cf. [8]) are two-level descriptions for Boolean functions in which a function is composed of $k$ product terms that are combined using the exclusive-OR (XOR, $\oplus$ ) operation. A product term is the conjunction of $l_{i}$ literals where a literal is either a propositional variable $x^{1} \stackrel{\text { def }}{=} x$ or its negation $x^{0} \stackrel{\text { def }}{=} \bar{x}$. ESOPs are the most general form of two-level AND-XOR expressions:

$$
\begin{equation*}
f=\bigoplus_{i=1}^{k} x_{i_{1}}^{p_{i_{1}}} \wedge \cdots \wedge x_{i_{l_{i}}}^{p_{i_{l_{i}}}} \tag{1}
\end{equation*}
$$

Several restricted subclasses have been considered in the past, e.g., positive polarity Reed-Muller expressions (PPRM [8]) in which all literals are positive. There are further subclasses and most of them can be defined based on applying one of the following decomposition rules. An arbitrary Boolean function $f\left(x_{1}, x_{2}, \ldots, x_{n}\right)$ can be expanded with respect to a variable $x_{i}$ as

$$
\begin{array}{lr}
f=\bar{x}_{i} f_{\bar{x}_{i}} \oplus x_{i} f_{x_{i}} & \text { (Shannon) } \\
f=f_{\bar{x}_{i}} \oplus x_{i}\left(f_{\bar{x}_{i}} \oplus f_{x_{i}}\right) & \text { (positive Davio) } \\
f=f_{x_{i}} \oplus \bar{x}_{i}\left(f_{\bar{x}_{i}} \oplus f_{x_{i}}\right) & \text { (negative Davio) }
\end{array}
$$

with co-factors $f_{\bar{x}_{i}}=f\left(x_{1}, \ldots, x_{i-1}, 0, x_{i+1}, \ldots, x_{n}\right)$ and $f_{x_{i}}=f\left(x_{1}, \ldots, x_{i-1}, 1, x_{i+1}, \ldots, x_{n}\right)$.

### 2.3 Boolean Function Decomposition

Boolean function decomposition describes the problem of finding, for a Boolean function, two or more simpler functions that being composed are functionally equivalent. Several types of Boolean function decomposition have been found in the last decades with the most important ones being:

1. Ashenhurst decomposition [9]: A function $f \in \mathcal{B}_{n}$ is decomposed into $f(X)=$ $h\left(g\left(X_{1}\right), X_{2}\right)$ with $g \in \mathcal{B}_{\left|X_{1}\right|}, h \in \mathcal{B}_{\left|X_{2}\right|+1}$, and $X=X_{1} \cup X_{2}$. If $X_{1} \cap X_{2}=\emptyset$, then the decomposition is called disjoint, otherwise it is called a non-disjoint decomposition. The set $X_{1}$ is called bound set and the set $X_{2}$ is called free set.
2. Curtis decomposition [10] is a generalization of the Ashenhurst decomposition with several inner functions of which each can have multiple outputs, i.e., $f(X)=h\left(g_{1}\left(X_{1}\right), g_{2}\left(X_{2}\right), \cdots, g_{k}\left(X_{k}\right), X_{k+1}\right)$ with $g_{i} \in \mathcal{B}_{\left|X_{i}\right|, m_{i}}$, $h \in \mathcal{B}_{\left|X_{k+1}\right|+m_{1}+\cdots+m_{k}}$, and $X=X_{1} \cup X_{2} \cup \cdots \cup X_{k+1}$.
3. Factorization [11]: The function is decomposed as $f(X)=g\left(X_{1}\right) \wedge h\left(X_{2}\right) \vee$ $c\left(X_{3}\right)$, with $g \in \mathcal{B}_{\left|X_{1}\right|}, h \in \mathcal{B}_{\left|X_{2}\right|}, c \in \mathcal{B}_{\left|X_{3}\right|}$ and $X=X_{1} \cup X_{2} \cup X_{3}$.
4. Bi-decomposition [12] is also known as simple decomposition. The function is decomposed into two sub-functions $f(X)=g\left(X_{1}\right) \odot h\left(X_{2}\right)$, with $g \in \mathcal{B}_{\left|X_{1}\right|}$, $h \in \mathcal{B}_{\left|X_{2}\right|}$ and $X=X_{1} \cup X_{2}$. The ' $\odot$ ' is any binary Boolean operation (typically $\vee, \wedge, \oplus$, or $\leftrightarrow$ ).


Fig. 1. Examples of reversible gates

When $X_{1}, X_{2}$, and $X_{3}$ are disjoint, the decomposition is called algebraic, otherwise Boolean or functional. Functional decomposition is much more powerful because the majority of Boolean functions are likely to have a functional decomposition rather than an algebraic one. Much work has been presented on decomposition algorithms based on truth tables [13] or binary decision diagrams (BDDs) [14-16].

### 2.4 Reversible Circuits

Reversible functions of $n$ variables can be realized by reversible circuits that consist of at least $n$ lines and are constructed as cascades of reversible gates that belong to a certain universal gate library. Although the Toffoli gate library is the most common gate library, single-target gates are of interest as they can lead to better circuits, e.g., lower quantum cost [7] and better circuit complexity [17].

Definition 1 (single-target gate). Given a set of variables $X=\left\{x_{1}, \ldots, x_{n}\right\}$, $a$ single-target gate $(S T) \mathrm{T}_{g}(C, t)$ with control lines $C=\left\{x_{i_{1}}, \ldots, x_{i_{k}}\right\} \subset X$, $a$ target line $t \in X \backslash C$, and $a$ control function $g \in \mathcal{B}_{k}$ inverts the variable on the target line if and only if $g\left(x_{i_{1}}, \ldots, x_{i_{k}}\right)$ evaluates to true. All other variables remain unchanged.

Definition 2 (Toffoli gate). Mixed-polarity multiple control Toffoli (MPMCT) gates are a subset of the single-target gates in which the control function $g$ can be represented with one product term or $g=\bigwedge_{k=i}^{j} x_{i}^{p}=1$. Multiple-control Toffoli gates (MCT) in turn are a subset from MPMCT gates in which the product terms can only consist of positive literals.


Fig. 2. Reversible circuit


Fig. 3. Quantum mapping of a Toffoli gate

Following from synthesis algorithm implementations, it can easily be shown that any reversible function $f \in \mathcal{B}_{n, n}$ can be realized by a reversible circuit with $n$ lines when using MCT gates. That is, it is not necessary to add any temporary lines (ancilla) to realize the circuit. This can be the case if the MCT (or MPMCT) gates are restricted to a given size, e.g., three bits. For drawing circuits, we follow the established conventions of using the symbol $\oplus$ to denote the target line, solid black circles to indicate positive controls, and white circles to indicate negative controls.

Example 1. Figure 1a shows a Toffoli gate with positive controls, Figure 1b shows a Toffoli gate with mixed-polarity control lines, and Figure 1c shows the representation of a single-target gate based on Feynman's notation. Figure 2 shows different Toffoli gates in a cascade forming a reversible circuit. The annotated values demonstrate the computation of the gate for a given input assignment.

### 2.5 Cost Metrics

To compare quantum circuits, we define metrics which depend on the gate library. For the NCV gate library, the quantum cost of a circuit is used while for the Clifford $+T$ gate library, the $T$-depth is used. The motivation for that cost measure origins from the fact that the $T$ gate is significantly larger compared to the other gates in the circuit.

Definition 3 (NCV-Cost). The NCV-cost is the total number of elementary gates used in a quantum circuit.

Definition 4 ( $T$-depth). The $T$-depth is the number of $T$-stages in a quantum circuit where each stage consists of one or more $T$ or $T^{\dagger}$ gates that can be performed concurrently on separate qubits. The total number of incorporated $T$ or $T^{\dagger}$ gates in the whole circuit is denoted by $T$-count


Fig. 4. Mapping from [4]

Example 2. The NCV-cost of the circuit shown in fig. 2 is equal to 8 since the total number of the elementary gates that realize a Toffoli gate is equal to 5 . The circuit has a $T$-count of 7 and $T$-depth of 3 .

### 2.6 Young Subgroup Synthesis

The young subgroup based synthesis approach makes use of the following property. Given a variable $x$, every reversible function $f \in \mathcal{B}_{n, n}$ can be decomposed into three functions $f=g_{2} \circ f^{\prime} \circ g_{1}$ such that $f^{\prime} \in \mathcal{B}_{n, n}$ is a reversible function that does not change in $x$, and $g_{1}, g_{2} \in \mathcal{B}_{n, n}$ are reversible functions that can be realized as single-target gates that act on $x$. By recursively applying the decomposition on the inner function $f^{\prime}$, one obtains $2 n$ single-target gates that realize $f$. After at most $n$ recursive applications $f^{\prime}$ represents the identity function, i.e., it does not change in any variable anymore. Details of the proof can be found in [18]. A synthesis algorithm based on the idea has initially been proposed in [6] that takes as input a reversible function represented by its truth table. The synthesis algorithm has been extended to work symbolically using binary decision diagrams in [7], which allows for handling larger functions.

## 3 Motivation

As mentioned above, a quantum circuits are described in terms of a reversible Boolean function. In order to derive a quantum circuit for the reversible function, a two step approach is usually applied: first a circuit description in terms of reversible gates is derived, which in the second step is mapped to a quantum circuit composed of gates from a given library. In these steps, reversible gates are very general; e.g., often MCT gates are used for which the number of controls is not restricted. More recently, also the use of MPMCT gates became common practice. Quantum gate libraries are much smaller and usually consist of a few gates which can act on at most 2 qubits. Two prominent quantum gate libraries are the NCV gate library and the Clifford $+T$ gate library. In particular, the latter library is of significant interest in the design of quantum computers due
to its good properties in fault tolerant quantum computing. Minimal quantum circuit realizations are known for the 2-controlled Toffoli gate and are shown in Figure 3.

For larger Toffoli gates a procedure from [4] is applied which, according to Lemma 7.3 in [4], maps a reversible Toffoli gate with $c \geq 3$ controls to a network consisting of two identical gates with $m$ controls and two other identical gates with $c-m+1$ controls, where $m \in\{2, \ldots, c-2\}$ and each of them are placed alternately. If no free line is available for the gate, a new helper line must be added to the circuit. Its value is restored and hence can be reused afterwards. Finally, each obtained gate is mapped according to Lemma 7.2 in [4]. As a result, all Toffoli gates have at most 2 control lines. At this point, the mapping given in Figure 3 can be applied.

Example 3. The procedure is illustrated in Figure 4 for a Toffoli gate with six control lines. The first circuit depicts the result after the application of Lemma 7.2 in [4], while the second network sketches the obtained circuit from the decomposition of the first gate in the dashed rectangle after applying Lemma 7.3 in [4].

So far, there is no mapping approach into quantum circuits that directly targets the single-target gates as it is done for the MPMCT gates. To map single-target gates, we aim to decompose them into MPMCT gates so that we can afterwards map each obtained MPMCT gate using the approach explained above.

The mapping of a single-target gate to an MPMCT cascade is so far done by computing the XOR decomposition of its controlling function, then each cube in the obtained expression is represented by an MPMCT gate.

Many other Boolean decompositions do exist and have shown good efficiency [14]. Motivated by this, we want to study the impact of applying different kinds of Boolean decompositions while mapping single-target gates to MPMCT gates, i.e., unlike the standard mapping, we will not restrict the decomposition to XOR but also to bi-decomposition, Ashenhurst, and Curtis decomposition.

## 4 Mapping of Single-target Gates

This section describes how Boolean decomposition can be applied to map reversible circuits composed of single-target gates into quantum circuits. Only the Young subgroup synthesis, for both the truth table based variant [6] and the BDD-based variant [7], makes use of single-target gates, however, due to the complexity of reversible circuits based on Toffoli gates (see, e.g., [17]) single-target gates are a preferable choice especially for large circuits.

Figure 5 shows how the functional decomposition of a single-target gate's control function can be used to generate less complex circuits. In the following we assume that the control function of the single-target gate that should be mapped depends on the variables $x_{1}, \ldots, x_{n-1}$. Figure $5($ a) shows the mapping approach for a disjoint Ashenhurst-Curtis decomposition. The variables are partitioned

$x_{n}^{\prime}=x_{n} \oplus h\left(g_{1}\left(\boldsymbol{x}_{1}\right), g_{2}\left(\boldsymbol{x}_{2}\right), g_{3}\left(\boldsymbol{x}_{3}\right), \boldsymbol{x}_{4}\right)$
(a) Ashenhurst-Curtis

(c) AND ( $\wedge$ )

$x_{n}^{\prime}=x_{n} \oplus f\left(\boldsymbol{x}_{1}, \boldsymbol{x}_{2}\right) \oplus g\left(\boldsymbol{x}_{2}, \boldsymbol{x}_{3}\right)$
(e) $\operatorname{XOR}(\oplus)$

$x_{n}^{\prime}=x_{n} \oplus\left(f\left(\boldsymbol{x}_{1}\right) \wedge g\left(\boldsymbol{x}_{2}\right) \oplus \bar{f}\left(\boldsymbol{x}_{1}\right) \wedge h\left(\boldsymbol{x}_{3}\right)\right)$
(b) MUX


$$
x_{n}^{\prime}=x_{n} \oplus\left(f\left(\boldsymbol{x}_{1}, \boldsymbol{x}_{2}\right) \vee g\left(\boldsymbol{x}_{2}, \boldsymbol{x}_{3}\right)\right)
$$

(d) OR (V)

$x_{n}^{\prime}=x_{n} \oplus\left(f\left(\boldsymbol{x}_{1}, \boldsymbol{x}_{2}\right) \leftrightarrow g\left(\boldsymbol{x}_{2}, \boldsymbol{x}_{3}\right)\right)$
(f) XNOR $(\leftrightarrow)$

Fig. 5. Different types of decomposition
into four sets of variables represented as bit-vectors $\boldsymbol{x}_{1}, \boldsymbol{x}_{2}, \boldsymbol{x}_{3}$, and $\boldsymbol{x}_{4}$. First, the inner functions $g_{1}, g_{2}$, and $g_{3}$ are computed and each of their results is stored on an additional helper line that is initialized with a constant 0 value. Having the resulting values on these lines the outer function can be computed and afterwards the constant values on the helper lines are restored by reapplying the inner functions.

Figures 5(b) and (c) show non-disjoint bi-decompositions based on the and and or operation, respectively. The sub-function $f$ depends on variables in $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$ and the sub-function $g$ depends on variables in $\boldsymbol{x}_{2}$ and $\boldsymbol{x}_{3}$. As can be seen, the construction follows the representation of the Ashenhurst-Curtis decomposition in Figure 5(a). Whether a decomposition is disjoint or non-disjoint does not have an effect on the circuit construction but only on the size of the single-target gates in terms of their support. Also, a decomposition based on the mux operation can analogously be performed by adding an extra helper line (see Figure 5(d)).

Fig. 6. Example for the mapping flow

When using bi-decomposition based on the XOR and XNOR operator, one can update the target line directly as can be seen in Figures 5(e) and (f).

The remainder of this section discusses an example application of the approach illustrated in Figure 6. The starting point is a single-target gate that is controlled by a control function

$$
f\left(x_{1}, x_{2}, x_{3}, x_{4}\right)=\bar{x}_{1} \bar{x}_{2} \bar{x}_{3} \bar{x}_{4} \vee \bar{x}_{1} x_{2} \bar{x}_{3} x_{4} \vee x_{1} \bar{x}_{2} x_{3} \bar{x}_{4} \vee x_{1} x_{2} x_{3} x_{4}
$$

as also illustrated in its specification.
Decomposing the single-target gate using the standard mapping requires finding an ESOP representation of the function. Since $f$ is given in terms of its minterms, it already resembles an ESOP representation. However, one can obtain a smaller one in terms of literals by applying ESOP minimization techniques finally resulting in the Toffoli gate cascade depicted in the upper box of Figure 6. The circuit consists of 4 Toffoli gates each having 2 controls. Mapping it into quantum circuits using the algorithm presented in [4] gives quantum costs of 21 for the NCV gate library (see Figure 3b and note that a Toffoli gate with two negative controls requires at most 6 NCV gates) and a $T$-depth of 12 when using the Clifford $+T$ gate library. Each Toffoli gate has a $T$-depth of 3 as it is depicted in Figure 3c.

Applying our proposed flow will first find a disjoint bi-decomposition

$$
f\left(x_{1}, x_{2}, x_{3}, x_{4}\right)=g\left(x_{1}, x_{3}\right) \wedge h\left(x_{2}, x_{4}\right)
$$

with $g\left(x_{1}, x_{3}\right)=x_{1} \leftrightarrow x_{3}$ and $h\left(x_{2}, x_{4}\right)=x_{2} \leftrightarrow x_{4}$. Next, each of the resulting single-target gates controlled by $g$ and $h$ are mapped to Toffoli cascades as it is shown in the reversible circuit, each ST gate has two Toffoli gates with only one control while the last gate computes the AND of both sub-functions using a Toffoli gate with two controls. Finally, the resulting reversible circuit is mapped to a quantum network with the same algorithm used in the standard flow. The number of NCV gates of the resulting circuit is 9 (compared to 21) and the $T$-depth is 3 (compared to 12 ).

## 5 Experimental Evaluation

In order to confirm the benefits of incorporating the Boolean decomposition technique into the mapping flow of reversible circuits to quantum circuits described in Section 4, we have implemented the proposed idea in the open source toolkit RevKit [19]. The starting point is reversible circuits obtained from applying the BDD-based version of the Young subgroup synthesis [7], which creates reversible circuits composed of single-target gates ${ }^{3}$. We used the BDS-PGA tool [20] to decompose each control fucntion of a single-target gate to smaller ones. We restricted the decomposition of each single-target gate to at most 3 smaller

[^0]single-target gates to limit the use of additional lines to at most 3. To map the resulting smaller gates into cascades of Toffoli gates we used the Xor minimization algorithm implemented in EXORCISM [21]. Finally, we applied the quantum mapping algorithm explained in [4]. The experimental evaluation has been carried out on an Intel Core i5 processor with 4 GB of main memory.

Table 1 summarizes the obtained results. All benchmark names and original lines are listed in the first and second column, respectively. Then, the number of lines (L), the number of gates G, the NCV quantum costs (NCV), the $T$ depth (TD), the $H$-count (HC), and the required run-times (TIME) are provided for the synthesized circuits based on standard mapping and the synthesized circuits based on Boolean decomposition as explained in Section 4.

We provide absolute and relative improvement in the last two columns for quantum costs in terms of the NCV and the Clifford $+T$ gate libraries. The NCV quantum cost reductions and its relative improvement of the circuits obtained by the proposed technique with respect to the realized circuits without taking into account the Boolean decomposition are given in the columns denoted by $\Delta \mathrm{NCV}$ and IMP.NCV, respectively. The procedure presented above yields circuits with lower NCV quantum cost comparing to circuits obtained by standard mapping. The table shows a percentage improvement in terms of NCV quantum cost by approx. $16 \%$. In the best case improvements of up to $67 \%$ are observed (cycle10_2_61).

The $T$-depth cost reductions and its relative improvement are provided in the columns denoted by $\Delta \mathrm{TD}$ and Imp.TD, respectively. Also for this gate library realizations with fewer $T$-depth are obtained when our technique is applied. On average, the size of the resulting quantum gate cascades was decreased by $20 \%$. In the best cases, reductions of up to 47916 in the $T$-depth for the benchmark $b w_{-} 116$ are obtained.
Remarks and Observations. When applying the BDS-PGA tool to find a decomposition for a Boolean function that controls a single-target gate, it first searches for an algebraic decomposition and only looks for a Boolean decomposition if the first attempt is not successful. This process is done recursively for each resulting sub-function until Boolean functions with at most 2 inputs are reached. We adapted the tool such that recursion stops after maximum three decompositions in order to keep a reasonable number of additional lines.

We refer to the common set as the intersection of bound set and free set in Ashenhurst-Curtis decompositions and as the intersection of supports in bidecompositions. We have observed that for large common sets the results of the standard mapping approach outperforms our approach. To ensure good results, we only decomposed functions with a small common set.

We further noticed that bi-decompositions based on XOR and XNOR are less effective compared to the standard mapping since EXORCISM finds efficient ESOP representations. Consequently, we adapted the BDS-PGA tool such that it does not try to find bi-decompositions based on XOR or XNOR.

Finally, factorization with respect to a single variable can usually not improve the overall result since the incorporation of the variable on the helper line does
Table 1. Experimental evaluation

| Benchmark <br> ID | L | Classical mapping |  |  |  |  |  | Boolean decomposition based mapping |  |  |  |  |  | Improvements |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | L | G | NCV | TD | HC | time | L |  | NCV | TD | HC | time | $\Delta \mathrm{NCV}$ | ImP.NCV \| | $\Delta$ TD | Imp.tD |
| decod24 | 4 | 5 | 12 | 74 | 42 | 36 | 0.05 | 6 | 30 | 49 | 33 | 20 | 0.10 | -25 | 33.78\% | -9 | 21.43\% |
| 4_49 | 4 | 5 | 14 | 117 | 81 | 44 | 0.04 | 7 | 57 | 105 | 63 | 52 | 0.15 | -12 | 10.26\% | -18 | 22.22\% |
| aj-e11 | 4 | 4 | 10 | 128 | 90 | 50 | 0.05 | 7 | 62 | 112 | 69 | 66 | 0.14 | -16 | 12.50\% | -21 | 23.33\% |
| mod5mils | 5 | 5 | 3 | 11 | 6 | , | 0.01 | 7 | 8 | 11 | 3 | 2 | 0.04 | 0 | 0.00\% | -3 | 50.00\% |
| $\bmod 5 \mathrm{~d} 1$ | 5 | 5 | 4 | 20 | 12 | 8 | 0.01 | 7 | 9 | 13 | 3 | 2 | 0.03 | -7 | 35.00\% | -9 | 75.00\% |
| 4gt11 | 5 | 6 | 13 | 248 | 147 | 102 | 0.06 | 7 | 40 | 171 | 108 | 76 | 0.13 | -77 | 31.05\% | -39 | 26.53\% |
| 4 mod 5 | 5 | 5 | 22 | 315 | 186 | 126 | 0.05 | 8 | 58 | 213 | 123 | 80 | 0.16 | -102 | 32.38\% | -63 | 33.87\% |
| 4 gt 12 | 5 | 6 | 16 | 304 | 183 | 124 | 0.05 | 7 | 46 | 125 | 114 | 52 | 0.14 | -179 | 58.88\% | -69 | 37.70\% |
| hwb5 | 5 | 6 | 31 | 494 | 339 | 194 | 0.06 | 8 | 89 | 422 | 264 | 162 | 0.21 | -72 | 14.57\% | -75 | 22.12\% |
| $4 \bmod 7$ | 5 | 6 | 33 | 591 | 381 | 238 | 0.07 | 8 | 91 | 510 | 303 | 196 | 0.21 | -81 | 13.71\% | -78 | 20.47\% |
| mini-alu | 5 | 6 | 23 | 368 | 279 | 150 | 0.06 | 8 | 83 | 317 | 195 | 140 | 0.19 | -51 | 13.86\% | -84 | 30.11\% |
| alu | 5 | 5 | 21 | 310 | 261 | 122 | 0.05 | 8 | 81 | 233 | 168 | 88 | 0.17 | -77 | 24.84\% | -93 | 35.63\% |
| $4 \mathrm{gt5}$ | 5 | 6 | 24 | 524 | 312 | 212 | 0.06 | 8 | 85 | 357 | 216 | 146 | 0.18 | -167 | 31.87\% | -96 | 30.77\% |
| cm82a | 6 | 7 | 55 | 1479 | 939 | 590 | 0.10 | 9 | 98 | 1468 | 894 | 580 | 0.30 | -11 | 0.74\% | -45 | 4.79\% |
| C17 | 6 | 7 | 52 | 1394 | 942 | 560 | 0.12 | 9 | 88 | 1329 | 831 | 530 | 0.31 | -65 | 4.66\% | -111 | 11.78\% |
| ex3 | 6 | 7 | 46 | 1593 | 963 | 644 | 0.12 | 9 | 96 | 1275 | 825 | 518 | 0.32 | -318 | 19.96\% | -138 | 14.33\% |
| decod24-en | 6 | 7 | 28 | 1076 | 642 | 440 | 0.09 | 9 | 87 | 727 | 480 | 306 | 0.26 | -349 | 32.43\% | -162 | 25.23\% |
| f2 | 7 | 8 | 77 | 4617 | 2763 | 1866 | 0.16 | 9 | 87 | 4509 | 2700 | 1826 | 0.36 | -108 | 2.34\% | -63 | 2.28\% |
| ham7 | 7 | 7 | 44 | 429 | 300 | 164 | 0.10 | 10 | 95 | 303 | 186 | 120 | 0.28 | -126 | 29.37\% | -114 | 38.00\% |
| z4ml | 8 | 9 | 266 | 18204 | 11568 | 7276 | 0.54 | 11 | 273 | 18041 | 11349 | 7212 | 0.92 | -163 | 0.90\% | -219 | 1.89\% |
| rd73 | 9 | 10 | 503 | 45567 | 28125 | 65625 | 2.3 | 11 | 521 | 45567 | 28038 | 18240 | 2.88 | 0 | 0.00\% | -87 | 0.31\% |
| squar5 | 9 | 10 | 440 | 31815 | 19098 | 12984 | 1.39 | 11 | 313 | 31342 | 18930 | 12566 | 1.80 | -473 | 1.49\% | -168 | 0.88\% |
| wim | 9 | 10 | 276 | 30145 | 18432 | 12082 | 1.60 | 11 | 309 | 29291 | 17955 | 11742 | 1.90 | -854 | 2.83\% | -477 | 2.59\% |
| sym9 | 10 | 11 | 1175 | 137421 | 84426 | 196994 | 3.27 | 12 | 1186 | 137367 | 84393 | 54942 | 3.67 | -54 | 0.04\% | -33 | 0.04\% |
| dc1 | 10 | 11 | 228 | 36363 | 22140 | 14576 | 1.21 | 12 | 252 | 35719 | 21750 | 14328 | 1.57 | -644 | 1.77\% | -390 | 1.76\% |
| rd84 | 11 | 12 | 2063 | 303672 | 184290 | 121468 | 50.58 | 13 | 2076 | 303672 | 184245 | 121488 | 53.27 | 0 | 0.00\% | -45 | 0.02\% |
| cycle10 | 12 | 12 | 27 | 3182 | 1908 | 1272 | 0.14 | 15 | 113 | 1025 | 1032 | 448 | 0.23 | -2157 | 67.79\% | -876 | 45.91\% |
| pm1 | 13 | 14 | 1580 | 427692 | 259134 | 171112 | 132.81 | 16 | 1624 | 425828 | 257880 | 170372 | 137.18 | -1864 | 0.44\% | -1254 | 0.48\% |
| 0410184 | 14 | 14 | 189 | 18108 | 10956 | 7238 | 0.13 | 17 | 199 | 14948 | 9057 | 5980 | 0.23 | -3160 | 17.45\% | -1899 | 17.33\% |
| ham15 | 15 | 15 | 734 | 40510 | 25566 | 16156 | 13.68 | 18 | 734 | 34967 | 21963 | 13948 | 15.06 | -5543 | 13.68\% | -3603 | 14.09\% |
| bw | 32 | 32 | 2585 | 2690620 | 1614336 | 1076344 | 4678.56 | 34 | 3176 | 2598544 | 1566420 | 1039576 | 5367.99 | -92076 | 3.42\% | -47916 | 2.97\% |
| Average |  |  |  |  |  |  |  |  |  |  |  |  |  | -3401 | 16.63\% | -1821 | 20.75\% |

not minimize the functional support. We have therefore turned off that option in the BDS-PGA tool.

To summarize, we looked for function decompositions with a small common-set and allowed bi-decompositions only for the OR, AND, and MUX operator.

## 6 Conclusions

In this work, we proposed a mapping approach that starts with single-target gates and therefore significantly differs from the standard mapping approach that has been state-of-the-art for the last two decades. We observed that incorporating Boolean decomposition in the mapping process of single-target gates often leads to better quantum realizations. Motivated by this, we introduced an improved mapping scheme which uses a constant number of ancillary lines and exploits the Boolean decomposition when generating the quantum gate cascades for a given single-target gate. Including our approach results in quantum circuits with a smaller NCV quantum cost as well as a lower $T$-depth cost comparing to standard mapping results.

## References

1. Devitt, S.J.: Classical control of large-scale quantum computers. In: Int'l Conf. Reversible Computation. (2014) 26-39
2. Nielsen, M., Chuang, I.: Quantum Computation and Quantum Information. Cambridge Univ. Press (2000)
3. Fowler, A.G., Stephens, A.M., Groszkowski, P.: High-threshold universal quantum computation on the surface code. Physical Review A 80 (2009) 052312
4. Barenco, A., Bennett, C.H., Cleve, R., DiVinchenzo, D., Margolus, N., Shor, P., Sleator, T., Smolin, J., Weinfurter, H.: Elementary gates for quantum computation. Physical Review A 52 (1995) 3457-3467
5. Maslov, D., Dueck, G.W., Miller, D.M.: Toffoli network synthesis with templates. IEEE Trans. on CAD of Integrated Circuits and Systems 24(6) (2005) 807-817
6. De Vos, A., Van Rentergem, Y.: Young subgroups for reversible computers. Advances in Mathematics of Communications 2(2) (2008) 183-200
7. Soeken, M., Tague, L., Dueck, G.W., Drechsler, R.: Ancilla-free synthesis of large reversible functions using binary decision diagrams. CoRR abs/1408.3955 (2014)
8. Sasao, T.: AND-EXOR expressions and their optimization. In Sasao, T., ed.: Logic Synthesis and Optimization. Kluwer Academic Publisher (1993) 287-312
9. Ashenhurst, R.L.: The decomposition of switching functions. In: Int'l Symp. on the Theory of Switching. (1957) 74-116
10. Curtis, H.A.: A new approach to the design of switching circuits. van Nostrand Princeton, NJ (1962)
11. Brayton, R.K.: Factoring logic functions. IBM Journal of Research and Development 31(2) (1987) 187-198
12. Sasao, T., Matsuura, M.: DECOMPOS: An integrated system for functional decomposition. In: Int'l Workshop on Logic Synthesis. (1998) 471-477
13. Mishchenko, A., Brayton, R.K., Chatterjee, S.: Boolean factoring and decomposition of logic networks. In: Int'l Conf. on Computer-Aided Design. (2008) 38-44
14. Mishchenko, A., Steinbach, B., Perkowski, M.A.: An algorithm for bi-decomposition of logic functions. In: Design Automation Conference. (2001) 103-108
15. Bertacco, V., Damiani, M.: The disjunctive decomposition of logic functions. In: Int'l Conf. on Computer-Aided Design. (1997) 78-82
16. Yang, C., Ciesielski, M.J.: BDS: a BDD-based logic optimization system. IEEE Trans. on CAD of Integrated Circuits and Systems 21(7) (2002) 866-876
17. Abdessaied, N., Soeken, M., Thomsen, M.K., Drechsler, R.: Upper bounds for reversible circuits based on Young subgroups. Information Processing Letters 114(6) (2014) 282-286
18. Van Rentergem, Y., De Vos, A., Storme, L.: Implementing an arbitrary reversible logic gate. Journal of Physics A: Mathematical and General 38(16) (2005) 3555-3577
19. Soeken, M., Frehse, S., Wille, R., Drechsler, R.: RevKit: A toolkit for reversible circuit design. Journal of Multiple-Valued Logic \& Soft Computing 18(1) (2012) RevKit is available at http://www.revkit.org.
20. Vemuri, N., Kalla, P., Tessier, R.: BDD-based logic synthesis for LUT-based FPGAs. ACM Trans. Design Autom. Electr. Syst. 7(4) (2002) 501-525
21. Mishchenko, A., Perkowski, M.: Fast heuristic minimization of exclusive-sums-ofproducts. In: Int'l Workshop on Applications of the Reed-Muller Expansion in Circuit Design. (2001) 242-250

[^0]:    ${ }^{3}$ Benchmarks were taken from http://webhome.cs.uvic.ca/~dmaslov/ and http: //www.revlib.org

