` .. image:: /tutorials/images/neural_network.svg :align: center :width: 85% :target: javascript:void(0); :html:`

` Here, the neural network depth is determined by the number of layers, while the maximum width is given by the layer with the greatest number of neurons. The network begins with an input layer of real-valued neurons, which feed forward onto a series of one or more hidden layers. Following the notation of [[1]_], if the :math:`n` neurons at one layer are given by the vector :math:`\mathbf{x} \in \mathbb{R}^{n}`, the :math:`m` neurons of the next layer take the values .. math:: \mathcal{L}(\mathbf{x}) = \varphi (W \mathbf{x} + \mathbf{b}), where * :math:`W \in \mathbb{R}^{m \times n}` is a matrix, * :math:`b \in \mathbb{R}^{m}` is a vector, and * :math:`\varphi` is a nonlinear function (also known as the activation function). The matrix multiplication :math:`W \mathbf{x}` is a linear transformation on :math:`\mathbf{x}`, while :math:`W \mathbf{x} + \mathbf{b}` represents an **affine transformation**. In principle, any nonlinear function can be chosen for :math:`\varphi`, but often the choice is fixed from a `standard set of activations

` .. image:: /tutorials/images/layer.svg :align: center :width: 70% :target: javascript:void(0); :html:`

` These layers can then be composed to form a quantum neural network. The width of the network can also be varied between layers [[1]_]. Reproducing classical neural networks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Let's see how the quantum layer can embed the transformation :math:`\mathcal{L}(\mathbf{x}) = \varphi (W \mathbf{x} + \mathbf{b})` of a classical neural network layer. Suppose :math:`N`-dimensional data is encoded in position eigenstates so that .. math:: \mathbf{x} \Leftrightarrow \ket{\mathbf{x}} := \ket{x_{1}} \otimes \ldots \otimes \ket{x_{N}}. We want to perform the transformation .. math:: \ket{\mathbf{x}} \Rightarrow \ket{\varphi (W \mathbf{x} + \mathbf{b})}. It turns out that the quantum circuit above can do precisely this! Consider first the affine transformation :math:`W \mathbf{x} + \mathbf{b}`. Leveraging the singular value decomposition, we can always write :math:`W = O_{2} \Sigma O_{1}` with :math:`O_{k}` orthogonal matrices and :math:`\Sigma` a positive diagonal matrix. These orthogonal transformations can be carried out using interferometers without access to phase, i.e., with :math:`\boldsymbol{\phi}_{k} = 0`: .. math:: U_{k}(\boldsymbol{\theta}_{k},\mathbf{0})\ket{\mathbf{x}} = \ket{O_{k} \mathbf{x}}. On the other hand, the diagonal matrix :math:`\Sigma = {\rm diag}\left(\{c_{i}\}_{i=1}^{N}\right)` can be achieved through squeezing: .. math:: \otimes_{i=1}^{N}S(r_{i})\ket{\mathbf{x}} \propto \ket{\Sigma \mathbf{x}}, with :math:`r_{i} = \log (c_{i})`. Finally, the addition of a bias vector :math:`\mathbf{b}` is done using position displacement gates: .. math:: \otimes_{i=1}^{N}D(\alpha_{i})\ket{\mathbf{x}} = \ket{\mathbf{x} + \mathbf{b}}, with :math:`\mathbf{b} = \{\alpha_{i}\}_{i=1}^{N}` and :math:`\alpha_{i} \in \mathbb{R}`. Putting this all together, we see that the operation :math:`\mathcal{D} \circ \mathcal{U}_{2} \circ \mathcal{S} \circ \mathcal{U}_{1}` with phaseless interferometers and position displacement performs the transformation :math:`\ket{\mathbf{x}} \Rightarrow \ket{W \mathbf{x} + \mathbf{b}}` on position eigenstates. .. warning:: The TensorFlow backend is the natural simulator for quantum neural networks in Strawberry Fields, but this backend cannot naturally accommodate position eigenstates, which require infinite squeezing. For simulation of position eigenstates in this backend, the best approach is to use a displaced squeezed state (:class:`prepare_displaced_squeezed_state

` .. figure:: /tutorials/images/layer_1mode.svg :align: center :width: 31% :target: javascript:void(0); One mode layer :html:`

` .. figure:: /tutorials/images/layer_2mode.svg :align: center :width: 46% :target: javascript:void(0); Two mode layer :html:`

` .. figure:: /tutorials/images/layer_3mode.svg :align: center :width: 75% :target: javascript:void(0); Three mode layer :html:`

` .. figure:: /tutorials/images/layer_4mode.svg :align: center :width: 90% :target: javascript:void(0); Four mode layer :html:`

` Here, the multimode linear interferometers :math:`U_{1}` and :math:`U_{2}` have been decomposed into two-mode phaseless beamsplitters (:class:`~strawberryfields.ops.BSgate`) and single-mode phase shifters (:class:`~strawberryfields.ops.Rgate`) using the Clements decomposition [[3]_]. The Kerr gate is used as the non-Gaussian gate. Code ---- First, we import Strawberry Fields, TensorFlow, and NumPy: """ import numpy as np import tensorflow as tf import strawberryfields as sf from strawberryfields import ops ###################################################################### # Before we begin defining our optimization problem, let's first create # some convenient utility functions. # # Utility functions # ~~~~~~~~~~~~~~~~~ # # The first step to writing a CV quantum neural network layer in Strawberry Fields is to define a # function for the two interferometers: def interferometer(params, q): """Parameterised interferometer acting on ``N`` modes. Args: params (list[float]): list of length ``max(1, N-1) + (N-1)*N`` parameters. * The first ``N(N-1)/2`` parameters correspond to the beamsplitter angles * The second ``N(N-1)/2`` parameters correspond to the beamsplitter phases * The final ``N-1`` parameters correspond to local rotation on the first N-1 modes q (list[RegRef]): list of Strawberry Fields quantum registers the interferometer is to be applied to """ N = len(q) theta = params[:N*(N-1)//2] phi = params[N*(N-1)//2:N*(N-1)] rphi = params[-N+1:] if N == 1: # the interferometer is a single rotation ops.Rgate(rphi[0]) | q[0] return n = 0 # keep track of free parameters # Apply the rectangular beamsplitter array # The array depth is N for l in range(N): for k, (q1, q2) in enumerate(zip(q[:-1], q[1:])): # skip even or odd pairs depending on layer if (l + k) % 2 != 1: ops.BSgate(theta[n], phi[n]) | (q1, q2) n += 1 # apply the final local phase shifts to all modes except the last one for i in range(max(1, N - 1)): ops.Rgate(rphi[i]) | q[i] ###################################################################### # .. warning:: # # The :class:`~strawberryfields.ops.Interferometer` class in Strawberry Fields does not reproduce # the functionality above. Instead, :class:`~strawberryfields.ops.Interferometer` applies a given # input unitary matrix according to the Clements decomposition. # # Using the above ``interferometer`` function, an :math:`N` mode CV quantum neural network layer is # given by the function: def layer(params, q): """CV quantum neural network layer acting on ``N`` modes. Args: params (list[float]): list of length ``2*(max(1, N-1) + N**2 + n)`` containing the number of parameters for the layer q (list[RegRef]): list of Strawberry Fields quantum registers the layer is to be applied to """ N = len(q) M = int(N * (N - 1)) + max(1, N - 1) int1 = params[:M] s = params[M:M+N] int2 = params[M+N:2*M+N] dr = params[2*M+N:2*M+2*N] dp = params[2*M+2*N:2*M+3*N] k = params[2*M+3*N:2*M+4*N] # begin layer interferometer(int1, q) for i in range(N): ops.Sgate(s[i]) | q[i] interferometer(int2, q) for i in range(N): ops.Dgate(dr[i], dp[i]) | q[i] ops.Kgate(k[i]) | q[i] ###################################################################### # Finally, we define one more utility function to help us initialize # the TensorFlow weights for our quantum neural network layers: def init_weights(modes, layers, active_sd=0.0001, passive_sd=0.1): """Initialize a 2D TensorFlow Variable containing normally-distributed random weights for an ``N`` mode quantum neural network with ``L`` layers. Args: modes (int): the number of modes in the quantum neural network layers (int): the number of layers in the quantum neural network active_sd (float): the standard deviation used when initializing the normally-distributed weights for the active parameters (displacement, squeezing, and Kerr magnitude) passive_sd (float): the standard deviation used when initializing the normally-distributed weights for the passive parameters (beamsplitter angles and all gate phases) Returns: tf.Variable[tf.float32]: A TensorFlow Variable of shape ``[layers, 2*(max(1, modes-1) + modes**2 + modes)]``, where the Lth row represents the layer parameters for the Lth layer. """ # Number of interferometer parameters: M = int(modes * (modes - 1)) + max(1, modes - 1) # Create the TensorFlow variables int1_weights = tf.random.normal(shape=[layers, M], stddev=passive_sd) s_weights = tf.random.normal(shape=[layers, modes], stddev=active_sd) int2_weights = tf.random.normal(shape=[layers, M], stddev=passive_sd) dr_weights = tf.random.normal(shape=[layers, modes], stddev=active_sd) dp_weights = tf.random.normal(shape=[layers, modes], stddev=passive_sd) k_weights = tf.random.normal(shape=[layers, modes], stddev=active_sd) weights = tf.concat( [int1_weights, s_weights, int2_weights, dr_weights, dp_weights, k_weights], axis=1 ) weights = tf.Variable(weights) return weights ###################################################################### # Optimization # ~~~~~~~~~~~~ # # Now that we have our utility functions, lets begin defining our optimization problem # In this particular example, let's create a 1 mode CVQNN with 8 layers and a Fock-basis # cutoff dimension of 6. We will train this QNN to output a desired target state; # a single photon state. # set the random seed tf.random.set_seed(137) np.random.seed(137) # define width and depth of CV quantum neural network modes = 1 layers = 8 cutoff_dim = 6 # defining desired state (single photon state) target_state = np.zeros(cutoff_dim) target_state[1] = 1 target_state = tf.constant(target_state, dtype=tf.complex64) ###################################################################### # Now, let's initialize an engine with the TensorFlow ``"tf"`` backend, # and begin constructing out QNN program. # initialize engine and program eng = sf.Engine(backend="tf", backend_options={"cutoff_dim": cutoff_dim}) qnn = sf.Program(modes) # initialize QNN weights weights = init_weights(modes, layers) # our TensorFlow weights num_params = np.prod(weights.shape) # total number of parameters in our model ###################################################################### # To construct the program, we must create and use Strawberry Fields symbolic # gate arguments. These will be mapped to the TensorFlow variables on engine # execution. # Create array of Strawberry Fields symbolic gate arguments, matching # the size of the weights Variable. sf_params = np.arange(num_params).reshape(weights.shape).astype(np.str) sf_params = np.array([qnn.params(*i) for i in sf_params]) # Construct the symbolic Strawberry Fields program by # looping and applying layers to the program. with qnn.context as q: for k in range(layers): layer(sf_params[k], q) ###################################################################### # where ``sf_params`` is a real array of size ``[layers, 2*(max(1, modes-1) + modes**2 + modes)]`` # containing the symbolic gate arguments for the quantum neural network. # # Now that our QNN program is defined, we can create our **cost function**. # Our cost function simply executes the QNN on our engine using the values of the # input weights. # # Since we want to maximize the fidelity :math:`f(w) = \langle \psi(w) | \psi_t\rangle` # between our QNN output state :math:`|\psi(w)\rangle` and our target state # :math:`\psi_t\rangle`, we compute the inner product between the two statevectors, # as well as the norm :math:`\left\lVert \psi(w) - \psi_t\right\rVert`. # # Finally, we also return the trace of the output QNN state. This should always # have a value close to 1. If it deviates significantly from 1, this is an # indication that we need to increase our Fock-basis cutoff. def cost(weights): # Create a dictionary mapping from the names of the Strawberry Fields # symbolic gate parameters to the TensorFlow weight values. mapping = {p.name: w for p, w in zip(sf_params.flatten(), tf.reshape(weights, [-1]))} # run the engine state = eng.run(qnn, args=mapping).state ket = state.ket() difference = tf.reduce_sum(tf.abs(ket - target_state)) fidelity = tf.abs(tf.reduce_sum(tf.math.conj(ket) * target_state)) ** 2 return difference, fidelity, ket, tf.math.real(state.trace()) ###################################################################### # We are now ready to minimize our cost function using TensorFlow: # set up the optimizer opt = tf.keras.optimizers.Adam() cost_before, fidelity_before, _, _ = cost(weights) # Perform the optimization for i in range(1000): # reset the engine if it has already been executed if eng.run_progs: eng.reset() with tf.GradientTape() as tape: loss, fid, ket, trace = cost(weights) # one repetition of the optimization gradients = tape.gradient(loss, weights) opt.apply_gradients(zip([gradients], [weights])) # Prints progress at every rep if i % 1 == 0: print("Rep: {} Cost: {:.4f} Fidelity: {:.4f} Trace: {:.4f}".format(i, loss, fid, trace)) print("\nFidelity before optimization: ", fidelity_before.numpy()) print("Fidelity after optimization: ", fid.numpy()) print("\nTarget state: ", target_state.numpy()) print("Output state: ", np.round(ket.numpy(), decimals=3)) ###################################################################### # For more applications of CV quantum neural networks, see the :doc:`state learning ` # and :doc:`gate synthesis ` demonstrations. # # References # ---------- # # .. [1] Nathan Killoran, Thomas R Bromley, Juan Miguel Arrazola, Maria Schuld, Nicolás Quesada, and # Seth Lloyd. Continuous-variable quantum neural networks. arXiv preprint arXiv:1806.06871, # 2018. # # .. [2] Maria Schuld, Ville Bergholm, Christian Gogolin, Josh Izaac, and Nathan Killoran. Evaluating # analytic gradients on quantum hardware. Physical Review A, 99(3):032331, 2019. # # .. [3] William R Clements, Peter C Humphreys, Benjamin J Metcalf, W Steven Kolthammer, and Ian A # Walsmley. Optimal design for universal multiport interferometers. Optica, 3(12):1460–1465, # 2016. doi:10.1364/OPTICA.3.001460.