Note

Click here to download the full example code

# Graph similarity¶

*Technical details are available in the API documentation:*
sf.apps.similarity

This page looks at how to use GBS to construct a similarity measure between graphs, known as a graph kernel [1]. Kernels can be applied to graph-based data for machine learning tasks such as classification using a support vector machine.

## Graph data¶

We begin by fixing a dataset of graphs to consider and loading GBS samples from these graphs, which will be needed in the following.

Let’s use the MUTAG dataset of graphs [2][3]. This is a dataset of 188 different graphs that each correspond to the structure of a chemical compound. Our goal is to use GBS samples from these graphs to measure their similarity.

The `data`

module provides pre-calculated GBS samples for selected
graphs in the MUTAG dataset. Each set of samples is generated by encoding the graph into a GBS
device, and collecting photon click events. We’ll start by loading four sets of samples and
visualizing the corresponding graphs.

```
from strawberryfields.apps import data, plot, similarity
m0 = data.Mutag0()
m1 = data.Mutag1()
m2 = data.Mutag2()
m3 = data.Mutag3()
```

These datasets contain both the adjacency matrix of the graph and the samples generated through GBS. We can access the adjacency matrix through:

Samples from these graphs can be accessed by indexing:

```
print(m0[0])
```

Out:

```
[0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
```

We can now plot the four graphs using the `plot`

module. To use
this module, we need to convert the adjacency matrices into NetworkX Graphs:

```
import networkx as nx
import plotly
```

Plotting `m0_a`

:

```
plot.graph(nx.Graph(m0_a))
```

Plotting `m1_a`

:

```
plot.graph(nx.Graph(m1_a))
```

Plotting `m2_a`

:

```
plot.graph(nx.Graph(m2_a))
```

Plotting `m3_a`

:

```
plot.graph(nx.Graph(m3_a))
```

The graphs of `m1_a`

and `m2_a`

look very similar. In fact,
it turns out that they are *isomorphic* to each other, which means that the graphs can be made
identical by permuting their node labels.

## Creating a feature vector¶

Following [1], we can create a *feature vector* to describe each graph.
These feature vectors contain information about the graphs and can be viewed as a mapping to a
high-dimensional feature space, a technique often used in machine learning that allows us to
employ properties of the feature space to separate and classify the vectors.

The feature vector of a graph can be composed in a variety of ways. One approach is to associate features with the relative frequencies of certain types of measurements being recorded from a GBS device configured to sample from the graph, as we now discuss.

We begin by defining the concept of an *orbit*, which is the set of all GBS samples that are
equivalent under permutation of the modes. A sample can be converted to its corresponding orbit
using the `sample_to_orbit()`

function. For example, the
first sample of `m0`

is `[0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]`

and has orbit:

```
print(similarity.sample_to_orbit(m0[0]))
```

Out:

```
[1, 1]
```

Here, `[1, 1]`

means that two photons were detected, each in a separate mode. Other samples
can be randomly generated from the `[1, 1]`

orbit using:

```
print(similarity.orbit_to_sample([1, 1], modes=m0.modes))
```

Out:

```
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
```

Orbits provide a useful way to coarse-grain the samples from GBS into outcomes that are
statistically more likely to be observed. However, we are interested in coarse-graining further
into *events*, which correspond to a combination of orbits with the same photon number such
that the number of photons counted in each mode does not exceed a fixed value
`max_count_per_mode`

. To understand this, let’s look at all of the orbits with a photon
number of 5:

```
print(list(similarity.orbits(5)))
```

Out:

```
[[1, 1, 1, 1, 1], [2, 1, 1, 1], [3, 1, 1], [2, 2, 1], [4, 1], [3, 2], [5]]
```

All 5-photon samples belong to one of the orbits above. A 5-photon event with
`max_count_per_mode = 3`

means that we include the orbits: ```
[[1, 1, 1, 1, 1], [2, 1, 1, 1],
[3, 1, 1], [2, 2, 1], [3, 2]]
```

and ignore the orbits `[[4, 1], [5]]`

. For example,
the sample `[0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0]`

is a 5-photon event:

```
print(similarity.sample_to_event([0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0], 3))
```

Out:

```
5
```

Samples with more than `max_count_per_mode`

in any mode are not counted as part of the event:

```
print(similarity.sample_to_event([0, 4, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], 3))
```

Out:

```
None
```

Now that we have mastered orbits and events, how can we make a feature vector? It was shown in [1] that one way of making a feature vector of a graph is through the frequencies of orbits or events. For example, for a \(k\) photon event \(E_{k, n_{\max}}\) with maximum count per mode \(n_{\max}\) and corresponding probability \(p_{k, n_{\max}}:=p_{E_{k, n_{\max}}}(G)\) with respect to a graph \(G\), a feature vector can be written as

where \(\mathbf{k} := (k_{1}, k_{2}, \ldots , k_{K})\) is a list of different total photon numbers.

For example, if \(\mathbf{k} := (2, 4, 6, 8)\) and \(n_{\max} = 2\), we have

In this case, we are interested in the probabilities of events \(E_{2, 2}\), \(E_{4,
2}\), and \(E_{6, 2}\). Suppose we are sampling from a four-mode device and have the samples
`[0, 3, 0, 1]`

and `[1, 2, 0, 1]`

. These samples are part of the orbits `[3, 1]`

and
`[2, 1, 1]`

, respectively. However, `[3, 1]`

is not part of the \(E_{4, 2}\) event while
`[2, 1, 1]`

is.

## Calculating a feature vector¶

We provide three methods for calculating a feature vector in the
`similarity`

module of Strawberry Fields:

Through sampling.

Using exact probability calculations.

Using a Monte Carlo estimate of the probability.

The first method uses statistical inference. Hence, all one needs to do is generate some GBS samples from the graph of interest and fix the composition of the feature vector. For example, to obtain feature vector \(f_{\mathbf{k} = (2, 4), n_{\max}=2}\) for the first MUTAG graph, we use:

```
print(similarity.feature_vector_events_sampling(m0, [2, 4], 2))
```

Out:

```
[0.19035, 0.2047]
```

We can also use any orbits of our choice instead of events:

```
print(similarity.feature_vector_orbits_sampling(m0, [[1, 1], [2], [1, 1, 1, 1], [2, 1, 1]]))
```

Out:

```
[0.19035, 0.0, 0.1352, 0.05175]
```

For the second method, we calculate the orbit probabilities exactly rather than through sampling. Considering a feature vector of orbit probabilities, the probability of a single orbit \(p(O)\) is given by:

where \(S\) represents a GBS output click pattern. Calculating each \(p(S)\) requires computing a hafnian, which gets exponentially difficult with increasing photon number. The exact probability of an event \(p_{k,n_{\max}}\) can be calculated in a similar manner.

Built-in functions `feature_vector_orbits()`

and
`feature_vector_events()`

can be used to get exact feature vectors. These functions
use a keyword argument `samples`

to signal producing either exact or Monte Carlo estimated
probabilities, as shown later. `samples`

is set to `None`

to get an exact feature vector by
default. To use Monte Carlo estimation, `samples`

can be set to the number of samples desired
to be used in the estimation. For example, to get the exact event probabilities in the feature
vector example \(f_{\mathbf{k} = (2, 4), n_{\max}=2}\) seen previously, we use:

```
print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2))
```

Out:

```
[0.22406643948198376, 0.2156557698525557]
```

Although they are precise, exact calculations for large matrices can be tough to evaluate. Additionally, what makes calculating \(p_{k, n_{\max}}\) really challenging is the number of GBS samples the corresponding event contains. For example, a 6-photon event over 17 modes \(E_{k=6, n_{\max}=2}\) contains the following number of samples:

```
print(similarity.event_cardinality(6, 2, 17))
```

Out:

```
58276
```

To avoid calculating a large number of sample probabilities, an alternative is to perform Monte Carlo estimation. Here, samples within an orbit or event are selected uniformly at random and their exact probabilities are calculated. For example, for an event \(E_{k, n_{\max}}\), if \(N\) samples \(\{S_{1}, S_{2}, \ldots , S_{N}\}\) are randomly selected from all possible samples in this event, then the event probability can be approximated as

with \(|E_{k, n_{\max}}|\) denoting the cardinality of the event.

This method can be accessed using the
`feature_vector_events()`

function
with `samples`

set to the number of samples desired to be used in the estimation.
For example, to get MC-estimated probabilities for our example feature vector
\(f_{\mathbf{k} = (2, 4), n_{\max}=2}\), we use:

```
print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2, samples=1000))
```

Out:

```
[0.23997515668520494, 0.19922220390655626]
```

Note

The results of using Monte Carlo estimation with
`feature_vector_orbits()`

and
`feature_vector_events()`

are probabilistic and may
vary between runs. Increasing the `samples`

parameter will increase the precision but
slow down the calculation.

## Machine learning with GBS graph kernels¶

The power of feature vectors that embed graphs in a vector space of real numbers is that we can now measure similarities between graphs. This is very useful in machine learning, where similar labels are assigned to graphs that are close to each other. GBS feature vectors therefore give rise to a similarity measure between graphs!

Let’s build this up a bit more. The MUTAG dataset we are considering contains not only graphs
corresponding to the structure of chemical compounds, but also a *label* of each
compound based upon its mutagenic effect. The four graphs we consider here have labels:

MUTAG0: Class 1

MUTAG1: Class 0

MUTAG2: Class 0

MUTAG3: Class 1

```
classes = [1, 0, 0, 1]
```

We can use GBS feature vectors in a support vector machine (SVM) that finds a separating hyperplane between classes in the feature space. We start by defining two-dimensional feature vectors:

```
events = [8, 10]
max_count = 2
f1 = similarity.feature_vector_events_sampling(m0, events, max_count)
f2 = similarity.feature_vector_events_sampling(m1, events, max_count)
f3 = similarity.feature_vector_events_sampling(m2, events, max_count)
f4 = similarity.feature_vector_events_sampling(m3, events, max_count)
import numpy as np
R = np.array([f1, f2, f3, f4])
print(R)
```

Out:

```
[[0.0884 0.042 ]
[0.0704 0.0285]
[0.0699 0.0294]
[0.0962 0.0459]]
```

The choice of what `events`

to use for the feature vectors can be significant and we
encourage the reader to explore different combinations. We can also use any orbits of our
choice above instead of events. Note, however, that GBS samples with odd total number of
photons have zero probability when using ideal GBS, which only generates and outputs pairs of
photons.

Given our points in the feature space and their target labels, we can use scikit-learn’s Support Vector Machine LinearSVC as our model to train:

```
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
R_scaled = StandardScaler().fit_transform(R) # Transform data to zero mean and unit variance
classifier = LinearSVC()
classifier.fit(R_scaled, classes)
```

Out:

```
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
intercept_scaling=1, loss='squared_hinge', max_iter=1000,
multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
verbose=0)
```

Here, the term “linear” refers to the *kernel* function used to calculate inner products
between vectors in the space. We can use a linear SVM because we have already embedded the
graphs in a feature space based on GBS. We have also rescaled the feature vectors so that they
zero mean and unit variance using scikit-learn’s `StandardScaler`

, a technique
often used in machine learning.

We can then visualize the trained SVM by plotting the decision boundary with respect to the points:

```
w = classifier.coef_[0]
i = classifier.intercept_[0]
m = -w[0] / w[1] # finding the values for y = mx + b
b = -i / w[1]
xx = [-1, 1]
yy = [m * x + b for x in xx]
fig = plot.points(R_scaled, classes)
fig.add_trace(plotly.graph_objects.Scatter(x=xx, y=yy, mode="lines"))
fig
```

This plot shows the two classes (grey points for class 0 and red points for class 1) successfully separated by the linear hyperplane using the GBS feature space. Moreover, recall that the two MUTAG1 and MUTAG2 graphs of class 0 are actually isomorphic. Reassuringly, their corresponding feature vectors are very similar. In fact, the feature vectors of isomorphic graphs should always be identical [4] - the small discrepancy in this plot is due to the statistical approximation from sampling.

## References¶

- 1(1,2,3)
Maria Schuld, Kamil Brádler, Robert Israel, Daiqin Su, and Brajesh Gupt. A quantum hardware-induced graph kernel based on gaussian boson sampling. arXiv preprint arXiv:1905.12646, 2019.

- 2
Asim Kumar Debnath, Rosa L Lopez de Compadre, Gargi Debnath, Alan J Shusterman, and Corwin Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of medicinal chemistry, 34(2):786–797, 1991.

- 3
Nils Kriege and Petra Mutzel. Subgraph matching kernels for attributed graphs. In Proceedings of the 29th International Conference on Machine Learning. 2012.

- 4
Kamil Bradler, Shmuel Friedland, Josh Izaac, Nathan Killoran, and Daiqin Su. Graph isomorphism and gaussian boson sampling. arXiv preprint arXiv:1810.10644, 2018.

**Total running time of the script:** ( 5 minutes 19.409 seconds)

## Contents

## Downloads

## Related tutorials