Graph similarity

Technical details are available in the API documentation: sf.apps.similarity

This page looks at how to use GBS to construct a similarity measure between graphs, known as a graph kernel [1]. Kernels can be applied to graph-based data for machine learning tasks such as classification using a support vector machine.

Graph data

We begin by fixing a dataset of graphs to consider and loading GBS samples from these graphs, which will be needed in the following.

Let’s use the MUTAG dataset of graphs [2][3]. This is a dataset of 188 different graphs that each correspond to the structure of a chemical compound. Our goal is to use GBS samples from these graphs to measure their similarity.

The data module provides pre-calculated GBS samples for selected graphs in the MUTAG dataset. Each set of samples is generated by encoding the graph into a GBS device, and collecting photon click events. We’ll start by loading four sets of samples and visualizing the corresponding graphs.

from strawberryfields.apps import data, plot, similarity

m0 = data.Mutag0()
m1 = data.Mutag1()
m2 = data.Mutag2()
m3 = data.Mutag3()

These datasets contain both the adjacency matrix of the graph and the samples generated through GBS. We can access the adjacency matrix through:

m0_a = m0.adj
m1_a = m1.adj
m2_a = m2.adj
m3_a = m3.adj

Samples from these graphs can be accessed by indexing:

print(m0[0])

Out:

[0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]

We can now plot the four graphs using the plot module. To use this module, we need to convert the adjacency matrices into NetworkX Graphs:

import networkx as nx
import plotly

Plotting m0_a:

plot.graph(nx.Graph(m0_a))


Plotting m1_a:

plot.graph(nx.Graph(m1_a))


Plotting m2_a:

plot.graph(nx.Graph(m2_a))


Plotting m3_a:

plot.graph(nx.Graph(m3_a))


The graphs of m1_a and m2_a look very similar. In fact, it turns out that they are isomorphic to each other, which means that the graphs can be made identical by permuting their node labels.

Creating a feature vector

Following [1], we can create a feature vector to describe each graph. These feature vectors contain information about the graphs and can be viewed as a mapping to a high-dimensional feature space, a technique often used in machine learning that allows us to employ properties of the feature space to separate and classify the vectors.

The feature vector of a graph can be composed in a variety of ways. One approach is to associate features with the relative frequencies of certain types of measurements being recorded from a GBS device configured to sample from the graph, as we now discuss.

We begin by defining the concept of an orbit, which is the set of all GBS samples that are equivalent under permutation of the modes. A sample can be converted to its corresponding orbit using the sample_to_orbit() function. For example, the first sample of m0 is [0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0] and has orbit:

Out:

[1, 1]

Here, [1, 1] means that two photons were detected, each in a separate mode. Other samples can be randomly generated from the [1, 1] orbit using:

print(similarity.orbit_to_sample([1, 1], modes=m0.modes))

Out:

[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]

Orbits provide a useful way to coarse-grain the samples from GBS into outcomes that are statistically more likely to be observed. However, we are interested in coarse-graining further into events, which correspond to a combination of orbits with the same photon number such that the number of photons counted in each mode does not exceed a fixed value max_count_per_mode. To understand this, let’s look at all of the orbits with a photon number of 5:

print(list(similarity.orbits(5)))

Out:

[[1, 1, 1, 1, 1], [2, 1, 1, 1], [3, 1, 1], [2, 2, 1], [4, 1], [3, 2], [5]]

All 5-photon samples belong to one of the orbits above. A 5-photon event with max_count_per_mode = 3 means that we include the orbits: [[1, 1, 1, 1, 1], [2, 1, 1, 1], [3, 1, 1], [2, 2, 1], [3, 2]] and ignore the orbits [[4, 1], [5]]. For example, the sample [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0] is a 5-photon event:

print(similarity.sample_to_event([0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0], 3))

Out:

5

Samples with more than max_count_per_mode in any mode are not counted as part of the event:

print(similarity.sample_to_event([0, 4, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], 3))

Out:

None

Now that we have mastered orbits and events, how can we make a feature vector? It was shown in [1] that one way of making a feature vector of a graph is through the frequencies of orbits or events. For example, for a \(k\) photon event \(E_{k, n_{\max}}\) with maximum count per mode \(n_{\max}\) and corresponding probability \(p_{k, n_{\max}}:=p_{E_{k, n_{\max}}}(G)\) with respect to a graph \(G\), a feature vector can be written as

\[f_{\mathbf{k}, n_{\max}} = (p_{k_{1}, n_{\max}}, p_{k_{2}, n_{\max}}, \ldots , p_{k_{K}, n_{\max}}),\]

where \(\mathbf{k} := (k_{1}, k_{2}, \ldots , k_{K})\) is a list of different total photon numbers.

For example, if \(\mathbf{k} := (2, 4, 6, 8)\) and \(n_{\max} = 2\), we have

\[f_{(2, 4, 6, 8), 2} = (p_{2, 2}, p_{4, 2}, p_{6, 2}, p_{8, 2}).\]

In this case, we are interested in the probabilities of events \(E_{2, 2}\), \(E_{4, 2}\), and \(E_{6, 2}\). Suppose we are sampling from a four-mode device and have the samples [0, 3, 0, 1] and [1, 2, 0, 1]. These samples are part of the orbits [3, 1] and [2, 1, 1], respectively. However, [3, 1] is not part of the \(E_{4, 2}\) event while [2, 1, 1] is.

Calculating a feature vector

We provide three methods for calculating a feature vector in the similarity module of Strawberry Fields:

  1. Through sampling.

  2. Using exact probability calculations.

  3. Using a Monte Carlo estimate of the probability.

The first method uses statistical inference. Hence, all one needs to do is generate some GBS samples from the graph of interest and fix the composition of the feature vector. For example, to obtain feature vector \(f_{\mathbf{k} = (2, 4), n_{\max}=2}\) for the first MUTAG graph, we use:

Out:

[0.19035, 0.2047]

We can also use any orbits of our choice instead of events:

print(similarity.feature_vector_orbits_sampling(m0, [[1, 1], [2], [1, 1, 1, 1], [2, 1, 1]]))

Out:

[0.19035, 0.0, 0.1352, 0.05175]

For the second method, we calculate the orbit probabilities exactly rather than through sampling. Considering a feature vector of orbit probabilities, the probability of a single orbit \(p(O)\) is given by:

\[p(O) = \sum_{S \in O} p(S)\]

where \(S\) represents a GBS output click pattern. Calculating each \(p(S)\) requires computing a hafnian, which gets exponentially difficult with increasing photon number. The exact probability of an event \(p_{k,n_{\max}}\) can be calculated in a similar manner.

Built-in functions feature_vector_orbits() and feature_vector_events() can be used to get exact feature vectors. These functions use a keyword argument samples to signal producing either exact or Monte Carlo estimated probabilities, as shown later. samples is set to None to get an exact feature vector by default. To use Monte Carlo estimation, samples can be set to the number of samples desired to be used in the estimation. For example, to get the exact event probabilities in the feature vector example \(f_{\mathbf{k} = (2, 4), n_{\max}=2}\) seen previously, we use:

print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2))

Out:

[0.22406643948198376, 0.2156557698525557]

Although they are precise, exact calculations for large matrices can be tough to evaluate. Additionally, what makes calculating \(p_{k, n_{\max}}\) really challenging is the number of GBS samples the corresponding event contains. For example, a 6-photon event over 17 modes \(E_{k=6, n_{\max}=2}\) contains the following number of samples:

Out:

58276

To avoid calculating a large number of sample probabilities, an alternative is to perform Monte Carlo estimation. Here, samples within an orbit or event are selected uniformly at random and their exact probabilities are calculated. For example, for an event \(E_{k, n_{\max}}\), if \(N\) samples \(\{S_{1}, S_{2}, \ldots , S_{N}\}\) are randomly selected from all possible samples in this event, then the event probability can be approximated as

\[p(E_{k, n_{\max}}) \approx \frac{1}{N}\sum_{i=1}^N p(S_i) |E_{k, n_{\max}}|,\]

with \(|E_{k, n_{\max}}|\) denoting the cardinality of the event.

This method can be accessed using the feature_vector_events() function with samples set to the number of samples desired to be used in the estimation. For example, to get MC-estimated probabilities for our example feature vector \(f_{\mathbf{k} = (2, 4), n_{\max}=2}\), we use:

print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2, samples=1000))

Out:

[0.23997515668520494, 0.19922220390655626]

Note

The results of using Monte Carlo estimation with feature_vector_orbits() and feature_vector_events() are probabilistic and may vary between runs. Increasing the samples parameter will increase the precision but slow down the calculation.

Machine learning with GBS graph kernels

The power of feature vectors that embed graphs in a vector space of real numbers is that we can now measure similarities between graphs. This is very useful in machine learning, where similar labels are assigned to graphs that are close to each other. GBS feature vectors therefore give rise to a similarity measure between graphs!

Let’s build this up a bit more. The MUTAG dataset we are considering contains not only graphs corresponding to the structure of chemical compounds, but also a label of each compound based upon its mutagenic effect. The four graphs we consider here have labels:

  • MUTAG0: Class 1

  • MUTAG1: Class 0

  • MUTAG2: Class 0

  • MUTAG3: Class 1

classes = [1, 0, 0, 1]

We can use GBS feature vectors in a support vector machine (SVM) that finds a separating hyperplane between classes in the feature space. We start by defining two-dimensional feature vectors:

events = [8, 10]
max_count = 2

f1 = similarity.feature_vector_events_sampling(m0, events, max_count)
f2 = similarity.feature_vector_events_sampling(m1, events, max_count)
f3 = similarity.feature_vector_events_sampling(m2, events, max_count)
f4 = similarity.feature_vector_events_sampling(m3, events, max_count)

import numpy as np

R = np.array([f1, f2, f3, f4])

print(R)

Out:

[[0.0884 0.042 ]
 [0.0704 0.0285]
 [0.0699 0.0294]
 [0.0962 0.0459]]

The choice of what events to use for the feature vectors can be significant and we encourage the reader to explore different combinations. We can also use any orbits of our choice above instead of events. Note, however, that GBS samples with odd total number of photons have zero probability when using ideal GBS, which only generates and outputs pairs of photons.

Given our points in the feature space and their target labels, we can use scikit-learn’s Support Vector Machine LinearSVC as our model to train:

from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler

R_scaled = StandardScaler().fit_transform(R)  # Transform data to zero mean and unit variance

classifier = LinearSVC()
classifier.fit(R_scaled, classes)

Out:

LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
          intercept_scaling=1, loss='squared_hinge', max_iter=1000,
          multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
          verbose=0)

Here, the term “linear” refers to the kernel function used to calculate inner products between vectors in the space. We can use a linear SVM because we have already embedded the graphs in a feature space based on GBS. We have also rescaled the feature vectors so that they zero mean and unit variance using scikit-learn’s StandardScaler, a technique often used in machine learning.

We can then visualize the trained SVM by plotting the decision boundary with respect to the points:

w = classifier.coef_[0]
i = classifier.intercept_[0]

m = -w[0] / w[1]  # finding the values for y = mx + b
b = -i / w[1]

xx = [-1, 1]
yy = [m * x + b for x in xx]

fig = plot.points(R_scaled, classes)
fig.add_trace(plotly.graph_objects.Scatter(x=xx, y=yy, mode="lines"))

fig


This plot shows the two classes (grey points for class 0 and red points for class 1) successfully separated by the linear hyperplane using the GBS feature space. Moreover, recall that the two MUTAG1 and MUTAG2 graphs of class 0 are actually isomorphic. Reassuringly, their corresponding feature vectors are very similar. In fact, the feature vectors of isomorphic graphs should always be identical [4] - the small discrepancy in this plot is due to the statistical approximation from sampling.

References

1(1,2,3)

Maria Schuld, Kamil Brádler, Robert Israel, Daiqin Su, and Brajesh Gupt. A quantum hardware-induced graph kernel based on gaussian boson sampling. arXiv preprint arXiv:1905.12646, 2019.

2

Asim Kumar Debnath, Rosa L Lopez de Compadre, Gargi Debnath, Alan J Shusterman, and Corwin Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of medicinal chemistry, 34(2):786–797, 1991.

3

Nils Kriege and Petra Mutzel. Subgraph matching kernels for attributed graphs. In Proceedings of the 29th International Conference on Machine Learning. 2012.

4

Kamil Bradler, Shmuel Friedland, Josh Izaac, Nathan Killoran, and Daiqin Su. Graph isomorphism and gaussian boson sampling. arXiv preprint arXiv:1810.10644, 2018.

Total running time of the script: ( 6 minutes 24.843 seconds)

Gallery generated by Sphinx-Gallery