Note
Click here to download the full example code
Graph similarity¶
Technical details are available in the API documentation: sf.apps.similarity
This page looks at how to use GBS to construct a similarity measure between graphs, known as a graph kernel [1]. Kernels can be applied to graph-based data for machine learning tasks such as classification using a support vector machine.
Graph data¶
We begin by fixing a dataset of graphs to consider and loading GBS samples from these graphs, which will be needed in the following.
Let’s use the MUTAG dataset of graphs [2][3]. This is a dataset of 188 different graphs that each correspond to the structure of a chemical compound. Our goal is to use GBS samples from these graphs to measure their similarity.
The data
module provides pre-calculated GBS samples for selected
graphs in the MUTAG dataset. Each set of samples is generated by encoding the graph into a GBS
device, and collecting photon click events. We’ll start by loading four sets of samples and
visualizing the corresponding graphs.
from strawberryfields.apps import data, plot, similarity
m0 = data.Mutag0()
m1 = data.Mutag1()
m2 = data.Mutag2()
m3 = data.Mutag3()
These datasets contain both the adjacency matrix of the graph and the samples generated through GBS. We can access the adjacency matrix through:
Samples from these graphs can be accessed by indexing:
print(m0[0])
Out:
[0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
We can now plot the four graphs using the plot
module. To use
this module, we need to convert the adjacency matrices into NetworkX Graphs:
import networkx as nx
import plotly
Plotting m0_a
:
plot.graph(nx.Graph(m0_a))
Plotting m1_a
:
plot.graph(nx.Graph(m1_a))
Plotting m2_a
:
plot.graph(nx.Graph(m2_a))
Plotting m3_a
:
plot.graph(nx.Graph(m3_a))
The graphs of m1_a
and m2_a
look very similar. In fact,
it turns out that they are isomorphic to each other, which means that the graphs can be made
identical by permuting their node labels.
Creating a feature vector¶
Following [1], we can create a feature vector to describe each graph. These feature vectors contain information about the graphs and can be viewed as a mapping to a high-dimensional feature space, a technique often used in machine learning that allows us to employ properties of the feature space to separate and classify the vectors.
The feature vector of a graph can be composed in a variety of ways. One approach is to associate features with the relative frequencies of certain types of measurements being recorded from a GBS device configured to sample from the graph, as we now discuss.
We begin by defining the concept of an orbit, which is the set of all GBS samples that are
equivalent under permutation of the modes. A sample can be converted to its corresponding orbit
using the sample_to_orbit()
function. For example, the
first sample of m0
is [0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
and has orbit:
print(similarity.sample_to_orbit(m0[0]))
Out:
[1, 1]
Here, [1, 1]
means that two photons were detected, each in a separate mode. Other samples
can be randomly generated from the [1, 1]
orbit using:
print(similarity.orbit_to_sample([1, 1], modes=m0.modes))
Out:
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
Orbits provide a useful way to coarse-grain the samples from GBS into outcomes that are
statistically more likely to be observed. However, we are interested in coarse-graining further
into events, which correspond to a combination of orbits with the same photon number such
that the number of photons counted in each mode does not exceed a fixed value
max_count_per_mode
. To understand this, let’s look at all of the orbits with a photon
number of 5:
print(list(similarity.orbits(5)))
Out:
[[1, 1, 1, 1, 1], [2, 1, 1, 1], [3, 1, 1], [2, 2, 1], [4, 1], [3, 2], [5]]
All 5-photon samples belong to one of the orbits above. A 5-photon event with
max_count_per_mode = 3
means that we include the orbits: [[1, 1, 1, 1, 1], [2, 1, 1, 1],
[3, 1, 1], [2, 2, 1], [3, 2]]
and ignore the orbits [[4, 1], [5]]
. For example,
the sample [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0]
is a 5-photon event:
print(similarity.sample_to_event([0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0], 3))
Out:
5
Samples with more than max_count_per_mode
in any mode are not counted as part of the event:
print(similarity.sample_to_event([0, 4, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], 3))
Out:
None
Now that we have mastered orbits and events, how can we make a feature vector? It was shown in [1] that one way of making a feature vector of a graph is through the frequencies of orbits or events. For example, for a \(k\) photon event \(E_{k, n_{\max}}\) with maximum count per mode \(n_{\max}\) and corresponding probability \(p_{k, n_{\max}}:=p_{E_{k, n_{\max}}}(G)\) with respect to a graph \(G\), a feature vector can be written as
where \(\mathbf{k} := (k_{1}, k_{2}, \ldots , k_{K})\) is a list of different total photon numbers.
For example, if \(\mathbf{k} := (2, 4, 6, 8)\) and \(n_{\max} = 2\), we have
In this case, we are interested in the probabilities of events \(E_{2, 2}\), \(E_{4,
2}\), and \(E_{6, 2}\). Suppose we are sampling from a four-mode device and have the samples
[0, 3, 0, 1]
and [1, 2, 0, 1]
. These samples are part of the orbits [3, 1]
and
[2, 1, 1]
, respectively. However, [3, 1]
is not part of the \(E_{4, 2}\) event while
[2, 1, 1]
is.
Calculating a feature vector¶
We provide three methods for calculating a feature vector in the
similarity
module of Strawberry Fields:
Through sampling.
Using exact probability calculations.
Using a Monte Carlo estimate of the probability.
The first method uses statistical inference. Hence, all one needs to do is generate some GBS samples from the graph of interest and fix the composition of the feature vector. For example, to obtain feature vector \(f_{\mathbf{k} = (2, 4), n_{\max}=2}\) for the first MUTAG graph, we use:
print(similarity.feature_vector_events_sampling(m0, [2, 4], 2))
Out:
[0.19035, 0.2047]
We can also use any orbits of our choice instead of events:
print(similarity.feature_vector_orbits_sampling(m0, [[1, 1], [2], [1, 1, 1, 1], [2, 1, 1]]))
Out:
[0.19035, 0.0, 0.1352, 0.05175]
For the second method, we calculate the orbit probabilities exactly rather than through sampling. Considering a feature vector of orbit probabilities, the probability of a single orbit \(p(O)\) is given by:
where \(S\) represents a GBS output click pattern. Calculating each \(p(S)\) requires computing a hafnian, which gets exponentially difficult with increasing photon number. The exact probability of an event \(p_{k,n_{\max}}\) can be calculated in a similar manner.
Built-in functions feature_vector_orbits()
and
feature_vector_events()
can be used to get exact feature vectors. These functions
use a keyword argument samples
to signal producing either exact or Monte Carlo estimated
probabilities, as shown later. samples
is set to None
to get an exact feature vector by
default. To use Monte Carlo estimation, samples
can be set to the number of samples desired
to be used in the estimation. For example, to get the exact event probabilities in the feature
vector example \(f_{\mathbf{k} = (2, 4), n_{\max}=2}\) seen previously, we use:
print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2))
Out:
[0.2240664394819835, 0.21565576985255536]
Although they are precise, exact calculations for large matrices can be tough to evaluate. Additionally, what makes calculating \(p_{k, n_{\max}}\) really challenging is the number of GBS samples the corresponding event contains. For example, a 6-photon event over 17 modes \(E_{k=6, n_{\max}=2}\) contains the following number of samples:
print(similarity.event_cardinality(6, 2, 17))
Out:
58276
To avoid calculating a large number of sample probabilities, an alternative is to perform Monte Carlo estimation. Here, samples within an orbit or event are selected uniformly at random and their exact probabilities are calculated. For example, for an event \(E_{k, n_{\max}}\), if \(N\) samples \(\{S_{1}, S_{2}, \ldots , S_{N}\}\) are randomly selected from all possible samples in this event, then the event probability can be approximated as
with \(|E_{k, n_{\max}}|\) denoting the cardinality of the event.
This method can be accessed using the
feature_vector_events()
function
with samples
set to the number of samples desired to be used in the estimation.
For example, to get MC-estimated probabilities for our example feature vector
\(f_{\mathbf{k} = (2, 4), n_{\max}=2}\), we use:
print(similarity.feature_vector_events(nx.Graph(m0_a), [2, 4], 2, samples=1000))
Out:
[0.23997515668520478, 0.1992222039065559]
Note
The results of using Monte Carlo estimation with
feature_vector_orbits()
and
feature_vector_events()
are probabilistic and may
vary between runs. Increasing the samples
parameter will increase the precision but
slow down the calculation.
Machine learning with GBS graph kernels¶
The power of feature vectors that embed graphs in a vector space of real numbers is that we can now measure similarities between graphs. This is very useful in machine learning, where similar labels are assigned to graphs that are close to each other. GBS feature vectors therefore give rise to a similarity measure between graphs!
Let’s build this up a bit more. The MUTAG dataset we are considering contains not only graphs corresponding to the structure of chemical compounds, but also a label of each compound based upon its mutagenic effect. The four graphs we consider here have labels:
MUTAG0: Class 1
MUTAG1: Class 0
MUTAG2: Class 0
MUTAG3: Class 1
classes = [1, 0, 0, 1]
We can use GBS feature vectors in a support vector machine (SVM) that finds a separating hyperplane between classes in the feature space. We start by defining two-dimensional feature vectors:
events = [8, 10]
max_count = 2
f1 = similarity.feature_vector_events_sampling(m0, events, max_count)
f2 = similarity.feature_vector_events_sampling(m1, events, max_count)
f3 = similarity.feature_vector_events_sampling(m2, events, max_count)
f4 = similarity.feature_vector_events_sampling(m3, events, max_count)
import numpy as np
R = np.array([f1, f2, f3, f4])
print(R)
Out:
[[0.0884 0.042 ]
[0.0704 0.0285]
[0.0699 0.0294]
[0.0962 0.0459]]
The choice of what events
to use for the feature vectors can be significant and we
encourage the reader to explore different combinations. We can also use any orbits of our
choice above instead of events. Note, however, that GBS samples with odd total number of
photons have zero probability when using ideal GBS, which only generates and outputs pairs of
photons.
Given our points in the feature space and their target labels, we can use scikit-learn’s Support Vector Machine LinearSVC as our model to train:
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
R_scaled = StandardScaler().fit_transform(R) # Transform data to zero mean and unit variance
classifier = LinearSVC()
classifier.fit(R_scaled, classes)
Out:
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
intercept_scaling=1, loss='squared_hinge', max_iter=1000,
multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
verbose=0)
Here, the term “linear” refers to the kernel function used to calculate inner products
between vectors in the space. We can use a linear SVM because we have already embedded the
graphs in a feature space based on GBS. We have also rescaled the feature vectors so that they
zero mean and unit variance using scikit-learn’s StandardScaler
, a technique
often used in machine learning.
We can then visualize the trained SVM by plotting the decision boundary with respect to the points:
w = classifier.coef_[0]
i = classifier.intercept_[0]
m = -w[0] / w[1] # finding the values for y = mx + b
b = -i / w[1]
xx = [-1, 1]
yy = [m * x + b for x in xx]
fig = plot.points(R_scaled, classes)
fig.add_trace(plotly.graph_objects.Scatter(x=xx, y=yy, mode="lines"))
fig
This plot shows the two classes (grey points for class 0 and red points for class 1) successfully separated by the linear hyperplane using the GBS feature space. Moreover, recall that the two MUTAG1 and MUTAG2 graphs of class 0 are actually isomorphic. Reassuringly, their corresponding feature vectors are very similar. In fact, the feature vectors of isomorphic graphs should always be identical [4] - the small discrepancy in this plot is due to the statistical approximation from sampling.
References¶
- 1(1,2,3)
Maria Schuld, Kamil Brádler, Robert Israel, Daiqin Su, and Brajesh Gupt. A quantum hardware-induced graph kernel based on gaussian boson sampling. arXiv preprint arXiv:1905.12646, 2019.
- 2
Asim Kumar Debnath, Rosa L Lopez de Compadre, Gargi Debnath, Alan J Shusterman, and Corwin Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of medicinal chemistry, 34(2):786–797, 1991.
- 3
Nils Kriege and Petra Mutzel. Subgraph matching kernels for attributed graphs. In Proceedings of the 29th International Conference on Machine Learning. 2012.
- 4
Kamil Bradler, Shmuel Friedland, Josh Izaac, Nathan Killoran, and Daiqin Su. Graph isomorphism and gaussian boson sampling. arXiv preprint arXiv:1810.10644, 2018.
Total running time of the script: ( 0 minutes 14.776 seconds)
Contents
Downloads
Related tutorials