Note

Click here to download the full example code

# Maximum Clique¶

*Technical details are available in the API documentation:* sf.apps.clique

Here we’ll explore how to combine GBS samples with local search algorithms to find large cliques in graphs. Let’s get started!

A clique is a special type of subgraph where all possible connections between nodes are present; they are densest possible subgraphs of their size. The maximum clique problem, or max clique for short, asks the question: given a graph \(G\), what is the largest clique in the graph? Max clique is NP-Hard, so finding the biggest clique becomes challenging for graphs with many nodes. This is why we need clever algorithms to identify large cliques!

To get started, we’ll analyze the 24-node TACE-AS graph used in [1]. This
is the *binding interaction graph* representing the spatial compatibility of atom pairs in a
protein-molecule complex. Cliques in this graph correspond to stable docking configurations, which
are of interest in determining how the molecule interacts with the protein.

The first step is to import the Strawberry Fields `apps`

module and external dependencies:

```
from strawberryfields.apps import data, plot, sample, clique
import numpy as np
import networkx as nx
import plotly
```

The adjacency matrix of the TACE-AS graph can be loaded from the `data`

module and the
graph can be visualized using the `plot`

module:

```
TA = data.TaceAs()
A = TA.adj
TA_graph = nx.Graph(A)
plot.graph(TA_graph)
```

Can you spot any cliques in the graph? It’s not so easy using only your eyes! The TACE-AS graph
is sufficiently small that all cliques can be found by performing an exhaustive search over
all subgraphs. For example, below we highlight a small *maximal* clique, i.e., a clique
not contained inside another clique:

```
maximal_clique = [4, 11, 12, 18]
plot.graph(TA_graph, maximal_clique)
```

We’ll now use the `clique`

module to find larger cliques in the graph. We can make
use of the pre-generated samples from the TACE-AS graph in the `data`

module and
post-select samples with a specific number of clicks. Here we’ll look at samples with eight
clicks, of which there are a total of 1,984:

```
postselected = sample.postselect(TA, 8, 8)
samples = sample.to_subgraphs(postselected, TA_graph)
print(len(samples))
```

Out:

```
1984
```

GBS produces samples that correspond to subgraphs of high density. For fun, let’s confirm this by comparing the average subgraph density in the GBS samples to uniformly generated samples:

```
GBS_dens = []
u_dens = []
for s in samples:
uniform = list(np.random.choice(24, 8, replace=False)) # generates uniform sample
GBS_dens.append(nx.density(TA_graph.subgraph(s)))
u_dens.append(nx.density(TA_graph.subgraph(uniform)))
print("GBS mean density = {:.4f}".format(np.mean(GBS_dens)))
print("Uniform mean density = {:.4f}".format(np.mean(u_dens)))
```

Out:

```
GBS mean density = 0.7005
Uniform mean density = 0.5870
```

Those look like great GBS samples 💪! To obtain cliques, we shrink the samples by greedily removing nodes with low degree until a clique is found.

```
shrunk = [clique.shrink(s, TA_graph) for s in samples]
print(clique.is_clique(TA_graph.subgraph(shrunk[0])))
```

Out:

```
True
```

Let’s take a look at some of these cliques. What are the clique sizes in the first ten samples? What is the average clique size? How about the largest and smallest clique size?

```
clique_sizes = [len(s) for s in shrunk]
print("First ten clique sizes = ", clique_sizes[:10])
print("Average clique size = {:.3f}".format(np.mean(clique_sizes)))
print("Maximum clique size = ", np.max(clique_sizes))
print("Minimum clique size = ", np.min(clique_sizes))
```

Out:

```
First ten clique sizes = [4, 5, 6, 7, 4, 4, 4, 6, 5, 5]
Average clique size = 5.010
Maximum clique size = 8
Minimum clique size = 3
```

Even in the first few samples, we’ve already identified larger cliques than the 4-node clique we studied before. Awesome! Indeed, this simple shrinking strategy gives cliques with average size of roughly five. We can enlarge these cliques by searching for larger cliques in their vicinity. We’ll do this by taking ten iterations of local search and studying the results. Note: this may take a few seconds.

```
searched = [clique.search(s, TA_graph, 10) for s in shrunk]
clique_sizes = [len(s) for s in searched]
print("First two cliques = ", searched[:2])
print("Average clique size = {:.3f}".format(np.mean(clique_sizes)))
```

Out:

```
First two cliques = [[5, 11, 14, 15, 16, 20, 21, 22], [0, 1, 2, 6, 7, 8, 17, 23]]
Average clique size = 7.999
```

Wow! Local search is very helpful, we’ve found cliques with the maximum size of eight for essentially all samples 🤩. Let’s take a look at the first clique we found

```
plot.graph(TA_graph, searched[0])
```

The TACE-AS graph is relatively small, so finding large cliques is not particularly difficult. A
tougher challenge is the 300-node `p_hat300-1`

random graph from the DIMACS maximum clique
dataset. In this section, we’ll write a short program that uses GBS samples in combination with
local search to identify large cliques in this graph.

```
Phat = data.PHat() # Load data
phat_graph = nx.Graph(Phat.adj) # Obtain graph
postselected = sample.postselect(Phat, 16, 20) # Post-select samples
samples = sample.to_subgraphs(postselected, phat_graph) # Convert samples into subgraphs
shrunk = [clique.shrink(s, phat_graph) for s in samples] # Shrink subgraphs to cliques
searched = [clique.search(s, phat_graph, 10) for s in shrunk] # Perform local search
clique_sizes = [len(s) for s in searched]
largest_clique = searched[np.argmax(clique_sizes)] # Identify largest clique found
print("Largest clique found is = ", largest_clique)
```

Out:

```
Largest clique found is = [114, 121, 132, 173, 189, 199, 249, 298]
```

Let’s make a plot to take a closer look at the largest clique we found

```
plot.graph(phat_graph, largest_clique)
```