GSoC 2024 | QGANs for Monte Carlo Simulations

Luis Rey Vargaz Guadarrama
july 13, 2024 · 13 min read

What is Google Summer of Code?

I spent the last summer working in GSoC 2024 with the ML4SCI organization. Google Summer of Code is an international online program designed to introduce new contributors to open-source software development. During this program, GSoC participants collaborate with an open-source organization on a programming project lasting 12 weeks or more under the supervision of mentors. Participating in GSoC was one of the most enriching and challenging experiences of my life. My project, QGANs for Monte Carlo Simulations aims to investigate the feasibility of using Quantum Generative Adversarial Networks to generate events for Monte Carlo simulations. The code of this project is available in Git Hub.

Monte Carlo Simulations

Monte Carlo Simulation is a computational technique that uses repeated random sampling to estimate the probability of various outcomes in uncertain scenarios. Developed by John von Neumann and Stanislaw Ulam during World War II, it is named after the Monte Carlo casino due to its reliance on chance. The method is widely used in finance, project management, and AI for risk assessment and decision-making. It involves setting up a predictive model, specifying probability distributions for input variables, and running several simulations to generate possible outcomes.

Monte Carlo Simulation differs from typical forecasting models by predicting a range of outcomes based on estimated values rather than fixed inputs. It builds a model using probability distributions for variables with inherent uncertainty. By recalculating results repeatedly with different random values, often thousands of times, it generates a wide array of possible outcomes.

Generative Adversarial Networks

Generative Adversarial Networks (GANs) is a framework for estimating generative models using an adversarial process [1]. This framework involves training two models simultaneously: a generative model \(G\) that captures the data distribution and a discriminative model \(D\) that distinguishes between samples from the training data distribution and those produced by \(G\). The goal is to improve \(G\) so that \(D\) cannot differentiate between training data and generated data. This process is similar to a minimax two-player game, where \(G\) tries to fool \(D\) while \(D\) aims to detect the fake data. We train \(D\) to maximize the probability of assigning the correct label to both the training data samples and the generated samples from \(G\). Simultaneously, we train \(G\) to minimize the chances of \(D\) correctly distinguishing between training and generated samples. This process involves optimizing the loss function:

\[V(D, G) = \mathbb{E}_{\mathbf{x} \sim p_{\text{data}}(\mathbf{x})} [\log D(\mathbf{x})] + \mathbb{E}_{\mathbf{z} \sim p_{\mathbf{z}}(\mathbf{z})} [\log (1 - D(G(\mathbf{z})))] \]

where \(D(\mathbf{x})\) represents the probability that \(\mathbf{x}\) came from the true data and \(D(G(\mathbf{z}))\) the probability of \(D\) correctly labeling a generated sample \(G(\mathbf{z})\) from the latent space \(\mathbf{z}\).

GAN-diagram — Diagram of a classical GAN.

Quantum Generative Adversarial Networks

Quantum Generative Adversarial Networks (QGANs) represent a quantum extension of classical GANs, incorporating quantum mechanics to leverage the computational advantages of quantum systems [2] [3]. In QGANs, the generator is a parametrized quantum circuit that produces quantum states that resemble the distribution from the training data, meanwhile, the discriminator can be either a classical discriminator or a quantum parametrized circuit that differentiates between the training data distribution and generated distribution.

QGAN-diagram — Diagram of a QGAN composed by a quantum generator and a classical discriminator.

Simulating random variables with QGANs

One of the key steps for the Monte Carlo Method is to specify the probability distribution of independent variables. An incorrect choice of the distribution can lead to inaccuracies. In my project, I try to address this difficulty by implementing QGANs for Monte Carlo Simulations. I used the architecture proposed by Zoufal et. al. [4], composed of a quantum generator with an equal superposition as a reference state followed by layers of parametrized \(\text{Pauli-Y}\) rotations followed by an entangling block of \(\text{control-Z}\) gates and a classical discriminator consisting of dense neural network.

QGAN-diagram1 — a) Shows a single layer of the parametrized circuit. b) shows the parametrized circuit corresponding to the quantum generator.

QGAN-diagram2 — a) Shows a single layer of the parametrized circuit. b) shows the parametrized circuit corresponding to the quantum generator.

The project implementation uses PennyLane and PyTorch. The generator circuit returns the probabilities of the basis states, each state representing a possible outcome. The circuit takes a weights parameter, as the name indicates, this parameter is an array with the angle of rotation of each parametrized \(\text{Pauli-Y}\) rotation.

dev = qml.device("default.qubit", wires=n_qubits)

@qml.qnode(dev, diff_method="backprop")
def quantum_circuit(weights):
    """Quantum generator's parametrized circuit"""

    weights = weights.reshape(q_depth, n_qubits)

    # Initialise latent vectors
    for i in range(n_qubits):
        qml.Hadamard(wires=i)

    # Repeat each layer
    for i in range(q_depth):
        # Parameterised layer
        for y in range(n_qubits):
            qml.RY(weights[i][y], wires=y)

        # Entangling blocks of control Z gates
        for y in range(n_qubits - 1):
            qml.CZ(wires=[y, y + 1])

        qml.Barrier(wires=list(range(n_qubits)), only_visual=True)

    return qml.probs(wires=list(range(n_qubits)))

The classical discriminator is a fully connected neural network with two hidden layers, containing 50 nodes in the first layer and 20 nodes in the second layer, both with a LeakyRelu activation function and, finally, an output layer of a single node with a sigmoid activation function to return the probability of an input to be a generated sample. The input layer shape depends on the number of possible outcomes the random variable of the training data can take. The discriminator's input is the pseudo-probabilities of a batch of samples from the random variable.

class Discriminator(nn.Module):
    """Fully connected classical discriminator"""

    def __init__(self):
        super().__init__()

        self.model = nn.Sequential(
            # Inputs to first hidden layer (num_input_features -> 50)
            nn.Linear(num_input_features, 50),
            nn.LeakyReLU(),

            # First hidden layer (50 -> 20)
            nn.Linear(50, 20),
            nn.LeakyReLU(),
            
            # Second hidden layer (20 -> 1)
            nn.Linear(20, 1),
            nn.Sigmoid(),
        )

    def forward(self, x):
        x = x.reshape(x.size(0), -1)
        
        return self.model(x)

I chose four simple scenarios to simulate: The rolling of two six-sided dice, coin toss sequences, particle time decay, and two-dimensional random walks. In this implementation, the quantum circuit returns the probability of measuring the elements in the basis state, and each element represents a possible outcome from the simulations. Once trained, the generator produces samples simulating a single event. The training process was the following:

Data Generation
Calculate Probabilities
Generator and discriminador

rolling-dice — a) Results from a 10 coin toss simulation. b) Results from a rolling two dice simulation. c) Results from a particle decay time simulation. d) Results from a 2D random walk simulation .

Using a QGAN to generate gluon initiated jets

Moving to a more complex data distribution, I took the dataset of detector images from Quark-initiated jets [5]. Specifically, I used the electromagnetic calorimeter images that measure energy deposits from electromagnetic particles.

Due to the high complexity of the dataset, the first architecture was not powerful enough to resemble the training dataset distribution, which is why I implemented the architecture proposed by He Liang et. al.[6], which uses an auxiliary register to provide the PQC more possible solution states along with a post-processing step which allows the output of the quantum generator to get rid of the normalization constraint of the measurement. This architecture also uses many generators, each responsible for producing a fraction of the image. This approach allows each generator to focus on a simpler distribution rather than the more complex complete image.

PQC-2 — Parametrized quantum circuit from the architecture proposed by He Liang. [6]

Quantum gates are unitary operators and, by definition, are linear transformations. With the most simple generative tasks like the examples from the section above, these transformations are good enough. However, for more complex data distribution, non-linear transformation could be needed. For the pre-measurement state of the generator, we have:

\[ |\Psi(z)\rangle = U_G(\theta)|z\rangle\]

Where \(U_G(\theta)\) represents the overall unitary operator of the parametrized layers. If we take a partial measurement \(\Pi\) and trace out the ancillary subsystem:

\[\rho(z) = \frac{\operatorname{Tr}_A (\Pi \otimes |\Psi(z)\rangle \langle \Psi(z)|)}{\operatorname{Tr} (\Pi \otimes |\Psi(z)\rangle \langle \Psi(z)|)} = \frac{\operatorname{Tr}_A (\Pi \otimes |\Psi(z)\rangle \langle \Psi(z)|)}{\langle \Psi(z) | \Pi \otimes |\Psi(z)\rangle} \]

The post-measurement state \(p(z)\), is dependent on \(z\) on the numerator and denominator. This implies a non-linear transformation was performed over \(|\Psi(z)\rangle\).

On the other hand, I wanted the output of the quantum generator to represent energy deposits, so the output should be able to take values larger than 1, and the elements from the quantum circuit output do not necessarily need to sum up to 1. These limitations can be overcome by applying the following transformation to the quantum circuit output of the circuit. \[ \tilde{x} = \frac{g}{y}\] Where \(g\) is the output from the generator and \(y \in (0, 1]\). In this work, I used, \(y=0.3\). The training process was the following:

Data Preprocessing
Generator and discriminador

ECAL_overlay — Results from the QGAN training. a) In the left the overlay of the training data set is shown and in the right the overlay of the generated jets is shown. b) The training history is shown.

ECAL_training — Results from the QGAN training. a) In the left the overlay of the training data set is shown and in the right the overlay of the generated jets is shown. b) The training history is shown.

Total-deposits — a) Shows the histogram of the total energy deposits per jet of the real and generated jets. b) shows the histogram of the rechit energy deposits of the real and generated jets.

particle-deposits — a) Shows the histogram of the total energy deposits per jet of the real and generated jets. b) shows the histogram of the rechit energy deposits of the real and generated jets.

Future work and conclusion

From the simple simulations and gluon-initiated jet image generation, we observe that implementing QGANs for data generation with simple and complex distributions is feasible. It would be interesting to investigate how well this implementation generates different yet correlated data distributions, such as the three subdetector channels from detector images [5]. The most exciting part of this project was exploring a promising method for future high-energy physics simulations. I am grateful for the opportunity to be part of the ML4SCI organization, their supportive community and cutting-edge projects are among the best I've encountered, inspiring people to pursue a career in this field. Thanks for your work.

References

[1] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville and Yoshua Bengio, Generative Adversarial Networks, arXiv:1406.2661v1.

[2] Pierre-Luc Dallaire-Demers, Quantum Generative Adversarial Networks, arXiv:1804.08641v2.

[3] Seth Lloyd, Christian Weedbrook, Quantum Generative Adversarial Learning, arXiv:1804.09139v1.

[4] Zoufal, C., Lucchi, A., & Woerner, S. (2019). Quantum Generative Adversarial Networks for learning and loading random distributions. Npj Quantum Information, 5(1). https://doi.org/10.1038/s41534-019-0223-2.

[5] Andrews, M., Alison, J., An, S., Bryant, P., Burkle, B., Gleyzer, S., Narain, M., Paulini, M., Poczos, B. & Usai, E. (2019). End-to-End Jet Classification of Quarks and Gluons with the CMS Open Data. Nucl. Instrum. Methods Phys. Res. A 977, 164304 (2020). https://doi.org/10.1016/j.nima.2020.164304.

[6] He-Liang, Yuxuan Du, Ming Gong, Youwei Zhao, Yulin Wu, Chaoyue Wang, Shaowei Li, Futian Liang, et. al. Quantum Generative Adversarial Networks for Image Generation, arXiv:2010.06201v3.