Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in cudaq.observe and spin_operator creation #2437

Open
3 of 4 tasks
bebora opened this issue Nov 28, 2024 · 1 comment
Open
3 of 4 tasks

Performance regression in cudaq.observe and spin_operator creation #2437

bebora opened this issue Nov 28, 2024 · 1 comment
Labels
performance python-lang Anything related to the Python CUDA Quantum language implementation

Comments

@bebora
Copy link
Contributor

bebora commented Nov 28, 2024

Required prerequisites

  • Consult the security policy. If reporting a security vulnerability, do not report the bug using this form. Use the process described in the policy to report the issue.
  • Make sure you've read the documentation. Your issue may be addressed there.
  • Search the issue tracker to verify that this hasn't already been reported. +1 or comment there if it has.
  • If possible, make a PR with a failing test to give us a starting point to work on!

Describe the bug

Creating a SpinOperator and observing the expectation value of a kernel with a given operator took a performance hit from v0.8.0 to v0.9.0.

I'm experimenting on a GH200 and getting the following slowdowns with a 20 qubit kernel and a Hamiltonian created from 3000 random SpinOperators:
SpinOperator creation: 22x
cudaq.observe duration: 46x

Steps to reproduce the bug

Create the file obs.py with the following content:

import random
import time

import numpy as np

import cudaq


def random_pauli_word(qubit_count: int) -> str:
    ops = "IXYZ"
    return "".join(random.choices(ops, k=qubit_count))


def random_hamiltonian(qubit_count: int, terms: int) -> dict[str, float]:
    res = {}
    for _ in range(terms):
        res[random_pauli_word(qubit_count)] = random.uniform(-0.1, 0.1)
    return res


def create_spin_operator(
    h: dict[str, float]
):  # -> cudaq.mlir._mlir_libs._quakeDialects.cudaq_runtime.SpinOperator
    res = 0
    for pauli_word, coeff in h.items():
        term = cudaq.SpinOperator.from_word(pauli_word)
        res += term * coeff

    return res


@cudaq.kernel
def ansatz(
    qubit_count: int,
):
    qc = cudaq.qvector(qubit_count)
    param = 0

    for i in range(qubit_count):
        ry(param, qc[i])
        param += 1


def observe_ansatz(qubit_count: int) -> tuple[float, float]:
    random.seed(0xE4)

    random_h = random_hamiltonian(qubit_count, 3000)
    operator_start = time.time()
    random_spin_operator = create_spin_operator(random_h)
    operator_time = time.time() - operator_start

    observe_start = time.time()
    value = cudaq.observe(ansatz, random_spin_operator, qubit_count).expectation()
    observe_time = time.time() - observe_start

    return operator_time, observe_time


if __name__ == "__main__":
    op_time_acc = []
    obs_time_acc = []
    for _ in range(10):
        op_time, obs_time = observe_ansatz(20)
        op_time_acc.append(op_time)
        obs_time_acc.append(obs_time)
    print(f"SpinOperator creation: {np.mean(op_time_acc)}")
    print(f"cudaq.observe duration: {np.mean(obs_time_acc)}")

I am getting the following times on a GH200 machine:
Version: 0.8.0

$ python obs.py
SpinOperator creation: 0.010332965850830078
cudaq.observe duration: 0.042797398567199704

Version: cu12-0.9.0

$ python obs.py
SpinOperator creation: 0.22831881046295166
cudaq.observe duration: 1.961019015312195

cu12-latest, cu11-latest, and cu12-0.9.0 have comparable running times.

Expected behavior

I expect the latest version to have comparable running times to the previous stable version (v0.8.0).

Is this a regression? If it is, put the last known working version (or commit) here.

Performance regression from v0.8.0

Environment

  • CUDA-Q version: cu11-0.9.0, cu12-0.9.0, and cu12-latest
  • Python version: 3.10.12
  • Operating system: Docker container

Suggestions

No response

@1tnguyen 1tnguyen added performance python-lang Anything related to the Python CUDA Quantum language implementation labels Nov 28, 2024
@boschmitt
Copy link
Collaborator

For those looking into to this: I can reproduce the issue. The problem seems to lie on the new SpinOperator and its lazy evaluation capabilities. Using the old one shows no performance regression.

Using the following create_spin_operator function:

def create_spin_operator(
    h: dict[str, float]
):  # -> cudaq.mlir._mlir_libs._quakeDialects.cudaq_runtime.SpinOperator
    res = 0
    for pauli_word, coeff in h.items():
        term = cudaq.mlir._mlir_libs._quakeDialects.cudaq_runtime.SpinOperator.from_word(pauli_word)
        res += term * coeff

    return res
# Version 0.8.0
SpinOperator creation: 0.007181382179260254
cudaq.observe duration: 0.03874294757843018

# Version 0.9.0-cu12 (new SpinOperator)
SpinOperator creation: 0.18058569431304933
cudaq.observe duration: 1.308878755569458

# Version 0.9.0-cu12 (old SpinOperator)
SpinOperator creation: 0.007892823219299317
cudaq.observe duration: 0.03526647090911865

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance python-lang Anything related to the Python CUDA Quantum language implementation
Projects
None yet
Development

No branches or pull requests

3 participants