

Quantum computing systems and software company D-Wave Quantum Inc. has partnered with the pharmaceutical division of Japan Tobacco Inc. to build a proof-of-concept artificial intelligence model using quantum technology to enhance drug discovery.
The goal of the project, announced Monday, is to accelerate the training accuracy of AI models called generative pretrained transformers used in drug discovery. This is similar to engines behind large language models such as OpenAI’s ChatGPT, but instead of generating words, they generate drug molecules.
The researchers sought to use D-Wave’s quantum processing units to assist with the training instead of just classical computation training. The work demonstrated that the new AI models produced more valid molecules than normal graphics processing units, which are the industry standard for training AI models. The companies added that the new AI was also able to discover more drugs that were likely to be better candidates compared to the training dataset than those trained with classical methods.
The chemical space for discovering new drugs is extremely large, creating a broad range of potential properties that drugs could exhibit. For example, properties could include cell wall permeability, potency, ability to move through the body, absorption rate, metabolic uptake and potential toxicity. Changing one property of a drug molecule can have a dramatic effect on any number of other properties.
“To the best of our knowledge, this is the first work for annealing quantum computation to outperform classical results concerning LLM training in drug discovery,” said Dr. Masaru Tateno, chief scientific officer of Central Pharma Research Institute at Japan Tobacco.
D-Wave’s QPUs work on a type of quantum architecture known as “quantum annealing,” which works well for applications such as materials science simulations and optimization. In this case, the QPUs were used to optimize potential druglike molecule results to help the AI model learn to generate better molecules by tackling the underlying challenges in training and generation.
Japan Tobacco said it intends to use the project as a proof-of-concept for accelerating the discovery of first-in-class small-molecule compounds faster and cheaper. Small molecule drugs are important because they are typically administered by mouth and can be used to treat a wide variety of diseases. For example, JT could be interested in discovering drugs to treat metabolic diseases, such as diabetes or obesity, autoimmune diseases such as arthritis or Crohn’s, or develop new pain medications.
“Moving forward, with D-Wave’s quantum annealing machines, we aim to maximize the use of quantum computing hardware characteristics and accelerate our efforts in achieving Quantum AI-driven drug discovery,” added Tateno.
Diffusion AI models, a type of model behind text-to-image generators, are trained on massive data sets of images so they can produce vivid, imaginative renderings of what a user asks for. However, this also means that the neural nets that underlie them are expensive to train and can take a long time to produce.
Trevor Lanting, chief development officer at D-Wave, told SiliconANGLE in an interview that the company has another AI project in the works using quantum annealing technology and an architecture proposed by Nvidia Corp. engineers that would enhance models with “discrete latent variable” spaces.
Essentially, when a user asks a diffusion model to produce an image of a cat it must look through the entire data set of all images to determine what is “cat-like” and reproduce a new cat image. A latent space captures the essential features and relationships that represent “cat” qualities allowing a model to analyze, interpret and generate a new cat image efficiently.
With these included, a model doesn’t need to look through the entire continuous space of its training dataset. The semantic information within a given image that contains a cat can be mapped for a relatively smaller number of relationships into a discrete latent space. This means that the model doesn’t need to learn everything about every image to produce a cat; it just needs to look at the “cat context” at a high level.
“What we’re doing right now is investigating co-training the diffusion models enhanced with these discrete latent spaces, to see if we can reduce the overall model complexity and the size of the model,” said Lanting. “By reducing the number of parameters in the continuous part of the neural net, we can potentially do training faster at lower cost.”
Parameters are configuration settings that determine how an AI model processes data. They can be adjusted during training to change how the model behaves. The greater the number of parameters the larger and more complex the model.
At the same time, models with an extremely large number of parameters can be expensive and time-consuming to train. For context, the Stable Diffusion 3.5 family of open-source image generation models ranges from 2.5 billion to 8 billion parameters.
According to Lanting, Nvidia is currently using classical computing and GPUs with its work, but he believes that by using D-Wave’s QPUs, the process could be handled more efficiently because this is optimization work.
“It’s a direct example of the sort of an architecture that would be hybrid, where the goal is to get equal or better performance, but with potentially dramatically lower costs,” explained Lanting.
THANK YOU