Can load Fluxonnx Modal Components using InferenceSession #23770

Kishimita · 2025-02-20T21:57:00Z

Describe the issue

Since there isnt a pipeline for inferencing Flux1devonnx I working on using onnxruntime and some custom script to load the model components and inference it.

below is what my custom flux pipeline script looks like.

from onnxruntime import onnxruntime as ort
import numpy as np
from PIL import Image
import os 
#import onnx

class FluxONNXPipeline:
	def __init__(self, model_onnx_paths, num_steps=50, guidance_scale=7.5):
		"""
		Initialize the pipeline by loading each ONNX component.

		Parameters:
		  model_onnx_paths (dict): Dictionary where the keys are model names and the values are paths to the ONNX files.
		  num_steps (int): Number of diffusion iterations.
		  guidance_scale (float): Guidance scale (if classifier-free guidance is used).
		"""
		#new suggestion 
		# self.clip_session = self_mod.loadel(model_onnx_paths.get('CLIP'), 'CLIP')
		# self.t5_session = self.load_model(model_onnx_paths.get('T5'), 'T5')
		# self.transformer_session = self.load_model(model_onnx_paths.get('TransformerFP8'), 'Transformer')
		# self.vae_session = self.load_model(model_onnx_paths.get('VAE'), 'VAE')
		# print(f"CLIP session: {self.clip_session}")
		# print(f"T5 session: {self.t5_session}")
		# print(f"Transformer session: {self.transformer_session}")
		# print(f"VAE session: {self.vae_session}")
		
		#end of new suggestion 

		print(f"This is the providers available in ort Inference Session : {ort.InferenceSession.get_providers()}")

		# try:
		# 	# Load each ONNX model with ONNX Runtime.
		# 	self.clip_session = ort.InferenceSession(model_onnx_paths['CLIP'], 
		# 									#providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 									)
		# except Exception as e:
		# 	print(f"Error loading CLIP model: {e}")

		# try:
		# 	# Optionally, you might load a T5 model if the architecture uses it.
		# 	self.t5_session = ort.InferenceSession(model_onnx_paths['T5'], providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading T5 model: {e}")

		# try:
		# 	# This session represents the core diffusion (or “denoising”) model.
		# 	self.transformer_session = ort.InferenceSession(model_onnx_paths['TransformerFP4'], #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading Transformer model: {e}")

		# try:
		# 	# This VAE decodes the final latent representation to an image.
		# 	self.vae_session = ort.InferenceSession(model_onnx_paths['VAE'], 
		# 								   #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading VAE model: {e}")
		
		# Store diffusion parameters.
		self.num_steps = num_steps
		self.guidance_scale = guidance_scale
# Example usage:
if __name__ == "__main__":
	t5_onnx_path = "/Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx"
	transformerbfp16_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/bfp16/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/bf16/model.onnx"
	transformerfp4_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx"
	transformerfp8_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp8/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp8/model.onnx"
	clip_onnx_path = "/Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx"
	vae_onnx_path = "/jcerutti/Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx"
	model_onnx_weights = {"CLIP": clip_onnx_path,
						  "T5": t5_onnx_path,
						  "TransformerBFP16": transformerbfp16_onnx_path,
						  "TransformerFP4": transformerfp4_onnx_path,
						  "TransformerFP8": transformerfp8_onnx_path,
						  "VAE": vae_onnx_path}
	# Replace 'path_to_flux_onnx_model' with the directory containing your ONNX files.
	pipeline = FluxONNXPipeline(model_onnx_paths=model_onnx_weights)

Once I run this below is the main error ive been trying to fix and havent been able to :

Script started at Thu Feb 20 11:43:26 AM EST 2025
Environment: /Flux2/.flux2-venv/bin/python
Python 3.11.2
Error loading CLIP model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading T5 model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading Transformer model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx failed:Invalid tensor data type 23.
Error loading VAE model: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/
    snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx failed:This is an invalid model. Type Error: Type 'tensor(bfloat16)' of input parameter (latent) of 
    operator (Conv) in node (/decoder/conv_in/Conv) is invalid.

<__main__.FluxONNXPipeline object at 0x7f8aec1bfa50>
Traceback (most recent call last):
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 196, in <module>
    generated_image = pipeline(prompt)
                      ^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 171, in __call__
    text_embedding = self.encode_text(prompt)
                     ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 98, in encode_text
    input_name = self.clip_session.get_inputs()[0].name
                 ^^^^^^^^^^^^^^^^^
AttributeError: 'FluxONNXPipeline' object has no attribute 'clip_session'
Script ended at Thu Feb 20 11:43:28 AM EST 2025

To reproduce

Since there isnt a pipeline for inferencing Flux1devonnx I working on using onnxruntime and some custom script to load the model components and inference it.

below is what my custom flux pipeline script looks like.

from onnxruntime import onnxruntime as ort
import numpy as np
from PIL import Image
import os 
#import onnx

class FluxONNXPipeline:
	def __init__(self, model_onnx_paths, num_steps=50, guidance_scale=7.5):
		"""
		Initialize the pipeline by loading each ONNX component.

		Parameters:
		  model_onnx_paths (dict): Dictionary where the keys are model names and the values are paths to the ONNX files.
		  num_steps (int): Number of diffusion iterations.
		  guidance_scale (float): Guidance scale (if classifier-free guidance is used).
		"""
		#new suggestion 
		# self.clip_session = self_mod.loadel(model_onnx_paths.get('CLIP'), 'CLIP')
		# self.t5_session = self.load_model(model_onnx_paths.get('T5'), 'T5')
		# self.transformer_session = self.load_model(model_onnx_paths.get('TransformerFP8'), 'Transformer')
		# self.vae_session = self.load_model(model_onnx_paths.get('VAE'), 'VAE')
		# print(f"CLIP session: {self.clip_session}")
		# print(f"T5 session: {self.t5_session}")
		# print(f"Transformer session: {self.transformer_session}")
		# print(f"VAE session: {self.vae_session}")
		
		#end of new suggestion 

		print(f"This is the providers available in ort Inference Session : {ort.InferenceSession.get_providers()}")

		# try:
		# 	# Load each ONNX model with ONNX Runtime.
		# 	self.clip_session = ort.InferenceSession(model_onnx_paths['CLIP'], 
		# 									#providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 									)
		# except Exception as e:
		# 	print(f"Error loading CLIP model: {e}")

		# try:
		# 	# Optionally, you might load a T5 model if the architecture uses it.
		# 	self.t5_session = ort.InferenceSession(model_onnx_paths['T5'], providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading T5 model: {e}")

		# try:
		# 	# This session represents the core diffusion (or “denoising”) model.
		# 	self.transformer_session = ort.InferenceSession(model_onnx_paths['TransformerFP4'], #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading Transformer model: {e}")

		# try:
		# 	# This VAE decodes the final latent representation to an image.
		# 	self.vae_session = ort.InferenceSession(model_onnx_paths['VAE'], 
		# 								   #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading VAE model: {e}")
		
		# Store diffusion parameters.
		self.num_steps = num_steps
		self.guidance_scale = guidance_scale
# Example usage:
if __name__ == "__main__":
	t5_onnx_path = "/Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx"
	transformerbfp16_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/bfp16/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/bf16/model.onnx"
	transformerfp4_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx"
	transformerfp8_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp8/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp8/model.onnx"
	clip_onnx_path = "/Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx"
	vae_onnx_path = "/jcerutti/Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx"
	model_onnx_weights = {"CLIP": clip_onnx_path,
						  "T5": t5_onnx_path,
						  "TransformerBFP16": transformerbfp16_onnx_path,
						  "TransformerFP4": transformerfp4_onnx_path,
						  "TransformerFP8": transformerfp8_onnx_path,
						  "VAE": vae_onnx_path}
	# Replace 'path_to_flux_onnx_model' with the directory containing your ONNX files.
	pipeline = FluxONNXPipeline(model_onnx_paths=model_onnx_weights)

Once I run this below is the main error ive been trying to fix and havent been able to :

Script started at Thu Feb 20 11:43:26 AM EST 2025
Environment: /Flux2/.flux2-venv/bin/python
Python 3.11.2
Error loading CLIP model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading T5 model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading Transformer model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx failed:Invalid tensor data type 23.
Error loading VAE model: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/
    snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx failed:This is an invalid model. Type Error: Type 'tensor(bfloat16)' of input parameter (latent) of 
    operator (Conv) in node (/decoder/conv_in/Conv) is invalid.

<__main__.FluxONNXPipeline object at 0x7f8aec1bfa50>
Traceback (most recent call last):
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 196, in <module>
    generated_image = pipeline(prompt)
                      ^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 171, in __call__
    text_embedding = self.encode_text(prompt)
                     ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 98, in encode_text
    input_name = self.clip_session.get_inputs()[0].name
                 ^^^^^^^^^^^^^^^^^
AttributeError: 'FluxONNXPipeline' object has no attribute 'clip_session'
Script ended at Thu Feb 20 11:43:28 AM EST 2025

Urgency

No response

Platform

Linux

OS Version

Debian GNU/Linux 12

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

Ive used 1.17, 1.18, and 1.20

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Other / Unknown

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

tianleiwu · 2025-02-21T00:29:36Z

Unsupported model IR version: 11, max supported IR version: 10. It means that you need to get nightly build.

I think the Flux onnx model are optimized for TensorRT. You might need to use TensorRT Execution Provider, or TensorRT to run it.
See https://github.com/black-forest-labs/flux/pull/410/files#diff-5ae3b79e3afd3516d247448cfe13d20acd242002fe557d2dad4e966342d41fbc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can load Fluxonnx Modal Components using InferenceSession #23770

Can load Fluxonnx Modal Components using InferenceSession #23770

Kishimita commented Feb 20, 2025

tianleiwu commented Feb 21, 2025

Can load Fluxonnx Modal Components using InferenceSession #23770

Can load Fluxonnx Modal Components using InferenceSession #23770

Comments

Kishimita commented Feb 20, 2025

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

tianleiwu commented Feb 21, 2025