Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can load Fluxonnx Modal Components using InferenceSession #23770

Open
Kishimita opened this issue Feb 20, 2025 · 1 comment
Open

Can load Fluxonnx Modal Components using InferenceSession #23770

Kishimita opened this issue Feb 20, 2025 · 1 comment

Comments

@Kishimita
Copy link

Describe the issue

Since there isnt a pipeline for inferencing Flux1devonnx I working on using onnxruntime and some custom script to load the model components and inference it.

below is what my custom flux pipeline script looks like.

from onnxruntime import onnxruntime as ort
import numpy as np
from PIL import Image
import os 
#import onnx

class FluxONNXPipeline:
	def __init__(self, model_onnx_paths, num_steps=50, guidance_scale=7.5):
		"""
		Initialize the pipeline by loading each ONNX component.

		Parameters:
		  model_onnx_paths (dict): Dictionary where the keys are model names and the values are paths to the ONNX files.
		  num_steps (int): Number of diffusion iterations.
		  guidance_scale (float): Guidance scale (if classifier-free guidance is used).
		"""
		#new suggestion 
		# self.clip_session = self_mod.loadel(model_onnx_paths.get('CLIP'), 'CLIP')
		# self.t5_session = self.load_model(model_onnx_paths.get('T5'), 'T5')
		# self.transformer_session = self.load_model(model_onnx_paths.get('TransformerFP8'), 'Transformer')
		# self.vae_session = self.load_model(model_onnx_paths.get('VAE'), 'VAE')
		# print(f"CLIP session: {self.clip_session}")
		# print(f"T5 session: {self.t5_session}")
		# print(f"Transformer session: {self.transformer_session}")
		# print(f"VAE session: {self.vae_session}")
		
		#end of new suggestion 

		print(f"This is the providers available in ort Inference Session : {ort.InferenceSession.get_providers()}")

		# try:
		# 	# Load each ONNX model with ONNX Runtime.
		# 	self.clip_session = ort.InferenceSession(model_onnx_paths['CLIP'], 
		# 									#providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 									)
		# except Exception as e:
		# 	print(f"Error loading CLIP model: {e}")

		# try:
		# 	# Optionally, you might load a T5 model if the architecture uses it.
		# 	self.t5_session = ort.InferenceSession(model_onnx_paths['T5'], providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading T5 model: {e}")

		# try:
		# 	# This session represents the core diffusion (or “denoising”) model.
		# 	self.transformer_session = ort.InferenceSession(model_onnx_paths['TransformerFP4'], #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading Transformer model: {e}")

		# try:
		# 	# This VAE decodes the final latent representation to an image.
		# 	self.vae_session = ort.InferenceSession(model_onnx_paths['VAE'], 
		# 								   #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading VAE model: {e}")
		
		# Store diffusion parameters.
		self.num_steps = num_steps
		self.guidance_scale = guidance_scale
# Example usage:
if __name__ == "__main__":
	t5_onnx_path = "/Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx"
	transformerbfp16_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/bfp16/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/bf16/model.onnx"
	transformerfp4_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx"
	transformerfp8_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp8/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp8/model.onnx"
	clip_onnx_path = "/Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx"
	vae_onnx_path = "/jcerutti/Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx"
	model_onnx_weights = {"CLIP": clip_onnx_path,
						  "T5": t5_onnx_path,
						  "TransformerBFP16": transformerbfp16_onnx_path,
						  "TransformerFP4": transformerfp4_onnx_path,
						  "TransformerFP8": transformerfp8_onnx_path,
						  "VAE": vae_onnx_path}
	# Replace 'path_to_flux_onnx_model' with the directory containing your ONNX files.
	pipeline = FluxONNXPipeline(model_onnx_paths=model_onnx_weights)

Once I run this below is the main error ive been trying to fix and havent been able to :

Script started at Thu Feb 20 11:43:26 AM EST 2025
Environment: /Flux2/.flux2-venv/bin/python
Python 3.11.2
Error loading CLIP model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading T5 model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading Transformer model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx failed:Invalid tensor data type 23.
Error loading VAE model: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/
    snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx failed:This is an invalid model. Type Error: Type 'tensor(bfloat16)' of input parameter (latent) of 
    operator (Conv) in node (/decoder/conv_in/Conv) is invalid.

<__main__.FluxONNXPipeline object at 0x7f8aec1bfa50>
Traceback (most recent call last):
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 196, in <module>
    generated_image = pipeline(prompt)
                      ^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 171, in __call__
    text_embedding = self.encode_text(prompt)
                     ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 98, in encode_text
    input_name = self.clip_session.get_inputs()[0].name
                 ^^^^^^^^^^^^^^^^^
AttributeError: 'FluxONNXPipeline' object has no attribute 'clip_session'
Script ended at Thu Feb 20 11:43:28 AM EST 2025

To reproduce

Since there isnt a pipeline for inferencing Flux1devonnx I working on using onnxruntime and some custom script to load the model components and inference it.

below is what my custom flux pipeline script looks like.

from onnxruntime import onnxruntime as ort
import numpy as np
from PIL import Image
import os 
#import onnx

class FluxONNXPipeline:
	def __init__(self, model_onnx_paths, num_steps=50, guidance_scale=7.5):
		"""
		Initialize the pipeline by loading each ONNX component.

		Parameters:
		  model_onnx_paths (dict): Dictionary where the keys are model names and the values are paths to the ONNX files.
		  num_steps (int): Number of diffusion iterations.
		  guidance_scale (float): Guidance scale (if classifier-free guidance is used).
		"""
		#new suggestion 
		# self.clip_session = self_mod.loadel(model_onnx_paths.get('CLIP'), 'CLIP')
		# self.t5_session = self.load_model(model_onnx_paths.get('T5'), 'T5')
		# self.transformer_session = self.load_model(model_onnx_paths.get('TransformerFP8'), 'Transformer')
		# self.vae_session = self.load_model(model_onnx_paths.get('VAE'), 'VAE')
		# print(f"CLIP session: {self.clip_session}")
		# print(f"T5 session: {self.t5_session}")
		# print(f"Transformer session: {self.transformer_session}")
		# print(f"VAE session: {self.vae_session}")
		
		#end of new suggestion 

		print(f"This is the providers available in ort Inference Session : {ort.InferenceSession.get_providers()}")

		# try:
		# 	# Load each ONNX model with ONNX Runtime.
		# 	self.clip_session = ort.InferenceSession(model_onnx_paths['CLIP'], 
		# 									#providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 									)
		# except Exception as e:
		# 	print(f"Error loading CLIP model: {e}")

		# try:
		# 	# Optionally, you might load a T5 model if the architecture uses it.
		# 	self.t5_session = ort.InferenceSession(model_onnx_paths['T5'], providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading T5 model: {e}")

		# try:
		# 	# This session represents the core diffusion (or “denoising”) model.
		# 	self.transformer_session = ort.InferenceSession(model_onnx_paths['TransformerFP4'], #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading Transformer model: {e}")

		# try:
		# 	# This VAE decodes the final latent representation to an image.
		# 	self.vae_session = ort.InferenceSession(model_onnx_paths['VAE'], 
		# 								   #providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
		# 								   )
		# except Exception as e:
		# 	print(f"Error loading VAE model: {e}")
		
		# Store diffusion parameters.
		self.num_steps = num_steps
		self.guidance_scale = guidance_scale
# Example usage:
if __name__ == "__main__":
	t5_onnx_path = "/Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx"
	transformerbfp16_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/bfp16/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/bf16/model.onnx"
	transformerfp4_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx"
	transformerfp8_onnx_path = "/Flux2/ai-toolkit/model_weights/transformer/fp8/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp8/model.onnx"
	clip_onnx_path = "/Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx"
	vae_onnx_path = "/jcerutti/Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx"
	model_onnx_weights = {"CLIP": clip_onnx_path,
						  "T5": t5_onnx_path,
						  "TransformerBFP16": transformerbfp16_onnx_path,
						  "TransformerFP4": transformerfp4_onnx_path,
						  "TransformerFP8": transformerfp8_onnx_path,
						  "VAE": vae_onnx_path}
	# Replace 'path_to_flux_onnx_model' with the directory containing your ONNX files.
	pipeline = FluxONNXPipeline(model_onnx_paths=model_onnx_weights)

Once I run this below is the main error ive been trying to fix and havent been able to :

Script started at Thu Feb 20 11:43:26 AM EST 2025
Environment: /Flux2/.flux2-venv/bin/python
Python 3.11.2
Error loading CLIP model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/clip/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/clip.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading T5 model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/t5/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/t5.opt/model.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:180 onnxruntime::Model::Model(onnx::ModelProto&&, const onnxruntime::PathString&, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 11, max supported IR version: 10

Error loading Transformer model: [ONNXRuntimeError] : 1 : FAIL : Load model from /Flux2/ai-toolkit/model_weights/transformer/fp4/models--black-forest-labs--FLUX.1-dev-onnx/snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/transformer.opt/fp4/model.onnx failed:Invalid tensor data type 23.
Error loading VAE model: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /Flux2/ai-toolkit/model_weights/vae/models--black-forest-labs--FLUX.1-dev-onnx/
    snapshots/b566cc0360f26cdbbbabec71621a9f9260835cdd/vae.opt/model.onnx failed:This is an invalid model. Type Error: Type 'tensor(bfloat16)' of input parameter (latent) of 
    operator (Conv) in node (/decoder/conv_in/Conv) is invalid.

<__main__.FluxONNXPipeline object at 0x7f8aec1bfa50>
Traceback (most recent call last):
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 196, in <module>
    generated_image = pipeline(prompt)
                      ^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 171, in __call__
    text_embedding = self.encode_text(prompt)
                     ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Flux2/ai-toolkit/pipelines/custom_flux_pipeline.py", line 98, in encode_text
    input_name = self.clip_session.get_inputs()[0].name
                 ^^^^^^^^^^^^^^^^^
AttributeError: 'FluxONNXPipeline' object has no attribute 'clip_session'
Script ended at Thu Feb 20 11:43:28 AM EST 2025

Urgency

No response

Platform

Linux

OS Version

Debian GNU/Linux 12

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

Ive used 1.17, 1.18, and 1.20

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Other / Unknown

Execution Provider Library Version

No response

@tianleiwu
Copy link
Contributor

Unsupported model IR version: 11, max supported IR version: 10. It means that you need to get nightly build.

I think the Flux onnx model are optimized for TensorRT. You might need to use TensorRT Execution Provider, or TensorRT to run it.
See https://github.com/black-forest-labs/flux/pull/410/files#diff-5ae3b79e3afd3516d247448cfe13d20acd242002fe557d2dad4e966342d41fbc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants