Image Output Configuration
Complete configuration guide for image generation and reconstruction tasks.
Overview
Image outputs handle visual data generation including:
Image synthesis - Creating new images from learned distributions
Image reconstruction - Restoring degraded or incomplete images
Style transfer - Converting images between different styles
Medical image generation - Synthetic medical imaging data
Quick Example
output_info:
output_source: "my//images/"
output_name: "synthetic_xray"
output_type: "image"
output_type_info:
output_dimensions: [256, 256]
num_channels: 1
model_config:
model_type: "array" # Images use array-based architectures
model_init_config:
fc_repr_dim: 1024
Output Type Configuration
- class eir.setup.schema_modules.output_schemas_image.ImageOutputTypeConfig(
- size: Sequence[int] = (64,),
- resize_approach: Literal['resize', 'randomcrop', 'centercrop'] = 'resize',
- adaptive_normalization_max_samples: int | None = None,
- mean_normalization_values: None | Sequence[float] = None,
- stds_normalization_values: None | Sequence[float] = None,
- mode: Literal['RGB', 'L', 'RGBA'] | None = None,
- num_channels: int | None = None,
- loss: Literal['mse', 'diffusion'] = 'mse',
- diffusion_time_steps: int | None = 1000,
- diffusion_beta_schedule: Literal['linear', 'scaled_linear', 'squaredcos_cap_v2', 'sigmoid'] = 'linear',
- Parameters:
adaptive_normalization_max_samples – If using adaptive normalization (channel / element), how many samples to use to compute the normalization parameters. If None, will use all samples.
resize_approach –
The method used for resizing the images. Options are:
resize: Directly resize the image to the target size.randomcrop: Resize the image to a larger size than the target and then apply a random crop to the target size.centercrop: Resize the image to a larger size than the target and then apply a center crop to the target size.
mean_normalization_values – Average channel values to normalize images with. This can be a sequence matching the number of channels, or None. If None and using a pretrained model, the values used for the model pretraining will be used. If None and training from scratch will iterate over training data and compute the running average per channel.
stds_normalization_values – Standard deviation channel values to normalize images with. This can be a sequence mathing the number of channels, or None. If None and using a pretrained model, the values used for the model pretraining will be used. If None and training from scratch, will iterate over training data and compute the running average per channel.
mode –
An explicit mode to convert loaded images to. Useful when working with input data with a mixed number of channels, or you want to convert images to a specific mode. Options are
RGB: Red, Green, Blue (channels=3)L: Grayscale (channels=1)RGBA: Red, Green, Blue, Alpha (channels=4)
num_channels – Number of channels in the images. If None, will try to infer the number of channels from a random image in the training data. Useful when known ahead of time how many channels the images have, will raise an error if an image with a different number of channels is encountered.
loss – Which loss to use for training the model. Either
mseordiffusion.diffusion_time_steps – Number of time steps to use for diffusion loss. Only used if
lossis set todiffusion.diffusion_beta_schedule –
Scheduler type to use for the diffusion process. Options are:
linearscaled_linearsquaredcos_cap_v2sigmoid
Output Module Configuration
Image outputs typically use array-based output modules. See Array Output Configuration for detailed configuration options.