Image Output Configuration

Complete configuration guide for image generation and reconstruction tasks.

Overview

Image outputs handle visual data generation including:

  • Image synthesis - Creating new images from learned distributions

  • Image reconstruction - Restoring degraded or incomplete images

  • Style transfer - Converting images between different styles

  • Medical image generation - Synthetic medical imaging data

Quick Example

output_info:
  output_source: "my//images/"
  output_name: "synthetic_xray"
  output_type: "image"
output_type_info:
  output_dimensions: [256, 256]
  num_channels: 1
model_config:
  model_type: "array"  # Images use array-based architectures
  model_init_config:
    fc_repr_dim: 1024

Output Type Configuration

class eir.setup.schema_modules.output_schemas_image.ImageOutputTypeConfig(
size: Sequence[int] = (64,),
resize_approach: Literal['resize', 'randomcrop', 'centercrop'] = 'resize',
adaptive_normalization_max_samples: int | None = None,
mean_normalization_values: None | Sequence[float] = None,
stds_normalization_values: None | Sequence[float] = None,
mode: Literal['RGB', 'L', 'RGBA'] | None = None,
num_channels: int | None = None,
loss: Literal['mse', 'diffusion'] = 'mse',
diffusion_time_steps: int | None = 1000,
diffusion_beta_schedule: Literal['linear', 'scaled_linear', 'squaredcos_cap_v2', 'sigmoid'] = 'linear',
)
Parameters:
  • adaptive_normalization_max_samples – If using adaptive normalization (channel / element), how many samples to use to compute the normalization parameters. If None, will use all samples.

  • resize_approach

    The method used for resizing the images. Options are:

    • resize: Directly resize the image to the target size.

    • randomcrop: Resize the image to a larger size than the target and then apply a random crop to the target size.

    • centercrop: Resize the image to a larger size than the target and then apply a center crop to the target size.

  • mean_normalization_values – Average channel values to normalize images with. This can be a sequence matching the number of channels, or None. If None and using a pretrained model, the values used for the model pretraining will be used. If None and training from scratch will iterate over training data and compute the running average per channel.

  • stds_normalization_values – Standard deviation channel values to normalize images with. This can be a sequence mathing the number of channels, or None. If None and using a pretrained model, the values used for the model pretraining will be used. If None and training from scratch, will iterate over training data and compute the running average per channel.

  • mode

    An explicit mode to convert loaded images to. Useful when working with input data with a mixed number of channels, or you want to convert images to a specific mode. Options are

    • RGB: Red, Green, Blue (channels=3)

    • L: Grayscale (channels=1)

    • RGBA: Red, Green, Blue, Alpha (channels=4)

  • num_channels – Number of channels in the images. If None, will try to infer the number of channels from a random image in the training data. Useful when known ahead of time how many channels the images have, will raise an error if an image with a different number of channels is encountered.

  • loss – Which loss to use for training the model. Either mse or diffusion.

  • diffusion_time_steps – Number of time steps to use for diffusion loss. Only used if loss is set to diffusion.

  • diffusion_beta_schedule

    Scheduler type to use for the diffusion process. Options are:

    • linear

    • scaled_linear

    • squaredcos_cap_v2

    • sigmoid

Output Module Configuration

Image outputs typically use array-based output modules. See Array Output Configuration for detailed configuration options.