Array Data Configuration
Configuration guide for multi-dimensional array and tensor data. Generally refers to NumPy arrays stored on disk (or streamed if using the streaming functionality), all of them with the same shape and type.
Overview
Array data in EIR handles multi-dimensional numerical data:
Scientific data - Sensor readings, measurements
Signal processing - Audio spectrograms, time-frequency data
Multi-dimensional features - Engineered feature matrices
Tensor data - Any N-dimensional numerical array
Quick Example
input_info:
input_source: "my/array/data/folder/"
input_name: "sensor_data"
input_type: "array"
model_config:
model_type: "cnn"
model_init_config:
channel_exp_base: 3
kernel_width: 5
kernel_height: 1
Input Data Configuration
Base Configuration
- class eir.setup.schemas.ArrayInputDataConfig(
- mixing_subtype: Literal['mixup'] = 'mixup',
- modality_dropout_rate: float = 0.0,
- normalization: Literal['element', 'channel'] | None = 'channel',
- adaptive_normalization_max_samples: int | None = None,
- Parameters:
mixing_subtype – Which type of mixing to use on the image data given that
mixing_alphais set >0.0 in the global configuration.modality_dropout_rate – Dropout rate to apply to the modality, e.g.,
0.2means that 20% of the time, this modality will be dropped out during training.normalization – Which type of normalization to apply to the array data. If
element, will normalize each element in the array independently. Ifchannel, will normalize each channel in the array independently. For ‘channel’, assumes PyTorch format where the channel dimension is the first dimension.adaptive_normalization_max_samples – If using adaptive normalization (channel / element), how many samples to use to compute the normalization parameters. If
None, uses all samples.
Model Selection
- class eir.models.input.array.array_models.ArrayModelConfig(
- model_type: Literal['cnn', 'lcl', 'lcl-informed-moe', 'transformer'],
- model_init_config: CNNModelConfig | LCLModelConfig | LCLInformedMoEModelConfig | ArrayTransformerConfig,
- pre_normalization: Literal['instancenorm', 'layernorm'] | None = None,
- Parameters:
model_type – Which type of image model to use.
model_init_config – Configuration used to initialise model.
Available Feature Extractors
CNN Models
- class eir.models.input.array.models_cnn.CNNModelConfig(
- layers: None | list[int] = None,
- num_output_features: int = 0,
- channel_exp_base: int = 2,
- first_channel_expansion: int = 1,
- kernel_width: int = 12,
- first_kernel_expansion_width: float = 1.0,
- down_stride_width: int = 4,
- first_stride_expansion_width: float = 1.0,
- dilation_factor_width: int = 1,
- kernel_height: int = 4,
- first_kernel_expansion_height: float = 1.0,
- down_stride_height: int = 1,
- first_stride_expansion_height: float = 1.0,
- dilation_factor_height: int = 1,
- allow_first_conv_size_reduction: bool = True,
- down_sample_every_n_blocks: int | None = 2,
- cutoff: int = 32,
- rb_do: float = 0.0,
- stochastic_depth_p: float = 0.0,
- attention_inclusion_cutoff: int = 256,
- l1: float = 0.0,
- Parameters:
layers –
A list that controls the number of layers and channels in the model. Each element in the list represents a layer group with a specified number of layers and channels. Specifically,
The first element in the list refers to the number of layers with the number of channels exactly as specified by the
channel_exp_baseparameter.The subsequent elements in the list correspond to an increased number of channels, doubling with each step. For instance, if
channel_exp_base=3(i.e.,2**3=8channels), and thelayerslist is[5, 3, 2], the model would be constructed as follows,First case: 5 layers with 8 channels
Second case: 3 layers with 16 channels (doubling from the previous case)
Third case: 2 layers with 32 channels (doubling from the previous case)
The model currently supports a maximum of 4 elements in the list.
If set to
None, the model will automatically set up the number of layer groups until a certain width and height (stride * 8for both) are met. In this automatic setup, channels will be increased as the input gets propagated through the network, while the width/height get reduced due to stride.
Future work includes adding a parameter to control the target width and height.
num_output_features – Output dimension of the last FC layer in the network which accepts the outputs from the convolutional layer. If set to 0, the output will be passed through directly to the fusion module.
channel_exp_base – Which power of 2 to use in order to set the number of channels in the network. For example, setting
channel_exp_base=3means that 2**3=8 channels will be used.first_channel_expansion – Factor to extend the first layer channels.
kernel_width – Base kernel width of the convolutions.
first_kernel_expansion_width – Factor to extend the first kernel’s width. The result of the multiplication will be rounded to the nearest integer.
down_stride_width – Down stride of the convolutional layers along the width.
first_stride_expansion_width – Factor to extend the first layer stride along the width. The result of the multiplication will be rounded to the nearest integer.
dilation_factor_width – Base dilation factor of the convolutions along the width in the network.
kernel_height – Base kernel height of the convolutions.
first_kernel_expansion_height – Factor to extend the first kernel’s height. The result of the multiplication will be rounded to the nearest integer.
down_stride_height – Down stride of the convolutional layers along the height.
first_stride_expansion_height – Factor to extend the first layer stride along the height. The result of the multiplication will be rounded to the nearest integer.
dilation_factor_height – Base dilation factor of the convolutions along the height in the network.
allow_first_conv_size_reduction – If set to False, will not allow the first convolutional layer to reduce the size of the input. Setting this is true if you want to ensure that the first convolutional layer reduces the size of the input, for example when the input is very large, and we want to compress it early.
cutoff – If the resulting dimension of width * height of adding a successive block is less than this value, will stop adding residual blocks to the model in the automated case (i.e., if the layers argument is not specified).
rb_do – Dropout in the convolutional residual blocks.
stochastic_depth_p – Probability of dropping input.
attention_inclusion_cutoff – If the dimension of width * height is less than this value, attention will be included in the model across channels and width * height as embedding dimension after that point (with the channels representing the length of the sequence).
l1 – L1 regularization to apply to the first layer.
Locally Connected Models
- class eir.models.input.array.models_locally_connected.LCLModelConfig(
- patch_size: tuple[int, int, int] | None = None,
- layers: None | list[int] = None,
- kernel_width: int | Literal['patch'] = 12,
- first_kernel_expansion: int = -2,
- channel_exp_base: int = 2,
- first_channel_expansion: int = 1,
- num_lcl_chunks: None | int = None,
- rb_do: float = 0.1,
- stochastic_depth_p: float = 0.0,
- l1: float = 0.0,
- cutoff: int | Literal['auto'] = 1024,
- direction: Literal['down', 'up'] = 'down',
- attention_inclusion_cutoff: int | None = None,
This is what the
"genome-local-net"model refers to. See https://academic.oup.com/nar/article/51/12/e67/7177885 for more details on the model architecture.Note that when using the automatic network setup, kernel widths will get expanded to ensure that the feature representations become smaller as they are propagated through the network.
- Parameters:
patch_size – Controls the size of the patches used in the first layer. If set to
None, the input is flattened according to the torchflattenfunction. Note that when using this parameter, we generally want the kernel width to be set to the multiplication of the patch size. Order follows PyTorch convention, i.e., [channels, height, width].layers – Controls the number of layers in the model. If set to
None, the model will automatically set up the number of layers according to thecutoffparameter value.kernel_width – With of the locally connected kernels. Note that in the context of genomic inputs this refers to the flattened input, meaning that if we have a one-hot encoding of 4 values (e.g. SNPs), 12 refers to 12/4 = 3 SNPs per locally connected window. Can be set to
Noneif thenum_lcl_chunksparameter is set, which means that the kernel width will be set automatically according tofirst_kernel_expansion – Factor to extend the first kernel. This value can both be positive or negative. For example in the case of
kernel_width=12, settingfirst_kernel_expansion=2means that the first kernel will have a width of 24, whereas other kernels will have a width of 12. When using a negative value, divides the first kernel by the value instead of multiplying.channel_exp_base – Which power of 2 to use in order to set the number of channels/weight sets in the network. For example, setting
channel_exp_base=3means that 2**3=8 weight sets will be used.first_channel_expansion – Whether to expand / shrink the number of channels in the first layer as compared to other layers in the network. Works analogously to the
first_kernel_expansionparameter.num_lcl_chunks – Controls the number of splits applied to the input. E.g. with a input width of 800, using
num_lcl_chunks=100will result in a kernel width of 8, meaning 8 elements in the flattened input. If using a SNP inputs with a one-hot encoding of 4 possible values, this will result in 8/2 = 2 SNPs per locally connected area.rb_do – Dropout in the residual blocks.
stochastic_depth_p – Probability of dropping input.
l1 – L1 regularization applied to the first layer in the network.
cutoff – Feature dimension cutoff where the automatic network setup stops adding layers. The ‘auto’ option is only supported when using the model for array outputs, and will set the cutoff to roughly the number of output features.
direction – Whether to use a “down” or “up” network. “Down” means that the feature representation will get smaller as it is propagated through the network, whereas “up” means that the feature representation will get larger.
attention_inclusion_cutoff – Cutoff to start including attention blocks in the network. If set to
None, no attention blocks will be included. The cutoff here refers to the “length” dimension of the input after reshaping according to the output_feature_sets in the preceding layer. For example, if we 1024 output features, and we have 4 output feature sets, the length dimension will be 1024/4 = 256. With an attention cutoff >= 256, the attention block will be included.
Transformer Models
- class eir.models.input.array.models_transformers.ArrayTransformerConfig(
- patch_size: tuple[int, int, int],
- embedding_dim: int,
- num_heads: int = 8,
- num_layers: int = 2,
- dim_feedforward: int | Literal['auto'] = 'auto',
- dropout: float = 0.1,
- position: Literal['encode', 'embed'] = 'encode',
- position_dropout: float = 0.1,
- Parameters:
patch_size – Controls the size of the patches used in the first layer. If set to
None, the input is flattened according to the torchflattenfunction. Note that when using this parameter, we generally want the kernel width to be set to the multiplication of the patch size. Order follows PyTorch convention, i.e., [channels, height, width].embedding_dim – The embedding dimension each patch is projected to. This is also the dimension of the transformer encoder layers.
num_heads – The number of heads in the multi-head attention layers.
num_layers – The number of transformer encoder layers.
dim_feedforward – The dimension of the feedforward layers in the transformer model.
dropout – The dropout rate to use in the transformer encoder layers.
position – Whether to encode the token position or use learnable position embeddings.
position_dropout – The dropout rate to use in the position encoding/embedding.