msd_pytorch package¶

Submodules¶

msd_pytorch.bench module¶

class msd_pytorch.bench.TimeitResult(name, loops, repeat, best, worst, all_runs, precision=3)[source]¶

Bases: object

Object returned by the timeit magic with info about the run.

Contains the following attributes :

loops: (int) number of loops done per measurement repeat: (int) number of times the measurement has been repeated best: (float) best execution time / number all_runs: (list of float) execution time of each run (in s)

__init__(name, loops, repeat, best, worst, all_runs, precision=3)[source]¶: Initialize self. See help(type(self)) for accurate signature.

property average¶

property stdev¶

msd_pytorch.bench.bench(name, timer, repeat=3)[source]¶

msd_pytorch.conv module¶

class msd_pytorch.conv.Conv2dInPlaceFunction[source]¶

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]¶

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, weight, bias, output, stride, dilation)[source]¶

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.conv.Conv2dInPlaceModule(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶

Bases: torch.nn.modules.module.Module

__init__(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶: Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

msd_pytorch.conv.conv2dInPlace()¶

msd_pytorch.conv_relu module¶

class msd_pytorch.conv_relu.ConvRelu2dInPlaceFunction[source]¶

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]¶

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, weight, bias, output, stride, dilation)[source]¶

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.conv_relu.ConvRelu2dInPlaceModule(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶

Bases: torch.nn.modules.module.Module

__init__(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶: Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

msd_pytorch.conv_relu.conv_relu2dInPlace()¶

msd_pytorch.errors module¶

exception msd_pytorch.errors.Error[source]¶

Bases: Exception

Base class for exceptions in msd_pytorch.

exception msd_pytorch.errors.InputError(message)[source]¶

Bases: msd_pytorch.errors.Error

Exception raised for errors in the input.

Attributes:: message – explanation of the error

__init__(message)[source]¶: Initialize self. See help(type(self)) for accurate signature.

msd_pytorch.image_dataset module¶

class msd_pytorch.image_dataset.ImageDataset(input_path_specifier, target_path_specifier, *, collapse_channels=False, labels=None)[source]¶

Bases: torch.utils.data.dataset.Dataset

A dataset for images stored on disk.

__init__(input_path_specifier, target_path_specifier, *, collapse_channels=False, labels=None)[source]¶

Create a new image dataset.

Parameters: input_path_specifier – string

A path with optional glob pattern describing the image file paths. Tildes and other HOME directory specifications are expanded with expanduser and symlinks are resolved.

If the path points to a directory, then all files in the directory are included in the image stack.

If the path points to file, then that single file is included in the image stack.

Alternatively, one may specify a “glob pattern” to match specific files in the directory. Of course, if the glob pattern does not contain a ‘*’, then it may match a single file.

Examples:

"~/train_images/"
"~/train_images/cats*.png"
"~/train_images/*.tif"
"~/train_images/scan*"
"~/train_images/just_one_image.jpeg"

Parameters: target_path_specifier – string

A pattern that describes the target data. Format is similar to the input path specification.

Parameters: collapse_channels – bool

By default, the images are returned in the CxHxW format, where C is the number of channels and H and W specify the height and width, respectively.

If collapse_channels=True, then all channels in the image will be averaged to a single channel. This can be used to convert color images to gray-scale images, for instance.

If collapse_channels=False, any channels in the image will be retained.

In either case, the returned images have at least one channel.

Parameters: labels – int or list(int)

By default, both input and target image pixel values are converted to float32.

If you want to retrieve the target image pixels as integral values instead, set:

labels=k for an integer k if the labels are contained in the set {0, 1, …, k-1};
labels=[1,2,5] if the labels are contained in the set {1,2,5}.

Setting labels is useful for segmentation.

Returns
Return type

property num_labels¶

The number of labels in this image stack.

If the stack is not labeled, this property access raises a RuntimeError.

Returns: The number of labels in this image stack.
Return type: int

class msd_pytorch.image_dataset.ImageStack(path_specifier, *, collapse_channels=False, labels=None)[source]¶

Bases: object

A stack of images stored on disk.

An image stack describes a collection of images matching the file path specifier path_specifier.

The images can be tiff files, or any other image filetype supported by imageio.

The image paths are sorted using a natural sorting mechanism. So “scan1.tif” comes before “scan10.tif”.

Images can be retrieved by indexing into the stack. For example:

ImageStack("*.tif")[i]

These images are returned as torch tensors with three dimensions CxHxW.

__init__(path_specifier, *, collapse_channels=False, labels=None)[source]¶

Create a new ImageStack.

Parameters: path_specifier – string

A path with optional glob pattern describing the image file paths. Tildes and other HOME directory specifications are expanded with expanduser and symlinks are resolved.

If the path points to a directory, then all files in the directory are included in the image stack.

If the path points to file, then that single file is included in the image stack.

Alternatively, one may specify a “glob pattern” to match specific files in the directory. Of course, if the glob pattern does not contain a ‘*’, then it may match a single file.

Examples:

"~/train_images/"
"~/train_images/cats*.png"
"~/train_images/*.tif"
"~/train_images/scan*"
"~/train_images/just_one_image.jpeg"

Parameters: collapse_channels – bool

By default, the images are returned in the CxHxW format, where C is the number of channels and H and W specify the height and width, respectively.

If collapse_channels=True, then all channels in the image will be averaged to a single channel. This can be used to convert color images to gray-scale images, for instance.

If collapse_channels=False, any channels in the image will be retained.

In either case, the returned images have at least one channel.

Parameters: labels – int or list(int)

By default, all image pixel values are converted to float32.

If you want to retrieve the image pixels as integral values instead, set

labels=k for an integer k if the labels are
contained in the set {0, 1, …, k-1};
labels=[1,2,5] if the labels are contained in the set
{1,2,5}.

Setting labels is useful for segmentation.

Returns: An ImageStack
Return type

find_images()[source]¶

property num_labels¶

The number of labels in this image stack.

If the stack is not labeled, this property access raises a RuntimeError.

Returns: The number of labels in this image stack.
Return type: int

msd_pytorch.main module¶

msd_pytorch.main.benchmark(msd, batch_size, input_size)[source]¶

msd_pytorch.main.experiment_main()[source]¶

msd_pytorch.main.main_function()[source]¶

msd_pytorch.main.regression(msd, epochs, batch_size, train_input_glob, train_target_glob, val_input_glob, val_target_glob, weights_path)[source]¶

msd_pytorch.main.segmentation(msd, epochs, labels, batch_size, train_input_glob, train_target_glob, val_input_glob, val_target_glob, weights_path)[source]¶

msd_pytorch.main.train(model, epochs, train_dl, val_dl, weights_path)[source]¶

msd_pytorch.msd_block module¶

class msd_pytorch.msd_block.MSDBlock2d(in_channels, dilations, width=1)[source]¶

Bases: torch.nn.modules.module.Module

__init__(in_channels, dilations, width=1)[source]¶

Multi-scale dense block

in_channelsint: Number of input channels
dilationstuple of int: Dilation for each convolution-block
widthint: Number of channels per convolution.

The number of output channels is in_channels + depth * width

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset_parameters()[source]¶

class msd_pytorch.msd_block.MSDBlockImpl2d[source]¶

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]¶

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, dilations, bias, *weights)[source]¶

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.msd_block.MSDModule2d(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶

Bases: torch.nn.modules.module.Module

__init__(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶

Create a 2-dimensional MSD Module

Parameters

c_in – # of input channels
c_out – # of output channels
depth – # of layers
width – # the width of the module
dilations – list(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated.

Returns: an MSD module
Return type: MSDModule2d

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset_parameters()[source]¶

msd_pytorch.msd_block.msdblock2d()¶

msd_pytorch.msd_model module¶

class msd_pytorch.msd_model.MSDModel(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶

Bases: object

Base class for MSD models.

This class provides methods for

training the network
calculating validation scores
loading and saving the network parameters to disk.
computing normalization for input and target data.

Note

Do not initialize MSDModel directly. Use MSDSegmentationModel or MSDRegressionModel instead.

__init__(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶

Create a new MSDModel base class.

Note

Do not initialize MSDModel directly. Use MSDSegmentationModel or MSDRegressionModel instead.

Parameters

c_in – The number of input channels.
c_out – The number of output channels.
depth – The depth of the MSD network.
width – The width of the MSD network.
dilations – list(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated when there are more layers than supplied dilations.

Returns
Return type

forward(input=None, target=None)[source]¶

Calculate the loss for a single input-target pair.

Both input and target are optional. If one of these parameters is not set, a previous value of these parameters is used.

Parameters: input – torch.Tensor

A BxCxHxW-dimensional torch input tensor.

Parameters: target – torch.Tensor

A BxCxHxW-dimensional torch input tensor.

Returns: The loss on target
Return type

get_loss()[source]¶

Get the mean loss of the last forward calculation.

Gets the mean loss of the last (input, target) pair. The loss function that is used depends on whether the model is doing regression or segmentation.

Returns: The loss.
Return type: float

get_output()[source]¶

Get the output of the network.

Note

The output is only defined after a call to forward(), learn(), train(), validate(). If none of these methods has been called, None is returned.

Returns: A torch tensor containing the output of the network or None.
Return type: torch.Tensor or NoneType

init_optimizer(trainable_net)[source]¶

learn(input=None, target=None)[source]¶

Train on a single input-target pair.

Parameters: input – torch.Tensor

A BxCxHxW-dimensional torch input tensor.

Parameters: target – torch.Tensor

A BxCxHxW-dimensional torch input tensor.

load(path)[source]¶

Load network parameters from disk.

Parameters: path – The filesystem path where the network parameters are stored.
Returns: the number of epochs the network has trained for.
Return type: int

print()[source]¶: Print the network.

save(path, epoch)[source]¶

Save network to disk.

Parameters

path – A filesystem path where the network parameters are stored.
epoch – The number of epochs the network has trained for. This is useful for reloading!

Returns

Nothing

Return type

set_input(data)[source]¶

Set input data.

Parameters: data – torch.Tensor

A BxCxHxW-dimensional torch input tensor.

Returns
Return type

set_normalization(dataloader)[source]¶

Normalize input and target data.

This function goes through all the training data to compute the mean and std of the training data.

It modifies the network so that all future invocations of the network first normalize input data and target data to have mean zero and a standard deviation of one.

These modified parameters are not updated after this step and are stored in the network, so that they are not lost when the network is saved to and loaded from disk.

Normalizing in this way makes training more stable.

Parameters: dataloader – The dataloader associated to the training data.
Returns
Return type

set_target(data)[source]¶

Set target data.

Parameters: data – torch.Tensor

A BxCxHxW-dimensional torch target tensor.

Returns
Return type

train(dataloader, num_epochs)[source]¶

Train on a dataset.

Trains the network for num_epochs epochs on the dataset supplied by dataloader.

Parameters

dataloader – A dataloader for a dataset to train on.
num_epochs – The number of epochs to train for.

Returns

Return type

validate(dataloader)[source]¶

Calculate validation score for dataset.

Calculates the mean loss per (input, target) pair in dataloader. The loss function that is used depends on whether the model is doing regression or segmentation.

Parameters: dataloader – A dataloader for a dataset to calculate the loss on.
Returns
Return type

msd_pytorch.msd_model.scaling_module(c_in, c_out, *, conv3d=False)[source]¶

Make a Module that normalizes the input data.

This part of the network can be used to renormalize the input data. Its parameters are

saved when the network is saved;
not updated by the gradient descent solvers.

Parameters

c_in – The number of input channels.
c_out – The number of output channels.
conv3d – Indicates that the input data is 3D instead of 2D.

Returns

A scaling module.

Return type

torch.nn.ConvNd

msd_pytorch.msd_module module¶

class msd_pytorch.msd_module.MSDFinalLayer(c_in, c_out)[source]¶

Bases: torch.nn.modules.module.Module

Documentation for MSDFinalLayer

Implements the final 1x1 multiplication and bias addition for all intermediate layers to get to the output layer.

Initializes the weight and bias to zero.

__init__(c_in, c_out)[source]¶: Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset_parameters()[source]¶

class msd_pytorch.msd_module.MSDLayerModule(buffer, c_in, layer_depth, width, dilation)[source]¶

Bases: torch.nn.modules.module.Module

A hidden layer of the MSD module.

The primary responsibility of this module is to define the forward() method.

This module is used by the MSDModule.

This module is not responsible for

Buffer management
Weight initialization

__init__(buffer, c_in, layer_depth, width, dilation)[source]¶

Initialize the hidden layer.

Parameters

buffer – a StitchBuffer object for storing the L and G buffers.
c_in – The number of input channels of the MSD module.
layer_depth – The depth of this layer in the MSD module. This index is zero-based: the first hidden layer has index zero.
width – The width of the MSD module.
dilation – An integer describing the dilation factor for the convolutions in this layer.

Returns

A module for the MSD hidden layer.

Return type

MSDLayerModule

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class msd_pytorch.msd_module.MSDModule(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶

Bases: torch.nn.modules.module.Module

__init__(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶

Create a msd module

Parameters

c_in – # of input channels
c_out – # of output channels
depth – # of layers
width – # the width of the module
dilations – list(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated.

Returns: an MSD module
Return type: MSDModule

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_buffers(input)[source]¶

msd_pytorch.msd_module.init_convolution_weights(conv_weight, c_in, c_out, width, depth)[source]¶

Initialize MSD convolution kernel weights

Based on:

Pelt, Daniel M., & Sethian, J. A. (2017). A mixed-scale dense convolutional neural network for image analysis. Proceedings of the National Academy of Sciences, 115(2), 254–259. http://dx.doi.org/10.1073/pnas.1715832114

Parameters

conv_weight – The kernel weight data
c_in – Number of input channels of the MSD module
c_out – Number of output channels of the MSD module
width – The width of the MSD module
depth – The depth of the MSD module. This is the number of hidden layers.

Returns

Nothing

Return type

msd_pytorch.msd_module.stitchLazy()¶

msd_pytorch.msd_module.units_in_front(c_in, width, layer_depth)[source]¶

Calculate how many intermediate images are in front of current layer

The input channels count as intermediate images
The layer_depth index is zero-based: the first hidden layer has index zero.

Parameters

c_in – The number of input channels of the MSD module
width – The width of the MSD module
layer_depth – The depth of the layer for which we are calculating the units in front. This index is zero-based: the first hidden layer has index zero.

Returns

Return type

msd_pytorch.msd_regression_model module¶

class msd_pytorch.msd_regression_model.MSDRegressionModel(c_in, c_out, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], loss='L2', parallel=False)[source]¶

Bases: msd_pytorch.msd_model.MSDModel

An MSD network for regression.

This class provides helper methods for using the MSD network module for regression.

Refer to the documentation of MSDModel for more information on the helper methods and attributes.

__init__(c_in, c_out, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], loss='L2', parallel=False)[source]¶

Create a new MSD network for regression.

Parameters

c_in – The number of input channels.
c_out – The number of output channels.
depth – The depth of the MSD network.
width – The width of the MSD network.
dilations – list(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated when there are more layers than supplied dilations.

Parameters: loss – string

A string describing the loss function that should be used. Currently, the following losses are supported:

“L1” - nn.L1Loss()
“L2” - nn.MSELoss()

Parameters: parallel – bool

Whether or not to execute the model on multiple GPUs. Note that the batch size must be a multiple of the number of available GPUs.

Returns
Return type

msd_pytorch.msd_segmentation_model module¶

class msd_pytorch.msd_segmentation_model.MSDSegmentationModel(c_in, num_labels, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], parallel=False)[source]¶

Bases: msd_pytorch.msd_model.MSDModel

An MSD network for segmentation.

This class provides helper methods for using the MSD network module for segmentation.

Refer to the documentation of MSDModel for more information on the helper methods and attributes.

__init__(c_in, num_labels, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], parallel=False)[source]¶

Create a new MSD network for segmentation.

Parameters

c_in – The number of input channels.
num_labels – The number of labels to divide the segmentation into.
depth – The depth of the MSD network
width – The width of the MSD network
dilations – list(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated when there are more layers than supplied dilations.

Parameters: parallel – bool

Whether or not to execute the model on multiple GPUs. Note that the batch size must be a multiple of the number of available GPUs.

Returns
Return type

set_normalization(dataloader)[source]¶

Normalize input data.

This function goes through all the training data to compute the mean and std of the training data. It modifies the network so that all future invocations of the network first normalize input data. The normalization parameters are saved.

Parameters: dataloader – The dataloader associated to the training data.
Returns
Return type

set_target(target)[source]¶

Set target data.

Parameters: data – torch.Tensor

A BxCxHxW-dimensional torch target tensor.

Returns
Return type

msd_pytorch.relu_inplace module¶

class msd_pytorch.relu_inplace.ReLUInplaceFunction[source]¶

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]¶

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input)[source]¶

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.relu_inplace.ReLUInplaceModule[source]¶

Bases: torch.nn.modules.module.Module

__init__()[source]¶: Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

msd_pytorch.stitch module¶

Stitch Functions and Modules for threading the gradient

Stitching refers to the practice of copying and / or reusing shared buffers in a network to improve efficiency. It handles distributing the gradient transparently.

In this module, we implement three types of stitching:

Slow stitching: concatenates to inputs in the forward pass and distributes the gradient output in the backward pass. Inefficient. Slow stitching is used for testing.
Copy Stitching: copies the input into a layer buffer L and returns all layers up to and including the newly copied input. More efficient than slow stitching, but preferably used sparingly.
Lazy Stitching: assumes that the input has already been copied in the layer buffer L and returns all layers up to and including the input. The gradient is accumulated in a gradient buffer G. This is fast and efficient.

class msd_pytorch.stitch.StitchBuffer[source]¶

Bases: object

__init__()[source]¶

Holds the L and G buffers for a stitched module.

The intermediate layers are stored in L for the forward pass. The gradients are stored in the G buffer.

like_(tensor, new_shape)[source]¶

Change the L and G buffers to match tensor.

Matches the tensor’s - data type - device (cpu, cuda, cuda:0, cuda:i)

The shape is taken from the new_shape parameter.

Parameters

tensor – An input tensor
new_shape – The new shape that the buffer should have.

Returns

Nothing

Return type

zero_()[source]¶

Set buffers to zero.

Returns
Return type

class msd_pytorch.stitch.StitchCopyFunction[source]¶

Bases: torch.autograd.function.Function

Copy stitching:

Stores output in buffer L in the forward pass and adds the grad_output to buffer G in the backward pass.

The buffer L is a tensor of dimensions B x C x ? where

B is the minibatch size, and
C is the number of channels.

The buffer G has the same dimension as L.

The parameter i is an index in the C dimension and points to where the input (the output of the previous layer) must be copied.

In the forward pass:

write the input into L at channel i
return L up to and including channel i

In the backward pass:

add the grad_output to G
return channel i of G

It is good practice to zero the G buffer before the backward pass. Sometimes, this is not possible since some methods, such as torch.autograd.gradcheck, repeatedly call .grad() on the output. Therefore, when grad_output is the same size as G, the buffer G is zeroed in the backward function.

static backward(ctx, grad_output)[source]¶

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, L, G, i)[source]¶

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.stitch.StitchCopyModule(buffer, i)[source]¶

Bases: torch.nn.modules.module.Module

__init__(buffer, i)[source]¶

Make a new StitchCopyModule

Parameters

buffer – A StitchBuffer
i – index of the output channel of the stitch

Returns

Return type

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class msd_pytorch.stitch.StitchLazyFunction[source]¶

Bases: torch.autograd.function.Function

StitchLazyFunction is similar to StitchCopyFunction, but it does not copy the output of the previous layer into L. Hence the name. StitchLazyFunction supposes that the output of the the previous layer has already been copied into L. This can be accomplished with conv_cuda.conv2dInPlace, for instance.

The buffer L is a tensor of dimensions B x C x ? where

B is the minibatch size, and
C is the number of channels.

The buffer G has the same dimension as L.

The parameter i is an index in the C dimension and points to where the input (the output of the previous layer) must be copied.

In the forward pass:

write the input into L at channel i

In the backward pass:

add the grad_output to G
return channel i of G

It is good practice to zero the G buffer before the backward pass. Sometimes, this is not possible since some methods, such as torch.autograd.gradcheck, repeatedly call .grad() on the output. Therefore, when grad_output is the same size as G, the buffer G is zeroed in the backward function.

static backward(ctx, grad_output)[source]¶

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, L, G, i)[source]¶

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.stitch.StitchLazyModule(buffer, i)[source]¶

Bases: torch.nn.modules.module.Module

__init__(buffer, i)[source]¶

Make a new StitchLazyModule

Parameters

buffer – A StitchBuffer
i – index of the output channel of the stitch

Returns

Return type

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class msd_pytorch.stitch.StitchSlowFunction[source]¶

Bases: torch.autograd.function.Function

Naive stitching: concatenates two inputs in the channel dimension.

static backward(ctx, grad_output)[source]¶

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input1, input2)[source]¶

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

msd_pytorch.stitch.stitchCopy()¶

msd_pytorch.stitch.stitchLazy()¶

msd_pytorch.stitch.stitchSlow()¶

Module contents¶

Top-level package for Mixed-scale Dense Networks for PyTorch.