msd_pytorch package

Submodules

msd_pytorch.bench module

class msd_pytorch.bench.TimeitResult(name, loops, repeat, best, worst, all_runs, precision=3)[source]

Bases: object

Object returned by the timeit magic with info about the run.

Contains the following attributes :

loops: (int) number of loops done per measurement repeat: (int) number of times the measurement has been repeated best: (float) best execution time / number all_runs: (list of float) execution time of each run (in s)

__init__(name, loops, repeat, best, worst, all_runs, precision=3)[source]

Initialize self. See help(type(self)) for accurate signature.

property average
property stdev
msd_pytorch.bench.bench(name, timer, repeat=3)[source]

msd_pytorch.conv module

class msd_pytorch.conv.Conv2dInPlaceFunction[source]

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, weight, bias, output, stride, dilation)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.conv.Conv2dInPlaceModule(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]

Bases: torch.nn.modules.module.Module

__init__(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]

Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

msd_pytorch.conv.conv2dInPlace()

msd_pytorch.conv_relu module

class msd_pytorch.conv_relu.ConvRelu2dInPlaceFunction[source]

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, weight, bias, output, stride, dilation)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.conv_relu.ConvRelu2dInPlaceModule(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]

Bases: torch.nn.modules.module.Module

__init__(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]

Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

msd_pytorch.conv_relu.conv_relu2dInPlace()

msd_pytorch.errors module

exception msd_pytorch.errors.Error[source]

Bases: Exception

Base class for exceptions in msd_pytorch.

exception msd_pytorch.errors.InputError(message)[source]

Bases: msd_pytorch.errors.Error

Exception raised for errors in the input.

Attributes:

message – explanation of the error

__init__(message)[source]

Initialize self. See help(type(self)) for accurate signature.

msd_pytorch.image_dataset module

class msd_pytorch.image_dataset.ImageDataset(input_path_specifier, target_path_specifier, *, collapse_channels=False, labels=None)[source]

Bases: torch.utils.data.dataset.Dataset

A dataset for images stored on disk.

__init__(input_path_specifier, target_path_specifier, *, collapse_channels=False, labels=None)[source]

Create a new image dataset.

Parameters

input_path_specifierstring

A path with optional glob pattern describing the image file paths. Tildes and other HOME directory specifications are expanded with expanduser and symlinks are resolved.

If the path points to a directory, then all files in the directory are included in the image stack.

If the path points to file, then that single file is included in the image stack.

Alternatively, one may specify a “glob pattern” to match specific files in the directory. Of course, if the glob pattern does not contain a ‘*’, then it may match a single file.

Examples:

  • "~/train_images/"

  • "~/train_images/cats*.png"

  • "~/train_images/*.tif"

  • "~/train_images/scan*"

  • "~/train_images/just_one_image.jpeg"

Parameters

target_path_specifierstring

A pattern that describes the target data. Format is similar to the input path specification.

Parameters

collapse_channelsbool

By default, the images are returned in the CxHxW format, where C is the number of channels and H and W specify the height and width, respectively.

If collapse_channels=True, then all channels in the image will be averaged to a single channel. This can be used to convert color images to gray-scale images, for instance.

If collapse_channels=False, any channels in the image will be retained.

In either case, the returned images have at least one channel.

Parameters

labelsint or list(int)

By default, both input and target image pixel values are converted to float32.

If you want to retrieve the target image pixels as integral values instead, set:

  • labels=k for an integer k if the labels are contained in the set {0, 1, …, k-1};

  • labels=[1,2,5] if the labels are contained in the set {1,2,5}.

Setting labels is useful for segmentation.

Returns

Return type

property num_labels

The number of labels in this image stack.

If the stack is not labeled, this property access raises a RuntimeError.

Returns

The number of labels in this image stack.

Return type

int

class msd_pytorch.image_dataset.ImageStack(path_specifier, *, collapse_channels=False, labels=None)[source]

Bases: object

A stack of images stored on disk.

An image stack describes a collection of images matching the file path specifier path_specifier.

The images can be tiff files, or any other image filetype supported by imageio.

The image paths are sorted using a natural sorting mechanism. So “scan1.tif” comes before “scan10.tif”.

Images can be retrieved by indexing into the stack. For example:

ImageStack("*.tif")[i]

These images are returned as torch tensors with three dimensions CxHxW.

__init__(path_specifier, *, collapse_channels=False, labels=None)[source]

Create a new ImageStack.

Parameters

path_specifierstring

A path with optional glob pattern describing the image file paths. Tildes and other HOME directory specifications are expanded with expanduser and symlinks are resolved.

If the path points to a directory, then all files in the directory are included in the image stack.

If the path points to file, then that single file is included in the image stack.

Alternatively, one may specify a “glob pattern” to match specific files in the directory. Of course, if the glob pattern does not contain a ‘*’, then it may match a single file.

Examples:

  • "~/train_images/"

  • "~/train_images/cats*.png"

  • "~/train_images/*.tif"

  • "~/train_images/scan*"

  • "~/train_images/just_one_image.jpeg"

Parameters

collapse_channelsbool

By default, the images are returned in the CxHxW format, where C is the number of channels and H and W specify the height and width, respectively.

If collapse_channels=True, then all channels in the image will be averaged to a single channel. This can be used to convert color images to gray-scale images, for instance.

If collapse_channels=False, any channels in the image will be retained.

In either case, the returned images have at least one channel.

Parameters

labelsint or list(int)

By default, all image pixel values are converted to float32.

If you want to retrieve the image pixels as integral values instead, set

  • labels=k for an integer k if the labels are

    contained in the set {0, 1, …, k-1};

  • labels=[1,2,5] if the labels are contained in the set

    {1,2,5}.

Setting labels is useful for segmentation.

Returns

An ImageStack

Return type

find_images()[source]
property num_labels

The number of labels in this image stack.

If the stack is not labeled, this property access raises a RuntimeError.

Returns

The number of labels in this image stack.

Return type

int

msd_pytorch.main module

msd_pytorch.main.benchmark(msd, batch_size, input_size)[source]
msd_pytorch.main.experiment_main()[source]
msd_pytorch.main.main_function()[source]
msd_pytorch.main.regression(msd, epochs, batch_size, train_input_glob, train_target_glob, val_input_glob, val_target_glob, weights_path)[source]
msd_pytorch.main.segmentation(msd, epochs, labels, batch_size, train_input_glob, train_target_glob, val_input_glob, val_target_glob, weights_path)[source]
msd_pytorch.main.train(model, epochs, train_dl, val_dl, weights_path)[source]

msd_pytorch.msd_block module

class msd_pytorch.msd_block.MSDBlock2d(in_channels, dilations, width=1)[source]

Bases: torch.nn.modules.module.Module

__init__(in_channels, dilations, width=1)[source]

Multi-scale dense block

in_channelsint

Number of input channels

dilationstuple of int

Dilation for each convolution-block

widthint

Number of channels per convolution.

The number of output channels is in_channels + depth * width

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset_parameters()[source]
class msd_pytorch.msd_block.MSDBlockImpl2d[source]

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, dilations, bias, *weights)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.msd_block.MSDModule2d(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]

Bases: torch.nn.modules.module.Module

__init__(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]

Create a 2-dimensional MSD Module

Parameters
  • c_in – # of input channels

  • c_out – # of output channels

  • depth – # of layers

  • width – # the width of the module

  • dilationslist(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated.

Returns

an MSD module

Return type

MSDModule2d

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset_parameters()[source]
msd_pytorch.msd_block.msdblock2d()

msd_pytorch.msd_model module

class msd_pytorch.msd_model.MSDModel(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]

Bases: object

Base class for MSD models.

This class provides methods for

  • training the network

  • calculating validation scores

  • loading and saving the network parameters to disk.

  • computing normalization for input and target data.

Note

Do not initialize MSDModel directly. Use MSDSegmentationModel or MSDRegressionModel instead.

__init__(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]

Create a new MSDModel base class.

Note

Do not initialize MSDModel directly. Use MSDSegmentationModel or MSDRegressionModel instead.

Parameters
  • c_in – The number of input channels.

  • c_out – The number of output channels.

  • depth – The depth of the MSD network.

  • width – The width of the MSD network.

  • dilationslist(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated when there are more layers than supplied dilations.

Returns

Return type

forward(input=None, target=None)[source]

Calculate the loss for a single input-target pair.

Both input and target are optional. If one of these parameters is not set, a previous value of these parameters is used.

Parameters

inputtorch.Tensor

A BxCxHxW-dimensional torch input tensor.

Parameters

targettorch.Tensor

A BxCxHxW-dimensional torch input tensor.

Returns

The loss on target

Return type

get_loss()[source]

Get the mean loss of the last forward calculation.

Gets the mean loss of the last (input, target) pair. The loss function that is used depends on whether the model is doing regression or segmentation.

Returns

The loss.

Return type

float

get_output()[source]

Get the output of the network.

Note

The output is only defined after a call to forward(), learn(), train(), validate(). If none of these methods has been called, None is returned.

Returns

A torch tensor containing the output of the network or None.

Return type

torch.Tensor or NoneType

init_optimizer(trainable_net)[source]
learn(input=None, target=None)[source]

Train on a single input-target pair.

Parameters

inputtorch.Tensor

A BxCxHxW-dimensional torch input tensor.

Parameters

targettorch.Tensor

A BxCxHxW-dimensional torch input tensor.

load(path)[source]

Load network parameters from disk.

Parameters

path – The filesystem path where the network parameters are stored.

Returns

the number of epochs the network has trained for.

Return type

int

print()[source]

Print the network.

save(path, epoch)[source]

Save network to disk.

Parameters
  • path – A filesystem path where the network parameters are stored.

  • epoch – The number of epochs the network has trained for. This is useful for reloading!

Returns

Nothing

Return type

set_input(data)[source]

Set input data.

Parameters

datatorch.Tensor

A BxCxHxW-dimensional torch input tensor.

Returns

Return type

set_normalization(dataloader)[source]

Normalize input and target data.

This function goes through all the training data to compute the mean and std of the training data.

It modifies the network so that all future invocations of the network first normalize input data and target data to have mean zero and a standard deviation of one.

These modified parameters are not updated after this step and are stored in the network, so that they are not lost when the network is saved to and loaded from disk.

Normalizing in this way makes training more stable.

Parameters

dataloader – The dataloader associated to the training data.

Returns

Return type

set_target(data)[source]

Set target data.

Parameters

datatorch.Tensor

A BxCxHxW-dimensional torch target tensor.

Returns

Return type

train(dataloader, num_epochs)[source]

Train on a dataset.

Trains the network for num_epochs epochs on the dataset supplied by dataloader.

Parameters
  • dataloader – A dataloader for a dataset to train on.

  • num_epochs – The number of epochs to train for.

Returns

Return type

validate(dataloader)[source]

Calculate validation score for dataset.

Calculates the mean loss per (input, target) pair in dataloader. The loss function that is used depends on whether the model is doing regression or segmentation.

Parameters

dataloader – A dataloader for a dataset to calculate the loss on.

Returns

Return type

msd_pytorch.msd_model.scaling_module(c_in, c_out, *, conv3d=False)[source]

Make a Module that normalizes the input data.

This part of the network can be used to renormalize the input data. Its parameters are

  • saved when the network is saved;

  • not updated by the gradient descent solvers.

Parameters
  • c_in – The number of input channels.

  • c_out – The number of output channels.

  • conv3d – Indicates that the input data is 3D instead of 2D.

Returns

A scaling module.

Return type

torch.nn.ConvNd

msd_pytorch.msd_module module

class msd_pytorch.msd_module.MSDFinalLayer(c_in, c_out)[source]

Bases: torch.nn.modules.module.Module

Documentation for MSDFinalLayer

Implements the final 1x1 multiplication and bias addition for all intermediate layers to get to the output layer.

Initializes the weight and bias to zero.

__init__(c_in, c_out)[source]

Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset_parameters()[source]
class msd_pytorch.msd_module.MSDLayerModule(buffer, c_in, layer_depth, width, dilation)[source]

Bases: torch.nn.modules.module.Module

A hidden layer of the MSD module.

The primary responsibility of this module is to define the forward() method.

This module is used by the MSDModule.

This module is not responsible for

  • Buffer management

  • Weight initialization

__init__(buffer, c_in, layer_depth, width, dilation)[source]

Initialize the hidden layer.

Parameters
  • buffer – a StitchBuffer object for storing the L and G buffers.

  • c_in – The number of input channels of the MSD module.

  • layer_depth – The depth of this layer in the MSD module. This index is zero-based: the first hidden layer has index zero.

  • width – The width of the MSD module.

  • dilation – An integer describing the dilation factor for the convolutions in this layer.

Returns

A module for the MSD hidden layer.

Return type

MSDLayerModule

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class msd_pytorch.msd_module.MSDModule(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]

Bases: torch.nn.modules.module.Module

__init__(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]

Create a msd module

Parameters
  • c_in – # of input channels

  • c_out – # of output channels

  • depth – # of layers

  • width – # the width of the module

  • dilationslist(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated.

Returns

an MSD module

Return type

MSDModule

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_buffers(input)[source]
msd_pytorch.msd_module.init_convolution_weights(conv_weight, c_in, c_out, width, depth)[source]

Initialize MSD convolution kernel weights

Based on:

Pelt, Daniel M., & Sethian, J. A. (2017). A mixed-scale dense convolutional neural network for image analysis. Proceedings of the National Academy of Sciences, 115(2), 254–259. http://dx.doi.org/10.1073/pnas.1715832114

Parameters
  • conv_weight – The kernel weight data

  • c_in – Number of input channels of the MSD module

  • c_out – Number of output channels of the MSD module

  • width – The width of the MSD module

  • depth – The depth of the MSD module. This is the number of hidden layers.

Returns

Nothing

Return type

msd_pytorch.msd_module.stitchLazy()
msd_pytorch.msd_module.units_in_front(c_in, width, layer_depth)[source]

Calculate how many intermediate images are in front of current layer

  • The input channels count as intermediate images

  • The layer_depth index is zero-based: the first hidden layer has index zero.

Parameters
  • c_in – The number of input channels of the MSD module

  • width – The width of the MSD module

  • layer_depth – The depth of the layer for which we are calculating the units in front. This index is zero-based: the first hidden layer has index zero.

Returns

Return type

msd_pytorch.msd_regression_model module

class msd_pytorch.msd_regression_model.MSDRegressionModel(c_in, c_out, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], loss='L2', parallel=False)[source]

Bases: msd_pytorch.msd_model.MSDModel

An MSD network for regression.

This class provides helper methods for using the MSD network module for regression.

Refer to the documentation of MSDModel for more information on the helper methods and attributes.

__init__(c_in, c_out, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], loss='L2', parallel=False)[source]

Create a new MSD network for regression.

Parameters
  • c_in – The number of input channels.

  • c_out – The number of output channels.

  • depth – The depth of the MSD network.

  • width – The width of the MSD network.

  • dilationslist(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated when there are more layers than supplied dilations.

Parameters

lossstring

A string describing the loss function that should be used. Currently, the following losses are supported:

  • “L1” - nn.L1Loss()

  • “L2” - nn.MSELoss()

Parameters

parallelbool

Whether or not to execute the model on multiple GPUs. Note that the batch size must be a multiple of the number of available GPUs.

Returns

Return type

msd_pytorch.msd_segmentation_model module

class msd_pytorch.msd_segmentation_model.MSDSegmentationModel(c_in, num_labels, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], parallel=False)[source]

Bases: msd_pytorch.msd_model.MSDModel

An MSD network for segmentation.

This class provides helper methods for using the MSD network module for segmentation.

Refer to the documentation of MSDModel for more information on the helper methods and attributes.

__init__(c_in, num_labels, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], parallel=False)[source]

Create a new MSD network for segmentation.

Parameters
  • c_in – The number of input channels.

  • num_labels – The number of labels to divide the segmentation into.

  • depth – The depth of the MSD network

  • width – The width of the MSD network

  • dilationslist(int)

A list of dilations to use. Default is [1, 2, ..., 10]. A good alternative is [1, 2, 4, 8]. The dilations are repeated when there are more layers than supplied dilations.

Parameters

parallelbool

Whether or not to execute the model on multiple GPUs. Note that the batch size must be a multiple of the number of available GPUs.

Returns

Return type

set_normalization(dataloader)[source]

Normalize input data.

This function goes through all the training data to compute the mean and std of the training data. It modifies the network so that all future invocations of the network first normalize input data. The normalization parameters are saved.

Parameters

dataloader – The dataloader associated to the training data.

Returns

Return type

set_target(target)[source]

Set target data.

Parameters

datatorch.Tensor

A BxCxHxW-dimensional torch target tensor.

Returns

Return type

msd_pytorch.relu_inplace module

class msd_pytorch.relu_inplace.ReLUInplaceFunction[source]

Bases: torch.autograd.function.Function

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.relu_inplace.ReLUInplaceModule[source]

Bases: torch.nn.modules.module.Module

__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

msd_pytorch.stitch module

Stitch Functions and Modules for threading the gradient

Stitching refers to the practice of copying and / or reusing shared buffers in a network to improve efficiency. It handles distributing the gradient transparently.

In this module, we implement three types of stitching:

  1. Slow stitching: concatenates to inputs in the forward pass and distributes the gradient output in the backward pass. Inefficient. Slow stitching is used for testing.

  2. Copy Stitching: copies the input into a layer buffer L and returns all layers up to and including the newly copied input. More efficient than slow stitching, but preferably used sparingly.

  3. Lazy Stitching: assumes that the input has already been copied in the layer buffer L and returns all layers up to and including the input. The gradient is accumulated in a gradient buffer G. This is fast and efficient.

class msd_pytorch.stitch.StitchBuffer[source]

Bases: object

__init__()[source]

Holds the L and G buffers for a stitched module.

The intermediate layers are stored in L for the forward pass. The gradients are stored in the G buffer.

like_(tensor, new_shape)[source]

Change the L and G buffers to match tensor.

Matches the tensor’s - data type - device (cpu, cuda, cuda:0, cuda:i)

The shape is taken from the new_shape parameter.

Parameters
  • tensor – An input tensor

  • new_shape – The new shape that the buffer should have.

Returns

Nothing

Return type

zero_()[source]

Set buffers to zero.

Returns

Return type

class msd_pytorch.stitch.StitchCopyFunction[source]

Bases: torch.autograd.function.Function

Copy stitching:

Stores output in buffer L in the forward pass and adds the grad_output to buffer G in the backward pass.

The buffer L is a tensor of dimensions B x C x ? where

  • B is the minibatch size, and

  • C is the number of channels.

The buffer G has the same dimension as L.

The parameter i is an index in the C dimension and points to where the input (the output of the previous layer) must be copied.

In the forward pass:

  • write the input into L at channel i

  • return L up to and including channel i

In the backward pass:

  • add the grad_output to G

  • return channel i of G

It is good practice to zero the G buffer before the backward pass. Sometimes, this is not possible since some methods, such as torch.autograd.gradcheck, repeatedly call .grad() on the output. Therefore, when grad_output is the same size as G, the buffer G is zeroed in the backward function.

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, L, G, i)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.stitch.StitchCopyModule(buffer, i)[source]

Bases: torch.nn.modules.module.Module

__init__(buffer, i)[source]

Make a new StitchCopyModule

Parameters
  • buffer – A StitchBuffer

  • i – index of the output channel of the stitch

Returns

Return type

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class msd_pytorch.stitch.StitchLazyFunction[source]

Bases: torch.autograd.function.Function

StitchLazyFunction is similar to StitchCopyFunction, but it does not copy the output of the previous layer into L. Hence the name. StitchLazyFunction supposes that the output of the the previous layer has already been copied into L. This can be accomplished with conv_cuda.conv2dInPlace, for instance.

The buffer L is a tensor of dimensions B x C x ? where

  • B is the minibatch size, and

  • C is the number of channels.

The buffer G has the same dimension as L.

The parameter i is an index in the C dimension and points to where the input (the output of the previous layer) must be copied.

In the forward pass:

  • write the input into L at channel i

In the backward pass:

  • add the grad_output to G

  • return channel i of G

It is good practice to zero the G buffer before the backward pass. Sometimes, this is not possible since some methods, such as torch.autograd.gradcheck, repeatedly call .grad() on the output. Therefore, when grad_output is the same size as G, the buffer G is zeroed in the backward function.

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input, L, G, i)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class msd_pytorch.stitch.StitchLazyModule(buffer, i)[source]

Bases: torch.nn.modules.module.Module

__init__(buffer, i)[source]

Make a new StitchLazyModule

Parameters
  • buffer – A StitchBuffer

  • i – index of the output channel of the stitch

Returns

Return type

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class msd_pytorch.stitch.StitchSlowFunction[source]

Bases: torch.autograd.function.Function

Naive stitching: concatenates two inputs in the channel dimension.

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, input1, input2)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

msd_pytorch.stitch.stitchCopy()
msd_pytorch.stitch.stitchLazy()
msd_pytorch.stitch.stitchSlow()

Module contents

Top-level package for Mixed-scale Dense Networks for PyTorch.