msd_pytorch package¶
Submodules¶
msd_pytorch.bench module¶
-
class
msd_pytorch.bench.
TimeitResult
(name, loops, repeat, best, worst, all_runs, precision=3)[source]¶ Bases:
object
Object returned by the timeit magic with info about the run.
Contains the following attributes :
loops: (int) number of loops done per measurement repeat: (int) number of times the measurement has been repeated best: (float) best execution time / number all_runs: (list of float) execution time of each run (in s)
-
__init__
(name, loops, repeat, best, worst, all_runs, precision=3)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
property
average
¶
-
property
stdev
¶
-
msd_pytorch.conv module¶
-
class
msd_pytorch.conv.
Conv2dInPlaceFunction
[source]¶ Bases:
torch.autograd.function.Function
-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input, weight, bias, output, stride, dilation)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
static
-
class
msd_pytorch.conv.
Conv2dInPlaceModule
(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶ Bases:
torch.nn.modules.module.Module
-
__init__
(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
msd_pytorch.conv.
conv2dInPlace
()¶
msd_pytorch.conv_relu module¶
-
class
msd_pytorch.conv_relu.
ConvRelu2dInPlaceFunction
[source]¶ Bases:
torch.autograd.function.Function
-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input, weight, bias, output, stride, dilation)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
static
-
class
msd_pytorch.conv_relu.
ConvRelu2dInPlaceModule
(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶ Bases:
torch.nn.modules.module.Module
-
__init__
(output, in_channels, out_channels, kernel_size=3, dilation=1)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
msd_pytorch.conv_relu.
conv_relu2dInPlace
()¶
msd_pytorch.errors module¶
-
exception
msd_pytorch.errors.
Error
[source]¶ Bases:
Exception
Base class for exceptions in msd_pytorch.
-
exception
msd_pytorch.errors.
InputError
(message)[source]¶ Bases:
msd_pytorch.errors.Error
Exception raised for errors in the input.
- Attributes:
message – explanation of the error
msd_pytorch.image_dataset module¶
-
class
msd_pytorch.image_dataset.
ImageDataset
(input_path_specifier, target_path_specifier, *, collapse_channels=False, labels=None)[source]¶ Bases:
torch.utils.data.dataset.Dataset
A dataset for images stored on disk.
-
__init__
(input_path_specifier, target_path_specifier, *, collapse_channels=False, labels=None)[source]¶ Create a new image dataset.
- Parameters
input_path_specifier – string
A path with optional glob pattern describing the image file paths. Tildes and other HOME directory specifications are expanded with expanduser and symlinks are resolved.
If the path points to a directory, then all files in the directory are included in the image stack.
If the path points to file, then that single file is included in the image stack.
Alternatively, one may specify a “glob pattern” to match specific files in the directory. Of course, if the glob pattern does not contain a ‘*’, then it may match a single file.
Examples:
"~/train_images/"
"~/train_images/cats*.png"
"~/train_images/*.tif"
"~/train_images/scan*"
"~/train_images/just_one_image.jpeg"
- Parameters
target_path_specifier – string
A pattern that describes the target data. Format is similar to the input path specification.
- Parameters
collapse_channels – bool
By default, the images are returned in the CxHxW format, where C is the number of channels and H and W specify the height and width, respectively.
If collapse_channels=True, then all channels in the image will be averaged to a single channel. This can be used to convert color images to gray-scale images, for instance.
If collapse_channels=False, any channels in the image will be retained.
In either case, the returned images have at least one channel.
- Parameters
labels – int or list(int)
By default, both input and target image pixel values are converted to float32.
If you want to retrieve the target image pixels as integral values instead, set:
labels=k
for an integerk
if the labels are contained in the set {0, 1, …, k-1};labels=[1,2,5]
if the labels are contained in the set {1,2,5}.
Setting labels is useful for segmentation.
- Returns
- Return type
-
property
num_labels
¶ The number of labels in this image stack.
If the stack is not labeled, this property access raises a RuntimeError.
- Returns
The number of labels in this image stack.
- Return type
int
-
-
class
msd_pytorch.image_dataset.
ImageStack
(path_specifier, *, collapse_channels=False, labels=None)[source]¶ Bases:
object
A stack of images stored on disk.
An image stack describes a collection of images matching the file path specifier path_specifier.
The images can be tiff files, or any other image filetype supported by imageio.
The image paths are sorted using a natural sorting mechanism. So “scan1.tif” comes before “scan10.tif”.
Images can be retrieved by indexing into the stack. For example:
ImageStack("*.tif")[i]
These images are returned as torch tensors with three dimensions CxHxW.
-
__init__
(path_specifier, *, collapse_channels=False, labels=None)[source]¶ Create a new ImageStack.
- Parameters
path_specifier – string
A path with optional glob pattern describing the image file paths. Tildes and other HOME directory specifications are expanded with expanduser and symlinks are resolved.
If the path points to a directory, then all files in the directory are included in the image stack.
If the path points to file, then that single file is included in the image stack.
Alternatively, one may specify a “glob pattern” to match specific files in the directory. Of course, if the glob pattern does not contain a ‘*’, then it may match a single file.
Examples:
"~/train_images/"
"~/train_images/cats*.png"
"~/train_images/*.tif"
"~/train_images/scan*"
"~/train_images/just_one_image.jpeg"
- Parameters
collapse_channels – bool
By default, the images are returned in the CxHxW format, where C is the number of channels and H and W specify the height and width, respectively.
If collapse_channels=True, then all channels in the image will be averaged to a single channel. This can be used to convert color images to gray-scale images, for instance.
If collapse_channels=False, any channels in the image will be retained.
In either case, the returned images have at least one channel.
- Parameters
labels – int or list(int)
By default, all image pixel values are converted to float32.
If you want to retrieve the image pixels as integral values instead, set
- labels=k for an integer k if the labels are
contained in the set {0, 1, …, k-1};
- labels=[1,2,5] if the labels are contained in the set
{1,2,5}.
Setting labels is useful for segmentation.
- Returns
An ImageStack
- Return type
-
property
num_labels
¶ The number of labels in this image stack.
If the stack is not labeled, this property access raises a RuntimeError.
- Returns
The number of labels in this image stack.
- Return type
int
-
msd_pytorch.main module¶
-
msd_pytorch.main.
regression
(msd, epochs, batch_size, train_input_glob, train_target_glob, val_input_glob, val_target_glob, weights_path)[source]¶
msd_pytorch.msd_block module¶
-
class
msd_pytorch.msd_block.
MSDBlock2d
(in_channels, dilations, width=1)[source]¶ Bases:
torch.nn.modules.module.Module
-
__init__
(in_channels, dilations, width=1)[source]¶ Multi-scale dense block
- in_channelsint
Number of input channels
- dilationstuple of int
Dilation for each convolution-block
- widthint
Number of channels per convolution.
The number of output channels is in_channels + depth * width
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
msd_pytorch.msd_block.
MSDBlockImpl2d
[source]¶ Bases:
torch.autograd.function.Function
-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input, dilations, bias, *weights)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
static
-
class
msd_pytorch.msd_block.
MSDModule2d
(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶ Bases:
torch.nn.modules.module.Module
-
__init__
(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶ Create a 2-dimensional MSD Module
- Parameters
c_in – # of input channels
c_out – # of output channels
depth – # of layers
width – # the width of the module
dilations – list(int)
A list of dilations to use. Default is
[1, 2, ..., 10]
. A good alternative is[1, 2, 4, 8]
. The dilations are repeated.- Returns
an MSD module
- Return type
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
msd_pytorch.msd_block.
msdblock2d
()¶
msd_pytorch.msd_model module¶
-
class
msd_pytorch.msd_model.
MSDModel
(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶ Bases:
object
Base class for MSD models.
This class provides methods for
training the network
calculating validation scores
loading and saving the network parameters to disk.
computing normalization for input and target data.
Note
Do not initialize MSDModel directly. Use
MSDSegmentationModel
orMSDRegressionModel
instead.-
__init__
(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶ Create a new MSDModel base class.
Note
Do not initialize MSDModel directly. Use
MSDSegmentationModel
orMSDRegressionModel
instead.- Parameters
c_in – The number of input channels.
c_out – The number of output channels.
depth – The depth of the MSD network.
width – The width of the MSD network.
dilations – list(int)
A list of dilations to use. Default is
[1, 2, ..., 10]
. A good alternative is[1, 2, 4, 8]
. The dilations are repeated when there are more layers than supplied dilations.- Returns
- Return type
-
forward
(input=None, target=None)[source]¶ Calculate the loss for a single input-target pair.
Both
input
andtarget
are optional. If one of these parameters is not set, a previous value of these parameters is used.- Parameters
input – torch.Tensor
A
BxCxHxW
-dimensional torch input tensor.- Parameters
target – torch.Tensor
A
BxCxHxW
-dimensional torch input tensor.- Returns
The loss on target
- Return type
-
get_loss
()[source]¶ Get the mean loss of the last forward calculation.
Gets the mean loss of the last
(input, target)
pair. The loss function that is used depends on whether the model is doing regression or segmentation.- Returns
The loss.
- Return type
float
-
get_output
()[source]¶ Get the output of the network.
Note
The output is only defined after a call to
forward()
,learn()
,train()
,validate()
. If none of these methods has been called,None
is returned.- Returns
A torch tensor containing the output of the network or
None
.- Return type
torch.Tensor or NoneType
-
learn
(input=None, target=None)[source]¶ Train on a single input-target pair.
- Parameters
input – torch.Tensor
A
BxCxHxW
-dimensional torch input tensor.- Parameters
target – torch.Tensor
A
BxCxHxW
-dimensional torch input tensor.
-
load
(path)[source]¶ Load network parameters from disk.
- Parameters
path – The filesystem path where the network parameters are stored.
- Returns
the number of epochs the network has trained for.
- Return type
int
-
save
(path, epoch)[source]¶ Save network to disk.
- Parameters
path – A filesystem path where the network parameters are stored.
epoch – The number of epochs the network has trained for. This is useful for reloading!
- Returns
Nothing
- Return type
-
set_input
(data)[source]¶ Set input data.
- Parameters
data – torch.Tensor
A
BxCxHxW
-dimensional torch input tensor.- Returns
- Return type
-
set_normalization
(dataloader)[source]¶ Normalize input and target data.
This function goes through all the training data to compute the mean and std of the training data.
It modifies the network so that all future invocations of the network first normalize input data and target data to have mean zero and a standard deviation of one.
These modified parameters are not updated after this step and are stored in the network, so that they are not lost when the network is saved to and loaded from disk.
Normalizing in this way makes training more stable.
- Parameters
dataloader – The dataloader associated to the training data.
- Returns
- Return type
-
set_target
(data)[source]¶ Set target data.
- Parameters
data – torch.Tensor
A
BxCxHxW
-dimensional torch target tensor.- Returns
- Return type
-
train
(dataloader, num_epochs)[source]¶ Train on a dataset.
Trains the network for
num_epochs
epochs on the dataset supplied bydataloader
.- Parameters
dataloader – A dataloader for a dataset to train on.
num_epochs – The number of epochs to train for.
- Returns
- Return type
-
validate
(dataloader)[source]¶ Calculate validation score for dataset.
Calculates the mean loss per
(input, target)
pair indataloader
. The loss function that is used depends on whether the model is doing regression or segmentation.- Parameters
dataloader – A dataloader for a dataset to calculate the loss on.
- Returns
- Return type
-
msd_pytorch.msd_model.
scaling_module
(c_in, c_out, *, conv3d=False)[source]¶ Make a Module that normalizes the input data.
This part of the network can be used to renormalize the input data. Its parameters are
saved when the network is saved;
not updated by the gradient descent solvers.
- Parameters
c_in – The number of input channels.
c_out – The number of output channels.
conv3d – Indicates that the input data is 3D instead of 2D.
- Returns
A scaling module.
- Return type
torch.nn.ConvNd
msd_pytorch.msd_module module¶
-
class
msd_pytorch.msd_module.
MSDFinalLayer
(c_in, c_out)[source]¶ Bases:
torch.nn.modules.module.Module
Documentation for MSDFinalLayer
Implements the final 1x1 multiplication and bias addition for all intermediate layers to get to the output layer.
Initializes the weight and bias to zero.
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
msd_pytorch.msd_module.
MSDLayerModule
(buffer, c_in, layer_depth, width, dilation)[source]¶ Bases:
torch.nn.modules.module.Module
A hidden layer of the MSD module.
The primary responsibility of this module is to define the forward() method.
This module is used by the MSDModule.
This module is not responsible for
Buffer management
Weight initialization
-
__init__
(buffer, c_in, layer_depth, width, dilation)[source]¶ Initialize the hidden layer.
- Parameters
buffer – a StitchBuffer object for storing the L and G buffers.
c_in – The number of input channels of the MSD module.
layer_depth – The depth of this layer in the MSD module. This index is zero-based: the first hidden layer has index zero.
width – The width of the MSD module.
dilation – An integer describing the dilation factor for the convolutions in this layer.
- Returns
A module for the MSD hidden layer.
- Return type
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
msd_pytorch.msd_module.
MSDModule
(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶ Bases:
torch.nn.modules.module.Module
-
__init__
(c_in, c_out, depth, width, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])[source]¶ Create a msd module
- Parameters
c_in – # of input channels
c_out – # of output channels
depth – # of layers
width – # the width of the module
dilations – list(int)
A list of dilations to use. Default is
[1, 2, ..., 10]
. A good alternative is[1, 2, 4, 8]
. The dilations are repeated.- Returns
an MSD module
- Return type
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
msd_pytorch.msd_module.
init_convolution_weights
(conv_weight, c_in, c_out, width, depth)[source]¶ Initialize MSD convolution kernel weights
Based on:
Pelt, Daniel M., & Sethian, J. A. (2017). A mixed-scale dense convolutional neural network for image analysis. Proceedings of the National Academy of Sciences, 115(2), 254–259. http://dx.doi.org/10.1073/pnas.1715832114
- Parameters
conv_weight – The kernel weight data
c_in – Number of input channels of the MSD module
c_out – Number of output channels of the MSD module
width – The width of the MSD module
depth – The depth of the MSD module. This is the number of hidden layers.
- Returns
Nothing
- Return type
-
msd_pytorch.msd_module.
stitchLazy
()¶
-
msd_pytorch.msd_module.
units_in_front
(c_in, width, layer_depth)[source]¶ Calculate how many intermediate images are in front of current layer
The input channels count as intermediate images
The layer_depth index is zero-based: the first hidden layer has index zero.
- Parameters
c_in – The number of input channels of the MSD module
width – The width of the MSD module
layer_depth – The depth of the layer for which we are calculating the units in front. This index is zero-based: the first hidden layer has index zero.
- Returns
- Return type
msd_pytorch.msd_regression_model module¶
-
class
msd_pytorch.msd_regression_model.
MSDRegressionModel
(c_in, c_out, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], loss='L2', parallel=False)[source]¶ Bases:
msd_pytorch.msd_model.MSDModel
An MSD network for regression.
This class provides helper methods for using the MSD network module for regression.
Refer to the documentation of
MSDModel
for more information on the helper methods and attributes.-
__init__
(c_in, c_out, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], loss='L2', parallel=False)[source]¶ Create a new MSD network for regression.
- Parameters
c_in – The number of input channels.
c_out – The number of output channels.
depth – The depth of the MSD network.
width – The width of the MSD network.
dilations – list(int)
A list of dilations to use. Default is
[1, 2, ..., 10]
. A good alternative is[1, 2, 4, 8]
. The dilations are repeated when there are more layers than supplied dilations.- Parameters
loss – string
A string describing the loss function that should be used. Currently, the following losses are supported:
“L1” -
nn.L1Loss()
“L2” -
nn.MSELoss()
- Parameters
parallel – bool
Whether or not to execute the model on multiple GPUs. Note that the batch size must be a multiple of the number of available GPUs.
- Returns
- Return type
-
msd_pytorch.msd_segmentation_model module¶
-
class
msd_pytorch.msd_segmentation_model.
MSDSegmentationModel
(c_in, num_labels, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], parallel=False)[source]¶ Bases:
msd_pytorch.msd_model.MSDModel
An MSD network for segmentation.
This class provides helper methods for using the MSD network module for segmentation.
Refer to the documentation of
MSDModel
for more information on the helper methods and attributes.-
__init__
(c_in, num_labels, depth, width, *, dilations=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], parallel=False)[source]¶ Create a new MSD network for segmentation.
- Parameters
c_in – The number of input channels.
num_labels – The number of labels to divide the segmentation into.
depth – The depth of the MSD network
width – The width of the MSD network
dilations – list(int)
A list of dilations to use. Default is
[1, 2, ..., 10]
. A good alternative is[1, 2, 4, 8]
. The dilations are repeated when there are more layers than supplied dilations.- Parameters
parallel – bool
Whether or not to execute the model on multiple GPUs. Note that the batch size must be a multiple of the number of available GPUs.
- Returns
- Return type
-
set_normalization
(dataloader)[source]¶ Normalize input data.
This function goes through all the training data to compute the mean and std of the training data. It modifies the network so that all future invocations of the network first normalize input data. The normalization parameters are saved.
- Parameters
dataloader – The dataloader associated to the training data.
- Returns
- Return type
-
msd_pytorch.relu_inplace module¶
-
class
msd_pytorch.relu_inplace.
ReLUInplaceFunction
[source]¶ Bases:
torch.autograd.function.Function
-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
static
-
class
msd_pytorch.relu_inplace.
ReLUInplaceModule
[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
msd_pytorch.stitch module¶
Stitch Functions and Modules for threading the gradient
Stitching refers to the practice of copying and / or reusing shared buffers in a network to improve efficiency. It handles distributing the gradient transparently.
In this module, we implement three types of stitching:
Slow stitching: concatenates to inputs in the forward pass and distributes the gradient output in the backward pass. Inefficient. Slow stitching is used for testing.
Copy Stitching: copies the input into a layer buffer
L
and returns all layers up to and including the newly copied input. More efficient than slow stitching, but preferably used sparingly.Lazy Stitching: assumes that the input has already been copied in the layer buffer
L
and returns all layers up to and including the input. The gradient is accumulated in a gradient bufferG
. This is fast and efficient.
-
class
msd_pytorch.stitch.
StitchBuffer
[source]¶ Bases:
object
-
__init__
()[source]¶ Holds the
L
andG
buffers for a stitched module.The intermediate layers are stored in
L
for the forward pass. The gradients are stored in theG
buffer.
-
like_
(tensor, new_shape)[source]¶ Change the
L
andG
buffers to match tensor.Matches the tensor’s - data type - device (cpu, cuda, cuda:0, cuda:i)
The shape is taken from the new_shape parameter.
- Parameters
tensor – An input tensor
new_shape – The new shape that the buffer should have.
- Returns
Nothing
- Return type
-
-
class
msd_pytorch.stitch.
StitchCopyFunction
[source]¶ Bases:
torch.autograd.function.Function
Copy stitching:
Stores output in buffer
L
in the forward pass and adds thegrad_output
to bufferG
in the backward pass.The buffer
L
is a tensor of dimensions B x C x ? whereB is the minibatch size, and
C is the number of channels.
The buffer
G
has the same dimension asL
.The parameter
i
is an index in the C dimension and points to where the input (the output of the previous layer) must be copied.In the forward pass:
write the input into
L
at channeli
return
L
up to and including channeli
In the backward pass:
add the
grad_output
toG
return channel
i
ofG
It is good practice to zero the
G
buffer before the backward pass. Sometimes, this is not possible since some methods, such astorch.autograd.gradcheck
, repeatedly call.grad()
on the output. Therefore, whengrad_output
is the same size asG
, the bufferG
is zeroed in thebackward
function.-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input, L, G, i)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
class
msd_pytorch.stitch.
StitchCopyModule
(buffer, i)[source]¶ Bases:
torch.nn.modules.module.Module
-
__init__
(buffer, i)[source]¶ Make a new StitchCopyModule
- Parameters
buffer – A StitchBuffer
i – index of the output channel of the stitch
- Returns
- Return type
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
msd_pytorch.stitch.
StitchLazyFunction
[source]¶ Bases:
torch.autograd.function.Function
StitchLazyFunction
is similar toStitchCopyFunction
, but it does not copy the output of the previous layer intoL
. Hence the name.StitchLazyFunction
supposes that the output of the the previous layer has already been copied intoL
. This can be accomplished withconv_cuda.conv2dInPlace
, for instance.The buffer
L
is a tensor of dimensions B x C x ? whereB is the minibatch size, and
C is the number of channels.
The buffer
G
has the same dimension asL
.The parameter
i
is an index in the C dimension and points to where the input (the output of the previous layer) must be copied.In the forward pass:
write the input into
L
at channeli
In the backward pass:
add the
grad_output
toG
return channel
i
ofG
It is good practice to zero the
G
buffer before the backward pass. Sometimes, this is not possible since some methods, such astorch.autograd.gradcheck
, repeatedly call.grad()
on the output. Therefore, whengrad_output
is the same size asG
, the bufferG
is zeroed in thebackward
function.-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input, L, G, i)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
class
msd_pytorch.stitch.
StitchLazyModule
(buffer, i)[source]¶ Bases:
torch.nn.modules.module.Module
-
__init__
(buffer, i)[source]¶ Make a new StitchLazyModule
- Parameters
buffer – A StitchBuffer
i – index of the output channel of the stitch
- Returns
- Return type
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
msd_pytorch.stitch.
StitchSlowFunction
[source]¶ Bases:
torch.autograd.function.Function
Naive stitching: concatenates two inputs in the channel dimension.
-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input1, input2)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
static
-
msd_pytorch.stitch.
stitchCopy
()¶
-
msd_pytorch.stitch.
stitchLazy
()¶
-
msd_pytorch.stitch.
stitchSlow
()¶
Module contents¶
Top-level package for Mixed-scale Dense Networks for PyTorch.