Traditional approaches to training exploit either data parallelism (dividing up the training samples), model parallelism (dividing up the model parameters), or expert-designed hybrids for particular situations. FlexFlow encompasses both of these in its sample (data parallelism), and parameter (model parallelism) dimensions, and also adds an operator dimension (more model parallelism) describing how operators within a DNN should be parallelized, and an attribute dimension with defines how different attributes within a sample should be partitioned (e.g. height and width of an image).