Mastrave project

The module "mdeal" of the Mastrave modelling library

Daniele de Rigo

Copyright and license notice of the function mdeal

The file mdeal.m is part of Mastrave.

Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.

Function declaration

 [ ... ] = mdeal( values                   ,
                  block_size   = []        ,
                  dim    = 'columns' ,
                  fitting_mode = '--check' ,
                  groups       = []        ,
                  groups_dim   = []        )

Description

Utility to extend the ability of the function @deal to assign multiple output variables when a single input matrix or a single multidimensional array (md-array) is provided. Instead of copying the entire input argument values to each output, this utility split values in a partition of blocks (sub-matrices or sub md-arrays) whose concatenation along dim is values . The size along dim of each block of values is defined by the vector block_size . In case the sum of block_size does not equal the size along dim of values , the fitting_mode argument can be used to define the exact splitting of values .

This utility does not support the behavior of the standard function @deal when passing more thagroun one input argument. If you need to dispatch multiple input variables to multiple output ones, you should use directly the function @deal .

Input arguments


 values             ::numeric::
                    Numeric vector, matrix or multidimensional-array.

 block_size         ::vector,numel::
                    Size of the blocks of  values  to be returned one 
                    per output argument.  The sizes are computed along
                     dim  and are expected to enable the creation
                    of a partition of sub-matrices.  This implies that
                     block_size  must sum to the size of  values  along 
                    the dimension  dim .

 dim                ::scalar_index|string::
                    Dimension along which to split  values  into blocks
                    (default: 'columns').
                    In case a string is passed, valid options are:

                          option      │   meaning
                       ───────────────┼────────────────────────────────
                          'rows'      │ split  values  along rows.
                       ───────────────┼────────────────────────────────
                          'columns'   │ split  values  along columns.

 fitting_mode       ::string::
                    Policy to adopt when selecting the size of each
                    output variable  (default: '--check').
                    Valid options are:

                          option      │   meaning
                       ───────────────┼────────────────────────────────
                        '--check'     │ Check whether  block_size  sum
                                      │ equals the number of  values 
                                      │ elements along the dimension 
                                      │  dim .
                                      │ If not, an error is thrown.
                       ───────────────┼────────────────────────────────
                        '--fit-all'   │ Adapt  block_size  values to
                                      │ be considered weights driving
                                      │ the size of each output
                                      │ variable.   values  elements
                                      │ will be always entirely split
                                      │ into output arguments even if
                                      │  block_size  sum doesn't equal
                                      │ the size of  values  along the
                                      │ dimension  dim .
                       ───────────────┼────────────────────────────────
                        '--fit-head'  │ Ensure the first output 
                                      │ variables have their size
                                      │ corresponding to the first
                                      │ elements of  block_size  even
                                      │ if  block_size  sum does not
                                      │ equal the size of  values 
                                      │ along the dimension  dim .
                                      │ Last output arguments adapt 
                                      │ their size to ensure all
                                      │ elements of  values  are 
                                      │ retuned in some output
                                      │ variable.
                       ───────────────┼────────────────────────────────
                        '--fit-tail'  │ Ensure the last output 
                                      │ variables have their size
                                      │ corresponding to the last
                                      │ elements of  block_size  even
                                      │ if  block_size  sum does not
                                      │ equal the size of  values 
                                      │ along the dimension  dim .
                                      │ First output arguments adapt
                                      │ their size to ensure all
                                      │ elements of  values  are 
                                      │ retuned in some output
                                      │ variable.

 groups             ::finite::
                    Optional argument (default: []) to permute the 
                    elements of  values  along the dimension  groups_dim 
                    and split  values  not only in blocks along the
                    dimension  dim  but also in homogeneous groups along
                    the dimension  groups_dim .
                    The groups are identified by associating the i-th
                    group to the i-th unique value of  groups  so that
                    the set of repeated instances of the i-th value
                    within  groups  is the i-th group.
                     values  is permuted by separating all the elements
                    along the dimension  groups_dim  whose position is
                    the position of an element in the first group, then
                    repeating the procedure with the elements associated
                    with the second group, and so on.

 groups_dim         ::scalar_index|empty::
                    Optional argument (default: []) defining the dimension
                    along which to permute and split the elements of
                     values  in homogeneous groups.

Example of usage


   % Motivational example: quick implementation of random walks
   [ x, y ]    = mdeal( cumsum( randn( 1000, 10 ) ) );
   figure(1); plot( x, y )
   [ x, y, z ] = mdeal( cumsum( randn( 1000, 15 ) ) );
   figure(2); plot3( x, y, z )


   % Straightforward cases: input number of columns (rows)
   % is an exact multiple of the number of output variables
   siz = [ 1 3 ]
   M0  = mat2multi( 1:prod(siz) , 2 , siz );
   [M1, M2, M3] = mdeal( M0 )

   siz = [ 4 3 ]
   M0  = mat2multi( 1:prod(siz) , 2 , siz );
   [M1, M2, M3] = mdeal( M0 )

   siz = [ 6 5 ]
   M0  = mat2multi( 1:prod(siz) , 2 , siz );
   [M1, M2, M3] = mdeal( M0 , [] , 1 )


   % Automatic or user-defined balancing in case the input number
   % of columns (rows, nth-dimension size) is not an exact multiple
   % of the number of output variables.
   siz = [ 1 7 ]
   M0  = mat2multi( 1:prod(siz) , 2 , siz );
   [M1, M2, M3] = mdeal( M0 )
   [M1, M2, M3] = mdeal( M0 , []      )
   [M1, M2, M3] = mdeal( M0 , [1 0 6] )

   siz = [ 4 7 ]
   M0  = mat2multi( 1:prod(siz) , 2 , siz );
   [M1, M2, M3] = mdeal( M0 )
   [M1, M2, M3] = mdeal( M0 , [] )
   [M1, M2, M3] = mdeal( M0 , [] , 'columns' )
   [M1, M2, M3] = mdeal( M0 , [] , 2         )
   [M1, M2, M3] = mdeal( M0 , [] , 'rows'    )
   [M1, M2, M3] = mdeal( M0 , [] , 1         )


   % Support for sparse matrices.
   M0  = sparse( M0 );
   [M1, M2, M3] = mdeal( M0 )
   [M1, M2, M3] = mdeal( M0 , [] )
   [M1, M2, M3] = mdeal( M0 , [] , 'columns' )
   [M1, M2, M3] = mdeal( M0 , [] , 2         )
   [M1, M2, M3] = mdeal( M0 , [] , 'rows'    )
   [M1, M2, M3] = mdeal( M0 , [] , 1         )


   % @mdeal   can manage multidimensional arrays.
   siz = [ 4 7 3 ]
   d   = 2;
   M0  = mat2multi( 1:prod(siz) , d , siz );
   [M1, M2, M3] = mdeal( M0 )
   [M1, M2, M3] = mdeal( M0 , []      )
   [M1, M2, M3] = mdeal( M0 , []      , d )
   [M1, M2, M3] = mdeal( M0 , [1 0 6] , d )
   d   = 1;
   M0  = mat2multi( 1:prod(siz) , d , siz );
   [M1, M2, M3] = mdeal( M0 , []      , d )
   [M1, M2, M3] = mdeal( M0 , [1 1 2] , d )
   d   = 3;
   M0  = mat2multi( 1:prod(siz) , d , siz );
   [M1, M2, M3] = mdeal( M0 , []      , d )
   [M1, M2, M3] = mdeal( M0 , [0 1 2] , d )
   
   % correctness test:
   isequal( M0 , cat( d , M1 , M2 , M3 ) )


   % using different values for 'fitting_mode'
   siz = [ 7 4 ]
   M0  = mat2multi( 1:prod(siz) , 2 , siz );
   [M1, M2, M3] = mdeal( M0 , [] )
   [M1, M2, M3] = mdeal( M0 , [2 1 1] )
   [M1, M2, M3] = mdeal( M0 , [2 0 1] , 2 , '--fit-all'  )
   [M1, M2, M3] = mdeal( M0 , [2 0 1] , 2 , '--fit-head' )
   [M1, M2, M3] = mdeal( M0 , [2 0 1] , 2 , '--fit-tail' )
   [M1, M2, M3] = mdeal( M0 , [2 3 9] , 2 , '--fit-all'  )
   [M1, M2, M3] = mdeal( M0 , [2 3 9] , 2 , '--fit-head' )
   [M1, M2, M3] = mdeal( M0 , [2 3 9] , 2 , '--fit-tail' )


   % Motivational example (advanced): quick subdivision in training
   % and validation sets for a multivariate regression
   N = 100
   % Let us suppose M is a dataset loaded from file, composed by three
   % columns of values representing measures of three quantities.
   % If the third quantity is supposed to be correlated and causally
   % dependent from the first two quantities, it could be interesting
   % to model its dependency from them.
   % A linear and a quadratic model are compared by repeating the
   % model training with several subsets of data and by validating
   % the model generalization each time with the unused subset of
   % data.
   M         = rand(N,2);
   M( :, 3 ) = M(:,1:2).^1.5*[2;3] .*( 1 + randn(N,1)/5 );

   n_subtrain          = 100;
   training_error_1    = zero( n_subtrain , 1 )*nan;
   validation_error_1  = zero( n_subtrain , 1 )*nan;
   training_error_2    = zero( n_subtrain , 1 )*nan;
   validation_error_2  = zero( n_subtrain , 1 )*nan;
   for i=1:n_subtrain
      % Training   set: x, y, z (about 70% of the data)
      % Validation set: X, Y, Z (about 30% of the data)
      [x,X,y,Y,z,Z]    = mdeal( M, [], 2, '--check', rand(N,1)>.7, 1);

      % Linear model.
      param            = [x y]\z;
      approximated_z   = [x y]*param;
      predicted_Z      = [X Y]*param;
      training_error_1(i)   = mean( ( approximated_z - z ).^2 )^.5;
      validation_error_1(i) = mean( ( predicted_Z    - Z ).^2 )^.5;

      % Quadratic model.
      param            = [x y x.^2 y.^2]\z;
      approximated_z   = [x y x.^2 y.^2]*param;
      predicted_Z      = [X Y X.^2 Y.^2]*param;
      training_error_2(i)   = mean( ( approximated_z - z ).^2 )^.5;
      validation_error_2(i) = mean( ( predicted_Z    - Z ).^2 )^.5;
   end

   subplot( 2, 2, 1 ); title('trainig: linear m.')
   hist( training_error_1 , n_subtrain^.5 )
   subplot( 2, 2, 2 ); title('trainig: quadratic m.')
   hist( training_error_2 , n_subtrain^.5 )
   subplot( 2, 2, 3 ); title('validation: linear m.')
   hist( validation_error_1 , n_subtrain^.5 )
   subplot( 2, 2, 4 ); title('validation: quadratic m.')
   hist( validation_error_2 , n_subtrain^.5 )

See also:
   score, mat2multi, multi2mat



Keywords:
   multidimensional-array, multiple variables, sub-matrices



Version: 0.4.5

Support

The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science. You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module. Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable. However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based. Reporting a problem that you found using Mastrave may help the developer team to find a possible bug. Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems. Thank you for your collaboration.

This page is licensed under a Creative Commons Attribution-NoDerivs 3.0 Italy License.

This document is also part of the book:
de Rigo, D. (2012). Semantic Array Programming with Mastrave - Introduction to Semantic Computational Modelling. http://mastrave.org/doc/MTV-1.012-1