Mastrave project

The unchecked module "rand_idx_" of the Mastrave modelling library

Daniele de Rigo

Copyright and license notice of the function rand_idx_

The file rand_idx_.m is part of Mastrave.

Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.

Function declaration

 [ random_vals, rnd ] = rand_idx_( weights        ,
                                   siz       = 1  ,
                                   value_set = [] ,
                                   rnd       = [] )

Description

Warning: rand_idx_ should not be used in end-user code because it deliberately skips input and output arguments checks and other pre- and post-condition testing. End-user code should use instead the rand_idx function (without the ending underscore).

Utility to generate random samples extracted from a given finite set of values ( value_set ) with a frequency proportional to a set of weights each of them associated to the corresponding element of value_set . The higher the i-th weight, the higher the probability for the corresponding i-th element of value_set to be randomly extracted. Multiple instances of the same element of value_set may be extracted. The size of the returned random-sample array random_vals is siz . To generate random_vals , a random-sample of real numbers uniformly distributed between 0 and 1 is required. The random-sample can be provided as an input with the array rnd , otherwise it is generated within the utility. In both cases, rnd is also returned as optional output argument.

Input arguments


 weights            ::nonnegative::
                    Array of non-negative elements each of them representing
                    the weight (proportional to the frequency) with which 
                    the corresponding element of  value_set  has to be
                    randomly sampled.

 siz                ::numel::
                    Size of the output argument  random_vals .
                    If omitted, the default value is 1.

 value_set          ::numstring::
                    Array providing the set of values to be randomly
                    sampled. If empty, the default value is the sequence
                    of position-indexes associated with  weights 
                    i.e. 1:numel(  weights  ) .
                    If omitted, the default value is [] (empty).

 rnd                ::probability::
                    Optional array of real numbers between 0 and 1 which
                    can be passed as input to ensure the random-sampling
                    is deterministically reproducible.  If  rnd  is not
                    uniformly distributed in [0 1], the average frequency 
                    of the  value_set  elements in the returned array 
                     random_vals  will not be proportional to  weights .
                    If empty, an internal random-sampling ─ uniformly 
                    distributed ─ is generated.
                    If omitted, the default value is [] (empty).

Example of usage


   % Motivational example:
   % Sensitivity analysis of a polynomial model of order 3 (cubic model)
   % against data generated with an higher-order polynomial.

   % Original (syntetic) data:
   xi  = mean( sort( rand( 20, 5 ) ), 2 )* 8 + 1;
   yi  = .01 * xi.^5 - xi.^3 + pi;
   
   % Cubic model:
   M   = bsxfun( @power, xi, [0:3] )
   th  = M \ yi

   hold off; plot( xi , yi , 'o-k' , xi , M*th , '+k' )
   legend( 'original data' , 'cubic model' )

   % Fine-grid model (spatial resolution: 0.1 units).
   xii = [0:0.1:10].';
   MM  = bsxfun( @power, xii, [0:3] );

   % Sensitivity analysis (simplified bootstrapping).
   % Bootstrapping extractions: 300.
   nb  = 300
   id  = rand_idx_( ones(20,1), [20,nb] );
   Xi  = xi( id ); Yi = yi( id );
   thi = zeros(     4      , nb );
   yii = zeros( numel(xii) , nb );
   for i=1:nb
      thi(:,i) = bsxfun( @power, Xi(:,i), [0:3] ) \ Yi(:,i);
      yii(:,i) = MM * thi(:,i);
   end  
   hold on
   plot( xii, quantile( yii.' , [.05 .5 .95] ).' )
   legend( '5%' , '50%' , '95%' ); hold off;



   % Comparison between rand_idx_( . ) and one of the possible
   % trivial implementations.  The one being considered is:
   %    O( numel( weights ) * numel( rnd ) )
   % Its comparison with rand_idx_( . ) is stopped when the size of
   % the involved temporary matrix to test  weights  against  rnd 
   % is approaching 50000 x 50000.  This is done to prevent memory
   % exhaustion.
   Ni = floor( 10.^[3:.25:6] +.5 )
   n  = numel( Ni );
   [ t1, t2, t3 ] = deal( zeros( n, 1 ) * nan );
   for i = 1:n
      N       = Ni( i );
      weights = rand( 1 , N );
      rnd     = rand( N , 1 );
      if N < 50000
         tic;
         w       = cumsum([ 0 weights ]);
         w       = w./w(end);
         id1     = sum( bsxfun( @le, w , rnd ), 2 );
         t1( i ) = toc;
      end

      tic;
      id2     = rand_idx( weights, N, [], rnd ); 
      t2( i ) = toc;

      tic;
      id3     = rand_idx_( weights, N, [], rnd ); 
      t3( i ) = toc;

      if N < 50000
         assert( isequal( id1 , id2 , id3 ) )
      end

   end
   hold off; loglog( Ni, [ t1, t2, t3 ], 'o-' )
   legend( 'trivial implementation' , 'rand_idx( . )' , 'rand_idx_( . )' )

See also:
   create_contiguous_intervals



Keywords:
   intervals, keys, indexes, random



Version: 0.8.8

Support

The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science. You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module. Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable. However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based. Reporting a problem that you found using Mastrave may help the developer team to find a possible bug. Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems. Thank you for your collaboration.

This page is licensed under a Creative Commons Attribution-NoDerivs 3.0 Italy License.

This document is also part of the book:
de Rigo, D. (2012). Semantic Array Programming with Mastrave - Introduction to Semantic Computational Modelling. http://mastrave.org/doc/MTV-1.012-1