The module "frequency_resampling" of the Mastrave modelling library
Copyright and license notice of the function frequency_resampling
Copyright © 2007,2008,2009,2010,2011,2012,2013,2014 Daniele de Rigo
The file frequency_resampling.m is part of Mastrave.
Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.
Function declaration
[ resampled_freqs, resampled_obs, siz ] = frequency_resampling( frequency_vals , tot_observations = [] , N_runs = [] , rand_func = [] , do_sparse_resampling = [] , use_binomial = [] , do_obs_resampling = [] )
Description
Module supporting the unbiased statistical resampling of binary observations (i.e. observations whose value may be true/positive or false/negative). Given an array recording the frequency of positive observations frequency_vals and an array with the corresponding total amount of recorded observations tot_observations (irrespective of whether they are positive or negative), the module generates a set of N_runs statistical resampling runs for both frequency_vals and tot_observations .
A given run randomly selects the observations following a uniform pseudo-random sequence (by default, generated with @rand) or a custom randomisation function rand_func which might be provided as optional input. The observations are selected with repetition (i.e. a given observation may be selected multiple times). While the overall amount of observations selected within each run is the same as that of tot_observations , this might not be true for the total number of positive observations.
frequency_vals and tot_observations must have the same size siz . The resampled frequencies resampled_freqs and the correspondingly resampled total observations resampled_obs are returned as matrices where each column provides the ourput of a run. The number of elements in each column (i.e. the number of matrix rows) is the same as the number of elements of frequency_vals and tot_observations .
Input arguments
frequency_vals ::numel:: Array counting the positive occurrences (presences). For each element of the array, the local cumulated amount of presences is provided. An element's value of zero is interpreted as an absence of positive observations, irrespective of whether within the element there is a lack of observations or instead the local available observations are all negative. tot_observations ::numel:: Array counting the total observations (including both positive and negative occurences). An element's value of zero is interpreted as an absence of local observations. Default: [] (empty array). If empty, the same array of frequency_vals is used (i.e. all observations are considered as positive). N_runs ::scalar_index:: Number of runs of the statistical resampling. Default: [] (empty array). If empty, only one run is generated. rand_func ::function_handle:: Optional handle to a custom randomisation function. Default: [] (empty array). If empty, the function @rand is used. do_sparse_resampling ::scalar_binary:: Flag setting whether the returned values of resampled_freqs and resampled_obs must be sparse matrices. Default: [] (empty array). If empty, the flag is set as false. use_binomial ::scalar_binary:: Flag setting whether the returned values of resampled_freqs must be computed by explicit bootstrap or instead by expliting a binomial Monte Carlo extraction. The binomial method may be more efficient where a high number of observations per element characterise frequency_vals and tot_observations . Default: [] (empty array). If empty, the flag is set as false. do_obs_resampling ::scalar_binary:: Flag setting whether the returned values of resampled_obs must be based on statistical resampling as resampled_freqs is, or not. If yes, the total number of observations in each resampled run is respected. If not, each column (i.e. run) of resampled_obs has the same elements as tot_observations . Default: [] (empty array). If empty, the flag is set as true. { frequency_vals , tot_observations } ::same_size::
Example of usage
P = 0:2:20 % Positive observations PN = ones(size(P))*max(P) % All observ. (positive+negative ones) ni = 10000 % Number of resampling runs x = linspace(0,1,ni); % Bootstrap of local positive and negative observations % (the overall number of observation is preserved) [fr,ob] = frequency_resampling( P, PN, ni ); % Binomial distribution % (without bootstrapping the local number of observations) for i=1:numel(P), bino(:,i)=binoinv(rand(ni,1),PN(i),P(i)/PN(i)); end figure(1); hold off; plot( x, sort(fr./ob,2).' ); hold on; plot( x, sort(bino)./PN(1), 'o' ); title( sprintf( [ ... '.\nthin: bootstrapped frequencies (P,PN, 10000 runs); \n' ... 'thick: binomial with fixed PN' ] ) ) % Bootstrap of only local positive observations % (the local number of observation is preserved) [fr,ob] = frequency_resampling( P, PN, ni, [], [], [], false ); figure(2); hold off; plot( x, sort(fr./ob,2).' ); hold on; plot( x, sort(bino)./PN(1), 'o' ); title( sprintf( [ ... '.\nthin: bootstrapped frequencies (only P, 10000 runs); \n' ... 'thick: binomial with fixed PN' ] ) )
See also: rand_idx, frequency2index, mloop Keywords: data-transformation, matrix, blocks Version: 0.6.12
Support
The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science. You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module. Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable. However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based. Reporting a problem that you found using Mastrave may help the developer team to find a possible bug. Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems. Thank you for your collaboration.