## The module "frequency_resampling" of the Mastrave modelling library

**Daniele de Rigo**

#### Copyright and license notice of the function frequency_resampling

Copyright © 2007,2008,2009,2010,2011,2012,2013,2014 Daniele de Rigo

The file frequency_resampling.m is part of Mastrave.

Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.

#### Function declaration

[resampled_freqs,resampled_obs,siz] = frequency_resampling(frequency_vals,tot_observations= [] ,N_runs= [] ,rand_func= [] ,do_sparse_resampling= [] ,use_binomial= [] ,do_obs_resampling= [] )

#### Description

Module supporting the unbiased statistical resampling of binary observations
(i.e. observations whose value may be true/positive or false/negative).
Given an array recording the frequency of positive observations
` frequency_vals ` and an array with the corresponding total amount of
recorded observations

`(irrespective of whether they are positive or negative), the module generates a set of`

**tot_observations**`statistical resampling runs for both`

**N_runs**`and`

**frequency_vals**`.`

**tot_observations**A given run randomly selects the observations following a uniform
pseudo-random sequence (by default, generated with @rand) or a custom
randomisation function ` rand_func ` which might be provided as optional
input. The observations are selected with repetition (i.e. a given
observation may be selected multiple times). While the overall amount of
observations selected within each run is the same as that of

`, this might not be true for the total number of positive observations.`

**tot_observations**` frequency_vals ` and

`must have the same size`

**tot_observations**`. The resampled frequencies`

**siz**`and the correspondingly resampled total observations`

**resampled_freqs**`are returned as matrices where each column provides the ourput of a run. The number of elements in each column (i.e. the number of matrix rows) is the same as the number of elements of`

**resampled_obs**`and`

**frequency_vals**`.`

**tot_observations**

#### Input arguments

frequency_valsArray counting the positive occurrences (presences). For each element of the array, the local cumulated amount of presences is provided. An element's value of zero is interpreted as an absence of positive observations, irrespective of whether within the element there is a lack of observations or instead the local available observations are all negative.::numel::tot_observationsArray counting the total observations (including both positive and negative occurences). An element's value of zero is interpreted as an absence of local observations. Default: [] (empty array). If empty, the same array of::numel::is used (i.e. all observations are considered as positive).frequency_valsN_runsNumber of runs of the statistical resampling. Default: [] (empty array). If empty, only one run is generated.::scalar_index::rand_funcOptional handle to a custom randomisation function. Default: [] (empty array). If empty, the function @rand is used.::function_handle::do_sparse_resamplingFlag setting whether the returned values of::scalar_binary::andresampled_freqsmust be sparse matrices. Default: [] (empty array). If empty, the flag is set as false.resampled_obsuse_binomialFlag setting whether the returned values of::scalar_binary::must be computed by explicit bootstrap or instead by expliting a binomial Monte Carlo extraction. The binomial method may be more efficient where a high number of observations per element characteriseresampled_freqsandfrequency_vals. Default: [] (empty array). If empty, the flag is set as false.tot_observationsdo_obs_resamplingFlag setting whether the returned values of::scalar_binary::must be based on statistical resampling asresampled_obsis, or not. If yes, the total number of observations in each resampled run is respected. If not, each column (i.e. run) ofresampled_freqshas the same elements asresampled_obs. Default: [] (empty array). If empty, the flag is set as true. {tot_observations,frequency_vals}tot_observations::same_size::

#### Example of usage

P = 0:2:20 % Positive observations PN = ones(size(P))*max(P) % All observ. (positive+negative ones) ni = 10000 % Number of resampling runs x = linspace(0,1,ni); % Bootstrap of local positive and negative observations % (the overall number of observation is preserved) [fr,ob] = frequency_resampling( P, PN, ni ); % Binomial distribution % (without bootstrapping the local number of observations) for i=1:numel(P), bino(:,i)=binoinv(rand(ni,1),PN(i),P(i)/PN(i)); end figure(1); hold off; plot( x, sort(fr./ob,2).' ); hold on; plot( x, sort(bino)./PN(1), 'o' ); title( sprintf( [ ... '.\nthin: bootstrapped frequencies (P,PN, 10000 runs); \n' ... 'thick: binomial with fixed PN' ] ) ) % Bootstrap of only local positive observations % (the local number of observation is preserved) [fr,ob] = frequency_resampling( P, PN, ni, [], [], [], false ); figure(2); hold off; plot( x, sort(fr./ob,2).' ); hold on; plot( x, sort(bino)./PN(1), 'o' ); title( sprintf( [ ... '.\nthin: bootstrapped frequencies (only P, 10000 runs); \n' ... 'thick: binomial with fixed PN' ] ) )

See also: rand_idx, frequency2index, mloop Keywords: data-transformation, matrix, blocks Version: 0.6.12

#### Support

The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science. You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module. Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable. However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based. Reporting a problem that you found using Mastrave may help the developer team to find a possible bug. Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems. Thank you for your collaboration.