## The module "mstream" of the Mastrave modelling library

Daniele de Rigo

The file mstream.m is part of Mastrave.

Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.

#### Function declaration

 answer = mstream( func               ,
in_streams         ,
out_streams        ,
block_size         ,
precision   = []   ,
skip        = []   ,
arch        = []   ,
same_size   = true )



#### Description

Utility to apply a data-transformation function func to a set of input streams in_streams and save the result inside a set of output streams out_streams .

The data-transformation function is identified by the function handle func and is expected to receive as input a matrix of block_size rows (or less) and n_in columns, where n_in is the number of input streams. The expected output of func is another matrix of block_size rows (or less, in accordance with the input matrix) and n_out columns, where n_out is the number of output streams.

The n_in input streams can be one of the following: - the n_in columns of a matrix passed as in_streams argument; - the n_in files identified by the corresponding file IDs passed
as in_stream argument in the form of a cell-array of file IDs
(which are integer numbers: see @fopen, @fread and @fwrite for
more details); - the n_in files identified by the corresponding file paths passed
as in_stream argument in the form of a cell-array of strings.

The same alternatives apply to the n_out output streams.

func is iteratively applied to blocks of the input streams and the result is iteratively saved in corresponding blocks of the output streams. block_size identifies the number of rows to be processed per iteration.

The optional input arguments precision , skip and arch have the same meaning as respectively the third, fourth and fifth argument of @fread and @fwrite.

The optional input argument header allows the desired number of bytes to be specified in order for them to be skipped before reading/writing the first block of values respectively from in_streams and out_stream . This may be useful so as to manipulate file formats containing data in binary format, when the data are prefixed with an initial set of bytes (of known size) used as metadata (header). For example, the Binary Terrain file format (Discoe, 2007) uses a fixed header of 256 bites, followed by single-precision binary data.

When files are to be written as requested with out_streams , the optional argument header_fill allows a reference file to be read so as for its header to be copied in the files defined in out_streams .

References
Discoe, B.: The BT (Binary Terrain) File Format, 2007
http://www.vterrain.org/Implementation/Formats/BT.html

#### Input arguments


func               ::function_handle::
Function handle identifying the data-transformation
function to be applied to  in_streams  to transform
them in  out_streams .

in_streams         ::matrix|cellnumstring-1::
Input streams.  If a cell-array of strings (file
paths) is passed, the  corresponding files will be
opened in read-mode and closed at the end of the
data-transformation.

out_streams        ::matrix|cellnumstring-1::
Output streams.  If a cell-array of strings (file
paths) is passed, the  corresponding files will be
opened in write-mode and closed at the end of the
data-transformation.

block_size         ::scalar_index::
Number of rows to be processed within each iteration.
See the description of the second argument of @fread.

precision          ::string::
Type of data to be read and write.
If  in_streams  (or  out_streams ) is a matrix,
precision  has no effects on its manipulation.
See the description of the third argument of @fread
and @fwrite functions.  If empty, the default
value is 'double'.
If omitted, the default value is [] (empty).

skip               ::scalar_numel::
Number of bytes to skip before reading/writing each
value respectively from  in_streams  and  out_stream .
If  in_streams  (or  out_streams ) is a matrix,
skip  has no effects on its manipulation.
See the description of the fourth argument of @fread
and @fwrite functions.  If empty, the default
value is 0.
If omitted, the default value is [] (empty).

arch               ::string::
Type of data-format of the files (order of bytes).
If  in_streams  (or  out_streams ) is a matrix,
arch  has no effects on its manipulation.
See the description of the fifth argument of @fread
and @fwrite functions.  If empty, the default
value is 'native'.
If omitted, the default value is [] (empty).

Number of bytes to skip before reading/writing the
first block of values respectively from  in_streams
and  out_stream .  Passing this optional argument
allows to manipulate file formats containing data
in binary format even when the data are prefixed
with an initial set of bytes ─ of known size ─
If  in_streams  (or  out_streams ) is a matrix,
skip  has no effects on its manipulation.
If empty, the default value is 0.
If omitted, the default value is [] (empty).

Reference file to be read so as for its header to be
copied as header in the files defined in  out_streams .
If empty or omitted, the header in  out_streams  are
filled with spaces.

same_size          ::scalar,logical::
If set to  true, this flag requires the size of the
output blocks to be the same of that of the input ones.
In particular, the number of rows of the input and
output matrix is expected to be the same in each block.
Different blocks may still have different number of
rows. If omitted, the default value is true.



#### Example of usage

   % Creating a series of files from a pre-existing matrix
vals  = reshape( [1:40].', 10, 4 )
func  = @(x)10*x
mstream( func, vals, {'t1','t2','t3','t4'}, 3 )

vals2 = mstream( @(x)x, {'t1','t2','t3','t4'}, [], 3 )

% Transforming a series of files in another series of files
func2 = @(x)x/10
mstream( func2, {'t1','t2','t3','t4'}, {'T1','T2','T3','T4'}, 3 )
vals3 = mstream( @(x)x, {'T1','T2','T3','T4'}, [], 3 )
% Functions changing the order of rows are allowed.  However,
% for such functions the result of the data-transformation depends
% on the value passed as  block_size
func2_rev = @(x)x(end:-1:1,:)/10
mstream( func2_rev, {'t1','t2','t3','t4'}, {'T1','T2','T3','T4'}, 3 )
vals3 = mstream( @(x)x, {'T1','T2','T3','T4'}, [], 3 )

% Transforming a series of files in another file
func3 = @(x)sum(x,2)
mstream( func3, {'T1','T2','T3','T4'}, {'Tall'}, 3 )

% Reading a file with the classic @fopen -> @fread -> @fclose
fid   = fopen( 'Tall', 'rb' )
vals4 = fread( fid, 'double' )
fclose( fid )

% Consistency check
assert( vals4 == func3( vals3    ) )
assert( vals4 == sum(   vals3, 2 ) )

% Importing/manipulating a Binary Terrain (Discoe, 2007) file.
% Obtaining a reference file (Island of Maui - USGS 10m DEM)
url      = 'http://vterrain.org/repo/elev/maui_1k.zip'
file     = 'maui_1k.bt'
command  = [ 'wget -O - ' url '| gunzip -c > ' file ]
download = @(url,file) unix([ 'wget -O - ' url '| gunzip -c > ' file ])

% Utility function to read the Binary Terrain file size
function  siz = btsize( fn )
fid        = fopen( fn, 'rb', 'ieee-le' ); fseek(  fid, 10, 'bof' );
siz([2 1]) = fread( fid, 2, 'int' );       fclose( fid );
end

siz      = btsize( file )
data     = mstream( @(x)x, {file}, [], 1e5, 'float', 0, 'ieee-le', 256 );
data     = reshape( data, siz );
imagesc( data(end:-1:1,:) )
title( 'Island of Maui - USGS 10m DEM' )

diff50   = @(x)reshape(                                       ...
mblk_fun( reshape(x,siz(1),[]), @(x)x-mean(x), 50 ), [], 1 ...
)
fil2     = 'diff50.bt';
% Using a reference input file to fill the output files' header
mstream( diff50, {file}, {fil2}, siz(1)*50,                   ...
'float', 0, 'ieee-le', 256, file                           ...
);
data     = mstream( @(x)x, {fil2}, [], 1e5, 'float', 0, 'ieee-le', 256 );
data     = reshape( data, siz );
imagesc( data(end:-1:1,:) )
title( 'Island of Maui - USGS 10m DEM: difference from 50x50 mean' )


See also:
mdeal, mbash

Keywords:
data-transformation, streams, blocks

Version: 0.8.4

#### Support

The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science.  You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module.  Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable.  However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based.  Reporting a problem that you found using Mastrave may help the developer team to find a possible bug.  Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems.  Thank you for your collaboration.

Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Daniele de Rigo