The module "mstream" of the Mastrave modelling library
Copyright and license notice of the function mstream
Copyright © 2008,2009,2010,2011,2012,2013,2014 Daniele de Rigo
The file mstream.m is part of Mastrave.
Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.
Function declaration
answer = mstream( func , in_streams , out_streams , block_size , precision = [] , skip = [] , arch = [] , header = [] , header_fill = [] , same_size = true )
Description
Utility to apply a data-transformation function func to a set of input streams in_streams and save the result inside a set of output streams out_streams .
The data-transformation function is identified by the function handle func and is expected to receive as input a matrix of block_size rows (or less) and n_in columns, where n_in is the number of input streams. The expected output of func is another matrix of block_size rows (or less, in accordance with the input matrix) and n_out columns, where n_out is the number of output streams.
The n_in input streams can be one of the following:
- the n_in columns of a matrix passed as in_streams argument;
- the n_in files identified by the corresponding file IDs passed
as in_stream argument in the form of a cell-array of file IDs
(which are integer numbers: see @fopen, @fread and @fwrite for
more details);
- the n_in files identified by the corresponding file paths passed
as in_stream argument in the form of a cell-array of strings.
The same alternatives apply to the n_out output streams.
func is iteratively applied to blocks of the input streams and the result is iteratively saved in corresponding blocks of the output streams. block_size identifies the number of rows to be processed per iteration.
The optional input arguments precision , skip and arch have the same meaning as respectively the third, fourth and fifth argument of @fread and @fwrite.
The optional input argument header allows the desired number of bytes to be specified in order for them to be skipped before reading/writing the first block of values respectively from in_streams and out_stream . This may be useful so as to manipulate file formats containing data in binary format, when the data are prefixed with an initial set of bytes (of known size) used as metadata (header). For example, the Binary Terrain file format (Discoe, 2007) uses a fixed header of 256 bites, followed by single-precision binary data.
When files are to be written as requested with out_streams , the optional
argument header_fill allows a reference file to be read so as for its
header to be copied in the files defined in out_streams .
References
Discoe, B.: The BT (Binary Terrain) File Format, 2007
http://www.vterrain.org/Implementation/Formats/BT.html
Input arguments
func ::function_handle:: Function handle identifying the data-transformation function to be applied to in_streams to transform them in out_streams . in_streams ::matrix|cellnumstring-1:: Input streams. If a cell-array of strings (file paths) is passed, the corresponding files will be opened in read-mode and closed at the end of the data-transformation. out_streams ::matrix|cellnumstring-1:: Output streams. If a cell-array of strings (file paths) is passed, the corresponding files will be opened in write-mode and closed at the end of the data-transformation. block_size ::scalar_index:: Number of rows to be processed within each iteration. See the description of the second argument of @fread. precision ::string:: Type of data to be read and write. If in_streams (or out_streams ) is a matrix, precision has no effects on its manipulation. See the description of the third argument of @fread and @fwrite functions. If empty, the default value is 'double'. If omitted, the default value is [] (empty). skip ::scalar_numel:: Number of bytes to skip before reading/writing each value respectively from in_streams and out_stream . If in_streams (or out_streams ) is a matrix, skip has no effects on its manipulation. See the description of the fourth argument of @fread and @fwrite functions. If empty, the default value is 0. If omitted, the default value is [] (empty). arch ::string:: Type of data-format of the files (order of bytes). If in_streams (or out_streams ) is a matrix, arch has no effects on its manipulation. See the description of the fifth argument of @fread and @fwrite functions. If empty, the default value is 'native'. If omitted, the default value is [] (empty). header ::scalar_numel:: Number of bytes to skip before reading/writing the first block of values respectively from in_streams and out_stream . Passing this optional argument allows to manipulate file formats containing data in binary format even when the data are prefixed with an initial set of bytes ─ of known size ─ used as metadata (header). If in_streams (or out_streams ) is a matrix, skip has no effects on its manipulation. If empty, the default value is 0. If omitted, the default value is [] (empty). header_fill ::string:: Reference file to be read so as for its header to be copied as header in the files defined in out_streams . If empty or omitted, the header in out_streams are filled with spaces. same_size ::scalar,logical:: If set to true, this flag requires the size of the output blocks to be the same of that of the input ones. In particular, the number of rows of the input and output matrix is expected to be the same in each block. Different blocks may still have different number of rows. If omitted, the default value is true.
Example of usage
% Creating a series of files from a pre-existing matrix vals = reshape( [1:40].', 10, 4 ) func = @(x)10*x mstream( func, vals, {'t1','t2','t3','t4'}, 3 ) % Loading a series of files as a matrix vals2 = mstream( @(x)x, {'t1','t2','t3','t4'}, [], 3 ) % Transforming a series of files in another series of files func2 = @(x)x/10 mstream( func2, {'t1','t2','t3','t4'}, {'T1','T2','T3','T4'}, 3 ) vals3 = mstream( @(x)x, {'T1','T2','T3','T4'}, [], 3 ) % Functions changing the order of rows are allowed. However, % for such functions the result of the data-transformation depends % on the value passed as block_size func2_rev = @(x)x(end:-1:1,:)/10 mstream( func2_rev, {'t1','t2','t3','t4'}, {'T1','T2','T3','T4'}, 3 ) vals3 = mstream( @(x)x, {'T1','T2','T3','T4'}, [], 3 ) % Transforming a series of files in another file func3 = @(x)sum(x,2) mstream( func3, {'T1','T2','T3','T4'}, {'Tall'}, 3 ) % Reading a file with the classic @fopen -> @fread -> @fclose fid = fopen( 'Tall', 'rb' ) vals4 = fread( fid, 'double' ) fclose( fid ) % Consistency check assert( vals4 == func3( vals3 ) ) assert( vals4 == sum( vals3, 2 ) ) % Advanced usage % Importing/manipulating a Binary Terrain (Discoe, 2007) file. % Obtaining a reference file (Island of Maui - USGS 10m DEM) url = 'http://vterrain.org/repo/elev/maui_1k.zip' file = 'maui_1k.bt' command = [ 'wget -O - ' url '| gunzip -c > ' file ] download = @(url,file) unix([ 'wget -O - ' url '| gunzip -c > ' file ]) assert( download(url,file) == 0 ) % Utility function to read the Binary Terrain file size function siz = btsize( fn ) fid = fopen( fn, 'rb', 'ieee-le' ); fseek( fid, 10, 'bof' ); siz([2 1]) = fread( fid, 2, 'int' ); fclose( fid ); end siz = btsize( file ) data = mstream( @(x)x, {file}, [], 1e5, 'float', 0, 'ieee-le', 256 ); data = reshape( data, siz ); imagesc( data(end:-1:1,:) ) title( 'Island of Maui - USGS 10m DEM' ) diff50 = @(x)reshape( ... mblk_fun( reshape(x,siz(1),[]), @(x)x-mean(x), 50 ), [], 1 ... ) fil2 = 'diff50.bt'; % Using a reference input file to fill the output files' header mstream( diff50, {file}, {fil2}, siz(1)*50, ... 'float', 0, 'ieee-le', 256, file ... ); data = mstream( @(x)x, {fil2}, [], 1e5, 'float', 0, 'ieee-le', 256 ); data = reshape( data, siz ); imagesc( data(end:-1:1,:) ) title( 'Island of Maui - USGS 10m DEM: difference from 50x50 mean' )
See also: mdeal, mbash Keywords: data-transformation, streams, blocks Version: 0.8.4
Support
The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science. You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module. Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable. However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based. Reporting a problem that you found using Mastrave may help the developer team to find a possible bug. Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems. Thank you for your collaboration.