## The module "msplit" of the Mastrave modelling library

Daniele de Rigo

The file msplit.m is part of Mastrave.

Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.

#### Function declaration

[tokens, start_id, end_id, id, token_id ] =
msplit( array                 ,
separator             ,
option     = 'single' ,
map        = []       )



#### Description

Utility for splitting a string or a numerical array into a cell array, using separator as delimiter among tokens. The string is transposed before the splitting, in order to preserve the correct order for multiline strings (please take into account of this convention, when passing a non-string array). start_id and end_id contains the original start and end positions of each token (substring or sequence of contiguous elements) returned in tokens . id is the set of indices referring to all array elements which are not separators. token_id has the same size as id and contains the indices of tokens to which each non-separator array element is assigned. A sequence of more than one contiguous separator s into array is differently processed depending on the value of option argument. If option is omitted or has value 'single', each internal separator of the contiguous separator s splits an empty token; if option has value 'multiple', contiguous separator s are considered as a single separator.

#### Input arguments


array           ::realstring::
Array of real numbers or of characters on which
the  separator  has to be used to locate contiguous
sets of elements whose indices are used to split  map .

separator        ::realstring,nonempty::
Element used as token-separator.

option           ::string::
Set the behaviour in case of sequences of  separator s.
Valid options are:

option     │      meaning
──────────────┼──────────────────────────────────
'single'     │ Each separator of the sequence
'--single'   │ splits an empty token (default).
──────────────┼──────────────────────────────────
'multiple'   │ Consider the sequence as a
'--multiple' │ single separator.

map             ::realstring::
Array of real numbers or of characters to be split
using the indices of the contiguous sets of elements
of  array  not containing  separator .



#### Example of usage


% Basic usage
t =  '[tokens,start_id, end_id]=   msplit(  array,separator, option )'  ;
msplit( t ,  ' '  )
msplit( t ,  'ar'  )
msplit( t ,  'ara'  )

elems                 = msplit( t ,  '='  )
[ which_punct , id ]  = mfind( ' (,)'  , elems{ 2 } );
elems{ 2 }( id )      =  ' '  ;

msplit( elems{ 2 } ,  ' '  )
token = msplit( elems{ 2 } ,  ' '  ,  '--multiple'  )

% Passing a numerical array
m = [ 7 8 3 5 ; 2 4 8 3 ; 0 1 3 8 ]
msplit( m , 6 )
msplit( m , 8 )
msplit( m , [8 3] )
msplit( m , [8 3 0] )

% Passing a map-array
is_group = rand( 1, 20 ) > .5
msplit( is_group, 0, '--single'  , 1:20 )
msplit( is_group, 0, '--multiple', 1:20 )

% Initial and final token's positions
tt =  '1234 6   01      8'
[c,in,fin] = msplit( tt,  ' '  )
for i=1:numel(c) fprintf(1,'"%s"\t"%s"\n',c{i},tt(in(i):fin(i))); end
[c,in,fin] = msplit( tt,  '  '  )
for i=1:numel(c) fprintf(1,'"%s"\t"%s"\n',c{i},tt(in(i):fin(i))); end
[c,in,fin] = msplit( tt,  '   '  )
for i=1:numel(c) fprintf(1,'"%s"\t"%s"\n',c{i},tt(in(i):fin(i))); end

[c,in,fin] = msplit( tt,  ' '  ,  '--multiple'  )
for i=1:numel(c) fprintf(1,'"%s"\t"%s"\n',c{i},tt(in(i):fin(i))); end
[c,in,fin] = msplit( tt,  '  '  ,  '--multiple'  )
for i=1:numel(c) fprintf(1,'"%s"\t"%s"\n',c{i},tt(in(i):fin(i))); end
[c,in,fin] = msplit( tt,  '   '  ,  '--multiple'  )
for i=1:numel(c) fprintf(1,'"%s"\t"%s"\n',c{i},tt(in(i):fin(i))); end

% Indices of positition of valid elements
tt   = 'oooooofooo bar booooooazoooooo'
[cells, start_id, end_id, id, cell_id]=msplit( tt , 'ooo' )
[cells, start_id, end_id, id, cell_id]=msplit( tt , 'ooo', '--multiple' )


Version: 0.4.5

#### Support

The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science.  You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module.  Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable.  However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based.  Reporting a problem that you found using Mastrave may help the developer team to find a possible bug.  Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems.  Thank you for your collaboration.

Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Daniele de Rigo