The module "train_pca" of the Mastrave modelling library
Copyright © 2007,2008,2009,2010 Daniele de Rigo
The file train_pca.m is part of Mastrave.
Mastrave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Mastrave. If not, see http://www.gnu.org/licenses/.
[pc, mse, w, val2pc, pc2val] = train_pca( values )
Training engine to model a given numeric matrix values applying
the principal component analysis (pca).
values is composed by N row-vectors representing N vectorial points
in an n-dimensional space (so that values n adjacent columns refer
to different dimension coordinates).
The principal components pc are returned along with the coefficients
pc2val to transform principal components subsets into the
corresponding approximations of the original values .
val2pc returned coefficients enable the inverse transformation
from values to pc . The i-th element of mse is the mean square
error associated with the use of the first i principal components.
w is the vector of weights associated to each principal component,
such that prod( w ) == 1.
w is proportional to the diagonal of the S matrix returned by
[U,S,V] = svd( values ) such that
values == U * S * V'
values ::numeric,matrix:: Numeric matrix each row of it represents a vectorial point in an n-dimensional space (so that values n adjacent columns are expected to refer to different dimension coordinates).
% Small example on how to train part of tha available data M % to obtain both a pca decomposition pc_train for the training set and % the coefficients v2p to validate the pca decomposition with the % validation set of data. N_train = 20, N_valid = 10, N = N_train + N_valid rnd_id = randperm( N ); [ x , y ] = mdeal( rand( N , 2 ) ); M = [ sin(x) exp(y)-x x.*log(y+1) y cos(y).*x ]; [ M_train , M_valid ] = mdeal( M(rnd_id,:), [ N_train N_valid ] , 1 ); [ id_train , id_valid ] = mdeal( rnd_id(:), [ N_train N_valid ] , 1 ); [ pc_train , mse_train , w , v2p , p2v ] = train_pca( M_train ); isequal( pc_train , M_train * v2p ) % Compute an algorithm equivalent (only less efficient) to that used % by @train_pca to estimate for each k the mean square error which % is associated with the use of the first k principal components to % reconstruct values . n = numel( mse_train ) o = ones( 1 , n ); M_aprox = zeros( [ size(M_train) , n ] ); for k=1:n M_aprox(:,:,k) = pc_train(:,1:k) * p2v(1:k,:); end mse_t = zeros( n , 1 ); mse_t(:) = mean( reshape( (M_train(:,:,o)-M_aprox).^2 ,  , 1 , n ) ) [ mse_train mse_t ] % Finally, compute the mean square errors associated with the % validation set. pc_valid = M_valid * v2p; M_aprox = zeros( [ size(M_valid) , n ] ); for k=1:n M_aprox(:,:,k) = pc_valid(:,1:k) * p2v(1:k,:); end mse_v = zeros( n , 1 ); mse_v(:) = mean( reshape( (M_valid(:,:,o)-M_aprox).^2 ,  , 1 , n ) ) [ mse_train mse_t mse_v ]
Memory requirements: max( ... O( numel( values ) * min( size( values ) ) ) , ... O( @svd ) ... ) See also: screed Keywords: training engine, modeling, space transformation, principal component analysis Version: 0.4.8
The Mastrave modelling library is committed to provide reusable and general - but also robust and scalable - modules for research modellers dealing with computational science. You can help the Mastrave project by providing feedbacks on unexpected behaviours of this module. Despite all efforts, all of us - either developers or users - (should) know that errors are unavoidable. However, the free software paradigm successfully highlights that scientific knowledge freedom also implies an impressive opportunity for collectively evolve the tools and ideas upon which our daily work is based. Reporting a problem that you found using Mastrave may help the developer team to find a possible bug. Please, be aware that Mastrave is entirely based on voluntary efforts: in order for your help to be as effective as possible, please read carefully the section on reporting problems. Thank you for your collaboration.