How to pass new sequence to trained neural network in order predict protein structure from sequence?

1 visualización (últimos 30 días)
Hello. A fellow student generated a MATLAB function that uses a series of amino acid (AA) sequences to train a neural network to predict secondary protein structure from AA sequence.
function net= ssp_train( aa, ss, halfwindow, net )
% Train a neural network on the given amino acid sequence aa and secondary structure sequence ss.
% If halfwindow is not provided, use halfwindow=1.
if ~exist('halfwindow','var')
halfwindow = 1;
end
% If net is not provided, create a neural network with 5 hidden units.
if ~exist('net', 'var')
num_hidden_units = 5;
% If net is provided but is an integer, create a neural network with that many hidden units.
% All integers are divisible by 1. So a good test for non-integer could use the mod function:
elseif mod(net,1)==0
num_hidden_units = abs(net);
else
error('Please provide hidden unit number as an integer!');
% alternatively, replace line#17-20 by else num_hidden_units = abs(round(net));
end
% arg represents row vector of one or more hidden layer sizes
net = feedforwardnet(num_hidden_units);
% Set net.trainParam.showWindow=false to prevent the GUI.
net.trainParam.showWindow=false;
% Change the net.divideParam, so that the
% 80% of data is used for training, and
% 20% is used % for validation. (0% is used for testing).
net.divideParam.trainRatio = 0.8;
net.divideParam.valRatio = 0.2;
net.divideParam.testRatio = 0.0;
W = 2*halfwindow + 1;
% Figure 11.21: A number of nodes are present in the input layer,
% which can be fired by certain types of residues, e.g., the D (Asp).
% There are often 20 nodes per residue, with just one having the value 1 (which means it is activated).
% The nodes that are activated then pass a signal to the hidden layers, where
% conditional and addititive calculations are performed on the information
%
% Code adapted from http://www.mathworks.com/help/nnet/ug/create-neural-network-object.html
%=== binarization of the input matrix
seqInt = double(aa2int(aa));
seqInt(seqInt>20)=0; % Any amino acid with integer representation >20 is represented with all zeros
winP = hankel(seqInt(1:W),seqInt(W:end)); % concurrent inputs (sliding windows)
P = double(kron(winP,ones(20,1)) == kron(ones(size(winP)),(1:20)')); % TBD: Get Input P
%=== binarization of the target matrix
ssInt = zeros(size(ss));
ssInt(ss=='H') = 1;
ssInt(ss=='E') = 2;
ssInt(ss=='T') = 3;
winT = hankel(ssInt(1:W),ssInt(W:end)); % concurrent targets (sliding windows)
T = double(kron(winT,ones(3,1)) == kron(ones(size(winT)),(1:3)'));
%%=== train neural network with input and target matrices
net = train( net, P, T );
My problem is that now that the neural network is created and trained (represented by the network variable 'net', I think?) it is unclear to me how I am supposed to input a query AA sequence so that it will be analyzed by the neural network and provide the sequence's predicted secondary structure as output.
Any help would be much appreciate. Thanks so much and all the best.
  3 comentarios
Greg Heath
Greg Heath el 20 de Mzo. de 2014
Do not use the term 'net' to indicate the number of hidden nodes. Why not just use H or numhidden?
Greg Heath
Greg Heath el 20 de Mzo. de 2014
Editada: Greg Heath el 20 de Mzo. de 2014
Please give two non-trivial examples of aa and the corresponding ss.

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Get Started with Deep Learning Toolbox en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by