How to use more than one training set for training a NARX neural network?
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
Robert
el 4 de Jul. de 2014
Comentada: Dominic
el 30 de Jul. de 2014
I have a question concerning the training of NARX neural networks using the Neural Network Toolbox. The task is to obtain a neural black box model of a time-dependent system and, as such, to predict time series.
Now I want to train my NARX network not only with one training set, i.e. one time series, but with several ones. In my case, this is important since it is important that the network captures the real system's behavior in the beginning of a time series. Hence, if I only use one training series, there is only one opportunity for the network to learn that behavior.
Again, to clarify: Can I somehow join several time series into one and use it for training, or can I do the training with several time series?
If an example might help, consider this: Say I want to train the NARX network such that it approximates a timeseries y(t)=t^2 + u(t). Now, it doesnt make full sense to create only one training data set with a varying u over time, say u(t)=t: Since the system itself is time-dependent, how can the network ever learn what would happen e.g. in the beginning if I would use u(t)=sqrt(t) and then predict the according time series? Hence, I would like to create several training data sets with say u(t)=t; u(t)=sqrt(t); u(t)=t^2 so that the network might interpolate between them. The final question to be answered is: How can I train the network using all these three training data set - at once?
I hope I could describe my issue in due detail.
Thanks a lot in advance! Robert
0 comentarios
Respuesta aceptada
Greg Heath
el 5 de Jul. de 2014
1. The system is not time-dependent because none of the weights depend on time.
2. Polynomial functions, sinusoids and their products satisfy time-invariant difference equations. Therefore can be represented by time-invariant nets.
3. Consider
y(t) = (a*t^2 + b*t + c)*( d*cos(t) + e*sin(t))
It is not difficult to show that y(t) can be determined knowing y(t-dt) and y(t-2*dt).
4. My approach
a. Subtract a polynomial fit.
b. Standardize to zero-mean/unit-variance and delete or modify outliers.
c. Find the statistically significant lags of the target auto-correlation and target-input cross-correlation functions.
d Use the lags to fit Ntrials candidate narx models for each value of hidden nodes in the range Hmin:dH:Hmax.
e. Choose the designs with the best validation set performance.
f. Obtain unbiased estimates of performance and confidence levels on unseen data from the test set performances.
Hope this helps.
Thank you for formally accepting my answer
Greg
3 comentarios
Dominic
el 30 de Jul. de 2014
Dear Greg,
Your approach seems to be pretty straight forward. Although I need to admit I have no idea how to do the step c which you mentioned. I am working with a 6 dimensional Input and I am using an Array with 200 Matrices with each a couple of thousand timesteps.
When I am using the crosscorr function like you mentioned in one of your other posts, I need to set the Lags to 500 in order to see some kind of Peak. This seems to be quite a high number....
So far I used the basic configuration with 10 nodes and 2 delays and it seemed to work pretty fine on the data I used for training but the performance on non trained data was really bad.
Just let me check if I understood the rest of your approach. a. Subtract a polynomial fit. I guess the dimension has no correlation with the network itself, has it? b. Standardize to zero-mean/unit-variance and delete or modify outliers. Standardize the output from a) c. Find the statistically significant lags of the target auto-correlation and target-input cross-correlation functions.
Use the the data from b and crosscorr but what are my significant lags? and how to I get the threshold?
d Use the lags to fit Ntrials candidate narx models for each value of hidden nodes in the range Hmin:dH:Hmax.
I guess it is the number of lags you are speaking of, aren't you? And to use them as inputs for the delay configuration of the network.
e. Choose the designs with the best validation set performance.
f. Obtain unbiased estimates of performance and confidence levels on unseen data from the test set performances.
Thank you for your answers.
Kind regards
Dominic
Más respuestas (1)
Greg Heath
el 6 de Jul. de 2014
You are not going to be able to do it that way.
1. The net has to recognize which waveform is the input.
a. How many waveforms do you have?
b. How long are they?
2. The weights, not to mention the significant delays, for each input will be different.
I have designed two stage classifiers where the first stage determined which of several second stage classifiers in parallel will complete the classification.
In your case, you could design a first stage classifier to determine which waveform is being input. Its output will then be directed to a second stage containing several parallel nets where each is designed for one type of input.
Hope this helps.
Thank you for formally accepting my answer
Greg
Ver también
Categorías
Más información sobre Sequence and Numeric Feature Data Workflows en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!