Using Probabilistic
Finite Automata to
Simulate Hourly
series of GLOBAL
M. Sidrach-de-Cardona
Shah Jayesh
Valentino Crespi
Data Set.
Probabilistic Finite Automata.
Global Radiation.
Generalization of Model.
Questions, Comments ??????
• Model to generate synthetic series of hourly exposure of the
Global Radiation.
• Based on Subclass of Probabilistic Finite Automata (PFA)
for Variable-length Marcov Process.
• Check “variable memory” of Cloudiness…………
• Traditionally, analysis based on stochastic process theory.
• Should eliminate negative values appears in series.
• PFA - Mathematical model developed in Artificial Intelligent
and Machine Learning.
Why Machine Learning Model ?
• Useful for studying system in which goal concept presents
Probabilistic behavior.
Global Radiation
Data Set
Data of hourly exposure series of global radiation were recorded
over several years at 9 Spanish metrological station.
Data Set
• Weather characteristics of locations are very different.
• Moderate
• Continental
• Costal
Atlantic Climate (Oviedo).
Continental Climate (Madrid, Tortosa).
Mediterranean Climate both in winter and summer
(Malaga, Mallorca).
Probabilistic Finite Automata.
• First application used for universal data compression.
• Used for, Analysis of biological sequences, for DNA and
• Analysis of natural languages, handwriting and speech.
Probabilistic Finite Automata.
Probabilistic Finite Automata.
Building PFA for Hourly Global Radiation
• Hourly Clearness index,
Kh = Gh/ Gh,0
Hourly global radiation.
Extraterrestrial hourly global radiation
Building PFA for Hourly Global Radiation
Why Constructed “Artificial”?
• Data from different days linked together.
• Last observation of each day is followed by the first
observation of the following day.
Building PFA for Hourly Global Radiation
Numbers of hours ?
• Each series (Month) is constant and equal for all
January, February, November, December.
March, April, September, October.
May, Jun, July, August.
• To Discretize the continuous values of clearness
index we have only 8 different discrete values.
Building PFA for Hourly Global Radiation
• Relationship between Values of Clearness index
and symbol Of alphabet.
• Don’t having uniform interval.
• Lower and upper intervals, frequency of values is less than
other intervals.
1. Compute the series of discrete values.
2. Initialize the PDF with a node, with label null sequence.
3. The set PSS – Possible Subsequence Set – is initialized with all
sequences of order 1. each element in this set corresponds to a
sequence of discrete values. Take o =1 as the initial value of the
order – that is, the size if subsequence to consider.
4. If there are elements of order o is PSS, pick any of these
elements, Y. Using all discrete sequences in the series,
compute the frequency of Y. if 4a and 4b are true, then go to 5,
else go to 6.
the frequency of this sequence is greater than
the threshold frequency.
for same,
the probability of occurrence of
the subsequence
is not equal to the probability
of the subsequence final(Y)xp, that is
(not equal: when the ratio between the probabilities is
significantly greater than one; for instance, greater than 1.2).
5. Do
Add to the PFA a node, labeled Y, and compute its
corresponding probabilities vector.
For each amplified sequence, Yxp; if the probability
of this amplified sequence is greater than the
threshold probability, then include it in PSS.
6. Remove the analyzed subsequence, Y, from PSS.
7. If there are no elements of order o in PSS, add 1 to the value of
o. if o<= N and there are elements of length o is PSS, then go to
4, else stop.
Generating new series of Hourly Global
• Generate a synthetic series.
• Tested on null hypothesis that the series have same mean and
variance, with significance level 0.05 .
• This series, selected as proxy for recorded one, else generate
another synthetic series until we find a synthetic series which
rejected the null hypothesis.
• In all cases, less than 10 synthetic series had to be generated.
Generating new series of Hourly Global
• For each selected synthetic series, compare cumulative
probability distribution function (cpdf) with cpdf of the
recorded series.
• Comparison is based on the Kolmogorov-Smirnov two-sample
test-statistic, which focus of the absolute value of the
maximum difference between two empirical distribution
Generalization of the Simulated Model
• To generate a new series of hourly clearness index uses, input
data as
• Mean monthly value of the daily clearness index
• Cpdf of the recorded month.
• Most of metrological stations, these values are not available and
only mean monthly values of the daily global radiation are
usually recorded.
• One of the Aim of Paper:
To characterize the observed relationship between the
recorded data and parameter used for the proposed model.
Generalization of the Simulated Model
Generalization of the Simulated Model
• Relationship between these two parameters have computed the
correlation coefficient between them and proves to be 0.992 .
• Concluded
• Mean monthly daily clearness index which is available can
be used in model instead of the mean monthly hourly
clearness index.
• List of Test
1. Both series have the same mean and variance have been
tested. (with significant level 0.05).
2. Cpdf of the recorded and simulated series have be
compared with Kolmogorov-Smirnov two-sample test
statistic with a bootstrap P-value.
• It is observed that 97.8 % of the month it is
• Model only use monthly mean value of global
radiation and generate following.
• Constructed PFA.
• Proposed standard cpdf.
• Generate new series of hourly global radiation similar to
real one.
• Conclude that probabilistic Finite Automata can
be used to characterize and predict new series of
hourly global solar radiation series.
No Questions ???????????
Thank you……

Using Probabilistic Finite Automata to Simulate Hourly