Evaluating Distributional Forecasts

TomDoan · Unread post by **TomDoan** » Tue Jul 05, 2022 7:53 am

If your simulations are coming in with similar results to the analytical forecasts, then the "directional accuracy" should also have similar results. If they don't, then you are probably doing the calculation wrong when inferring directional accuracy from the analytical results.

TomDoan · Unread post by **TomDoan** » Thu Jul 14, 2022 10:46 pm

I would note that one of the methods you are using is Normal simulations. Given a high enough number of simulations, that should converge to the "analytical" solutions. Normal simulations are useful if you are doing something that requires calculations across forecast horizons (such as, P(min value < x over range [t0,t1]), but for looking only at single forecasts will differ from analytical values only by experimental errors.

ac_1 · Unread post by **ac_1** » Sun Feb 12, 2023 8:11 am

Thanks.

TomDoan wrote:Why are you using two DO loops rather than two SET instructions? Neither one of those is saving the calculations into different slots in a time series. You can use this for the LPDS:

set scores_rwwd iend+1 iend+steps = %LOGDENSITY(stderrs_rwwd^2,y-yhat_rwwd)
sstats(mean) iend+1 iend+steps scores_rwwd>>lpds_rwwd
disp 'lpds_rwwd' lpds_rwwd

Note, BTW, that the LPDS measure is likely to favor forecasting procedures which get a better value of the variance, than a better value of the mean.

The above is for a normal distribution.

Using the transformations from @BJTRANS, what would be the LPDS for
(a) a log-normal distribution?
(b) a non-central chi-squared distribution?

LPDS is just an average of the densities of each forecast distribution. For (b) I have tried the following:

Code: Select all

set ncsq iend+1 iend+12 = yhat_D^2.0
sstats iend+1 iend+12 ncsq>>nc
set scores iend+1 iend+12 = %chisqrncdensity(stderrs_D,%ndf,nc)
sstats(mean) iend+1 iend+12 scores>>lpds
disp *.############ 'lpds' lpds

yhat_D and stderrs_D are the forecasts and forecast SE's in the square-root scale, and non-centrailty parameter nc is the sum of the squared forecasts; but get 0

Please advise.

ac_1 · Unread post by **ac_1** » Wed Oct 09, 2024 2:24 am

Hi Tom,

Revisiting this topic, I have a MAIN program and a PROCEDURE to calculate LPDS for the forecasts using 3 transformations: linear, log, sqrt:

Code: Select all

*===============================
* Linear transformation
seed 100
set(first=0.0) x 1 100 = .9*x{1}+%ran(1.0)
set x = x+6.0

boxjenk(const,ar=1,define=lineareq,MAXL) x 1 80 resids
uforecast(equation=lineareq,stderrs=stderrs_lin) linearf 81 100

sstats 81 100 %logdensity(stderrs_lin^2,x-linearf)>>linlpds; * sum

disp linlpds
@acLPDS x linearf stderrs_lin 81 100


*===============================
* Log transformation
seed 100
set(first=0.0) x 1 100 = .9*x{1}+%ran(1.0)
set x = x+6.0

set logx = log(x)
boxjenk(const,ar=1,define=logeq,MAXL) logx 1 80
uforecast(equation=logeq,stderrs=stderrs_log) logf 81 100

set log_density 81 100 = -log((1.0 / (logx * sqrt(stderrs_log^2) * sqrt(2.0 * %pi))) * exp(-0.5 * ((logx - logf) / sqrt(stderrs_log))^2))
sstats 81 100 log_density>>loglpds; * sum

disp loglpds
@acLPDS(dist=1) logx logf stderrs_log 81 100


*===============================
* Square root transformation
seed 100
set(first=0.0) x 1 100 = .9*x{1}+%ran(1.0)
set x = x+6.0

set sqrtx = sqrt(x)
boxjenk(const,ar=1,define=sqrteq,MAXL) sqrtx 1 80
uforecast(equation=sqrteq,stderrs=stderrs_sqrt) sqrtf 81 100

set sqrt_density 81 100 = %logdensity(stderrs_sqrt^2, sqrtx - sqrtf) - log(2 * sqrtx)
sstats 81 100 sqrt_density>>sqrtlpds; * sum

disp sqrtlpds
@acLPDS(dist=2) sqrtx sqrtf stderrs_sqrt 81 100

Code: Select all

*
* acLPDS Log Predictive Density Score
*
* @acLPDS( options ) actual fcast fstd start end
*
*
* Parameters:
*  actual
*  fcast
*  start  end
*  fstd
*
*
* Options:
*   TITLE=title for report ["Log Predictive Density Score"]
*   dist= [0]=normal, 1=log-normal, 2=non-central chi-squared, 3=box-cox
*
*
* Description:
*
*
*
* Reference:
* https://estima.com/forum/viewtopic.php?t=3589
*
*
* Variables Defined:
*   %%LPDS  Log Predictive Density Score
*
*
* Revision Schedule:
*

procedure acLPDS actual fcast fstd start end
*
   type series    actual fcast fstd
   type integer   start end
*
   local integer  startl endl
   local real     LPDS
*
   local series   scores
   local string   stitle
   local report   dreport
*
   option string  title
   option integer dist 0
*
   inquire(reglist) startl<<start endl<<end
   # actual fcast

* Calculate LPDS based on distribution type
   if dist == 0
      set scores startl endl = %LOGDENSITY(fstd^2, actual - fcast)
   else if dist == 1
      set scores startl endl = -log((1.0 / (actual * sqrt(fstd^2) * sqrt(2.0 * %pi))) * exp(-0.5 * ((actual - fcast) / sqrt(fstd))^2))
   else if dist == 2
      set scores startl endl = %LOGDENSITY(fstd^2, actual - fcast) - log(2 * actual)


   sstats startl endl scores>>LPDS; * sum
   comp %%LPDS = LPDS

   * Generate report
   if %defined(title)
      compute stitle=title
   else
      compute stitle="Log Predictive Density Score"

   report(use=dreport,action=define,title=stitle)
   report(use=dreport,atrow=1,atcol=1,span) stitle
   report(use=dreport,atrow=2,atcol=1,span) $
   "Forecast Analysis for "+%l(actual)+" vs "+%l(fcast)
   report(use=dreport,atrow=3,atcol=1,span) $
   "From "+%datelabel(startl)+" to "+%datelabel(endl)
   report(use=dreport,atrow=4,atcol=1) "LPDS                         " %%LPDS
   report(use=dreport,action=format,picture="*.######################")
   report(use=dreport,action=show)


end acLPDS

Is the LPDS calculated correctly, or should the forecasts using log & sqrt be back-transformed and then LPDS calculated?

thanks,
Amarjit

TomDoan · Unread post by **TomDoan** » Wed Oct 09, 2024 10:14 am

You can't aggregate the forecast densities across forecast steps. (You're doing dynamic forecasts from 81 to 100, so you have 1 step and 20 steps combined).

You have another thread that's almost entirely about LPDS calculations where I corrected a number of errors. You would probably do well to review it:

https://www.estima.com/forum/viewtopic. ... 731#p18731

And no, you can't compare forecasts of the log or sqrt with forecasts of the levels. They need to all be adjusted for the transformations.

ac_1 · Unread post by **ac_1** » Thu Oct 10, 2024 1:19 am

TomDoan wrote: ↑Wed Oct 09, 2024 10:14 am You can't aggregate the forecast densities across forecast steps. (You're doing dynamic forecasts from 81 to 100, so you have 1 step and 20 steps combined).

You have another thread that's almost entirely about LPDS calculations where I corrected a number of errors. You would probably do well to review it:

https://www.estima.com/forum/viewtopic. ... 731#p18731

And no, you can't compare forecasts of the log or sqrt with forecasts of the levels. They need to all be adjusted for the transformations.

Sorry - as you say in the related post I'll use appropriate loss measures rather than LPDS to choose the "best" model.

What about plotting the histogram of the PIT values - they should be uniformly distributed between 0 and 1 if the forecasts are well-calibrated.
Do I back-transform the log and sqrt transformations of the predicted means and variances before calculating

%CDF((y(t)-predicted_mean(t))/sqrt(predicted_variance(t)))

and are they applicable to static and dynamic forecasts?

TomDoan · Unread post by **TomDoan** » Thu Oct 10, 2024 7:50 am

https://www.estima.com/forum/viewtopic. ... 502#p18502

The RATS Software Forum

Evaluating Distributional Forecasts

Re: Evaluating Distributional Forecasts

Re: Evaluating Distributional Forecasts

Re: Evaluating Distributional Forecasts

Re: Evaluating Distributional Forecasts

Re: Evaluating Distributional Forecasts

Re: Evaluating Distributional Forecasts

Re: Evaluating Distributional Forecasts