* * A (partially non-linear) dynamic model of COVID infection. Based upon the SEIR * model, which has the population partitioned into S (susceptible), E (exposed), I * (infectious) and R (recovered, and assumed to be no longer suspectible). * * The key equations in the evolution in a (fixed parameter) SEIR model are * * E(t)=(1-sigma)E(t-1)+beta*I(t-1)*S(t-1)/Pop * I(t)=(1-gamma)I(t-1)+sigma*E(t-1) * * sigma*E(t-1) is the number of exposed persons who become infectious. * beta*I(t-1)*S(t-1)/Pop is the number of suspectible persons who become exposed. * gamma*I(t-1) is the number of infectious people who recover. * * sigma and gamma are largely determined by the properties of the disease. (Better * treatments could reduce both, but it's probably a reasonable strategy to assume * those are constant over a relatively short time interval). beta, however, * depends upon behavior; for instance, a perfect quarantine would make beta=0 * since no suspectible person would come in contact with an infectious person, so * making beta time varying makes sense. * * For simplicity, we'll assume S=Pop, since the number infected and since * recovered is still relative to population. * * Taking an SEIR model to time series data is not a simple process because the * error terms will not have a constant variance---at the early part of a contagion * all the errors will be small and the rate at which the component numbers (and * errors) become large is determined by the parameters governing the process. * * We add to the EI (S is assumed to be pop, and R doesn't matter), N as the number * of new infections, which is the second term in the I equation. With state shocks * added, we have * * beta(t)=beta(t-1)+noise_beta * E(t)=(1-sigma)E(t-1)+beta(t-1)*I(t-1)+noise_E * N(t)=sigma*E(t-1)+noise_N * I(t)=(1-gamma)I(t-1)+N(t) * * Note well that I(t) is the (unobservable) number of people who are currently * *infectious*---someone who was infected but has recovered and is no longer * contagious isn't included. The "dashboard" number of total infected since day 1 * is the accumulation of the N(t) values. * * Data from https://covid.ourworldindata.org/data/owid-covid-data.xlsx * open data "uscovid.xlsx" calendar(7) 2020:1:21 data(format=xlsx,org=columns,left=3) 2020:01:21 * total_cases new_cases total_deaths new_deaths $ total_cases_per_million new_cases_per_million total_deaths_per_million new_deaths_per_million total_tests $ new_tests total_tests_per_thousand new_tests_per_thousand tests_units * dec real sigma gamma beta0 compute rawR0=3.5 compute gamma=1.0/14.0 compute sigma=.7 * * Starting guess for beta (based upon early estimates of reproduction rate) * compute beta0=rawR0*gamma * * Long-term death rate * compute d0=.03 * * Parameters for the Pascal distribution * dec real dr dp compute np=1 * * Because N(t) appears on the right side of the I equation, we have to substitute * the definition to create a proper state-space model: * * beta(t)=beta(t-1)+noise_beta * E(t)=(1-sigma)E(t-1)+beta(t-1)*I(t-1)+noise_E * I(t)=(1-gamma)I(t-1)+sigma*E(t-1)+noise_N * N(t)=sigma*E(t-1)+noise_N * * Because the states enter multiplicatively in the E equation, that has to be * linearized (around the smoothed estimates for beta(t-1) and I(t-1)) * * What one should use as an observable (or observables) will depend upon the data * available. For US data, the obvious one would be number of new cases, but that's * heavily influenced by the testing regime, which differs from state to state and * has evolved over time. As a start, however, this assumes N(t)=the new_cases * variable on the data set. * * States are arranged in the order * * beta(t),E(t),I(t),N(t) * * The E equation needs linearization of the beta(t-1)*I(t-1) term. It's linearized * around the smoothed estimates of beta(t-1) and I(t-1). * dec rect a(np+3,np+3) dec vect z(np+3) * * Take care of the shifts (if any) * ewise a(i,j)=%if(i>4,(i==j+1),0.0) ewise z(i)=0.0 * * Loadings on shocks to states * dec rect f(np+3,3) ewise f(i,j)=(i==j).or.(i==4.and.j==3) * compute c=||0.0,0.0,0.0,1.0|| * dec frml[rect] afrml dec frml[vect] zfrml * dec series itilde betatilde * function afnc t type rect afnc type integer t * compute a(1,1)=1.0 compute a(2,1)=itilde(t-1) compute a(2,2)=1-sigma compute a(2,3)=betatilde(t-1) compute a(3,2)=sigma compute a(3,3)=1-gamma compute a(4,2)=sigma compute afnc=a end * function zfnc t type vect zfnc type integer t * compute z(2)=-itilde(t-1)*betatilde(t-1) compute zfnc=z end * filter(type=lagging,span=7,extend=zeros) new_deaths / smooth_deaths set betatilde 1 * = beta0 set itilde 1 * = 50*smooth_deaths set etilde 1 * = 10*smooth_deaths * * This treats gamma as known as a 14 day infectious period. * nonlin(parmset=virusspecs) sigma sigma<=.99 * * There are three noise variances which need the ability to evolve with the * contagion. For simplicity, all are treated as scales of the lagged * smoothed value of E. (The variance of beta is pegged at .0001). * nonlin(parmset=varparms) sn se nonlin(parmset=deathparms) dr dp compute sn=.05,se=1.55,sd=.50 * * Initial guesses for the states, assuming cases are basically zero at the start * of the sample. * compute x0=||beta0,0.0,0.0||~%zeros(1,np) compute sx0=%diag(||1.0,0.0,1.0||~%zeros(1,np)) * * This does the estimates assuming the death process is known up to a d0 which is * the long-term death rate. (If dr and dp are freely estimated, they tend to go * towards an unnatural front-loaded process.) * frml vscalef = %max(.01,etilde{1}) find(parmset=virusspecs+varparms,method=simplex,trace,reject=(se<0.or.sn<0)) max %logl * * This does three subiterations to try to get the expansion points to converge * on each function evaluation. * do iters=1,3 dlm(y=new_cases,a=afnc(t),z=zfnc(t),c=c,f=f,$ sw=%diag(||.0001,se*vscalef(t),sn*vscalef(t)||),$ type=smooth,yhat=yhat,x0=x0,sx0=sx0) 2 * xstates set betatilde 2 * = xstates(t)(1) set etilde 2 * = xstates(t)(2) set itilde 2 * = xstates(t)(3) set ntilde 2 * = xstates(t)(4) end do iters end find graph(footer="Smoothed Estimate of Exposed") # etilde graph(footer="New Infections") # ntilde graph(footer="Smoothed Estimate of Infectious") # itilde set(first=0.0) totalcases = totalcases{1}+ntilde graph(footer="Total Cases") # totalcases * set r0 = betatilde/gamma graph(footer="Estimated R0") # r0