Bayesian methods for complex statistical applications

$ \newcommand{\red}{\textcolor{red}} \newcommand{\blue}{\textcolor{blue}} \newcommand{\zerob} {{\bf 0}} \newcommand{\oneb} {{\bf 1}} \newcommand{\expect} {{\mathbb{E}}} \newcommand{\pt} {\tilde{p}} \newcommand{\alphat} {\tilde{\alpha}} \newcommand{\betat} {\tilde{\beta}} \newcommand{\pb} {\bar{p}} \newcommand{\thetab} {{\boldsymbol{\theta}}} \newcommand{\alphab} {{\boldsymbol{\alpha}}} \newcommand{\kappab} {{\boldsymbol{\kappa}}} \newcommand{\sigmab} {{\boldsymbol{\sigma}}} \newcommand{\nub} {{\boldsymbol{\nub}}} \newcommand{\gammab} {{\boldsymbol{\gamma}}} \newcommand{\deltab} {{\boldsymbol{\delta}}} \newcommand{\Deltab} {{\boldsymbol{\Delta}}} \newcommand{\etab} {{\boldsymbol{\eta}}} \newcommand{\Thetab} {{\boldsymbol{\Theta}}} \newcommand{\varthetab} {{\boldsymbol{\vartheta}}} \newcommand{\Yset} {\mathcal{Y}} \newcommand{\Xset} {\mathcal{X}} \newcommand{\intd} {\textrm{d}} \newcommand{\phib} {\boldsymbol{\phi}} \newcommand{\zetab} {\boldsymbol{\zeta}} \newcommand{\psib} {\boldsymbol{\psi}} \newcommand{\Sigmawinv} {{\boldsymbol{\it \Sigma}}_w^{-1}} \newcommand{\Sigmamat} {{\bm \Sigma}} \newcommand{\Sigmamatt} {\widetilde{\boldsymbol{\Sigma}}} \newcommand{\Qmatt} {\widetilde{\textbf{Q}}} \newcommand{\muvect} {\widetilde{\boldsymbol{\mu}}} \newcommand{\Psib} {{\bm \Psi}} \newcommand{\Upsilonmat} {{\boldsymbol{\it \Upsilon}}} \newcommand{\Lambdamat} {\mathbf{\Lambda}} \newcommand{\Gammamat} {{\boldsymbol{\it \Gamma}}} \renewcommand{\Gammamat} {{\boldsymbol{\Gamma}}} \newcommand{\Pimat} {{\boldsymbol{\it \Pi}}} \newcommand{\Amat} {\textbf{A}} \newcommand{\Bmat} {\textbf{B}} \newcommand{\Dmat} {\textbf{D}} \newcommand{\Gmat} {\textbf{G}} \newcommand{\Lmat} {\textbf{L}} \newcommand{\Qmat} {\textbf{Q}} \newcommand{\Rmat} {\textbf{R}} \newcommand{\Tmat} {\textbf{T}} \newcommand{\Qt} {\widetilde{\textbf{Q}}} \newcommand{\Qtinv} {\widetilde{\textbf{Q}}^{-1}} \newcommand{\Mmat} {\textbf{M}} \newcommand{\Cmat} {\mathbf{C}} \newcommand{\Jmat} {\mathbf{J}} \newcommand{\cmat} {\textbf{c}} \newcommand{\Kmat} {\textbf{K}} \newcommand{\Zmat} {\textbf{Z}} \newcommand{\Xmat} {\textbf{X}} \newcommand{\Xvec} {\mathbf{X}} \newcommand{\Imat} {\textbf{I}} \newcommand{\Umat} {\textbf{U}} \newcommand{\Pmat} {\textbf{P}} \newcommand{\Hmat} {\textbf{H}} \newcommand{\Vmat} {\textbf{V}} \newcommand{\bvec} {\textbf{b}} \newcommand{\dvec} {\textbf{d}} \newcommand{\avec} {\textbf{a}} \newcommand{\evec} {\textbf{e}} \newcommand{\hvec} {\textbf{h}} \newcommand{\xvec} {\textbf{x}} \newcommand{\yvec} {\textbf{y}} \newcommand{\zvec} {\textbf{z}} \newcommand{\wvec} {\textbf{w}} \newcommand{\vvec} {\textbf{v}} \newcommand{\svec} {\textbf{s}} \newcommand{\uvec} {\textbf{u}} \newcommand{\gvec} {\textbf{g}} \newcommand{\fvec} {\textbf{f}} \newcommand{\rvec} {\textbf{r}} \newcommand{\muvec} {\boldsymbol{\mu}} \newcommand{\Psix} {{\boldsymbol{\it \Psi}}_{\xvec}} \newcommand{\Phimat} {{\boldsymbol{\it \Phi}}} \newcommand{\Psitheta} {{\boldsymbol{\it \Psi}}_{\varthetab}} \newcommand{\Psia} {{\boldsymbol{\it \Psi}}_{A}} \newcommand{\Psixinv} {{\boldsymbol{\it \Psi}}_{\xvec}^{-1}} \newcommand{\vvm} {\boldsymbol {\mathcal \upsilon}} \newcommand{\upsilonb} {\boldsymbol {\upsilon}} \newcommand{\betab} {\boldsymbol {\beta}} \newcommand{\omegab} {\boldsymbol {\omega}} \newcommand{\Aop}{\boldsymbol{\mathcal{A}}} \newcommand{\ICE} {\textit{ICE}} \newcommand{\GIA} {\textit{GIA}} \newcommand{\GPS} {\textit{GPS}} \newcommand{\ERS} {\textit{ERS}} \newcommand{\GR} {\textit{GR}} \newcommand{\IS} {\textit{IS}} \newcommand{\ES} {\textit{ES}} \newcommand{\zeroes}{\mathop{\textrm{zeroes}}} \newcommand{\odd}{\mathop{\textrm{odd}}} \newcommand{\even}{\mathop{\textrm{even}}} \newcommand{\ff} {\textit{ff}} \newcommand{\fm} {\textit{fm}} \newcommand{\mf} {\textit{mf}} \newcommand{\inv} {\textit{inv}} \renewcommand{\zerob}{\mathbf{0}} \renewcommand{\v}{\mathbf{v}} \renewcommand{\u}{\mathbf{u}} \newcommand{\w}{\mathbf{w}} \renewcommand{\d}{\mathrm{d}} \newcommand{\Z}{\mathbf{Z}} \newcommand{\X}{\mathbf{X}} \newcommand{\x}{\mathbf{x}} \newcommand{\Y}{\mathbf{Y}} \newcommand{\Yvec}{\mathbf{Y}} \newcommand{\Yt}{\widetilde{\mathbf{Y}}} \newcommand{\Zvec}{\mathbf{Z}} \newcommand{\epsilonb}{\mbox{\boldmath{$\varepsilon$}}} \newcommand{\epsilonb}{\boldsymbol{\varepsilon}} \newcommand{\bS}{\mathbf{S}} \newcommand{\bK}{\mathbf{K}} \newcommand{\bI}{\mathbf{I}} \newcommand{\bR}{\mathbf{R}} \newcommand{\bC}{\mathbf{C}} \newcommand{\bB}{\mathbf{B}} \newcommand{\bP}{\mathbf{P}} \newcommand{\bQ}{\mathbf{Q}} \renewcommand{\L}{\mathbf{L}} \newcommand{\E}{\mathrm{E}} \newcommand{\cov}{\mathrm{cov}} \newcommand{\var}{\mathrm{var}} \newcommand{\Dist}{\mathrm{Dist}} \renewcommand{\prec}{\mathrm{prec}} \newcommand{\tr}{\mathrm{tr}} \newcommand{\diag}{\mathrm{diag}} \newcommand{\sgn}{\mathrm{sgn}} \newcommand{\trace}{\mathrm{tr}} \newcommand{\vect}{\mathrm{vec}} \newcommand{\Gau}{\mathrm{Gau}} \newcommand{\RR}{\mathbb{R}} \newcommand{\s}{\mathbf{s}} \newcommand{\p}{\mathbf{p}} \renewcommand{\a}{\mathbf{a}} \newcommand{\h}{\mathbf{h}} \renewcommand{\b}{\mathbf{b}} \renewcommand{\c}{\mathbf{c}} \newcommand{\z}{\mathbf{z}} \newcommand{\blambda}{\boldsymbol{\lambda}} \newcommand{\btheta}{\boldsymbol{\theta}} \newcommand{\balpha}{\boldsymbol{\alpha}} \newcommand{\bgamma}{\boldsymbol{\gamma}} \newcommand{\bbeta}{\boldsymbol{\beta}} \newcommand{\bzero}{\boldsymbol{0}} \newcommand{\bmu}{\boldsymbol{\mu}} \newcommand{\bSigma}{\bm{\Sigma}} \newcommand{\s}{\mathbf{s}} $

Motivation

We wish to develop flexible, scalable, exact and approximate, Bayesian posterior simulation methods, such as Markov chain Monte Carlo (MCMC), sequential Monte Carlo (SMC), particle Markov chain Monte Carlo (PMCMC), variational approximation (VA), and approximate Bayesian methods to solve real-world problems in areas, such as economic inequality and poverty measurement, health, economics, cognitive psychology, and finance. Note that many of the Bayesian approaches developed in this research can be useful in solving environmental problems.

Statistical Models and Applications

Economic Inequality and Poverty Measurements

Figure 1: Estimated Income Densities in Australia for the years 2001, 2006, 2010, 2014, and 2017. The units are hundreds of dollars.

In this research, we develop a range of Bayesian approaches for measuring inequality and welfare, where we make comparisons over time, over different populations and before and after the introduction of new policies such as taxes.

Statistical tests of dominance regularly appear in the economics and finance literature. They employ statistical methods to compare the distributions of two random variables (or one random variable at two points in time) in order to determine if one "dominates" another. Lander et al. (2020) and Gunawan et al. (2021a) develop Bayesian approaches for assessing Lorenz and stochastic dominance between two income distributions $X$ and $Y$.

Our approach begins with flexible specifications for income distributions $X$ and $Y$, such as finite or infinite mixtures of gamma densities (see Lander et al. (2020) and Gunawan et al. (2021a), for further details). Figure 1 shows the estimated income densities in Australia for the years 2001, 2006, 2010, 2014, and 2017. The densities are all bimodal with a sharp peak between $10,000-$20,000 and a lower peak in the range $20,000-$40,000. Prior to 2010 there have been clear shifts to the right, but the distributions for 2010, 2014, and 2017 are similar, particularly those in 2014 and 2017. Once the income distributions have been estimated, we can then apply the Bayesian dominance approaches in Lander et al. (2020) and Gunawan et al. (2021a) to compare pairs of income distributions.

It is important to note that income alone may not be sufficient to characterise the level of well-being of a person in a society. The need to understand welfare and inequality beyond the dimension of income has resulted in a significant increase in research on non-income well-being distributions. Gunawan et al. (2018) and Gunawan et al. (2022a) proposed a Bayesian approach to compare two ordinal categorical distributions commonly occuring with data on non-income well-being distributions, such as self-reported health status, life satisfaction, education, and other variables. They show how to compute probabilities for inequality and stochastic dominance along with methodology for obtaining posterior densities for an inequality index for ordinal health data. Figure 2 shows that inequality is similar in 2005, 2008, and 2010, but greater in 2002.

Figure 2: Posterior densities of the Abul-Naga Inequality Index for self-reported health status data in Australia

The studies mentioned above are only based on independent analysis of individual well-being attributes. They do not explicitly represent a multidimensional approach as they do not allow for possible degrees of correlation between different well-being attributes. To understand the multidimensional nature of poverty and inequality, research on the joint distribution of several measures of well-being, such as income, education, health status, housing conditions, and happiness, must be conducted in a way that allows for statistical dependence of the measures. Future research will extend the framework to multivariate frameworks using copulas.

Time Series State Space Models

In this research, we consider Bayesian inference over parameters and the latent states in non-linear, non-Gaussian, state space models. In particular, we use a multivariate factor stochastic volatility model, which is an example of a high-dimensional state space model having both a large number of parameters and a large number of latent states. Most of the applications of the multivariate factor stochastic volatility (SV) models are in the area of financial econometrics (Zhou et al., 2014). In general, joint inference over both the parameters and the unobserved latent states can be very challenging because the likelihood involves calculating an integral over a very high-dimensional latent state space. We now describe the multivariate SV model.

Suppose that $\boldsymbol{P}_{t}$ is an $S$-dimensional vector of daily stock prices, and define $\mathbf{y}_{t}\equiv\log\boldsymbol{P}_{t}-\log\boldsymbol{P}_{t-1}$, which is the log-return of the stocks. We model $\boldsymbol{y}_{t}$ as a multivariate factor SV model \[ \boldsymbol{y}_{t}=\boldsymbol{\beta}\boldsymbol{f}_{t}+\boldsymbol{V}_{t}^{\frac{1}{2}}\boldsymbol{\epsilon}_{t},\left(t=1,...,T\right), \] where $\boldsymbol{f}_{t}$ is an $K$-dimensional vector of latent factors (with $K\ll S$), and $\boldsymbol{\beta}$ is an $S\times K$ factor loading matrix of unknown parameters. We model the latent factors as, $\boldsymbol{f}_{t}\sim N\left(0,\boldsymbol{D}_{t}\right)$, and $\boldsymbol{\epsilon}_{t}\sim N\left(0,I\right)$. The time-varying variance matrices, $\boldsymbol{V}_{t}$ and $\boldsymbol{D}_{t}$, depend on unobserved log-volatilities $\boldsymbol{h}_{t}=\left(h_{1t},...,h_{St}\right)'$ and $\boldsymbol{\lambda}_{t}=\left(\lambda_{1t},...,\lambda_{Kt}\right)^{'}$ such that \[ \boldsymbol{V}_{t}=\textrm{diag}\left(\exp\left(h_{1t}\right),...,\exp\left(h_{St}\right)\right),\;\;\boldsymbol{D}_{t}=\textrm{diag}\left(\exp\left(\lambda_{1t}\right),...,\exp\left(\lambda_{Kt}\right)\right). \] Each of the unobserved log-volatilities $\lambda_{kt}$ and $h_{st}$ are assumed to follow independent, first-order autoregressive process with \[ h_{st}-\mu_{\epsilon s}=\phi_{\epsilon s}\left(h_{st-1}-\mu_{\epsilon s}\right)+\eta_{\epsilon st},\;\textrm{for}\;\eta_{\epsilon st}\sim N\left(0,\tau_{\epsilon s}^{2}\right);s=1,...,S, \] and \[ \lambda_{kt}=\phi_{fk}\left(\lambda_{kt-1}-\mu_{fk}\right)+\eta_{fkt},\;\textrm{for}\;\eta_{fkt}\sim N\left(0,\tau_{fk}^{2}\right),k=1,...,K. \] We denote the parameter vector for the multivariate factor SV model by \[ \theta=\left(\beta,\left\{ \left(\phi_{fk},\tau_{fk}^{2}\right):k=1,...,K\right\} ,\left\{ \left(\mu_{\epsilon s},\phi_{\epsilon s},\tau_{\epsilon s}^{2}\right):s=1,...,S\right\} \right). \] Mendes et al. (2020), Gunawan et al. (2023a) and Gunawan et al. (2023b) proposed efficient particle Markov chain Monte Carlo methods that are scalable in terms of a number of parameters and a number of latent states. Gunawan et al. (2022c) also proposed a flexible density tempering approach to estimate a high-dimensional factor SV model. Gunawan et al. (2021b) develop variational Bayes methods to speed up the computation of high-dimensional factor stochastic volatility models. They apply the variational Bayes methods to a panel of 90 daily stock returns from the S&P 100 stocks. Gunawan et al. (2023c) develop a flexible variational approximation based on a copula of a mixture that can approximate complex posterior distributions.

Evidence Accumulation Models (EAMs)

Figure 3: Linear Ballistic Accumulator (LBA) Model.

Evidence accumulation models (EAMs) have been used to address theoretical and applied questions about human cognition, both in the general population and in clinical groups. For example, we can use the EAM to understand how ageing affects the decision making. It has been long known that in many cognitive tasks, older adults respond more slowly than the younger adults. For many decades, age-related slowing was attributed to the decrease in the rate of information processing (Salthouse, 1996). Ratcliff et al. (2004) found that age-related slowing of older adults was very often caused by increased caution rather than the decrease in the processing rate. In these cases, older adults can process the information as quickly as the younger participants, but they responded more slowly because they were more cautious. This kind of understanding could not have been reached without the use of the evidence accumulation model (EAM).

There are many EAMs, but almost all applied work has used the linear ballistic accumulator model (LBA; see Brown and Heathcote, 2008) and the diffusion decision model (DDM; see Ratcliff and McKoon, 2008). These models assume that decisions are made by accumulating evidence for each response (the speed of this accumulation is called “drift rate”). A decision is triggered as soon as the amount of evidence in favour of one response exceeds a threshold value (see Figure 3 for the LBA model).

The key inferences are drawn in two ways: either from parameter estimates or by comparing different versions of the EAMs estimated from the same data. These inferences rely on accurate parameter estimates and valid model-selection procedures, which can be difficult problems. Most modern applications of the model use hierarchical structures estimated in a Bayesian framework.

In this research, we develop a range of Bayesian methods to fit hierarchical EAMs and their extensions. Gunawan et al. (2020) propose two new methods, the particle Metropolis within Gibbs (PMwG) and the density-tempered sequential Monte Carlo (DT-SMC) approaches, for fitting the hierarchical LBA model that are more efficient than the differential evoluation Markov chain Monte Carlo (DE-MCMC) sampler used for the estimation of the LBA since Turner et al. (2013). Gunawan et al. (2022b) extend the hierarchical LBA model to allow the individual-level parameters for each subject to change over blocks of the trials. Tran et al. (2021) propose an efficient method for estimating marginal likelihood for models where the likelihood is intractable but can be estimated unbiasedly. They illustrate their method to estimate the marginal likelihood of different versions of the hierarchical LBA model. Dao et al. (2022) develop novel algorithms that make variational Bayes (VB) inference for hierarchical models feasible and computationally efficient for complex cognitive models of substantive theoretical interest. They also develop a novel VB algorithm with Bayesian prediction as a tool to perform model comparison by cross-validation, which we refer to as CVVB. In particular, CVVB can be used as a model screening device that quickly identifies bad models. Dao et al. (2023) extend upon previous work with methods for estimating the DDM, which is less tractable than LBA, in hierarchical Bayesian frameworks that include random effects which are correlated between people, and include regression-model links between decisionrelevant covariates and model parameters. Our methods work equally well in cases where the covariates are measured once per person (e.g., personality traits or psychological tests) or once per decision (e.g., neural or physiological data). We provide methods for exact Bayesian inference, using particle-based MCMC, and also approximate methods based on variational Bayesian (VB) inference. The VB methods are sufficiently fast and efficient that they can address large-scale estimation problems, such as with very large data sets. Future work will consider cognitive models with intractable density.

Brown, S. and Heathcote, A. (2008). The simple complete model of choice reaction time: linear ballistic accumulation. Cognitive Psychology57, 153-178.

Dao, V. H., Gunawan, D., Brown, S. D., Kohn, R., Tran, M.-N., and Hawkins, G. E. (2023). Bayesian inference for evidence accumulation models with regressors. arXiv preprint arXiv:2302.10389.

Dao, V. H., Gunawan, D., Tran, M.-N., Kohn, R., Hawkins, G. E., and Brown, S. D. (2022). Efficient selection between hierarchical cognitive models: Cross-validation with variational Bayes. Psychological Methods.

Gunawan, D., Carter, C., and Kohn, R. (2023a). The correlated particle hybrid sampler for state space model. arXiv preprint arXiv:1804.04359v4.

Gunawan, D., Chatterjee, P., and Kohn, R. (2023b). The block correlated pseudo marginal method for state space model. arXiv preprint arXiv:2109.14194.

Gunawan, D., Griffiths, W., and Chotikapanich, D. (2022a). Inequality in education: A comparison of Australian indigenous and nonindigenous populations. Statistics, Politics and Policy, 13(1), 57–72.

Gunawan, D., Griffiths, W. E., and Chotikapanich, D. (2018). Bayesian inference for health inequality and welfare using qualitative data. Economics Letters, 162, 76–80.

Gunawan, D., Griffiths, W. E., and Chotikapanich, D. (2021a). Posterior probabilities for Lorenz and stochastic dominance of Australian income distributions. Economic Record, 97(319), 504–524.

Gunawan, D., Hawkins, G. E., Kohn, R., Tran, M.-N., and Brown, S. D. (2022b). Time-evolving psychological processes over repeated decisions. Psychological review, 129(3), 438.

Gunawan, D., Hawkins, G. E., Tran, M.-N., Kohn, R., and Brown, S. (2020). New estimation approaches for the hierarchical linear ballistic accumulator model. Journal of Mathematical Psychology, 96, 102368.

Gunawan, D., Kohn, R., and Nott, D. (2021b). Variational Bayes approximation of factor stochastic volatility models. International Journal of Forecasting, 37(4), 1355–1375.

Gunawan, D., Kohn, R., and Nott, D. (2023c). Flexible variational Bayes based on a copula of a mixture. Journal of Computational and Graphical Statistics, pages 1–16.

Gunawan, D., Kohn, R., and Tran, M. N. (2022c). Flexible and robust particle tempering for state space models. Econometrics and Statistics.

Lander, D., Gunawan, D., Griffiths, W. E., and Chotikapanich, D. (2020). Bayesian assessment of Lorenz and stochastic dominance. Canadian Journal of Economics.

Mendes, E. F., Carter, C. K., Gunawan, D., and Kohn, R. (2020). A flexible particle Markov chain Monte Carlo method. Statistics and Computing, 30, 783–798.

Ratcliff, R. and McKoon, G. (2008). The diffusion decision model: theory and data for two-choice decision tasks. Neural Computation, 20, 873–922.

Ratcliff, R., Thapar, A., Gomez, P., and McKoon, G. (2004). A diffusion model analysis of the effects of aging in the lexical-decision task. Psychology and Aging, 19(2), 278.

Salthouse, T. A. (1996). The processing-speed theory of adult age differences in cognition. Psychological Review, 103(3), 403.

Tran, M.-N., Scharth, M., Gunawan, D., Kohn, R., Brown, S. D., and Hawkins, G. E. (2021). Robustly estimating the marginal likelihood for cognitive models via importance sampling. Behavior Research Methods, 53, 1148–1165.

Turner, B. M., Sederberg, P. B., Brown, S. D., and Steyvers, M. (2013). A method for efficiently sampling from distributions with correlated dimensions. Psychological Methods, 18(3), 368–384.

Wall, L., Gunawan, D., Brown, S. D., Tran, M.-N., Kohn, R., and Hawkins, G. E. (2021). Identifying relationships between cognitive processes across tasks, contexts, and time. Behavior Research Methods, 53(1), 78–95.

Zhou, X., Nakajima, J., and West, M. (2014). Bayesian forecasting and portfolio decisions using dynamic dependent sparse factor models. International Journal of Forecasting, 30(4), 963–980.

Authored by D. Gunawan, 2020.