Addressing overdispersion and zero-inflation for clustered count data via new multilevel heterogenous hurdle models,Journal of Applied Statistics

当前位置： X-MOL 学术 › J. Appl. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Addressing overdispersion and zero-inflation for clustered count data via new multilevel heterogenous hurdle models
Journal of Applied Statistics ( IF 1.5 ) Pub Date : 2022-07-26 , DOI: 10.1080/02664763.2022.2096875
Yasin Altinisik ₁

Affiliation

ABSTRACT

Unobserved heterogeneity causing overdispersion and the excessive number of zeros take a prominent place in the methodological development on count modeling. An insight into the mechanisms that induce heterogeneity is required for better understanding of the phenomenon of overdispersion. When the heterogeneity is sourced by the stochastic component of the model, the use of a heterogenous Poisson distribution for this part encounters as an elegant solution. Hierarchical design of the study is also responsible for the heterogeneity as the unobservable effects at various levels also contribute to the overdispersion. Zero-inflation, heterogeneity and multilevel nature in the count data present special challenges in their own respect, however the presence of all in one study adds more challenges to the modeling strategies. This study therefore is designed to merge the attractive features of the separate strand of the solutions in order to face such a comprehensive challenge. This study differs from the previous attempts by the choice of two recently developed heterogeneous distributions, namely Poisson–Lindley (PL) and Poisson–Ailamujia (PA) for the truncated part. Using generalized linear mixed modeling settings, predictive performances of the multilevel PL and PA models and their hurdle counterparts were assessed within a comprehensive simulation study in terms of bias, precision and accuracy measures. Multilevel models were applied to two separate real world examples for the assessment of practical implications of the new models proposed in this study.

中文翻译：

通过新的多级异质障碍模型解决聚类计数数据的过度分散和零膨胀问题

摘要

未观察到的异质性导致过度离散和过多的零在计数建模方法的发展中占据重要地位。为了更好地理解过度分散现象，需要深入了解引起异质性的机制。当异质性源自模型的随机分量时，在这部分使用异质泊松分布是一种优雅的解决方案。研究的分层设计也是造成异质性的原因，因为各个级别的不可观察的影响也导致了过度分散。计数数据中的零通货膨胀、异质性和多层次性质就其本身而言提出了特殊的挑战，然而一项研究中所有内容的存在给建模策略增加了更多挑战。因此，本研究旨在融合各个解决方案的有吸引力的特征，以应对如此全面的挑战。本研究与之前的尝试不同，选择了两种最近开发的异质分布，即截断部分的泊松-林德利（PL）和泊松-埃拉穆加（PA）。使用广义线性混合建模设置，在综合模拟研究中对多级 PL 和 PA 模型及其障碍模型的预测性能进行偏差、精度和准确度测量方面的评估。多级模型应用于两个独立的现实世界示例，以评估本研究中提出的新模型的实际影响。

更新日期：2022-07-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>