TY - JOUR
T1 - Analysis of zero inflated dichotomous variables from a Bayesian perspective :
T2 - application to occupational health
AU - Moriña, David
AU - Puig, Pedro
AU - Navarro Giné, Albert
PY - 2021
Y1 - 2021
N2 - Background: Zero-inflated models are generally aimed to addressing the problem that arises from having two different sources that generate the zero values observed in a distribution. In practice, this is due to the fact that the population studied actually consists of two subpopulations: one in which the value zero is by default (structural zero) and the other is circumstantial (sample zero). Methods: This work proposes a new methodology to fit zero inflated Bernoulli data from a Bayesian approach, able to distinguish between two potential sources of zeros (structural and non-structural). Results: The proposed methodology performance has been evaluated through a comprehensive simulation study, and it has been compiled as an R package freely available to the community. Its usage is illustrated by means of a real example from the field of occupational health as the phenomenon of sickness presenteeism, in which it is reasonable to think that some individuals will never be at risk of suffering it because they have not been sick in the period of study (structural zeros). Without separating structural and non-structural zeros one would be studying jointly the general health status and the presenteeism itself, and therefore obtaining potentially biased estimates as the phenomenon is being implicitly underestimated by diluting it into the general health status. Conclusions: The proposed methodology is able to distinguish two different sources of zeros (structural and non-structural) from dichotomous data with or without covariates in a Bayesian framework, and has been made available to any interested researcher in the form of the bayesZIB R package (https://cran.r-project.org/package=bayesZIB).
AB - Background: Zero-inflated models are generally aimed to addressing the problem that arises from having two different sources that generate the zero values observed in a distribution. In practice, this is due to the fact that the population studied actually consists of two subpopulations: one in which the value zero is by default (structural zero) and the other is circumstantial (sample zero). Methods: This work proposes a new methodology to fit zero inflated Bernoulli data from a Bayesian approach, able to distinguish between two potential sources of zeros (structural and non-structural). Results: The proposed methodology performance has been evaluated through a comprehensive simulation study, and it has been compiled as an R package freely available to the community. Its usage is illustrated by means of a real example from the field of occupational health as the phenomenon of sickness presenteeism, in which it is reasonable to think that some individuals will never be at risk of suffering it because they have not been sick in the period of study (structural zeros). Without separating structural and non-structural zeros one would be studying jointly the general health status and the presenteeism itself, and therefore obtaining potentially biased estimates as the phenomenon is being implicitly underestimated by diluting it into the general health status. Conclusions: The proposed methodology is able to distinguish two different sources of zeros (structural and non-structural) from dichotomous data with or without covariates in a Bayesian framework, and has been made available to any interested researcher in the form of the bayesZIB R package (https://cran.r-project.org/package=bayesZIB).
KW - Presenteeism
KW - Bayesian methods
KW - Zero-infation
KW - Simulation study
KW - Bernoulli mixture models
U2 - 10.1186/s12874-021-01427-2
DO - 10.1186/s12874-021-01427-2
M3 - Article
C2 - 34895155
SN - 1471-2288
VL - 21
JO - BMC Medical Research Methodology
JF - BMC Medical Research Methodology
IS - 1
ER -