A bibliometric study is presented which main aims are to evaluate the frequency of use of the count variables in different research areas in Psychology, as well as the statistical models that are habitually applied to count data variables. A random sample of 168 articles from two of the ten magazines with greater impact index (JCR-SCI index) for each area of Psychology is selected. The results show that count variables are habitual in Psychology, since they appear in 38,1% of the articles reviewed, and that there is a massive application of the general linear model whereas specific models for count data are not applied. Once established the important presence of count variables in Psychology and stated the remarkable problem of the application of suitable statistical models for count data, the proposal we make, already well-known although little applied in Psychology, is to analyze data through a modelling strategy. On this basis, and after discussing the aspects of modelling from an epistemologic point of view, statistical modelling as well as the generalized linear model (GLM) main features are reviewed since they are the theoretical bases of this work. Next, distributional characteristics of count variables that justify the application of suitable generalized linear models for this kind of variables are introduced. Thus, for a start it is described the Poisson distribution as well as the benchmark regression model for count variables: the Poisson regression model (PRM). The set of assumptions on which the PRM is based causes its application scope to be restricted to a set of situations that are not actually habitual. Maybe the most important of such situations is equidispersion. When there is no equidispersion the most habitual situation is overdispersion. In presence of overdispersion some models or procedures must be applied that allow, at least, on of the following: to model the overdispersion source, to relax the conditional mean-variance assumption or to correct the standard error of the PRM estimations. Nevertheless, there is a previous step exists that is of vital importance: the diagnostic of the overdispersion. In the empirical part, diverse issues related to the diagnosis and the treatment of overdispersion are treated: the study of the error nominal rate and power of overdispersion diagnostic; the comparison of standard error correction procedures of the PRM estimations in presence of overdispersion and, additionally, the verification of the incidence of overdispersion on the coefficients estimations and their standard errors. In order to cover these objectives 5 Monte Carlo experiments of simulation have been implemented in the R framework, and have been organized in 3 studies. The results show the efficiency, consistency and power of tests LR and c2 as well as the superiority of bootstrap and jackknife estimations for the correction of the standard error.
El diagnóstico de la sobredispersión en modelos de análisis de datos de recuento
Vives Brosa, J. (Author). 5 Sept 2002
Student thesis: Doctoral thesis
Student thesis: Doctoral thesis