TY - JOUR

T1 - Centile estimation for a proportion response variable

AU - Enea, Marco

AU - Stasinopoulos, Mikis

AU - Hossain, Abu

AU - Rigby, Robert

PY - 2016

Y1 - 2016

N2 - This paper introduces two general models for computing centiles when the response variable Y can take values between 0 and 1, inclusive of 0 or 1. The models developed are more flexible alternatives to the beta inflated distribution. The first proposed model employs a flexible four parameter logit skew Student t (logitSST) distribution to model the response variable Y on the unit interval (0, 1), excluding 0 and 1. This model is then extended to the inflated logitSST distribution for Y on the unit interval, including 1. The second model developed in this paper is a generalised Tobit model for Y on the unit interval, including 1. Applying these two models to (1-Y) rather than Y enables modelling of Y on the unit interval including 0 rather than 1. An application of the new models to real data shows that they can provide superior fits.

AB - This paper introduces two general models for computing centiles when the response variable Y can take values between 0 and 1, inclusive of 0 or 1. The models developed are more flexible alternatives to the beta inflated distribution. The first proposed model employs a flexible four parameter logit skew Student t (logitSST) distribution to model the response variable Y on the unit interval (0, 1), excluding 0 and 1. This model is then extended to the inflated logitSST distribution for Y on the unit interval, including 1. The second model developed in this paper is a generalised Tobit model for Y on the unit interval, including 1. Applying these two models to (1-Y) rather than Y enables modelling of Y on the unit interval including 0 rather than 1. An application of the new models to real data shows that they can provide superior fits.

KW - Beta inflated distribution; Fractional data; GAMLSS; Generalised Tobit model; Logit skew Student t distribution; Computer Simulation; Humans; Least-Squares Analysis; Logistic Models; Lung; Male; Statistical Distributions; Models

KW - Statistical; Epidemiology; Statistics and Probability

KW - Beta inflated distribution; Fractional data; GAMLSS; Generalised Tobit model; Logit skew Student t distribution; Computer Simulation; Humans; Least-Squares Analysis; Logistic Models; Lung; Male; Statistical Distributions; Models

KW - Statistical; Epidemiology; Statistics and Probability

UR - http://hdl.handle.net/10447/219851

UR - http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1097-0258

M3 - Article

VL - 35

SP - 895

EP - 904

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

ER -