ARTICLE ARTIGO

Health-related quality of life in Brazil: normative data for the SF-36 in a general population sample in the south of the country

Qualidade de vida relacionada à saúde no Brasil: dados normativos para o SF-36 em uma amostra da população geral do sul do Brasil

Luciane Nascimento CruzI; Marcelo Pio de Almeida FleckII; Michele Rosana OliveiraII; Suzi Alves CameyII; Juliana Feliciati HoffmannII; Ângela Maria BagattiniII; Carisi Anne PolanczykII

IHospital de Clínicas de Porto Alegre, Instituto de Avaliação de Tecnologias em Saúde. R. Ramiro Barcelos 2350 Prédio 21. 90035-903 Porto Alegre RS. lncruz@hcpa.ufrgs.br
IIUniversidade Federal do Rio Grande do Sul

ABSTRACT

The objective of this study was to provide normative SF-36 scores in a general population sample in Brazil and to describe differences in mean scores according to socio-demographic characteristics. The SF-36 questionnaire was distributed to a randomly selected sample of the general population of Porto Alegre in the State of Rio Grande do Sul. The response rate was 68% and 755 subjects were included (38% male, 62% female). Lower health status was revealed among females in the 30 to 44 year age bracket, from the lower income class, with less education and self-reported chronic medical conditions. The results and percentiles of scores of the SF-36 are reported as normative data for the general population. The SF-36 was an acceptable and practical instrument for measuring health-related quality of life in a sample of Brazilians. The results of this study can be useful for researchers using the SF-36 questionnaire in other groups to compare the scores with normative data. The SF-36 may prove a valuable tool for discovering vulnerable groups in epidemiological studies due to the ability to discriminate between different population subgroups.

Key words  Quality of life, Epidemiology, Health status indicators

RESUMO

Palavras-chave  Qualidade de vida, Nível de saúde, Epidemiologia

Introduction

The importance of quality-of-life assessment (QOL) has been expressively growing in the past 50 years. Some factors that have contributed to the increased use include the accumulation of evidence that it is a valid and reliable measure, the publication of clinical studies demonstrating that these measures are responsive to clinical changes, and the development of shorter instruments easier to use and understand1.

One of the most used health-related quality of life (HRQOL) instrument worldwide is the Medical Outcomes Study Short-Form 36 (SF-36)2. The SF-36 was created from the necessity of having a standardized instrument that would address general health concepts not specific for any medical condition, and that were understandable, easy to use and psychometrically appropriate. The conceptual basis for the development of SF-36 were the concepts of functional status and well-being described in accepted definitions of "health"2. Thus, the concept of quality of life considered in the elaboration of this instrument was the health-related quality of life, emphasizing the specific impact that prevention and treatment of a disease has on the "value of being alive".

The health concepts assessed by the SF-36 are: physical functioning, social functioning, role functioning, general health and mental health perceptions, pain and vitality. As a generic instrument, it is useful for comparing general and specific populations, comparing the relative impact of diseases, differentiating the benefits produced by different treatments and screening individual patients3.

SF-36 has been translated into several languages and adapted to several cultures. The International Quality of Life Assessment (IQOLA) is the project of a group of researchers from Europe and the United States where the guidelines for the translation and cultural adaptation process of SF-36 were delineated, which consists of 3 stages: 1. Translation; 2. Psychometric evaluation of the items; 3. Empirical validation and norming of scores4. Normative data enable the interpretation of scores of the instrument for an individual or the average of a group, since there is no "gold standard" against which to compare the results obtained with this instrument. Population norms are available for many developed countries4, but there are a limited number of studies reporting these data in developing ones5,6. In Latin America, translations and validations of the SF-36 are available for a few countries7,8, but this is the first study, of our knowledge, to report population normative data in this context. This is seems to be an important research question, since Brazil is the only Latin American country that speaks Portuguese and adds in its territory several ethnic groups and cultures, thus requiring regional normative data for comparison of health-related quality of life scores.

Methods

Sampling

The sample consisted of individuals selected from the general population of Porto Alegre, a capital city in the South of Brazil. This is a city with 1,436,123 inhabitants, being the capital of one of the most developed states of the country, with 97% of the population living in urban areas, per capita GDP of approximately US$13,000.00 and a literacy rate of 96.7%9. The estimated sample size was 800 individuals, according to the minimum sample size recommended by IQOLA project4. A two-stage cluster random selection design was used. In a first stage of sampling, a random sample of 108 census sectors of the city was obtained, divided by the Brazilian Institute of Geography and Statistics (IBGE). To calculate the number of households to be visited, the average number of adults per household was considered, and the population's proportion in each one of the strata the study aimed to reach, that is, men and women in the age ranges of 20-29, 30-44, 45-64 years. In each sector, 7 households to be visited were sistematically selected and all residents were invited to participate in the study if they met the following inclusion criteria: age ranging from 20 to 64 years; be literate; not having any physical or mental limitation that could prevent the reading and understanding of the instruments. If the residents were not found in the first visit, another two visits in different days and times, including non-business hours, were carried out. A cover letter containing the team identification and purposes of the study, time to be spent on the interview and phone numbers was provided to residents present in the first visit, or deposited in the mailbox of the selected households. Instruments SF-36 is a generic instrument whose conceptual basis is "health-related quality of life". This construct is represented by 36 questions divided into eight domains: physical functioning, role physical, pain, general health, vitality, role social, role emotional and mental health. Items are scored by a Likert scale. All items of SF-36 are used to score the eight domains, except for item 2, which refers to a self-report of health transition. Each item contributes to only one domain. After recalibrating two items and reverse the score of nine items, the responses to items are summed. The highest scores represent better health status. One score for missing values is computed if items of one scale are not responded. Scores range from 0 to 100, 0 indicating the less favorable health status and 100 the most favorable one. SF-36 is a questionnaire that can be administered by: self-administration, administration by computer, personally or by phone calls by a trained interviewer and is adequate for individuals above 14 years of age. It can be administered in 5 to 10 minutes with high degree of acceptability and quality of data3. The SF-36 employed in this study was previously translated into Portuguese and validated in Brazil by Ciconelli et al.7. The study was performed with a population of patients with rheumatoid arthritis, using the protocol elaborated in compliance with some steps proposed by the IQOLA coordinators10. A standardized questionnaire was used to obtain socio-economic and demographic data and contained the following variables: gender, age, race, marital status, practice of any religion, employment status and economic class, number of medical consultations and admissions in the last year, smoking and alcohol use. The presence of chronic diseases was assessed by a list of diseases with dichotomous response (yes/no): hypertension, diabetes, ischemic cardiopathy (infarct/angina), heart failure, arthrosis/arthritis, cerebrovascular accident, chronic bronchitis/ emphysema, asthma, kidney disease, cancer, HIV/AIDS, back pain, depression and anxiety, and one open question codified as "others". The economic class was assessed by an index called Brazil Criterion (Critério Brasil) which divides the population into classes according to their purchasing power and schooling of the family head11. The classification and its equivalence concerning approximate mean family income in American dollars would be: Class A1: Mean family income of US$ 3,800; Class A2: US$2,300; Class B1: US$ 1,400; Class B2: US$800; Class C: US$ 460; Class D: US$212, and Class E: US$ 103.

Statistical analysis

Continuous data are expressed as means ± standard deviation and categorical in percentage. Comparisons of QOL mean scores among groups according to socio-demographic characteristics were performed by ANOVA, Brown-Fosythe or t-test. ANOVA was used when there was homogeneity of variances and Brown-Fosythe when there were not both of them for comparisons between two or more groups. Levene's test was used to perform tests of homogeneity of variances.

For all tests a significance level was established at < 0.05. Data were analyzed using SPSS for Windows, version 13.0 (IBM Company, Chicago) and Microsoft Office Excel 2003.

Results

From July/2007 to July/2009, 1057 households were visited, being possible to perform the interviews in 514 (49%), because for the others it was impossible to contact dwellers after 3 consecutive visits or people refused to welcome the study team. From 1119 eligible individuals, identified and contacted, 758 participated in the project, achieving a response rate of 68%. The number of eligible individuals includes all the households in which the research team was able to contact, even households where people refused to participate, because we could collect information regarding the number of people aged from 20 to 64 living at the addresses contacted.

It was necessary to exclude 3 individuals from the sample, 2 (0.3%) for error in the age record and 1 (0.1%) for not responding more than 50% of the SF-36 items, totaling 755 participants with data available for analysis.

The distribution of the sample in terms of gender and age group was similar to that of the general population, except for the subgroup represented by males from 30 to 44 years that had a smaller percentage of individuals. In relation to economic class, there was a subrepresentativity of lower classes, D and E, probably due to some criteria used by the study protocol such as exclusion of illiterates, which are usually included in these strata of the population. Additionally, 8 (6.7%) census sectors had to be excluded from the sample because they are places with high rates of urban violence that could jeopardize the safety of the team members. Such exclusions, because they involve neighborhoods possibly inhabited by people of lower purchasing power, may also justify the low level of D class and the lack of E class.

The socio-demographic characteristics of the sample are described in Table 1. The mean age of the sample was 41 ± 13 years and 62% of the participants were females. Respondents were mostly married, white, practiced a religion, and with formal employment. The mean of study years was 11.3 ± 5.1, and 37% of the sample with 12 years or above of study.

Forty nine percent of participants reported having some chronic medical condition, the most common being hypertension (13.5%), arthritis (8.3%), asthma (7.5%), and diabetes mellitus (4.6%). Depression was reported by 14% of the sample and anxiety by 21%.

Seven hundred forty-eight participants (99%) responded to all questions of the SF-36. The items with higher number of missing values (1%) were PF4 ("Climbing several flights of stairs"  physical functioning domain) and RP3 ("Accomplished less than would like"  role physical domain). The mean of time spent to answer to SF-36 was 10 ± 5.2 minutes.

Descriptive statistics for the 8 domains of SF-36 are available in Table 2. As expected in data obtained from a general population sample, most of the respondents scored in the favorable health scores, a finding observed through the high median seen in all domains, and the negative asymmetry, indicating the trend of scores to the upper range of the scale. This finding is also evidenced by the high percentage of ceiling effects, that is, respondents scoring at the highest score, especially in the role physical and role emotional domains. A high index of ceiling effect was also seen in the social functioning domain. On the other hand, in a sample of the general population, the percentage of participants with scores at the lower scale should be minimal, as seen in the present study. From the 8 domains, 1.5% or less of floor effect was observed in 6. There was exception in the role physical and role emotional domains, which had higher percentages of individuals with minimum scores, 11 and 16%, respectively.

Normative data for the 8 domains of SF-36 according to gender, age group, economic class, educational level and presence of disease are available in Table 3. Results are presented by mean and standard deviation of the scores.

Women had the worst health status, with statistically significant difference (p < 0.001 to 5 domains) in all domains. Major differences occurred in pain and vitality domains and the minor ones in general health and mental health domains. In terms of age, statistically significant differences between the mean scores were found only in domains related to physical health (physical functioning, pain, role physical and general health), with decreasing values as increasing age.

The mean scores also varied according to economic class and educational level. Values decreased in a proportional manner to education, with statistical significance in most of the areas, except for the pain, social functioning and role emotional domains. A worst health status was also seen in individuals of lower economic classes, with statistically significant differences in scores in all domains, except pain and role emotional.

As a measure of health status, SF-36 was able to differentiate the group of individuals that reported having some chronic health condition from the group that considered itself healthy, with worst health status for the first group. The differences in scores had statistical significance (p < 0.001) in all the 8 domains. The major discrepancies occurred between means of individuals who reported having depression and those who not reported, mainly in the role emotional and mental health, as expected.

Table 4 shows the scores in each area of the SF-36 in percentiles 5, 10, 25, 50, 75, 90, 95 for the total sample and for each subgroup according to gender and age group. The description in percentiles is to make the use of scores here available more practical for future comparisons.

Graph 1 shows the mean scores for the 8 domains of SF-36 obtained in this research compared to the normative scores of other 4 countries with different cultures. Brazil has lower scores than developed countries and Turkey in nearly all domains, except for vitality, where the score was higher in relation to all the other. Comparing to Croatia, a developing country, the studied Brazilian population presented higher mean scores.

Discussion

The results of our research provide regional normative data for SF-36 to be used by researchers in comparisons of cohorts of individuals in different clinical situations. In the absence of "gold standards" for health measures, normative scores can be very useful in interpreting scale scores for an individual respondent or the average score for a group in comparison to the distribution scores for individuals from the general population4.

This study sought to meet the requirements recommended by the guidelines for standardization of scores of the SF-36. The number of participants was near to the 800 individuals as suggested by IQOLA and the study complied with other criteria such as response rate over two thirds, demographic information including age, sex, employment status, education, marital status and a checklist of self-reported chronic conditions4.

In spite of slightly smaller sample size, important to detect differences in mean scores between groups, the results of the present study regarding the ability to differentiate individuals according to demographic variables and presence of disease were similar to those found in other countries which used a larger number of individuals in the sample5,6,12,13.

The quality of data of the study was high, considering as criterion the percentage of missing values for items and domains of the SF-36, which was below 2%14. This percentage was lower than that found in Medical Outcomes Study (MOS), a study that used the original version of the SF-36, which ranged from 1.1 to 5.9%14. The authors of SF-36 emphasize that the scores cannot be estimated with the same confidence level if there is a large number of missing data15. Additionally, the non-response index also reflects the understanding and acceptance of the questionnaire by the participants14. In this sample of the general population of Porto Alegre, SF-36 seems to have been well accepted, and it was of quick application, with mean of completion time of 10 minutes.

The low number of missing values in this research might have been secondary to the administration mode of SF-36, which was self-applicable, but performed in the presence of the interviewer, who checked if all questions were responded. The few missing items that occurred were probably due to a refusal of the respondent to complete that item.

The distribution of SF-36 scores in the total of sample is comparable to that found in the application of the original instrument in the general population of the United States3, with most respondents having higher scores. The areas with highest percentages of floor and ceiling effects were the same, role physical and role emotional. These two domains are considered the "coarsest" of the eight scales, enumerating only five or four levels of health each. One of the ways to improve this limitation of these areas would be to replace dichotomous responses by responses with more categories that measure finer gradations in role disability aside from the mere presence or absence of limitation14. The fact that the highest level of functioning is merely defined by the absence of physical or emotional limitations causes the ceiling effect in the above areas to always be a limitation in the SF-36 application in samples of non-diseased individuals. Younger individuals also had more domains with the highest score in relation to older individuals, confirming a possible reduction in sensitivity at the upper limits of the scale in people with less functional limitations. Data for the SF-36 in populations of patients with chronic diseases had lower prevalence of ceiling effect16.

The observed differences in mean scores of the SF-36 among different population strata emphasize the need to use the standards described for each subgroup for comparison. The main discrepancies were related to gender, women presenting a worse health status in all domains of the SF-36. This finding seems to be independent from culture and socioeconomic status, since it was unanimous in normalization studies conducted in different countries of Western Europe13-17, Canada12, New Zealand18 and Mexico19. For the other socio-demographic variables, older individuals reported a worse health status only in domains related to physical health, while respondents with less education and lower socioeconomic class had the lowest scores in almost all areas. These findings were also seen in other studies performed on developed countries20, and developing ones5,21, demonstrating the advantage of using the SF-36 in populational studies to identify groups of vulnerable individuals. The description of health-related quality of life in different areas also allows identifying which aspects of the life of the individual might be more affected. The graphic showing the curves of scores in different countries showed that the population of some countries reported better health in the physical areas, but worse in areas such as vitality, for example. Additionally, since this is a generic instrument, it enables cross-cultural comparisons.

SF-36 was able to clearly differentiate between the subjects with self-reported diseases and the group declared healthy, suggesting good construct validity of this instrument developed to measure health status. The group that reported to have some chronic condition had worst health status in all areas. Individuals who identified themselves with depression and anxiety, the only two psychiatric conditions included in the protocol, had mean scores significantly lower in the 8 domains, with major differences in role emotional and mental health.

One important aspect to be highlighted is that the sample used is not representative of all regions of Brazil. Due to the great cultural diversity, this research should be replicated in the other regions so that the national normative tables become available.

Another caveat of this study is the fact that the sample presented a subrepresentativity of the general population in relation to the lower socioeconomic classes. Some criteria followed by the protocol such as to exclude illiterate and preserve the team to work on areas of greatest risk for urban violence may have led to a smaller representation of D and E classes. Since the quality of life scores decreased progressively in lower socioeconomic classes, one can infer that the classes D and E would score even lower. Then, using the results presented in this paper to compare quality of life between groups, researchers should be aware that for individuals pertaining to classes D and E, values are probably overestimated.

While we recognize some caveats of the study, it is important to emphasize the difficulty of conducting a population survey in our country. Due to high rates of urban violence in our city, many people use to live in buildings with security systems that greatly hinder access to residents. For these reasons, it was necessary to adopt the strategy of replacement of losses and refusals, visiting a number of households larger than planned in order to obtain the required number of interviews.

To conclude, SF-36 seems to be an acceptable and easily applicable instrument to the general population, and its performance proved to be similar to that found in other general population samples around the world. It is a useful tool to measure the health status in cross-sectional epidemiological studies, but it has limitations in some scales to detect positive changes in health status in longitudinal studies of populations without chronic diseases. The normative values available in this study can be used as reference for comparison of scores obtained from different cohorts of patients.

Collaborations

LN Cruz, MPA Fleck, MR Oliveira, SA Camey, LF Hoffmann, AM Bagattini and CA Polanczyk have made substantive contributions to conception and design, acquisition of data, analysis and interpretation of data. LN Cruz drafted the article and MPA Fleck, SA Camey and CA Polanczyk revised the final version. All the authors endorse the data and conclusions.

References

1. Wilson IB, Cleary PD. Linking clinical variables with health-related quality of life. A conceptual model of patient outcomes. JAMA 1995; 273(1):59-65.

2. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey SF-36. I. Conceptual framework and item selection. Med Care 1992; 30(6):473-483.

3. Ware JE Jr. SF-36 health survey update. Spine (Phila Pa 1976) 2000; 25(24):3130-3139.

4. Gandek B, Ware Jr JE. Methods for validating and norming translations of health status questionnaires: the IQOLA Project approach. International Quality of Life Assessment. J Clin Epidemiol 1998; 51(11):953-959.

5. Demiral Y, Ergor G, Unal B, Semin S, Akvardar Y, Kivircik B, Alptekin K. Normative data and discriminative properties of short form 36 SF-36 in Turkish urban population. BMC Public Health 2006; 6:247.

6. Maslic SD, Vuletic G. Psychometric evaluation and establishing norms of Croatian SF-36 health survey: framework for subjective health research. Croat Med J 2006; 47(1):95-102.

7. Ciconelli RM, Ferraz MB, Santos W, Meinão I, Quaresma MR. Brazilian-Portuguese version of the SF-36. A reliable and valid quality of life outcome measure. Revista Brasileira de Reumatologia 1999; 39(3):143-150.

8. Augustovski FA, Lewin G, Elorrio EG, Rubinstein A. The Argentine-Spanish SF-36 Health Survey was successfully validated for local outcome research. J Clin Epidemiol 2008; 61(12):1279-1284.

9. Instituto Brasileiro de Geografia e Estatística (IBGE). IBGE Cidades. [cited 2010 Feb 12]. Available from: www.ibge.gov.br

10. Bullinger M, Alonso J, Apolone G, Leplège A, Sullivan M, Wood-Dauphinee S, Gandek B, Wagner A, Aaronson N, Bech P, Fukuhara S, Kaasa S, Ware JE Jr. Translating health status questionnaires and evaluating their quality: the IQOLA Project approach. International Quality of Life Assessment. J Clin Epidemiol 1998; 51(11):913-923.

11. Associação Brasileira de Empresas de Pesquisa (ABEP). Critério Econômico Brasil 2003. [cited 2010 Feb 12]. Available from: www.abep.org/novo/default.aspx

12. Hopman WM, Towheed T, Anastassiades T, Tenenhouse A, Poliquin S, Berger C, Joseph L, Brown JP, Murray TM, Adachi JD, Hanley DA, Papadimitropoulos E. Canadian normative data for the SF-36 health survey. Canadian Multicentre Osteoporosis Study Research Group. CMAJ 2000; 163(3):265-271.

13. Pappa E, Kontodimopoulos N, Niakas D. Validating and norming of the Greek SF-36 Health Survey. Qual Life Res 2005; 14(5):1433-1438.

14. McHorney CA, Ware JE, Jr., Lu JF, Sherbourne CD. The MOS 36-item Short-Form Health Survey SF-36: III. Tests of data quality, scaling assumptions, and reliability across diverse patient groups. Med Care 1994; 32(1):40-66.

15. Gandek B, Ware JE, Jr., Aaronson NK, Alonso J, Apolone G, Bjorner J, Brazier J, Bullinger M, Fukuhara S, Kaasa S, Leplège A, Sullivan M. Tests of data quality, scaling assumptions, and reliability of the SF-36 in eleven countries: results from the IQOLA Project. International Quality of Life Assessment. J Clin Epidemiol 1998; 51(11):1149-1158.

16. Cruz LN, Camey SA, Fleck MP, Polanczyk CA. World Health Organization quality of life instrument-brief and Short Form-36 in patients with coronary artery disease: do they measure similar quality of life concepts? Psychol Health Med 2009; 14(5):619-628.

17. Aaronson NK, Muller M, Cohen PD, Essink-Bot ML, Fekkes M, Sanderman R, Sprangers MA, te Velde A, Verrips E. Translation, validation, and norming of the Dutch language version of the SF-36 Health Survey in community and chronic disease populations. J Clin Epidemiol 1998; 51(11):1055-1068.

18. Scott KM, Tobias MI, Sarfati D, Haslett SJ. SF-36 health survey reliability, validity and norms for New Zealand. Aust N Z J Public Health 1999; 23(4):401-406.

19. Duran-Arenas L, Gallegos-Carrillo K, Salinas-Escudero G, Martínez-Salgado H. Towards a Mexican normative standard for measurement of the short format 36 health-related quality of life instrument. Salud Publica Mex 2004; 46(4):306-315.

20. Sullivan M, Karlsson J. The Swedish SF-36 Health Survey III. Evaluation of criterion-based validity: results from normative population. J Clin Epidemiol 1998; 51(11):1105-1113.

21. Wang R, Wu C, Zhao Y, Yan X, Ma X, Wu M, Liu W, Gu Z, Zhao J, He J. Health related quality of life measured by SF-36: a population-based study in Shanghai, China. BMC Public Health 2008; 8:292.