Reproducibility, Relative Validity and Calibration of a Food Frequency Questionnaire for Adults

The purpose of the present study was to evaluate the reproducibility and relative validity and calibrate the dietary intake assessment of a food frequency questionnaire (FFQ) using a random sample of 195 adults aged 20 to 50 years from the Central-West Region of Brazil. The reference method used by the study was two 24-hour recalls (24hR) that provided energy-adjusted deattenuated food intake data for comparison purposes. With respect to reproducibility, the average weighted kappa was 0.43 and exact agreement was 41.5%. With regard to relative validity, correlation coefficients ranged from 0.32 (thia-min) to 0.51 (carbohydrates), with a mean of 0.41. Deattenuation and adjustment for energy intake decreased most correlation coefficients in relation to crude values. The food frequency questionnaire showed good reliability and moderate validity for most nutrients based on classification into quartiles of energy and nutrient intake. The calibrated means of the FFQ were more similar to the means estimated from the 24hR and showed lower standard deviation. http://dx.


Abstract
The purpose of the present study was to evaluate the reproducibility and relative validity and calibrate the dietary intake assessment of a food frequency questionnaire (FFQ) using a random sample of 195 adults aged 20 to 50 years from the Central-West Region of Brazil.The reference method used by the study was two 24-hour recalls (24hR) that provided energy-adjusted deattenuated food intake data for comparison purposes.With respect to reproducibility, the average weighted kappa was 0.43 and exact agreement was 41.5%.With regard to relative validity, correlation coefficients ranged from 0.32 (thiamin) to 0.51 (carbohydrates), with a mean of 0.41.Deattenuation and adjustment for energy intake decreased most correlation coefficients in relation to crude values.The food frequency questionnaire showed good reliability and moderate validity for most nutrients based on classification into quartiles of energy and nutrient intake.The calibrated means of the FFQ were more similar to the means estimated from the 24hR and showed lower standard deviation.

Introduction
Measuring the exposure to dietary factors in epidemiological studies is a complex task because eating is influenced by the interaction of several factors, including an individual's biological characteristics, behavioral and affective issues, culture and socioeconomic status.These factors complicate not only the reporting of food consumption but also data interpretation.The growth in prevalence of diet-related and nutrition-related diseases, such as obesity and nontransmissible chronic diseases has stimulated the development of more appropriate dietary assessment tools 1,2 .
Food frequency questionnaires (FFQ) have proven to be an essential tool for epidemiological studies, especially cohort and case-control studies, investigating the role of diet in the etiology, prevention and treatment of diseases, and for the evaluation of interventions aimed at promoting healthy dietary habits 3 .FFQs are widely used to estimate dietary intake in prospective studies because they allow researchers to estimate usual intake, are relatively inexpensive, and data analysis is practical and simple 4,5 .
However, the elaboration of food lists used in the questionnaires is time consuming and the tool is also subject to measurement errors that often lead to biased estimates of the association between diet and disease development.Therefore, the evaluation of reproducibility, validation and calibration of FFQs is a critical element of effective dietary intake assessment 6 .
Evaluation of reproducibility is an estimate of the similarity between the results of the same FFQ conducted on two separate occasions, assuming no significant changes in dietary habits between the two time points 7 .Validity and calibration are assessed by comparing the results of the FFQ with the results of another dietary assessment method which serves as a reference method to quantify measurement error 8 .Validation assesses the accuracy of dietary intake estimates 9 , while calibration identifies correction factors to improve the accuracy of estimated food and nutrient intake obtained from FFQ data 10,11 .
In Brazil, FFQs have been developed for the populations of the following cities: Rio de Janeiro (two questionnaires) 12,13 , São Paulo (three questionnaires) 14,15,16 , Goiânia, Goiás State (one questionnaire) 17 , Brasilia (one questionnaire) 18 and Porto Alegre, Rio Grande do Sul State (two questionnaires) 19,20 .However, the majority of these FFQs have not been validated through population-based studies and deattenuated coefficients have not been calculated to correct the data for intrapersonal variability 9 .
This study uses internationally recommended statistical techniques to evaluate reproducibility and validity and calibrate the dietary intake assessment of a FFQ designed for adults from the Central-West Region of Brazil 21 .This tool will help in the study of diet-related diseases, especially obesity, metabolic syndrome and cardiovascular diseases, which have increased markedly in this region 22 .

Study design
Two 24-hour dietary recalls were conducted at an interval of 30 days.The 24-hour dietary recall (24hR) was chosen as the reference method since it is the most suitable method for the assessment of individual food intake in large samples 23 .Reproducibility of the FFQ was evaluated by conducting the questionnaire on two separate occasions (FFQ1 and FFQ2), also at an interval of 30 days.

Sample
A total of 115 men and 115 women were randomly selected from a sub-sample of 686 adults aged between 20 and 50 years from a population-based survey conducted to estimate the prevalence of arterial hypertension 24 .This sample size was chosen based on the findings of Cade et al. 9 that established that a sample size of 100 individuals is sufficient for the analysis of FFQ validity.

Data collection
Data were collected between July and September 2007 through in-home interviews conducted by two trained nutritionists.Home visits were made on randomly selected weekdays and weekend days and if the individual was not located by the third visit the interview was conducted with the right-hand side next door neighbor of the original interviewee.
During the first visit, a socioeconomic questionnaire was filled out prior to completing the FFQ and 24hR.A second visit to conduct the second FFQ and 24hR was scheduled for a future date at least 30 days after the first visit.As recommended by Willett 7 , the FFQ was conducted before the reference method on both occasions.

The FFQ
The food list used in the FFQ was elaborated based on a dietary survey (24hR) of a sample of 104 individuals aged between 20 and 50 years 25 .A total of 81 food items, comprising those items reported by at least 15% of sample members, were included in the list 25 .The questionnaire included the following frequency of consumption options: more than three times a day; twice or three times a day; once a day; five or six times a week; twice to four times a week; once a week; once to three times a month; and never or almost never.Intake of 63 of the 81 food items was based on three portion size options.For the remaining items, including onions, peppers, butter and cream cheese, only information on intake frequency was requested.Reference portions were defined based on the most reported portion sizes in the 24hR.Standard portion sizes or units were adopted for certain foods, such as bread rolls or eggs.The time frame for the FFQ was the last six months preceding the interview 25 .

The 24hR
Individuals were asked to describe the food consumed the day before the interview, including the type of food consumed, portion size, mode of preparation and time and location of consumption.Interview techniques were used to help interviewees recall food intake through associating consumption with the activities performed during the day.Pictures of food items and utensils 26 were also used to help individuals describe the amount consumed.Individuals were probed to remember foods that are often forgotten in 24hR, such as beverages, sauces, spreads, snacks and sugar 27 .

Data treatment
A detailed review of the questionnaires was performed to screen out implausible energy intake reports (outliers with values below the second percentile or above the 98th percentile in both recalls, n = 13).Food consumption frequency reported in both FFQs was converted into daily equivalent frequency of consumption by attributing proportional values to the frequency of consumption options calculated with reference to a base value of 1.0 for the "once a day" option.For example, the value 2.5 was assigned to the option "twice to three times a day" based on the following daily equivalent frequency of consumption calculation: [(2 + 3)/2] = 2.5, while the value 0.79 was assigned to the "five to six times a week" option calculated as average frequency [(5 + 6)/2] divided by seven (the number of days in a week).
Serving sizes were presented in their respective units of measurement based on standard amounts determined by previous studies 28 and measurements taken in the laboratory.Daily intake was calculated by multiplying the daily equivalent frequency of consumption by the respective amount per serving in grams or milliliters.
The NutWin software (Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil) was used to estimate daily energy and nutrient intake (carbohydrate; protein; total saturated and unsaturated fats; cholesterol; fiber; calcium; iron; vitamin C; thiamin; and folate) based on data from both the FFQs and 24hR.The NutWin software food composition table is based on data obtained from the United States Department of Agriculture (USDA).The nutritional composition of foods not on the Nut-Win list was taken from the Brazilian Food Composition Table 29 .Nutrient intake was energyadjusted using the residual method 30 .
Mean usual energy and nutrient intake based on data from the 24hR was estimated using the PC-SIDE Program which uses the Iowa State University method 31,32 to correct data for intraindividual variability (Department of Statistics, Iowa State University, Ames, USA).The means were deattenuated considering the ratio of within and between-subject variance.

Data analysis
Reproducibility of the FFQ was assessed using the weighted kappa statistic to evaluate the degree of agreement for classification into quartiles of energy and nutrient intake based on data from the two FFQs.Exact agreement (the proportion of individuals categorized into the same quartiles), adjacent agreement (the proportion of individuals categorized into adjacent quartiles) and disagreement (the proportion of individuals categorized into opposite and distant quartiles) were calculated.Furthermore, crude and energy-adjusted intraclass correlation coefficients (ICCs) were calculated to assess the linear relationship between energy and nutrient intake based on data from the FFQ1 and FFQ2 and degree of agreement.We also calculated the ICCs for portions and food items.
The relative validity of the FFQ was assessed by comparing the estimates obtained using data from the FFQ2 with the crude and deattenuated means of the two 24hR.The Student's t-test was used to ascertain differences between mean energy and nutrient intake based on data from the FFQ2 and crude mean energy and nutrient intake based on data from the two 24hR.Pearson's correlation coefficients were also calculated for energy and nutrient intake based on data from the FFQ2 and crude and deattenuated and energy-adjusted estimates from the 24hR.Finally, the study also evaluated the degree of agreement between the two different methods based on the classification of individuals into quartiles of energy and nutrient intake using the weighted kappa.Exact agreement and disagreement were evaluated by comparing the proportion of individuals classified into same and opposite quartiles.
Calibration was performed using a linear regression model.Energy and nutrient intake values based on mean deattenuated energy and nutrient intake data from the 24hR were used as dependent variables and estimates based on data from the FFQ2 were used as independent variables.Values for each nutrient were calibrated using the following equation: where α is the regression constant, λ is the slope of the regression line, and Q is the estimated energy and nutrient intake from the FFQ.The significance level was set at p = 0.05 for all analyses.The Kolmogorov-Smirnov test was used to test the distribution symmetry of continuous variables.Where necessary, the variables with asymmetric distributions were converted into logarithms to obtain symmetric distributions to fit the model.
This study was conducted in accordance with the Declaration of Helsinki guidelines and all relevant procedures were approved by the Ethics Committee of the Júlio Muller University Hospital (application number # 234/CEP-HU-JM/05).Participants signed an informed consent form after being informed of the goals of the study .

Results
The FFQs and 24hR were completed by 208 subjects (108 women and 100 men).The main reasons for loss were change of address and inability to attend the second interview.Thirteen individuals were excluded after data screening due to inconsistencies in dietary information.The final sample therefore consisted of 195 individuals (100 women and 95 men), corresponding to 85% of the original sample.The mean time interval between conducting the first and second FFQ was 36 days (standard deviation = six days).Mean age was 34 years (standard deviation = 10 years), with a predominance of individuals in the 20 to 29 year age range (42%).The majority of individuals (52%) had completed high school and 68% were living with a per capita household income of less than the Brazilian monthly minimum wage at the time of the study (R$380 equivalent to US$203.92).Compared to the two 24hR the FFQ1 and FFQ2 overestimated energy and nutrient intake (Table 1).
The weighted kappa of agreement of classification into quartiles of energy and nutrient intake between the FFQ1 and FFQ2 ranged from 0.33 (protein) to 0.60 (energy) (p < 0.01).Mean exact agreement was 41.5%, while mean disagreement was 4.5% (Table 2).
Pearson's correlation coefficients for energy and nutrient intake from the FFQ2 and crude mean energy and nutrient intake from the 24hR varied between 0.32 (folate) and 0.51 (carbohydrates).When correlated with the energy-adjusted and deattenuated 24hR mean energy and nutrient intake values ranged from 0.12 (unsaturated fat) to 0.41 (calcium) (Table 3).
Coefficients for the crude values for most nutrients dropped after adjustment for energy intake and increased in the case of calcium (r = 0.40 to r = 0.41), vitamin C (r = 0.34 to r = 0.39), thiamin (r = 0.32 to r = 0.33) and folate (r = 0.35 to r = 0.36).The strongest correlation after deattenuation and adjustment for energy found was for calcium, while the weakest correlation was for unsaturated fat (r = 0.12).
The weighted kappa of agreement of classification into quartiles of energy and nutrient intake between the FFQ2 and 24hR varied from 0.06 (unsaturated fat) to 0.53 (energy).Agreement was highest for the macronutrient fiber (k = 0.38).The results also revealed a reasonable degree of agreement for the micronutrients calcium and vitamin C (k = 0.36).Exact agreement was highest for energy (45.1%) and lowest for unsaturated fat (26.2%).Disagreement was highest for unsaturated fat (12.8%) (Table 4).
Table 5 presents the results of calibration, with values ranging from -0.75 (iron) to 0.46 (vitamin C).The results of calibration for energy, carbohydrates, protein, fat, unsaturated fat, cholesterol, fiber, thiamin and folate were statistically significant.As expected, the mean calibrated values based on data from the FFQ were similar to the energy-adjusted mean values from the 24hR, and were associated with a considerable reduction in standard deviation.

Discussion
The FFQ assessed by this study showed good reproducibility and moderate relative validity.The results of calibration to correct estimates indicate that this FFQ is an appropriate tool for assessing food intake in adults from the Central-west Region of Brazil.The analysis of reproducibility reveals that the FFQ is generally reliable considering that more than 75% of the participants were classified into same or adjacent quartiles of energy and nutrient intake.ICCs observed by this study were similar to those found by Salvo & Gimeno 16 and Kesse-Guyot 33 , but slightly lower than values reported by Zanolla et al. 19 and Marques-Vidal et al. 34 .
The average time interval between conducting the two FFQs (30 days) is considered adequate because it is not likely to influence eating habits and individuals do not have enough time to become familiar with the questionnaire 9 .
The evaluation of relative validity was based on correlation coefficients and agreement for classification into quartiles of energy and nutri-ent intake.Correlation coefficients are widely used by studies evaluating dietary assessment methods.Cade et al. 9 reported that 83% of studies assessing the validity of FFQs used this statistical measure in their analyses.In a review of FFQ validation studies, Thompson & Byers 35 observed that correlation coefficients ranged from 0.4 to 0.7.The mean correlation coefficient for crude estimates of energy and nutrient intake presented by the present study is within an acceptable level (mean r = 0.41), and is similar to mean values found by Jackson et al. 36 (r = 0.44), Kusama et al. 37 (r = 0.46), Marques-Vidal et al. 34 (r = 0.42), Block et al. 38 and Deschamps et al. 39 (r = 0.52).FFQ validation studies carried out in Brazil also showed similar values: Fornés et al. 17 and Ribeiro et al. 18 found a mean Pearson correlation coefficient of 0.50, and Henn et al. 20 observed a mean value of 0.49.
For most nutrients, correlation coefficients decreased after energy-adjustment and deattenuation, remaining below 0.30.However, the correlation coefficients for calcium, vitamin C, thiamin and folate remained constant or increased after adjustment.Similarly low deattenuated correlation coefficients were reported by Fornés et al. 17 in an assessment of a FFQ for construction workers in the Central-west region of Brazil.
The correlation coefficients observed for fiber, calcium, vitamin C, thiamin and folate were higher than those reported by Chen et al. 40 , who evaluated the validity of a FFQ developed for adults in Bangladesh, observing correlation coefficients below 0.3 for these and other nutrients, such as iron and vitamin B6.
According to Freedman et al. 41 , the low-magnitude of correlation may be a result of several factors, including: biased reporting, where people with high food intake tend to underestimate consumption; reference method bias; variations in food intake during the study period; and difficulties in recalling food intake and correctly estimating food portions.
Decreases in correlation coefficients after deattenuation and adjustment for energy intake were also observed in other studies in Brazil 12,17,19,20 .Adjustment for total energy intake is a common procedure 7 .In the present study this process led to the elimination of large variations for most nutrients, except calcium, showing that this may not be a necessary step in analysis in validation studies where a strong correlation is observed with total food intake.
Our review of research conducted in Brazil, including studies carried out by Salvo & Gimeno 16 , Fornés et al. 17 , and Lima et al. 42 , shows that adjustment for energy intake led to lower correlation coefficients for most items,  while deattenuation minimally altered crude correlation coefficients.Based on the findings of other studies 16,38,43,44 , a greater increase in correlation coefficients following deattenuation could have been expected for most nutrients.The use of only two 24hR may be insufficient for estimating usual energy and nutrient intake, thus limiting the relevance of the correlation coefficients.Higher correlation coefficients have been observed in studies that used a greater number of reference method replications.For example, Kroke et al. 45 , in a study using twelve 24hR, observed coefficients ranging from 0.54 (fiber) to 0.84 (energy), while Johansson et al. 46 , in a study using 10 replications of the 24hR, found values between 0.44 (protein and cholesterol) and 0.63 (carbohydrate).A study of a sample of adults in Brazil carried out by Crispin et al. 47 using four 24hR found low correlations for only two nutrients (calcium = 0.13 and lipids = 0.39), even in individuals with a low level of education.The values for the other nutrients assessed by the study were above 0.40 and ranged from 0.40 (protein) to 0.98 (retinol) among individuals with a higher level of education.
According to Cade et al. 9 , an increase in the time interval between conducting dietary recalls is associated with improvements in the validity of the FFQ.Furthermore, Molag et al. 48bserved that correlation coefficients of studies that used a time interval of between eight and 14 days were significantly higher than in studies that used a time interval of between one to seven days.Pereira et al. 49 , in an analysis of dietary intake in adults, observed a high degree of within and between-subject variance, with values of up to 15.6 for cholesterol in men.These authors showed that the number of replications needed to estimate usual energy and nutrient intake depends on the nutrient being analyzed and on the expected correlation coefficient: for example, to achieve an expected correlation coefficient of 0.7 requires an average of eight replications within a range of one (for carbohydrates) to 15 (for cholesterol).
The classification of individuals into quantile categories (tertiles, quartiles, or quintiles) reveals the capacity of both methods to allocate individuals according to level of nutrient intake and is useful for estimating disease risk or relative protection measures 19 .Masson et al. 50recommends that no more than 10% of individuals should be classified into opposite quartiles and at least 50% should be classified into the same quartile in epidemiological dietary intake studies.In the present study, average exact agreement for energy and nutrient intake was 34%, and disagreement was below 10% for most of the items analyzed, except saturated and unsaturated fats and thiamin.It should be noted that the sum of exact and adjacent agreement was more than 70% for most nutrients, indicating a good degree of concordance and emphasizing that energy adjustment is not always necessary for assessing validation studies.The rate of exact agreement was identical to that found by Henn et al. 20 .In the same study the rate of disagreement was slightly lower (6.1%)than in the present study.Similarly, in a study by Zanolla et al. 19 , exact agreement ranged from 35% (vitamins A and C) to 40% (energy), and average disagreement was 4%, and in a study carried out by Pakseresht & Sharma 51 exact agreement ranged from 30% (protein) to 45% (energy) and average disagreement was 4.5%.
Masson et al. 50 The moderate validity of the FFQ may partially explained by the reference method (24hR) used by this study.It is widely recognized that, in comparison to the double-labeled water method that assesses energy expenditure, the 24hR underestimates energy intake 53 .Underreporting in the 24hR may also explain why the FFQ means of energy and nutrient intake were significantly higher than the 24hR means.A review of the literature found that the typical level of underreporting among adults is approximately 20% and that higher levels are common among obese individuals 54 .According to Buzzard 55 , underreporting of consumption is highest among obese individuals, women, teenagers and the elderly.Furthermore, underreporting seems to be selective and foods with more calories, such as cakes, soft drinks, cheese, sandwiches, butter, sauces and sugar, are generally poorly reported 43,55 .Behavior related to underreporting of food intake is complex and includes perceptual, emotional, cognitive, social, moral and physical factors.Several aspects of underreporting remain largely unexplored, significantly affecting food intake assessment 43 .
With respect to calibration using linear regression, intercept and slope (λ) values of approximately zero and 1.0, respectively, indicate absence of bias in the questionnaire, i.e., mean FFQ intake is similar to the mean intake estimated using the reference method 56 .As observed in other calibration studies 12,57 , slope values under 1.0 were observed in the present study.In this respect, it should be noted that, even under optimal study conditions, food consumption assessment is subject to several sources of error that affect data quality 7,58 , including misreporting, within-subject variability, use of a limited number of foods items or excessive aggregation of items during grouping of food items, which may induce subjects to under or over-report intake.Furthermore, the attenuation coefficient slope, independence between the errors of both methods of dietary intake assessment and a lack of systematic errors in the reference method may be attributed to the violation of theoretical assumptions essential to the calibration process 7 .
It was observed that standard deviations were lower for calibrated data than for crude data, corroborating findings of other studies 11,56,57 .Calibration allows researchers to correct errors that occur in the classification of individuals since extreme values are affected by linear correction due to the linearity between FFQ and the reference method values 59 .
For both the FFQ and 24hR, effective data collection depends on the ability of individuals to recall and describe their dietary intake.Certain sources of error are common to participants, resulting in positive correlations between measurement errors.Thus, variance in intake levels may be somewhat overestimated when recalls are used as a reference method 57,60 .
This study represents a valuable contribution to nutritional epidemiology research in this region.Further studies are necessary to help refine the present FFQ and revise the food list and reference portions.Continuous evaluation of dietary assessment methods is essential considering current changes in food intake and the importance of FFQ for monitoring food habits and detecting groups at risk due to dietary intake and nutritional deficiencies.

Table 1
Mean energy intake, adjusted mean * and 95% confidence interval (95%CI) of nutrients included in the food frequency questionnaire (FFQ 2) and means of the two 24-hour recalls (24hR).Adults from the Central-west Region of Brazil, 2007 (N = 195).

Table 2 Weighted
kappa of energy adjusted intake and agreement (%) between the two food frequency questionnaires (FFQ1/FFQ2).Adults from theCentral-west Region of Brazil, 2007 (N = 195).*** Corrected using the variance ratio (intra-individual and inter-individual variability); # Energy adjusted using the residual method.

Table 4
Weighted kappa of energy adjusted intake and agreement (%) between the food frequency questionnaire (FFQ 2) and the means of the two 24-hour recalls (24hR) based on classification into quartiles of energy and nutrient intake.Adults from the Central-west Region of Brazil, 2007 (N = 195).
suggested a minimum kappa value for reasonable agreement of 0.40, whereas Crewson 52 noted that kappa values between 0.21 and 0.40 indicate reasonable agreement.The kappa values observed by this study were reliable for energy (0.53) and reasonable for protein (0.30), cholesterol (0.33), fiber (0.38), calcium (0.36), iron (0.30), vitamin C (0.36) and folate (0.26).Reasonable values were also reported by Zanolla et al. 19 and Pakseresht & Sharma 52 for most nutrients.