# Abstract

This paper describes the sample design for the National Survey into Labor and Birth in Brazil. The hospitals with 500 or more live births in 2007 were stratified into: the five Brazilian regions; state capital or not; and type of governance. They were then selected with probability proportional to the number of live births in 2007. An inverse sampling method was used to select as many days (minimum of 7) as necessary to reach 90 interviews in the hospital. Postnatal women were sampled with equal probability from the set of eligible women, who had entered the hospital in the sampled days. Initial sample weights were computed as the reciprocals of the sample inclusion probabilities and were calibrated to ensure that total estimates of the number of live births from the survey matched the known figures obtained from the Brazilian System of Information on Live Births. For the two telephone follow-up waves (6 and 12 months later), the postnatal woman’s response probability was modelled using baseline covariate information in order to adjust the sample weights for nonresponse in each follow-up wave.

Sampling Studies; Stratified Sampling; Statistical Modeles; Parturition

# Introduction

According to do Carmo Leal et al. ^{1}1 do Carmo Leal M, da Silva AA, Dias MA, da Gama SG, Rattner D, Moreira ME, et al. Birth in Brazil: national survey into labour and birth. Reprod Health 2012; 9:15. the objectives of the National Survey into Labour and Birth were: (1) to describe the incidence of excessive caesarean section (according to Robson’s groups) and examine the consequences on women’s and new-borns’ health; (2) to investigate the relationship between excessive caesarean section and late preterm birth and low birth weight; and (3) to investigate the relationship between excessive caesarean section and the use of technological procedures after birth.

This article describes the sample design used in the survey including the definition of the survey population, the stratification of primary sampling units, the criteria for selection of hospitals, days and postnatal women, the base sample weights calculation and their calibration. It also describes the strategy used for estimating the response probabilities of respondents in the two additional telephone follow-up waves six and 12 months after the interview in the hospital, in order to calculate the sampling weights for the respondents in each follow-up wave.

# Survey population, first stage sampling frame and stratification

The survey population ^{2}2 Cochran WG. Sampling techniques. 3rd Ed. New York: John Wiley & Sons; 1977. corresponds to the set of postnatal women who gave birth in 2011 in hospitals with 500 or more live births in 2007, according to the Information System on Live Births (SINASC. http://portal.saude.gov.br/portal/saude/visualizar_texto.cfm?idtxt=21379). The SINASC was created by the Brazilian Department of Health in 1990 to gather epidemiological information on live births in hospitals and households all over the country.

For operational reasons, a number of groups were excluded from the survey population including postnatal women with severe mental health disorders, those who were homeless or were foreigners who did not understand Portuguese, deaf/mutes, and women sectioned by court order. Given the survey population definition, only hospitals with 500 live births or more in 2007 were included in the first stage sampling frame. In the end 1.403 of the 3.961 hospitals registered in 2007 were eligible for the study, accounting for 2,228,534 (77.1%) of the 2,891,328 live births that year.

In order to ensure different types of hospital governance (public, private and mixed) in all the five macro-regions of the country, divided into the set of state capitals and the other cities, which have important differences in dimension and kinds of health services, the hospitals in the first stage sampling frame were stratified by the combination of macro-region, capital or not and type of hospital governance, defining the strata presented in Table 1. Mixed governance was used for private hospitals that had beds contracted by the public sector.

# Sample size and its allocation by stratum

According to do Carmo Leal et al. ^{1}1 do Carmo Leal M, da Silva AA, Dias MA, da Gama SG, Rattner D, Moreira ME, et al. Birth in Brazil: national survey into labour and birth. Reprod Health 2012; 9:15., the sample size in each stratum was calculated based on the caesarean section rate in Brazil in 2007 of 46.6%, with 5% significance to detect differences of 14% between public, mixed and private hospitals and power of 95%. The minimum sample per stratum was 341 postnatal women. Since the sample was clustered by hospital, a design effect of approximately 1.3 was used to inflate the initial sample sizes, leading to a minimum sample size of 450 postnatal women per stratum.

Although not usual in sample survey, this way to determine sample size is common in clinical trials and randomized experiments. It derives from a two-tailed test of the hypothesis of equality between the proportions within treatment and control groups ^{3}3 Altman DG. Practical statistics for medical research. London: Chapman and Hall, 1991.. For this calculation the expression 3.14 from Fleiss ^{4}4 Fleiss JL. Statistical methods for rates and proportions, 2nd Ed. New York: John Wiley & Sons; 1981. was used.

According to do Carmo Leal et al. ^{1}1 do Carmo Leal M, da Silva AA, Dias MA, da Gama SG, Rattner D, Moreira ME, et al. Birth in Brazil: national survey into labour and birth. Reprod Health 2012; 9:15., the sample size has a power of 80% to detect adverse outcomes in the order of 3%, and differences of at least 1.5% among large geographic regions or type of hospital governance (public/private/mixed).

Considering the minimum size of 450 postnatal women by stratum, it was decided to select at least five hospitals by stratum, leading to a sample size of 90 postnatal women by hospital. If an equal allocation among the strata were used, these parameters would lead to a sample size of 210 hospitals. However, a proportional allocation to the number of hospitals was used and conducted to a sample size of 266 hospitals, since in all strata with an allocated sample size smaller than five hospitals, the sample size was increased to five in order to ensure a minimum of five hospitals and 450 postnatal women, as indicated in Table 1.

# Hospital selection

In the first stage, the hospitals were selected with probability proportional to size (PPS), defined by number of live births of the hospital according to SINASC 2007. As usual in PPS selection, the hospitals with large numbers of live births (more than 13 per day on average, in this case) were included with certainty in the sample and treated as selection strata for sampling days and postnatal women. In the case of strata having five or less hospitals, a take-all procedure was used and each hospital was also treated as a selection stratum for the subsequent sampling stages.

The hospital selection was done systematically ^{5}5 Madow WG. On the theory of systematic sampling, II. Annals of Mathematical Statistics 1949; 20: 333-54., after sorting the hospitals in each stratum in ascending order by number of live births in 2007. The sample inclusion probabilities of hospitals are provided in expressions (1a) and (1b) of Figure 1.

# Selection of survey days

In the second stage of sampling, an inverse sampling method ^{2}2 Cochran WG. Sampling techniques. 3rd Ed. New York: John Wiley & Sons; 1977.^{,}^{6}6 Haldane JBS. On a method of estimating frequencies. Biometrika 1945; 33:222-5. was used to select as many days as necessary to reach 90 postnatal women interviewed in the hospital. This method, originally proposed by Haldane ^{6}6 Haldane JBS. On a method of estimating frequencies. Biometrika 1945; 33:222-5. to estimate frequencies and proportions, can be defined as a technique to sample as many units (in this case, days) as needed to be observed in order to obtain a pre-specified number of successes or, in this case, 90 interviews performed with postnatal women in the hospital.

It is called inverse sampling because rather than defining a fixed number of days sufficient to have an expected sample size of 90 interviews as done by Veloso et al. ^{7}7 Veloso VG, Portela MC, Vasconcellos MTL, Matzenbacher LA, Vasconcelos ALR, Grinsztejn B, et al. HIV testing among pregnant women in Brazil: rates and predictors. Rev Saúde Pública 2008; 42:859-67., it defines the number of interviews performed as the stopping rule of the consecutive sample of survey days. The first survey day in each hospital was always selected with equal probability during the year, as indicated by expression (2) of Figure 1. The -1 in the numerator and denominator in expression (2) are explained by the loss of one degree of freedom due to the stopping rule, as defined by Haldane ^{6}6 Haldane JBS. On a method of estimating frequencies. Biometrika 1945; 33:222-5..

To account for the difference of number of live births in weekends and work days, a minimum of seven consecutive days was mandatory and the size of field team was determined to ensure this rule.

# Selection of postnatal women

The number of postnatal women to be selected per day and hospital depended on the number of live births and the numbers of interview shifts and interviewers per day in the hospital. To establish the number of shifts and interviewers, the mean number of live births per day per hospital in 2007 was used and four combinations were defined: (1) one interviewer and one shift for four interviews; (2) one interviewer and two shifts for six interviews; (3) two interviewers and one shift for eight interviews; and (4) two interviewers and two shifts for twelve interviews.

To ensure a random selection of postnatal women, the survey central office has prepared tables with the number of order of the women to be interviewed according to the numbers of live births (up to 40) and interviews per day and hospital (4, 6, 8 and 12). The number of order of the postnatal women was defined by the order of entrance in the hospital. Some additional numbers of order have been selected for replacement of non-responses.

Unfortunately, the number of live births per hospital and survey day were not recorded during the field work. To overcome this problem, the SINASC 2011 and 2012 files were processed to determine the number of live births in each hospital and survey day, as required to calculate the inclusion probabilities described in expression (3) of Figure 1.

# Treatment of non-responses

Nine sampled hospitals refused to take part in the survey, and three had the maternity service closed prior to the start of the fieldwork. The established replacement procedure for hospital non-response consisted in replacing the non-responding hospital by the next hospital in the stratum, according to the sort order of hospitals in the first stage sampling frame. Despite this, it was not possible to replace two non-responding hospitals among private hospitals located in non-capital cities in the Northeast region, as indicated in Table 1.

Postnatal women’s non-response was treated, if possible, by replacement according to selection tables prepared for each hospital or by the inverse sampling procedure used in survey day selection (more days added to the sample until 90 complete interviews were achieved per hospital). In the case of closure of the maternity service during the field work, the inverse sampling procedure was interrupted, restarting as soon as the maternity service was open.

A total of 1,356 (5.7%) postnatal women selected were replaced, 15% due to early hospital discharge and 85% due to refusal to participate. The sample size was composed of 23,940 postnatal women interviewed in 266 hospitals. During processing, records with no data from the woman or no new-born medical records were excluded and the final sample size accounted for 23,894 postnatal women (Table 1).

# Sample weighting and calibration of sample weights

As indicated in Figure 1, the base sample weights were calculated by the reciprocals of the product of the inclusion probabilities in each sampling stage.

As usual in official statistical surveys (according to Silva ^{8}8 Silva PLN. Calibration estimation: when and why, how much and how. Rio de Janeiro: Instituto Brasileiro de Geografia e Estatística; 2004. (Textos para Discussão da Diretoria de Pesquisas, 14).), calibration of the base sample weights was performed to enforce coherence between sample estimates and known population totals obtained from an external source. In addition, up to a point, calibration helps to compensate for potential sampling and nonresponse biases.

Since the field work was conducted in 2011 (and at the beginning of 2012 for a few hospitals), it seemed appropriate to keep the coherence between sample based estimates and the total number of live births as obtained from the SINASC 2011 for the hospitals in the sampling frame, i.e. those with more than 500 live births in 2007.

For this reason, a ratio type calibration procedure of the base sample weights was performed within each of the selection strata, as indicated in expression (6) of Figure 1.

Results comparing population data with estimates obtained using both the base and calibrated sample weights are presented in Table 2. These results show the coherence between estimates based on calibrated weights and the known population totals, as expected. Also as expected, calibration leads to a slight increase in the variation of the sample weights as shown in Table 3. This increase in sample weight variation is the price to assure coherence for estimates.

# Sample weights for the two telephone follow-up waves

As expected, it was not possible to contact all postnatal women interviewed in the baseline survey during the two telephone interview follow-up waves. Some possibilities could be used to correct the non-response: (1) probabilistic imputation of non-respondents’ data; (2) treating the responding sample as a subsample of the baseline sample; or (3) modelling the probability of response in each follow-up wave as a function of some covariates obtained in the baseline survey and using these to derive nonresponse weight adjustments for responding women in each follow-up wave.

Considering the information on responses achieved in each follow-up wave as provided in Table 3, note that 67.4% and 49.9% of the women interviewed in the baseline survey responded in the first and second follow-up waves respectively. Due to the high nonresponse rates, the first two options were not considered suitable alternatives for nonresponse compensation.

Thus the solution adopted was to model the response probabilities using the covariate information available from the baseline survey. The procedure used was proposed by Little ^{9}9 Little RJ. Survey nonresponse adjustments. International Statistical Review 1986; 54:139-57., and is also described in Lepkowski ^{10}10 Lepkowski J. Non-observation error in household surveys in developing countries. In: Department of Economic and Social Affairs, Statistics Division, editor. Household surveys in developing and transition countries. New York: United Nations; 2005. p. 149-69. (Series F, 96). and Brick & Montaquila ^{11}11 Brick JM, Montaquila JM. Nonresponse and weighting, In: Pfeffermann D, Rao CR, editors. Handbook of statistics 29A. Sample surveys: design, methods and applications. Philadelphia: Elsevier; 2009. p. 163-85..

The general idea behind the procedure used to obtain the sample weights in each telephone interview follow-up wave can be described in four steps, as presented in Figure 2.

**Figure 2**

Modeling response probabilities to calculate adjustments to the weights of the two segments.

In the first step, a model was fitted to explain the probability of responding to each follow-up wave for each postnatal woman in the baseline sample using the baseline covariate information as well as the follow-up wave response indicator. This procedure was applied independently for each follow-up wave.

In the second step, the predicted values of the response probabilities in each follow-up wave were estimated using the model fitted in step one.

In the third step, for each follow-up wave the quintiles of the predicted response probabilities were used to define five weight adjustment classes in which a response rate was estimated by the ratio of the sum of respondents’ baseline calibra-ted sample weights to the total of baseline calibrated sample weights of postnatal women of the class, as indicated by expression (9) of Figure 2.

In the last step, the reciprocals of the response rates estimated by follow-up wave and weight adjustment class were used to adjust the baseline calibrated sample weights of the postnatal women interviewed in each follow-up wave.

For the models of response probability, the set of potential predictor variables initially considered included: macro-region; located in capital city or not; type of hospital governance; postnatal woman’s socioeconomic class (A+B, C, or D+E), delivery payment (public, private health insurance, or directly out of pocket), postnatal woman age class (12-19 years, 20-34 years, and 35 years or more); “*Have you got any work where you get paid?*” (yes or no); “*Were you satisfied with your pregnancy at its beginning?*” (yes or no); “*Still birth or neonatal death of child?*” (yes or no); race or skin color (white, black, brown, yellow, or indigenous); “*Were there obstetric complications during gestation leading to negative perinatal outcomes?*” (yes or no); and for the second follow-up wave only, has the woman responded to the first follow-up wave (yes or no).

For the first follow-up wave, the significant predictor variables were the three variables that defined sample strata (macro-region, capital or not and type of hospital governance), postnatal woman’s socioeconomic class and postnatal woman’s age class.

For the second follow-up wave the significant variables were the same five variables listed above plus “*Have you got any work where you get paid?*”, “*Were you satisfied with your pregnancy at its beginning?” and “Still birth or neonatal death of child?*”.

In the correction of follow-up sample weight (third step), the predicted response probabilities were not used directly to adjust the baseline calibrated sample weights in each follow-up wave to avoid undesirable variation in the final weights. In fact, Kish ^{12}12 Kish L. Weigthing for unequal Pi. Journal of Official Statistics 1992; 8:183-200. demonstrates that sample weights may reduce bias but often increase the variance of weighted estimators, since the ratio between the variance of the weighted estimator and the variance of the corresponding un-weighted estimator is equal to 1 plus the square of the coefficient of variation of the sample weights. Thus the solution in the third and fourth steps leads to a better solution in correcting the follow-up sample weights for nonresponse, while keeping the increase in weight variation to a minimum (Table 3).

# Acknowledgments

To the regional and state coordinators, supervisors, interviewers and crew of the study and the mothers who participated and made this study possible.

# References

^{1}do Carmo Leal M, da Silva AA, Dias MA, da Gama SG, Rattner D, Moreira ME, et al. Birth in Brazil: national survey into labour and birth. Reprod Health 2012; 9:15.^{2}Cochran WG. Sampling techniques. 3^{rd }Ed. New York: John Wiley & Sons; 1977.^{3}Altman DG. Practical statistics for medical research. London: Chapman and Hall, 1991.^{4}Fleiss JL. Statistical methods for rates and proportions, 2^{nd}Ed. New York: John Wiley & Sons; 1981.^{5}Madow WG. On the theory of systematic sampling, II. Annals of Mathematical Statistics 1949; 20: 333-54.^{6}Haldane JBS. On a method of estimating frequencies. Biometrika 1945; 33:222-5.^{7}Veloso VG, Portela MC, Vasconcellos MTL, Matzenbacher LA, Vasconcelos ALR, Grinsztejn B, et al. HIV testing among pregnant women in Brazil: rates and predictors. Rev Saúde Pública 2008; 42:859-67.^{8}Silva PLN. Calibration estimation: when and why, how much and how. Rio de Janeiro: Instituto Brasileiro de Geografia e Estatística; 2004. (Textos para Discussão da Diretoria de Pesquisas, 14).^{9}Little RJ. Survey nonresponse adjustments. International Statistical Review 1986; 54:139-57.^{10}Lepkowski J. Non-observation error in household surveys in developing countries. In: Department of Economic and Social Affairs, Statistics Division, editor. Household surveys in developing and transition countries. New York: United Nations; 2005. p. 149-69. (Series F, 96).^{11}Brick JM, Montaquila JM. Nonresponse and weighting, In: Pfeffermann D, Rao CR, editors. Handbook of statistics 29A. Sample surveys: design, methods and applications. Philadelphia: Elsevier; 2009. p. 163-85.^{12}Kish L. Weigthing for unequal Pi. Journal of Official Statistics 1992; 8:183-200.

- FundingNational Council for Scientific and Technological Development (CNPq); Science and Tecnology Department, Secretariat of Science, Tecnology, and Strategic Inputs, Brazilian Ministry of Health; National School of Public Health, Oswaldo Cruz Foundation (INOVA Project); and Foundation for supporting Research in the State of Rio de Janeiro (Faperj).

# Publication Dates

**Publication in this collection**

Aug 2014

# History

**Received**

09 Oct 2013**Reviewed**

26 Feb 2014**Accepted**

24 Mar 2014