时间:2024-04-10 来源:合肥网hfw.cc 作者:hfw.cc 我要纠错
SIPA INAF U8145
Spring 2024
Problem Set 3: Poverty and Inequality in Guatemala
Due Fri. April 5, 11:59pm, uploaded in a single pdf file on Courseworks
In this exercise, you will conduct an assessment of poverty and inequality in Guatemala. The data come from the
Encuesta de Condiciones de Vita (ENCOVI) 2000, collected by the Instituto Nacional de Estadistica (INE), the
national statistical institute of Guatemala, with assistance from the World Bank’s Living Standards Measurement
Study (LSMS). Information on this and other LSMS surveys are on the World Bank’s website at
http://www.worldbank.org/lsms. These data were used in the World Bank’s official poverty assessment for
Guatemala in 2003, available here.
Two poverty lines have been calculated for Guatemala using these ENCOVI 2000 data. The first is an extreme
poverty line, defined as the annual cost of purchasing the minimum daily caloric requirement of 2, 172 calories.
By this definition, the extreme poverty line is 1,912 Quetzals (Q), or approximately I$649 (PPP conversion), per
person per year. The second is a full poverty line, defined as the extreme poverty line plus an allowance for nonfood items, where the allowance is calculated from the average non-food budget share of households whose
calorie consumption is approximately the minimum daily requirement. (In other words, the full poverty line is the
average per-capita expenditures of households whose food per-capita food consumption is approximately at the
minimum.) By this definition, the full poverty line is 4,319 Q, or I$1,467.
Note on sampling design: the ENCOVI sample was not a random sample of the entire population. First, clusters
(or “strata”) were defined, and then households were sampled within each cluster. Given the sampling design, the
analysis should technically be carried out with different weights for different observations. Stata has a special set
of commands to do this sort of weighting (svymean, svytest, svytab etc.) But for the purpose of this exercise, we
will ignore the fact that the sample was stratified, and assign equal weight for all observations.1 As a result, your
answers will not be the same as in the World Bank’s poverty assessment, and will in some cases be unreliable.
1. Get the data. From the course website, download the dataset ps3.dta, which contains a subset of the variables
available in the ENCOVI 2000. Variable descriptions are contained in ps3vardesc.txt.
2. Start a new do file. My suggestion is that you begin again from the starter Stata program for Problem Set 1 (or
from your own code for Problem Set 1), keep the first set of commands (the “housekeeping” section) changing
the name of the log file, delete the rest, and save the do file under a new name.
3. Open the dataset in Stata (“use ps3.dta”), run the “describe” command, and check that you have 7,230
observations on the variables in ps3vardesc.txt.
4. Calculate the income rank for each household in the dataset (egen incrank = rank(incomepc)). Graph the
poverty profile. Include horizontal lines corresponding to the full poverty line and the extreme poverty line.
(Hint: you may want to create new variables equal to the full and extreme poverty lines.) When drawing the
poverty profile, only include households up to the 95th percentile in income per capita on the graph. (That is,
leave the top 5% of households off the graph.) Eliminating the highest-income household in this way will allow
you to use a sensible scale for the graph, and you will be able to see better what is happening at lower income
levels.
5. Using the full poverty line and the consumption per capita variable, calculate the poverty measures P0, P1, P2.
(Note: to sum a variable over all observations, use the command “egen newvar = total(oldvar);”.)
6. Using the extreme poverty line and the consumption per capita variable, again calculate P0, P1, and P2.
1 In all parts, you should treat each household as one observation. That is, do not try to adjust for the fact that
some households are larger than others. You will thus be calculating poverty statistics for households, using
per-capita consumption within the household as an indicator of the well-being of the household as a whole.
7. Using the full poverty line and the consumption per capita variable, calculate P2 separately for urban and rural
households.
8. Using the full poverty line and the consumption per capita variable, calculate P2 separately for indigenous and
non-indigenous households.
9. Using the full poverty line and the consumption per capita variable, calculate P2 separately for each region.
(Three bonus points for doing this in a “while” loop in Stata, like the one you used in Problem Set 1.)
10. Using one of your comparisons from parts 7-9, compute the contribution that each subgroup makes to
overall poverty. Note that if P2 is the poverty measure for the entire population (of households or of individuals),
and P2 j and sj are the poverty measure and population share of sub-group j of the population, then the
contribution of each sub-group to overall poverty can be written: sj*P2j/P2.
11. Summarize your results for parts 4-10 in a paragraph, noting which calculations you find particularly
interesting or important and why.
12. In many cases, detailed consumption or income data is not available, or is available only for a subset of
households, and targeting of anti-poverty programs must rely on poverty indices based on a few easy-toobserve correlates of poverty. Suppose that in addition to the ENCOVI survey, Guatemala has a population
census with data on all households, but suppose also that the census contains no information on per capita
consumption and only contains information on the following variables: urban, indig, spanish, n0_6, n7_24,
n25_59, n60_plus, hhhfemal, hhhage, ed_1_5, ed_6, ed_7_10, ed_11, ed_m11, and dummies for each region.
(In Stata, a convenient command to create dummy variables for each region is “xi i.region;”.) Calculate a
“consumption index” using the ENCOVI by (a) regressing log per-capita consumption on the variables
available in the population census, and (b) recovering the predicted values (command: predict), (c) converting
from log to level using the “exp( )” function in Stata. These predicted values are your consumption index. Note
that an analogous consumption index could be calculated for all households in the population census, using the
coefficient estimates from this regression using the ENCOVI data. Explain how.
13. Calculate P2 using your index (using the full poverty line) and compare to the value of P2 you calculated in
question 5.
14. Using the per-capita income variable, calculate the Gini coefficient for households (assuming that each
household enters with equal weight.) Some notes: (1) Your bins will be 1/N wide, where N is the number of
households. (2) The value of the Gini coefficient you calculate will not be equal to the actual Gini coefficient for
Guatemala, because of the weighting issue described above. (3) To generate a cumulative sum of a variable in Stata,
use the syntax “gen newvar = sum(oldvar);”. Try it out. (4) If you are interested (although it is not strictly
necessary in this case) you can create a difference between the value of a variable in one observation and the value
of the same variable in a previous observation in Stata, use the command “gen xdiff = x - x[_n-1];”. Be careful
about how the data are sorted when you do this.
What to turn in: In your write-up, you should report for each part any calculations you made, as well as written
answers to any questions. Remember that you are welcome to work in groups but you must do your write-up on
your own, and note whom you worked with. You should also attach a print-out of your Stata code.
请加QQ:99515681 邮箱:99515681@qq.com WX:codinghelp