• collegeeducation > Returns to College Education Reexamined: Individual ...
  • Returns to College Education Reexamined: Individual ...

    免费下载 下载该文档 文档格式:PDF   更新时间:2014-08-29   下载次数:0   点击次数:1
    Returns to College Education Reexamined: Individual Treatment Effects, Selection Bias, and Sorting Gain Shu-Ling Tsai* (Academia Sinica) and Yu Xie** (University of Michigan) Population Studies Center Research Report 08-631 * Shu-Ling Tsai, Ph.D., is a Research Fellow at the Institute of Sociology, Academia Sinica, Nankang, Taipei, Taiwan 11529 (e-mail: tsai@gate.sinica.edu.tw ; Tel: +886-2-2652-5142; Fax: +886-2-2652-5050). **Yu Xie, Ph.D., is Otis Dudley Duncan Professor of Sociology and Statistics at the University of Michigan, a Research Professor at the Population Studies Center and the Survey Research Center of the Institute for Social Research, and a Faculty Associate at the Center for Chinese Studies, 426 Thompson, Ann Arbor, MI 48109 (e-mail: yuxie@umich.edu). Returns to College Education Reexamined: Individual Treatment Effects, Selection Bias, and Sorting Gain Abstract In reexamining earnings return to college education, we consider and compare a Mincer-type productivity model and a Heckman-type selection model with essential heterogeneity. We apply the two methodological approaches to an empirical setting in a transitional economy that has recently experienced a rapid expansion in higher education: contemporary Taiwan. Our empirical results reveal substantial individual heterogeneity in the Taiwanese data used. Not only do we find profound gender differences, but heterogeneity within women. Among women, the downward biases in the Mincer coefficient for both the average treatment effect (ATE) and the effect for the treated (TT) are statistically significant. The results show that women's schooling decisions are based on unobserved gains. Female college attendees would be much worse off if they had not gone to college. 1 1. Introduction One of the best-established empirical findings in social science research is that college graduates attain higher earnings in the labor market than do high school graduates (e.g., Morris and Western 1999), irrespective of gender (Mare 1995; McCall 2000; Bobbitt-Zeher 2007) or social context (Glewwe 2002; Gerber and Schaefer 2004). Nevertheless, the underlying cause of the observed relationship between education and earnings has long been a subject of debate. A widely held view in both economics and sociology maintains that schooling causally affects earnings positively, as part of human capital that raises a worker's skill and productivity (Becker 1964; Blau and Duncan 1967; Mincer 1974). Critics, however, contend that the empirically documented relationship between the two is not necessarily causal. They argue that some portion of the schooling-earnings relationship is spurious, but they disagree as to how much of the observed relationship is spurious (see, e.g., Card 1995, 1999). It has been suggested that the typical person who chooses to go to college would have relatively high earnings due to his/her higher unobserved endowments, whether or not s/he actually went to college. This is called "ability bias" or, simply, "selection bias." Further, as Heckman (2001a) convincingly shows, the presence of individual heterogeneity and self-selection gives rise to a sorting gain, another form of selection bias in the standard estimator for the causal effect of schooling with observational data. In this paper, we borrow language from the causal inference literature to reassess the causal 2 effects, particularly heterogeneous effects, of college education on earnings, using data from recent Taiwanese surveys in the early 2000s for young workers aged 25-34. Treating college education as a treatment versus high-school and junior-college education as untreated (control), we ask the question: What would the economic outcome be if a given person received the treatment (i.e., attained college education) compared to the case where the person had not received the treatment (i.e., stopped education after high school or junior college)?1 Of course, this counterfactual question is impossible to answer at the individual level, as a person is observed either to have received a college education or not.2 Thus, attempts to answer this causal question empirically always invoke statistical analyses of observational data under some assumptions. Such attempts, called statistical approaches to causal inference, can only be made at an aggregate level (Holland 1986).3 We define the returns to college education as the differences between two sub-populations in the mean of logged earnings, conditional on relevant covariates. We focus on identifying causal parameters with differences in means for specific populations or sub-populations such as the average treatment effect (ATE, i.e., the effect of randomly assigning a person to college), the 1 In this study, we focus on the treatment effect of university education (i.e., at least four-year college education) so that our results are comparable with the literature in college premium. 2 See Sobel (1995), Winship and Morgan (1999), and Morgan and Winship (2007) for the general problem of using observation data to make causal inferences. 3 The counterfactual model of causality has become increasingly popular in sociological empirical studies; see, e.g., Morgan (2001, 2004), Brand and Halaby (2006), Xie and Wu (2005), and Brand and Xie (2007). 3 treatment effect for the treated (TT, i.e., the effect of treatment for college attendees, compared with what they would experience if they had not gone to college), and the treatment effect for the untreated (TUT, i.e., the effect of treatment for those who did not attain college, compared with what they would have experienced had they gone to college). As in most studies of the economic returns to education (e.g., Hauser and Xie 2005; McCall 2000; Xie and Hannum 1996), we begin with the classic "Mincer equation" as a point of departure. The "Mincer equation" is just a simple linear regression of logged earnings on schooling and a separable quadratic function in work experience. The equation can easily be estimated via ordinary least squares (OLS) regression with observed data. While simple, the coefficient of the education variable in the Mincer equation has the desirable property of being easily interpretable as the rate of economic return to schooling. However, the causal interpretation of the Mincer model of education relies on a number of strong assumptions that are unlikely to hold true. For example, it has long been recognized that if some unobserved factor, such as ability, both correlates with schooling choice and affects earnings positively, the OLS estimator of the return to schooling will be upwardly biased (Griliches 1977). Based on notions of comparative advantage, Willis and Rosen (1979) also argue that persons may self-select into college versus non-college educational levels based on their anticipated economic benefits from their educational decisions. More recently, Heckman and his associates challenge – among other 4 things – an underlying assumption of the Mincer model that the causal effects of education are homogeneous (e.g., Heckman, Lochner, and Todd 2006), and they propose methods that help examine "essential heterogeneity" at the individual level (e.g., Heckman and Vytlacil 1999, 2001, 2005). Thus, Heckman and his associates show that self-selection may arise in two forms: selection based on heterogeneous background characteristics and heterogeneous effects, the former giving rise to a "selection bias" and the latter to a "sorting gain."4 The concepts of heterogeneity and self-selection of agents (i.e., individuals and their families) are fundamental for studies in causes and consequences of higher education. In this study, we rely on a methodological approach developed in Heckman, Urzua, and Vytlacil (2006) to explore heterogeneous treatment effects at different levels of unobserved educational selectivity. This approach allows us to estimate the "marginal treatment effect" (MTE) either under assumptions for local instrumental variables (LIV) or under the normality assumption. We can aggregate the MTE to derive the average treatment effect if everyone goes to college (ATE), if only the treated group goes to college (TT), and if the untreated group counterfactually goes to college (TUT), separately. We also test whether the relationship between the propensity of receiving college education 4 The separation of a "selection bias" and a "sorting gain" in the econometrics literature corresponds to that of the two well understood sources of biases with naive estimators with observational data for causal inferences: (1) average difference in outcome between the treated and the control groups even without treatment and (2) average difference in the effects of the treatment between the two groups (see Winship and Morgan 1999). 5 and logged earnings is linear or nonlinear, whether the treated group benefits more from going to college than the untreated group (TT > TUT), whether self-selection is based on unobserved gains, whether the OLS coefficient obtained from a simple version of the Mincer equation is biased, whether the results obtained are driven by the particular empirical strategy utilized, whether selection bias and sorting gain are positive or negative, and, finally, whether these effects and biases are statistically significant. In the remainder of the paper, we first discuss explanations for college premium through a selective review of the literature. Then we highlight the setting investigated and present analytical models considered in our framework for causal inference, along with the rationale for using them. After illustrating methods used and giving formal definitions of parameters to be identified and estimated, we report our data source and empirical findings, then conclude with a discussion of our findings. 2. Explanations for Economic Returns to College Education The earnings premium for college graduates over high school graduates in the labor market is well documented. Why do the college-educated earn more? There are a number of possible explanations. One is a pure productivity story: college education raises an individual's human capital and thus improves his/her skill and productivity, which, in turn, leads to a significant increase in earnings. Another explanation is one of selection: the type of persons who select (or are selected) into college have certain characteristics – both observable and unobservable – that 6 enable them to earn more in the labor market. In this contrarian view, it is the selection in the allocation of educational resources and economic rewards that matters.5 Below, we briefly review the two explanations. The Productivity Explanation One prevailing explanation for the positive relation between education and earnings views education as a source of marketable skills. It is commonly believed that the economic value of schooling lies in the human capital it instills in students. In particular, higher education is believed to provide students with skills that are valued and rewarded in the labor market. The thesis of industrialism, for example, suggests that school systems expand in modern society to meet the increased need for a trained labor force, and this expansion, in turn, leads to a more meritocratic allocation of both schooling and economic rewards (e.g., Blau and Duncan 1967; Treiman 1970).6 In industrialized societies, high economic rewards are allocated to those jobs requiring high degrees of skill. Educational credentials represent a type of "capital" – be it human capital or 5 The interpretation that stresses the role of selection does not rule out productivity as a partial explanation for college premium, but it distinctly differs from the productivity story in causal interpretations. 6 Conflict and reproduction theorists have long challenged the common argument that advances in technology and the upgrading of the occupational structure have resulted in a need for higher levels of skill and training, which education is said to supply (e.g., Boudon 1974; Collins 1979). They argue that the critical role played by education in industrialized societies is not to provide training but to preserve the status culture. Educational credentials serve as a signal, and employers select those job candidates whom they believe will fit best into the status culture of the elite. Thus, the schools are used to control membership in economic institutions. As a result, educational systems may expand without either increasing the equality of educational opportunity or fostering the meritocratic allocation of economic rewards. 7 status capital – that allows individuals to "buy" their way into more lucrative jobs (Sewell and Hauser 1975; Grusky 1983; Shavit and Müller 1997).7 Accordingly, individuals invest in higher education, for which the earnings premium represents a justifiable return to a prior investment. The productivity explanation is best provided by Becker's (1964) human capital theory, which offers an economic conceptual apparatus for explaining why earnings inequality is a necessity in an economy where some activities require more costly investments than others. According to this theory, differential wages are assumed to result in large part from differences in the amounts of human capital possessed by workers over their life cycles, as human capital determines productivity. Thus, human capital theory explains earnings disparities between college graduates and high school graduates as attributable to their differences in productive capacity. Becker emphasizes that if one views college education as an investment, then persons decide whether or not to invest in it in the expectation of maximizing a positive return on their investment. Within the human capital framework, Mincer (1974) develops a "standard" equation to empirically estimate the return to schooling, using OLS regressions with logged earnings as the dependent variable and years of schooling as a primary independent variable, along with a separable quadratic function in work experience. The Mincer equation has been "one of the great 7 In developing countries, education also provides pathways into the evolving labor market; see, e.g., Morgan and Morgan (1998, 2004) and Buchmann and Hannum (2001). 8 success stories of modern labor economics" (Willis 1986: 526). It was not until recently that the Mincer model was seriously criticized, by Heckman, Lochner, and Todd (2006). The Selection Explanation Alternatively, the positive association between education and earnings can be attributable to educational selectivity. Recent studies in sociology show that resources and incentives are two major determinants of college attainment, regardless of gender (Buchmann and DiPrete 2006) and ethnicity (Morgan 2005). The economist's view is that college attendance is the result – at least in part – of optimizing behavior by agents within a certain opportunity structure and based on anticipated returns to schooling. Not only may opportunities vary across individuals, but individuals respond differently to opportunities. Among persons situated in the same opportunity structure, some select (or are selected) into college and others do not, in part reflecting their differential expected returns to college education. When differential schooling choices and outcomes are based on characteristics not observed by the researcher, self-selection arises. One form of selection bias assumes that persons with higher endowments (such as innate ability) are more likely to attend college, and they also tend to have higher earnings. This form of selection is often called "selection bias." Another form of selection bias assumes that persons who benefit the most from college education are most likely to attend college, so that the average student going to college should have higher earnings returns to college than the marginal student 9 who is indifferent between going or not (Heckman and Vytlacil 1999; Card 2001). This form of bias is called "sorting gain." Estimating the marginal return for a latent group at margin represents an important but difficult research task. It was Roy (1951) who provided a prototypical model of sorting gain selection for a two-sector choice. The Roy model posits that self-selection due to comparative advantage in skills reduces earnings differences by sector relative to those that would result if workers were randomly assigned to the sectors. The importance of Roy's work was not widely recognized by economists until the 1970's (Neal and Rosen 2000), and not implemented in empirical work until Quandt (1972), Heckman (1974) and Gronau (1974) provided the econometric foundations for estimating switching regression, selection model, and selectivity bias. Willis and Rosen (1979) extended the Roy model to allow for endogenous skill acquisition through education. Their empirical work revealed that expected lifetime earnings gains influence the decision to attend college: those who did not attend college would have earned less than observably similar persons who did attend, while those who attended college would have earned less as high school graduates than observably similar persons who stopped schooling after high school. Willis and Rosen emphasized this positive sorting as a selection mechanism instead of the common "ability bias." They said that the ability bias might actually be zero or even negative because quitting school early was indicative of good earnings prospects. 10 While separating those who attend college from those who do not, Willis and Rosen (1979) assumed homogeneous education effects within a sector. The pioneering work by Heckman and Robb (1985) established the importance of heterogeneous treatment effects. Responding to this new emphasis, Bj?rklund and Moffitt (1987) modified the then "standard" selection model by allowing "heterogeneity of rewards" in the model for the effects of education and other economic activities on earnings. They argued that such heterogeneity creates a new form of selection bias, namely, sorting on the gain (slope), which is distinct from sorting on the level (intercept). And they demonstrated how to use a selection model to identify the marginal gain to persons induced into a treatment status by a marginal change in the cost of treatment. Bj?rklund and Moffitt (1987) thus introduced the parameter of marginal treatment effect into the literature in a parametric context. Later, Imbens and Angrist (1994) showed how to identify a discrete approximation to this parameter as a local average treatment effect (LATE) using the instrumental variable (IV) approach. The Roy model has been further clarified and extended by Heckman and his associates (e.g., Heckman and Honoré 1990). This body of recent work (e.g., Heckman and Vytlacil 1999, 2000, 2005; Carneiro and Heckman 2002; Carneiro, Hansen, and Heckman 2003; Heckman and Li 2004; Heckman, Urzua, and Vytlacil 2006) extends the "marginal treatment effect" approach to a semiparametric context with a local instrumental variable (LIV), which is essentially the propensity score for treatment consisting of at least some instrumental variables. Heckman 11 (2001a, 2001b) argues that returns to college education should be conceptualized as heterogeneous at the individual level, with an emphasis on unobservables: due to unobserved heterogeneity, observationally identical people make different schooling choices and earn different wages, and hence there should be a wide range of causal effects of college education for different members in a population. Furthermore, Heckman and his associates (cited above) present new methods for modelling the essential heterogeneity in responses to schooling, i.e., persons select into college based on their own idiosyncratic return, conditional on observed characteristics. To conclude, there are two distinct perspectives in the literature. The productivity story says that in the absence of unmeasured heterogeneity bias, all individuals get the roughly same benefit from college education. A key assumption of this perspective is TT = ATE. A more strict version of this perspective assumes homogeneous treatment effects so that individual-level treatment effects are the same as the average treatment effect for everyone. In contrast, the Heckman-type selection model assumes not only individual heterogeneity but that people act on it when making schooling choices. Consequently, persons who receive the treatment are those who get more out of it (TT > ATE). This self-selection assumption implies a positive sorting gain for college attendees as a whole, if the principle of comparative advantage is at work. In what follows, we apply methodologies recently developed by Heckman, Urzua, and Vytlacil (2006) to recent Taiwanese survey data to ascertain the causal effects of college 12 education on earnings in the Taiwanese context. The methodologies allow us to consider essential heterogeneity at the individual level and to separate out the selection bias component and the sorting gain component. We next highlight the setting investigated. 3. The Taiwanese Context The Taiwanese educational system has a basic structure of 6-3-3-4 years of schooling, with 9 years of compulsory education. Students completing compulsory education take competitive examinations and are assigned to tracked high schools according to their results. Another great leap takes students from high school to university or college: stringent eligibility examinations. Prior to 1995, the "united college entrance examination" held in summer was the only mechanism for selection into colleges and universities. Those who passed the entrance examination were assigned to specific institutions and departments within these institutions, and those who failed the examination could retake the examination again in subsequent years. During the period of 1995-2002, certain departments in some universities were allowed to hold their own matriculation examinations and to recruit preferred students up to a certain proportion of the total intake (from 5 to 30 percent). Since 2002, almost all institutions of higher education have been granted the freedom to select preferred students up to a preferred proportion in spring first – using student's performance in the nationwide "basic academic test" held in winter as a major qualification consideration – and then recruit the remainder intake in summer through the united college entrance examination. In 2000, 70% of female and 68% of male high-school graduates 13 moved on to the tertiary level (Tsai and Shavit 2007: 147), making Taiwan one of the most highly educated societies in the world. In Taiwan, educational expansion is an important factor that rapidly increases the supply of college-educated labor force to the labor market. To meet the growing demand for skilled workers generated by industrialization, national manpower planning has been part of the economic development plans implemented by the Taiwanese government since as early as the 1960s. Prior to the lifting of martial laws in 1987, higher education was highly centralized and the "low-tuition" policy was enforced with an explicit purpose of reducing class inequality in educational opportunity by lowering the economic barrier to access higher education. During the 1990s, the state exercised less and less control over educational policies; civil society in Taiwan became more influential. To meet the increasing social demand for higher education, higher education has expanded rapidly since the 1990s. In 1990, there were 121 institutions of higher education with a total of 576,623 students; by 2004, the number of institutions was 159, serving 1,285,867 students. As a result, Taiwanese higher education was transformed from "elite" to "mass" education. This expansion of higher education systems coincided with a period of declining gender, class and even some ethnic stratification, but accompanied by growing inequality between categories of parental education (Tsai and Shavit 2007). A note of caution should be made here. In Taiwan, higher education refers to education provided by junior colleges, colleges, universities, and graduate schools. Junior colleges 14 constitute the lowest tier of education at the tertiary level.8 In this study, we limit our focus on earnings disparities between college attendees9 and those whose highest education level is high school or junior college. Clearly, this is an oversimplification. Yet, treating college education as a simple dichotomous treatment, this simplification allows us to borrow the literature on causal inference and to focus on the main points of the paper. 4. Analytical Models and Rationale A Conventional Mincer-type Model To test who benefits more from going to college, we start our empirical analysis by estimating a conventional Mincer-type model that treats the return to be invariant, although we can also reinterpret it as a weighted average of heterogeneous treatment effects (Angrist and Krueger 1999). The earnings equation takes the form i i i Y D X Ui β γ 1) where Yi is earnings in the logarithm form; i ( = 1, …, n) is subscript for person i; Di is a dummy variable representing whether or not the person is a college attendee (Di = 1 if yes; Di = 0 if no); β is the return to college attendance, after controlling for the effects of Xi, a vector of other 8 There are three types of junior colleges in Taiwan: five-year junior colleges that admit graduates from junior-high schools; two-year junior colleges that admit graduates of vocational high schools; and three-year junior colleges that admit graduates of both academic and vocational high schools (Tsai and Shavit 2007: 144). 9 Due to data limitations, we do not distinguish between college attendance and college completion, i.e., assuming that all college attendees complete college. This assumption is largely true in Taiwan, with 9.1% of drop-out rate (temporary: 4.8%; permanent: 4.3%) at the tertiary level of education; see http://www.edu.tw/EDU_WEB/EDU_MGT/STATISTICS/EDU7220001/service/sts4-2.htm. 15 earnings determinants including constant, gender, Mincer experience, and Mincer experience squared; γ is a vector of coefficients; Ui is the disturbance component of log earnings which includes such unobserved factors as ability, effort, and market luck. It is hardly a new hypothesis but an empirical regularity established around the world that β > 0; see, e.g., Psacharopoulos (1985) for international comparisons. The real question of social science interest is this: Does the magnitude of β estimated in equation (1) accurately reflect the causal effect of college education on earnings? There can be two potential sources of bias: (1) Di is correlated with Ui (i.e., if high-ability people choose to go to college, then there is the problem of ability bias); and (2) if the return varies by individual, β is correlated with Di (i.e., whether or not schooling decisions are made with expected gain β, resulting in a sorting gain). See Heckman, Lochner, and Todd (2006) for details. A long-nagging concern in the literature is that high-ability people would go to college and the same group would attain higher earnings even if they had not received college education. In such a case, the schooling-earnings connection may be a mirage; it is just a reflection of the fact that high-ability people are rewarded with an earnings premium for their (unobservable) innate skills in the labor market. Another possibility is that high ability is associated with advantaged family background. The result is that the Mincer coefficient may be biased due to the omission of ability or family background. In this study, we are limited by lack of an ability measure. We can, however, retain the assumption that ability is unobservable and test whether, after the inclusion 16 of the background variable in the model, there is still evidence of self-selection.10 A Heckman-type Selection Model Sociological theories of educational attainment based on rational choice often posit that agents choose among the different educational options available to them on the basis of their evaluations of costs and benefits and their perceived probabilities of more or less successful outcomes (e.g., Breen and Goldthorpe 1997; Boudon 1974; Buchmann and DiPrete 2006; Morgan 2005; Raftery and Hout 1993). This approach implies that agents would "rationally" choose college graduation over high school graduation when economic benefits can be anticipated from this choice. Nevertheless, the anticipated gain to a particular person from going to college is unknown. In this study, the latent gain is assumed to be determined by individual's observed and unobserved characteristics. We assume two mechanisms involved in the determination of college attendance: observed social selection (i.e., inequality of educational opportunities due to ascribed characteristics), and unobserved individual heterogeneity, such as high or low ability (i.e., a general concept encompassing such factors as intelligence, aspiration, and effort). We know that the family plays 10 The self-selection problem is intimately tied to the ability bias problem, but more serious and more complicated. Whereas conventional conceptualization of the ability bias is concerned with heterogeneity in an individual's ability to earn money in the labor market without additional education (i.e., a pre-treatment predictor), self-selection is caused by individual heterogeneity in returns to education that is unknown to the researcher but may be partially known to the agent. Thus, self-selection cannot be addressed by measures of ability as a pre-treatment covariate in empirical studies (Carneiro and Heckman 2002). 17 an important role in shaping educational opportunities and educational expectations of its members. A key unverifiable assumption for our study is the assumption of exclusion restrictions: some observed background variables can serve as instrumental variables (IVs) in order to identify the effects of the unobserved component. We then employ a Heckman-type two-stage selection model that consists of two equations: an (observed) earnings outcome equation and a treatment selection equation. Note that equation (1) is now revised in the random coefficient form to allow for self-selection based on individual idiosyncratic returns. The two equations in the revised model can be expressed as i i i i Y D X Ui β γ 2) Prob(Di = 1) = F(Ziδ), (3) where βi represents the heterogeneous return to college attendance, which varies among persons; Di is an endogenous dummy variable denoting whether or not person i is a college attendee; Zi is a vector of observed exogenous covariates like gender, ethnicity, family background, and birth cohort; δ is a vector of coefficients; F is a function that transforms (Ziδ) into a probability; other notations remain the same. Note that the following decision rule is used to predict the binary selection into college: ; D * 1 if 0 i i D D = > Di i = 0 otherwise, (4) * ( ) i i i D P Z U = ? where D* i is an unobserved latent variable indicating the net gain to person i from receiving 18 college education; Pi(Zi) is the person's "propensity score" of receiving college education, which is a linear function of Z after a transformation function F; UDi is the unobserved individual heterogeneity in the selection equation. Within this framework, Pi(Zi) and UDi in the schooling choice equation (4) may be interpreted as observed and unobserved costs of education, respectively (Carneiro and Heckman 2002). The higher the propensity score Pi(Zi), the more advantaged the family background; thus the lower the observed costs of education, and the larger the person's educational opportunity. By contrast, the larger the unobserved individual heterogeneity UDi, the larger the unobserved costs of education, and the less likely it is that the person will receive college education. If Pi(Zi) = UDi, then person i is assumed to be indifferent between going to college or not. We use this schooling decision rule to break the earnings equation (2) into two switching equations representing the two potential outcomes (Y0i, Y1i) for each person i: 0 0 i i Y X U0i γ = + if Di = 0 (5a) 1 1 i i Y X U1i γ = + if Di = 1 (5b) where E(U0i | Xi ) = 0 and E(U1i | Xi ) = 0 in the population. At the individual level, the treatment effect is Δi = Y1i – Y0i = (γ1 – γ0 ) Xi + (U1i – U0i) = βi , which is the casual effect of college education. Although a person may experience two potential outcomes, in practice only one of them is realized (or can be observed) for any person. Therefore, the two equations (5a and 5b) can be combined in a single-equation form with a switching weight: 19 , or Y 1 (1 ) i i i i Y DY D Y = + ? 0i 0i i = Y0i + Di (Y1i – Y0i). (6) Using equations (5a), (5b) and (6), we can rewrite equation (2) as 0 i i i i Y D X U β γ = + + , where 1 0 1 0 ( ) ( i i i ) i X U U β γ γ 7) Generally speaking, individual heterogeneity either in observed term (γ1 – γ0 )Xi or in unobserved term (U1i – U0i ) gives rise to the heterogeneity in βi in the population even after controlling for X. Thus, the return to college education (conditional on X) is a random variable with a distribution, and there may be sorting on the gain (see, e.g., Heckman 2001a; Heckman, Urzua, and Vytlacil 2006). In the presence of heterogeneity and self-selection of agents, the use of conventional methods (such as OLS or IV) fails to identify the treatment effects of concern. We next illustrate the methods used in this study. 5. Methods The fundamental reason why a simple Mincer model may yield a biased estimate is that college education is not randomly assigned, so the treated group (i.e., college attendees) and the untreated group (i.e., high-school or junior-college graduates) may systematically differ in important factors other than in observed pre-treatment covariates. These differences may exhibit complex correlations with the outcome variable or the coefficient representing returns to college education, making it difficult to ascertain the average causal effect of the treatment. To resolve this problem, several approaches have been adopted in the literature.11 See Morgan and Winship 11 For example, Blundell, Dearden, and Sianesi (2005) employ regression, matching, control function and 20 (2007) for a variety of methods used for counterfactuals and causal inference. Matching models have become popular in the sociological literature.12 If the average differences between treated and untreated groups can be fully captured by observed pre-treatment covariates, we can apply matching methods and use the conditional probability of receiving the treatment given a set of observed pretreatment variables (namely, the propensity score13 ) to account for such differences. Nevertheless, the unobservables play a part in this study. And hence, departing from the previous sociological literature, we borrow the methodological approach (and software14 ) developed by Heckman, Urzua, and Vytlacil (2006) to gauge heterogeneous treatment effects in the presence of self-selection and essential heterogeneity via both the semiparametric approach of local instrument variables (LIV) and the parametric approach. Our empirical work thus involves two stages, the second of which consists of several steps. In the first stage, we follow the literature of educational stratification (e.g., Mare 1980; Shavit and Blossfeld 1993; Hauser and Andrew 2006; Shavit, Arum, and Gamoran 2007), using instrumental variables methods to recovering the effect of education on earnings, using British data. 12 See DiPrete and Engelhardt (2004), DiPrete and Gangl (2004), Morgan and Harding (2006), and Morgan and Winship (2007) for recent advances in matching models and the practical limitations of matching estimators of causal effects. 13 Rosenbaum and Rubin (1983) establish the central role of the propensity score in matching models, whereas Heckman (1980) and Heckman and Robb (1985, 1986) establish the central role of the propensity score in selection models. The propensity score also plays a central role in instrumental variable estimation of treatment effects, even when unobserved selection bias and sorting effect are present (Heckman and Vytlacil 1999). 21 ascriptive characteristics Z (e.g., gender, ethnicity, birth cohort, parental education, growing-up place prior to age 15, and some interaction terms) to predict a persons' probability of attaining college/university education, as opposed to high-school/junior-college education, and thereby obtain the estimated propensity score. We treat this set of variables (Z) as instrumental variables (IV), which are assumed to affect earnings only indirectly by affecting college attendance. To the extent that this IV assumption is not true in practice, our results would be subject to alternative explanations. However, under the provisional IV assumption, this approach allows us to examine heterogeneous treatment effects at different levels of unobserved selectivity. We condition our analyses on gender – the X variable in previous equations. The propensity score P(Z) is then serving as a local instrumental variable (LIV) of marginal treatment effects in the second stage of the analysis, in which the nonlinear relationship between the propensity score and logged earnings can be written as ( ) 0 1 0 , E Y X x P Z p x x p K p γ γ γ ? ? ? ? ) (8) where p is a particular evaluation value of the propensity score and ( ) ( ) ( 1 0 1, K p E U U D P Z p p 9) The marginal treatment effect – defined as the average effect of treatment given the unobserved characteristics in the decision rule of schooling choice – plays a fundamental role in 14 See http://jenni.uchicago.edu/underiv/. 22 the identification and estimation of treatment effects of concern. In order to compute the MTE, we need to estimate values for γ0 and (γ1 - γ0 ) in equation (8), using a nonparametric version of the double residual regression procedure. We also test for a parametric model of essential heterogeneity under the assumption of joint normality for the error terms (U0, U1). Note that the MTE has two interpretations. First, the MTE defined as E(Δ | X , UD) is the expected effect of treatment conditional on observed characteristics X and conditional on UD, the unobservables from the first stage decision rule. That is: ( ) , i Di D MTE X x U u = = ( ) , i i Di D E X x U u = Δ 1 0 1 0 i i Di D x E U U U u γ γ 10) In such a case, the parameter of local average treatment effect (LATE) is a version of MTE. LATE, defined as E[(Y1 - Y0 )| D(z) - D( z′) = 1], is the average treatment effect for individuals whose treatment status is influenced by changing an exogenous regressor included in the treatment equation. Alternatively, we can also interpret the MTE as the mean gain in terms of Δ (= Y1 –Y0) for persons with observed characteristics X who would be indifferent between treatment or not if they were exogenously assigned a value of Z, say z, such that UD(z) = uD. Heckman and Vytlacil (1999, 2005) show that the MTE can be identified by taking derivatives of E(Y | Z = z) with respect to P(z). This method defines the local instrumental variable (LIV) estimator. ( ) , i Di i MTE X x U P p , i i LIV X x P p = = = ( ) , i i i E Y X x P p p ? = = = ? (11) 23 ( ) ( ) ( ) ( , , , LIV MTE ) D D D p u E Y X x P Z p x u x p = ? = = Δ = = Δ ? u (12) From this, we observe that the estimation of MTE involves the partial derivative of the expectation of the outcome Y (conditional on X = x and P(Z) = p) with respect to p. This is the method of local instrumental variables introduced in Heckman and Vytlacil (2001). For the model of essential heterogeneity, Heckman, Urzua, and Vytlacil (2006) consider a linear and separable version of the form: ( ) ( ) ( ) ( ) 1 0 , D D p u p u E Y X x P Z p K p x p p γ γ = = ? = = ? = ? + ? ? (13) and it requires the utilization of nonparametric techniques to estimate the last term ( ) K p p ? ? . That is to say, within their semiparametric approach the LIV estimator of the MTE is ultimately computed as ( ) n ( ) ( ) n n ( 1 0 , , LIV D D D p u K p ), x u x MTE p γ γ = ? ′ Δ = ? + = ? x u (14) and is evaluated over the set of p's contained in P. All treatment parameters of concern can be identified by using weighted averages of the MTE. Heckman, Urzua, and Vytlacil (2006: 396) show that MTE 1 0 ATE , D D x E Y Y X x x u du 1 0 = ? = = Δ ∫ MTE 1 0 TT TT , 1 , , D D D x E Y Y X x D x u w x u du 1 0 Δ ∫ MTE 1 0 TUT TUT , 0 , , D D D x E Y Y X x D x u w x u du 1 0 Δ ∫ 24 where the weights are ( ) ATE , 1 D w x u = ( ) ( ) ( ) TT 1 1 , D D u w x u f p X x dp E P X x ? ? = = ? ? = ? ? ∫ ( ) ( ) ( ) ( ) TUT 0 1 , 1 D D u w x u f p X x dp E P X x ? ? = = ? ? ? ? ? = ∫ where f is a density function. We finally decompose the conventional bias (i.e., the difference in magnitude between OLS and ATE estimators) into two components: the selection bias (i.e., the mean bias of selection in the absence of treatment) and the sorting gain (i.e., the mean difference in the return to college education between persons who went to college and persons who did not). Note that sorting gain + selection bias = E(U1 – U0 | X, D = 1) + [E(U0 | X, D = 1) – E(U0 | X, D = 0)] = E(U1 | X, D = 1) – E(U0 | X, D = 0) = Bias arising in the OLS estimate. 6. Empirical Results 6.1. Data Source and Variables This analysis is based on data from the Taiwan Social Change Survey (TSCS), which is a series of island-wide surveys conducted by the survey office at Academia Sinica. TSCS is an ongoing project designed to create data sets on the main themes of Taiwan's changing society; see http://www.ios.sinica.edu.tw/sc1/ for details of the surveys. For this analysis we use the 25 TSCS data collected during the period from 2001 to 2003.15 We limit our focus to young entrants (aged 25-34 when surveyed) to the labor market who attained at least 12 years of schooling (i.e., high school or higher). Our analysis is thus based on the information ascertained from 1,484 respondents (808 males and 676 females) born between 1967 and 1978, who provided complete information on earnings, education, parental education, gender, and birth year. Table 1 presents measurements and descriptive statistics of most of the variables used in the analysis. The share of college attendees in the sample is about the same for men (28.2%) and for women (29.0%). For men and women as well, the average earnings of the treated group are significantly higher than those of the untreated group. Although in both groups men's earnings are significantly higher than those of women, the gender gap is much smaller among college attendees (nt$5,861) than among those who did not make it to college (nt$11,736). We observe that irrespective of gender, college attendees are more likely to come from better-educated families, with both father's and mother's average years of schooling significantly higher than those in the untreated group. Parental education is used as an important indicator of family socioeconomic background.16 (Table 1 about here) 15 TSCS uses two different types of questionnaire each year. We use data derived from surveys 2001(I), 2002(I), 2002(II), 2003(I), and 2003(II), as they provide information useful for this analysis. 16 Due to data limitations, unfortunately, we are unable to consider father's occupation or family income in this analysis. 26 6.2. Results of OLS Regressions Predicting Logged Earnings We first gauge the return to college attendance, using a simple version of the "standard" Mincer earnings equation. This model includes only gender, years of Mincer experience, and Mincer experience-squared as explanatory variables, in addition to a dummy variable indicating whether or not the respondent is a college attendee. Table 2 presents the OLS coefficients estimated for men and women, separately. All the coefficients reported in the table are statistically significant at the level of α = .05. (Table 2 about here) The OLS estimate of the mean return to college attendance is 31.1% in the log form for Taiwan's young male entrants to the labor market in the early 2000s. This OLS coefficient implies that the average rate of returns to schooling is about 8.2% annually in the male sample, as the treated group stayed in school longer than did the untreated group (men: 3.8 years; women: 3.4 years). The estimated return to college education for females (43.2%, about 12.9% annually) is significantly higher than that for males. We thus find that college credentials are more important for women than for men in the pursuit of labor market achievements. This finding is consistent with the literature (e.g., Gerber and Schaefer 2004; Mare 1995; Xie and Hannum 1996). 6.3. Propensity Score Estimation We then consider both causes and consequences of college education. We start from 27 estimating the probability of receiving college education for every observation in the analysis sample, using a probit model. The model and coefficient estimates for men and women, separately, are presented in Table 3. The last column of the table gives the mean marginal effect for each explanatory variable Z. The marginal effects derived from the probit model, Pr(D = 1 | Z) = Φ(δ'Z), is of the form Marginal Effects ( ) ( ) Pr 1 D Z Z Z φ δ δ ? = ′ = = ? where δ is the coefficients estimated in the probit model, Φ(?) is the standard normal cumulative distribution and φ(?) is the standard normal density. (Table 3 about here) Figure 1 depicts the density function for the estimated propensity score of college attendance by gender and for the treated group and the untreated group, respectively. It is the support of P(Z) – albeit not fully – that helps identify a variety of treatment effects of concern. The larger the support of P(Z) conditional on X, the bigger the set of LIV and LATE parameters that can be identified. In this part of the analysis, we lose a few cases due to lack of fully common support over range [0, 1]. In the situation of limited common support, we can still use LIV or LATE to identify ATE and TT and construct bounds on ATE and TT; see Heckman and Vytlacil (1999) for details. In short, the following analysis pertains to 770 males and 633 females, whose estimates are bounded within the range [0.08, 0.89] and [0.09, 0.86], respectively. (Figure 1 about here) 28 6.4. Tests on Essential Heterogeneity Using Semiparametric (LIV) Approach We first test the linearity of the conditional expectation of Y in terms of P(Z), i.e., E(Y | X = x, P(Z) = p). A test of whether the relationship between the propensity score and logged earnings is linear or nonlinear is a test of whether the conventional model or the model of essential heterogeneity is consistent with the data (Heckman, Urzua, and Vytlacil 2006: 397). Our test results indicated that our Taiwanese data do not support the linearity assumption. Polynomial regressions that respectively fit the male and female data well are Male: E(Y | X = x, P(Z) = p) = (10.5)*–(.70)*p + (10.6)*p2 –(50.2)*p3 + (106)*p4 –(99.6)*p5 + (32.9)*p6 (R2 = .999); Female: E(Y | X = x, P(Z) = p) = (10.2)* + (.72)*p–(.118)*p2 (R2 = 1.0). Accordingly, we find that for both genders, the Heckman-type selection model of essential heterogeneity fits our data better than does the conventional Mincer-type model. We therefore employ the Heckman-Urzua-Vytlacil semiparametric approach of estimation via the method of local instrumental variables (LIV), using the probability of receiving college education as the instrument. The following results are obtained from local linear regressions with Gaussian Kernel and cross-validation optimal bandwidth, through a nonparametric version of the double residual regression procedure. 6.5. Gender-Specific Marginal Treatment Effects The OLS results imply that going to college is an important channel for women to achieve 29 in the labor market and to overcome their disadvantages associated with their gender. Given this interesting finding on gender, we proceed to estimate the gender-specific marginal treatment effects. Figure 2 plots the estimated marginal treatment effect as a function of unobserved heterogeneity UD in the schooling choice equation, along with their 95% confidence intervals. (Figure 2 about here) Inspection of Figure 2 reveals that the confidence intervals are rather wide at two ends (in particular, at the right tail). Besides, the shape of the MTE differs between men and women, as the depicted line for men is non-monotone, and the one for women appears linear.17 In fact, the relationship between the MTE and UD is nonlinear for both genders, as shown below: Male: MTE = (1.4)* – (13.9)*UD + (42.5)*UD 2 – (36.2)*UD 3 (R2 = .973); Female: MTE = (.75)* – (.39)*UD + (.25)*UD 2 – (.11)*UD 3 (R2 = 1.0). That is to say, overall, the MTE declines with increasing unobserved component UD in the schooling choice equation, for both men and women. Recall that within this framework, the higher the unobserved individual heterogeneity UD, the higher the unobserved costs of attending college, and thus the lower the probability of attending college. Accordingly, the declining pattern of MTE with UD means that those who have the highest probability of going to college (i.e., those who are most advantaged in social selection) have the largest marginal returns. By 17 The line plotted in the figure has a few blanks, because the MTE cannot be estimated at points where the support of P(Z) is short 30 contrast, those who have the least probability of going to college (i.e., those who are disadvantaged in educational attainment due to ascribed characteristics Z) have the lowest marginal returns. The declining pattern of MTE in UD not only confirms heterogeneity in the return to college education for Taiwan but suggests that the average college attendee earns more than the marginal participant in Taiwanese higher education. Figure 3 depicts the estimated weights used to gauge treatment parameter ATE, TT, and TUT, whose estimates will be discussed later. As shown in the figure, the shapes of weights for men and women are similar to each other. ATE weights MTE evenly, while TT overweights the ATE for persons with low values of UD (who are more likely to attend college), and TUT overweights the ATE for persons with high values of UD (who are less likely to attend college). In light of the shape of MTE and the shape of the weights, TT > ATE > TUT. In such a case, the homogeneity assumption does not fit our data, and neither does the conventional approach assuming a homogeneous treatment effect (OLS or IV). (Figure 3 about here) The analysis up to this point was carried out by using a semiparametric-LIV approach, and the findings are in support of the essential heterogeneity model, irrespective of gender. To test whether the results are driven by the particular empirical strategy utilized (e.g., Kernel function and bandwidth used), we next estimate the model of essential heterogeneity using a parametric approach instead. 31 6.6. Tests on Essential Heterogeneity Using Parametric Approach In this section we report tests on essential heterogeneity using a parametric approach in which the marginal treatment effect is estimated under the assumption of joint normality for the error terms (U0, U1); see Heckman, Urzua, and Vytlacil (2006). Under this specification, we find that the relationship between the propensity score and logged earnings is linear among men, whereas the linearity assumption is, again, rejected for the case of women, as shown below: Male: E(Y | X = x, P(Z) = p) = (10.5)* + (.42)*p (R2 = 1.00) Female: E(Y | X = x, P(Z) = p) = (10.2)* + (.82)*p – (.12)*p2 (R2 = 1.00). Figure 4 depicts the marginal treatment effects in the parametric form. As we can see in the figure, the male MTE appears constant. This is different from what we saw in Figure 2. In contrast, the female MTE declines with increasing unobserved component UD, similar to what we reported earlier. Among women, there is a clear pattern that the higher the propensity of receiving college education, the higher the marginal returns to college education. The parametric results suggest that advantages in (observed) educational selectivity may lead to differential returns among women, but not among men. (Figure 4 about here) All in all, we find that the Mincer-type model fits the male data better, if the normality assumption holds true. By contrast, the Heckman-type selection model of essential heterogeneity fits the female data better, irrespective of which estimation approach is used. 32 6.7. Summary: Sorting Gain and Selection Bias To summarize the results, Table 4 presents comparisons of various treatment parameters of interest, along with the conventional OLS and IV estimators obtained from the same analysis samples. A few findings emerge from the table. (Table 4 about here) First of all, generally speaking, the treatment effect for the treated (TT) is larger than the average treatment effect (ATE), which is larger than the treatment effect for the untreated (TUT). The pattern TT > ATE > TUT holds for both genders and for the two approaches used, with one exception. The parametric estimation yields a result that TUT > ATE > TT among men. This result seems odd, but the differences across these three estimates are, in fact, statistically insignificant. We will come back to this point later. Second, the OLS coefficients are smaller than the ATE estimated, with one exception (i.e., the male semiparametric-LIV estimation). By contrast, the conventional IV estimates are larger than ATE, with one exception (i.e., men's IV is the same as their ATE estimated parametrically). It seems that conventional approaches (OLS and IV) may yield misleading estimates of the returns to schooling, because our data – in particular, female data – fit the selection model of essential heterogeneity better than the conventional Mincer-type model. Among women, the bias (for either ATE or TT) in the Mincer coefficient is both substantial and significant.18 This is true 18 We use the parametric bootstrap method to test whether or not the difference in the magnitude between 33 for both semiparametric-LIV and parametric estimations. Third, we find a negative selection bias and a positive sorting gain in the female sample, whereas in the male sample sorting gain and selection bias are not consistent in the (positive or negative) sign, neither are they statistically significant. In contrast, women's negative selection bias is both significant and substantial. Both semiparametric and parametric estimations suggest that women who attend college would make low incomes in the labor market if they did not make it to college. Besides, it seems that there is a purposive sorting into college on the basis of unobserved gains among women. Yet, likely due to a moderate sample size, women's sorting gain is not statistically significant, regardless of the estimation method used. In other words, we do not find strong evidence in support of the principle of comparative advantage at work in Taiwan. Fourth, the treatment effects are larger in the magnitude for women than in that for men, no matter which estimation method is used. On the one hand, the semiparametric-LIV estimation indicates that the average treatment effect for women (ATE = 62.1%, about 18.8% annually) is much higher than that for men (ATE = 14.4%, about 3.9% annually), with a significant gender gap (47.8%). Although the female TT and TUT are also higher than their male counterparts, gender disparities in these two parameters (43.3% and 49.6%, respectively) are not statistically two parameters of concern is statistically significant at the level of α = .05. See Johnston and DiNardo (1997: 365-6) for the method used. 34 significant. The higher education returns for women should be interpreted as resulting from women's severe disadvantage in the low-skill labor market rather than their advantage over men in the high-skill labor market (Xie and Hannum 1996). On the other hand, the parametric estimation reveals that gender differentials (in TT, ATE, and TUT) are all statistically significant. Note that for the male sample, the parametric approach yields a similar rate of return to schooling for TT, ATE, TUT, and IV, which is around 42% (about 11% annually). This finding implies that the male case may be approximated at the aggregate level by the homogeneous model in which U1 = U0 and thus IV = ATE = TT (>OLS). In contrast, the female case is best described by the selection model of essential heterogeneity, in which TT > IV > ATE > TUT > OLS. 7. Conclusion and Discussion Both economists and sociologists have had a long-standing interest in estimating the causal effect of education on labor market outcomes such as earnings. While there is a consensus in the relevant literature that higher education is associated with higher earnings, there are, however, disagreements over the nature of this observed relationship and the proper way to precisely estimate the true magnitude of the causal effect of education on earnings. In this paper, we test and distinguish between the Mincer-type productivity model and the Heckman-type selection model of essential heterogeneity. Our empirical results reveal substantial individual heterogeneity in the Taiwanese data, 35 especially in the female data. In Taiwan, as in elsewhere (see, e.g., Gerber and Schaefer 2004; Mare 1995; Xie and Hannum 1996), the gender gap in earnings is smaller among university degree holders, but the returns to a university education are greater for women than for men. Not only do we find profound gender differences in the returns to schooling, we also find heterogeneity within the female sample. Among women, TT > ATE > TUT with a positive sorting gain and negative selection bias. In such a case, the Heckman-type selection model of essential heterogeneity definitely fits the female data better, regardless of the estimation approach used. The downward biases in the Mincer coefficient for both ATE and TT are statistically significant. It seems that among women, not only do the treated group and the untreated group differ in unmeasured heterogeneity, but schooling choices are based on unobserved gains. In such a situation, a significant selection bias in terms of sorting gain is observed. Women who attend college would make low earnings if they did not make it to college. By contrast, the parametric estimation indicates that the Mincer-type model may fit the male data under the normality assumption, implying no significant group differences in unmeasured heterogeneity between college attendees and high-school/junior-college graduates. In this situation (U1i = U0i so Δi = Δ), and thus TT is close to ATE, with a trivial sorting gain. Besides, there is no significant selection bias among men. Although in many respects our results are consistent with the relevant literature, there are 36 some limitations in this study. For example, our data are limited by a small sample size and few covariate variables. We use a very simple version of the Mincer model with no other background variable in the earnings equation. We consider parental education an important indicator of family socioeconomic background that directly affects the attainment of college education and only indirectly influences labor force outcome through college education. Besides, we retain the assumption of unobservable ability throughout the analysis and do not use a proxy of ability. Future studies would benefit from either collecting new and better Taiwanese data or using existing rich data sets in which not only parental income but direct measures of intelligence (e.g., IQ) and achievement aspirations are available. A well-known tenet of status attainment research is that education is a crucial intervening link between the social background of individuals and their later socioeconomic achievements. In a stylized form of path analysis, parental education is assumed to affect one's educational attainment, which in turn affects one's occupational attainment and earnings (Blau and Duncan 1967; Sewell and Hauser 1975). A college degree mediates the direct effect of social origins on destinations in the United States (Hout 1988), whereas the gap in earnings between persons with a college education and those with only a high school education has widened since the 1980s (Mare 1995; Morris and Western 1999). The American case indicates that education acted as a great divider during the last century (Fisher and Hout 2006). Similarly, education has long served as a main vehicle of social mobility and a key divider as well in Taiwan. 37 Finally, the connection between educational selectivity and socioeconomic inequality is complex. As is well known, allocation and selection processes into positions in different dimensions of social stratification – such as education and earnings – are contingent on a particular social-political context under study, and so are the socioeconomic consequences of educational selectivity. We propose three new directions for future research. First, a careful study of temporal trends in Taiwan is needed to understand the impact of the educational expansion in Taiwan on labor force outcomes of college graduates. Second, the Taiwanese case should be compared to the experiences in other societal contexts, such as Western developed countries, Japan, and newly industrializing countries in East Asia, such as Korea and China. Third, it will also be useful to expand the causal analysis from two potential outcomes to multiple potential outcomes, as educational outcomes are not binary. The sequential nature of educational qualification and options has been emphasized in the literature of educational stratification (e.g., Breen and Jonsson 2000; Lucas 2001; Morgan 2005) and should be brought into the study of returns to education. 38 REFERENCES Angrist, Joshua D. and Alan B. Krueger. 1999. "Empirical Strategies in Labor Economics." Pp. 1277–1366 in Handbook of Labor Economics, vol. 3A, edited by O. Ashenfelter and D. Card. Amsterdam: Elsevier. Becker, Gary S. 1964. Human Capital: A Theoretical and Empirical Analysis, with Special Reference to Education. New York: Columbia University Press. Bj?rklund, Anders and Robert Moffitt. 1987. "The Estimation of Wage Gains and Welfare Gains in Self-Selection Models." The Review of Economics and Statistics 69: 42–49. Blau, Peter M. and Otis D. Duncan. 1967. The American Occupational Structure. New York : Free Press. Blundell, Richard, Lorraine Dearden, and Barbara Sianesi. 2005. "Evaluating the Effect of Education on Earnings: Models, Methods and Results from the National Child Development Survey." Journal of the Royal Statistical Society 168: 473–512. Bobbitt-Zeher, Donna. 2007. "The Gender Income Gap and the Role of Education." Sociology of Education 80: 1–22. Boudon, Raymond. 1974. Education, Opportunity, and Social Inequality: Changing Prospects in Western Society. New York: Wiley. Brand, Jennie E. and Charles N. Halaby. 2006. "Regression and Matching Estimates of the Effects of Elite College Attendance on Educational and Career Achievement." Social Science Research 35: 749–770. Brand, Jennie E. and Yu Xie. 2007. "Identification and Estimation of Causal Effects with Time-Varying Treatments and Time-Varying Outcomes." Sociological Methodology 37: 393–434. Breen, Richard and John H. Goldthorpe. 1997. "Explaining Educational Differentials: Towards a Formal Rational Action Theory." Rationality and Society 9: 275–305. Breen, Richard and Jan O. Jonsson. 2000. "Analyzing Educational Careers: A Multinomial Transitional Model." American Sociological Review 65: 754–772. Buchmann, Claudia and Thomas A. DiPrete. 2006. "The Growing Female Advantage in College Completion: The Role of Family Background and Academic Achievement." American Sociological Review 71: 515–541. 39 Buchmann, Claudia and Emily Hannum. 2001. "Education and Stratification in Developing Countries: A Review of Theories and Research." Annual Review of Sociology 27: 77–102. Card, David. 1995. "Earnings, Schooling, and Ability Revisited." Research in Labor Economics 14: 23–48. ——. 1999. "The Causal Effect of Education on Earnings." Pp. 1801–1863 in Handbook of Labor Economics, vol. 3A, edited by O. Ashenfelter and D. Card. Amsterdam: Elsevier. ——. 2001. "Estimating the Return to Schooling: Progress on Some Persistent Econometric Problems." Econometrica 69: 1127–1160. Carneiro, Pedro, Karsten T. Hansen, and James J. Heckman. 2003. "Estimating Distributions of Treatment Effects with an Application to the Returns to Schooling and Measurement of the Effects of Uncertainty on College Choice." International Economic Review 44: 361–422. Carneiro, Pedro and James J. Heckman. 2002. "The Evidence on Credit Constraints in Post-Secondary Schooling." The Economic Journal 112: 705–734. Collins, Randall. 1979. The Credential Society: A Historical Sociology of Education and Stratification. New York: Academic Press. DiPrete, Thomas A. and Henriette Engelhardt. 2004. "Estimating Causal Effects with Matching Methods in the Presence and Absence of Bias Cancellation." Sociological Methods and Research 32: 501–528. DiPrete, Thomas A. and Markus Gangl. 2004. "Assessing Bias in the Estimation of Causal Effects: Rosenbaum Bounds on Matching Estimators and Instrumental Variables Estimation With Imperfect Instruments." Sociological Methodology 34: 271–310. Fisher, Claude S. and Michael Hout. 2006. Century of Difference: How America Changed in the Last One Hundred Years. New York: Russell Sage Foundation. Gerber, Theodore P. and David R. Schaefer. 2004. "Horizontal Stratification of Higher Education in Russia: Trends, Gender Differences, and Labor Market Outcomes." Sociology of Education 77: 32–59. Glewwe, Paul. 2002. "Schools and Skills in Developing Countries: Education Policies and Socioeconomic Outcomes." Journal of Economic Literature 40: 436–482. Griliches, Zvi. 1977. "Estimating the Returns to Schooling: Some Econometric Problems." Econometrica 45: 1–22. Gronau, Reuben. 1974. "Wage Comparisons—A Selectivity Bias." Journal of Political Economy 40 82: 1119–1143. Grusky, David B. 1983. "Industrialization and the Status Attainment Process: the Thesis of Industrialism Reconsidered." American Sociological Review 48: 494–506. Hauser, Robert M. and Megan Andrew. 2006. "Another Look at the Stratification of Educational Transitions: the Logistic Response Model with Partial Proportionality Constraints." Sociological Methodology 36: 1–26. Hauser, Seth M. and Yu Xie. 2005. "Temporal and Regional Variation in Earnings Inequality: Urban China in Transition between 1988 and 1995." Social Science Research 34: 44–79. Heckman, James J. 1974. "Shadow Prices, Market Wages, and Labor Supply." Econometrica 42: 679–694. ——. 1980. "Addendum to 'Sample Selection Bias as a Specification Error'." Pp. 69–74 in Evaluation Studies Review Annual, vol. 5, edited by in E. Stromsdorfer and G. Farkas. Beverly Hills: Sage Publications. ——. 2001a. "Micro Data, Heterogeneity, and the Evaluation of Public Policy: Nobel Lecture." Journal of Political Economy 109: 673–748. ——. 2001b. "Accounting for Heterogeneity, Diversity and General Equilibrium in Evaluating Social Programmes." The Economic Journal 111: F654–F699. Heckman, James J. and Bo E. Honoré. 1990. "The Empirical Content of the Roy Model." Econometrica 58: 1121–1149. Heckman, James J. and Xuesong Li. 2004. "Selection Bias, Comparative Advantage and Heterogeneous Returns to Education: Evidence from China in 2000." Pacific Economic Review 9: 155–171. Heckman, James J., Lance J. Lochner and Petra E. Todd. 2006. "Earnings Functions, Rates of Return and Treatment Effects: The Mincer Equation and Beyond." Pp. 307–458 in Handbook of the Economics of Education, vol. 1, edited by E. Hanushek and F. Welch. Amsterdam: North-Holland. Heckman, James J. and Richard Robb. 1985. "Alternative Methods for Estimating the Impact of Interventions." Pp. 156–245 in Longitudinal Analysis of Labor Market Data, edited by J.J. Heckman and B. Singer. New York: Cambridge University Press. ——. 1986. "Alternative Methods for Solving the Problem of Selection Bias in Evaluating the Impact of Treatments on Outcomes." Pp. 63–107 in Drawing Inferences from Self-Selected 41 Samples, edited by H. Wainer. New York : Springer-Verlag. Heckman, James J., Sergio Urzua, and Edward Vytlacil. 2006. "Understanding Instrumental Variables in Models with Essential Heterogeneity." The Review of Economics and Statistics 88: 389–432. Heckman, James J. and Edward J. Vytlacil. 1999. "Local Instrumental Variables and Latent Variable Models for Identifying and Bounding Treatment Effects." Proceedings of the National Academy of Sciences of the United States of America 96: 4730–4734. ——. 2000. "The Relationship between Treatment Parameters within a Latent Variable Framework." Economics Letters 66: 33–39. ——. 2001. "Instrumental Variables, Selection Models, and Tight Bounds on the Average Treatment Effect." Pp. 1–15 in Econometric Evaluation of Labour Market Policies, edited by M. Lechner and F. Pfeiffer. New York: Physica-Verlag. ——. 2005. "Structural Equations, Treatment Effects, and Econometric Policy Evaluation." Econometrica 73: 669–738. Holland, Paul W. 1986. "Statistics and Causal Inference." Journal of the American Statistical Association 81: 945–960. Hout, Michael. 1988. "More Universalism, Less Structural Mobility: The American Occupational Structure in the 1980s." American Journal of Sociology 93: 1358–1400. Imbens, Guido W. and Joshua D. Angrist. 1994. "Identification and Estimation of Local Average Treatment Effects." Econometrica 62: 467–475. Johnston, Jack and John DiNardo. 1997. Econometric Methods. New York: McGraw-Hill. Lucas, Samuel R. 2001. "Effectively Maintained Inequality: Education Transitions, Track Mobility, and Social Background Effects." American Journal of Sociology 106: 1642–1690. Mare, Robert D. 1980. "School Background and Social Continuation Decisions." Journal of the American Statistical Association 75: 295–305. ——. 1995. "Changes in Educational Attainment and School Enrollment." Pp. 155–213 in State of the Union: American in the 1990s, vol. 1, edited by R. Farley. New York: Russell Sage Foundation. McCall, Leslie. 2000. "Gender and the New Inequality: Explaining the College/Non-College Wage Gap." American Sociological Review 65: 234–255. 42 Mincer, Jacob. 1974. Schooling, Experience, and Earnings. New York: National Bureau of Economic Research. Morgan, Stephen L. 2001. "Counterfactuals, Causal Effect Heterogeneity, and the Catholic School Effect on Learning." Sociology of Education 74: 341–374. ——. 2004. "Methodologist as Arbitrator: Five Methods for Black-White Differences in the Causal Effect of Expectations on Attainment." Sociological Methods and Research 33: 3–53. ——. 2005. On the Edge of Commitment: Educational Attainment and Race in the United States. Stanford: Stanford University Press. Morgan, Stephen L. and David J. Harding. 2006. "Matching Estimators of Causal Effects: Prospects and Pitfalls in Theory and Practice." Sociological Methods and Research 35: 3–60. Morgan, Stephen L. and William R. Morgan. 1998. "Education and Earnings in Nigeria, 1974-1992." Research in Social Stratification and Mobility 16: 3–26. ——. 2004. "Educational Pathways into the Evolving Labour Market of West Africa." Research in Sociology of Education 14: 225–245. Morgan, Stephen L. and Christopher Winship. 2007. Counterfactuals and Causal Inference: Methods and Principles for Social Research. New York: Cambridge University Press. Morris, Martina and Bruce Western. 1999. "Inequality in Earnings at the Close of the Twentieth Century." Annual Review of Sociology 25: 623–657. Neal, Derek and Sherwin Rosen. 2000. "Theories of the Distribution and Earnings." Pp. 379–427 in Handbook of Income Distribution, vol. 1, edited by A.B. Atkinson and F. Bourguignon. Amsterdam: Elsevier Science. Psacharopoulos, George. 1985. "Returns to Education: A Further International Update and Implications." The Journal of Human Resources 20: 583–604. Quandt, Richard E. 1972. "A New Approach to Estimating Switching Regressions." Journal of the American Statistical Association 67: 306–310. Raftery, Adrian E. and Michael Hout. 1993. "Maximally Maintained Inequality: Expansion, Reform and Opportunity in Irish Education 1921-75." Sociology of Education 66: 41–62. Rosenbaum, Paul R. and Donald B. Rubin. 1983. "The Central Role of the Propensity Score in Observational Studies for Causal Effects." Biometrika 70: 41–55. Roy, Andrew D. 1951. "Some Thoughts on the Distribution of Earnings." Oxford Economic 43 Paper 3: 135–146. Sewell , William H. and Robert M. Hauser. 1975. Education, Occupation, and Earnings: Achievement in the Early Career. New York: Academic Press. Shavit, Yossi, Richard Arum, and Adam Gamoran. 2007. Stratification in Higher Education: A Comparative Study. Stanford: Stanford University Press. Shavit, Yossi and Hans-Peter Blossfeld. 1993. Persistent Inequality: Changing Educational Attainment in Thirteen Countries. Boulder: Westview Press. Shavit, Yossi and Walter Müller. 1997. From School to Work: A Comparative Study of Educational Qualifications and Occupational Destinations. Oxford: Clarendon Press. Sobel, Michael E. 1995. "Causal Inference in the Social and Behavioral Sciences." Pp. 1–38 in Handbook of Statistical Modeling for the Social and Behavioral Sciences, edited by G. Arminger, C.C. Clogg, and M.E. Sobel. New York : Plenum Press. Treiman, Donald J. 1970. "Industrialization and Social Stratification." Pp. 207–234 in Social Stratification: Research and Theory for the 1970's, edited by E.O. Laumann. Indianapolis: Bobbs-Merrill. Tsai, Shu-Ling and Yossi Shavit. 2007. "Taiwan: Higher Education—Expansion and Equality of Educational Opportunity." Pp. 140–164 in Stratification in Higher Education: A Comparative Study, edited by Y. Shavit, R. Arum, and A. Gamoran. Stanford: Stanford University Press. Willis, Robert J. 1986. "Wage Determinants: A Survey and Reinterpretation of Human Capital Earnings Functions." Pp. 525–602 in Handbook of Labor Economics, vol. 1, edited by O. Ashenfelter and R. Layard. Amsterdam: Elsevier Science. Willis, Robert J. and Sherwin Rosen. 1979. "Education and Self-Selection." Journal of Political Economy 87: S7–S36. Winship, Christopher and Stephen L. Morgan. 1999. "The Estimation of Causal Effects from Observational Data." Annual Review of Sociology 25: 659–707. Xie, Yu and Emily Hannum. 1996. "Regional Variation in Earnings Inequality in Reform-Era Urban China." American Journal of Sociology 101: 950–992. Xie,Yu and Xiaogang Wu. 2005. "Market Premium, Social Process, and Statisticism." American Sociological Review 70: 865–870. 44 Table 1. Variables and Descriptive Statistics Male Female Treated Untreated Treated Untreated Independent Variables Mean SD Mean SD Mean SD Mean SD Years of schooling 16.623 1.019 12.841 1.077 16.357 .794 13.006 1.139 Monthly earnings 46,754 31,085 40,121 27,731 40,893 15,911 28,385 12,673 Log of earnings 10.611 .523 10.471 .479 10.542 .405 10.168 .413 Mincer experience = Age– years of schooling–6 6.583 3.028 10.755 3.084 7.015 2.853 10.496 3.211 Father's years of schooling 10.447 4.201 7.840 3.476 10.270 3.911 7.700 3.440 Mother's years of schooling 8.237 3.862 6.116 3.265 8.087 4.032 5.871 3.192 Ethnicity Hokkien ( = 1, if yes) .732 .729 .679 .775 Hakka ( = 1, if yes) .118 .162 .128 .113 Mainlander ( = 1, if yes) .149 .102 .189 .096 Aborigine ( = 1, if yes) .000 .007 .005 .017 Residence prior to age 15 Major city ( = 1, if yes) .246 .207 .327 .200 Not major city ( = 1, if yes) .500 .584 .474 .588 Not in Taiwan ( = 1, if yes) .000 .003 .010 .013 Missing data ( = 1, if yes) .254 .205 .189 .200 Birth cohort 1967 ( = 1, if yes) .018 .021 .010 .033 1968 ( = 1, if yes) .048 .053 .056 .054 1969 ( = 1, if yes) .075 .097 .102 .090 1970 ( = 1, if yes) .083 .117 .071 .100 1971 ( = 1, if yes) .105 .095 .092 .088 1972 ( = 1, if yes) .105 .097 .122 .104 1973 ( = 1, if yes) .088 .119 .122 .108 1974 ( = 1, if yes) .110 .084 .102 .106 1975 ( = 1, if yes) .083 .083 .082 .094 1976 ( = 1, if yes) .140 .112 .133 .100 1977 ( = 1, if yes) .070 .078 .046 .085 1978 ( = 1, if yes) .075 .045 .061 .038 Sample size (N) 228 580 196 480 Table 2. OLS Regressions Predicting Logged Earnings Independent Variables Male Female Intercept 9.596* 9.696* (.110) (.113) College attendee ( = 1, if yes) .311* .432* (.044) (.039) Mincer experience .163* .094* (.022) (.024) Experience squared -.007* -.004* (.001) (.001) R2 .091 .169 N 808 676 * Significant at the level of α = .05; Numbers in parentheses are standard errors. Table 3. Estimated Probit Model for College Attendance Male (N = 808) Female (N = 676) Independent Variables Coefficient SE Mean Marginal Effect Coefficient SE Mean Marginal Effect Intercept -1.485* .484 — -1.765* .527 — Father's schooling (FS) .100* .042 .032* .043 .038 .014 Mother's schooling (MS) -.009 .049 -.003 -.052 .053 -.017 Ethnicity (relative to Hokkien) Hakka -1.444 1.335 -.305 -1.058 1.299 -.247 Mainlander .912 .794 .338 -.362 1.128 -.107 Aborigine — — — -.572 .582 -.151 Residence prior to age 15 (relative to not in major city) Major city .115 .701 .038 -1.899 1.056 -.410 Not in Taiwan — — — .345 .555 .123 Missing data .224 .128 .075 -.096 .155 -.031 Birth cohort (relative to 1967) 1968 .143 .414 .048 .590 .486 .217 1969 -.208 .391 -.064 .819 .463 .305 1970 -.272 .388 -.082 .429 .464 .153 1971 -.005 .385 -.002 .678 .462 .250 1972 -.045 .385 -.014 .675 .456 .248 1973 -.342 .386 -.101 .742 .455 .274 1974 .016 .384 .005 .492 .460 .177 1975 -.121 .391 -.038 .413 .469 .147 1976 .026 .378 .009 .738 .456 .272 1977 -.128 .395 -.040 .270 .481 .094 1978 .077 .408 .026 .898 .500 .339 Interaction terms FS * MS .003 .004 .001 .007 .005 .002 FS * Hakka .074 .137 .024 .020 .128 .006 * Mainlander -.176* .077 -.057* .007 .103 .002 * Major city -.038 .080 -.012 .185 .103 .060 MS * Hakka .340 .194 .111 .253 .189 .083 * Mainlander -.045 .121 -.015 .028 .169 .009 * Major city -.040 .098 -.013 .349* .145 .114* Hakka * Major City .480 .558 .175 .357 .521 .128 Mainlander * Major City .273 .365 .095 -.885* .414 -.207* FS * MS * Hakka -.028 .017 -.009 -.012 .014 -.004 * Mainlander .013 .010 .004 .006 .014 .002 * Major city .005 .009 .002 -.027* .012 -.009* * Significant at the level of α = .05. Table 4. Comparisons of Different Treatment Parameters: Using Two Different Estimation Approaches Semiparametric–LIV Parametric Parameter Male (N=770) Female (N=633) Male (N=770) Female (N=633) 1. ATE .144 .621* .418* .715* (.215) (.082) (.100) (.084) 2. TT .245 .678* .414* .769* (.216) (.084) (.136) (.105) 3. TUT .102 .598* .420* .693* (.283) (.092) (.098) (.091) 4. OLS .298* .432* .298* .432* (.045) (.039) (.045) (.039) 5. IV .240 .634* .418* .729* (.140) (.077) (.105) (.084) 6. Bias = OLS–ATE .154 -.190* -.120 -.283* (.220) (.091) (.109) (.092) 7. Selection bias = OLS–TT .053 -.247* -.116 -.337* (.220) (.092) (.143) (.112) 8. Sorting gain = TT–ATE .102 .057 -.005 .054 (.305) (.117) (.169) (.134) * Significant at the level of α = .05; Numbers in parentheses are standard errors. Male 0 2 4 6 8 10 12 14 .0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0 P f (P) Untreated Group Treated Group Female 0 2 4 6 8 10 12 .0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0 P f (P) Untreated Group Treated Group Figure 1. Density of Estimated Propensity Score: P(Z) = Prob(D = 1 | Z) Male -3.0 -2.0 -1.0 .0 1.0 2.0 .0 .2 .4 .6 .8 1.0 UD MTE Female .0 .2 .4 .6 .8 1.0 .0 .2 .4 .6 .8 1.0 UD MTE Figure 2. MTE as a Function of Unobserved Heterogeneity UD: Semiparametric–LIV Approach Male .0 2.0 4.0 6.0 .0 .2 .4 .6 .8 1.0 UD Weight TT TUT ATE Female .0 2.0 4.0 6.0 .0 .2 .4 .6 .8 1.0 UD Weight TT TUT ATE Figure 3. Weights as a Function of Unobserved Heterogeneity UD: Semiparametric–LIV Approach Male .0 .3 .6 .9 .0 .2 .4 .6 .8 1.0 UD MTE Female .0 .3 .6 .9 1.2 .0 .2 .4 .6 .8 1.0 UD MTE Figure 4. MTE as a Function of Unobserved Heterogeneity UD: Parametric Approach
  • 下载地址 (推荐使用迅雷下载地址,速度快,支持断点续传)
  • 免费下载 PDF格式下载
  • 您可能感兴趣的
  • collegeeducation  collegeeducation中  education  educationpays  onlineeducation  americaneducation  education是什么意思  aneducation  educationinchina  onlineeducation作文