Equilibrium stock return dynamics under alternative rules of learning about hidden states

pdf
Số trang Equilibrium stock return dynamics under alternative rules of learning about hidden states 30 Cỡ tệp Equilibrium stock return dynamics under alternative rules of learning about hidden states 376 KB Lượt tải Equilibrium stock return dynamics under alternative rules of learning about hidden states 0 Lượt đọc Equilibrium stock return dynamics under alternative rules of learning about hidden states 0
Đánh giá Equilibrium stock return dynamics under alternative rules of learning about hidden states
4.7 ( 19 lượt)
Nhấn vào bên dưới để tải tài liệu
Đang xem trước 10 trên tổng 30 trang, để tải xuống xem đầy đủ hãy nhấn vào bên trên
Chủ đề liên quan

Nội dung

Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 www.elsevier.com/locate/econbase Equilibrium stock return dynamics under alternative rules of learning about hidden states Michael W. Brandta;∗ , Qi Zengb , Lu Zhangc a The Fuqua School of Business, Duke University and NBER, Durham, NC 27708, USA of Economics and Commerce, University of Melbourne, Melbourne, VIC 3010, Australia c William E. Simon Graduate School of Business Administration, University of Rochester, Rochester, NY 14627, USA b Faculty Received 6 August 2002; accepted 16 September 2003 Abstract We examine the properties of equilibrium stock returns in an economy in which agents need to learn the hidden state of the endowment process. We consider Bayesian and suboptimal learning rules, including near-rational learning, conservatism, representativeness, optimism, and pessimism. Bayesian learning produces realistic variation in the conditional equity risk premium, return volatility, and Sharpe ratio. Alternative learning behaviors alter signi8cantly the level and variation of the conditional return moments. However, when agents are allowed to be conscious of their learning mistakes and to price assets accordingly, the properties of returns under Bayesian and alternative learning rules are virtually indistinguishable. ? 2004 Elsevier B.V. All rights reserved. JEL classi2cation: G0; G12; G14 Keywords: Time-varying moments or returns; Behavioral biases 1. Introduction The equity risk premium is time-varying, and understanding why and how it varies is a lively research 8eld. Intuitively, there are two reasons for the risk premium to vary in a rational expectations equilibrium (REE) framework: either the compensation required by agents to take on a marginal unit of risk (the market price of risk) changes or the amount of risk in the economy changes. It is relatively straightforward to generate endogenous changes in the market price of risk through changing aggregate ∗ Corresponding author. Tel.: +1-919 660 1948; fax: +1-919 660 8038. E-mail address: mbrandt@duke.edu (M.W. Brandt). 0165-1889/$ - see front matter ? 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jedc.2003.09.003 1926 M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 preferences (induced, for example, by habit formation or heterogeneous agents), but it is more diCcult to generate endogenous changes in the variance-covariance structure of a REE model. One mechanism is incomplete information, where agents must learn about unobservable features of the economy, such as parameters or latent state variables, from observables. 1 As agents become more or less sure about the true values of the unobservables, the uncertainty in the economy Euctuates and, as a result, the risk premium varies. Relative to time-varying preferences, where the variation in the risk premium is essentially controlled by the researcher’s modeling of the preferences, the incomplete information setting is signi8cantly less Eexible. The fact that there is only one way to learn optimally, namely through Bayesian updating, ties the researcher’s hands. Rather than being a modeling choice, the learning process, which generates the time-variation of the risk premium in an economy with incomplete information, is 8xed by the assumption of rational expectations. Despite being optimal and therefore rational, Bayesian learning is not the only learning process advocated in the literature. In fact, it has recently become fashionable to explain empirical irregularities which are diCcult to explain in a fully rational model through alternative forms of learning motivated by the psychology literature. For example, Barberis et al. (1998) and Brav and Heaton (2002) explain over- and under-reaction of stock prices to news with ‘representativeness’ and ‘conservatism’, where agents place too much or too little weight on the most recent data relative to Bayesian learning. Daniel et al. (1998, 2001) and Odean (1998) use ‘overcon8dence’, where agents are too con8dent in the quality of private information, to explain the same phenomena. Cecchetti et al. (2000) resolve the equity risk premium puzzle with ‘optimism’ about the duration of recessions and ‘pessimism’ about the duration of expansions. Finally, Abel (2002) studies the eJect of pessimism and ‘doubt’ on expected returns. Since the learning process controls the dynamics of the risk premium in a REE model with incomplete information (but constant aggregate preferences) and there are a variety of alternative learning rules advocated in the 8nance literature, it is natural to consider the eJects of these alternative learning rules on the dynamics of the risk premium. This is the aim of this paper. We conduct a systematic study of the quantitative eJects of alternative learning, as opposed to Bayesian learning, on the conditional distribution of stock returns in an otherwise REE model. The three key features of our approach are: • Common economic model. We study a variety of learning rules in the context of a common economic model. We consider a Lucas (1978) fruit-tree economy with identical agents that have recursive preferences (Epstein and Zin, 1989, 1991; Weil, 1989) and an exogenous endowment that follows a four-state Markov switching process. The agents know the structure of the economy and all of its parameters but cannot observe the current or past states of the economy. The only diJerence between the versions of this model we consider is the updating rule the agents 1 Incomplete information models include Detemple (1986, 1991), Wang (1993a), Moore and Schaller (1996), David (1997), Brennan and Xia (1998), and Veronesi (1999, 2000). M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 1927 use to incorporate new information into their beliefs about the hidden state of the economy. 2 • Broad set of learning rules. The alternative learning rules we consider cover a broad spectrum of the literature on bounded rationality and learning: near-rational learning, in which the agents update their beliefs about the hidden state using Bayes’ rule but occasionally make random mistakes; conservatism and representativeness, in which the agents update their beliefs with too little or too much emphasis on the most recent data; and optimism and pessimism, in which the agents systematically bias their beliefs toward or away from the good states. 3 • Distinction between ignorant and conscious agents. We argue that there are two ways to introduce alternative learning into an otherwise REE model. The 8rst is to assume that the agents follow a suboptimal learning rule but think that they learn optimally, which implies that the assets are priced the same way as in the Bayesian benchmark model except with diJerent state-beliefs. We refer to these agents as ignorant, since they are unaware of their own limitations. The second way is to assume that the agents knowingly follow a suboptimal learning rule and account for this fact in setting the asset prices (eJectively trying to compensate for or hedge against their learning mistakes). We refer to these agents as conscious and note that assuming irrational but conscious agents represents a far less severe breach of full rationality than irrational and ignorant agents. We further argue that consciousness can be justi8ed from a costs versus bene8ts perspective of correcting either the learning behavior, which is an on-going eJort, or the asset pricing rule, which involves only a one-time correction. It may well be optimal and rational for the agents to be consciously irrational. Depending on whether one believes in the ideal of full rationality or not (we deliberately do not take a stance on this issue here), there are two ways to interpret the contributions of this paper. From a bounded rationality perspective, we compare a broad range of behavioral learning rules within a common economic model and study the implications of allowing agents to be conscious. Our results can be used to assess the equilibrium implications of a given behavioral learning rule or, from a reverse-engineering perspective, to determine which behavioral learning rule is best suited for matching the stylized features of the data. From a full-rationality perspective, we check the robustness of the incomplete information model to deviations from Bayesian learning. In that sense, our analysis contributes to the 2 We impose the same learning rule uniformly on all agents, or equivalently on the representative agent, and hence disregard the interesting issue of how agents with diJerent learning rules and/or heterogeneous beliefs interact and aggregate. The role of competing learning rules and heterogeneous beliefs is studied by Brock and Hommes (1997, 1998), Brock and LeBaron (1996), Detemple and Murthy (1994), LeBaron (2001), and Wang (1993b), among others. A related issue (which we also sidestep) is whether rational and irrational agents can co-exist in a competitive market. For this research topic, see Bernardo and Welch (2001), DeLong et al. (1990), Hirshleifer and Luo (2001), and Shleifer and Vishny (1997). 3 See Camerer (1995) and Conlisk (1996) for detailed surveys of this literature. 1928 M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 extensive literature on the robustness of REE models to deviations from optimal behavior. 4 Our 8ndings are easy to summarize. Bayesian learning performs reasonably well in matching the unconditional moments of stock returns and in producing realistic variation in the conditional equity premium, return volatility, and Sharpe ratio. Alternative learning of ignorant agents aJects both the level and time-variation of the moments of stock returns. However, allowing agents to be conscious of their suboptimal learning behavior eliminates virtually all these diJerences in the return dynamics. This suggests that the bene8ts of considering alternative learning rules depend crucially on the assumption of ignorance. The remainder of this paper is structured as follows. In Section 2, we set up the economic model and describe how asset prices are determined under full and incomplete information. Section 3 reviews Bayesian learning and formalizes the alternative learning rules. We present our quantitative results in Section 4 and conclude in Section 5. 2. Economic model We consider a Lucas (1978) fruit tree economy populated by a large number of identical and in8nitely lived individuals that can be aggregated into a single representative agent. The only source of income in the economy is a large number of identical and in8nitely lived fruit trees. Without loss of generality, we assume that there exists one tree per individual, so that the amount of fruit produced by a tree in period t, denoted Dt , represents the output or dividend per capita. The fruits are non-storable and cannot be used to increase the number of trees. In equilibrium, all fruits are therefore consumed during the period in which they are produced, i.e., Ct = Dt , where Ct is the per-capita consumption in period t. Finally, we assume that each tree has a single perfectly divisible claim outstanding on it and that this claim can be freely traded at a price Pt in a competitive equity market. The dividends are exogenously stochastic. 5 We de8ne dt ≡ ln Dt and assume that the dividend growth rate Ndt ≡ dt − dt−1 follows a Markov mean-switching process: 6 Ndt = (St−1 ) + t; (1) where t is iid standard normal. St follows a 8nite-state Markov chain with transition matrix {pij }N ×N , where N is the number of states and pij is the conditional probability 4 This literature includes Muth (1961), Akerlof and Yellen (1985a, b), Cochrane (1989), Day et al. (1974), Ingram (1990), Krusell and Smith (1996), Lettau and Uhlig (1999) and Wang (1993b). 5 Timmermann (1994) argues more generally that there may exist a feedback from stock prices to dividends which can lead to the existence of multiple rational expectations equilibria. Incorporating this feedback eJect into our incomplete information framework is beyond the scope of this paper. 6 Cecchetti et al. (1990) provide empirical justi8cation for modeling the dividend growth rate as a mean-switching process. Related models with similar endowment process include Abel (1994), Cecchetti et al. (1993), and Kandel and Stambaugh (1990, 1991). DriCll and Sola (1998) present evidence that the volatility of dividend growth is also state-dependent. However, to keep the model simple we keep the dividend growth volatility constant. M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 1929 of the process being in state j next period given that it is in state i this period: pij = Prob[St+1 = j|St = i] (2) with pij ∈ [0; 1]. For notational convenience, we let (i) denote (St = i). Following Epstein and Zin (1989) and Weil (1989), we assume that the preferences of the representative agent are de8ned recursively by 1− 1= =(1−) Ut = ((1 − )Ct(1−)= + (Et [Ut+1 ]) ) ; (3) where  6 1 is the subjective discount factor,  ¿ 0 is the coeCcient of relative risk aversion, and  = (1 − )=(1 − (1= )) with ¿ 0 being the elasticity of intertemporal substitution. The 8rst-order condition with Epstein–Zin–Weil preferences can be expressed as     −1= C t+1 Et   (4) Rt+1  = 1; Ct where Rt+1 ≡ (Pt+1 + Dt+1 )=Pt denotes the return on the market portfolio. If the agents have full information (i.e., know the structure of the economy, its parameters, and the current state St ) we can solve for the equilibrium asset price Pt by the method of undetermined coeCcients (see Appendix A for details). Speci8cally, the price-dividend ratio t and the risk-free rate Rft take on only N values, (i) and Rf (i), for i = 1; 2; : : : ; N . In more realistic economies in which the process St is unobservable and the agents learn about the current state of the economy, the price-dividend ratio and risk-free rate are continuous functions. Intuitively, they are convex combinations of the full-information values. Consider economies with incomplete information in which the agents know the structure and parameters of the model but do not observe the state variable St . 7 Formally, the agents know that Ndt follows the Markov switching process in Eq. (1) with parameters (i) and and with transition probabilities pij . However, the agents must form an opinion about the probability that the economy is currently in any particular state using the information 8ltration generated by the observed dividend series Ft ={d0 ; d1 ; : : : ; dt } and a set of updating rules for their subjective beliefs (such as Bayes’ rule). The agents’ subjective probability assessment t ≡ {t (1); t (2); : : : ; t (N − 1)}, where t (i) ≡ Prob[St = i|Ft ], determines the demands for the assets and, through market-clearing, sets their equilibrium prices. To price the risky asset, we conjecture a solution of the form Pt = t Dt , where the price-dividend ratio t now depends on the subjective state-belief t as well as on the observed dividend growth rate Ndt . From the 8rst-order condition in Eq. (4), the price-dividend ratio satis8es the equation  −(= ) Dt+1 (t ; Ndt ) =  Et (5) ((t+1 ; Ndt+1 ) + 1) : Dt 7 We do not consider learning about the parameters or structure of the model. The role of learning about parameters, considered by Detemple (1986, 1991), Timmermann (1993, 1996), and Cassano (1999), among others, is asymptotically degenerate, unless the true model changes periodically. 1930 M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 Since the agents use Ndt+1 to form the belief t+1 , the two terms in the expectation are not independent and Eq. (5) generally does not have an analytical solution. 8 We solve the model numerically using the projection method of Judd (1992) (see Appendix B for details). 3. Learning 3.1. Bayesian learning The benchmark case of Bayesian learning works as follows. The agents leave period t − 1 with the information Ft−1 summarized by the subjective belief t−1 . Once the dividend Dt is observed, the agents use Bayes’ theorem to update their beliefs to t . The updating is simpli8ed by the fact that the current state St has no contemporaneous eJect on Dt . As a result, the agents use the newly observed data only to update their beliefs about the state for the previous period, denoted t t−1 (i) ≡ Prob[St−1 = i|Ft ], and then use the transition probabilities pij to form their beliefs t (i) ≡ Prob[St =i|Ft ] about the current state. Formally, starting at the end of period t−1 with the subjective and so-called prior belief t−1 , the agent enters period t and observes the new information Dt , or equivalently Ndt ≡ dt − dt−1 . From the mean-switching speci8cation in Eq. (1), the probability density function of Ndt conditional on the information at time t − 1 is f(Ndt |St−1 = i; Ft−1 ) = √ (Ndt − (i))2 1 exp − : 2 2 2 (6) We de8ne t (i) ≡ f(Ndt |St−1 = i; Ft−1 )t−1 (i) (7) B and let t t−1 (i) denote the updated belief Prob[St−1 = i|Ft ] under Bayesian learning. The updating is done optimally through Bayes’ rule: B t t−1 (i) = Prob[Ndt |St−1 = i; Ft−1 ] × Prob[St−1 = i|Ft−1 ] = Prob[Ndt |Ft−1 ] t (i) : N j=1 t (j) (8) Finally, the agents combine the output from the updating step in Eq. (8) with the transition probabilities pij to form the Bayesian belief tB (i) ≡ Prob[St = i|Ft ] about the current state: tB (i) = N  B pji × t t−1 (j): (9) j=1 8 In the nested case of Bayesian learning with power utility, the price-dividend ratio is available analytically. Veronesi (2000) provides the solution in a continuous time model and David and Veronesi (2001) solve the corresponding discrete time model. Speci8cally, the price-dividend ratio t is a belief-weighted average of the (i) that solve the 8rst-order condition under full information. M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 1931 3.2. Alternative learning rules We now turn to alternative learning behavior, which is suboptimal and in some cases even biased relative to the optimal Bayesian learning. Consistent with our representative agent framework, we impose the same suboptimal learning rule uniformly on all agents or equivalently on the representative agent (see footnote 2). 3.2.1. Near-rational learning The 8rst suboptimal learning rule we consider is near-rational learning, in which the agents update their beliefs about the hidden state using Bayes’ rule but occasionally make mistakes. The mistakes are assumed to be random in such a way that the subjective belief t is still conditionally unbiased, meaning that the agents do not deviate from the benchmark case of Bayesian learning on average. Formally, we maintain that E[t |Ft ] = tB : (10) where tB is the Bayesian belief about the state. This unbiasedness property distinguishes near-rational learning from the other alternative learning rules which are biased. We formalize near-rational learning as follows. Once the agents observe the dividend Dt they update their prior belief t−1 about the previous state to t t−1 not through Bayes’ rule but instead through a weighted average of Bayes’ rule and a random error term: t t−1 (i) B = (1 − !)t t−1 (i) + !t (i); (11) where t (i) denotes a random error with state-dependent distribution, the weight !, B (i) dewhich is assumed to be state-independent, takes a value in [0; 1], and t t−1 notes the Bayesian updating process (as opposed to the Bayesian belief) described in Eq. (8). 9 Given the updated belief about the state in the previous period, the belief about the current state is again formed using the transition probabilities: t (i) = N  pji × t t−1 (j): (12) j=1 We need to impose more structure on the random noise term t to guarantee that the posterior beliefs t t−1 are valid probabilities and sum to one across states. Speci8cally, we assume that for a 2xed benchmark state i the error t (i) follows a Beta distribution with parameters t (i) and j=i t (j), where the vector t is de8ned in Eq. (7). Moreover, we assume that the errors are perfectly correlated across all states and that for state j other than the benchmark state i: t (j) : (13) t (j) = (1 − t (i)) j=i t (j) 9 We distinguish between the Bayesian updating process and Bayesian belief, which is the outcome of the Bayesian updating process when used in conjunction with the Bayesian belief from the previous period. By acting on the updating process, an error feeds into all futures periods because the contaminated belief serves as prior for the next period. If the error acts directly on the belief, its eJects lasts only one period. 1932 M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 This particular way of distributing the error across the states guarantees that the resulting beliefs t satisfy the unbiasedness condition in Eq. (10). 10 It is also straightforward to verify that t t−1 ∈ [0; 1] for all states and that it sums to one across states. 3.2.2. Conservatism and representativeness Conservatism and representativeness are psychologically motivated alternatives to Bayesian learning (Edwards, 1968; Kahneman and Tversky, 1972) that have recently attracted attention in behavioral 8nance. 11 Conservatism leads individuals to place too much emphasis on old data or the status-quo and too little emphasis on recent data or the possibility of change. 12 Representativeness refers to the exact opposite behavior, that individuals tend to think relatively short data sequences are representative of the underlying distribution. To formalize conservatism and representativeness, we assume that the agents update their beliefs t−1 about the state in the previous period to t t−1 not through Bayes rule but instead through the following updating rule: t t−1 (i) B = (1 − !)t t−1 (i) + !t−1 (i) (14) for conservatism and t t−1 (i) B = (1 − !)t t−1 (i) + ! f(Ndt |st−1 = i) N j=1 f(Ndt |st−1 = j) (15) for representativeness, where ! is a parameter that takes a value in [0; 1]. For conservatism, the updated belief in Eq. (14) is a convex combination of the Bayesian belief and the prior belief. Since Bayesian updating reEects an optimal weighting of the likelihood of the data Ndt and the prior belief t−1 [Eqs. (7) and (8)], conservative agents place more weight on their prior belief and less weight on the data in the updating process. The parameter ! measures the degree of conservatism. Analogously, the updated belief in Eq. (15) for representativeness is a convex combination of the Bayesian belief and the likelihood of the data. Agents that suJer from representativeness place less weight on the prior belief and more weight on the data than Bayesian agents. Note that conservatism and representativeness lead to conditionally biased state-beliefs. 10 Recall that if y follows a Beta distribution with parameters {; }, E[y] = =( + ) and Var[y] =  =(( +  + 1)( + )2 ). It then follows that E[t (i)|Ft ] = t (i)= N i=1 t (i). 11 See, for example, DeBondt and Thaler (1985), Lakonishok et al. (1994), Barberis et al. (1998) and Brav and Heaton (2002). 12 Conservatism can also be interpreted as ‘overcon8dence’ in one’s ability to learn. Overcon8dence typically refers to individuals placing too much weight on their private information relative to the public information. Although there is no real distinction between public and private information in our model, the dividend realizations are obviously public while the prior beliefs are arguably private. Conservative agents place too much weight on their prior belief because they irrationally overrate its accuracy. It therefore appears that conservative agents are overcon8dent in their ability to learn. M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 1933 3.2.3. Optimism and pessimism Optimistic agents systematically bias their beliefs toward good states and pessimistic agents tend to think the economy is in bad states. 13 We de8ne good states to be states with (i) ¿ , V where V denotes the unconditional median dividend growth rate, and bad states to be states with (i) ¡ . V We order the state index i such that states with larger indices correspond to higher conditional mean dividend growth rates, which means that good states correspond to the indices i ¿ N=2 and bad states to i ¡ N=2. 14 To capture the notion of optimism, we remove mass of the Bayesian posterior beliefs from the bad states, in proportion to the conditional probabilities of being in each of the bad states, and then distribute this mass, again proportionally, across the good states. Formally, we de8ne the optimistic beliefs as  tB (i) N B   (1 − !) (i) + ! for i ¿ (good states);  t B 2 j¿N=2 t (j) t (i) = (16)   N  (1 − !)B (i) for i ¡ (bad states); t 2 where !, which takes a value in [0,1], measures the degree of optimism. For pessimism, we remove mass from the good states and distribute it proportionally across the bad states:  N B  (good states); for i ¿   (1 − !)t (i) 2 t (i) = (17) tB (i) N   for i ¡ (bad states);  (1 − !)tB (i) + ! B 2 j¡N=2 t (j) where, in this case, ! measures the degree of pessimism. 3.3. Ignorant versus conscious learning There are two ways to introduce alternative learning into an otherwise REE model. The 8rst is to assume that the agents follow a suboptimal learning rule but think that they learn optimally. We refer to these agents as ignorant because they are unaware of their limitations. With irrational and ignorant agents the assets are priced by the same price–dividend ratio (t ) as with rational agents. Therefore, the only diJerences between the two versions of the model are the realizations and the evolution of the state-beliefs. Conditional on the same belief realization t = tB , they are identical. The second way to introduce alternative is to assume that the agents knowingly follow a suboptimal learning rule and incorporate this fact into asset prices by using a diJerent 13 On one hand, psychological studies 8nd that people tend to be optimistic about their future prospects (Weinstein, 1980; Kunda, 1987) and that, perhaps somewhat counter-intuitively, optimism is more pronounced among more intelligent people (Klaczynski and Fauth, 1996). On the other hand, Cecchetti et al. (2000) and Abel (2002) show that pessimistic behavior can help explain the equity premium and risk-free rate puzzles, respectively. 14 The classi8cation of states according to the median growth rate is notationally convenient. Alternatively, we could use the mean growth rate, but the two approaches are equivalent for our quantitative results. 1934 M.W. Brandt et al. / Journal of Economic Dynamics & Control 28 (2004) 1925 – 1954 price–dividend ratio function to compensate for or hedge against their learning mistakes. We refer to these agents as conscious. To intuitively understand how irrational but conscious agents can partially correct their learning mistakes through the price-dividend ratio function, consider an economy with two states St ={0; 1} and a pessimistic agent with state-beliefs that are unconditionally biased E[t ] = 0:9E[tB ]. Furthermore, assume that the price-dividend ratio for the Bayesian agents is B (t ) = 100 + 50tB . In this example, irrational and ignorant agents underprice the asset (relative to the dividends) by an average of E[t ]−E[tB ]=−5E[tB ]. Irrational but conscious agents, in contrast, recognize that with their particular learning mistakes a price-dividend ratio of (t ) = 100 + 55:66t results in unconditionally unbiased valuations. However, unless the belief of the irrational agents is proportional to that of the rational agents in all states, which is not the case with the alternative learning rules described above, conscious agents cannot fully correct their mistakes through the price-dividend ratio. Intuitively, they can only compensate for systematic biases through their own price-dividend ratio function. 15 Technically, conscious agents solve for the function (t ; Ndt+1 ) that satis8es the 8rst-order condition in Eq. (5) when the conditional expectations are taken with respect to the dynamics of their suboptimal state-beliefs. Ignorant agents, in contrast, use the price-dividend ratio function of the Bayesian agents, which solves the 8rst-order condition when the conditional expectations are taken with respect to the dynamics of the Bayesian beliefs. Perhaps the most intuitive reason for considering conscious agents is internal consistency. Since the agents in our model are in8nitely lived, it is diCcult to imagine that they employ a set of learning and pricing rules that consistently misprices the asset relative to the historical dividend realizations. At the same time, psychologists argue convincingly that it is diCcult for naturally pessimistic individuals, for example, not to be pessimistic in their probability assessment. One can interpret conscious agents as adjusting their pricing rule to be at least partially consistent with the data. One way to justify why conscious agents adjust their pricing rule to be internally consistent rather than correct their learning behavior, is from a costs versus bene8ts perspective. As Simon (1955), Marschak (1968), and Einhorn (1970, 1971) suggest, computational costs are an important consideration in deciding whether to act rationally or according to a behavioral heuristic (see also Payne et al., 1990). It is arguably less costly for agents to adjust their pricing rule once than to correct and monitor each period (for an in8nite number of periods) their natural tendency to be pessimistic. As long as the expected utility loss from being irrational but conscious does not exceed the costs of not being pessimistic, it can therefore be optimal (and rational) to be irrational but conscious. 15 This intuition is not quite correct due to the non-linearities in the 8rst-order condition (5). Unless  = 1 (power utility), conscious agents may also be able to partially correct for too much or too little variation in their state-belief. The price-dividend ratio can also be diJerent due to a correlation between dividend growth realizations and the learning errors, which generates positive or negative hedging demands for the asset (e.g., Merton, 1969). However, the 8rst-order corrections are for systematic biases in the belief.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.