Unshrouding: Evidence from Bank Overdrafts in Turkey

Alan, S., Cemalcilar, M., Karlan, D., & Zinman, J. (2018). Unshrouding: Evidence from Bank Overdrafts in Turkey. The Journal of Finance, 73(2), 481-522. pdf (via Ruben Cox).

Abstract

 

Lower prices produce higher demand… or do they? A bank’s direct marketing to holders of “free” checking accounts show that a large discount on 60% APR overdrafts reduces overdraft usage, especially when bundled with a discount on debit card or auto-debit transactions. In contrast, messages mentioning overdraft availability without mentioning price increase usage. Neither change persists long after messages stop. These results do not square easily with classical models of consumer choice and firm competition. Instead they support behavioral models where consumers both underestimate and are inattentive to overdraft costs, and firms respond by shrouding overdraft prices in equilibrium.

Why would a bank hide information on overdraft costs? After all, a classically rational consumer would simply infer that shrouded prices are high prices. But recent behavioral theories show that shrouded and high prices can persist if consumers tend to underestimate their add-on costs and firms cannot profit from de-biasing consumers with more transparent pricing or information about competitors’ high add-on prices (Gabaix and Laibson 2006; Grubb 2015; Heidhues, Köszegi, and Murooka 2017).

What drives overdraft pricing, advertising, and usage?

Retail banking in much of the world: “free if nonnegative balance, very expensive if in overdraft”.

 

Experimental setup

the experiment varies promotions Yapi Kredi sent via SMS from September-December 2012 to 108,000 existing checking account clients who had not overdrafted in the previous few months.

overdraft discount
yes no
no bundle A C
with debit card/auto bill payment B D

H1: The key comparison is between overdraft-promoting messages that mention price (A & B) and those that do not (C & D). Our key hypothesis is that drawing attention to the cost of overdrafting will depress demand for it. We test this hypothesis by comparing overdraft usage, during the experimental period, between customers sent an Overdraft Interest Discount message  and customers sent an Overdraft Availability message (A vs C) (debit card and auto bill payment were separate conditions I grouped in the matrix).

H2: A second behavioral hypothesis is that bundling the overdraft discount with a discount on debit card or automated bill-payment usage will further depress demand (A vs B)

H3: Third, we hypothesize that promoting overdraft availability, without mentioning price, will change demand. We test whether and how promoting overdraft availability changes demand by comparing overdraft usage among customers sent the Overdraft Availability message to customers sent messages that promote only debit card or automated bill payment usage and do not mention overdraft at all. (C vs D)

H4: To test the dynamics of attention and overdraft behavior, we examine data from the post-campaign period (January-May 2013). Treatment effects will persist if consumer learning or attention re: add-ons is durable. Treatment effects will not persist if consumers quickly forget about add-ons or only attend to them when induced to by external stimuli like advertising.

Results

We still find that mentioning overdraft price lowers overdraft demand. E.g., the likelihood of overdrafting during the experiment is 1.2 percentage points lower (se=0.4 pp) for those receiving the discount offer relative to those receiving a message that mentions overdraft without mentioning its price, on a baseline likelihood of 31%. (H1 not true, C overdrafts more than A). These results support the hypothesis that drawing attention to overdraft costs reduced demand (even while offering a 50% discount!).

Offering the overdraft discount alone reduces overdraft likelihood by only 0.7pp (se=0.5) relative to messages that mention overdraft without mentioning price, while the bundled reductions are 1.4pp for automated bill payment (se=0.5) and 1.7pp for debit card (se=0.5). (H2; A overdafts more than B). Do All Promotional Discounts Backfire? No. Offering the debit discount alone (cell D) weakly increase debit card usage, by 0.5pp (se=0.4), the auto-pay discount alone increases auto-pay signup (by 0.4pp, se=0.1).

The overdraft availability message increases overdraft likelihood, by about 0.9 pp (se=0.4). (H3, overdraft C > overdaft D).

Ad H4: Treatment effects do not persist: the lack of persistent effects suggests that consumer learning and/or attention concerning overdrafts depreciates quickly, and hence that advertising and de-biasing campaigns must persist to be effective.

One message is not enough to generate an effect, and repeating messaging does influence demand, with diminishing marginal effects from messaging every 10 vs. 20 days.

 

Shrouded equilibrium

Altogether our results are consistent with the models of shrouded equilibrium and limited/reactive consumer attention. In particular, they support:

  1. the key modeling assumption that consumers tend to underestimate add-on costs (if consumers’ estimates were unbiased then offering a discount would weakly increase demand);
  2. the key assumption that firms lack incentives to unshroud prices;
  3. a key prediction of reactive attention models that consumers respond differently when advertising highlights different add-on attributes (price or availability).

Policy implications

our results should give pause to third parties seeking to improve overdraft markets with messages (like social marketing campaigns) that draw attention to overdraft costs. To fix ideas, imagine messaging around the theme of “Beware of big overdraft fees!”, delivered by an entity that might actually benefit from unshrouding; e.g., a regulator, a firm with social objectives or a product-differentiation strategy, or a personal financial management service.

Our results also suggest that unshrouding could be quite costly to sustain, since its effects do not persist. Moreover, our results suggest that incumbent suppliers could effectively counter unshrouding campaigns by advertising non-price attributes (like availability/credit lines in our case). Hence we are sympathetic to Heidhues, Köszegi, and Murooka’s conjecture that third-parties, or deviating firms, will be outgunned in a messaging arms race with incumbent add-on suppliers.

I should probably read more on Heidhues & Koszegi to understand that:

Ethics of Experimenting with High-Cost Credit

We are frequently asked in seminars whether researchers should partner with a lender that is seeking to sell more high-interest rate loans. We think yes, for four key reasons:

  1. First, an ethical concern here presumes that high-cost consumer credit harms consumers. We emphasize the presumption; extensive research on this question suggests that a different assumption is warranted– (weakly) beneficial impacts for consumers (Karlan and Zinman 2010; Zinman 2014; Banerjee, Karlan, and Zinman 2015).
  2. Second, YK’s advertising was truthful and its terms were competitive. Thus, combining the first and second points, the experiment was not trying to convince consumers to accept a bad deal in either absolute terms or compared to market alternatives.
  3. Third, YK was going to promote overdraft usage among its existing customers with or without the participation of the research team; we helped convince bank management to feature prices despite its skepticism about the effectiveness of past overdraft price promotions.
  4. Fourth, YK and the research team contracted ex-ante that the academic co-authors would have unrestricted intellectual freedom to report the results and disseminate them publicly to benefit regulators and further scientific knowledge.
Advertenties

Anchoring in payment: Evaluating a judgmental heuristic in field experimental settings

Jung, Minah H., Hannah Perfecto, and Leif D. Nelson. “Anchoring in payment: Evaluating a judgmental heuristic in field experimental settings.” Journal of Marketing Research 53.3 (2016): 354-368. [suggested by Job van Wolferen]

Executive summary Key Takeaways:

  • Anchors exert more influence in hypothetical payment settings than in the field.
  • The size of the subjective gap between anchors matters more than the objective gap.
  • Low anchors exert more influence than comparable high anchors

 

Abstract

Anchoring, the biasing of estimates toward a previously considered value, is a long-standing and oft-studied phenomenon in consumer research. However, most anchoring work has been in the lab, and the results from field work have been mixed. Here, the authors use real transactions from an empirically investigated and commercially-employed pricing scheme (“pay what you want”) to better understand how anchors influence payments.

Sixteen field studies (N = 21,997) and four hypothetical studies (N = 3,174) reveal four main points:

(1) Although anchoring replicates both with and without financial consequences (Studies 1–2), the percentile rank gap between anchors in the distribution of payments is a much stronger predictor of anchoring emerging than merely the absolute gap between the anchors on a number line (Studies 3–5).

(2) Low anchors influence payments more than high anchors (Studies 6a–b).

(3) Findings from the literature that should enhance anchoring effects—anchor precision, descriptive and injunctive norms, nonsuggestions—yield null results in payment (Studies 7–13).

(4) The above patterns do not emerge in hypothetical settings (Studies 14a–d), in which anchoring is as big and reliable as the literature has previously suggested.

The persevering reader will see our general inferences (effects of anchors):

distributional gap > absolute gap. First, although anchoring researchers agree that larger anchor gaps generate larger effects, we consider these gaps in terms of both the absolute gap (the numeric difference between the anchor values) and the distributional gap (the difference in the anchors’ percentile ranks in the distribution of payments). We find that the latter [distributional gap] is a much better predictor of anchoring effects than the former [absolute gap]. (…) Not all anchor gaps lead to anchoring effects. (…) Past research on anchoring has generally been indifferent to the type of anchor gap. (…) the distributional gap between the anchors must be quite wide to elicit significant effects [studies 4-6]

low > high. Second, with extreme anchors, low anchors pull down payments more than high anchors inflate them.

lab > field. Finally, when these exact paradigms are taken back into the lab (where no money leaves the participant’s wallet), the range of payments widens in the distribution: as a result, previously inert extremely high anchors now appear reasonable and become influential. (…) [Anchoring in the field] is far less reliable than in the lab, but it is hardly absent either.

 

Pay what you want (PWYW)

Experimental set-up

donuts

jungmuseum

General discussion

Detailed investigation into the operation of a core judgment process (anchoring) in a meaningful real-life setting (payment).

Published literature makes anchoring effects look unrealistically large and easy to find.

Despite conscientious efforts to find a reliable anchoring paradigm, it still took many attempts (and tens of thousands of participants) before we had some sense of the variables that mattered;

  • The Lab Versus the Field: real payments are less sensitive to anchors than are hypothetical payments
  • Asymmetry of Magnitude Perception: low anchor licenses a low payment. High anchors lack such influence, and could even motivate some reactance.
  • Size of Anchor Gap: that anchor gap needs to be considered in terms of the anchors’ places in the distribution (i.e., their percentile ranks across all payments), rather than in terms of their absolute magnitudes. For every anchor gap smaller than 50 percentile points, we observed nonsignificant effects and very small effect sizes.

These three explanations (real vs. hypothetical payments, asymmetries of influence for low and high anchors, and the distributional anchor gap) don’t rule out the operation of a more mundane concern with simple noisiness of measurement in the field. (…) Defaults, although seemingly arbitrary (i.e., lacking explicit justification), are often perceived as recommendations (McKenzie, Liersch, and Finklestein 2006).

Anchoring effects are extraordinarily robust and replicable when studied in the lab, but they can become more subtle and fragile when taken into the field, especially into a monetary domain. The gap between anchors needs to be wide enough (wider than previously believed) to elicit a difference, but perhaps not so wide that the high anchors become extreme and less influential. There is anchoring in payment, but, despite the large literature fromthe lab, more work is still needed to fully understand it.

 

 

Proportional and effective supervision (DNB)

Yestereday (May 31st), the Dutch Central Bank (DNB) published a report: Proportional and effective supervision.

The accompanying Dutch press release: Effectiviteit van toezicht gebaat bij goede maatvoering. With the sound bite: ‘Simpeler, maar niet soepeler’ [simpler, not more lenient], zegt directeur Toezicht Jan Sijbrand van De Nederlandsche Bank (DNB).

I summarized some interesting bits in tweets:

‘Objective’ compliance costs

Perceived burden

Survey participants

The Application of Behavioural Insights to Financial Literacy and Investor Education Programmes and Initiatives (OECD & IOSCO)

Approaches

Based on the literature review and survey responses, C8 and the OECD/INFE developed a set of approaches considered to be effective for regulators, policy-makers, and other organisations and practitioners that are considering whether or how to apply insights from behavioural sciences to investor and financial education programmes and initiatives. These approaches are, in summary, to:

  • Establish a concrete understanding of the problem
  • Design the intervention taking the context into account (& mind the intention-behaviour gap)
  • Start small: perform small-scale field tests to gather feedback and make adjustments
  • Evaluate rigorously (Ideally experimentally, e.g., randomised control trials)
  • Interact, learn, and keep track
  • Create thought leadership (“This document is intended to add to the thought leadership” [see Ritson])
  • Consider combining traditional approaches and those based on behavioural insights (combine behavioural insights and cognitive-based approaches)
  • Review programs and initiatives regularly

 

Literature review

The review focused on strategies to mitigate or eliminate the effects of behavioural biases (i.e., debiasing). Debiasing strategies can be classified according to the object of intervention:

  • the investor
  • the decision environment
    • placing (economic or non-economic) incentives
      Policy-makers can also make use of non-economic incentives, including introducing accountability, providing new information, and conveying social norms to decision-makers. Requiring disclosure of more information is a typical stimulus provided by regulation
    • altering the choice context

Other applications of behavioural insights to investor education and financial literacy [3.2]: rules of thumb, mass media, visual tools, counselling and financial coaching.

Offering financial literacy or investor education at “teachable moments” helps to ensure relevance and prompt action. A recent meta-regression analysis considers this a key factor to success, as lack of motivation explains low financial literacy. Another meta-analysis posits that the effects of financial literacy initiatives decay over time and suggests giving “just-in-time” education (Fernandes et al).

 

Six frameworks

This report describes the six behavioural frameworks that are most currently used by academics, economists and others to design behaviour change interventions:

  • COM-B: Capability Opportunity Motivation – Behaviour
  • Behaviour Change Wheel (BCW)
  • MINDSPACE
  • EAST: Easy Attractive Social Timely
  • TEST: Target, Explore, Solution, and Trial
  • CREATE: Cue Reaction Evaluation Ability Timing Execute action

mindspace

 

Gamification

Companies have been increasingly adopting games to strengthen customer relationships
and increase employee participation in training. In the financial literacy field, gamification is a promising tool to engage learners and drive good financial habits, such as saving. Games can be distributed online and played without teachers, so dissemination cost is low. As well-designed gaming tools can provide an entertaining and encouraging frame within which financial concepts and behaviours can be tested and experienced, they are potentially capable of engaging learners for long periods and enhancing financial self-efficacy. Personal finance apps can utilise gamification elements (e.g., badges, challenges, quizzes) that take advantage of behavioural biases, such as loss aversion and mental accounting, to stimulate savings.

Gamification potentially also allows organisations to collect data from learners’ game movements and develop a set of analytics on their decision-making and performance. These insights can be applied to further game improvement and new educational materials, as well as used to inform supporting classroom activities. Sophisticated games could be used to provide real-time information or offer appropriate financial products to encourage players to action, but such additions should take into account the need to clearly differentiate education and marketing.

Rigorous evaluation of such programmes is still unusual: although a wide range of financial literacy games is freely available, most impact results are reported in the form of absolute quantity of engaged players or usage time, with no use of experiments or data analysis to understand the effect of such games on future behaviour. (p.30)

 

Misc

Effective financial literacy and investor education initiatives contain elements to access both mental systems. They deliver information and education to raise programme participants’ awareness and motivate them to change their behaviour consciously (System 2). Such initiatives also create contexts and environments to induce individuals to behave in ways they consider to be in their bestinterests by accessing their System 1 (e.g., nudging).

It is important to note that the RCT methodology was not developed by behavioural economists, but its application to studies of behavioural insights can be considered as an innovation in social policy evaluation. (footnote 261)

 

How to become a Bayesian in eight easy steps: An annotated reading list

I finally got to reading How to become a Bayesian in eight easy steps: An annotated reading list by Alexander Etz, Quentin Gronau, Fabian Dablander, Peter Edelsbrunner & Beth Baribault.

Saw this article first in this tweet from August 2017:

In april 2018 it was published in a special issue:

Abstract

In this guide, we present a reading list to serve as a concise introduction to Bayesian data analysis. The introduction is geared toward reviewers, editors, and interested researchers who are new to Bayesian statistics. We provide commentary for eight recommended sources, which together cover the theoretical and practical cornerstones of Bayesian statistics in psychology and related sciences. The resources are presented in an incremental order, starting with theoretical foundations and moving on to applied issues. In addition, we outline an additional 32 articles and books that can be consulted to gain background knowledge about various theoretical specifics and Bayesian approaches to frequently used models. Our goal is to offer researchers a starting point for understanding the core tenets of Bayesian analysis, while requiring a low level of time commitment. After consulting our guide, the reader should understand how and why Bayesian methods work, and feel able to evaluate their use in the behavioral and social sciences.

The 40 sources are presented in an awesome graph (1-8 in bold are the recommended sources, 1-4 are theoretical sources, 5-8 applied sources):

BayesEtz

Theoretical sources

[1] What is Bayesian inference? Key takeaways:

  • Bayesian approach depends only on the observed data, so the results are interpretable regardless of whether the sampling plan was rigid or flexible or even known at all.
  •  Bayesian approach is inherently comparative: Hypotheses are tested against one another and never in isolation.
  • Since the posterior probability that the null is true will often be higher than the p-value, p-values will discount null hypotheses more easily in general.

[2] Bayesian credibility assessments

  • explains the fundamental Bayesian principle of reallocation of probability, or “credibility,” across possible states of nature.
  • Sequential updating from prior to posterior as data are collected. “Today’s posterior is tomorrow’s prior”

[3] Implications of Bayesian statistics for experimental psychology

  • The probabilities of data given theory and of theory given data. Frequentist statistics only allow for statements to be made about P(data|theory)
  • Bayesian approach is more liberating than the frequentist approach with regard to:
    • stopping rule; with Bayesian inference, one is allowed to continue or stop collecting participants at any time while maintaining the validity of one’s results
    • planned versus post hoc comparisons; frequentist: It matters whether the hypothesis was formulated before or after data collection, with Bayesian approach that does not matter
    • multiple testing; When conducting multiple tests in the classical approach, it is
      important to correct for the number of tests performed. within the Bayesian approach, the number of hypotheses tested does not matter. But note: “cherry picking is wrong on all statistical approaches”
  • there are two main schools of Bayesian thought: default (or objective) Bayes and context-dependent (or subjective) Bayes.

[4] Structure and motivation of Bayes factors

  • In classical statistics this is generally not possible as significance tests are asymmetric; they can only serve to reject the null hypothesis and never to affirm it. One benefit of Bayesian analysis is that inference is perfectly symmetric, meaning evidence can be obtained that favors the null hypothesis as well as the alternative hypothesis.
  • Bayes factor (BF) is a fundamental measure of relative evidence. . If the result of a study is BF01= 10 then the data are ten times more probable under H0 than under H1.
  • Various benchmarks have been suggested to help researchers interpret Bayes factors, with values between 1 and 3, between 3 and 10, and greater than 10 generally taken to indicate inconclusive, weak, and strong evidence, respectively. But: the difference between a Bayes factor of, say, 8 and 12 is more a difference of degree than of category
  • To evaluate which model is better supported by the data, we need to find out which model has done the best job predicting the data we observe
  • Bayesians specify a range of plausible values that the parameter might take under the alternative hypothesis (prior distribution)
  • The default Bayesian (see [3]) tries to specify prior distributions that convey little information while maintaining certain desirable properties. Context-dependent prior distributions are often used because they more accurately encode our prior information
  • a Cauchy distribution is now a common default prior on the alternative hypothesis, giving rise to what is now called the default Bayes factor. Suggest a scale of 1, which implies that the effect size has a prior probability of 50% to be between d=−1 and d=1.

Applied sources

we believe that the existence of divisions speaks to the intellectual vibrancy of the field and its practitioners“.

[5] How do we select a model that both fits the data well and generalizes adequately to new data? Trade-off between goodness-of-fit and parsimony

[6] The choice of priors and how those choices influence the posterior estimates for parameters of interes. When testing hypotheses in the Bayesian framework one should calculate a model comparison metric.

[7] In contrast to classical statistics, Bayesian statistics allows one to formalize and use this prior knowledge for analysis. What possibilities are there to formalize and uncover prior knowledge? Excellent overview of why and how one can specify prior distributions for cognitive mode

[8] Bayesian cognitive modeling

Best of the rest

I always favor easy & applied, so the lower right hand corner of the graph. And [15] does not dissapoint:

Wagenmakers, Morey, and Lee (2016) — Bayesian Benefits for the Pragmatic
Researcher. Applied focus (9), low difficulty (1).

Provides pragmatic arguments for the use of Bayesian inference with two examples
featuring fictional characters Eric Cartman and Adam Sandler. This paper is clear,
witty, and persuasive.

The (free, statistical) JASP package ties into this article very well, see https://jasp-stats.org/

What does Behavioural Economics mean for Competition Policy?

What does Behavioural Economics mean for Competition Policy? Office of Fair Trading, March 2010. Matthew Bennet, John Fingleton, Amelia Fletcher, Liz Hurley & David Ruck

Abstract

This paper looks at whether behavioural economics fundamentally changes our understanding of competition policy. We argue that behavioural economics is an important incremental advance in our understanding, just as informational economics was before it. But this does not mean that all previous economic models of competition and markets are now irrelevant. For the most part, they still provide valid and valuable insights. Importantly, behavioural economics does not question our belief in competition policy as a tool for making markets work well for consumers.

Nevertheless, the existence of behavioural biases does have a number of implications for the way in which markets work. Behavioural biases on the consumer side emphasize the importance of the demand side in making markets work well, and the important synergies between consumer policy and competition policy. Behavioural biases may also have implications for anti-competitive behaviour. In spite of this, behavioural economics does not necessarily imply more intervention. Markets can often solve their own problems and even where they can’t, there are dangers inherent in over-paternalism limiting consumer choice. Behavioural economics also emphasizes the difficulties that authorities can have in trying to correct for such biases.

Homo Sapiens exhibits systematic biases in the way he views both the world and markets. (…) are there ways in which behavioural biases might lead to systematic biases in the models of markets and competition on which we have been relying?

Behavioral economics no fundamental shift because/but:

  1. Behavioural economics does not mean that all previous economic models are negated
  2. Both competition policy and demand-side intervention are crucial tools for making markets work well for consumers
  3. The market may find its own solutions to any problems, but we can not blindly assume the market will solve everything
  4. competition (or consumer) authorities can face difficulities in trying to correct for such biases. Eg.: Behavioural economics tells us that simply providing more information may not be a good solution when consumers have problems assessing such information

Ad #4: It is well documented that consumers do not always read and understand the information provided to them; For example, see (November 2007), Warning: Too much information can harm, A final report by the Better Regulation Executive and National Consumer council on maximising the positive impact of regulated information for consumers and markets.

Market failures

  1. Market power
  2. Asymmetries in information between consumers and firms
  3. Externalities not captured within consumers’ preferences
  4. (?) Behavioural biases

Behavioural economics (& biases):

  • highlights that consumers may find it hard to assess information and compare across products
  • allows us to better understand the underlying causes of search costs (which affect access) and switching (which limits ability to act)
  • makes clear that existing problems within the consumer decision-making process are more entrenched and prevalent than we had believed

Virtuous circle

Markets work well when there are efficient interactions on both the demand(consumer) side and the supply (firm) side. On the demand side, confident consumers activate competition by making well-informed and well-reasoned decisions which reward those firms which best satisfy their needs. On the supply side, vigorous competition provides firms with incentives to deliver what consumers want as efficiently and innovatively as possible. When both sides function well, a virtuous circle is created between consumers and competition.

virtuouscircle

Failure of either side of the circle can harm the effectiveness of markets

Dynamic competition may also be affected by consumer biases within the market. Over time this evolutionary role of competition implies that the average efficiency of the market increases for all consumers. This role is diminished when consumers no longer reward those firms that provide them with what they really want but, instead, reward those that best play on their biases.

Consumers drive

In order for consumers to drive competition by their active, effective, and rational part in this virtuous circle, they ideally need to:

  • Access information about the various offers available in the market. Affected by biases e.g.: consumers tend to look at relative costs rather than absolute search costs.
  • Assess these offers in a well-reasoned way. Affected by biases e.g.:
    • incorrectly anticipating risk, underestimating or overestimating future use, or overweighting the present
    • use rules of thumb
    • distracted by the way in which information is framed and presented
  • Act on this information and analysis by purchasing the good or service that offers the best value to the customer. Affected by biases: e.g. overconfidence, can create inertia

Firms’ Reactions to Consumer Biases

  • Accessing information. Firms can make it more difficult for consumers to
    perform optimal search. E.g. add-on services, adding clauses, drip-pricing
  • Assessing offers. E.g. obfuscating prices or increasing choice or complexity
  • Acting on information and analysis. E.g. increase switching costs (play on inertia), use defaults and automatic enrolments, or use time limited offers to inhibit switching

But: there is a growing empirical literature that provides evidence to support the notion of non-rational behaviour by firms, see Armstrong & Huck (2010) Behavioral Economics as Applied to Firms: A Primer.

Problems in Markets can be Self-correcting

Market Solutions: the market may require a catalyst in order to change from an equilibrium in which all firms want to exploit consumer biases to an equilibrium in which all firms want to help consumers by revealing their prices. Potential catalysts: media, or advisors & intermediaries (e.g. consumer organizations).

Power of Learning: Even if firms have an incentive to mislead consumers this may not be possible (for long) if consumers learn from their mistakes. There are clearly limits to learning.

Self-regulation occurs where firms opt to join schemes that require them to
behave in particular ways.

Intervention Can Potentially Do More Harm than Good

All errors which [man] is likely to commit against advice and warning, are far outweighed by the evil of allowing others to constrain him to what they deem his good.

John Stuart Mill, (1859), On Liberty.

  1. We want solutions that solve the problem, but we do not want to remove consumer choice
  2. Tthere is no guarantee that authorities will necessarily improve the market or not create unforeseen consequences elsewhere. It may be that authorities simply do not have the level of expertise required to make delicate interventions
  3. Authorities might have behavioural biases as well

Caution us against being too paternalistic even when behavioural biases point to problems within the market.

Lessons for Design of Remedies

  • There will always be times – just as there has always been – when intervention is necessary.
  • Other tools include consumer enforcement, consumer education, and (in
    the UK at least) market studies and investigations. There is also potential for
    authorities to advocate legislation in a particular market
  • Example of a positive intervention may be obligations on firms to require them to help consumers make decisions

A further concern that can arise around interventions to solve problems associated with consumer biases is that such interventions can be inherently redistributive. In many markets, the gains that firms make from exploiting consumer biases will be to some extent passed back, through the competitive process, to customers who do not exhibit those biases. In this case, there is effectively a form of cross-subsidy between customers, and this may be unwound with intervention. This does not imply that such interventions should not be made, but it is important to be aware that there can be losers as well as winners in such situations.

Conclusion

Where behavioural biases appear to be creating problem, some may advocate abandoning competition for regulation. We discussed above the dangers of over-paternalism and limiting choice. Competition authorities have a key role in reminding government of the benefits that competition and choice bring. In doing so, however, they need to be cognisant of the available evidence on behavioural economics and its implications.

Hypotheek aflossen met rentevoordeel

Samen met Stefanie de Beer schreef ik een stukje voor de rubriek Statistiek van ESB: Hypotheek aflossen met rentevoordeel [paywall].

Het is een uitwerking van wat op pagina 36 van dit eerdere rapport van de AFM staat: Rapport Experimenteren: samen leren activeren [bijbehorende persbericht: AFM, ING en Florius innoveren met gedragswetenschappelijke experimenten].

Enkele praktijkmensen (hypotheekadviseurs en financiële planners) deelden via Twitter wat van hun ervaringen:

SMaRT

Omdat het idee gebaseerd is op Save More Tomorrow: Using Behavioral Economics to Increase Employee Saving, maakte ik ook een Engelse tweet met daarin de twee bedenkers van SMaRT:

pdf: ESB_Hypotheek aflossen met rentevoordeel

Musea

De eerste maanden van dit jaar ben ik in paar musea geweest en heb daar wat kiekjes geschoten. Verzamel ik hier in deze blogpost.

Bezoek aan Haags Fotomuseum (maart) heeft eigen post: Michael Wolf – Life in Cities, net als het Jenevermuseum in Schiedam (januari).

Februari “Photo-phylles” in Jardin Botanique Bordeaux:

Deze slideshow heeft JavaScript nodig.

Februari: Bernd, Hilla en de anderen / Fotografie uit Düsseldorf in Huis Marseille:

Deze slideshow heeft JavaScript nodig.

Maart: NEMO Amsterdam (kids hadden studiedag, samen met Mees):

DbFy7-ZW4AAg9EO.jpg large

Maart: André Volten – Utopia in Beelden aan zee:

Deze slideshow heeft JavaScript nodig.

April: Centraal Museum Utrecht:

Deze slideshow heeft JavaScript nodig.

April: Fashion Cities Africa in Tropenmuseum Amsterdam:

Deze slideshow heeft JavaScript nodig.

April: Hollandse Meesters uit de Hermitage in Hermitage Amsterdam:

Deze slideshow heeft JavaScript nodig.

 

Central Bank Communication and the General Public

Central Bank Communication and the General Public. Andy Haldane (of Dog and Frisbee fame) and Michael McMahon. 2018. Forthcoming, AEA Papers and Proceedings.

Blinder (2009) wrote that “It may be time for both central banks and researchers to pay more attention to communication with a very different audience: the general public.” communication can aid expectations, and hence economic, management; central bank communication is now itself a powerful lever of monetary policy.

Haldane (2017) stresses a deficit of public understanding as well as public trust in central banks –  a twin deficits problem. Facing these twin deficits, a number of central banks have recently acknowledged the need to adapt their communications strategies to improve their reach to the general public, including through more accessible language and more direct engagement (Haldane, 2017). Because such efforts are not costless, however, two important considerations arise: feasibility and desirabilty.

Desirabilty

Four reasons why it may be desirable to speak directly to a wider audience.

  1. A better under standing of the factors driving the economy, and economic policy, could help to reduce the incidence of such self-reinforcing expectatational swings in sentiment and behaviour.
    To become convincing and credible, communications may need to be simple, relevant and story-based. Typical central bank communications tend to fail on all three fronts.
    Households who report greater knowledge and greater satisfaction with monetary policy are also likely to have one-year, two-year and five year inflation expectations that are closer to the inflation target.
  2. Building public understanding may be important as a means of establishing trust and credibility about central banks and their policies.
    It is also important for reasons of political accountability.
    Satisfaction in central banks’ actions is positively correlated with institutional understanding. It is also positively correlated with measures of central bank credibility.
  3. Traditional information intermediaries, such as the mainstream media and nancial markets, may benefit from new, simpler narrative communication.
  4. To engage in more listening to messages from the general public, given that aggregating information is one of a monetary policy committees’ key roles.

Feasibility

We examine a recent communication initiative by the Bank of England. In November 2017 the Bank of England launched a new, broader-interest version of its quarterly In ation Report (IR), augmented with new layers of content aimed explicitly at speaking to a less-specialist audience.

Overall, the analysis is a nuanced good news message.

  1. Website activity over the course of the 24 hours after the announcement increase markedly in November 2017, almost doubling compared with earlier IRs.
  2. Numbers of tweets and retweets associated with the IR were materially higher than in August 2017, but slightly lower than in August 2016. Monetary policy news itself,
    rather than the means by which it is communicated, is the largest single factor determining the reach of Twitter activity.
  3. More than 70% of respondents [in a survey of BoE business contacts] felt the new layered summary helped them to better understand the IR’s messages. And around 60% of respondents felt the new communication improved their perceptions of the Bank.

Experiment

We now assess the impact of the new Bank of England communications more directly through a controlled experiment. N=285 UK general public, plus sample of first-year Oxford economics graduate students.

Participants were then randomly assigned to read either the traditional Monetary Policy Summary that accompanies the IR or the new, simplified layered content.

Three questions:

(1) understand the content and messages?

The results confirm that the new layered content is easier to read and understand, even for technically-advanced MPhil students.

(2) IR summary changed your views or expectations?

In the case of the general public survey, we find that more straightforward communication boosts the chances that the participant’s beliefs move more closely into alignment with the Bank’s forecasts. For MPhil students, the coefficient is also positive but not statistically significant.

(3) How has the IR summary affected your perceptions of the Bank of England?

Those that read the new layered content tended to develop an improved perception of the institution (BoE).

Policy implications

On a practical level, central banks aiming to reach a broader audience will need to continue to innovate and experiment with different methods and media for engaging the general public. This will, inevitably, require a degree of trial and error.

Success should be measured, not by the ability to reach everyone, but rather to influence beyond the small minority of technical specialists and information intermediaries who currently form the core of central banks’ audience.

***

Summarized in The Telegraph: There are good reasons why the Bank of England is trying to speak to ‘ordinary people’.

For central banks, communication is a powerful policy tool. The way central bankers talk about their thinking and decision influences even long-term interest rates as investors price credit according to their expectations of the central bank’s behaviour.

(…)

At the same time, the Bank is changing the way it communicates, and more specifically changing the people it communicates to. As well as the traders, economists and strategists in the markets, the Bank wants to talk to the wider public. This approach has led to a more accessible version of the quarterly Inflation Report and Governor Mark Carney going on ITV’s Peston on Sunday show. There are several reasons why the Bank might try to broaden the audience.

For one, clearer, simpler messaging may help media and markets to understand policy. Experience suggests the cryptic code of an earlier generation of central bankers can be misunderstood even by sophisticated market participants.

Perhaps more important, talking directly to “ordinary people” confirms that household actions matter to the economy and to Bank policy. Just as much as City investors, consumers need to form sensible expectations of the future economy when they make decisions on borrowing and spending.

A more accessible approach may build public confidence in the Bank at a time when trust in public institutions is weak. It could also open a dialogue that facilitates the flow of information from households to the central bank. The Bank surveys businesses extensively, but households less so. MPC members knowing more about what households think and feel about the economy can only be a good thing.

So there are good reasons for the Bank to try harder to talk to the wider public, and I think that approach will continue. Market participants should get used to the fact that they’re not the only people the Bank wants to talk to – and learn to read its utterances in that context.

 

Rethinking Traditional Methods of Survey Validation – Maul (2017)

Andrew Maul (2017) Rethinking Traditional Methods of Survey Validation, Measurement: Interdisciplinary Research and Perspectives, 15:2, 51-6 (Found via Tweet that is now protected)

Abstract

It is commonly believed that self-report, survey-based instruments can be used to measure a wide range of psychological attributes, such as self-control, growth mindsets, and grit. Increasingly, such instruments are being used not only for basic research but also for supporting decisions regarding educational policy and accountability. The validity of such instruments is typically investigated using a classic set of methods, including the examination of reliability coefficients, factor or principal components analyses, and correlations between scores on the instrument and other variables. However, these techniques may fall short of providing the kinds of rigorous, potentially falsifying tests of relevant hypotheses commonly expected in scientific research. This point is illustrated via a series of studies in which respondents were presented with survey items deliberately constructed to be uninterpretable, but the application of the aforementioned validation procedures nonetheless returned favorable-appearing results. In part, this disconnect may be traceable to the way in which operationalist modes of thinking in the social sciences have reinforced the perception that attributes do not need to be defined independently of particular sets of testing operations. It is argued that affairs might be improved via greater attention to the manner in which definitions of psychological attributes are articulated and greater openness to treating beliefs about the existence and measurability of psychological attributes as hypotheses rather than assumptions—in other words, as beliefs potentially subject to revision.

Procedures of analysis and quality control of measurement instruments are often grouped under the heading of “validation” in the social sciences. In the case of self-report, survey-based instruments, such validation activities commonly consist of essentially three steps:

  1. Estimation of overall reliability or measurement precision, via estimation of Cronbach’s alpha
  2. Some form of latent variable modeling, via exploratory factor analysis (or sometimes principal components analysis), possibly followed by confirmatory factor analysis; and, more rarely,other latent variable models
  3. Estimation of associations between the measured variable and external variables, by inspection and interpretation of correlation matrices of scores on the new instrument and scores from existing instruments designed to measure similar or theoretically related attributes or outcomes of interest.

 

Why is this trinity succesful?

An optimistic explanation for the longevity and popularity of these techniques could be that they are, in fact, reliably successful in achieving their intended scientific and quality-control aims.

A less optimistic observer might note various extra-scientific factors that might contribute to the popularity of these techniques, such as the fact that they have a clear track record of success in facilitating the publication of manuscripts in academic journals and providing a socially accepted warrant for claims of validity; additionally, these techniques are relatively easy to understand and implement, especially by comparison to many other psychometric models (which are not as easily accessible via common software programs such as SPSS).

Three studies with items without theory

In the three studies described above, items were written in the complete absence of a theory concerning what they measured and how they worked.

  1. In the first study, the items closely resembled items from a widely used survey instrument intended to measure growth mindsets, with the notable exception that the key noun in the sentence (“intelligence”) had been replaced with a nonsense word (“gavagai”). To help ensure that any results were not driven by peculiarities of the word “gavagai,” two additional versions of the survey were also used,  where the word “gavagai” was replaced with “kanin” or“quintessence” [result: wording did not matter].
  2. In the second study, items consisted only of meaningless gibberish (Study 2 items were constructed so as to entirely lack even the semblance of semantics. Eight items were constructed, of approximately equal length, consisting of stock lorem ipsum text (e.g.,“sale mollis qualisque eum id, molestie constituto ei ius”)
  3. In the third, they were simply absent. The items (if they could even be called that) simply consisted of an item number (e.g., “1.”), followed only by the six response options as described in the previous studies,ranging from strongly disagree to strongly agree.

Prima facie, it would seem difficult to take seriously the claim that any of these sets of items constituted a valid measure of a psychological attribute, and if such a claim were made, one might reasonably expect any quality-control procedure worthy of the name to provide an unequivocal rejection.To state this in Popperian language: If ever there were a time when a theory deserved to be falsified, this would appear to be it.

Yet, this is not what occurred. In all three studies above, reliability estimates for the deliberately-poorly-designed item blocks were quite high by nearly any standard found in the social sciences.

These validation procedures returned results roughly in line with what is commonly provided as positive evidence of validity throughout the social sciences. This would appear to cast doubt on the adequacy of these methods for providing the kind of rigorous test of beliefs usually expected of scientific studies. Indeed, if response data from nonsensical and blank items can meet classically accepted criteria for validity, one might wonder under what conditions such procedures would not return encouraging results.

it was argued and shown that traditional validation approaches may commonly fail to provide rigorous, potentially falsifying tests of key hypotheses involved in the construction of measures; it was demonstrated that it is not only possible but also apparently fairly easy to obtain favorable-seeming values of common statistical criteria for validity even in the absence of a theory concerning what an instrument measures and how it operates and, in fact, even in the absence of actual items.

The validation activities themselves (in particular, the aforementioned trinity of reliability estimates, factor analyses, and inspection of correlations with other variables) are essentially unreactive to theory.

Favorable-looking results as a default expectation

The results of this study suggest that, at least in the context of responding to survey questions, respondents often choose to behave consistently unless there is a clear reason not to do so. As such, it may be that favorable-looking results of covariance-based statistical procedures  should be regarded more as a default expectation for survey response data than as positive evidence for the validity of an instrument as a measure of a psychological attribute.

Ad hoc explanations

[A] number of interesting correlations surfaced, including the correlation between scores on the “Theory of Gavagai” items and scores on the original Theory of Intelligence items, and the correlation between the nonsense items and Big Five Agreeableness. If one were inclined to do so, one might be able to provide ad hoc explanations regarding how these correlations constitute evidence of validity.

Misconceptions regarding the nature of scientific inquiry in general and measurement in particular

The process of “validating” a measure seems to be thought of by many as separate from the process of defining the attribute to be measured and articulating hypotheses concerning the nature of the connection between variation in the attribute and variation in the outcomes of the proposed testing procedures; that is, the classic trinity of analytic methods used in traditional survey validation applications seem to be fixed a priori and independently of the substantive area of application, background psychological theory, and motivating goals for the creation of the instrument.

In many applications of psychological measurement, the definition of the attribute of interest is vague at best and incoherent or entirely absent at worst.

operationalism and other strong forms of empiricism may have encouraged the perception that psychological attributes do not need to be rigorously defined independently of a particular set of testing operations.

There may be good reason to be suspicious of strong claims regarding the accuracy, precision, and coherence of many survey-based instruments at least, to the extent to which such claims are justified with reference to traditional validation strategies and

especially in the presence of unclear or poorly formulated definitions of target attributes and theories regarding their connection to the outcomes of proposed measurement procedures.

Michell (e.g.,1999; Measurement in psychology: A critical history of a methodological concept) refers to this belief [measurement is a universally necessary component of scientific inquiry] as the quantitative imperative—the conviction that measurement is necessary for scientific inquiry— and gives a thorough historical account of its origins and development and of the ways in which it has shaped methodological reasoning in the psychological sciences since their inception.

***
Factoid from footnote 5: Lorem ipsum text, which is commonly used as placeholder text in publishing and graphic design applications, is itself derived from sections 1.10.32 and 1.10.33 of “de Finibus Bonorum et Malorum” (The Extremes of Good and Evil) by Cicero, written in 45 BC.