Stress testing: conservative calibration and regular verification.
Stress tests are used by commercial financial institutions, regulators and central banks as a means of testing the resilience of institutions or the entire sector to adverse changes in the economic environment. The global financial crisis uncovered the deficiencies of the stress-testing methodologies used in many countries. Before the crisis, many tests were wrongly indicating that the sector would remain stable even in the event of sizeable shocks (Haldane, 2009). These deficiencies related not only to the configuration of the adverse scenarios used, which had initially seemed implausibly strong but were often exceeded in reality, but also to the shock combination assume Breuer et al. (2009). A role was also played by deficiencies in model calibration and in the assumed behaviour of banks and markets, and by the absence of testing of liquidity risk alongside traditional financial risks (in particular credit risk and interest rate risk).
Consequently, the assumptions and parameters used in stress tests are gradually being re-examined so that the tests can better analyze the impacts of strong shocks to the financial system. In defense of stress testing, however, it should be mentioned that this is a relatively new too and hence it still requires ongoing methodological development and refinement.
This paper focuses on how to calibrate models used to stress test the most important risks in the banking system. We argue that stress tests should be calibrated conservatively and slightly overestimate the risks. However, to ensure that the stress test framework is conservative enough over time, a process of verification, i.e. comparison of the actual values of key banking sector variables with predictions generated by the stress-testing models should become a standard part of the stress-testing framework. Direct verification of adverse scenarios is in majority of cases (i.e. non-crisis periods) not possible. Thus, the verification should be performed on baseline scenarios. However, the whole stress-testing model should be calibrated conservatively in order to take into account the uncertainty related to the possible changes in estimated relationships in the case of adverse economic development. Hence, ex-post comparison between reality and predictions generated by baseline scenarios should indicate systematic risk overestimation.
To illustrate our point we present the results of the verification of the Czech National Bank's (CNB) stress testing framework. The CNB has been performing bank stress tests since 2003 and has significantly expanded its methodology over the past few years. The most recent major update was done in mid-2009 and involved and Introduction of dynamic features in the system (see section 2). On this occasion, a verification of the overall stress-testing methodology was conducted in the context of the aforementioned international debate on the reliability of the predictions of the impacts of shocks to the banking sector. The aims were to demonstrate whether the stress test assumptions were correctly configured and to identify any deficiencies in those assumptions.
The analysis reveals that the current CNB stress-testing system generally errs on the right--i.e. pessimistic--side and slightly overestimates the risks. This leads on average to estimates of key financial soundness indicators (in particular capital adequacy) that are lower (more conservative) than the actual values. Some verification results were used to further develop the stress tests.
To our knowledge, there is no other study that would systematically and transparently present the verification of someone's stress testing methodology. With this paper we would like to make a contribution to the debate on how to develop and calibrate reliable stress testing frameworks.
The paper is structured as follows. Section 2 briefly describes the CNB's stress-testing methodology as of end-2009 that was subsequently verified. Section 3 summarizes the verification methodology and presents summary conclusions of the verification for capital adequacy and some other key banking sector variables used in the stress tests. The conclusion summarizes the verification results and proposes a medium-term plan for further developing the tests.
Current banking sector stress-testing methodology of the CNB
The original banking sector stress-testing methodology applied at the CNB was based on the IMF methodology Cihak (2005). The CNB later switched from testing historical ad-hoc scenarios defined by a combination of shocks to using consistent macroeconomic scenarios generated by the CNB's prediction model and related credit risk and credit growth sub-models (Cihak, Heomanek and Hlavacek, 2007; Jakubik and Schmieder, 2008).
In the second half of 2009, the CNB significantly updated the banking sector stress-testing methodology in three respects. First, the tests were "dynamised", in the sense of switching to quarterly modelling of shocks and their impacts on banks' portfolios. This change was described in a box in the CNB Financial Stability Report 2008/ 2009 (CNB, 2009, pp. 63-64). Second, in the credit risk area there was a changeover to "Basel II terminology", i.e. to capturing the credit risk of several separate portfolios using the standard parameters PD, LGD and EAD and relating risk-weighted assets to those parameters using procedures specified in the IRB approach to calculating capital requirements. (2) The final major innovation was the extension of the shock impact horizon from one to two years (or eight subsequent quarters).
Alternative macroeconomic scenarios
Alternative macroeconomic scenarios still serve as the starting point for stress testing in the updated methodological framework. The scenarios are designed using the CNB's official prediction model supplemented with an estimate of the evolution of some additional variables, which are not directly generated by the model. "Stress scenarios" are constructed based on the identification of risks to the Czech economy in the near future. To compare the stress outcome with the most probable outcome, the stress tests use a baseline scenario, i.e. the current official macroeconomic prediction of the CNB. The predictions for GDP growth, inflation and other macroeconomic variables enter credit risk and credit growth models. They were developed to capture changes in banks' credit portfolios and credit risk. The stress tests work explicitly with the four main loan portfolio segments by debtor and/or credit type (non-financial corporations, loans to households for house purchase, consumer credit and other loans), to which the sub-models are also adjusted. The credit risk models are used to predict PD for the individual loan segments, whereas the credit growth models are used to estimate the growth in bank portfolios in relation to the macroeconomic situation and (after certain adjustments) to estimate the evolution of risk-weighted assets.
In the stress tests, the prediction for macroeconomic and financial variables for individual quarters is reflected directly in the prediction for the main balance-sheet and flow indicators of banks. The tests are dynamic, i.e. for each item of assets, liabilities, income and expenditure there is an initial (the last actually known) stock, to which the impact of the shock in one quarter is added/deducted, and this final stock is then used as the initial stock for the following quarter. This logic is repeated in all eight quarters for which the prediction is being prepared. The consistency between stocks and flows is thus ensured.
Credit risk testing is the most important area of stress testing. This testing is based on the use of PD for each of the four main segments of the loan portfolio. The second credit risk parameter is LGD, which is currently determined by expert judgement, with different amounts being set for different scenarios and different credit segments in line with the regulatory rules, commercial bank practices, the approaches applied by some rating agencies (Moody's, 2009) and existing estimates based on market data (Seidler and Jakubik, 2009). The third parameter is EAD, which is determined as the volume of the non-default part of the portfolio (i.e. excluding non-performing loans).
An increase in PD and LGD has two main effects on individual banks.
First, the expected loan losses (in CZK millions), against which banks will create new provisions of an equal amount and record them on the expenses side of the profit and loss statement as impairment losses, are calculated as the product of PD, LGD and EAD for each credit segment and quarter. Total assets are then symmetrically reduced by the amount of these expenses.
The product of PD and the volume of the non-default portfolio forms the volume of new non-performing loans (NPLs) for each quarter. This allows us to generate the volume of total NPLs in the following eight quarters for each bank, and subsequently for the banking sector as a whole, according to the following equation:
(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where NPL are non-performing loans, PD is the probability of default, NP is the non-default portfolio in the four segments defined above and a is an NPL outflow parameter (i.e. write-offs or sales of existing NPLs, i.e. the default part of the portfolio). Parameter a is set by expert judgement at 15% for all segments, i.e. 15% of NPLs are written off/sold each quarter and subsequently disappear from the total volume of NPLs and (gross) assets of the bank.
The credit growth model leads to an estimate of the gross volume of loans in individual segments. Using relation (1) for NPL modeling, this allows us to determine for each bank, and subsequently for the banking sector as a whole, the NPL/total loans ratio, a standard indicator of the banking sector's health.
Second, in the case of banks applying the Basel II IRB approach to the calculation of capital requirements for credit risk, the capital requirements (or risk-weighted assets, RWA (3)) for credit risk are a function of PD, LGD and EAD. Given that the largest banks in the Czech Republic apply this approach, this relation is applied to all banks for the sake of simplicity. Given a constant non-default portfolio volume, i.e. EAD, an increase in PD and LGD thus generally results in an increase in RWA and therefore a decrease in capital adequacy.
Interest rate and currency risk
The macroeconomic scenarios contain a prediction of the evolution of the simplified koruna and Euro yield curves (rates with 3M, 1Y and 5Y maturities). A change in interest rates has a direct effect on bank balance sheets in two main items, namely interest profit and the value of bond holdings. A rise in short-term rates thus reduces the interest rate profit of those banks, which have an excess of short-term liabilities over short-term assets. However, the calculation is adjusted by expert judgement to take account of the business policies of commercial banks, which respond relatively little to market interest rate changes on the deposit side.
The quarter-on-quarter change in the CZK/EUR exchange rate is applied to the net open foreign currency position (including off-balance-sheet items), generating either a loss or a profit depending on the sign of the net open position and the direction of the exchange rate change. (4)
Interbank contagion risk
Interbank contagion risk is modeled in two selected periods (in the fourth and eighth quarters). The test uses data on interbank exposures, with the capital adequacy of individual banks being used to determine their probability of default (PD). As interbank exposures are mostly unsecured, LGD is assumed to be 100%. The expected losses due to interbank exposures are calculated for each bank according to the formula PD x LGD x EAD, where EAD is the net interbank exposure. If these losses are relatively high and will lead to a reduction in the bank's capital adequacy and thus an increase in its PD, there follows another iteration of the transmission of the negative effects to other banks through an increase in the expected losses. These iterations are performed until this "domino effect" of interbank contagion stops, i.e. until the rise in PD induced in one bank or group of banks does not lead to a rise in the PD of other banks.
Profit, regulatory capital and capital adequacy
The stress test assumes that banks will continue to generate revenues even in the stress period, particularly net interest income (interest profit) and net fee income. For these purposes, an analytical item of the profit and loss account called "adjusted operating profit" has been constructed. This consists of interest profit (+), fee profit (+), administrative expenses (-) and some other (non-shock) items. The volume of adjusted operating profit was initially determined by expert judgment for the individual scenarios. A model estimate of this item was introduced only in mid-2010 (CNB 2010).
Regulatory capital is modelled in accordance with the applicable CNB regulations. Each bank enters the first predicted quarter with initial capital equal to that recorded in the last known quarter. If a bank generates a profit in the first predicted quarter (i.e. its adjusted operating profit is higher than its losses due to the shocks), its regulatory capital remains at the same level (is not increased). If, however, it generates a loss, its regulatory capital is reduced by the amount of that loss. The impacts of the shocks are thus reflected in a reduction of capital only if they exceed adjusted operating profit and the bank generates a loss.
Total capital adequacy is calculated for the individual quarters as the ratio of regulatory capital to total RWA. The portion of RWA relating to credit risk is modelled on the basis of the credit risk parameters (see above), while the other components of RWA (or of the capital requirements for other risks) for the individual quarters are determined by expert judgement.
Verification of the stress tests
The objective of the verification is to examine to what extent the assumptions and sub-models used in the stress testing framework are in line with reality. A problematic aspect of the verification is that the tests use stress--i.e. unlikely--scenarios, which may not occur in reality. Hence, we cannot subsequently compare predictions based on adverse scenarios with reality. For this reason, only the scenario that represents the most likely evolution of the economic environment, i.e. the no-stress baseline scenario of the CNB forecast, could be used for the verification. (5)
The prediction using the baseline (i.e. likely) scenario should indicate slightly higher risks than those that occur in reality. This is because the whole system should have a "conservative" buffer to offset the uncertainty associated with estimating losses given adverse economic developments, when relations (for example between GDP growth and risk parameters such as PD) estimated by standard econometric techniques on data from mainly calm periods can change suddenly for the worse. This requirement implies that stress test prediction errors should be evaluated differently from the errors of standard macroeconomic predictions, where deviations in either direction are regarded as "equally bad". In verifications using baseline scenarios, it is appropriate to apply an asymmetric view in the stress tests and tolerate prediction errors towards modest overestimation of the risks.
The verification was conducted on quarterly data in the period 2004 Q4-2009 Q2, i.e. for 19 periods in all. The actual values of key variables for the banking sector as a whole are compared with the predictions generated by the current stress-testing methodology for the individual quarters using the relevant baseline scenario of the forecast. The predictions for past quarters were therefore created subsequently using the updated stress-testing methodology in order to verify that methodology and do not match the values published in CNB Financial Stability Reports.
Two statistics based on the mean prediction errors were used to verify the selected variables: the mean absolute error (MAE) defined by equation (2):
(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
and the mean error in direction (MED) defined as:
(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where Pt denotes the value of the prediction of the estimated variable for the given quarter, At denotes the actual value and t represents the quarter for which the prediction is being made.
MAE serves for simple presentation of the mean prediction error in the units in which the given variable is expressed, while MED expresses whether the given variable was overestimated or underestimated on average and thus gives the degree of "conservativism".
The prediction error of the capital adequacy ratio and other key banking sector variables can be split into two main factors. The first is the potential prediction error caused by inaccuracy in the estimates of the macroeconomic variables entering the stress-testing mechanism (interest rates and the exchange rate), and the second concerns the assumptions and sub-models used in the stress test itself (e.g. the assumptions about how the bank raises its regulatory capital, what interest and non-interest yields it achieves and how sensitive it is to interest rate risk). The macroeconomic prediction error can be eliminated in the verification by using the actual (ex post) values of macroeconomic variables. The residual error is then due to inaccuracies in the assumptions and sub-models of the stress-testing framework and the intentional conservative buffer.
The most important output variable of the tests is the estimate of the capital adequacy ratio (CAR). The mean absolute deviation (MAE) for CAR equates to roughly 1.6 p.p. of the capital adequacy ratio (see Table 1). This means, for example, that the test predicts CAR of 11.4% instead of 13%.
This prediction error equates to roughly 1.8 standard deviations. In the individual shorter periods this error gradually shrinks to 0.8 p.p. (i.e. 1 standard deviation) but then grows again slightly from 2007 onwards. Only a small part of the error is due to errors in the macroeconomic forecast, as the MAE statistic decreases only modestly with knowledge of actual macroeconomic developments.
The negative MED statistic of -10.8% shows that the real values were higher on average in the period as a whole and the stress tests thus tended to generate overvalued CAR estimates (see Table 1). This fact is also demonstrated by Chart 1, which reveals that a lower-than- actual CAR is predicted from the end of 2006 onwards. The resulting CAR was thus underestimated for most periods, in line with the conservative design of the tests. This conclusion remains valid even when the predictions are adjusted for the error in the prediction of macroeconomic variables.
[GRAPHIC 1 OMITTED]
[GRAPHIC 2 OMITTED]
The estimate of a lower-than-actual CAR is due to inaccuracy in the estimate of both RWA and regulatory capital. With few exceptions the stress test overestimated RWA (see Chart 2) and simultaneously tended to underestimate regulatory capital (see Chart 3). The decomposition of the error in the CAR estimate into the part caused by inaccurate prediction of RWA and the part caused by inaccurate prediction of regulatory capital shows that the contributions of the two items to the error are balanced on average.
[GRAPHIC 3 OMITTED]
[GRAPHIC 4 OMITTED]
The overestimation of risk-weighted assets has two sources: first, the credit growth model tends to predict higher credit volumes than the ex-post turnout. While on a first sight an underestimation of credit growth seems to be the conservative calibration, the opposite is true at least from the point of view of risk-weighted assets. Second, the framework uses the estimates of PDs and LGDs as a base of risk weights (IRB approach) which are also overestimated.
Regulatory capital is regularly increased out of after-tax profits, so the estimate of profits is an important parameter for the evolution of capital. Profits are calculated as the difference between adjusted operating profit and losses due to the individual shocks tested (see section 2). The verification of this variable revealed that the stress test systematically underestimates after-tax profit (Chart 4). This is due to two factors. First, the test systematically underestimates adjusted operating profit directly through the assumption about its level (for the baseline it was assumed that adjusted operating profit will be 90% of the average for the previous two years). This is also in line with the more conservative approach to risk assessment. The second cause is that the stress test tends to overestimate the impact of the main risk tested, i.e. credit risk, in the form of higher-than-actual PD and related higher provisioning for NPLs (recorded in the "losses from impairment" category), partly also due to a too conservative expert estimates of LGD.
Despite the relatively positive message of the verification results, further gradual refinement of the predictions is desirable. The main problem in the credit risk area is with the sub-models and assumptions used, as they excessively overestimate the impact of credit risk in the form of losses on impaired loans. While the direction towards overestimation is correct, the degree of overestimation should be held in a reasonable range.
The further development of the stress tests should be based on regular verification. This should become an integral part of the banking sector stress-testing framework to enable ongoing assessment of whether the assumptions are realistic and a conservative buffer is being maintained in the risk predictions.
This paper focused on how to calibrate parameters used in stress tests. It argued that the parameters should be calibrated conservatively and should slightly overestimate risks in order to take into account the uncertainty related to the possible changes in estimated elasticities in the case of adverse economic development. We used the case study of the CNB's banking sector stress-testing methodology and presented the results of a verification of that methodology. Such verification is a tool that should be used regularly as a guide for refining the assumptions and models used. The results of the verification, conducted at the end of 2009, reveal that the CNB stress tests err on the right--i.e. pessimistic--side and slightly overestimate the risks. This leads on average to capital adequacy estimates that are lower (more conservative) than the actual values. This is consistent with the design of the stress tests, which should be built on conservative assumptions.
Breuer, T., Jandacka, M., Rheinberger, K., Summer, M. (2009), "How to Find Plausible, Severe and Useful Stress Scenarios," International Journal of Central Banking, Vol. 5: 205-224.
CNB (2010), Financial Stability Report 2009/2010, Czech National Bank.
Cihak, M. (2005), "Stress Testing of Banking Systems," Czech Journal of Economics and Finance, Vol. 55, pp. 418-440.
Cihak, M., Heomanek, J., Hlavacek, M. (2007), "New Approaches to Stress Testing the Czech Banking Sector," Czech Journal of Economics and Finance, Vol. 57: 41-59.
CNB (2009), Financial Stability Report 2008/2009, Czech National Bank, 2009.
Haldane, A. G. (2009), "Why Banks Failed the Stress Test," (Speech given at the Marcus Evans Conference on Stress-Testing, 9-10 February 2009).
Jakubik, P., Schmieder, C. (2008), "Stress Testing Credit Risk, Is the Czech Republic Different from Germany?," CNB Working Paper 9/2008.
Moody's (2009), "Approach to Estimating Czech Banks' Credit Losses", Moody's Global Banking, July 2009.
Seidler, J., Jakubik, P. (2009), "Estimation of Expected LGD," Financial Stability Report 2008/2009, Czech National Bank.
(1.) Both authors acknowledge the support by the Grant Agency of the Czech Republic (GACR 403/10/1235). Jakub Seidler also acknowledges the support by Grant Agency of the Charles University (GAUK 2009/47509). The findings, interpretations and conclusions expressed in this paper are entirely those of the authors and do not represent the views of any of the above-mentioned institutions.
(2.) PD--probability of default; LGD--loss given default; EAD--exposure at default; IRB--internal ratings based.
(3.) Risk-weighted assets = capital requirements (in CZK millions) x 12.5.
(4.) For example, a positive open foreign currency position and appreciation of the koruna leads to losses.
(5.) The first attempt to verify the stress tests using the baseline forecast scenario was made back in 2007 (Hlavacek et al., 2007), when the capital adequacy ratio and NPL growth predictions generated by the 2006 stress-testing methodology were compared with their real counterparts.
Czech National Bank &
Charles University in Prague 1
Table 1: Deviation of capital adequacy ratio estimate Estimate for 1-year horizon Mean absolute error (MAE) 2004-2009 2004-2005 2005-2006 Prediction--stress test 1,6 1,0 0,8 Prediction--known macro 1,5 0,9 0,6 Mean error in direction (MED) in % Prediction--stress test -10,8 -1,7 -6,5 Prediction--known macro -8,8 1,9 -1,3 Mean absolute error (MAE) 2006-2007 2007-2008 2008-2009 Prediction--stress test 1,6 2,1 1,9 Prediction--known macro 1,1 2,0 2,5 Mean error in direction (MED) in % Prediction--stress test -13,1 -17,2 -15,3 Prediction--known macro -7,1 -16,3 -20,0
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON BUSINESS AND ECONOMY|
|Author:||Seidler, Jakub; Gersl, Adam|
|Publication:||Economics, Management, and Financial Markets|
|Date:||Mar 1, 2011|
|Previous Article:||Modeling long-term electricity contracts at EEX.|
|Next Article:||The determinants of bank rating changes: an analysis of global banking merger and acquisition (M&A).|