Regression tests and the efficiency of fixed odds betting markets.

Sports markets have been a very fruitful fruit·ful
1.
a. Producing fruit.

b. Conducive to productivity; causing to bear in abundance: fruitful soil.

2.
object of research and of application of economic theories. Data on sports are usually readily available, and objectives of market participants are often easily specified. Also, economists are human, and many humans have a greater or lesser interest in sports.

In this paper, we set out to apply one particular economic theory, the efficient market hypothesis Efficient Market Hypothesis

States that all relevant information is fully and immediately reflected in a security's market price, thereby assuming that an investor will obtain an equilibrium rate of return.
, to the market of fixed odds betting in soccer. The central question is whether prices reflect all available information. Prices are understood to be the inverse (mathematics) inverse - Given a function, f : D -> C, a function g : C -> D is called a left inverse for f if for all d in D, g (f d) = d and a right inverse if, for all c in C, f (g c) = c and an inverse if both conditions hold.  of odds offered by bookmakers, an issue we return to in the next section. Suppose odds do not reflect all information. In that case, fans or professional gamblers could exploit such inefficiencies and develop betting strategies with positive expected returns. One would expect that the advent of internet gambling has reduced the possibilities of pursuing profitable betting strategies.

In the literature, two general approaches to testing betting efficiency in sports markets are distinguished: a statistical approach and an economic approach (see for example Zuber, Gandar,& Bowers Bowers is a surname, and may refer to
• Betty Bowers
• Bryan Bowers
• Charles Bowers
• Claude Bowers
• Dane Bowers
• David A. Bowers
• Elizabeth Crocker Bowers
• Graham Bowers
• Henry Francis Bowers
• Henry Robertson Bowers, (1883 - 1912), polar explorer
, 1985). In the first approach, one examines whether betting odds Noun 1. betting odds - the ratio by which one better's wager is greater than that of another; "he offered odds of two to one"
odds

ratio - the relative magnitudes of two quantities (usually expressed as a quotient)
is an unbiased estimator of the outcome of a contest. The second approach takes a different route: different betting strategies are formulated, and one tests whether these strategies yield excessive profits. This paper follows the first approach, and does relate to the second as well.

This paper contributes to the current body of knowledge by making three main contributions. First, we discuss a simple yet flexible approach to testing efficiency, that allows us to combine both the statistical approach and the economic approach to testing market efficiency. The second contribution is the extent of the empirical analysis. Most studies of betting efficiency published so far focus on one particular market. Instead, we perform the same tests for betting on soccer games in 10 different countries. By using so many different data sets, for much longer periods than typically used in other papers, we hope to avoid drawing conclusions that do not hold up to extension of a particular data set. Finally, we examine significance of one particular variable: are past returns on bets on contestants in a current match a predictor of current performance? This is similar to assessing whether or not past stock returns can help predict future performance.

Terminology and Literature Review

Many people are actively engaged in betting on the outcome of sports contests. In fact, some sports mainly exist because of associated betting opportunities. The main focus of this paper is fixed odds betting. In these markets, a bookmaker offers a payout pay·out
n.
1. The act or an instance of paying out.

2. A percentage of corporate earnings that is paid as dividends to shareholders.
on a certain event that can be taken by a punter Punter

1. An trader who hopes to make quick profits. Basically, another term for speculator.

2. In the U.K., it is generally used to describe someone who gambles. It is also used to mean a client or customer of any business.
by staking some amount on that bet. The odds are fixed at the moment the punter and the bookmaker enter the contract, hence the name 'fixed odds betting market' (as opposed to, for example, pari-mutuel betting where the actual odds are not known at the time the contract is entered). To develop notation notation: see arithmetic and musical notation.

How a system of numbers, phrases, words or quantities is written or expressed. Positional notation is the location and value of digits in a numbering system, such as the decimal or binary system.
and introduce concepts, we take the example of a soccer game that may end in a home win (HW), a draw (D), or an away win (AW).

In this paper, odds are understood to be decimal Meaning 10. The numbering system used by humans, which is based on 10 digits. In contrast, computers use binary numbers because it is easier to design electronic systems that can maintain two states rather than 10.  odds, that is, the total payout (stake and profit) per unit wagered. Odds as posted by bookmakers are denoted by [[??].sup.HW]. The 'implied probability' that would make a bet fair is then 1/[[??].sup.HW]. Similarly, 'implied probabilities' for draws and away wins can be calculated, and invariably in·var·i·a·ble
Not changing or subject to change; constant.

in·vari·a·bil
the sum of these 'probabilities' exceeds 1:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII ASCII or American Standard Code for Information Interchange, a set of codes used to represent letters, numbers, a few symbols, and control characters. Originally designed for teletype operations, it has found wide application in computers. ] (1)

[LAMBDA The Greek letter "L," which is used as a symbol for "wavelength." A lambda is a particular frequency of light, and the term is widely used in optical networking. Sending "multiple lambdas" down a fiber is the same as sending "multiple frequencies" or "multiple colors. ] is known as the overround, and [LAMBDA] is usually positive it odds are obtained from the same bookmaker. If [lambda] would be negative for a particular match, a punter would be able to earn a sure profit. This could be possible if quotes are offered by different bookmakers who have different beliefs about the outcome of a game. Probabilities that do sum up to 1 are now obtained by scaling (Pope & Peel, 1989; Goddard & Asimakopoulos, 2004):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2)

Probabilities for a draw and an away win are calculated similarly. This scaling is consistent with bookmakers that earn an expected profit of [lambda]/(1+ [lambda]) times total stakes on the game (Haigh, 1999). Corresponding to the the scaled probabilities, one can calculate scaled odds by [[??].sup.HW] = (1+[lambda]) [[??].sup.HW]. These odds are known as fair odds, as the overround corresponding to these odds is zero by construction. The probabilities derived in equation (2) are also known as prices (amounts to be wagered to collect one unit after the event occurs). From here on, we omit o·mit
tr.v. o·mit·ted, o·mit·ting, o·mits
1. To fail to include or mention; leave out: omit a word.

2.
a. To pass over; neglect.

b.
the adjectives 'fair' and 'scaled', so odds are understood to be scaled odds [[??].sup.HW], and prices [[pi].sup.HW] are the based on scaled odds as in equation (2). In the sequel of this paper we examine whether these prices are unbiased predictors of the probability that the event actually occurs.

The literature on tests of efficiency has developed between two extremes, indicated by Malkiel's definition of efficiency of capital markets:

A capital market is said to be efficient if it fully and correctly reflects all relevant information in determining prices. Formally, the market is said to be efficient with respect to some information set ... if security prices would be unaffected by revealing that information to all participants. Moreover, efficiency with respect to an information set ... implies that it is impossible to make economic profits by trading on the basis [of that information set]. (Malkiel, 1992)

Using this definition of market efficiency, two approaches to testing efficiency of betting markets have developed. In the first approach, one tests efficiency by examining whether prices (2) are unbiased estimates for the probability of the event. The alternative approach is economic tests of efficiency that look for the existence of profitable betting rules (or trading rules in a stock market). Of course, in practice both approaches are combined frequently.

If the information set contains past price information only, the market is said to be weakly efficient. If it contains other, publicly available information as well, the market is semi-strong efficient. If the information set also contains private, non-public information, the market is strongly efficient. We focus on testing weak and semi-strong efficiency.

Informational efficiency of fixed betting odds in football markets has been examined in many different papers; we restrict ourselves to mention a few of the most relevant papers to this study only. One of the first studies is the one by Pope and Peel (1989). They analyze odds offered by four betting firms in the 1981-82 season. One of their statistical tests consists of estimating a linear probability model The linear probability specification of a binary regression model assumes that, for binary outcome and regressor vector  where an outcome indicator is regressed on the price, and they test whether the slope is 1. They find that posted odds do not fully reflect available information, especially in the case of draws. However, they also conclude that these inefficiencies cannot be used to formulate betting strategies that generate post-tax profits. In a more extensive empirical study based on data of the 1993-94 and 1994-95 seasons, Kuypers (2000) uses an ordered probit model In statistics, a probit model is a popular specification of a generalized linear model, using the probit link function. Probit models were introduced by Chester Ittner Bliss in 1935.  for match outcomes to formulate a betting strategy. One of the variables that he includes is recent match performance. He finds 'exploitable betting opportunities'; the model can be used to generate a betting strategy with positive post-tax returns. He explains this lack of efficiency by profit-maximizing odds-setting by the bookmakers. If bettors expectations are biased (for example, because of team allegiance), it may be optimal for the bookmaker to post odds that are not fully informationally efficient. These two studies are based on relatively old data sets, when bookmakers posted odds on paper, in shops. More recently, betting has become an internet-based activity and opportunities for arbitrage arbitrage: see foreign exchange.
arbitrage

Business operation involving the purchase of foreign currency, gold, financial securities, or commodities in one market and their almost simultaneous sale in another market, in order to profit from price
are easier to identify. Forrest, Goddard, and Simmons (2005) use this argument, and the increasing amounts of money at stake, to explain why bookmakers' forecasts are more difficult to improve upon over time. We also mention the study of Goddard and Asimakopoulos (2004), who also estimate an ordered probit model for match outcomes, using an impressive variety of explanatory ex·plan·a·to·ry
Serving or intended to explain: an explanatory paragraph.

ex·plan
variables. Recent performance indicators are among these. They find some evidence of inefficiency, in particular of odds of games played Games played (most often abbreviated as G or GP) is a statistic used in team sports to indicate the total number of games in which a player has participated (in any capacity); the statistic is generally applied irrespective of whatever portion of the game is contested.  during the later months of the season. We discuss their approach in more detail below. Finally, this paper is related to Vlastakis, Dotsis, and Markellos (2009), who compare odds between bookmakers of different countries. They estimate models to identify profitable bets, and show that "... econometric e·con·o·met·rics
n. (used with a sing. verb)
Application of mathematical and statistical techniques to economics in the study of problems, the analysis of data, and the development and testing of theories and models.
models lead to more accurate forecasts which can be employed successfully to form profitable betting strategies"(p. 441). They document existence of a favorite-longshot bias, and arbitrage opportunities that exist in international matches between teams of unequal quality.

Regression Based Tests

In this section we discuss the regression testing In software development, testing a program that has been modified in order to ensure that additional bugs have not been introduced. When a program is enhanced, testing is often done only on the new features.  procedure that we use to assess effi ciency of betting markets. The approach is related to the approaches used by Zuber et al. (1985) and Pope and Peel (1989). The methodological approach sketched here can also be applied to other cases that study the calibration calibration /cal·i·bra·tion/ (kal?i-bra´shun) determination of the accuracy of an instrument, usually by measurement of its variation from a standard, to ascertain necessary correction factors.  of probability models, see, for example, Medema, Koning, and Lensink (2009). To facilitate the discussion, we first introduce some notation.

Consider team t that plays a home game against team j. Date of play is indicated by t, t-1 is the date of the previous game. Note that this date may differ between teams t and j. The game is played in season s. From the names of the teams it is clear to which league they belong. Outcome of the game is either a home win (HW), a draw (D), or an away win (AW). These events are indicated by dummy Sham; make-believe; pretended; imitation. Person who serves in place of another, or who serves until the proper person is named or available to take his place (e.g., dummy corporate directors; dummy owners of real estate).  variables [Y.sup.HW.sub.ijs]= 1, [Y.sup.D.sub.ijs] = 1, and [Y.sup.AW.sub.ijs] = 1, where the index indicates the game, the superscript Any letter, digit or symbol that appears above the line. For example, 10 to the 9th power is written with the 9 in superscript (109). Contrast with subscript.  the outcome, and if the outcome does not occur these variables take value 0. When the game is finished, we have that the three dummy variables sum to 1. Odds for a home win are denoted by [O.sup.HW.sub.ijs]

The expected payout (including stake) on a 1 unit bet on a home win is

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)

We assume that bets have expected return Expected Return

The average of a probability distribution of possible returns, calculated by using the following formula:
equal to 1. Forrest et al. (2005) stress that "intensifying in·ten·si·fy
v. in·ten·si·fied, in·ten·si·fy·ing, in·ten·si·fies

v.tr.
1. To make intense or more intense:
competition is likely to have increased the financial penalties for bookmakers of imprecise im·pre·cise
Not precise.

odds-setting" (p. 251). Hence, if odds reflect all information, we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (4)

In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently
, informationally efficient odds imply the following relation between the odds and the probability of a home win: Pi([Y.sup.HW.sub.ijs] = 1) = 1/[O.sup.HW.sub.ijs]. This suggests the following approach to test efficiency of the odds. Estimate the logit model

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (5)

and test whether or not [[beta].sub.0] = 0 and [[beta].sub.1] = -1. A more powerful test (Sauer, Brajer, Ferris, & Marr, 1988) may be obtained by extending the logit model with variables [z.sub.ijs] as in

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (6)

and test whether [[beta].sub.0] = 0, [[beta].sub.1] = -1, and [gamma] = 0. The variables in [z.sub.ijs] are any variables that may determine the probability of a home win, but whose effects are not (fully) incorporated in the odds [O.sub.ijs].

Model (6) to test efficiency differs from the ones in, for example, Pope and Peel (1989) and Goddard and Asimakopoulos (2004), who use linear probability models.

The advantage of using logit model (6) is that estimated probabilities are guaranteed to be between 0 and 1. This may not be the case in a linear probability model, especially for low and high probabilities. Moreover, we do not have to worry about heteroscedasticity when we estimate model (6) by maximum likelihood.

An alternative to these regression-based tests are tests of efficiency that look for trading rules that provide the basis for a profitable betting strategy. Examples of such trading rules are: always bet on the home team, bet on the favorite; but they can also be more complex, such as, bet on the home team if the expected return of the bet is positive, the expectation being taken with respect to some statistical model. Efficiency is then tested by calculating the average return of the bets following the trading rule, and comparing that average to 0. An example of this approach is Goddard and Asimakopoulos (2004). However, such a comparison is implicitly based on a one-sided hypothesis test, and most papers do not establish whether any trading rule yields statistically significant positive returns. Variables that enter the trading rule can be included in the regression variables Zjs in equation (6), so testing efficiency by looking for profitable trading strategies, and by testing g = 0 in (6), does not seem to be fundamentally different. However, the tests based on trading rules test the null hypothesis null hypothesis,
n theoretical assumption that a given therapy will have results not statistically different from another treatment.

null hypothesis,
n
that the bets on games satisfying the trading rule have a positive return. Such tests may not incorporate observations that do not satisfy the trading rule, and therefor, may have some lower power than the regression based tests because they are based on fewer observations.

Most tests in the literature are based on trading rules that are based on the outcome of a statistical model, see, for example, Dixon and Pope (2004), Goddard and Asimakopoulos (2004), Goddard and Thomas (2006), and Vlastakis et al. (2009). In particular, Goddard and Asimakopoulos (2004) test the efficiency of a betting market by estimating the following regression

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (7)

by weighted least squares. In this regression, [p.sup.HW.sub.ijs] is the probability of a home win based on their statistical model. The term in parenthesis measures the extent to which the statistical model contains relevant information, not included in the probabilities implied by the odds. Their test of efficiency is then [[beta].sub.0] = 0, [[beta].sub.1] = 1, and [gamma] = 0. There are two problems with this approach. First, the coefficients of the statistical model are not known, but estimated (using games played earlier). Hence, Goddard and Asimakopoulos (2004) estimate

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (8)

with [[eta].sub.ijs] the prediction error, [gamma]([p.sup.HW.sub.ijs] - [[??].sup.HW.sub.ijs]).ssuming that the statistical model is correctly specified, this term reflects estimation estimation

In mathematics, use of a function or formula to derive a solution or make a prediction. Unlike approximation, it has precise connotations. In statistics, for example, it connotes the careful selection and testing of a function called an estimator.
uncertainty of the coefficients of the statistical model, that may or may not be negligible This article or section is written like a personal reflection or and may require .
and/or homoscedastic (Pagan, 1984). The standard errors and test results in Goddard and Asimakopoulos (2004) are difficult to interpret for this reason. Also, correlation between the prediction error r/jjs and the odds error ([[??].sup.HW.sub.ijs] - 1/[O.sup.HW.sub.ijs]) in the extended regression (6) cannot be ruled out a priori a priori

In epistemology, knowledge that is independent of all particular experiences, as opposed to a posteriori (or empirical) knowledge, which derives from experience.
, and this may cause the estimator of [gamma] to be inconsistent. A second difficulty is that equation (6) is estimated for home wins, draws, and away wins separately, without taking the restriction that one of these events occurs into account.

However, the constraint Constraint

A restriction on the natural degrees of freedom of a system. If n and m are the numbers of the natural and actual degrees of freedom, the difference n - m is the number of constraints.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] implies restrictions on the specification of equation (7) if it is to hold for all values of the covariates (Van Perlo, Steerneman, & Koning, 2006).

To avoid the first problem, we do not specify a specific statistical model for outcomes and include predicted results of such a models in the covariates z in equation (6). Instead, the effects of variables in z are added to the logit model (6) directly. If necessary, nonlinear A system in which the output is not a uniform relationship to the input.

nonlinear - (Scientific computation) A property of a system whose output is not proportional to its input.
terms can be added to allow the marginal effect of a variable on the index to vary with the level of that variable, see Harrell (2001). This comes at the cost of estimating some additional parameters, so this strategy may affect the power of the test adversely. However, coefficients and standard errors are estimated consistently using this strategy, and we need that for the interpretation of the test results. As in other papers, we ignore the second issue, except at a basic, descriptive level.

Informational Efficiency, Empirical Results

In the application, we analyze fixed odds on soccer results. The dataset is obtained from http://www.football-data.co.uk and consists of national games from 10 different highest level European leagues: Belgium, England, France, Germany, Greece, Italy, Netherlands, Portugal, Spain, and Turkey. The dataset does not cover international games such as Champions League games or games between national teams. Seasons covered are 2002-03 to 2009-10. The dataset consists of 25,744 different soccer games. For each game, multiple odds offered by different bookmakers are available. The number of bookmakers vary by season and country. Bookmakers that appear in the dataset are Bet365, Blue Square, Bet & Win, Gamebookers, Interwetten, Ladbrokes, Sporting Odds, Sportingbet, Stan STAN Stanchion
STAN Stärke- und Ausrüstungsnachweis (German)
Stan Standard Man (human patient simulator)
STAN SEMCIP Technical Assistance Network
STAN System Trace Audit Number
STAN Star Trek Area Network
James, Stanleybet, Victor Chandler Chandler, city (1990 pop. 90,533), Maricopa co., S central Ariz., in the Salt River valley; inc. 1920. It is both a residential community and a center for research and technology. Tourism is also important, and the San Marcos Golf Resort is in Chandler. , and William Hill The name William Hill may refer to the following: People
• William Hill (Australian politician) (1866-1939), a long serving member of the Australian House of Representatives.
. These are all well-known European bookmakers. In the dataset, 47% of all games end in a home win, 26% in a draw, and 27% in an away win.

As a first check, we examine whether there are combinations of odds that offer a sure profit. This could be possible because odds of the same event vary between bookmakers. We find this is the case for 47 games, only 0.2% of all games. Thirty-four of these games are from the 2004-05 season or earlier, when arbitrage by trading through the internet was perhaps less common, so we do not consider this to be pervasive evidence against full informational efficiency of betting odds. In fact, these games may not have provided actual arbitrage opportunities, since bookmakers have started to allow bets on single games and single outcomes only from 2002 onwards on·ward
Moving or tending forward.

In a direction or toward a position that is ahead in space or time; forward.

(Buchdahl, 2003).

As mentioned above, our dataset contains odds for a particular event offered by up to 12 different bookmakers. There is some variation of the odds that are offered for the same event (game and outcome). Since our focus is on the betting market in general (and not on one particular bookmaker), we take the average odds (over bookmakers) to be the market consensus and we base our calculation of prices (equation (2)) on these averaged odds. Our results are not sensitive to this choice, and below we examine whether there is any additional explanatory power of disagreement on the outcome (measured by the highest odds offered minus the lowest odds offered).

We proceed to test informational efficiency in two steps. First, we estimate model (5) separately by country. Then, we extend that model with covariates that may reflect information not incorporated in the odds, that is, we estimate equation (6) for different choices of variables.

[FIGURE 1 OMITTED]

As a first description of the data, we provide a non-parametric loess loess (lĕs, lō`əs, Ger. lös), unstratified soil deposit of varying thickness, usually yellowish and composed of fine-grained angular mineral particles mixed with clay.  regression of the outcome on the standardized prices of home wins and home losses in Figure 1. The results for draws are not shown, as these are more erratic er·rat·ic
1. Having no fixed or regular course; wandering.

2. Lacking consistency, regularity, or uniformity: an erratic heartbeat.

3.
. Other authors have also documented that the betting market for draws is more erratic, see, for instance, Pope and Peel (1989). If prices are an unbiased estimator for the likelihood of an event, the nonparametric regressions should be on the 45-degree line. We notice that there is some tendency of events with higher prices to occur more often than the price indicates. That is, more likely events tend to occur slightly more frequently than the prices suggest. This is the favorite-longshot bias: favorites are more likely to win than prices suggest. This conclusion holds across different countries and applies both to home wins and away wins. In soccer betting, the favorite-longshot bias has been identified before by Cain, Law, and Peel (2000) and Dixon and Pope (2004), and it is usually attributed to gamblers who are risk-loving, or to bookmakers who guard themselves against possible insider trading.

The existence of the favorite-longshot bias is confirmed by a more formal test, when we estimate logit model (5). In the Appendix we show that, for fixed value of the odds, [[beta].sub.0] < - 1 implies that the probability of an event occurring is larger than implied by the odds as long as the odds are smaller than 2 (in other words, as long as the posted price is larger than 0.5). This is indeed what we find in Table 1. In that table, we give for each event (blocks of columns) and each country (blocks of rows) the estimated values of [[beta].sub.1] and [[beta].sub.0], their standard errors, and the p-value of the Wald test The Wald test is a statistical test, typically used to test whether an effect exists or not. In other words, it tests whether an independent variable has a statistically significant relationship with a dependent variable.  of the hypothesis [[beta].sub.0] = 0, [[beta].sub.1] = -1. In all cases, the estimate of [[beta].sub.0] is positive, and the one of [[beta].sub.1] is smaller than -1, so these estimation results imply the favorite-longshot bias. The hypotheses that the coefficients are equal between result (home win/draw/away win) is rejected. However, the hypothesis that the coefficients are equal between countries is not rejected, so from now on we will not distinguish between different countries anymore, but we do distinguish between type of result. Similarly, we tested whether the estimated coefficients are stable over seasons, and also that null hypothesis could not be rejected. The estimates and p-value for the final model are given in Table 1. For all three results, the hypothesis that [beta] = 0 and [[beta].sub.1] = -1 is rejected at any reasonable level of significance, this conclusion holds for all three outcomes (or, in all three markets).

To improve the power of the test, we extend the regression with covariates whose effects may not be fully incorporated in the odds posted. We use the following additional variables:

return: average return on one unit bets on last three games, for home team and away team

return H/A: average return on one unit bets during last three home games for home team, and during away games for the away team

spread: best odds offered minus worst odds offered

time: dummy when game is played (season divided into 10 deciles)

points H/A: average points obtained during last three games, for home team and away team

position: position of home team and away team

goals: average number of goals scored during last three games, for home team and away team

goals H/A: average number of goals scored in the last three home games for home team in home games, and in away games for the away team all: all variables above

Most of these variables enter in pairs, one for the home team and one for the away team. The estimation results are summarized in Table 3.

In all cases, the joint hypothesis [[beta].sub.0]=0, [[beta].sub.1]=-1, and [gamma]=0 is rejected. This is unsurprising given the earlier results when we did not include additional covariates (Table 1). However, we also calculate the p-value for the test whether the additional covariates have significant effects, that is, [gamma]=0. In most cases, this p-value is large, and considering the size of the dataset, a significance level of [alpha]=0.01 seems a reasonable cutoff point Cutoff point

The lowest rate of return acceptable on investments.
.

We conclude that the markets are not efficient in the semi-strong sense, because of both the favorite-longshot bias and the significance of some additional, publicly available covariates. Point estimates of the significant variables are given in Table 4. A variable significant in one market need not be significant in another market. The marginal effects have the same sign as the point estimate (with the exception of the odds). An increase in a variable with a negative point estimate reduces the probability of the event, keeping other factors such as odds constant. Interestingly, from Table 4 it is clear that the effect of some variables measuring recent performance are not fully incorporated in the odds. Financial returns on recent bets are significant, both in the home win market and the draw market, and so are points gained in recent matches in the market for home wins. Also, the timing of the match in the season is significant in the markets for home wins and draws. Especially recent performance of the away team is not fully covered by the odds. The significant, positive effect of recent performance by the away team in the betting market for home wins implies that the true probability of a home win is higher if the away team has a strong, unexpected, recent performance, keeping other factors constant. Apparently, the odds overestimate o·ver·es·ti·mate
tr.v. o·ver·es·ti·mat·ed, o·ver·es·ti·mat·ing, o·ver·es·ti·mates
1. To estimate too highly.

2. To esteem too greatly.
the persistence of a run of good performances by the away team. Obviously, it is an open question whether this effect would result in profitable betting opportunities. This issue is beyond the scope of this paper.

Conclusions

In this paper we assessed the informational content of odds posted in fixed odds betting markets. Using a methodological approach that addresses some issues of earlier approaches, and a big database, we showed that odds in the markets are not unbiased. In all three markets (home win, draw, and away win) we find a favorite-longshot bias: outcomes that are likely to occur according to according to
prep.
1. As stated or indicated by; on the authority of: according to historians.

2. In keeping with: according to instructions.

3.
the odds happen more frequently than expected. Based on this information we conclude that fixed odds betting markets are not efficient in the sense that prices are not unbiased estimates of the probability that events occur.

We also tested explicitly whether these markets are efficient in the semi-strong sense: is public information available that is not incorporated in the prices and helps to predict the likelihood of an event? We found such evidence in the markets for home wins and draws: odds suffer from the same favorite-long shot bias, and it seems that recent performance of the away team is not captured fully by the odds. The financial return on bets on (recent) away games of the away team is significant, and so are the recent number of points obtained by the away team in recent away games. A sequence of good results in away games is unlikely to last forever, though.

Appendix: The Favorite-Longshot Bias in the Logit Model

The estimated relation between odds and the actual probability is graphed in Figure 2. The solid line reflects the relation under the null hypothesis ([[beta].sub.0] = 0, [[beta].sub.1] = -1). The dashed line reflects the estimation results ([[beta].sub.0] = 0.1, [[beta].sub.1] = -1.2). The dashed line indicates that more likely events (those with low odds) have a higher probability of being observed.

[FIGURE 2 OMITTED]

This favorite-longshot bias in the logit model can be derived more formally as follows. Consider the case with one additional covariate, so we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This can be written as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

with the factor fdefined implicitly. If this factor is less than 1, the outcome probability Pr(Y = 1) is larger than the price 1/O, or the true odds are larger than the posted odds. The opposite holds when fexceeds 1. The favorite-longshot bias appears as follows. Consider a high-probability event, so log(O - 1) < 0. If [[beta].sub.1] < -1, we have exp exp
abbr.
1. exponent

2. exponential
(-([[beta].sub.0] + 1) log(O-1)) <1, so the probability of the event is larger than the price for high probability events. This is what we find in the empirical results section. Furthermore, note that, if [gamma] > 0, an increase in z reduces f, so it exacerbates the favorite-longshot bias.

References

Buchdahl, J. (2003). Fixed odds sports betting Sports betting is the general activity of predicting sports results by making a wager on the outcome of a sporting event. Perhaps more so than other forms of gambling, the legality and general acceptance of sports betting varies from nation to nation. . London, UK: High Stakes High Stakes is a British sitcom starring Richard Wilson that aired in 2001. It was written by Tony Sarchet. The second series remains unaired after the first received a poor reception. .

Cain, M., Law, D., & Peel, D. (2000). The favoutite-longshot bias and market efficiency in UK football betting. Scottish Journal of Political Economy Scottish Journal of Political Economy is a scholarly political economy journal published by the Scottish Economic Society.[1] , 47(1), 25-36.

Dixon, M. J., & Pope, P. F. (2004). The value of statistical forecasts in the UK association football market. International Journal of Forecasting, 20, 697-711.

Forrest, D., Goddard, J., & Simmons, R. (2005). Odds-setters as forecasters: The case of English football. International Journal of Forecasting, 21, 551-564.

Goddard, J., & Asimakopoulos, I. (2004). Forecasting football results and the efficiency of fixedodds betting. Journal of Forecasting, 23, 51-66.

Goddard, J., & Thomas, S. (2006). The efficiency of the UK fixed-odds betting market for Euro 2004. International Journal of Sport Finance,1, 21-32.

Haigh, J. (1999). (Performance) index betting and fixed odds. The Statistician, 48(3), 425^34.

Harrell, Jr, F. E. (2001). Regression modelling strategies. New York New York, state, United States
New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of
, NY: Springer springer

a North American term commonly used to describe heifers close to term with their first calf.
.

Van Perlo-ten Kleij, F., Steerneman, A. G. M., & Koning, R. H. (2006). Logical consistency and sum-constrained linear models. Retrieved from http://www.rhkoning.com

Kuypers, T. (2000). Information and efficiency: An empirical study of a fixed odds betting market. Applied Economics, 32, 1353-1363.

Malkiel, B. (1992). Efficient market hypothesis. In P. Newman, P. M. Milgate, & J. Eatwell (Eds.), New palgrave dictionary of money and finance. London, UK: MacMillan.

Medema, L., Koning, R. H., & Lensink, B. W. (2009). A practical approach to validating a PD model. Journal of Banking and Finance, 33(4), 701-708.

Pagan, A. R. (1984). Econometric issues in the analysis of regressions with generated regressors. International Economic Review, 25, 221-247.

Regression Tests and the Efficiency of Fixed Odds Betting Markets

Pope, P. F., & Peel, D. A. (1989). Information, proces and efficiency in a fixed-odds betting market. Economica, 56, 323-341.

Sauer, R. D., Brajer, V., Ferris, a S. P., & Marr, W. M. (1988). Hold your bets: Another look at the efficiency of the gambling market for National Football League games. Journal of Political Economy, 96(1), 206-213.

Vlastakis, N., Dotsis, G., & Markellos, R. N. (2009). How efficient is the European football betting market? Evidence from arbitrage and trading strategies. Journal of Forecasting, 28, 426-444.

Zuber, R. A., Gandar, J. M., & Bowers, B. D. (1985). Beating the spread: Testing the efficiency of the gambling market for National Football League games. Journal of Policital Economy, 93(4), 800-806.

Author's Note

I thank David Forrest and Brad Humphreys for comments, as well as participants at the XIII IASE IASE International Association for Statistical Education
IASE Information Assurance Support Environment (DISA)
IASE International Association of Special Education
IASE Illinois Administrators of Special Education
and III ESEA ESEA Elementary and Secondary Education Act
ESEA E-Sports Entertainment Association
ESEA Eurocopter South East Asia
Conferences on Sports Economics in Prague. Three anonymous referees are thanked for their constructive criticism. An appendix with descriptive statistics descriptive statistics

see statistics.
of the dataset is available from http://www.rhkoning.com.

Ruud H. Koning

University of Groningen Degree programmes
Bachelor's degree programmes
The Bachelor phase lasts three years and after successful completion of a Bachelor's programme result in a BSc or BA degree. There are a total number of 61 Bachelor degree programmes.
, The Netherlands

Ruud H. Koning, PhD, is a professor of sport economics in the Department of Economics, Econometrics econometrics, technique of economic analysis that expresses economic theory in terms of mathematical relationships and then tests it empirically through statistical research.  and Finance. His research interests include sport, economics, statistics, and applied micro econometrics, with a special focus on models for decision making for financial institutions and insurance companies.
```Table 1: Estimates of Single Logit Model for Soccer Bets, by Result

home win              draw

est.     std.err.    est.    std.err.

Belgium

[[beta].sub.0]     0.155     0.047     0.779     0.349
[[beta].sub.1]    -1.252     0.078     -1.839    0.336
p - value          0.000               0.010

England

[[beta].sub.0]     0.164     0.041     0.151     0.245
[[beta].sub.1]    -1.143     0.061     -1.192    0.238
p - value          0.000               0.421

France

[[beta].sub.0]     0.077     0.041     0.266     0.286
[[beta].sub.1]    -1.043     0.085     -1.284    0.322
p - value          0.169               0.626

Germany

[[beta].sub.0]     0.060     0.045     0.172     0.349
[[beta].sub.1]    -1.124     0.079     -1.226    0.345
p - value          0.195               0.401

Greece

[[beta].sub.0]     0.159     0.053     0.817     0.242
[[beta].sub.1]    -1.371     0.077     -1.847    0.238
p - value          0.000               0.002

Italy

[[beta].sub.0]     0.140     0.043     0.670     0.180
[[beta].sub.1]    -1.310     0.069     -1.718    0.196
p - value          0.000               0.001

Netherlands

[[beta].sub.0]     0.163     0.046     0.642     0.311
[[beta].sub.1]    -1.280     0.068     -1.754    0.289
p - value          0.000               0.000

Portugal

[[beta].sub.0]     0.068     0.048     0.448     0.290
[[beta].sub.1]    -1.252     0.079     -1.407    0.286
p - value          0.005               0.251

Spain

[[beta].sub.0]     0.111     0.040     0.686     0.294
[[beta].sub.1]    -1.153     0.072     -1.788    0.298
p - value          0.010               0.004

Turkey

[[beta].sub.0]     0.087     0.045     0.503     0.301
[[beta].sub.1]    -1.153     0.071     -1.527    0.289
p - value          0.038               0.132

away win

est.     std.err.

Belgium

[[beta].sub.0]     0.095     0.081
[[beta].sub.1]    -1.173     0.080
p - value          0.066

England

[[beta].sub.0]     0.118     0.067
[[beta].sub.1]    -1.314     0.068
p - value          0.000

France

[[beta].sub.0]     0.072     0.100
[[beta].sub.1]    -1.184     0.093
p - value          0.008

Germany

[[beta].sub.0]     0.061     0.085
[[beta].sub.1]    -1.060     0.081
p - value          0.750

Greece

[[beta].sub.0]     0.182     0.085
[[beta].sub.1]    -1.351     0.080
p - value          0.000

Italy

[[beta].sub.0]     0.118     0.077
[[beta].sub.1]    -1.281     0.073
p - value          0.000

Netherlands

[[beta].sub.0]     0.228     0.072
[[beta].sub.1]    -1.269     0.071
p - value          0.001

Portugal

[[beta].sub.0]     0.153     0.085
[[beta].sub.1]    -1.277     0.084
p - value          0.002

Spain

[[beta].sub.0]     0.028     0.076
[[beta].sub.1]    -1.045     0.073
p - value          0.794

Turkey

[[beta].sub.0]     0.070     0.076
[[beta].sub.1]    -1.123     0.074
p - value          0.212

Table 2: Estimates of Single Logit Model for Soccer Bets, by Result

home win              draw

est.     std.err.    est.    std.err.

[[beta].sub.0]     0.121     0.014     0.550     0.081
[[beta].sub.1]    -1.213     0.023     -1.600    0.081
p - value          0.000               0.000

away win

est.     std.err.

[[beta].sub.0]     0.121      0.025
[[beta].sub.1]     -1.213     0.024
p - value          0.000

Table 3: p-values of Test of Efficiency with Additional Variables

home win                       draw

[beta], [gamma]   [gamma]   [beta], [gamma]   [gamma]

base model        0.000                       0.000
return            0.000         0.007         0.000         0.234
return H/A        0.000         0.000         0.000         0.004
time              0.000         0.026         0.000         0.009
points            0.000         0.000         0.000         0.027
position          0.000         0.223         0.000         0.150
goals             0.000         0.817         0.000         0.918
goals H/A         0.000         0.321         0.000         0.253
all               0.000         0.010         0.000         0.375

away win

[beta], [gamma]   [gamma]

base model        0.000
return            0.000         0.160
return H/A        0.000         0.365
time              0.000         0.780
points            0.000         0.251
position          0.000         0.594
goals             0.000         0.424
goals H/A         0.000         0.085
all               0.000         0.105

Table 4: Point Estimates of Significant Variables

home win              draw

est.     std.err.    est.     std.err.

return

[[beta].sub.0]    0.144     0.016      0.582     0.090
[[beta].sub.1]   -1.222     0.026     -1.650     0.090
return ht        -0.048     0.026      0.019     0.028
return at         0.070     0.027     -0.046     0.029

return H/A

[[beta].sub.0]    0.149     0.016      0.569     0.089
[[beta].sub.1]   -1.223     0.026     -1.641     0.089
return ht home   -0.054     0.023      0.029     0.024
return at away    0.052     0.016     -0.055     0.018

points

[[beta].sub.0]    0.187     0.049      0.576     0.100
[[beta].sub.1]   -1.277     0.033     -1.683     0.095
points ht        -0.014     0.038      0.020     0.040
points at        -0.008     0.037      0.056     0.037
points ht home   -0.055     0.027      0.016     0.029
points at away    0.087     0.028     -0.087     0.030

away win

est.     std.err.

return

[[beta].sub.0]    0.114     0.028
[[beta].sub.1]   -1.212     0.027
return ht         0.040     0.029
return at        -0.042     0.031

return H/A

[[beta].sub.0]    0.111     0.028
[[beta].sub.1]   -1.211     0.027
return ht home    0.034     0.025
return at away   -0.008     0.019

points

[[beta].sub.0]    0.127     0.066
[[beta].sub.1]   -1.248     0.035
points ht         0.003     0.044
points at        -0.037     0.042
points ht home    0.046     0.031
points at away   -0.008     0.031
```
COPYRIGHT 2012 Fitness Information Technology Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.