Improved trade classification rules: estimates using a logit model based on misclassified data.
Because most high-frequency data sets do not provide information on trade direction, researchers have developed the tick, quote, and the LR rules to classify trades as either a "buy" or a "sell." Recent research has begun to focus on how accurately these rules predict trade direction, specifically addressing the potential for biases in empirical work based on misclassified trade data. Studies by Ellis, Michaely, and O'Hara [Journal of Financial and Quantitative Analysis, 31, 2000, pp. 529-51] and Finucane [Journal of Financial and Quantitative Analysis, 31, 2000, 553-76] establish that the commonly-used quote rule, tick rule, and LR rule do not predict trade direction very well, sometimes classifying "buys" as "sells" and vice versa. Finucane shows that using the predictions from these methods in empirical studies can lead to biased parameter estimates. Ellis, Michaely, and O'Hara and Finucane show that prediction errors using the quote rule, the tick rule, and the LR rule are systematically related to several exogenous variables.
Together, these studies establish the usefulness of good predictions of trade direction and also show that prediction errors in the rules of thumb are systematic. These facts lead us to develop a multinomial logit model based on the misclassified trade data given by the quote rule, tick rule, and LR rule. The authors' aim is to develop a model capable of producing more accurate predictions of trade direction than any of the methods currently being used. In order to improve predictions, a logit model based on misclassified data on the dependent variable was developed and estimated. The logit model includes several explanatory variables used by Finucane. The model is estimated using the NYSE TORQ database. The predictive performance of the model can be assessed because, in the TORQ data, the true trade direction is known. Unfortunately, the model fails to yield improvements over the tic and LR rules. The disappointing performance of the logit model based on misclassified dependent variable data is likely due to the weak set of explanatory variables used in the model. Any model for misclassified data can only be as good as the set of explanatory variables used. The authors' believe that their approach is sound and that the performance of the model can be improved with the addition of more informative explanatory variables. (JEL G18)
|Printer friendly Cite/link Email Feedback|
|Author:||Caudill, Steven B.; Marshall, Beverly B.; Garner, Jacqueline|
|Publication:||Atlantic Economic Journal|
|Date:||Sep 1, 2004|
|Previous Article:||The Corn Laws and English wheat prices, 1815-1846.|
|Next Article:||Extension of Romer's IS-MP-IA model to small open economies.|