The development and evaluation of a hybrid decision support system for clinical decision making: The case of discharge from the military.
Key words; clinical judgment; decision support systems; expert rules; mental health officers; military
Social workers and other helping professionals make clinical judgments on a routine basis. Many of these decisions have important consequences for clients. Relying on clinical judgment and expertise, practitioners make daily decisions whether to accept clients for treatment, render services, hospitalize, discharge, or commit to involuntary care; such decisions can be of immeasurable significance to people's lives. Our empirical study of these important judgments and decisions made in the helping professions revealed how these professions are fallible and prone to error (Gambrill, 1990; Gibbs & Gambrill, 1996; Nisbett & Ross, 1980). Almost all clinical decisions are made under some degree of uncertainty (Hogarth, 1980). Many factors in the decision-making process, such as incomplete or unreliable information relevant to the decision, and errors or inconsistencies in the application of decision-making rules introduce uncertainty. Much of the uncertainty results from the probabilistic nature of the outcomes of decisions. Decisions made under what seem to be identical situations lead to different outcomes (Cooksey, 1996). Fallibility, therefore, is an inseparable feature of clinical judgment under uncertainty.
Our study focused on the decisions made by social workers, psychologists, and psychiatrists serving as military mental health officers (MHOs) to recommend discharge from compulsory duty in the Israeli military services because of mental or emotional difficulties. This decision has significant consequences for the individual soldier, as well as for the soldier's social environment. Compulsory service in the military is a major duty and right of all Jewish citizens in Israel (most non-Jewish citizens are exempt from compulsory service). A discharge on the basis of psychiatric dysfunction stigmatizes the individual in civilian life. Consequences range from difficulties in obtaining a driver's license or in securing employment to less blatant discriminatory attitudes in Israeli society. On the other hand, a misguided decision not to discharge a dysfunctional soldier may lead to dire outcomes such as psychotic breakdown, suicide, and violent behavior within the unit, which in turn could lead to failure in battle (Benbenishty, Zirlin-Shemesh, & Kaplan, 1993).
Military MHOs, the majority of whom are social workers, assess the soldier and decide whether or not to lower the soldier's medical rating on psychiatric grounds, thereby causing an immediate dismissal from the military. This decision is based on clinical judgment and not on any preset requirements or guidelines. Thus, it is open to the same problems in judgment identified in so many other contexts. Furthermore, given the consequences of the decision, the need to maintain consistency among MHOs making the decision is of special importance. It is unacceptable for a decision with such social implications to reflect the idiosyncratic style of a particular MHO rather than a consistent policy applied to all client soldiers (Shapira, 1990).
The consequences and limitations of judgments create a strong need to improve the judgment process, so that decisions will result in positive outcomes. One approach to this improvement in decision making is to sharpen the thinking skills of practitioners. Thus, for instance, Gambrill (1990) and Gibbs and Gambrill (1996) suggested teaching social workers "critical thinking" skills. Another approach is to implement and use a "decision support system" (Seilheimer, 1988; Shapira, 1990; Sprague & Watson, 1986). A decision support system (DSS) can be defined as a computerized user-interactive system that uses data or models or both to generate information that will support (and not replace) a decision maker (Eom & Lee, 1990; Seilheimer, 1988).
Although a DSS in clinical decision making intends to make decision making more normative by eliminating indiscriminate or idiosyncratic judgments, it is important to emphasize that the decisions generated by the system are by no means meant to supplant those made by the clinical workers. In fact, optimal decisions are a combination of the workers' analytical judgments and the DSS output (Seilheimer, 1988). If the DSS output is different from the intended decision of the worker, the worker is then compelled to rethink the decision and to retrace the steps leading him or her to that decision. It is this interaction between the worker and the DSS that is most effective in keeping decisions normative: The worker double-checks to make sure that his or her judgments are based only on relevant factors and that the decision is aligned with professional standards.
DSS outputs can range from simple taxonomic classification to complex scheduling or design strategies. This study focused on a DSS designed to recommend a decision based on given client information. There are several mechanisms a DSS can use to generate a recommended decision (Benbenishty, 1992). For the purposes of this study, two such mechanisms are described: (1) the statistical model and (2) the knowledge-based expert system (Clark, 1992).
Numerous studies have shown that statistical integration of information (statistical modeling) outperforms unaided clinical judgments (for example, Dawes, 1980, 1989; Dawes, Faust, & Meehl, 1980). A statistical model generally is a mathematical formula that calculates the value of a dependent variable on the basis of one or more independent variables. The predicted dependent variable becomes the output of the DSS and can be translated into a concrete recommended decision. The statistical model usually is designed by collecting large amounts of data on previous decisions, analyzing that data, and computing weights (such as coefficients in a multiple linear regression equation) assigned to the independent variables connected to that decision, the dependent variable. This process is often called "policy capturing" (Cooksey, 1996; Stewart, 1988).
An alternative approach to statistical modeling is the expert system. Expert systems are based on attempts to model the thinking processes of experts (Goodall, 1989; Goodman, Gingerich, & De Shazer, 1989; Schuerman, 1987). For the most part, experts rely on two types of knowledge when making decisions. One type is "public" or textbook knowledge--facts that anyone can look up in some literature in some library. The other type of knowledge is "private," consisting of intuitive judgment and rules of thumb (heuristics) that enable the expert to make educated guesses and deal with the uncertainty of incomplete or inexact information (Sicoly, 1989). The strength of an expert system is that it bases its decision making on both types of information, and the logic underlying the decision-making process is available to the user. Expert systems contain an "inference engine," which applies expert rules to process the information and move from input data to a recommendation (Forsyth, 1989). Although expert systems are built on human experience, Clark (1992) found that, under certain conditions, expert systems "outperform" individual human experts.
The expert system mimics the human reasoning process more than statistical modeling would, yet the same predictions and performance can be achieved with both types of decision aid (Benbenishty, 1992). Given that fact, which should one choose in developing a DSS? Carroll (1987) argued that, because expert systems attempt to model and imitate human experts, they can perform at best as well as human experts. Given that human experts are outperformed by simple linear models in making decisions in uncertain environments and given the degree of error that exists in the real world (and the cost of developing expert systems), linear models are a more cost-effective means to modeling expert reasoning (Clark, 1992). On the other hand expert systems have many advantages, among them the "explainability" criterion--their ability to explain the logic behind their recommendations (Benbenishty, 1992).
Some people suggest that the choice between knowledge-based and statistical models be rooted in the characteristics of the decision situation (for example, Clark, 1992). Others suggest a hybrid--the combination of an expert system and a statistical model. Sicoly (1989) described such a possibility: "It is proposed that an integration of these procedures, exploiting their unique strengths, would enhance both the performance and acceptability of computer-aided supports to decision-making" (p. 47).
In other words, a DSS could contain aspects from each type of design approach. For example, a statistical model addresses the "normal" cases: When cases are straightforward and do not contain any extreme or unusual data, the statistical model can be an excellent predictor. However, those rare and extreme cases (or statistical outliers) overlooked by the model can be identified immediately and accurately by an expert (Sicoly, 1989). This hybrid approach was adopted in our study.
The aim of this study was to develop, implement, and evaluate a DSS, combining a statistical model and expert rules to support the decisions of MHOs in the Israeli Defense Forces regarding discharge on the basis of mental health difficulties. In a series of previous studies, we investigated various aspects of this particular decision (Benbenishty et al., 1993; Dekel, 1993; Zirlin-Shemesh, 1991). The most important aspect of our findings relevant to this study was our ability to predict the decision accurately based on a linear regression equation (Benbenishty et al., 1993).
Developing and then implementing a DSS into a clinical setting cannot be completed without evaluating its performance. There are two central components involved in the complete evaluation of a DSS: (1) the validation of the system (measuring whether the program satisfactorily performs the real-world tasks for which it was created) and (2) the evaluation of user acceptance (Preece, 1990). Regarding the validation of the system, Le Blanc (1987) discussed the concept of "critical success factors"--predetermined objective measures that verify whether the implementation of a DSS is successful. These factors (which can be anything from improved cost efficiency to uniformity in decision making) are meant to measure changes in organizational performance as a result of the implementation of the DSS.
Analyzing the DSS, in this case seeing if it was accomplishing what it was meant to do, involves checking that it produces normative and appropriate decisions. As mentioned previously, knowing whether a clinician makes the "right" decision is in many cases almost impossible, as is the situation in this context (no information on differential outcomes of the decision is available). However, because experience eventually standardizes clinical decision making, especially in recurrent situations, the actual decisions of the experts become the "norm" against which future decision making is measured (Shapira, 1990). Evaluation of DSS performance, therefore, is dependent on how close it comes to the normative decision-making behavior of the expert users (Andriole, 1989; Yates, 1990). More specifically, the critical success factor determining the validation of the DSS is a predetermined percentage of correct predictions of the experts' decisions (Furse, 1989; Preece, 1990; Shaw & Woodward, 1988; Sicoly, 1989).
Evaluation of a DSS does not stop at objective system performance. The DSS can be valid as far as decision prediction is concerned, yet the implementation may still fail. According to Seilheimer (1988) most DSS development and implementation efforts are not successful because of failure to secure user satisfaction and acceptance. When evaluating the implementation of a DSS, it is crucial to include an evaluation of user satisfaction with the system. This can be accomplished by using a survey approach, such as questionnaires or interviews administered during or after implementation and validation (Le Blanc, 1987).
RESEARCH AND DEVELOPMENT PROCESS
The development, implementation, and evaluation of this decision support system involved a three-stage process: (1) modeling the decisions of the mental health officers through statistical modeling (analysis of past decisions and prospective study of current decisions) and eliciting expert rules; (2) developing the computerized decision support system and interface; and (3) evaluating the DSS through empirical validation using a new "test" sample of present cases, supplementary empirical validation using old data from a past study, and assessment of user reactions.
MODELING OF DECISIONS OF MENTAL HEALTH OFFICERS
A preliminary study performed content analysis of existing clinical files. The study was described in detail by Benbenishty et al. (1993). Briefly, a stratified sample was used, consisting of 82 randomly selected files of male soldiers who were discharged and 72 cases who were not discharged. The content analysis was performed using a very detailed guide (17 chapters and more than 400 items) developed for that study. The findings indicated that the decision to discharge from the military could be predicted well: A simple linear model explained more than 73 percent of the variance in decisions. The predictive variables included information about the soldiers' social relationships before the military service, difficulties in school, mental health problems before the military service, discipline problems in the military, and suicidal behavior.
For this research we decided not to rely on the statistical model identified in the earlier study. We wished to develop a more current model. Furthermore, we wanted to overcome the limitations inherit in analyzing existing clinical files that vary in the completeness of their documentation.
Procedure and Participants
Mental health officers in a central clinic filled out forms at the end of the intake process of soldiers referred to the clinic for a period of several months (exact periods and numbers of MHOs cannot be revealed because of security considerations). MHOs filled out 92 consecutive forms. Thirty-nine of them contained a recommendation for discharge, and 53 recommended continued military service.
The form developed for this stage followed the content analysis schedule developed in the earlier study and was much shorter based on the findings of that study (Zirlin-Shemesh, 1991). Most of the questions and their response choices on the form were presented in a dichotomous yes-no format, making the filling-out process simple and efficient for the MHOs. The form covered 81 independent variables that could be classified under the following tire headings: (1) family background and relationships, (2) social history of the soldier before his or her draft, (3) soldier's military service, (4) soldier's recent case history and psychological status at intake, and (5) present psychosocial diagnosis. The final variable on the forms was the decision itself (the dependent variable)--the recommendation for discharge or for continued military service.
To identify the factors correlated with the decision, we conducted a series of bivariate correlations. Seventeen of 81 independent variables were found to have significant statistical connection to the dependent variable. Subsequently, multivariate procedures were performed to build an equation that reliably postdicted the recoded dependent variable. The 17 variables mentioned previously were those implemented into a stepwise multiple linear regression (0 [Do not discharge] [is less than] Y [is less than] 1 [Discharge], where Y = motivation for discharge*0.295 suicidal gestures*0.198 military functioning *[0.50] premilitary social ties*0. 131 - motivation at draft*[0.123] - motivation for treatment*[.359] - [0.396]). Because multiple linear regression is not the most appropriate for predicting and postdicting a dichotomous variable, we used logistic regression (Hosmer & Lemeshow, 1989) throughout the study, parallel with the linear regression analysis. Overall, the findings from these sets of analyses were very similar. The predictive ability of the model based on the logistic regression was slightly lower than that of the linear regression equation (for example, 84.6 percent compared with 86.5 percent). To save space we report here only the findings of the more commonly known linear regression analyses. (All findings regarding the application of the logistic regression are available on request from the second author.)
The overall [R.sup.2] was .783. This figure compares favorably with the [R.sup.2] in the study by Benbenishty et al. (1993), [R.sup.2] = .730, which was achieved on the basis of content analysis of case files. After recoding the predicted variable, with a cut-off point of 0.5 (because it was a dichotomous yes-no variable), it was possible to cross-tabulate the actual recommendations made by the practitioners with the postdictions of the regression equation. We found 93.5 percent agreement between the regression postdictions and the practitioners' actual recommendations.
Eliciting Expert Rules
As mentioned previously, the decision generator for the DSS in the IDF mental health clinic is a combination of a statistical model and expert rules. Because statistical models work on averages, they do not encompass extreme or unusual cases. It is therefore hoped that unusual but "obvious" cases missed by the statistical model will be covered by identified expert rules.
One way of eliciting expert rules is through "think-aloud" sessions. (Ericssen & Simon, 1980). During such sessions, decision makers are asked to voice all thoughts that come into mind when making a decision--which variables do they first look at and why, why do they skip others, how do they weigh the evidence, and so forth. The analysis of the protocols of this type of exchange is very enlightening with regard to the identification of unwritten rules that go into decision making (Benbenishty, 1992).
For this study there were three separate audiotaped hour-long interviews with two different MHOs. Both MHOs have had years of experience in this decision-making process and function on the supervisory level of the clinic. In each interview the MHO viewed data about one or more soldiers through a computerized information display board, which simultaneously recorded the information search pattern (Dekel, 1993). This computer interface was ideal for such a think-aloud session, because, although all of the information items on the soldier were available at the touch of the MHOs' fingertips, they may not have viewed it all, and what they did decide to view may have been in some order of importance. As they made a selection to view certain data about a case when making the decision, they were asked to verbalize why they chose that data and not other data and how they weighed that information.
Besides being asked to trace the decision-making process, the MHOs were asked two specific questions: (1) Which data (if any) about a soldier would precipitate a definite decision to discharge him or her from the army, regardless of other data? and (2) Which data (if any) are considered positive enough to override other facts that point to the discharge of a soldier?
Results of Think-Aloud Sessions
The most important conclusion resulting from the think-aloud sessions with the MHOs was that the decision-making process is far from simple and straightforward. In fact, it became obvious that not only does much information go into the decision, but the integration and consideration of that information is highly particular, depending on the individual case and on the orientation of the MHO (for complete and detailed documentation, see Dekel, 1993). Given that conclusion, it was very difficult to discern a substantial number of specific rules that could guide the formation of an expert system decision generator that must reflect "shared" rules.
Several rules, however, were identified from the content of the personal interviews with the MHOs. If a soldier is addicted to drugs, regulations require discharge on psychiatric grounds regardless of any other data on the soldier. If a soldier has attempted or threatened suicide, it is highly likely that he or she will be discharged (but not necessarily). If a soldier is known to have had severe psychological, emotional, or mental pathology in his or her past (yet was drafted anyway) or if the pathology appears during the military service, it is highly likely that the soldier will be discharged (the number of soldiers who fall under these two categories is quite low and estimated as being significantly lower than 10 percent). The final negative rule--one that leads to discharge--involves the soldier's motivation for discharge. For a case in which there is consideration of discharge, if the soldier very much wants to be released, discharge is highly probable. This rule results in a much larger group of soldiers, and informal estimates would put it close to one-third of the cases.
In addition to the previously mentioned negative rules, there is a positive rule that, when present, may keep a soldier from being discharged, even when other factors point to that decision. This rule involves family support: If the MHO has the impression that the family is willing to support the soldier in an effort to improve his or her situation within the framework of the military, the MHO will tend to recommend continued military service. It should be noted again that these rules were derived from only two workers, and more should be done to assess to what extent these decision rules are shared by other MHOs.
Of the five rules mentioned, two of them appeared in the statistical model. Suicide issues and motivation for discharge did not need to be used as expert-system rules in the DSS generator, because these variables have a significant influence on the DSS output through the statistical model anyway.
DEVELOPING THE COMPUTERIZED DECISION SUPPORT SYSTEM AND INTERFACE
Following the development of a statistical model and the identification of expert rules, a computer interface for the DSS was designed. The program itself was written in the database language Clipper 5.1 and consisted of five data input screens. The fields and their choices on the screens were identical to those in the original forms (in the first stage of the research and development process). Data were stored in a database and could be retrieved when necessary for further statistical analysis or report making. The final field on the final input screen was the MHO's recommendation for the soldier. After the MHO entered his or her choice into this field, the computer screen was cleared, and the recommendation generated by the DSS appeared. Thus, in this study, when the MHOs made their initial recommendations, they were not aware of the system recommendation.
It is imperative to note that the calculations of the recommendation, whether done by rules or by a model, were done completely behind the scenes as far as the system user was concerned. The user saw merely his or her input data and then a recommendation by the system. How the system generated the recommendation was not at all visible to the user.
The DSS generated its recommendation through a combination of the calculation of a value through the statistical model and the application of the expert decision-making rules identified previously. The first step toward arriving at any decision was entering all data on the soldier into the screens. With the input data, the value of Y (predicted decision) was then calculated based on the regression equation presented previously. Y was a value between 0 (do not discharge) and 1 (discharge). Taking the result of the statistical model (Y), the DSS applied the expert rules and arrived at the recommendation in the following manner: Discharge if Y [is greater than] 0.6 or if soldier is addicted to drugs or alcohol, discharge with reservation if(Y [is less than] 0.6 and Y [is greater than] 0.5) and fewer than two family members show genuine support for the soldier, and do not discharge if Y [is greater than] 0.5 or [(Y [is less than] 0.6 and Y [is greater than] 0.5) and two or more family members show genuine support for the soldier].
EVALUATING THE DSS
Empirical Validation Using a New Test Sample of Present Cases
Sample and Procedure. Fifty-two new cases were entered into the DSS to assess its ability to predict the recommendations given by the MHOs. In half of these cases, the recommendation was to discharge.
The input process was divided into two parts. In the first part 20 cases were input into the computer by the researcher with the MHO sitting alongside selecting the value of each variable on the entry screens. In other words, the MHO performed all the necessary steps of the input process without actually pressing the keys of the computer. Each MHO who participated in this part of the study input at least two cases in this manner into the DSS. After the input of the MHOs, they were asked to comment on the DSS and its output.
Before the outset of this research project, it was determined that the new sample must contain at least 50 cases. However, after the input of the first 20 cases by six MHOs, it was very difficult to continue the study in this manner because of internal employment and time constraints in the Mental Health Clinic at that time. It was therefore decided in the second part of the input process that for the remaining 32 cases the researcher would retrieve the data directly from the soldiers' files without the intervention of the MHOs. Having access to the soldiers' files provided an element of control regarding the type of cases selected for the validation: Only the most recent cases were selected, and the number of cases in which soldiers were discharged (26) was controlled to match the number of soldiers who were not discharged. Because the information required for the DSS was minimal, missing information was not a major problem. In the few cases in which additional information was needed, we received it from the relevant MHOs.
Empirical validation of the DSS involved a statistical analysis of the "hits" (the cases in which the decisions of the DSS matched those of the users) compared with the "misses." One can say that a DSS is valid if the proportion of hits exceeds a predetermined level ratio or percentage of the total number of cases. Previous studies of decision-making behavior in the clinic found that, when comparing present decisions on cases that had been decided in the past or when examining interpersonal reliability on the same decisions, the percentage of agreement on the decisions fell at around 70 percent, never exceeding 75 percent (Benbenishty et al., 1993; Dekel, 1993). One cannot expect a system to be much more accurate than one in which the percentage of agreement on decisions among the MHOs is always at about 70 percent. However, because it has been found that DSS can perform better than experts (Clark, 1992) and because decision makers may regard the DSS output highly, it is preferable to set one's expectations of system performance higher than one's expectations of decision makers. We therefore set the predetermined proportion to 75 percent.
Results. When using the DSS, 45 of 52 cases were hits. That is, 86.5 percent of the cases were postdicted correctly by the DSS. The test for significance relating to the baseline of 75 percent hits yielded a significance of p [is less than] 0.01.
Supplementary Empirical Validation Using Old Data from a Past Study
Sample and Method. The sample used in this part was 185 soldiers' files analyzed in the study by Benbenishty et al. (1993). Each of the cases in the sample was entered into the DSS (with some minor recoding to fit the needs of the new study). Here, the hit-miss check examined how well the DSS postdicted the decisions that were made about three years before the present study.
Results. The number of hits using the DSS was 130 of 185 cases, yielding an agreement of 70.3 percent. This additional validation shows that even with older data the DSS provided consistency that performed as well as the decision makers themselves.
Assessment of User Reactions
Sample and Method. Each of six MHOs, all of whom had considerable experience in making this particular type of decision, participated in the input of several cases. After obtaining the recommendation of the DSS in at least two cases, the MHOs were asked to comment on the DSS by answering several questions:
* What is your general reaction to the feedback given by the DSS?
* If the DSS output differed from yours, would you reconsider your decision?
* If the DSS were sitting in the clinic available for any user, would you use it to examine any cases?
* What are your general comments on ways to improve the DSS?
Results. The overall reaction to the recommendations given by the DSS was positive. Five of the six MHOs seemed to derive some pleasure when the machine gave the same recommendation as they did. One MHO went so far as to exclaim, "Great!" when she saw that the recommendation of the DSS matched hers. The one MHO who did not have such a positive reaction commented, "There are just too many factors that go into the decision by a human being that the DSS cannot possibly cover them." He further stated that although the process of inputting the data is very healthy and positive for the decision maker (in that it forces the decision maker to review the information again), the recommendation given at the end is "worthless." To prove his point, this MHO took on himself the challenge of finding a case to input that would stump the DSS--a task he accomplished.
In response to the question about reconsidering the decision if the decision made by the DSS were different, three of the MHOs commented that they would definitely reconsider their decision. Two MHOs specifically stated that if they recommended discharge but the DSS did not, they would feel "compelled" to rethink their recommendation. Two of the MHOs did not feel that the DSS and its recommendation were convincing enough to necessitate a re-evaluation of the decision. The other four MHOs felt more confident in their decision when the DSS output matched their recommendation.
If the DSS were presently available for regular use in the Mental Health Clinic, only two of the MHOs said they would consult it, and only for cases in which they felt ambivalent about the decision. The other four MHOs showed an overwhelming reliance on themselves as decision makers and were almost insulted at the prospect of consulting a computer for support in their decisions. The general feeling from the MHOs was that if a DSS were sitting in the clinic for use at all times, very few MHOs (if any) would approach it voluntarily.
All six MHOs had numerous suggestions about how to improve the DSS (although none stated that such changes would cause him or her to want to use it more). The suggestions covered issues ranging from the computer interface to the generality of the recommendation itself. One MHO claimed that the recommendation was not precise enough. In other words, if the recommendation were for discharge, what would be the severity or strength of this recommendation? This MHO thought the intensity of the recommendation should appear as a percentage or even graphically as a shade of color. The MHO also believed that the recommendations were too general. One MHO consented that it is the MHO's task to recommend for or against discharge but that there is more involved. The DSS should be able, the MHO added, to recommend treatment when necessary and the specific type of treatment based on the input variables. Another MHO insisted that the DSS should provide the user with information regarding which variables should be rechecked if the recommendations of the DSS and the user did not match. Some comments concerned the independent variables. The categorical choices in each were too "big" or too general, according to some, and perhaps should have been more detailed. The variables together should make up a clearer picture of the soldier than what appeared at the time.
In conclusion, it was obvious that in the minds of the MHOs, the DSS had room for improvement. However, even if improvements were to be introduced into the DSS, it did not seem that the MHOs would change their basic approach to the system. Five of the six did not appear to be convinced by the concept underlying the DSS, namely, decision support. Until then their decision-making behavior seemed dependable, so the added dimension of a computer-generated recommendation seemed superfluous.
After our study of the development, implementation, and evaluation of a DSS in the IDF Mental Health Clinic, we reached a two-part conclusion: (1) The DSS as a tool for predicting or replicating real decisions in the mental health clinic was rendered valid based on a high percentage of agreement in the new sample of cases and based on an adequate percentage of agreement in the sample of older cases and (2) despite this accuracy in decision making, the user assessment component of the study suggested that the DSS was not an accepted tool for decision support.
One main objective of this study was to improve the DSS based on a statistical model by integrating some expert rules outside of the results of the model (as suggested by Sicoly, 1989). When analyzing the MHOs' think-aloud sessions, it became apparent that there were some consistent rules-of-thumb that they applied in their decisions. For example, every case in which a soldier was addicted to drugs resulted in a decision for discharge. However, because there were a relatively small number of drug users in the original sample, the statistical models did not incorporate such a variable. Because this rule was solid enough to base a decision on but in danger of not being included in the model, it must be otherwise identified and included in the DSS. The think-aloud sessions provided a safety net against the exclusion of important variables. In contexts in which outliers are more frequent, the advantages of adding such rules to the statistical equation would be even more pronounced.
Another advantage of the integration between statistical models and expert-system principles is the use of expert rules when the statistical analysis does not provide a strong enough recommendation in either direction. That difficult midpoint can be a trigger to implement expert rules so that a definite recommendation can be made despite the ambiguity of the statistical midpoint. Using expert rules in such cases would replicate the everyday practice of calling on experts in difficult cases. Thus, this study indicated that this hybrid approach can and should be implemented.
There were two categories of limitations in this study that must be addressed: One pertains to the development of the DSS, and the other to its evaluation. MHOs filled out 92 forms, which were the basis of the statistical models. The low number of cases calls into question the power of certain types of statistical analyses. Because of the low number of cases on which the models were based, adding only a few eccentric cases to the sample can drastically alter the nature of the statistical models to the point that they may no longer reflect reality. Such a limitation can be addressed in future studies by analyzing the "residuals" to ensure that unique cases do not overwhelm the composition of the statistical models (Cooksey, 1996).
Another limitation regarding the DSS development involved the low number of think-aloud sessions with the experts. Having more opportunities to talk to more decision makers about the mechanisms behind their decisions may have yielded more rules that were not encompassed in the model. The number of rules that emerged from the sessions was very small in relation to the many factors that can affect the MHOs' decision-making behavior. Although the addition of more rules may not have improved the predictive validity of the DSS, it could increase the acceptability of the DSS by practitioners who felt uncomfortable with a DSS that did not seem to reflect the complexities of their thought process.
The validity of this DSS was established, but user acceptance was not. Although the evaluations of the users were moderately positive, their willingness to use such a system in the future was very nominal. There are numerous explanations for the lack of enthusiasm regarding the DSS. First, there were the reasons stated by the MHOs themselves. It was widely believed that, because so many factors go into the decision (including the personality, mood, and idiosyncratic style of the decision maker), the DSS could not possibly incorporate all of them. It was preferable to rely only on oneself for the decision and not on some impassive machine that cannot think as extensively as a human being. Numerous studies have shown that this opinion of the MHOs is mistaken: Computer integration of information performs better than human processing of the same information (Fischhoff, Goitein, & Shapira, 1983; Sicoly, 1989). Nevertheless, as long as such facts are not accepted by potential users of a DSS, it would be difficult to win their approval of the system.
Another possible reason that the MHOs did not accept the system was that it was not user friendly. The MHOs were inexperienced with computers and most likely required a DSS that had more of an interactive dialogue with the user (Andriole, 1989). Improving the dialogue between the user and the DSS was actually one of the suggestions made by an MHO for improvement of the system.
However, there appears to be a means by which user acceptance of a DSS can be increased, and it can be found in the fact that every MHO had at least one suggestion for how to improve the DSS. When the potential users are part of the design team or when they feel at least involved in the development of the system, their willingness to use the resulting DSS increases manifold (Andriole, 1989; Seilheimer, 1988; Zinkhan, Joachimsthaler, & Kinnear, 1987). It is possible that the development of the DSS in the military Mental Health Clinic was too isolated from the MHOs.
This unsuccessful implementation can be seen from the wider perspective of other attempts to introduce change, especially changes relevant to information technology and structured decision making (for example, Gleeson, 1987; Mandell, 1989; Neugeborn, 1991, 1995). Because of space limitations we will mention just one important factor: lack of management commitment and support. Although the project received all the necessary approvals and blessings of upper management, it was never perceived as being backed by management. The MHOs' participation was strictly voluntary. The study was seen as an "interesting project" conducted by a researcher associated with the Military Department of Mental Health (the first author), with no commitment for implementation regardless of the project's outcomes. Under these circumstances the computer system died immediately after we submitted the report.
It is important to note that some time after the project was completed, shelved, and apparently forgotten, it experienced somewhat of a revival. Recently, the Israeli public and the IDF became aware of the magnitude, characteristics, and social impact of decisions to discharge soldiers from the military because of psychological difficulties. Consequently, there were calls to use this DSS to support decisions in this area. But it is too early to comment on this new effort. On the basis of our experience, we can predict that interest by the upper echelons is a necessary but insufficient condition for the instatement of a DSS. Much has to be done to transform a valid DSS into a system accepted by clinicians as part of their everyday work.
Andriole, S. J. (1989). Handbook of decision support systems. Blue Ridge Summit, PA: Tab Books.
Benbenishty, R. (1992). An overview of methods to elicit and model expert clinical judgment and decision making. Journal of Social Service Review, 66, 598-616.
Benbenishty, R., Zirlin-Shemesh, N., & Kaplan, Z. (1993). Policy capturing: Discharge from the Israeli army due to mental difficulties. Military Psychology, 5(3), 159-172.
Carroll, B. (1987). Expert systems from clinical diagnosis: Are they worth the effort? Behavioral Science, 32, 274-292.
Clark, D. A. (1992). Human expertise, statistical models, and knowledge systems. In G. Wright & F. Bolger (Eds.), Expertise and decision support (pp. 227-249). New York: Plenum Press.
Cooksey, R. W. (1996). Judgment analysis: Theory', methods and applications. San Diego: Academic Press.
Dawes, R. M. (1980). Apologies for using what works. American Psychologist, 35, 678.
Dawes, R. M. (1989). Experience and validity of clinical judgment: The illusory correlation. Behavioral Science and the Law, 7, 457-467.
Dawes, R. M., Faust, D., & Meehl, P. (1980). Clinical versus actuarial judgment. Science, 243, 1668-1674.
Dekel, R. (1993). Process tracing: Discharge of soldiers from IDF on psychiatric grounds. Unpublished MSW thesis, Hebrew University, Jerusalem.
Eom, H. B., & Lee, S. M. (1990). A survey of decision support systems applications (1971-April 1988). Interfaces, 20(3), 65-79.
Ericssen, J., & Simon, H. (1980). Verbal reports as data. Psychological Review, 87, 215-251.
Fischhoff, B., Goitein, B., & Shapira, Z. (1983). Subjective expected utility: A model of decision making. In R. W. Scholz (Ed.), Advances in psychology: Decision making under uncertainty (pp. 183-207). Amsterdam: Elsevier Science.
Forsyth, R. (1989). The expert systems phenomenon. In R. Forsyth (Ed.), Expert systems: Principles and case studies (pp. 3-21). London: Chapman & Hall Computing.
Furse, G. (1989). Debugging knowledge base. In R. Forsyth (Ed.), Expert systems: Principles and case studies (pp. 184-196). London: Chapman & Hall Computing.
Gambrill, E. (1990). Critical thinking in clinical practice. San Francisco: Jossey-Bass.
Gibbs, L., & Gambrill, E. (1996). Critical thinking for social workers: A workbook. Thousand Oaks, CA: Pine Forge Press.
Gleeson, J. P. (1987). Implementing structured decision-making procedures at child welfare intake. Child Welfare, 56(2), 101-112.
Goodall, A. (1989). An introduction to expert systems: Principles and case studies. In R. Forsyth (Ed.), Expert systems: Principles and case studies (pp. 22-30). London: Chapman & Hall Computing.
Goodman, H., Gingerich, W., & De Shazer, S. (1989). Briefer: An expert system for clinical practice. In R. Cnaan & P. Parsloe (Eds.), The impact of information technology on social work practice (pp. 53-68). New York: Haworth Press.
Hogarth, R. M. (1980). Judgment and choice: The psychology of decision. New York: John Wiley & Sons.
Hosmer, D. W., & Lemeshow, S. (1989). Applied logistic regression. New York: John Wiley & Sons.
Le Blanc, L. (1987). An analysis of critical success factors for a decision support system. Evaluation Review, II, 73-83.
Mandell, S. F. (1989). Resistance and power: The perceived effect that computerization has on a social agency's power relationships. Computers in Human Services, 4, 29-40.
Neugeborn, B. (1991). Organization, policy and practice in the human services. Binghamton, NY: Haworth Press.
Neugeborn, B. (1995). Organizational influences on management information systems in the human services. Computers in Human Services, 12, 295-310.
Nisbett, R., & Ross, L. (1980). Human inference: Strategies and shortcomings of social judgment. Englewood Cliffs, NJ: Prentice Hall.
Preece, A. D. (1990). Toward a methodology for evaluating expert systems. Expert Systems, 7, 215-223.
Schuerman, J. (1987). Expert consulting systems in social welfare. Social Work Research & Abstracts, 23(3), 14-18.
Seilheimer, S. D. (1988). Current state of DSS and ES technology. Journal of Systems Management, 39(8), 14-19.
Shapira, M. (1990). Computerized decision technology in social services: Decision support system improves decision practice in youth probation service. Journal of Sociology and Social Policy, 10, 138-153.
Shaw, M. L., & Woodward, J. B. (1988). Validation in a knowledge support system: Construing and consistency with multiple experts. International Journal of Man-Machine Studies, 29, 329-350.
Sicoly, F. (1989). Computer-aided decisions in human services: Expert systems and multivariate models. Computers in Human Services, 5, 47-60.
Sprague, R. J., & Watson, H. J. (1986). A framework for the development of decision support systems. In R. J. Sprague & H. J. Watson (Eds.), Decision support systems: Putting theory into practice (pp. 7-34). Englewood Cliffs, NJ: Prentice Hall.
Stewart, T. R. (1988). Judgment analysis: Procedures. In B. Brehemer & C. R. Joyce (Eds.), Human judgment: The SJT view (pp. 41-74). Amsterdam: Elsevier Science.
Yates, J. F. (1990). Judgment and decision making. Englewood Cliffs, NJ: Prentice Hall.
Zinkhan, G. M., Joachimsthaler, E. A., & Kinnear, T. C. (1987). Individual differences and marketing decision support system usage and satisfaction. Journal of Marketing Research, 24, 208-214.
Zirlin-Shemesh, N. (1991). Information use in the decision whether to discharge from IDF for psychiatric reasons. Unpublished MSW thesis, Hebrew University, Jerusalem.
Part of the work on this manuscript was completed while the first author was employed by the Israeli Defense Force (IDF). The views expressed in this manuscript are the views of the authors. They do not necessarily represent the views and policies of the IDF. The manuscript is based to a large extent on the second author's MSW thesis prepared in the Paul Baerwald School of Social Work under the supervision of the first author. The authors wish to acknowledge gratefully the financial support of the Israeli Defense Ministry and to thank all the practitioners who participated in this study.
Original manuscript received November 13, 1997 Final revision received May 14, 1998 Accepted July 21, 1998
Rami Benbenishty, PhD, A CSW, is associate professor, School of Social Work, Hebrew University, Mount Scopus, Jerusalem, Israel 91905; e-mail: firstname.lastname@example.org. Robin Treistman, MSW, ACSW, is a social worker and director, Neve Daniel Seminary, Jerusalem.
|Printer friendly Cite/link Email Feedback|
|Author:||Benbenishty, Rami; Treistman, Robin|
|Publication:||Social Work Research|
|Date:||Dec 1, 1998|
|Next Article:||In-home family-focused reunification: A six-year follow-up of a successful experiment.|