Proposals to enhance the quality of observational cohort studies.


For herbal medicinal products the methodology of observational cohort studies (observational studies, drug monitoring studies, Anwendungsbeobachtung) represents a useful addition to clinical trials. The key objectives are the documentation of efficacy in particular under conditions of everyday medical practice in authentic patients and the documentation of the safety profile. Supplementary to earlier activities, members of the "Clinical Trials of Herbal Medicinal Products" Working Party of the German Society for Phytotherapy have therefore again addressed the issue of observational cohort studies for to enhance the informative value and importance of this clinical research methodology. Recommendations were developed on quality aspects, methodological approaches of observation parameters, and for the reporting of the study's results. Properly planned and conducted observational cohort studies may contribute to the documentation and proof of well-established medicinal use according the EU Directive 2001/83/EC.

Key words: herbal medicinal products, observational cohort studies, Anwendungsbeobachtung, well-established medicinal use, efficacy, safety, quality aspects

* Introduction

According to international consensus, ongoing clinical research on medicinal products that is conducted under practical conditions after marketing approval is termed Phase IV research, a category that is covered in English by the term post-marketing surveillance (PMS). In terms of the methods involved, PMS includes Phase IV clinical trials as well as observational cohort studies (OCS), sometimes also referred to as observational studies or drug monitoring studies (Benson and Hartz 2000, MCA 1994, Waller et al. 1992). The German term for an OCS is Anwendungsbeobachtung (AWB).

In the light of Article 4 no. 8 a) ii of EU Directive 75/318/EEC and, as amended, of Commission Directives 1999/83/EC and 2001/83/EC where reference is made to "well-established medicinal use", OCS occupy a special place in the sphere of herbal medicinal products. According to these directives, proof of well-established medicinal use may be furnished with bibliographical references (publications). Along with post-marketing studies in the true sense, bibliographical applications may also include OCS, epidemiological studies and comparative epidemiological studies. It should be stated unambiguously that "true" Phase II-III or even Phase IV clinical trials do not reflect routine medicinal use because patient selection is based on strict inclusion and exclusion criteria, on informed consent, and definition a priori of diagnostic and therapeutic measures. Well-established medicinal use can be documented only by study designs that also take account of routine practice, prescribing behaviour and patients' use of self-medication remedies. To this extent an OCS that is carefully planned, conducted, analysed and documented or reported has a crucial role in proving well-established medicinal use.

The key objectives of Phase IV research as a whole are as follows (BfArM 1998, Denes and Gorbauch 1990, GPHY 1996, Kraft et al. 1997, Victor et al. 1997):

* To obtain new data on efficacy, in particular under conditions of everyday medical practice in authentic patients; to extend the patient populations studied; to identify desired long-term effects; and to compare the treatment with other pharmacological or non-pharmacological measures.

* To obtain new data on safety; to identify rare adverse events; to discover interactions; and to identify unwanted long-term effects.

* To find new indications: examples that may be cited here include the cardioprotective platelet aggregation-inhibiting activity of acetylsalicylic acid, the antihypertensive activity of thiazide diuretics/beta-blockers/calcium antagonists, and the lipid-lowering effects of artichoke extracts.

* To provide data on practicability of use, and patient acceptance and compliance.

* To generate data on prescribing behaviour under conditions of everyday medical practice.

Apart from specific studies conducted to demonstrate new or extended indications, all these objectives can be met using the methodology of the OCS.

In the past, and still nowadays, the medical profession, the scientists and the regulatory authorities have had only a very low opinion of the scientific value of OCS. Extensive misuse of OCS, predominantly in the sphere of product marketing, brought the method into disrepute. In particular, studies were designed with a purely marketing emphasis, i.e. to promote the use of the products by the medical professions. The principal concern in such studies was not to carefully gather scientific data but to heighten familiarity with the product. OCS of this kind (if published at all) are worthless as further data for inclusion in ascientific assessment. Furthermore, many OCS published in the past have provided minimal information or details on planning and conduct, making it virtually impossible to comment on or assess their content.

However, together with open, uncontrolled studies (many of which should instead be reclassified as OCS when the chosen study design is assessed retrospectively by present-day standards) from previous years, OCS have been an important and popular method of obtaining data on the use of medicinal products that have been licensed or granted marketing approval. For example, the results obtained made a crucial contribution to the acceptance of the herbal medicinal products in question for inclusion in the German Commission E-prepared herbal drug monographs up to 1994.

* Recommendations issued by expert bodies and the BfArM

In the light of this unsatisfactory situation various working parties in the past have focused on the methodology of OCS and have developed recommendations for their conduct, analysis and results presentation (Honig et al. 1998, Linden et al. 1994, Manniche et al. 1994, Selbmann 1996, Victor et al. 1997). A Working Party of the German Society for Phytotherapy has also elaborated recommendations, in this case relating primarily to the requirements for herbal medicines (GPHY 1996, Kraft et al. 1997).

In collaboration with the German Society for Medical Information Science, Biometrics and Epidemiology, the German Federal Institute for Drugs and Medical Devices (Bundesinstitut far Arzneimittel und Medizinprodukte, BfArM) then drew up criteria for the planning and conduct of OCS and published these as an official announcement in the Federal Gazette (Bundesanzeiger) (BfArM 1998). Here the regulatory authority accepted for the first time the value of the OCS methodology. According to the wording of this official announcement, properly conducted OCS can be included in the risk-benefit assessment as "other" scientific data, in accordance with [section] 22 (3) of the German Drug Law. This is expressly confirmed in the European Commission Directive 1999/83/EC, which states: "Whereas it is in particular necessary to clarify that 'bibliographic reference' to other sources of evidence (for example, post-marketing studies, epidemiological studies, studies conducted with similar products, etc.) and not just tests and trials may serve as a valid proof of safety and efficacy of a product if an applicant explains and justifies the use of these sources of information satisfactorily." However, the use of OCS as proof of efficacy and safety presupposes that they are planned, conducted and analysed in accordance with scientifically accepted criteria.

* Current importance of OCS for herbal medicines

For various reasons, in the context of herbal medicines, the OCS methodology represents a useful addition to clinical trials.

Firstly it should be noted that the majority of proprietary herbal medicines are used for the treatment of health problems, complaints that are perceived as illnesses, and diseases that follow a chronic course. In many cases this is because their effects are only "mild", being simultaneously coupled with good tolerability and excellent patient acceptance.

Examples that may be cited in this context include the subjective symptoms (i.e. troublesome to the patients) of chronic venous insufficiency, dyspepsia or musculoskeletal diseases. Because of the wide range of standard deviations and errors, objective measurement techniques (if available at all) for this types of conditions are inprecise and do not accurately reflect outcome in terms of the improvement in patients' subjective well-being. The high turnover achieved by oral and topical venous therapeutic agents, the many herbal remedies in gastroenterology, the tonics and the rubs for rheumatism typically illustrate consumer's preference for herbal medicines.

OCS can yield data that precisely reflect the user spectrum and the activity characteristics of the products in question during routine administration in everyday practice. In addition, the claimed indications from earlier Commission E monographs can be substantiated and confirmed. Similarly, OCS readily yield data that permit an estimate and assessment of long-term efficacy and tolerability. In all these cases the OCS can make a substantial contribution to the body of knowledge and, for many enquiries, offer an advantage over the clinical trial.

An assessment of efficacy by means of an OCS presupposes that observations are recorded not only in patients who have been treated with the medicinal product of interest (the study treatment), but also in patients who have received comparable treatments or--where a study treatment is being administered in addition to a standard treatment--the standard treatment only (control treatment). Such studies, in which a representative selection is made from a defined patient population without restriction to a particular treatment, are known as cohort studies. According to Feinstein (1985), a cohort study is described as retrolective if the treatment had already been started before the study, and as prolective if allocation to the study or control treatment was made only after the study had commenced. Data collection is defined as retrospective where the data are obtained from medical records that have already been completed at the time of study commencement, and as prospective where the data are collected after the start of the study in accordance with predefined rules. Whereas data are also generally collected prospectively in studies with prolective treatment allocation, prospective data collection is also possible in studies with retrolective treatment allocation, although only the treatment data generated after study commencement are collected ("prospective OCS with a deferred starting point" (BfArM 1998)).

One essential feature of an OCS (and hence also of a cohort study) is that no influence should be brought to bear on the treatments used in the individual patients during the course of the study. Instead, the treatments may be freely chosen by the physician and/or patient. In individual cases the preferences of the physician and the characteristics of the treatment facility and of the patient will influence the decision for a particular treatment. In general, therefore, it cannot be assumed that the patients in the study group will be comparable with those in the control group in terms of their baseline characteristics and other treatment conditions. However, since these characteristics and conditions may also influence the treatment outcome, direct comparison of results between the study group and the control group is not unbiased and is thus unsuitable for deriving valid conclusions concerning efficacy.

One of the most important tasks when analysing and evaluating the results of an OCS is to identify possible bias factors and to make compensatory adjustments for any effect these may have. Two equivalent procedures are available to achieve such compensation: stratification and regression analysis. In stratification all cases studied are divided into subgroups (strata) with similar baseline and treatment conditions. Comparison of outcomes between study treatment and control treatment is performed in each case within homogeneous subgroups and is therefore free from bias. The results of the comparison are then summarised in a suitable form across the subgroups. In regression analysis the dependence of the treatment outcome on baseline and treatment conditions is determined using a suitable regression function that is estimated from the study data. The results in both groups are then converted (adjusted) to identical conditions (usually the mean conditions of the study) and these "cleaned" results are compared between the study and control group.

In general, since many conditions may influence allocation to treatment and outcome, direct stratification or regression by all such conditions is difficult and the result is only unsatisfactory. This problem can be overcome by performing compensatory adjustment in two steps. In the first step the influence of the conditions on the allocation to study treatment or control treatment is determined and expressed using an appropriate function. Such a function is the propensity score introduced by Rosenbaum and Rubin (1983), which reflects the probability of giving a patient the study treatment, as a function of his or her individual baseline and treatment conditions. The propensity score can be estimated from the study data. For each patient it summarises into a single score the multitude of conditions potentially influencing his or her treatment but also the treatment outcome. In the second step the treatment outcome achieved only needs to be adjusted using this score. The stratification and regression analysis of treatment outcomes is thus appreciably simplified. The procedure has now been employed successfully in numerous OCS.

Reference may additionally be made to the future EC Directive on traditional herbal medicines (EC 2002). In this context, as well as providing evidence to prove that the products in question have been in practical use for many years, properly planned and conducted OCS may also offer a crucial advantage for a successful licensing procedure: according to Article 3, paragraph 3d of this Directive, a bibliographic review of safety data and an expert report on that data are required for the assessment of safety of use of these products.

In contrast with the requirements of the ICH or EC Guidelines on Good Clinical Practice (GCP), no comparable regulatory framework yet exists for OCS. Despite the recommendations that have been published by the BfArM and many expert bodies, OCS of rather mediocre importance in scientific terms continue to be conducted. Members of the "Clinical Trials of Herbal Medicinal Products" Working Party of the German Society for Phytotherapy have therefore again addressed the issue of OCS quality and have developed further recommendations so that the informative value and importance of this clinical research method can be further enhanced.

* Proposals for the further enhancement of OCS quality

Organisational quality

Firstly, in order to eliminate confusion, the terminology used in the context of an OCS should be applied in a consistent manner. The terms observation protocol and observation form should be used in an OCS (in contrast to study protocol and case report form or CRF in a clinical trial). The approved or registered proprietary medicine whose use is to be documented should be described as the study medication.

In many OCS the quality of the documentation received or produced is poor. Commonly, there is a lack of information concerning methods of physician recruitment and observation form distribution, numbers of observation forms distributed/issued, return rates, and physician liaison contact (including specialist fields) while the OCS is in progress. These data are of fundamental importance for assessing the quality of an OCS as well as for allowing comparison with other studies or OCS and should therefore also always be collected.

In general, in order to overcome these and other deficiencies and to ensure quality, it is recommended that standard operating procedures (SOPs) should be written to cover the planning, conduct and analysis of OCS. These SOPs should describe the full sequence of events in an OCS, detailing those aspects requiring special attention, but they need not reproduce clinical trial SOPs in terms of volume.

Repeated reference has been made to the need for regular monitoring (by telephone, by letter, or in person) to manage and direct the progress of an OCS during the practical phase in order to guarantee quality. Ideally, this should be done by a clinical monitor, but the objective can also be achieved by sales staff of pharmaceutical manufacturers who have been carefully trained and briefed.

Full details relating to planning, conduct and analysis cannot be extracted from some OCS reports (final in-house reports as well as publications). In such cases it is virtually impossible to evaluate the methodological and organisational aspects of the study and hence to assess to what extent the aforementioned recommendations of the German BfArM or other expert bodies have been taken into account. The final reports as well as the publications derived from them should therefore contain sufficient detailed information.

Observation parameters

In line with the methodology of non-interventional routine medical use, technical or other objective measurement procedures tend rarely to be practicable in the context of OCS. On the other hand, the importance of claims based on the content of OCS is minimal if individually designed or constructed instruments are used to score findings or symptoms. Comparison of results on efficacy and tolerability with results from studies with other medicinal products in the same indication, explicitly stated as one of the objectives of OCS, is hardly possible in such circumstances. In many cases the scores or scales selected mostly only permit statements regarding "general" or "global" efficacy and tolerability (ranging from "very good" to "very poor"), and this is altogether unsatisfactory for a thorough assessment and discussion of results. It is difficult to derive specific conclusions on therapeutic effects that may manifest themselves differently for individual typical symptoms.

However, if OCS utilise the same methodological approaches as those employed in clinical trials, i.e. international accepted and standardised instruments (if available) as e.g. the Hamilton Depression Scale (HAMD) score for St. John's wort extracts, it becomes possible to make comparisons with results from other studies (comparator, reference, placebo therapy; also with findings on chemically defined medicinal substances).

The determination of therapeutic outcomes can be an important approach in the context of OCS. Health is defined not only as the simple absence of diseases, but also as well-being at a physical, psychological and social level. Quality of life and patient-centred health status have thus become increasingly important focuses for any therapeutic intervention, particularly in chronic disease states. The development of high-quality health care requires evidence-based and patient-oriented medicines that are simultaneously efficient and hence favourably priced. It is no longer sufficient for studies to investigate the efficacy and efficiency of preventive, therapeutic or rehabilitative approaches. Following the methods of quality research, it is also important to subject the results of efficacy research to critical review in practice. The overall therapeutic result--the outcome--is determined in this way.

It is now beyond dispute that patients are entirely capable of making a reliable assessment of their symptoms, everyday limitations and functional impairment, provided they are asked relevant questions in a standardised form. Expert medical bodies and associations have also recently insisted that conclusions concerning the clinical effectiveness of medicinal products should be coupled increasingly with the effects of long-term therapy appropriate to the indication--i.e., with the outcome--and less with effects that are achievable in the short term simply at the symptom level. A comprehensive listing may be found, for example, in the treatment guidelines issued by the German Working Party of Scientific and Medical Expert Organisations (Arbeitsgemeinschaft der Wissenschaftlichen Medizinischen Fachgesellschaften, AWMF). The licensing authorities have embraced such guidelines and recommend that these be taken into account for the demonstration of therapeutic efficacy; for more on this subject, see e.g. the FDA Guidelines (FDA). These guidelines are even more applicable to administration in chronic disease processes, which have become the focus of pharmacological research in recent years and constitute a particular indication area for herbal medicines--in the fields of rheumatology and gastroenterology, for example. In this context it is important carefully to select appropriate instruments for the individual target enquiry and, where these have been newly developed or translated in other languages, to conduct (pilot) tests with the outcome instruments to verify their validity, reliability and sensitivity. This will be discussed below in greater detail using four specific examples.

Psychiatric and psychological illnesses

A quite extensive body of experience has now accumulated in the fields of neuropsychopharmacology and pharmacopsychiatry. Working groups in these fields have long been pioneering the quantification of therapeutic outcomes using appropriate standardised assessment or measurement scales. Examples of indications for which validated, reliable and sensitive scales are available are: depression, assessed using the Hamilton Depression Scale (HAMD); anxiety, assessed using the Anxiety Status Inventory (ASI) or the Stait-Trait Anxiety Inventory (STAI); dementia, assessed using the Evaluation Scale for Geriatric Patients (Beurteilungsskala fur Geriatrische Patienten, BGP); and sleep disorders, assessed using the Sleep Questionnaire A and B (Schlaffragebogen A und B, SF-A/B). An excellent collection of the scales available, together with extensive documentation, has been published by the Collegium Internationale Psychiatriae Scalarum (CIPS 1996).

Musculoskeletal diseases

The principal goals of treatment are to improve patients' quality of life and to facilitate the normal activities of daily living. These goals can be achieved by alleviating or eliminating pain, by improving mobility functions and preventing permanent functional impairment, and by protract or even arrest the progressive course of the disease. In determining health status and quality of life, three different techniques may be distinguished in principle: assessment by a physician, the performance of standardised activities by the patient and their assessment (e.g., the Keitel Index), and the completion of standardised questionnaires by the patient (Stucki 1997). Questionnaires for the assessment of disease-specific health status are suitable in particular for identifying clinically relevant changes as a response to treatment. Since such questionnaires usually contain typical elements for a defined disease, they have greater sensitivity than more general questionnaires (such as the Clinical Global Impressions (CGI) scale that is commonly used in a modified or simplified form). Disease-specific questionnaires here are just as reliable as traditional clinical parameters (e.g., measurements of walking distance) or laboratory values (Sangha and Stucki 1997).

Various validated "standard instruments" (self-assessment questionnaires) already exist for the determination of health status in patients with rheumatic diseases. They all assess pain and physical function. Thus, for example, the Lequesne Index (Lequesne et al. 1987) and the Western Ontario and McMaster Universities (WOMAC) Osteoarthritis Index (Stucki et al. 1996) have been developed and recommended for use in osteoarthritis; the Health Assessment Questionnaire (HAQ) (Bruhlmann et al. 1994, Fries et al. 1980), the American College of Rheumatology (ACR) criteria (Felson et al. 1993, 1995) and the Paulus criteria (Paulus et al. 1990) for rheumatoid arthritis; and the Arhus Back Pain Index (Manniche et al. 1994) for low back pain. All these self-assessment questionnaires have already been and are being used in clinical trials and are therefore also suitable in principle for the assessment of efficacy in OCS.

Gastrointestinal diseases

Many herbal medicines are indicated for use in dyspeptic disorders. Regrettably, the concept of antidyspeptic activity covers a variety of activity profiles, and these at first have to be defined more precisely in therapeutic terms on the basis of pharmacological data.

Relevant and typical symptom profiles have now been defined on the basis of an internationally accepted classification system for dyspepsia and current understanding of the pathophysiology (Colin-Jones 1988, Hotz et al. 1999, Muller-Lissner and Koelz 1992, Muller-Lissner and Klauser 1999, Talley et al. 1999b). More recently, therefore, various standardised measurement instruments have been elaborated, although the clinico-diagnostic research and development of self-assessment scales has not yet been completed. Some of the proposed scales should also be suitable for use in OCS as well as in clinical trials (Chassany et al. 1999, Leidy et al. 2000, Mearin et al. 1999, Sandha et al. 1999, Talley et al. 1999a, 1999c, Wiklund 1998, Yacavone et al. 2001). There is still a lack of conclusive published studies that have been conducted with an appropriate design and would thus permit comparison with the results of an OCS.

Final assessment of efficacy and tolerability

In the past the methods used by patients and those treating them to determine or provide information on the final assessment of "general" or "global" efficacy and tolerability have been highly heterogeneous. This fact unnecessarily complicates or prevents comparison of such an assessment with that from other OCS or clinical trials.

Use of the CGI scale is to be recommended for a standardised final assessment of efficacy, status change (overall status of the patient) and tolerability (adverse effects and risks of treatment). Although not without criticism (Beneke and Rasmus 1992), this scale has now been used in many clinical trials as well as in OCS. In many cases therefore comparison with other study results is already possible.

Practical use of self-assessment scales

Provided that the structural characteristics of the populations are comparable, results obtained with self-assessment scales in OCS with herbal medicines could permit comparison with other pharmacotherapies or treatment methods. The importance of OCS conducted in this way would therefore increase.

However, it should be noted that the use of self-assessment questionnaires or instruments to measure outcome presupposes adequate experience or training on the part of physicians, as well as the availability of motivated patients. It is advantageous to collaborate with physicians who are already familiar with such methods from other clinical trials. Where this is not possible, physicians must undergo intensive familiarisation with the special features of such a study. This will require individual training and comprehensive instruction before the treatment phase commences.

Investigators and sponsors should be aware of the ethical aspects of their OCS. In detail, it might be advisable to submit the study protocol for consideration, comment or guidance to an independent ethics committee.

Presentation of results and publication

Reference has already been made to the common lack of information on quality in terms of planning and conduct, rendering final assessment impossible without further details. In contrast, reporting of the results of an OCS is often accompanied by an extensive presentation of other scientific data on the herbal drug, pharmaceutical substance or product, an approach that is more consistent with a review article. The results of the OCS itself are therefore difficult to analyse. Where intended as an original or principal article, publication of the results of an OCS should restrict itself primarily to the actual study objectives and results. Extensive general presentations on a subject or a pharmaceutical substance should preferably be summarised in a separately authored review article.

For many patients whose treatment is documented in OCS with herbal medicines, initial symptom severity tends to be mild. However, it is a deficiency of results presentation when statements of the actual initial intensity or severity of an individual symptom are omitted, for example, in favour of citing the frequently considerable regression rates. However, information on initial symptom intensity is of fundamental importance for the assessment of therapeutic outcome.

Discussion of the results of an OCS must be approached carefully. Excessive importance should not be attached to the data obtained by this method. In particular, speculative statements here concerning new indications should be regarded critically. While proof of new indications is not possible in an uncontrolled comparison, helpful hints may nevertheless be uncovered.

As well as describing the data, the biometric analysis should in particular provide a careful evaluation of those conditions that form the basis for allocating patients to the study or control treatment. The influence of these conditions on treatment outcome should be determined and compensatory adjustments should be made before treatment outcome is compared between the study group and control group. Appropriate techniques for this purpose have been outlined briefly above. Particular reference may be made here to the use of the propensity score to identify those factors influencing treatment allocation and to adjust for the bias to treatment outcome arising from these factors. Statistical analysis of the differences in the efficacy variable (adjusted to identical baseline and treatment conditions) between the study and control groups using confidence intervals or significance probabilities is just as possible and indeed necessary in an OCS as in a controlled trial. Adjustment is performed to compensate for any lack of homogeneity between the treatment groups. The validity of the comparison depends on quality of the adjustment. This should be verified. It is strongly urged that a Bayesian approach be used: for this a detailed model should be developed to describe the interaction of the various factors influencing the treatment outcome and probabilities should be assigned a priori to the variables contained in the model.

* Conclusions

The scientific methodology of the OCS is an important component in research following the granting of marketing approval for a medicinal product. This is particularly true for herbal medicines: experience gained in everyday practice can be used to accurately describe their importance as safe and effective routine therapy. Overall, the proposals outlined in this article are intended to further enhance methodological quality in terms of the planning, conduct, analysis and reporting of OCS. This methodology can make an important contribution in raising the therapeutic status of herbal medicines--and for the future it is certain that well planned, organised and analysed OCS will also help to extend the body of knowledge relating to herbal medicines.

