Clinical Study Design – Critical Issues

Additional related information may be found at:

Neuropsychopharmacology: The Fifth Generation of Progress

Back to Psychopharmacology - The Fourth Generation of Progress

Donald S. Robinson, M.D., and Nina R. Schooler, Ph.D.

INTRODUCTION

This chapter on study design focuses on methodological issues of importance in designing studies to evaluate the treatment effects of psychotropic drugs. Critical choices facing the clinical investigator in the planning of trials to assess an investigational or approved drug are discussed, and recommendations are given for enhancing the reliability and validity of experiments. For a comprehensive discussion of clinical methodology in psychopharmacology, the reader is referred to a compendium volume on clinical evaluation of psychotropic drugs jointly sponsored by the National Institute of Mental Health and the American College of Neuropsychopharmacology (75).

Prior to approval, new drugs undergo stringent clinical testing in studies whose design, analysis, and method of reporting in a New Drug Application (NDA) are shaped to a large extent by U.S. Food and Drug (FDA) guidelines (22, 23). The clinical experience with a new agent accrued during phase I, II, and III trials is limited to exposure of only a few thousand patients who receive the drug . Much of the clinically important knowledge about a new drug is gained after its approval for marketing during its use in clinical practice, or in postmarketing studies, which are subject to fewer restrictions than pre-approval trials and usually are more consonant with actual practice. The underlying principles governing clinical evaluation of psychotherapeutic agents apply equally to investigations conducted during the development of a new drug or post-approval.

BASIC DESIGN ELEMENTS OF CLINICAL DRUG STUDIES

Controlled and Uncontrolled Studies

Psychopharmacology, more than most other areas of clinical research, is subject to a degree of experimental variability that can tax the skills of the most experienced and careful investigator. Effect size of treatments with psychotropic drugs tends to be modest, at best, and statistical variance is usually substantial. These factors in concert make it more difficult to show a treatment effect in a psychopharmacologic trial than for many other types of drugs, even with agents known to be efficacious. This dilemma underscores how critical sound clinical methodology is in conducting a psychotropic drug trial.

In psychopharmacology, hypothesis-testing experiments must be well controlled, employing one or more control groups (55). Because of the duration of treatment required to show a therapeutic effect (weeks or months) as well as to the highly variable course of psychiatric disorders, these factors mandate utilizing a double-blind, placebo-controlled study with parallel group design for most drug trials.

A clinical trial that involves more than one control treatment, typically a standard drug and a placebo Table 1, for example), is one of the most informative designs, for several reasons. Such designs provide a useful efficacy standard by quantifying the treatment effect against both a known drug and placebo. This study design also serves to reduce experimental bias by better protecting the double-blind mode throughout conduct of the trial. Another advantage is that this study design increases the probability that on randomization a patient will receive an active treatment rather than placebo, which has several advantages. Among these are facilitating patient recruitment and better blinding of both research staff and patients to treatment assignment.

Generating definitive data about dose-response relationships for a new psychotropic drug requires a development program comprising a series of controlled trials. Because a single experiment can address a limited number of hypotheses in gaining a complete understanding of how to use a drug therapeutically, especially dosing regimen, trials of differing designs are needed. Additional studies after drug approval are generally needed to establish the complete therapeutic spectrum and the optimal regimen of a new drug.

Uncontrolled studies have rather limited value in clinical psychopharmacology. More often they serve as exploratory and hypothesis-generating studies. Using an open-trial design in early phase II testing, one can gain some indication of a new drug's efficacy in addition to safety data. However in an uncontrolled psychopharmacological trial, one can place more confidence in a finding of "no benefit" than one of "possible benefit". Uncontrolled, non-blinded studies generate useful supplemental information about a drug; for example, effects in special populations, such as elderly or chronically ill patients, who can be difficult to recruit into a controlled trial in sufficient numbers. Well-controlled studies are essential to obtaining definitive dose-response and efficacy data in all patient groups including special populations.

Standard Clinical Trial Paradigms

Randomized controlled trials (RCTs) are the mainstay of studies to define the efficacy and safety of a drug, as well as to delineate dose-response relationships. Studies of double-blind (and triple-blind) design with placebo controls, randomized assignment to treatment, and three (or more) treatment arms, represent the methodological standard in clinical psychopharmacology (55). This basic paradigm is amenable to many design variations, depending on specific study objectives and the properties of the drug being studied (50).

Once the objectives of a study have been set, and the patient sample, entry criteria, and the desired statistical power specified, an especially critical design decision is whether to stratify treatment assignment (randomization) based on some important characteristic of the target patient population. Even though stratification can add considerable complexity to a trial, it may offer methodological advantages that offset potential disadvantages (44). It may be crucial to safeguard against chance (but critical) baseline imbalances in treatment groups that might limit inferences to be drawn about results of a trial. Employing a matching variable or stratification strategy in protocol design may enhance statistical power, if the matching variable is itself an important independent predictor of outcome (44, 69). The decision to stratify randomization is dependent on identification of variables with the foremost influence on outcome. Because of the complexity that stratification imposes on study design (which increases with each additional stratum), thereby affecting patient recruitment and study implementation, patients should be stratified on the fewest possible and most critical variables. Using a stratification design complicates drug packaging and procedures for assigning treatments, especially in multi-site studies, and adds complexity to the data analyses. Therefore, before choosing this design strategy, it should be clear that the statistical advantages outweigh the logistical and other disadvantages of the more complicated design.

The ideal circumstance is to conduct an RCT as a single-site study. This design offers many advantages but often is impractical for reasons of required sample size and patient recruitment rates, which may exceed the capabilities of a single site. Inevitably, adding sites increases variance and complicates implementation and logistics. Controlling inter-site differences becomes essential when electing to conduct a multi-site trial. Ensuring uniformity of protocol implementation and high inter-site rater reliability is critical. Statistical testing for treatment-by-site interactions can sometimes remedy a seemingly aberrant outcome at one site. The ability to test for site-by-treatment interaction becomes more limited as number of sites increases, and sufficient statistical power to show a site interaction may be lost. When planning a trial, this is another important reason for restricting the number of performance sites.

Alternatives to Parallel Group Placebo Controls

There is no completely satisfactory alternative to employing a concurrent placebo control group when seeking to establish the efficacy of psychopharmacological agents. However, if the use of a placebo is impractical, or inappropriate for ethical or other reasons, a multiple fixed-dose study without a placebo may be an acceptable option. This approach involves comparison of responses to a range of doses of the test drug. If significant statistical differences are observed between treatment groups, such an outcome is interpreted as evidence of a drug's efficacy. This can be an economical approach to drug development because it generates valuable dose-response data in addition to supporting drug efficacy. However, it carries the potential risk of a false negative error if the wrong doses are selected because of uncertainty about the therapeutic dose range of a drug. Choosing inappropriate doses to evaluate can lead to failure, especially if the lowest dose selected has some therapeutic benefit. If significant differences between doses are not detected, no inference can be drawn about drug efficacy. Another criticism of this study design is that purposely selecting a low dose outside the therapeutic range, or even a dose that is thought to be marginally effective, raises the same ethical concerns as use of a placebo.

Another option is to compare the test drug against a standard drug using a 2-arm double-blind design. However, as is the case with the multiple fixed-dose design without placebo control, a finding of no significant differences in outcome is ambiguous. This outcome could result from non-drug treatment factors, insensitive outcome measures, or investigator or patient bias. The test drug may appear as effective as the reference drug because both treatments are efficacious, or neither. A major disadvantage of this 2-arm design is that it requires a much larger sample to adequately test the statistical hypothesis of no true difference in efficacy between the test and reference drug. Relatively small differences in treatment effects (effect size) are extremely hard to detect using a 2-arm study design because of the inherently large variance of psychopharmacological studies. In general, 2-arm trials lacking a placebo control cannot be relied on to establish efficacy of a psychopharmacological agent.

A third option is to employ an historical placebo control group (49). With this option, a standard placebo response is assumed for a given research facility and study design. This improvement rate is then used for purposes of comparison with results of subsequent non-placebo controlled studies involving the same investigator and protocol design but different drugs. If this strategy were adopted, it is recommended that a small placebo treatment arm still be included to serve as a control group for safety data purposes, and to help protect against observer bias. One can also assess whether the pattern of placebo response corresponds with previous experience. Major drawbacks to this approach include the statistical inadequacies of a small or absent placebo control group and the fact that factors affecting placebo (non-drug) treatment response often do not remain static over time. This makes comparisons of recent with prior studies problematic, even by the same investigator ("one cannot step into the same river twice"). Employing historical placebo controls is also subject to the ethical criticism that the study would be less scientifically conclusive, and therefore would not justify the risks of exposing human subjects to an experimental agent. Despite these drawbacks, there might be a situation where a large placebo control group is impractical and a modified design using historical controls would be warranted.

Longitudinal drug trials sometimes include intermittent periods of single-blind placebo treatment to assess whether a drug is efficacious, and how long it needs to be continued. This approach can provide some indication of efficacy of a new drug, as well as information about how long a treatment needs to be continued to sustain clinical response. This study design, however, cannot provide unequivocal evidence of efficacy and requires confirmatory studies using parallel group design.

Specialized Trial Designs
Cross-Over Designs

A balanced and randomized cross-over design has theoretical appeal because it factors out the largest source of experimental variance, inter-individual variability. This could significantly enhance statistical power and permit much smaller samples sizes to detect a treatment effect. Unfortunately, this design is rarely appropriate for efficacy trials in psychopharmacology. This is due to the highly variable course of psychiatric disorders during a course of therapy and to carryover effects from prior treatment. These limitations preclude valid statistical analyses of outcomes beyond the first treatment period.

A cross-over design might be applicable for studies dealing with chronic stable conditions, where within-subject variability is much less than between-subject variability, and if patients return to baseline condition after each intervention. Unfortunately, the latter is rarely the case for psychopharmacological agents, which have complex and lingering effects on neuronal systems. On the other hand, the cross-over paradigm has utility in phase I clinical pharmacology studies, where the focus is on pharmacokinetics and safety assessment. In such studies, where therapeutic benefit is not an outcome being measured and objective endpoints (such as drug concentrations) are used, carryover effects are less of a concern.

A variation of the cross-over design is the so-called ‘N of one' study (30), wherein a single subject (or a few subjects) undergoes several alternating courses of drug and placebo, preferably under double-blind conditions. This study design presupposes that a patient returns to baseline status following each course of treatment (as with a balanced cross-over study design). Because of the obvious methodological limitations of generalizing to a population from a sample of one, proponents of the ‘N of one' study design generally advocate that a small series of patients be studied, with each patient undergoing several periods of crossover (usually between active drug and placebo). Despite drawbacks to this design in psychopharmacology, there may be occasional circumstances involving special populations where intensive observation of a single patient by a skilled investigator is a preferred (or the only possible) approach, e.g., if the target disorder is extremely rare or a concurrent placebo control group is impractical (48, 49).

Longitudinal Study Designs of Long-term Treatment

There is a growing interest in, and importance of, evaluation of the entire therapeutic course of psychopharmacological treatment, as opposed to assessing short-term efficacy. A study design to assess long-term benefits and risks differs from that of short-term, acute efficacy studies. Long-term drug studies pose arduous methodological issues and added complexities, some of which remain unresolved despite growing emphasis on continuation and maintenance treatment. Health agencies today seek comparative data about long-term treatment alternatives in making judgments about managed care, and for making drug formulary decisions utilizing pharmacoeconomic data. Study designs in this category are also discussed in the section of this chapter on "Future Directions in Methodologic Research".

A frequently employed design for evaluation of long-term treatment efficacy is a placebo-substitution paradigm. This study design selects patients who are acute responders and randomly assigns them either to continue study medication or to switch to placebo, in a double-blind manner. This design is adaptable to evaluating prevention of relapse (same episode) or recurrence (new episode). Most studies of this type admit patients during an acute phase or episode of illness, and require that patients attain adequate control of symptoms (response) before randomization to long-term treatment with drug or placebo. This type of enrichment design has inherent limitations regarding generalizability of results to an unselected patient population. Attrition from the acute treatment phase in these studies may reach 50% or more. This can produce patient samples in the long-term treatment phase of the study that are unrepresentative of the original intake criteria and end up inadvertently favoring one treatment over another (28).

Few studies report data on differences between the sample entering the acute phase and the subset of survivors that end up randomized to long-term treatment. Although patient attrition is a concern in any double-blind drug trial, the dropout problem is more critical for these studies in which the inevitable attrition is sufficient to skew the sample and "censoring" of dropout data is employed for purposes of survival analyses. Lavori (52, 54) has extensively discussed these vexing problems of dropouts and lost data, which can adversely affect all drug trials but is especially critical in long-term treatment studies.

Another problem with multiphasic, long-term studies is that the pre-randomization attrition may favor one treatment over another in the long-term phase of the study (28). Various efforts to deal with this problem have been proposed (54). One proposed solution to the problem of pre-randomization attrition is to enter patients into the study during an inter-episode period when they are asymptomatic and perhaps less likely to drop out. In this approach patients in long-term treatment are taken off whatever therapy they are receiving and are randomized to a study treatment. There are disadvantages to this design: patients who have been episode-free for a significant period of time may be reluctant to participate in a study where they might receive an inferior drug, or worse a placebo, and recruitment of patients presents practical difficulties. One long-term study that attempted this approach with patients receiving treatment for a recurrent mood disorder fell far short of attaining the requisite sample size (76).

Another type of design that is employed to evaluate long-term treatment is the single-blind, longitudinal trial. A variation of this design is to periodically discontinue and/or replace study treatment with placebo. Another similar type of longitudinal study is the "mirror image" trial, in which the course of illness during the study treatment is compared to the course of illness during an equivalent period of prior treatment. These designs have the advantage of smaller sample sizes. They can provide some useful information about a drug's long-term efficacy, but are not regarded as being conclusive due the lack of a concurrent control group. Longitudinal, non-blinded studies are a practical way to accrue much valuable safety data during chronic use of a new drug.

Overall, none of the designs for evaluating long-term treatment are flawless or completely satisfactory; each has methodological limitations that limit generalizability of results. Implementation of long-term studies is more demanding because of the need to maintain integrity of the study and quality of the research over an extended period during which staff changes and patient attrition occur. In undertaking such studies, it is important to recognize the difficulties and limitations of these approaches. Long-term treatment strategies are discussed in other chapters in this volume (see also Short- and Long-Term Psychopharmacological Treatment Strategies; , Long-Term Treatment of Mood Disorders; , Methodological and Statistical Progress in Psychiatric Clinical Research: A Statistician's Perspective).

Dose-Response Studies

Accurate dose-response data are needed to provide prescribing information essential to clinicians. Such information may be a requirement of drug approval in many countries. No single study design can address all aspects of dose-response relationships for a new drug so a number of different trials are needed, each supplying complementary dose information. The combined phase I-III study data is one source of data on how dose of a new drug relates to safety and side effects. More precise information on safety and side effects related to dose requires well controlled dose-response studies. A multiple-dose RCT design can be a rich source of efficacy and safety data relating to drug dose.

Multiple-dose RCTs directly compare concurrent treatment at different dosages of the test drug. Defining how dose and efficacy are related requires a rigorous experimental approach, primarily involving placebo-controlled study designs. It is desirable to identify the optimal dosage and therapeutic dose range for a new drug in different patient populations. Study designs that compare either different fixed doses or different dose ranges of a drug represent one experimental approach to obtaining such information (14). As discussed in the foregoing section on alternatives to placebo controls, a variation of the multiple-dose RCT design involves substituting a marginally active (low) dose treatment arm for a placebo treatment arm. This has been proposed as a way to obviate some of the ethical and logistical concerns regarding placebo use. Incorporating a low (marginally or minimally effective) dose arm in a study protocol presupposes some knowledge of dose-response and therapeutic dose range. An incorrect choice of low dose could prove to be a costly error if it fails to discriminate from higher doses. Obviously, inclusion of a placebo treatment arm in a dose-response study is preferred whenever possible because it better quantifies treatment effects, both safety and efficacy.

Fixed-dose studies using multiple doses have gained popularity for evaluating dose-response relationships. While this study design yields valuable information, it does not necessarily reflect the way a drug is likely to be used in clinical practice. First-pass metabolism and systemic bioavailability vary widely among individuals, especially with psychotherapeutic agents (90). Therefore, any given dose produces widely variable plasma and tissue concentrations, depending on each patient's body habitus and metabolic clearance. To optimize therapeutic benefit during clinical management of a patient, the drug dose is normally titrated within a defined range based on observed response (benefit and tolerability). Dose-titration studies where dose is individualized in this manner, as opposed to fixed-dose studies, better approximate actual clinical practice, and arguably can provide a better assessment of dosage and therapeutic benefit for patient populations treated in usual practice settings.

Concentration-controlled Studies

A modification of the standard RCT design has been proposed by Peck (71). This concentration-controlled study design has been advocated as a more cost-effective and efficient way to investigate the clinical pharmacology, safety, and efficacy of an investigational drug, especially during early stage of development (84). This methodological approach is predicated on the assumption that the plasma concentration of a drug correlates better with clinical response than does dosage (50). In psychopharmacology unfortunately, except for a few drugs such as lithium, this has not proven to be the case (73, 93). Given the fact that centrally-acting drugs are highly lipophilic with high brain-plasma ratios and selective but variable neuroreceptor affinities, the low predictive value of plasma drug levels for therapeutic effect is understandable. At best, plasma levels constitute an indirect index of target tissue (brain) concentrations, and for most psychotherapeutic agents they have proven to be of limited usefulness for optimizing therapeutic response.

The concentration-controlled study design has several practical drawbacks, as well. Its implementation requires modifying the standard RCT design in a way that runs counter to accepted, sound research design. Compromising the blind and thereby introducing bias is a significant risk due to the need to adjust dose based on drug level (presumably by an unblinded third party). An elaborate procedure would have to be superimposed on the standard RCT to allow for collecting and assaying blood samples in a standardized way, and then to adjust a patient's dose based on drug level. During early stages of human testing there is little information to go on about target dose levels in man. There is scant evidence that this experimental paradigm can be adapted successfully to efficacy trials, especially during early phases of drug development when the goal is to provide a pivotal study. It is unclear whether the more complicated logistics and expense of this design can be successfully applied to drug development in psychopharmacology. More experience is needed to assess the practicality and utility of this approach for clinical research (59).

Studies to Evaluate Anti-psychotic Medication in Schizophrenia: Design and Assessment

There is currently unprecedented activity in the development of anti-psychotic medications. For this reason the chapter discusses clinical methodologies for evaluating anti-psychotic drugs. It highlights some issues that are specific to research with anti-psychotic medications and the patient populations for such studies as well as some more general topics of clinical research. For more detailed discussion of research with other classes of medications and other patient populations the reader is referred to the recent compendium on the clinical evaluation of psychotropic drugs (75).

The present interest in the development of new anti-psychotic medications can be traced to the approval of clozapine for the treatment of "treatment refractory schizophrenia" in 1989 (72). The results of a novel study reporting clozapine's efficacy in the improvement of a wide range of symptoms (38), coupled with its complex receptor binding profile, encouraged the expectation that new drugs based on non-traditional pre-clinical screening models might have anti-psychotic activity. In particular, the development of new compounds was stimulated by the possibility that drugs with either clozapine's regional specificity and/or certain of its many and diverse receptor interactions might be effective anti-psychotic agents. Subsequently, it has been hypothesized that various portions of that spectrum of activity are the source of clozapine's apparent unique effects. In 1994, risperidone, an anti-psychotic with affinity for serotonin-2 and dopamine-2 receptors was approved by the Food and Drug Administration (FDA) for "management of symptoms of psychosis". In 1996 olanzapine, an anti-psychotic with a receptor binding profile and chemical structure similar to that of clozapine, was also approved for treatment of psychosis. As of February 1997 two other anti-psychotic medications are under review by the FDA, and a number of other drugs are under development, several of them in late phase III. Because discovery efforts and development of anti-psychotic drugs continue at a rapid pace, attention to the methodology of clinical trials in this area is of particular importance.

The Patient Populations Studied

The illnesses for which anti-psychotic medications are administered are generally long-term and chronic. Schizophrenia, a chronic relapsing disorder, represents the paradigmatic illness in which anti-psychotic drugs are used, and most medication development programs are carried out using schizophrenic subjects almost exclusively. Remarkably, with the exception of clozapine, no other marketed drug or compound under development is formally indicated for treatment of schizophrenia. All are labeled for use in the treatment or management of psychosis; long-acting forms of anti-psychotics are indicated for use "in the management of patients requiring prolonged parenteral anti-psychotic therapy, e.g., patients with chronic schizophrenia" (72).

Patient samples studied in recent drug development programs of anti-psychotic medications tend to be male, with an average age in the late 30's and an illness duration of about 10 years (62). The predominance of men in such studies reflects both past FDA policies restricting use of experimental medication in women of child bearing potential and the fact that Veterans Administration hospitals are a common venue for such clinical trials. Even among eligible patients, schizophrenic men are less likely to refuse to participate in trials than schizophrenic women (79). Reasons for refusal include unwillingness to accept risk of randomization, resistance to the idea of change, or unwillingness to consider experimental medication. Since some studies also suggest that men are less responsive to anti-psychotic medication than women (27), refusal of women to participate in trials may lead to a less treatment responsive study population.

Trials of anti-psychotic medication are not conducted in samples drawn from defined populations such that generalization to those populations can be confidently made. The report by Robinson and colleagues (79) is a rare report comparing participants with those who were eligible but rejected the invitation to participate. Failing the possibility of drawing patient samples from enumerated populations as is sometimes done in prevention trials (8), the comparisons of trial participants and those who do not enter trials can provide valuable information regarding the generalization of findings to populations of interest. The inclusion of such data in trial reports would be valuable.

Kane and colleagues (39) have commented on the importance of diagnostic specificity for clinical trials in schizophrenia. Most drug development trials still rely on chart diagnosis, rather than on more formal diagnostic assessment requiring a structured interview linked to a formal diagnostic system. The Structured Clinical Interview for DSM-IV Diagnosis (89) is the most widely used diagnostic assessment instrument that allows confident diagnosis if completed by a well-trained clinical assessor. In evaluating the results of clinical trials in schizophrenia conducted over time it must be recognized that there have been changes in diagnostic specificity and diagnostic fashion. Early studies of anti-psychotic medications in schizophrenia may have included patients who would be diagnosed as DSM-IV Psychotic Disorder NOS, Schizo-affective Disorder, Bipolar or Depressed, or Bipolar Disorder. From one perspective, this suggests that findings from early trials conducted in so-called schizophrenia patients may well be applicable to a broader spectrum of psychotic disorders. However, in the absence of specific information, this is speculative. It is more certain that the composition of patient populations in studies of anti-psychotic medications has changed over time.

A further concern regarding trial participants is whether they are medication responsive or refractory. The Kane et al study (38) of clozapine was specifically designed to include only treatment refractory patients, but many trials that are not specifically designed for treatment refractory patients may have an overrepresentation of such subjects because of the settings in which they are conducted and biases in subject recruitment. One way in which this occurs is through the decision by physicians or others not to present a particular study to a potential subject. The most severely ill patients may be excluded because of a concern regarding the risk of further clinical worsening. This bias presumably excludes the least medication responsive subjects. The most successfully treated patients may be excluded because of a fear of jeopardizing fragile clinical stability. Excluding these patients means that subjects in trials of new anti-psychotic agents will be less rather than more medication responsive. Usually, these exclusions are tacit and their extent is not measured, but Leff and colleagues made exclusions explicit in an early maintenance treatment trial (52). Another recommendation of information to be gathered in trials of anti-psychotics in schizophrenia is the range of exclusions related to medication responsiveness.

Trial Duration

As noted earlier, schizophrenia and other psychotic illnesses tend to be chronic and have a relapsing course. Despite this, most clinical trials tend to be relatively short (six to eight weeks). Such trials may be too brief to evaluate efficacy for specific symptom complexes such as negative symptoms or cognitive dysfunction and cannot address important questions regarding maintenance of overall clinical efficacy or emergence of long term side effects such as tardive dyskinesia. There is a substantial literature on maintenance treatment with older anti-psychotic medications (85, 86) but long term outcome data for newer compounds come primarily from open label extension studies. These studies provide only limited information regarding the role of newer agents in maintenance treatment and do not allow precise estimates of risk of relapse during extended periods with these newer drugs compared to older agents.

A major consideration in determining the length of a trial is whether subjects can be kept in their randomized treatment condition for the full trial duration. High drop out rates across all treatment arms limits the interpretation of study results. Differences in drop out rates among treatment arms can yield information about the differential efficacy or toxicity of specific treatment regimens being compared.

Clinical Trial Designs

Within the general paradigm of phase III clinical trials that entails randomization to treatment and provision of treatment and clinical assessment under double-blind conditions, a range of different trial designs are used as shown in the table. Several of the more important designs used in the development of anti-psychotic medications will be described and their advantages and disadvantages identified.

Flexible Dose Designs

As shown in the table, flexible dose designs involve up to three treatment groups: the experimental drug, a reference drug, and a placebo. The reference medication may be omitted early in a drug development program and the placebo condition may be omitted in later studies. If a reference medication is used, a dosage equivalence between the new medication and the reference drug must be established on a priori grounds. Within the limits of the established dosage equivalence, dosage is titrated according to clinical response, often with some caveats regarding speed of dose titration and always with minimum and maximum dosage limits.

This design offers several advantages. First, titrating dosage to a desired clinical effect or until side effects intervene mirrors clinical management. Second, the placebo control, if present, makes it possible to determine whether the experimental condition being studied is an effective antipsychotic in comparison to no medication. The use of both a placebo and reference control provide what has been termed "assay sensitivity". Despite the clear demonstration in numerous studies that anti-psychotic medication offers superior efficacy to placebo treatment, there is a wide range of placebo response across studies (40) so that in the absence of a placebo control, a finding of no difference between a new drug and a standard agent can be interpreted as indicating efficacy for both or for neither. Third, the design also addresses the question of whether the new medication is comparable to or better than a currently available medication both in terms of anti-psychotic effect and side effects.

This study design has a number of limitations. First, dose titration makes it impossible in a single study to determine whether there is an optimal dosage within the range studied. Correlation of dose and response may well show that those subjects who received the highest dose showed the poorest response since dosage may be increased if patients do not respond at lower doses. Second, depending on the speed of titration, a response that is attributed to the dose the subject was receiving when improvement first appeared may actually be due to the passage of time and response to a lower dose. Differences that are attributed to dose may be a function of time. Further, flexible dose designs do not allow for the clear assessment of the relationship of plasma drug concentrations and clinical response.

Fixed Dose Designs

The Table also presents a model study design that incorporates more than one fixed dose of the new experimental drug, a placebo control, and a single fixed dose of reference drug. This design offers advantages over the flexible design and is amenable to clinical trials of anti-psychotic agents. First, it is not necessary to determine milligram equivalence between the experimental and the reference drug. Second, dosage and time are now independent of each other for the experimental medication so that time to response can be better assessed. Third, it is possible to determine, within the range of doses studied, an optimal fixed dose for the new drug. Fourth, this design allows study of the relationship of plasma drug concentrations to both clinical response and side effects. The advantages of placebo control described above still apply, and in addition, the likelihood of randomization to placebo medication for a given subject has been reduced from one in three to one in four or less, depending on the number of doses of the experimental medication being studied.

Some disadvantages remain. Only a single dose is evaluated for the reference drug and that may not be the optimal dose. If it is too high, side effects may be unduly high; if it is too low, efficacy may be underestimated. Both of these conditions limit the ability to draw conclusions regarding comparative effects of the experimental drug and the standard drug.

Fixed Dose Designs with Multiple Doses of New Medication and Reference Drug

A more ambitious design for an anti-psychotic drug trial incorporates multiple fixed doses of both the new medication and the reference compound as well as a placebo control. This design addresses a number of the disadvantages of using only a single dose of the reference medication described above. If three doses of each medication are used, it also would have the advantage of reducing the probability of randomization to placebo to one in seven chances. The major disadvantages of designs with large numbers of cells is that they require large numbers of subjects in order to have adequate power to make all appropriate comparisons. Such a study design is more costly and time-consuming. If the number of subjects per cell is inadequate, by chance a clinically meaningful difference may not be detected.

Measuring Efficacy

The Brief Psychiatric Rating Scale (BPRS) (67) represents one of the most secure and stable assessment instruments in clinical psychopharmacology. Two early factor analytic studies of the scale (29, 68) yielded a series of sub-scale scores that, in combination with a Total BPRS score, have been widely used over a generation and distinguish both between anti-psychotic medications and placebo as well as between newer anti-psychotics and older ones. More recently, two factor analytic studies of the BPRS using confirmatory factor analytic methods (33, 66) suggest that the traditionally used factor structures of the BPRS may not provide the most sensitive measures of outcomes. Each of these articles provides alternative factor solutions that may be more sensitive to change. Further, both of these solutions identify a Disorganization factor, similar to the one identified by Liddle and colleagues (60).

At the same time, other instruments have been developed that provide greater attention to symptoms that are of concern in schizophrenia, in particular negative symptoms. The Positive and Negative Symptom Scale (PANSS) developed by Kay (42) was explicitly designed to be a successor to the BPRS. It incorporates all 18 item names of the BPRS, but the item definitions do not correspond fully to the BPRS items. Therefore, the current practice of extracting a BPRS Total Score from PANSS ratings is not appropriate. The PANSS provides expanded coverage of negative symptoms, including some measures of cognitive function and items that assess mood. The PANSS has three sub-scales: positive, negative, and general psychopathology. However, factor analytic studies of the PANSS (e.g., 61) consistently show a five factor structure: delusions and hallucinations; negative symptoms; disorganization; anxiety and depression; activation and excitement.

The Comprehensive Assessment of Symptoms and History (CASH) (4) incorporates detailed scales for the assessment of positive and negative symptoms of schizophrenia, the Scale for the Assessment of Positive Symptoms (SAPS) and the Scale for the Assessment of Negative Symptoms (5). A factor analytic study of the SAPS and SANS items (Andreassen 1995) reveals three factors: negative, disorganized, and psychotic (specifically delusions and hallucinations).

It appears that factor analytic studies of scales that are restricted to core positive and negative symptoms of schizophrenia show a pattern of three dimensions: delusions and hallucination; disorganization; and negative symptoms. Analyses that include a broader range of symptomatology identify either of one or two factors: an anxiety depression factor and an activation factor. On balance, these studies suggest that in the evaluation of new anti-psychotic agents, broad coverage of symptomatology is valuable and may increase ability to detect therapeutic differences between older and newer anti-psychotic medications.

Degrees of Blinding Studies

Rigorous blinding procedures for RCTs are essential in psychopharmacology because outcome measures are based on subjectively-rated items (26). The double-blind designs of early clinical research have undergone evolution and sophistication to protect against and minimize experimental bias. An example of bias would be receiving cues about treatment assignment by study personnel from the actual or perceived side effects of the treatments (20). It is important to instruct research staff to resist drawing inferences about a patient's treatment assignment when assessing clinical response. Personnel need to be encouraged to maintain objectivity by using a standardized approach to eliciting symptoms and making ratings.

Another concern is the possibility that outcomes of patients enrolled in the early phase of a trial may bias data of later subjects, if treatment assignment is made known to the investigator as each patient completes the study. Emerging patterns of side effects or response might influence the clinician's ratings and decision-making as the trial progresses, thereby introducing a systematic experimental bias. This is obviously undesirable from a research standpoint and is a source of bias that is difficult to detect or quantify.

A third concern is that knowing the randomization of completing patients increases the likelihood that unplanned and inappropriate interim analyses will be done. This also can introduce bias as a study progresses, in addition to raising the issue of multiple probability testing, which calls for statistical penalties in setting P values. The pressure to publish interim results to enhance curriculum vitae or to provide progress reports to a granting agency should be resisted. For all of these reasons, many now advocate that key RCTs be implemented in a "triple-blind" mode, i.e., with neither the clinical/ research staff nor the study sponsor having access to the randomization code until the clinical phase of a study is complete and all data are in a verified data base. A procedure for emergency codebreaking for an individual patient should be provided for in the event of a serious adverse event, but this should be rarely invoked.

In planning a study, particularly a lengthy or large multi-center trial, an independent safety committee should be considered for monitoring very high risk treatments, or when there are compelling reasons for periodic review of data by knowledgeable experts . The study protocol should specify the functions of the monitoring group, its responsibilities for maintaining the study blind, and decision-making procedures to be followed.

It is more or less the standard within the pharmaceutical industry to conduct RCTs in psychopharmacology as triple-blind studies, especially if they may be potentially pivotal in a regulatory sense. There are some disadvantages to this approach: triple-blind conditions add complexity and increase drug development costs. This is because it delays review and analysis of often useful data, which might be helpful in planning follow-on, overlapping phase II studies. In drug development, where studies are often sequential, this delayed access to clinical data leads to inefficiencies in an already costly clinical development process. In some instances it proves disadvantageous to patients in a study as well. The clinician can have a compelling clinical need to know which drug was given so that optimal continuation therapy can be decided upon. One approach to this dilemma, which better insures continuity of care, is to allow double-blind continuation treatment for responding patients, if the clinician feels it would benefit the patient (3). This design strategy avoids disrupting an effective therapy, protects the blinding, and can generate valuable data about a drug's long-term effectiveness.

Occasionally, it may not be feasible, or ethical, to conduct a clinical investigation of an agent where all of the treatment providers are blinded. In this circumstance, it has been common practice to either have all formal ratings made by an independent rater unaware of treatment assignment, or else to give access to clinical data to a senior staff member, with all others remaining blind. Such an approach would be needed, for example, with concentration-controlled RCTs. Obviously, these modifications of true double-blinding risk compromising the validity of a study by introducing experimental bias. Designs that threaten the integrity of a double-blind protocol should be approached with the greatest caution, and other alternatives should be explored before settling for something less than a triple-blind mode. A degree of skepticism is to be expected from reviewers and regulators if there have been deviations from strict protocol-defined maintenance of study blind.

Treatment Outcome Measures
Standardized Rating Instruments

Many psychometric rating instruments have been in use for over two or three decades to clinically evaluate psychotropic drugs. While this attests to the validity of these scales, it does not follow that they are necessarily the most ideal, or sensitive, scales for all applications. Many rating scales do not adequately reflect current understanding of the phenomenology of disorders, or they omit core symptoms (50). It is unfortunate that more resources are not allocated to scale development in clinical psychopharmacology. This could facilitate the drug discovery process, as well as promote clinical research in general. To improve and validate a scale, or to develop a new one, is a daunting task that often exceeds the capabilities of the individual investigator. The pharmaceutical industry sponsor of a new drug is similarly limited in ability to create a new outcome measure that might be more optimal for novel drugs. Data based on proven instruments are required in a New Drug Application (NDA). Outcome data derived from a new rating scale would be regarded as supplemental evidence only. The emphasis placed on use of standard instruments by regulatory agencies, while understandable, has drawbacks for the field of psychometric scale development. For a truly novel compound, standard scales could be insensitive to, or even biased against, detecting therapeutic effect (50).

Another dilemma is the fact that many symptoms of psychiatric disorders and treatment-emergent side effects of psychotropic drugs overlap each other. This could result in a new therapeutic agent being disadvantaged compared to standard or placebo treatment if efficacy ratings are affected by side effects, for example, somatic side effects perceived as worsening of depressive symptomatology or transient anxiety or psychic activation caused by drug that later resolves with ongoing treatment.

Another potential bias of existing rating instruments is apparent in the case of the universally used Hamilton scales for depression (31) and anxiety (32). These rating instruments were developed during the era of tricyclic antidepressants and benzodiazepines, when validity and assay sensitivity were established based on response to the these drugs. Greater latitude in the use of modified scales and new rating instruments seems justified to avoid overlooking useful and novel properties of investigational drugs.

Choice of Outcome Measures

In assessing efficacy, it is desirable to measure both change in symptoms and change in global functioning (83). The number of primary outcome measures specified in the protocol should be held to a minimum, primarily for statistical reasons, to avoid a penalty for testing multiple hypotheses. An overabundance of scales can be intrusive on the clinical interview, adversely affecting the quality of the data. A well-validated symptom rating instrument and a global improvement scale constitute appropriate primary outcome measures for efficacy evaluations in RCTs.

Well-being and level of social functioning are outcomes of drug treatment that are of interest and deserve study (83). With the recent emphasis on clinical services and effectiveness research, quality of life (86), and cost effectiveness (15, 92), it is desirable to consider outcome research measures for incorporation into a drug trial. Scales in these domains are relatively untested, especially in psychopharmacology. Increasingly, health authorities and formulary committees require cost-effectiveness data, in addition to efficacy and safety data, as a condition of drug registration and reimbursement. For this reason, there is a shift in focus to encompass long-term outcomes and costs, ideally in comparison to a treatment standard (16). Such studies are still in the rudimentary design stages in the case of psychotropic drugs, but inclusion of ancillary measures of clinical effectiveness in planning large-scale drug trials deserves consideration.

Safety Outcome Measures

Recording safety data is a key element of every drug study, irrespective of study objectives or design. This is the case for protocols designed to evaluate efficacy, to make biological or pharmacokinetic measurements, or to assess tolerability. Methods for quantifying side effects are less advanced than those for measuring efficacy (9, 51). Nevertheless, collection of safety data is an essential element of every drug trial. All adverse experiences should be captured on either on a symptom rating scale or adverse event (AE) form.

Although structured AE rating instruments are available, it is usual practice to elicit AEs by general questioning and by voluntary report of the patient. A standard glossary of AE dictionary terms can be supplied to each investigator as a supplement to the study protocol. This serves the purpose of standardizing the terminology used by all investigators involved in a drug development program, and it results in improved quality and precision of safety data. It also facilitates data entry and allows for more meaningful analyses of the diverse safety terms when consolidating safety results across studies (51).

A comprehensive but lengthy interview such as SAFTEE (58) can be time-consuming, thereby limiting how frequently it is practical to administer it. A leading concern about use of this instrument is generating excessive numbers of AEs than is the case with an unstructured approach to elicitation. In their methodological comparison study, Rabkin et al. (77) concluded that general elicitation of AEs is appropriate and more practical for most clinical trials than is SAFTEE. Abbreviated versions of SAFTEE have been successfully adapted for use in specialized research applications, for example, phase I tolerability studies. Structured safety instruments are also valuable for assessing extrapyramidal symptoms and tardive dyskinesia in antipsychotic drug studies and for withdrawal symptoms in antianxiety drug trials.

Most health authorities agree on the regulatory definition of a serious AE, which requires special reporting. Adverse events in this category include any event within 30 days of exposure to the test drug that is life-threatening or fatal or involves hospitalization, overdose, cancer, congenital abnormality, or permanent disability. Regulations require prompt notification of appropriate regulatory agencies and institutional review boards of such an occurrence if there is a reasonable possibility that the AE could be caused by drug (or if the AE is unexpected, i.e., not in the investigator's brochure or in the package labeling for an approved drug). In the case of marketed drugs, a suspected drug reaction in this category should be brought to the attention of the manufacturer by the investigator or practitioner, and/or it may be directly reported to the FDA.

Rare serious AEs, that is, those with an incidence less than one per 1,000, are difficult to detect during the pre-approval development stage of a new drug because the number of patients exposed is too few (51). After a drug comes into wider clinical use, spontaneous AR reports remain the mainstay of detection for rare and unexpected side effects and toxicity. This spontaneous AE reporting network functions as an important early warning system, providing valuable information to the drug sponsor and alerting regulatory agencies.

Several approaches to the post-marketing surveillance of new psychotropic drugs have been taken, with uneven success (35). Creation of an FDA center for assessing pharmaceutical effectiveness has been advocated that would collect post-approval data from clinicians (78). One promising study design relies on telephone queries of pharmacy-based cohorts of patients who agree to report on their response to a psychotropic agent, using another standard drug as a concurrent comparator (19). In general, psychotherapeutics has suffered from a paucity of post-marketing studies, in part because of the cost and methodological and logistical complexities of such studies. It is of interest that in psychopharmacology, formal post-marketing surveillance studies so far have failed to detect unsuspected AEs and toxicities of clinical import, while the spontaneous AE reporting system has served the purpose well (35).

STUDY IMPLEMENTATION, MONITORING, AND DOCUMENTATION

Industry-sponsored Trials

Today drug discovery and development are heavily influenced by good laboratory and clinical practice guidelines that have been adopted by regulatory agencies worldwide (43). These guidelines clarify the responsibilities of both individual investigators and drug sponsors undertaking pharmaceutical research.. In addition to requiring peer review and informed consent, clinical guidelines specify the importance of adherence to the study protocol and stress the need to promptly report any unexpected or serious AEs. Furthermore, drug supplies must be accounted for and research records maintained until at least 2 years after an NDA is approved or the investigation of a new drug (IND) is discontinued.

Most industry sponsors of drug trials have adopted standard operating procedures, which includes a requirement for pre-investigation visits to investigators to ensure that they fully understand the protocol and their obligations as required by the regulations (41). Investigators' meetings and frequent monitoring of studies by research sponsors, inter-rater reliability training, and ongoing monitoring during the progress of the study are critical functions, especially with multi-site trials.

Documentation of Clinical Trials

Clinical research, whether sponsored or unsponsored, requires attention to quality control through proper documentation. An RCT protocol should specify objectives, state hypotheses, define response criteria, and present the statistical power calculations and proposed methods of data analyses. Subsequent changes of the protocol should be kept to a minimum, with protocol amendments furnished to the appropriate institutional review committees, regulatory agencies, and research sponsors.

Other research documentation is required to supplement research case report forms and the study protocol. For example, an overall research plan should summarize the overall project, providing relevant background information, stating overarching hypotheses of interest (both a priori and exploratory), and describing general research directions. As an effective way to improve precision and reduce variability, both of which are paramount concerns in psychopharmacology, methodologists generally advocate creating an operations manual that contains specific instructions for making the measurements (36). Included would be written directions for carrying out every procedure: for example, how to carry out and record an interview or make a measurement, how to calibrate the rating instruments, and how to maintain and document reliability among rater/observers across sites and over the duration of the trial, etc. It is essential to maintain a daily log for each separate trial so as to document the day-to-day activities of a study. Some method of cross-referencing research records to clinical records without compromising confidentiality should be devised for study audit purposes. The goal of comprehensive research documentation is to facilitate and validate the data collection and to insure the integrity of the study. This allows one to certify the basis for arriving at conclusions and for claims made about a drug that is subject to regulatory review and approval.

REPORTING RESULTS OF TRIALS

Results of RCTs are reported in different ways depending on the purpose and intended audience. Federally sponsored research is generally summarized in progress reports, usually annually. Often, ongoing trial results and interim analyses are included in progress reports. In accordance with regulations results of industry-sponsored trials are reported periodically to the Food and Drug Administration. Regulations require a degree of detail in these reports that exceeds what would be appropriate for publication purposes but is necessary for regulatory oversight.

How clinical research results should be disseminated by publication has been addressed by Kupfer et al. (46, 47). These authors and others have pointed out how the resistance of journal editors to overly long and detailed presentations of study results imposed by space limitations runs counter to the readers' need for more informative and detailed documentation of methods employed, problems encountered, and so on (10, 54, 65). Sufficient details of study design, a priori hypotheses, statistical power analyses, study implementation, and rationale for methods of data analysis employed should be included in a publication to allow one to judge the authors' conclusions. Documentation of the quality of the measurement procedures and adequate justification of statistical tests used should be provided. Information should also be supplied about patients excluded as well as those enrolled. "Intent-to-treat" patient sample data should always be presented, as well as other types of analyses.

Authors should discuss possible compromises and any limitations to the scientific validity of the conclusions drawn, either as a result of decisions made during planning or due to lack of protocol compliance by investigators or patients. Kupfer and colleagues (46) stress that even in the most carefully designed and executed studies, unanticipated problems arise and accidents happen, and when these affect the results of a study, they must be reported. As Mosteller et al. (65) state, "We must encourage authors to report, and editors to publish, work that includes a thoughtful discussion of mistakes and difficulties encountered in the research

Without more systematic reporting of problems encountered in clinical trials, we will be unable to do much about them."

Many authors (e.g., 1, 11, 17, 46, 53) have admonished editors and reviewers to focus more on the proper description of the methods and a complete reporting of results of the protocol-defined experiment. Editors should resist requests by referees for an author to address ancillary issues that the study was not designed to test, no matter how intriguing. A common problem confronting the reader is a clear distinction in the report between what were the confirmatory (hypothesis-testing) and what were the exploratory (hypothesis-generating) aspects of a study.

Presentation of results needs to be made more informative, for example, using confidence intervals and box whisker plots more frequently, in addition to reporting ‘P' values (82, 87). Better data presentation allows the reader to more accurately assess whether the findings of a clinical trial are justified by the methodology employed and to judge the clinical importance of the effect size.

More complete and informative reporting of RCTs in publications is needed for a full and accurate evaluation of the effects of a psychotropic drug. To accomplish this, both investigators and editors must come to appreciate that adequate description and self critique of trial design and implementation in addition to presentation of results is essential to the scientific process and deserves sufficient space.

ETHICAL CONSIDERATIONS

There is a rich literature encompassing all aspects of the ethics governing clinical trials in medical research (24, 34, 37). Extensive discussions have addressed such diverse issues as requiring the clinician to act simultaneously as physician and scientist, accepting randomization of a treatment condition, use of placebo, publication redundancy, and complete disclosure of potential conflicts of interest. In the field of psychotropic drug evaluation, there has been a particular focus on the ethical aspects of double-blind, placebo-controlled trials, which are a more common standard in the U.S. than elsewhere. While this topic is too extensive to discuss here, it can be stated that the ethical bases for placebo-controlled trials is well established. This is a preferred study design for most efficacy trials for psychiatric disorders, including the depressive and anxiety disorders, psychoses, and others. Justification for using this design includes minimizing experimental bias and the number of patients exposed to an investigational drug, and maximizing the scientific validity of a study.

Informed Consent

The subject of informed consent has assumed increasing importance in psychiatric research over the past decade and there is a developing body of case law governing this area (81). To become a participant in a clinical trial, subjects must be fully knowledgeable about the purpose and potential benefits and risks of the study before granting informed consent, which is an ethical prerequisite for all medical research. Some patient populations pose special ethical dilemmas because they may be more vulnerable due to impaired ability to comprehend all relevant information concerning participation in a study (24). Examples are patients with certain psychiatric disorders such as schizophrenia, organic mental disorders, e.g., degenerative dementia of the Alzheimer's type, and mentally retardation. Such populations require extra precautions that may necessitate legal guardians or patient surrogates to protect their rights. Patient representatives have responsibility for adequate awareness of a mentally incompetent patient's background and attitude in order to decide if the subject would have been likely to consent if he or she were competent. This proxy consent is morally valid insofar as it is a reasonable presumption of what the patient would have wished (24, 63). Another criteria for surrogate granting of consent is whether the proposed research would be objectionable to mentally competent persons. In any case, if the mentally incompetent individual objects to the proposal, withholding consent, the subject's decision holds precedence over a surrogate's.

An approach that applies to patients with disorders, such as Alzheimer's disease, who may go on to develop impaired comprehension as their illness progresses, is the use of a durable power of attorney. This procedure allows an appointed representative to act in the patient's best interests in exercising consent when the individual becomes impaired (21, 24). If there is no advance directive in place for a mentally incompetent subject, a legally authorized representative must be obtained through court procedures under existing law.

It has been proposed that substitute consent be permissible only for research where potential harm or risk to the subject is minor or minimal (2). However in the case of disorders such as schizophrenia and Alzheimer's disease, which are serious and debilitating illnesses, it has been argued that the potential therapeutic benefits of new treatments may be greater than for other patients. This might justify assuming greater research risks for subjects with these disorders than would be apply to other patient populations (13).

Confidentiality of Records

Maintaining confidentiality of research records is a responsibility of every investigator, but there are added concerns for subjects in psychopharmacological trials (24). A subject may be placed at risk if knowledge of participation in a study for treatment of a psychiatric disorder became known to employers, insurers, or peers. Subjects with impaired psychomotor skills due to drug may potentially put themselves and others at risk on the highway or in their occupation. The investigator must carefully instruct research subjects about special precautions to be taken, especially in the situation where a subject's research participation is confidential information and not made known to others. It is incumbent on institutional review committees to satisfy themselves that adequate safeguards are in place to protect the rights of the individual and the public.

A special concern is maintaining confidentiality of substance abusers because merely identifying them as subjects represents a significant risk in many cases. Regulations have been issued by the U.S. Department of Health and Human Services mandating confidentiality measures that apply specifically to the handling of research records of alcohol and drug abuse patients (12). These regulations, if adhered to, can provide considerable protection from subpoena of patient records. However, there may be extenuating medical reasons for releasing information about a subject's substance abuse. The potential for disclosure needs to be discussed with all research subjects agreeing to participate in a study, especially those involving study and treatment of substance abuse (12, 91).

Use of Financial Inducements in Drug Trials

In some instances in-kind or monetary compensation of research subjects may be justified. Reimbursement of travel expenses may be appropriate for subjects who have to travel considerable distances or for those who with extremely limited resources. It is customary to pay a moderate stipend to normal volunteers and research subjects in phase I studies requiring housing in a 24-hour surveillance facility, and as compensation for extensive time commitments and for assuming risk in the absence of any potential benefit. For high risk research, the study sponsor and the institutional review committee should determine that compensation will not compromise the voluntariness of potential subjects, especially those who may be more vulnerable because of economic need (24). Analogous situations may arise with recruitment of prisoners and drug abusers, when other forms of implied coercion, in addition to financial, may be a concern. Subjects in these groups have the right to participate in drug research, but it must be established that they are not put at undue risk, or that their voluntary status is not influenced by the criminal justice system.

Advertising for research patients in drug studies is becoming commonplace. This practice can be justified in part because for many psychiatric disorders with high prevalence in the community patients may never seek or receive treatment despite long-standing illness and disability. Such research patients, so-called patient volunteers, can benefit greatly from diagnosis of a treatable mental disorder. Typically, in return for participation, patients in drug trials undergo comprehensive clinical evaluation and receive care for their disorder at no cost. This often includes after care for a reasonable period of time (up to 6 months or a year) for the disorder when the patient finishes the formal trial. In any case, investigators are ethically bound to make adequate followup arrangements or referral of patients as they complete a protocol. Care at no cost, including appropriate follow-up, is regarded as being reasonable research compensation, without the stigma of excessive or unethical financial inducement This practice has been found to contribute to better patient compliance and to enhance scientific validity through improved quality of data collection and better long term follow-up.

Ethical concerns have been raised where the practice of advertising for patients for a drug trial included offers of substantial financial compensation. Financial reward should never be used to balance significant risk to human subjects, when it would constitute a form of enticement. This practice also raises methodological concerns because it may attract less representative patients into a drug study, adversely affecting the quality of the data. When used in a study, advertisements for subjects should be reviewed and receive prior approval by the responsible institutional review committee.

Drug trials involving psychiatric disorders are especially susceptible to experimental bias, as discussed earlier in this chapter. It has be argued by some that advertising for patients for psychopharmacological studies may solicit inappropriate subjects, leading to an unrepresentative patient sample and a scientifically flawed trial. This raises a different type of ethical concern because it would place subjects at risk without potential societal benefit, and in fact, might lead to erroneous results that could prove detrimental to others. It is interesting that several investigators recently examined the records of patients with depressive disorders recruited by advertising compared to those obtained by clinical referral and patients from office-based practice (7, 57, 64). They found no meaningful differences between groups and concluded that patients recruited by advertisement constituted a representative sample of the target disorder. Of interest was the fact that many of these patients had disorders of long duration, which had never been treated, often because patients were unaware of the nature of their symptoms until viewing the advertisement and being motivated to seek out treatment.

The practice of offering physicians "finder's fees" for referring patients to research represents an unethical practice because it creates an inherent conflict of interest for the physician and may interfere with his judgment to inform about alternative therapies. Fortunately, this appears to be an uncommon practice. In any case, financial incentives as part of a research project require full disclosure in the consent form, as approved by the institutional review committee. Failure to disclose such information to patients by the referring physician, or by those offering to pay for the referral, could result in liability in the event of harmful effects (80).

Fortunately, advertising for subjects, when the content is primarily educational and does not involve monetary incentives, seems to be an acceptable and practical way to find patients for psychopharmacological drug studies. Patients in general benefit from accurate psychiatric assessment and becoming informed about their symptoms, often for the first time. Eligible research patients go on to receive treatment with careful monitoring and evaluation of response, and often subsequent follow-up.

FUTURE DIRECTIONS IN CLINICAL PSYCHOPHARMACOLOGY TRIALS

Conducting a clinical evaluation of a new or existing drug is a sophisticated and demanding process if one is to insure accurate and unbiased safety and efficacy data. Rigorously controlled trials provide the most reliable basis for decision making about new drug development as well as how best to use a drug in practice.

The past decade has seen improved clinical methodologies for drug trials. Contributing to these advances has been adoption of good clinical practice guidelines, better documentation, guidelines for protection of human subjects, and audit procedures (43). Innovative study designs and more critical statistical methods have helped to diminish the inherent variability of psychopharmacological drug studies. These advancements have served to reduce bias and to avoid type I errors by requiring protocol-defined samples to be analyzed and primary outcome variables to be defined. More uniform adverse event reporting has been implemented to improve safety information.

Efforts are being pursued to make the clinical evaluation of a psychotropic drug more efficient. The discovery and development of a new agent is a time-consuming and costly process that affects health care costs. More efficient development by truncating the time-consuming phase II and III programs for psychopharmacological agents is needed. The concentration-controlled clinical trial design proposed by Sanathanan and Peck (84) is a suggested approach that requires further study to establish its practicality.

There is recent emphasis on longer term treatment studies to assess the full risk/benefit profile of a drug. This new focus in drug evaluations highlights a need for more methodological research on conducting long-term outcome trials. Sample enrichment and dropout issues raise concerns about the generalizability of results of long-term trials, as discussed earlier in this chapter, and deserve attention.

A growing emphasis on health care research has fostered so-called outcome research and effectiveness studies. A distinction is made between a drug's "efficacy" (i.e., an outcome based on measures of clinical improvement in controlled trials using carefully defined samples) and clinical "effectiveness" (i.e., how well the treatment works overall in typical practice settings). The latter can be variously defined depending on the health services research question, for example, comparative costs of alternative treatments, service utilization, quality of life years, work productivity, and so on. Increasingly, clinical investigators are asked to consider incorporating into drug studies some outcome measures of effectiveness, such as health care costs, functional capacity, or quality of life (6, 88, 92). Some of these evaluations can be obtained as part of, or immediately following, a controlled clinical trial (16). Imposing these additional outcome measures inevitably adds complexity to a clinical study, which poses methodological risks. Health care methodologies are still evolving and require validation as to the best way to quantify overall treatment costs and outcomes. Although imperfect, such studies still provide valuable data about therapeutic choices facing the clinician and are helpful in health care planning. The coming years will see growing emphasis on evaluating the effectiveness and health care costs of new drugs, in addition to establishing efficacy (11: also see Autism and Pervasive Developmental Disorders and Tic Disorders, this volume).

published 2000