Additional related information may be found at: |
Neuropsychopharmacology: The Fifth Generation of Progress |
Strategies for Multimodality Research
Ellen Frank, David J. Kupfer, and Jordan Karp
Department of Psychiatry,
University of Pittsburgh,
Western Psychiatric Institute and Clinic,
Pittsburgh, Pennsylvania 15213-2593.
This chapter addresses treatment outcome research involving both pharmacotherapeutic and psychotherapeutic treatment modalities. We will begin by describing the rationale for such studies and the patient populations to which they might be applicable. We will then address a number of questions relating to the design and implementation of such investigations, including patient selection and the preparation of patients for study participation. We will discuss a number of issues relating to outcome assessment in multimodality research, including the choice of outcome measures, the timing of assessments, and the importance of assessing variables which might be important mediators or moderators of outcome. We discuss magnitude of effects in terms of both statistical and clinical significance. We note the importance of consistency of terminology in describing outcomes and the methodological problems associated with assessing outcome in uncontrolled follow-up. A number of specific design issues are discussed, including types of trials, appropriate "doses" and durations of treatments, strategies for achieving comparability across treatments, and the possible utility of assessing psychosocial variables in pharmacotherapy-only trials. We raise methodologic concerns specific to the study of nonsomatic therapies, including the choice of appropriate controls for psychosocial treatments, assessment of the adequacy of the psychosocial treatment, and standardization of the nonpharmacotherapeutic treatments. Finally, we discuss several ethical considerations relevant to multimodality research.
Surveys of actual clinical practice suggest that the majority of psychiatric practitioners, regardless of their theoretical persuasion, practice some form of combination treatment with the majority of their patients. Today, even the most psychoanalytically oriented physician prescribes medication for a portion of her or his patients, whereas the biologically oriented physician typically dispenses both psychoeducation and supportive psychotherapy along with medication.
There have been a number of reasons for the trend toward combination treatment strategies throughout the last two decades. Following the early enthusiasm for psychopharmacologic agents, pharmacotherapeutically oriented clinicians recognized that medications were less than completely effective for a large percentage of their patients. Furthermore, they discovered that by educating the patient about the disorder under treatment and by providing substantial amounts of psychological support, they could improve the patient's adherence to the prescribed drug regimen while awaiting evidence of a therapeutic response. Psychotherapeutically oriented clinicians, on the other hand, came to recognize that pharmacotherapy did not necessarily lead to negative interactions when used along with psychotherapeutic techniques. Rather than diminishing the patients' capacity to participate in the psychotherapeutic process, particularly in the case of the mood disorders, pharmacotherapy actually facilitated patients' participation in the therapeutic work. Thus, clinicians of both orientations came to suspect that they were most likely to achieve optimal results through a combination of medication and psychosocial intervention.
Rush and Hollon (51) have argued that there are two ways in which the combination of pharmacotherapy and psychotherapy might result in an outcome superior to that achieved by either treatment alone: the "magnitude" model and the "frequency" model. According to the magnitude model, the combination provides a greater degree of symptom relief for most or all patients being treated when compared to either modality offered alone. The frequency model posits that combined treatment leads to an increased likelihood that any given patient will receive the treatment modality to which he or she would have been responsive had that treatment been offered alone. In truth, any benefit observed for the combination of medication and psychotherapy in a randomized control trial undoubtedly results from a combination of "magnitude" and "frequency" effects.
The Rush and Hollon arguments apply well to acute treatment strategies. When long-term maintenance strategies are under consideration, the typical argument advanced for the superiority of combination treatment is that pharmacotherapy prevents relapse/recurrence directly through symptomatic prophylaxis whereas psychotherapy prevents return of symptomatology indirectly through improvement in social functioning and capacity for coping with stressful life events. Interestingly, in controlled trials where both pharmacotherapy and psychotherapy have been maximized, it has often been difficult to demonstrate a statistically significant and sustained advantage for combination treatment with respect to relapse recurrence outcomes (7, 14, 24).
As implied above, combination treatment strategies are appropriate for both short-term treatment aimed at the resolution of acute symptomatology and prophylactic maintenance aimed at the prevention of new episodes of illness. Combination treatment strategies have been applied successfully in patients suffering from schizophrenia (22, 23, 24) and unipolar major depression (7, 14, 15, 58). Studies examining the benefits of combination strategies in bipolar illness and anxiety disorders are ongoing. The eating disorders represent another area where combination treatment might be applicable.
Treatment Responsiveness and Representativeness of Patients
Two sometimes conflicting requirements face the investigator selecting patients for a combination treatment trial. Patients selected should, at minimum, represent a population with some likelihood of responsivity to either treatment alone and, perhaps, particularly to the combination. Thus, they must represent a relatively homogeneous, well-defined diagnostic group. On the other hand, for the study results to be of practical value, the patient population selected must be sufficiently representative of populations seen in actual practice to allow for generalization. As Elkin et al. (8) have pointed out, such requirements have implications for both the inclusion and exclusion criteria for combination treatment trials. Structured diagnostic interviews such as the Schedule for Affective Disorders and Schizophrenia (53) and the accompanying Research Diagnostic Criteria (54), the Diagnostic Interview Schedule (49), and the Structured Clinical Interview for DSM (55) have markedly enhanced the reliability of psychiatric diagnoses and the interpretability of outcomes from controlled treatment trials. Nonetheless, even diagnostic categories as apparently distinct as bipolar 1 disorder are plagued with considerable heterogeneity. In their 1988 paper, Elkin et al. (8) argued for the importance of the inclusion of subtypes of patients presumed to be responsive to each type of treatment in any combination treatment trial in depression. They pointed, for example, to the need to include endogenous patients (presumed to be responsive to drug therapy) but not to the exclusion of other subtypes. More recent studies, however, such as those of Thase et al. (56), suggest that the same subtypes are likely to be overrepresented among the responders to both psychotherapy and drugs. Some of our own studies (11, 13, 47) suggest that it is the inclusion of patients with Axis II comorbidity which may be most helpful in demonstrating the efficacy of combination treatments.
In terms of exclusion criteria, many investigators have used the strategy of excluding patients with a previous history of nonresponse to the experimental treatment. In selecting patients for combination treatment studies, however, it might be advantageous to include at least a subset of patients with a history of nonresponse to one or the other treatment alone, stratifying on this variable in order to be certain that equivalent numbers of such patients are included in each treatment condition. Indeed, a failure to find large effects for the combination in comparison to either treatment alone may be a result of the failure in previous trials to include a sufficient number of patients who had previously been nonresponsive to either treatment alone.
Determination of Who Will Benefit from Combination Therapy
Determination of benefit in any treatment protocol will always be dependent on the choice of the outcome measures. Choosing the outcome measures for combination treatment trials can be especially difficult because those measures must reflect the expected mode of action of each of the treatments individually. For example, in a trial combining cognitive therapy for depression (1) and some form of antidepressant pharmacotherapy, one would probably want to use both (a) the Beck Depression Inventory to assess the more cognitive aspects of depression and (b) a measure such as the Hamilton Depression Rating Scale (21) to assess the more somatic aspects of the depressive syndrome. In addition, one might choose to assess dysfunctional attitudes, the presumed etiologic mechanism according to Beck's theory, more directly through a measure such as the Dysfunctional Attitudes Scale (A. N. Weissman, unpublished dissertation). The problem is how to combine or weight these measures as outcomes and, as the number of measures expands, how to avoid multiplicity. Needless to say, in such a protocol, the weighting or means of combining outcome measures should be determined a priori.
The treatment paradigm will also, in some respects, determine who benefits from combination therapy. In their earliest trial, Weissman et al. (58) assigned depressed women who had responded to 4 weeks of amitriptyline alone to the combination of amitriptyline and psychotherapy in comparison to medication alone, psychotherapy alone, psychotherapy with a placebo tablet, placebo alone, or low-contact without a tablet. In this design it is likely that those who benefited from the combination would differ from those who would benefit from the combination in a protocol in which patients were assigned to the various treatments from the outset.
A large number of potential biological and psychosocial predictor variables could also enter into the determination of who will benefit from combination treatment strategies. For example, in the area of schizophrenia research, some studies suggest that the level of expressed emotion in the patient's family may predict benefit from family therapy and/or family psychoeducational intervention in combination with pharmacotherapy (18, 31). Data from the laboratories of Michael Goldstein and David Miklowitz suggest that the same variables may be important in the prediction of relapse in young manic patients (42). Personality variables (21, 42) and recent experience of a stressful life event (29) appear to be related to a slower response to combination treatment in recurrent unipolar patients. It remains an empirical question whether patients with personality pathology or a recent severe life event would have failed to respond altogether had those same investigations included only pharmacotherapy.
In the area of mood disorders, for example, a host of biological predictors has been identified for prediction of response for pharmacotherapy. Electroencephalographic (EEG) sleep parameters (33), failure to suppress cortisol following dexamethasone administration (19), and thyroid function (34) have all been related to outcome in pharmacotherapy trials. Biological predictors have rarely been examined in relation to response to psychosocial interventions for mood disorder patients. Exceptions to this rule are the examination of EEG sleep variables and response to acute cognitive therapy (52) and interpersonal psychotherapy (2, 3), maintenance interpersonal psychotherapy in depression (32), and psychophysiologic testing in treatment of anxiety disorders (4, 39, 40, 41). The investigator who has a strong belief in the predictive capacity of any or several of these variables may wish to carry out pretreatment assessments in order to stratify on these variables.
There are three important aspects to preparing patients for a combination treatment study. The first is explaining the rationale for each of the individual treatment modalities, as well as the rationale for combining them. The second aspect of patient preparation has to do with those maneuvers that the investigator goes through to ensure patient adherence to each of the treatment modalities. Finally, and not unrelated to ensuring adherence, is the investigator's responsibility to address possible patient biases regarding each of the individual treatments.
Explaining the Rationale for Each Modality and the Combination
One of the important advances of modern psychiatric treatment is the general demystification of the treatment process for the patient. Whether modern clinicians choose a pharmacotherapeutic approach, a disorder-specific psychotherapeutic approach, or a combination of the two, they typically spend some part of the early phase of treatment in explaining the rationale for the treatment strategy they have chosen. This can be even more important in the conduct of treatment research where a good understanding on the part of the patient of the theoretical basis of the treatment appears to have positive effects on adherence to the treatment regimen and on remaining in the study. Needless to say, explanations of the rationale and/or mechanism of action of each of the treatments should be provided in a manner consistent with the intelligence and education level of the individual patient. It is also necessary to take into account the patient's clinical state at the time such an explanation is offered. For symptomatic patients whose ability to concentrate and remember may be seriously impaired, it is often helpful to provide both an oral and a written description of the rationale for the treatment, giving the patient the opportunity to read and reread the written description as cognitive functioning improves. It may also be important to repeat the oral description of the treatment rationale at several points in time as the patient's clinical state is improving. The clinician should ask for feedback as to what the patient understands the rationale to be so that any misconceptions can be corrected.
Once the clinician feels reasonably confident that the patient understands the rationale for each treatment individually, she or he can then develop the rationale for combining the two treatments. In many instances this rationale will simply take the form of increasing the chance that the patient will respond (the "frequency" model of Rush and Hollon) (51); however, there may actually be a rationale for the "magnitude" model—that is, that the pharmacotherapeutic intervention is likely to act on one aspect of the disorder while the psychotherapeutic intervention is likely to act on a second.
Maneuvers to Ensure Adherence to Each Type of Treatment
A second important maneuver for ensuring patient compliance is the setting of reasonable expectations for the time course of recovery. As Elkin et al. (8) have pointed out, expectations about the time course of recovery can influence both therapist and patient morale and may have subtle and not-so-subtle effects on the behavior of both. In the treatment of depression, for example, it has been helpful not only to set reasonable expectations for how long either drug therapy or psychotherapy is likely to take, but also to explain that the course of recovery is rarely a linear trajectory and more often saw-toothed in nature. Such an explanation prevents patients from becoming demoralized when a modest improvement is followed by a temporary setback. Another important aspect of expectation-setting is informing the patient that different symptoms will remit on different time courses and that this is also modality-specific. For example, in the treatment of mood and anxiety disorders, the symptoms which remit earliest with pharmacotherapy may be different from the symptoms which remit early in the course of treatment with psychotherapy.
In addition to the maneuvers described above, we have found it helpful to involve family members as "ex officio" members of the treatment team. This can be done by meeting with one family member at a time or through workshops conducted for groups of family members and significant others. In our own work we have found that such educational workshops offered early in the treatment course have led to some of the lowest reported attrition rates of modern treatment research (12, 27). By providing family members and friends with a rationale for each of the treatments, a rationale for the conduct of the investigation (i.e., what the investigator hopes to learn), and reasonable expectations about the course of recovery, it appears possible to make family members important allies of the treatment team. In this role they can support the patient when she or he becomes discouraged, remind the patient to take medication, and watch for changes in the patient's clinical state of which the patient may be unaware but which may be important to continued protocol adherence.
Addressing Patient Biases Regarding Each Treatment Modality
Patients' willingness to consent to a treatment trial in which they might receive either pharmacotherapy, psychotherapy, or the combination does not mean that they have no treatment preferences or biases regarding which treatment is likely to be most effective. Such attitudes and expectations can affect patient behavior (and ultimately treatment outcome) in a variety of important ways. If a treatment facility is known for one form of treatment or another, patients coming to that facility are likely to be seeking that form of treatment and, although they agree to random assignment, are hoping for the form of treatment they originally sought. Patient expectations and biases can affect patient behavior in the treatment itself, response to any self-report evaluations of treatment efficacy, and attrition rates. The most sensible approach to these potential attitudinal confounds is to conduct an explicit assessment of patient expectations and attitudes and analyze outcome results as a function of such attitudes. An investigator might even consider stratification of subjects on the basis of their beliefs and biases regarding the biological versus psychosocial nature of the etiology of their illness and/or their desire for one form of treatment or the other.
An important counterbalance to biases on the part of patients is the neutral or, more properly, equally enthusiastic stance of the treating clinicians with respect to the treatments being provided. In the area of psychotherapy research, there have been endless debates as to whether clinicians should be crossed with treatments or whether clinicians should provide only the treatment at which they are most expert and about which they are most enthusiastic. This debate applies equally to the combination treatment trials in which medication, psychotherapy, and the combination are being provided. Whichever choice an investigator makes, it is essential to ensure that clinicians convey enthusiasm and optimism about each form of treatment they are being asked to provide.
Treatment Combinations Suggest a Spectrum of Outcomes
Outcome Measurement Choices
Kasdin (30) has argued that in the ideal case outcome should be multifaceted, involving the perspectives of (a) the patients under study, (b) their significant others, and (c) the clinician. In addition, assessments should address different facets of the individual and consequences of the individual's psychopathology. Thus, assessments should be made of subjective distress, overt symptomatology, and impairment in functioning. Finally, in the ideal case, assessment will involve self-report instruments, clinician ratings, and observational measures. Finally, as Waskow and Parloff (57) pointed out, a core assessment battery—particularly in a combination or multimodality treatment study—should be applicable to outcomes viewed from a variety of theoretical orientations.
Having argued for such a battery, one must acknowledge that the analysis of such a large and diversified assessment battery presents both practical and philosophical problems. It is incumbent on the investigator using such a battery to specify at the outset how such data will be examined. Does the investigator, for example, have a specific hypothesis that different treatments will affect different domains (i.e., distress versus symptoms versus functioning)? Does the investigator hypothesize that only one of the modalities, perhaps the combination, will lead to effects observable by significant others? While combination treatment studies, in particular, appear to require such multifaceted assessment, unless a clearly defined set of hypotheses is stated a priori, the investigator using such a battery runs the risk of having different analyses lead to different conclusions about the relative efficacy of the treatments under study.
A final issue in the area of measurement choices has to do with the tendency in the field of treatment research to continue to use old, established measures despite clear evidence that they are not particularly sensitive to the outcomes of interest. A prominent example of this is the continued use of the Hamilton Rating Scale for Depression (HRSD) (21). The HRSD was originally designed for the assessment of change in inpatient populations with melancholic unipolar depression. Wanting to be in a position to compare their results to those of previous studies, investigators have continued to use this measure despite considerable evidence that it is relatively insensitive to change in milder, outpatient nonmelancholic depressions and particularly inadequate in the substantial proportion of unipolar outpatients with reverse vegetative signs.
Timing of Assessments
Another critical decision in the development of an assessment protocol for all treatment outcome research is that of the timing of assessments. Particularly in combination or multimodality research where the presumed effects of the different treatments are also presumed to operate on differing time courses, the more frequently the core assessment battery can be performed, the more likely it is that such differential effects can be observed. The additional value of multiple assessment points is that a much better understanding of the effects of treatment on those patients who ultimately drop out or are terminated from the study can be obtained. This is especially true when survival analytic or random effects models can be applied to the data. Under ideal circumstances, short-term treatment studies should include weekly or biweekly assessments. Longer-term studies, as well as long-term follow-up of acute treatment effects, should obtain at least some portion of the assessment battery on a monthly or bimonthly basis.
End-of-treatment and follow-up evaluations provide a particular dilemma in studies in which one of the modalities under investigation is short-term pharmacotherapy. While Elkin et al. (8) argued in 1988 that one might consider end-of-treatment evaluation of such subjects both prior to and following drug taper, today we would probably consider any evaluation following the tapering of pharmacotherapy to be a follow-up rather than end-of-treatment evaluation. Follow-up evaluations are discussed more fully below.
Other Measures to Be Considered
As implied above in the discussion of patient preparation for a treatment protocol, a number of variables which may mediate or moderate treatment outcome should be assessed as well as outcome per se. As noted above, patient expectations and biases regarding the likely efficacy of biological versus psychosocial versus combination treatments fall into this category. In our own work, measures of treatment specificity have proven to be particularly useful in the interpretation of psychotherapy and combination treatment outcome (15). Particularly important in interpreting "placebo" effects in studies crossing psychotherapy versus no psychotherapy with drug versus placebo would be measures of treatment "process" in the non-psychotherapeutic condition. For example, had data from the Treatment of Depression Collaborative Research Program (9) been analyzed in this way, it might be possible to make a more sensible interpretation of the equivalence of the clinical management with placebo condition in the less severely depressed subjects.
One aspect of measurement which is almost universally lacking in treatment outcome studies is an assessment of the costs versus benefits of the interventions as seen from the perspective of health economists. Kasdin (30) pointed out that a better understanding of such variables as the cost of providing the treatment, the cost of training clinicians to carry out the treatment, and the requirements for ongoing supervision of clinicians would complement more traditional measures of efficacy and facilitate a more complete assessment of the actual costs and benefits of each of the treatments under study. Health economists tend to be interested in other "costs" of treatment, such as the time required for patient participation, the side-effect and nuisance profile of each of the treatments under consideration, and the evaluation of extreme outcomes such as suicide. Under such an analysis, treatments that were less effective from the symptom reduction or probability of remission standpoint but were more effective from the suicide prevention standpoint might be judged more effective overall.
Magnitude of Effect/Response
Assessment of Statistical Significance
The conventional approach to the assessment of treatment outcome has been to examine the statistical significance of pre- to post-treatment change and/or of differences between the treatment groups. Typically, in studies of this nature, the outcome of primary interest has been the end-of-treatment score on a single measure or a variety of measures. This outcome evaluation is then compared to baseline in a paired t test within a group of subjects or between groups using a t test or analysis of variance, sometimes adjusting for baseline scores. There are numerous problems with this method of examining treatment effects (none of which are specific to multimodality research). The most prominent of these problems have to do with (a) how dropouts and early terminators should be handled in such an analysis and (b) how much confidence one should have in any value or combination of values taken from a single assessment.
With respect to the problem of handling dropouts and early terminators, one approach has been to analyze data by the principle of intention-to-treat, carrying forward the last observation available on any subject who drops out or is terminated prematurely. All too often the observation being carried forward is the baseline, pretreatment evaluation of the subject and, thus, really tells us nothing about the efficacy of the treatment for that particular subject. A second common approach has been to analyze only those subjects who complete the full course or a specified minimum proportion of the course of treatment. Finally, some investigators (26) have chosen to present analyses based on all three of these approaches. Such a presentation may leave the reader confused as to what lesson is, in fact, to be drawn from the investigation, particularly if the three sets of analyses lead to somewhat different conclusions.
A relatively new, but much more satisfying, approach to the analysis of such data involves the use of random effects models such as those proposed by Gibbons et al. (17). By analyzing treatment response trajectories averaged across subjects within a treatment for as long as any given subject is under treatment, these models allow the investigator to incorporate all of the data obtained from a given subject for as long as the subject is continued in the protocol. Random effects models thus provide for a more rational way of handling the problems associated with dropouts and early terminators. This can be particularly important in multimodality research where subjects in different treatment groups are likely to drop out or be terminated at different rates and for different reasons. The use of a random effects or random regression analysis approach allows the investigator to model those differences.
Assessment of Clinical Significance
While generally thought of as another approach to the assessment of statistical significance, methods involving life-table or survival analytic techniques might be conceptualized as falling somewhere between the analysis of clinical and statistical significance. When such models are used, the investigator is required to make a single, categorical determination with respect to the outcome of interest. In the case of a short-term treatment study, this outcome would likely be response or clinical remission. In prophylactic maintenance studies the outcome of interest is likely to be relapse, recurrence, or rehospitalization. In any case, a clinically based determination as to whether a subject meets criteria for this categorical outcome represents the outcome of interest. To the extent that this determination is made on the basis of clinical judgment, the statistical analysis of such outcomes through survival analytic methods can be thought of as having substantial clinical significance as well.
The more common meaning of the term clinical significance grows out of the work of several investigators (26, 28, 60). These investigators have sought to define ways of (a) measuring and analyzing the extent to which a treatment results in a return to functioning comparable to that of some normative sample, (b) examining the magnitude of change, and/or (c) determining the degree to which change is apparent to nonclinical observers, such as family members or friends. While some of these methods have been well worked out from a statistical standpoint, they have not gained wide acceptance. Few papers, other than those written by the developers of these methods themselves, have presented analyses of clinical significance.
There is also a more informal way in which clinical significance is discussed among researchers and clinicians familiar with a given outcome measure. For example, such individuals might conclude that despite the fact that an investigator achieved a statistically significant difference between two large groups of subjects on the Brief Psychiatric Rating Scale (BPRS) (45), the difference is not clinically meaningful in terms of the size of the change achieved. They would therefore conclude that the result is statistically but not clinically significant.
Distinguishing Response/Remission/Recovery and Relapse/Recurrence
Prien et al. (48) have demonstrated the extent to which the mood disorders outcome literature is riddled with inconsistent use of terms describing outcome. One investigator's "response" is another's "remission." We suspect that a similar situation exists in the outcome literature for other major psychiatric disorders.
As an attempt at an antidote to this problem, a task force of the MacArthur Foundation Network I on the Psychobiology of Affective Disorders offered conceptual definitions differentiating among these terms in order to reduce the amount of confusion in the mood disorders literature (16). Similar efforts have been made or are underway with respect to several other diagnostic categories including substance abuse and anxiety disorders. At the very least, within a single report or series of reports from a single study, investigators should attempt to make clear how such terms are being used and to use them in a consistent fashion.
Problems Associated with Follow-Up in Acute Treatment Studies
It has recently become apparent how problematic the interpretation is of follow-up evaluations of treatment responders in a randomized trial. Several years ago, Elkin et al. (8) argued that such evaluations may reveal delayed effects of one treatment or the other. More recently, however, a number of methodologists including Lavori (unpublished paper, presented 1992) have argued that any follow-up evaluation is of limited interpretability. Because only responders to the various modalities are included in the follow-up assessments, the various groups can no longer be considered to have been randomly sampled from the same population and thus cannot be compared statistically. An important area of future work for multimodality research will be to define acceptable methods for the examination of uncontrolled, naturalistic follow-up data. If such methods cannot be developed, one must question whether the time and expense involved in such evaluations is warranted.
This section on issues relating to study design will address the types of trials the multimodality researcher may consider conducting, what the optimal design for such trials might be, the appropriate dose and duration of the various types of interventions, the problems associated with equating the various interventions, and, finally, the value of examining psychosocial outcomes in studies involving only pharmacotherapy.
Types of Trials
At the most basic level, treatment trials sort themselves into two varieties: (i) those that assess the extent to which a treatment or treatments improve patients' clinical condition and/or bring about a full remission of symptoms and (ii) those that assess the extent to which treatments prevent a return of symptoms. Studies of the first variety, often referred to as short-term or acute treatment studies in the psychopharmacology literature, have generally taken the form of a randomized controlled comparison of one or more active treatments with a placebo tablet in acutely ill patients. Typically carried primarily for the purpose of obtaining FDA approval for the use of a compound for a particular indication, these trials have often been very brief (a few weeks in duration) and have simply explored the question of whether there is a difference between the active treatment(s) and the control condition on some rating scale. Moving this paradigm to multimodality research has changed the nature of the research enterprise, because psychotherapy generally operates on a much more extended time course. While the pharmacotherapy-only trials tended to look simply for response to the treatment (i.e., a difference between the active and control condition), multimodality trials involving a psychotherapy comparison have typically defined the minimum or ideal length of the psychotherapeutic intervention and then compared the drug, the psychotherapy, and, in some cases, the combination over the time course needed to complete the short-term psychotherapeutic intervention (37, 50, 59).
Particularly in the areas of mood and substance abuse disorders, there has been increasing recognition of the importance of relapse following initial treatment success. This has led to the emergence of the concept of continuation treatment—that is, interventions aimed at the prevention of relapse once a remission of symptoms has been achieved. While various parameters of continuation treatment, including its appropriate duration for unipolar depression (20), have been described, to our knowledge we are yet to see a trial in which patients brought to remission by various controlled or uncontrolled routes are randomly assigned to a set of continuation treatment strategies. Although the term "relapse" has often been applied to severe symptomatic exacerbations in patients suffering from schizophrenia, the duration of most (i.e., several years) makes them more like maintenance treatment studies.
The concept of maintenance treatment probably first emerged in the mood disorder area with the early study of Weissman et al. (58), which they characterized as a "maintenance" trial. Because this followed on only 4 weeks of amitriptyline treatment, however, in current parlance it would be considered a continuation treatment study. As implied above, the true maintenance studies have focused on years rather than months of prophylaxis and, as will be described below, have led to some very substantial controversies with respect to appropriate design.
Optimal Design
The issue of primary concern in multimodality research is which and how many treatment conditions ought to be considered. Issues of secondary concern relate to blocking or stratifying variables. In the area of continuation and maintenance treatment studies, the investigator not only must make a determination as to which treatment conditions are to be evaluated, but also must determine at which point in the course of treatment the experiment is to begin.
The naive (or, at least, simple) answer to the question of the ideal design for multimodality research is the two-by-two factorial in which active pharmacotherapy and placebo are crossed with active psychotherapy and some form of psychotherapy control condition. From a pure design standpoint, this is clearly the ideal paradigm. Unfortunately, however, the treatment conditions created in such a design (especially the no-active drug/no-active psychotherapy condition and the psychotherapy with placebo tablet condition) may not actually be applicable to several clinical conditions or generalize to real clinical practice. For example, the placebo tablet/placebo psychotherapy condition would probably rarely be applied to patients with psychotic disorders, including schizophrenia and bipolar 1 disorder. With respect to the representativeness of design conditions vis-à-vis clinical practice, Hollon and Beck (25) and many other psychotherapy researchers have argued that psychotherapy with a placebo tablet is not representative of psychotherapy in actual clinical practice, where a placebo tablet would never be given.
Is the Six-Cell Design the Solution?
This has led Hollon and Beck (25) to propose a six-cell design in which psychotherapy and a psychotherapy-control condition are crossed with active medication, placebo tablet, and no pill. While from a purely objective standpoint this could be considered an "ideal" design, it raises at least as many practical problems as the two-by-two factorial. For example, is the no-pill/no-active psychosocial treatment condition a viable option? For some conditions (e.g., depression following bereavement) it might be possible to convince subjects that a regularly scheduled visit intended simply to check on the progress of their recovery is a reasonable treatment option. However, it is difficult to imagine how one would present the rationale for such a condition to, for example, an individual with severe, incapacitating panic disorder.
Recognition of the difficulties associated with the perfectly balanced factorial designs leads to the recommendation of a variety of unbalanced designs for the conduct of multimodality research. For example, in our own work we have carried out a five-cell comparison of psychotherapy and medication in which the conditions were (a) maintenance interpersonal psychotherapy (IPT-M) and active medication, (b) IPT-M and placebo tablet, (c) IPT-M alone, (d) medication clinic visits and active imipramine, and (e) medication clinic visits and placebo tablet. Post-hoc analyses revealed absolutely no differences between the IPT-M alone and the IPT-M-plus-placebo condition for any of the outcome or outcome-mediating variables (10). In light of the amount of time required to recruit the additional 20% of subjects needed for this fifth condition and in light of our difficulty in analyzing a variety of interaction effects because of the reduced power resulting from the smaller cell sizes, we concluded in retrospect that it would have been preferable to limit the study to four treatment conditions.
In the end the investigator must balance scientific purity, interpretability, and generalizability. Certain designs may be less than ideal from a scientific standpoint. The information yield may, nonetheless, be high relative to the cost of mounting the trial. In contrast, other designs with perfect scientific credibility may be so costly as to render them useless in an environment of shrinking resources for clinical research.
A discussion of design decisions would not be complete without addressing the problem of statistical power in multimodality treatment research. Studies examining multiple treatment modalities are almost always underpowered when one considers anything beyond the primary outcome measures. The presumed moderators and mediators of treatment outcome invariably differ for the pharmacotherapy and psychotherapy conditions, leading to a profusion of assessment measures. While the investigator typically had an appropriate rationale for each measure, there is rarely sufficient power to examine the relationship of these measures to outcome.
The most common solution to the problem of inadequate power has been the multicenter trial. However, multicenter trials raise serious problems of their own. Achieving consistency among treatment providers and raters is difficult enough in a single-site investigation, but it represents a major challenge in multicenter trials. How investigators and funding agencies should make a determination as to whether the likely additional yield from a multicenter trial outweighs the additional complexity and expense is not clear. The development of criteria for decision-making in this arena could be a useful addition to the field.
Appropriate Doses and Durations of Interventions in Trials Involving Both Pharmacotherapy and Psychotherapy
One of the many decisions facing the investigator, once the specific treatments to be investigated have been decided upon, is what the duration of those treatments should be. As Elkin et al. (8) pointed out, the appropriate length of most of the disorder-specific, manualized psychotherapies is reasonably well defined. In contrast, there is still considerable controversy over what the ideal length of any given pharmacotherapeutic intervention should be. Equally, there are disputes as to the appropriate "dose" of each intervention. For most of the disorder-specific psychotherapies the modal frequency of contact appears to be once a week. However, there are circumstances under which treatment might be more or less frequent. In the pharmacotherapy domain, the investigator must consider whether to employ a titrated dose or a fixed dose or whether to key the intervention to a targeted blood level of the compound.
For studies of long-term prophylaxis, the duration of treatment may be more easily defined by the expected time to relapse/recurrence in the study population. However, the question of dose remains problematic for both the pharmacotherapeutic and the psychotherapeutic intervention. Should the medication be maintained at a constant dose comparable to that used in acute therapy or is a reduced, "maintenance" dose (or blood level) more appropriate to the study population in question? How frequently should the psychosocial intervention be offered: as often as practical for subject retention or as often as seems likely to be required for prophylaxis?
Equally perplexing in the design of combination trials is the issue of the equivalence of the interventions, both in terms of the duration of individual sessions and in terms of the amount and source of attention given to individuals in each treatment condition. A recurrent theme in the critiques of multimodality treatment research is the time differential between the active psychosocial intervention and the psychosocial control condition. For example, when a psychotherapy is being contrasted with a clinical management/medication clinic condition, the investigator must decide whether the control condition sessions will be of a duration equal to that of the active psychotherapy. Choosing to make them equivalent removes the confound of time differential but may introduce the very practical problem of how the clinician providing the control condition can fill the time of the session without providing something that (a) is not representative of a real-world pharmacotherapeutic clinical management and (b) is not contaminated with the aspects of either the active psychosocial treatment under study or some other active psychosocial intervention. In the end, in most circumstances we would come down on the side of recommending the more ecologically valid choice; that is, each modality should be representative of how it is likely to be provided in actual clinical practice. Because the ultimate goal of all clinical research is to provide information that is generalizable to the real world of treatment provision, the more representative the modality, the better.
Achieving Comparability Among Psychosocial, Medication, and Combination Treatments
A good argument can be made that the goal in multimodality research should be achieving comparably good operationalization of each of the modalities rather than achieving comparability across modalities. When the operationalization of each of the individual treatments is maximized—that is, when pharmacotherapy, psychotherapy, and the combination are each being provided by well-trained, well-monitored enthusiastic clinicians—the fairest test of the treatments in comparison to one another can occur. In order to do this, the investigator must specify each of the interventions with equal clarity. The interpersonal interaction through which pharmacotherapy is to be provided must be defined as clearly as the manner in which the psychotherapy is to be provided. Clinicians carrying out these treatments (and the combination) must be trained to equivalent levels of competence, and ongoing monitoring (preferably in the form of objective rating of audio or video tapes of treatment sessions) should be employed to verify the continued adherence of the clinicians in each modality.
Multimodality treatment studies invariably raise the issue of what the specific duties of physician and nonphysician clinicians will be during the trial. Will both physicians and nonphysicians provide the psychosocial treatment? If so, does the investigator need to have sufficient power to examine clinician discipline as a variable in the analysis? The more common approach is probably to have physicians provide pharmacotherapy, while nonphysicians provide the psychotherapy. Does this then mean that those individuals assigned to a combined treatment condition will have two clinicians rather than one? And how, then, should any superior efficacy of the combination be interpreted? As a function of the intervention or the additional clinicians presence?
One solution which undoubtedly adds to the expense and complexity of the trial but provides for a certain amount of equivalence across conditions is to assign each subject a physician and a nonphysician clinician. The nonphysician might then be given the role of primary clinician, and the physician might be given the role of consulting psychiatrist. The nonphysician has responsibility for the provision of the psychosocial treatment or its contrast condition, and the physician has the responsibility for prescribing in those treatment conditions where medication or pill placebo is being provided and for monitoring the symptoms and side effects being experienced by the subject even in those treatment conditions where no pill is offered. Depending upon the disorder under investigation, such a design may have considerable ecological validity. For example, in the long-term management of schizophrenia or bipolar disorder, it is quite likely that most of the treatment will be provided by a nonphysician, with the physician seeing the patient less frequently and primarily for the purpose of adjusting medication. In the treatment of panic disorder, however, a purely behavioral intervention might be offered in the absence of any pharmacotherapy. In the "real world" there would be no role for the physician in the provision of this treatment, making the physician's participation in the subject's care questionable in terms of generalizability. In designing a study where psychosocial treatment of panic disorder is compared with pharmacotherapy, the investigator must decide whether to opt for balance across conditions versus ecological validity in giving the physician a consulting role in the psychosocial intervention.
A related issue is whether, setting aside discipline as a variable, all clinicians should be crossed with treatments or whether clinicians should provide only those interventions in which they have the greatest expertise and for which they have the highest level of enthusiasm. For example, in a study where family psychoeducation is compared with an individual medication management approach, one might choose to have very different types of clinicians carrying out the two treatments because this is more consistent with the relative expertise and enthusiasms of, for example, psychiatric social workers and psychiatrists.
Assessment of Psychosocial Outcomes in Pharmacotherapy-Only Studies
It is altogether possible that one might wish to examine a variety of psychosocial parameters in a study in which only pharmacotherapy is being provided. These parameters might occur in two domains: (i) in the patient–clinician interaction itself and (ii) in the patient's life outside of the treatment setting. Even with very severe illnesses such as schizophrenia and psychotic mania, it would appear that the nature of the patient–clinician interaction can have an impact on outcome. Thus, we would argue that even in pharmacotherapy-only studies it is critical that the investigator specify the nature of the clinician–patient interaction and monitor this interaction throughout the trial in order to investigate adherence to the specifications of the interaction and the extent to which adherence or other "process" parameters affect outcome. "Clinician" is almost never inspected as a variable in a single-site drug–placebo study. If it were, interesting differences might emerge. In many multicenter medication trials (D. Stangl, unpublished dissertation) the center differences are actually more striking than the differences between treatment conditions. This is almost certainly attributable to uncontrolled differences in the nature of the clinician–patient interactions at various sites. Had the nature of that interaction been more carefully specified and monitored throughout the trial, we would at least be able to determine whether the center differences are attributable to unmeasured differences in the patient populations recruited at the various sites or, more probably, to unmeasured differences in clinician behavior.
A related problem is how one makes the determination as to whether a supportive medication management intervention has somehow crossed into the realm of psychotherapy. A frequent observation in recent years has been the steady decline in the drug–placebo difference observed in pharmacotherapy trials (C. M. Beasley, Jr., unpublished paper, 1991). This has been particularly true in outpatient treatment of mood and anxiety disorders. One hypothesis is that this decline is attributable to the increasing sophistication with which pharmacotherapeutically oriented clinicians are providing psychological support along with medication management. This raises the question of when such support, in fact, becomes supportive psychotherapy. In their search for adequate and ethical control conditions, psychotherapy researchers have noted the vagueness of this boundary; however, little attention has been paid to this issue the pharmacotherapy-alone research enterprise.
The measurement of change in the psychosocial realm has rarely been a feature of pharmacotherapy-only trials. The possible exceptions are those studies which have included a "quality of life" measure. Occasionally, assessment is made of more specific aspects of social functioning or social adjustment. There are particular areas of pharmacotherapy research where such assessment seems especially relevant. For example, it would appear that clozapine not only provides a level of symptomatic relief not seen with other antipsychotic medication but also leads to improvements in social adjustment that are unlike those observed with other treatments for schizophrenia (6, 35, 38). Prior to clozapine there was probably little reason to think that one agent would lead to psychosocial outcomes that are markedly different from those observed with another. With the advent of the new antipsychotics and the selective serotonin reuptake inhibitor antidepressants, much more sophisticated assessment of various aspects of social and interpersonal functioning would seem warranted.
METHODOLOGICAL CONSIDERATIONS SPECIFIC TO THE STUDY OF NONSOMATIC THERAPIES
Establishment of Appropriate Controls for the Psychotherapy/Psychosocial Treatment
As implied above, in recent years the psychotherapy research community has been particularly concerned about the development of appropriate controls for active psychotherapeutic conditions. Up to now most contrast or control conditions for active psychosocial interventions have been inadequate in one way or another. This concern applies equally to multimodality research. As Parloff (46) pointed out in his classic paper on psychotherapy "placebos," the term placebo historically has referred to inactive medications prescribed primarily for the purpose of placating or soothing the patient rather than directly treating any real disorder. In trying to conceptualize a psychotherapy placebo we run directly into the traditional medical distinction between core somatic pathology and psychological symptoms which might be associated with the experience of that pathology. In psychiatry, whether we are treating with medication, with psychotherapy, or with the combination, we are interested in both the somatic pathology and the psychological symptoms. Therefore, what constitutes a "placebo effect" in physical medicine may be synonymous with a treatment goal in psychiatry. The reduction of anxiety is a clear example of this. Up to the present time there have been approximately a half-dozen types of controls used in psychotherapy research, only some of which apply to the multimodality treatment research enterprise. These have included: (a) dropout waiting-list controls, (b) attention controls (the nonspecific support, medication clinic, clinical management paradigm would fit in here), (c) alternative treatment controls, (d) crossover controls, (e) the mirror image or "patient as own control" paradigm, and (f) dismantling the treatment package. With respect to multimodality research, the most applicable of these paradigms are the attention controls, alternative treatment controls, and, in the case of highly chronic disorders, the mirror image design. While a dismantling paradigm has many attractive features in a psychotherapy-alone study, the level of complexity involved in a study involving both medication and psychotherapy which at the same time attempts to dismantle the psychotherapeutic treatment package seems to preclude the actual conduct of such a trial.
Multimodality investigators do generally have the advantage of being able to contrast the psychotherapeutic condition or the combined treatment condition with an attention control (clinical or medication management). While this has many of the features of an adequate control condition, as noted above, there are design problems which must be addressed in terms of "equating" this condition with the psychotherapy condition both with respect to the amount of time spent in the treatment session and with respect to the discipline of the clinician providing the control intervention.
Measuring the "Blood Level" or "Take" of the Psychosocial Treatment
It has been typical of well-designed pharmacotherapy trials to examine the blood level of the compound under investigation both as a measure of patient compliance and as a measure of the potential for therapeutic effect. Recently, psychotherapy researchers have begun to address similar issues.
Luborsky et al. (36) investigated the relationship of treatment quality to efficacy of drug counseling, cognitive-behavioral psychotherapy, and supportive–expressive psychotherapy with methadone-maintained, opioid-dependent patients. These investigators found that there was a positive relationship between the presence of supportive–expressive elements and 7-month outcome even among patients who were assigned to the other two conditions. However, the strongest relationship to outcome was for their measure of "purity" (the ratio of the intended therapy rating to the total of all ratings). This was especially true for the two psychotherapy conditions. They noted that this finding can be interpreted as follows: The more therapists did what they were "supposed to do," the better the outcome, or the better the outcome, the more therapists did what they were supposed to do.
O'Malley et al. (44) examined the relationship of therapist competence to efficacy of IPT in patients with unipolar depression. These investigators used multiple regression analyses to predict outcome and found that supervisors' ratings of the fourth treatment session, which took into account quality of problem-oriented strategies, quality of techniques, general IPT skills, and overall session quality, made a significant contribution to explained variance in patient-rated change at termination beyond the contribution of pretreatment patient characteristics. While competence ratings did not add to the explained variance (after pretreatment characteristics) in total Hamilton Depression Rating Scale (HDRS) (21) scores at termination, they did contribute significantly to the termination apathy factor of the HDRS. Thus, they found that higher treatment specificity was related to greater improvement on at least some measures. In this study, treatment purity was not examined directly.
More recently, Crits-Cristoph et al. (5) examined the extent to which accuracy of interpretations, errors in technique, and "positive-helping alliance" scores were related to treatment outcomes of residual gain and rated benefits among patients in individual psychodynamic psychotherapy. Accuracy of interpretation and positive-helping alliance were both significantly related to outcome, with one measure of accuracy ("wish plus response from other") being the best predictor of outcome. Although the results did not lend themselves quite as well as the findings from the earlier study of Luborsky et al. (36) to a conceptualization in terms of specificity versus purity, it would appear that, in this instance, it was specificity of the therapy that was more strongly related to outcome. For individual studies, then, it remains an empiric question whether, if outcome is found to be related to treatment quality, it is specificity or purity that is the better predictor of outcome.
Finally, our own research group has reported that in a long-term trial examining the prophylactic capacity of IPT-M, patients whose monthly therapy sessions were rated above the median on specificity of IPT had a median survival time of almost 2 years, while those below the median on IPT specificity had a median survival time of less than 5 months (45). Clearly, some measure of the extent to which the specific ingredients of a therapy are being provided by the clinician and/or absorbed by the patient should be included in any multimodality research design.
Standardizing Nonpharmacotherapeutic Treatments
As noted above and as Elkin et al. (8) have pointed out, it is critical to multimodality research that the pharmacotherapy provision condition be "defined with sufficient clarity to ensure the adequate delivery of the active ingredients of this treatment." Treatment manuals and specific treatment training, along with ongoing monitoring of therapist's behavior, represent important methodologic advances in the direction of ensuring the adequate delivery of such treatment. Furthermore, actual ratings of the content of the pharmacotherapy/clinical management sessions have the potential to greatly enhance the interpretability of the results.
Justification for the Double-Placebo Condition
In relatively short-term trials with relatively milder conditions it is not difficult to justify a treatment condition in which the patient can expect to receive neither active pharmacotherapy nor active psychotherapy, provided that the subject is fully informed as to the nature of all the treatment conditions included in the trial. When it comes to more severe conditions and to longer-term trials even in less severe conditions, the investigator must consider carefully whether there is ethical justification for the double-placebo condition. In discussing this issue, O'Leary and Borkovec (43) argued that one approach to this ethical dilemma (admittedly, a rationalization) is to argue that if there is no scientific evidence of the efficacy of the active treatment, then it is not unethical to withhold it. The problem comes in the severe disorders where we do have treatments known to be effective. In these circumstances, asking a subject to consent to random assignment in a trial testing one or several new treatments and forego known effective treatments represents an ethical dilemma with no easy rationalization. It is around this very issue that the demands of science and the values of an ethical clinician investigator most often come into conflict, often with no fully satisfying solution.
If the investigator concludes that the scientific gains of a no-active-treatment condition outweigh the ethical concerns associated with it, it is then incumbent on the investigator to (a) do everything possible to maximize the extent to which the patient and family members are fully informed about the nature of the investigation and the probability of receiving the inactive drug/inactive therapy condition and (b) be prepared to provide active intervention as soon as treatment failure has been established. In our own work we have employed an ongoing consent process involving an initial informed consent while the patient is acutely ill, whenever possible including both the patient and a well family member in the consent process. This initial consent is followed by a second full disclosure of the nature of the investigation which takes place at a patient/family educational workshop timed to coincide with the early phase of clinical remission. At this time, subjects and their family members are reminded of the existence of the double-placebo condition and of the patient's freedom to withdraw from the investigation at any time. Finally, just prior to random assignment, the various treatment conditions are once again reviewed with the subject. While not all investigators may wish to mount a family educational workshop, at a minimum a process of ongoing consent which reminds patients and, when possible, family members of what it is they have consented to seems a reasonable way to proceed, particularly when one is concerned about the inclusion of a fully inactive condition.
Weighing the Costs and Benefits of Combination Strategies in Describing Results
A second ethical problem arises for the multimodality researcher in describing the results of an investigation in which the combination strategy has proven more effective than either of the single modalities alone. Here we must once again concern ourselves with the distinction between statistical and clinical significance. First, the investigator must decide whether the standard of clinical as well as statistical significance of superiority of the combination has been achieved. If this is the case, then it is incumbent on the investigator to describe the added therapeutic benefit measured against the additional cost incurred in the provision of the combined treatment. Only when the results are described in this way is the investigator fully meeting her or his responsibility to assess the results of the investigation.
This work is supported in part by NIMH grants MH29618, MH49115 (Dr. Frank), and MH30915 (Dr. Kupfer). Portions of this chapter were adapted from the following sources:
Frank E, Kupfer DJ, Levenson J. Continuation therapy of unipolar depression: the case for combined treatment. In: Manning D, Frances A, eds. Combination drug and psychotherapy in depression. Washington, DC: American Psychiatric Press, 1990;133–149.
Frank E, Kupfer DJ, Wagner EF, McEachran A, Cornes C. Efficacy of interpersonal psychotherapy as a maintenance treatment for recurrent depression: contributing factors. Arch Gen Psychiatry 1991;48:1053–1059.
Frank E, Kupfer DJ, Thase ME. Combining psychotherapy and psychopharmacology. In: Elliott GR, Ciaranello RD, Barchas JD, eds. Psychopharmacology: from theory to practice. New York: Oxford University Press (in press).
published 2000