CONDEMNATION OF A SCIENTIFIC ARTICLE: 
A CHRONOLOGY AND REFUTATION OF THE ATTACKS AND A DISCUSSION OF THREATS TO THE INTEGRITY OF SCIENCE

Bruce Rind
Department of Psychology, Temple University, Philadelphia, PA 19122 (rind@vm.temple.edu)

Philip Tromovitch
Graduate School of Education. University of Pennsylvania

Robert Bauserman
Department of Health and Mental Hygiene. State of Maryland

In: Sexuality & Culture, 4-2, Spring 2000

Content

[Abstract]
Introduction
Background
Summary of our two meta-analytic literature reviews
Meta-Analysis of National Probability Samples
Meta-Analysis of College Samples 
Chronology of the attacks
Refutation of Criticisms Concerning Methodology and Data Analysis
NARTH

"Dr Laura" and the Criticism of Unpublished Studies

The Leadership Council and its Criticism that We "Loaded our Analysis"
Additional Criticisms by the Leadership Council
Sample bias
Inclusion of non-contact CSA
Poor measurement

Poor choice and interpretation of effect sizes

Deck stacked against CSA compared to FE
No way to infer causality or lack of it
Double standard of interpretation
Misleading interpretation of misrepresented data

Flawed conclusions not well supported by the data

Ignored majority of sample in arguing for lack of harmfulness
Summary
Refutation of Criticisms Concerning Conceptual Issues

Terminology

Our recommendations

Other sexologists concur that the term "abuse" is problematic
Consent
Simple vs. informed consent
Empirical validation of the consent construct
Children vs. adolescents
Summary
Response to the Congressional Resolution
Threat to Science
NARTH
The Leadership Council
"Dr. Laura"
The Family Research Council
Congress
Synthesis

Concluding Remarks

Need to separate morality and science
References

[Abstract]

On July 12, 1999, our meta-analysis on child sexual abuse published in Psy chological Bulletin, one of the American Psychological Association's (APA) premiere journals, was condemned by the U.S. Congress (H. Con. Res. 107). The condemnation followed months of attacks on the article, the APA, and us by various social conservatives and psychoanalytically-oriented clinicians. The American Association for the Advancement of Science (AAAS) was asked by the APA to independently review our article. After considering criticisms of it and the article itself, AAAS declined, but commented that it was the criticisms, not our methods or analyses, that troubled them because these criticisms misrepresented what we wrote. 

The current article chronicles this whole affair. First, we provide background, explaining why an article such as ours was needed. Then we accurately summarize the article, given that it has been so widely misrepresented. Next we present a chronology of the events leading up to and following the condemnation. We then present and refute all the major criticisms of the article, which have included both methodological and conceptual attacks. Next we discuss the threat to science that these events portend. We conclude by discussing the need to separate moral judgments from scientific research, the conflation of which formed the basis for the distortions and condemnation.

[page2]

Introduction

For over one hundred years the American Psychological Association (APA) has been publishing psychological theory and research in its various journals. July 12, 1999 may come to be seen as one of the most extraordinary dates in APA's publication history. On this day, the United States Congress, in an unprecedented move, condemned an article published by the APA in one of its most prestigious journals, Psychological Bulletin. The vote for House Concurrent Resolution 107 (H. Con. Res. JO7) was 355-0, with 13 voting "present" (i.e., abstaining). 

On July 30, 1999, the Senate unanimously passed it as well. The article, entitled "A Meta-Analytic Examination of Assumed Properties of Child Sexual Abuse Using College Samples," was published exactly one year earlier and was authored by us (Rind et al., 1998). In short, the resolution proclaimed our study to be "severely flawed," condemning and denouncing all suggestions in the article indicating that "sexual relationships between adults and 'willing' children are less harmful than believed."

In response to pressure from Congress, the APA pledged to seek independent review of the article, another unprecedented event in the long history of APA publications. The APA asked the American Association for the Advancement of Science (AAAS) to conduct the review. AAAS is the largest science organization in America and is publisher of the prestigious journal Science. In an October 1999 letter, the AAAS Committee on Scientific Freedom and Responsibility informed the APA that after "considerable deliberations taking into account the views of ... two consultants and extensive background materials on reactions to the published article," the Committee would not formally review the article (McCarty. 1999). They commented that they saw "no reason to second-guess the peer review process" used by Psychological Bulletin in its decision to publish the research. Importantly, the Committee went on to state that "[a]fter examining all the materials available to the Committee, we saw no clear evidence of improper application of methodology or other questionable practices on the part of the article's authors." Additionally, they commented that:
[page 3]

The Committee also wishes to express its grave concerns with the politicization of the debate over the article's methods and findings. In reviewing the set of background materials available to us, we found it deeply disconcerting that so many of the comments made by those in the political arena and in the media indicate a lack of understanding of the analysis presented by the authors or misrepresented the article's findings. All citizens, especially those in a position of public trust, have a responsibility to be accurate about the evidence that informs their public statements. We see little indication of that from the most vocal on this matter, behavior that the Committee finds very distressing.

Although the AAAS Committee ended its letter by noting that its decision not to review the article "should not be seen as either endorsement or criticism of it," the letter's other comments clearly point to the criticisms of the article, rather than the article itself, as the problem area. In an interview with the Philadelphia Inquirer (November 17, 1999 , A20), the chair of the Committee, physicist Irving Lerch, commented that "[s]ome of the political statements were clearly self serving. I think some politicians tried to inflame or cash in on public sentiment by purposely distorting what the authors said."

In the current article we will document the distortions and misrepresentations made by our most vocal critics and discuss the threat posed to sexual science by these attacks. First, we will provide background on why a study such as ours was needed. Second, because of the numerous distortions, we will present an accurate summary of it. Third, we will detail the chronology of events that led to congressional condemnation. Fourth, we will present and refute the major methodological and conceptual criticisms of our review. Fifth, we will discuss the problems created by the attacks and distortions, which include not only the threat to science, but to individuals as well. Finally, we will conclude by arguing for the need to separate moral judgments from scientific research as a prerequisite for achieving a valid understanding in science in general and sexual science in particular.

Background

In the 1970s, the feminist movement raised consciousness first about rape and then incest, two problem areas they believed society had not taken seriously enough up to that time, Feminist ideas
[page 4]
about the nature of rape served as the model for conceptualizing incest. which in turn came to serve as the model for understanding all forms of age-discrepant sexual relations involving minors (Finkelhor, 1984; Okami, 1990). Thus, the anti-incest campaign was a central factor in the evolution of thinking about child sexual abuse (CSA).

Anti-incest advocates helped to revive Freud's seduction theory, which had lain dormant for seven decades (Frontline, 1995a; Nathan & Snedeker, 1995; Pendergrast, 1996). Seduction theory held that childhood sexual seduction is the cause of all adult neuroses (Crews, 1998; Esman, 1994 ). Freud claimed that seduction is so disturbing that the psyche tends to repress memories of it. Though submerged, he maintained, these memories nevertheless express themselves in the form of symptoms, which can only be relieved by bringing the memories back to conscious awareness. Freud quickly abandoned this theory, however, claiming instead that wish for seduction rather than actual seduction produce repressed memories. 

By the mid-l97os, feminist advocates challenged his decision as lacking in moral courage, vowing that victims would no longer be silenced (Crews, 1998; Frontline. 1995a). As a consequence of this advocacy, Freud's unitary concept of pathogenesis reemerged, and therapists increasingly came to view the evocation and abreaction of repressed memories as the sine qua non of therapeutic success (Esman, 1994 ).

But advocacy is not science. Freud's unitary concept, though politically expedient in the 1970s and beyond. was extreme at best and completely invalid at worst (Crews, 1998; Frontline, 1995a; Tavris, 1993). A major problem with this advocacy was the exaggerated claims about the effects of CSA that it was prone to fostering (Best, 1997). Such exaggeration has iatrogenic potential, as Brown and Finkelhor (1986) concisely cautioned:

... it is important that advocates not exaggerate or overstate the intensity or inevitability of these consequences. In addition to policymakers, victims and their families wait for research findings on the effects of sexual abuse and they may be further victimized by exaggerated claims about the effects of sexual abuse. It is not possible to maintain two sets of conclusions about the effects of abuse: a dire one for political purposes,  and a hopeful one for family members. Thus the presence of both audiences requires that those who conduct and interpret research in this field maintain a posture of objectivily and balance (p. 178).

[Page 5]
Other researchers similarly cautioned against hasty generalizations regarding CSA effects because secondary victimization stemming from overreaction is a real and serious threat (Baurnann, 1983; Seligman, 1994 ), and labeling someone as a victim and then treating him or her accordingly can be self-fulfilling (Constantine, 1983).

Many victimologists and other commentators on CSA effects did not heed these warnings, however. A sort of tunnel vision developed, in which researchers and others uncritically interpreted all data as evidence for CSA's pathogenicity. Numerous examples can be cited (see Okami, 1990, 1994; Tavris, 1993); here we provide three. 

In a widely cited study, Burgess et al. (1984) found that the majority of their child and adolescent subjects showed vague or no symptoms while their CSA was occurring, but they developed many symptoms after the intervention of the criminal justice and social service systems. Rather than parsimoniously attributing these symptoms to the intervention, they interpreted them as posttraumatic stress stemming from the CSA. 
.

Kelly (1993) reported that nearly half of his child subjects fell in the clinical range of traumatic stress after being sexually and ritually abused. However, his subjects were children in the McMartin preschool case, which has been thoroughly discredited, The stress responses were iatrogenic, stemming from therapeutic and criminal justice intervention, as well as parental anxiety and panic (Nathan, 1990; Nathan & Snedeker, 1995). 
.

In their often-cited Psychological Bulletin literature review, Kendall-Tackett et al. (1993)'used these two studies among others to argue that posttraumatic stress disorder (PTSD) is a relatively frequent consequence of CSA, They also inflated the impression that CSA is pathogenic by including sexualized behavior as a symptom -- it was the most frequently reported "symptom" in the studies they reviewed.

 This designation as a symptom, however, is a value judgment (cf. Constantine, 1983), whose scientific validity is not supported by cross-species and cross-cultural data. In their seminal  review of these data, Ford and Beach (1951) observed that "[a]s long as the adult members of a society permit them to do so, immature males and females engage in practically every type of sexual behavior found in grown men and women" (p. 197) and concluded that "tendencies toward sexual behavior before maturity and even
[page 6]
before puberty are genetically determined in many primates, including humans" (p. 198).

The overstatement of child abuse researchers and advocates regarding CSA effects on psychological adjustment and memory led to profound iatrogenesis in the 1980s and 1990s (Nathan & Snedeker, 1995; Pendergrast, 1996). 

Between 1983 and 1993, staff members in dozens of day care centers around the country were accused by preschoolers of bizarre acts of CSA often accompanied by satanic ritual abuse (SRA). The appearance of a rash or ambiguous remark made by a single preschooler might ignite a full-scale police investigation of all the children at the center. Initially, the children would deny any abuse. But after repeated interrogations by therapists, the children would begin speaking of being molested in tunnels or magic rooms, sodomized with curling irons or swords, forced to consume feces or drink the blood of sacrificed babies, and made to watch animals being ritually tortured and killed. Children's initial denials were attributed to PTSD, wherein it was believed that they failed to reveal the truth because of fear or amnesia. The complete absence of physical evidence to corroborate claims of brutal abuse was ignored. Dozens upon dozens of staff members were put on trial -- sometimes the lengthiest and most expensive in their state's history -- and were frequently convicted and sentenced to lengthy prisons terms (Frontline, 1991, 1993, 1998; Nathan, 1990; Nathan & Snedeker, 1995). 

Academic research has essentially debunked the validity of these cases. Researchers have empirically demonstrated that false memories can readily be implanted in preschoolers (e.g., Ceci & Bruck, 1993; Garven et al., 1998). While Ceci and his colleagues have focused on simple suggestion as a powerful implantation tool when suggestion is repeated over time, Garven et al. showed that the full range of techniques employed by the social workers in the McMartin case is substantially more powerful. Aside from suggestion, these techniques included social influence (e.g., misinforming children that others had already made allegations), positive and negative consequences (e.g., praise for "correct" answers, disappointment for "incorrect" ones), asked-and-answered questions (i.e., asking a child a question again that he or she has just unambiguously an-
[page 7]
swered, which conveys to the child that his or her first answer was wrong), and inviting speculation (i.e., asking the child to pretend or imagine what might have happened). Using preschoolers as subjects, Garven et al. showed that the full package of technjques more than tripled false memories (58% error rate) compared to suggestibility alone (17% error rate) during a single session. Investigative reports (e.g., Nathan, 1990; Nathan & Snedeker, 1995) and television documentaries (e.g., Frontline, 1991, 1993, 1998) have detailed the coercive techniques employed by therapists and prosecutors that elicited false allegations.

From the mid- to late 1980s and continuing well into the 1990s, thousands of mostly female adult patients began "recovering" memories of CSA while in therapy. Typically, the patient entered therapy complaining of unhappiness, but not knowing why. After being told that CSA might be the cause, encouraged to remember CSA experiences, and advised to read books such as Bass and Davis' (1988) The Courage to Heal, vague memories of abuse might develop, which would later coalesce as vivid, detailed memories after many more therapy sessions. 

In thousands of such cases, patients proceeded to sue their assumed perpetrator (e.g., father), wreaking havoc in as many families (Jaroff, 1993; Ofshe & Watters, 1993; Prendergrast, 1996). Frontline's (1995a) documentary "Divided Memories" provided numerous vivid examples. In one, a young woman not knowing how to relate to her mother went into therapy. After several months, with the encouragement of her therapist, she developed memories that eventually crystallized into recollections of satanic rituals involving her parents and grandparents. She was diagnosed as suffering from multiple personality disorder with twenty-seven alters. With her therapist's support, she sued her parents for $20 million. In a deposition, she claimed that her mother inserted a broomstick into her vagina, as well as spiders, wires, and vegetables, and that her father sexually assaulted her with tools from a hardware store. When reminded that she had perfect school attendance and pediatric reports, the daughter reacted, "Oh, that doesn't matter:'

Although many clinicians continue to believe that recovered .memories in adult patients are valid, academic researchers have
[page 8]
generally concluded that recovered memories are not supported by empirical data (e.g.. Brandon et al., 1998; Crews, 1998; Kihlstrom. 1997; Loftus. 1993. 1997; Loftus & Polage, 1999; Ofshe & Watters, 1993; Pope & Hudson, 1995). 

In a recent comprehensive review of the literature, Brandon et al. ( 1998) found that opinions have often been expressed with great conviction but without evidence; they concluded that there is a high probability that memories "recovered" after long periods of amnesia are false. Their conclusions formed the basis of the official report issued by the Royal College of Psychiatrists in Great Britain in 1997. Frontline's (1995a, 1995b) in-depth documentaries for PBS television have vividly portrayed the highly questionable nature of recovered memory therapy and the extensive iatrogenic damage it has engendered.

These dramatic events stemmed from social beliefs about the effects of CSA based on advocacy rather than empirical science. Some critics have referred to these events as the third wave of hysteria in America, the first two being the Salem witch trials and McCarthyism (e.g., Gardner, 1993). These occurrences highlight the importance of Brown and Finkelhor's (1986) caution that exaggeration and overstatement have iatrogenic potential, and thus objectivity and balance are imperative in research on CSA. Our Psychological Bulletin article was in part a response to the problems just discussed. Our goal was to strive for greater objectivity and balance by strict adherence to sound methodological principles and analytical techniques, notably missing in many previous treatments of this issue.

Summary of Our Two Meta-Analytic Literature Reviews

Since the early 1980s, several dozen literature reviews examining the relationship between CSA and psychological adjustment have been conducted. Most have been qualitative reviews, providing narrative summaries of results from studies based mainly on clinical and legal samples. Consistent with the pathogenic view of CSA. these reviews have generally concluded that CSA causes such diverse problems as depression, anxiety, low self-esteem,
[page 9]
sexual dysfunction, dissociation, suicidal ideation, and PTSD, and that these effects are pervasive in the general population of persons who have experienced CSA. However, these conclusions and the methodology used to achieve them are problematic. 

First, given that CSA has been consistently shown in the primary studies to be confounded with other problems such as physical abuse and emotional neglect, it cannot safely be assumed that CSA-symptom associations reflect the effects of CSA. 

Second, it cannot be safely assumed that findings from clinical and legal samples generalize. They may be a biased sample of CSA experiences in the general population, overestimating negative effects, or correlates. 

Third, narrative reviews are imprecise and subjective. It is not enough to know that CSA subjects are more poorly adjusted than controls; it is also important to know by how much. The narrative approach is also more vulnerable to confirmation bias, meaning that a reviewer with an initial belief may invalidly confirm the belief by emphasizing supportive findings while minimizing or ignoring unsupportive findings.

To address these shortcomings, we conducted two literature reviews examining the psychological correlates of CSA (Rind & Tromovitch, 1997; Rind et al., 1998). We focused on non-clinical /  non-legal samples for better generalizability. 

To obtain greater precision and objectivity we analyzed the data in a quantitative fashion by using meta-analysis, a technique that statistically summarizes results across studies. In our meta-analyses, each result represented the magnitude of association, as measured by Pearson's r, between CSA and adjustment, with small, medium, and large effect sizes corresponding to rs of .10, .30, and .50, respectively (Cohen, 1988). 

We analyzed results separately for males and females, and examined whether CSA-adjustment relations could be causally construed. Each literature review served as a test of four hypotheses:

(a) CSA causes harm; 

(b) this harm is pervasive in the population of persons with a history of CSA; 

(c) this harm is likely to be intense; and 

(d) CSA is equivalent for boys and girls in terms of its widespread and intensely negative effects.

 These four hypotheses were derived directly from victimology and trauma theory, originating in turn from Freud's seduction theory.

[page 10]

Meta-Analysis of National Probability Samples

In our first meta-analysis (Rind & Tromovitch, 1997), we included only studies based on national probability samples. Because these samples were selected to be representative of national populations, their results are clearly more relevant to inferring effects in the general population than are clinical or legal samples, which represents a major strength of this review.

Prevalence rates for CSA for males and females were 11 percent and 19 percent, respectively. Although statistically significant, the association between CSA and adjustment was small: effect size r= .07 for males and r = .10 for females. Females reported more negative effects and reactions than males did. Although 68 percent of females reported some sort of negative effect at some point in their lives since the CSA occurred, only 42 percent of males did. This difference was statistically significant and of medium-small magnitude, r = .23. 

One study inquired about perceptions of permanent harm, finding it rare for both females (13%) and males (4%) (Baker & Duncan, 1985). Another study asked subjects about emotional reactions at the time of the CSA (Lopez et al., 1995). Whereas only a minority of males reacted with negative emotions such as fear, disgust, and hostility at the time of the CSA, the majority of females did. Conversely, more males than females reacted with pleasure (27% vs. 10%). Across all emotions, there was a statistically significant sex difference of medium magnitude, r = .31. Finally, CSA was confounded with other environmental problems. One study found that sexually abused (SA) girls tended to have disruption in their family, school, and social environments before their CSA experience (Ageton, 1988).

Results contradicted or failed to support the four hypotheses. The confounding between CSA and third variables weakened assumptions about causality. The self-reported effects suggested that lasting harm is far from pervasive in the general population. The small effect sizes indicated that effects are not typically intense. 

Finally, consistent sex differences in self-reported effects and emotional reactions indicated that males and females, on average, experience CSA quite differently. The importance of these
[page 11]
findings is bolstered by use of national probability samples and precise statistical techniques. Weaknesses included the small number of studies used and scant data relevant to assessing causality. 

Meta-Analysis of College Samples 

What about the review of college studies -- the research that led to the congressional resolution ? The small number of national probability samples suggested the need to extend this research by examining data from a larger collection of non-clinical /non-legal samples where a greater amount of data relevant to assessing causality could be obtained. 

We chose to focus on studies based on college samples because: 

(a) these studies comprise the largest body of non-clinical /non-legal research on CSA; 

(b) they are more representative of CSA in the general population than are clinical or legal samples because about 50 percent of adults in the U.S. have had some degree of college experience; 

(c) they provide the most extensive data on moderators of CSA-adjustment relations, which are relevant to assessing causality; and 

(d) they provide a rich amount of data on self-reports of effects and reactions, which are relevant to assessing pervasiveness of effects and sex differences. Another important advantage of these studies is the large amount of data available on males; most previous literature reviews have focused on females, including two meta-analytic reviews (Jumper, 1995; Neumann et al., 1996).

Potential shortcomings of using college samples are that students may be too young for symptoms to appear and students' CSA experiences might be less intense or severe than in the general population. The first possibility is doubtful; Neumann et al. ( 1996) showed that women younger and older than 30 were the same in terms of CSA-adjustment relations. As we demonstrated in our Psychological Bulletin article, the latter possibility is also unsupported. Prevalence rates were at least as high in the college population (14% for males and 27% for females) as in the general population, as indicated by results taken from national probability samples. Types of CSA (i.e., exhibitionism, fondling, oral sex, and
[page 12]
intercourse) in the college and national samples were similar, as were prevalence rates of incest and the frequency of CSA. Meta-analysis of the college data produced the same effect size estimates as the national probability data: rs = .07 and .10 for males and females, respectively.

Across eighteen different symptoms, effect sizes were generally homogeneous and were all small, ranging from r = .04 for self-esteem to r = .13 for anxiety. The magnitude of CSA-adjustment relations varied across studies as a function of the interaction of subjects ' gender and reported willingness to participate, but not as a function of whether the CSA involved physical contact.

Males who were involved in willing CSA were not different from controls in terms of adjustment. Willing and unwilling females and unwilling males, however, were somewhat more poorly  adjusted than controls. Symptoms, reactions, and /or self-reported effects were worse when incest or the use of force was involved, but were unrelated to the occurrence of penetration or the duration and frequency of the CSA. 

Positive. neutral, and negative reactions to CSA were, respectively, 37, 29, and 33 percent for males and 11, 18, and 72 percent for females. Negative reactions were significantly more frequent for females than males, with a medium effect size, r = .31. 

The vast majority of males and females reported no negative effects on their sex lives. Reports of lasting general effects were uncommon but were more frequent for females, and temporary negative effects were reported by a minority of males and a majority of females. Overall, reports of negative effects were significantly more frequent for females than for males, with a small-to-medium effect size, r = .22. Effect sizes for sex differences in reactions and self-reported effects were virtually identical to those in the national probability samples (rs = .31 and .23, respectively).

To investigate causality, we examined family environment (FE) as a third variable that might account for some or all of the association found between CSA and adjustment, which was r = .09 across male and female subjects. We meta-analyzed FE-CSA relations and found that the two variables were reliably confounded with a small effect size, r = .13. 

Next, we meta-analyzed FE-adjustment relations, finding that they were statistically significantly related
[page 13]
with a medium effect size, r = .29. 

These latter two findings suggested that the CSA-adjustment relation would be even smaller if FE were statistically controlled for. Thus, we examined all studies that employed statistical control and found that 41 percent of CSA-adjustment relations were significant before statistical control but only 17 percent were significant afterwards -- a 59 percent reduction of statistical significance when factoring out FE.

The small CSA-adjustment associations indicated that CSA does not produce lasting, intense effects on average in the college population. The self-reported effects and reaction data indicated that negative effects are far from pervasive and the two sexes respond quite differently to CSA.

Analyses of confounding and statistical control indicated that causality is not well supported in the typical case. Nevertheless, CSA's potential for harm was indicated in a portion of the cases by self-reports of negative effects and the fact that some measures remained significant after statistical control. 

Results suggested that negative effects are increased by incest and use of force and, conversely, are lessened or eliminated by willingness of participation. 

Finally, the consistent similarities between results from the college and national samples suggest that these findings may be generalizable. Not only were prevalence rates, types of CSA, extent of incest, and frequency of episodes similar, but so were the magnitude of CSA-adjustment relations, self-reported effects and reactions, and sex differences in self-reported effects and reactions. 

These were the methods, findings, and conclusions that led to the later attacks.

Chronology of the Attacks

Our Psychological Bulletin article appeared in print in July 1998. In December of that same year, the National Association for Research and Therapy of Homosexuality (NARTH), an organization with psychoanalytically-oriented clinicians at its core dedicated to the cure and prevention of homosexuality, offered the first critique, which appeared on its web page ( http://www.narth.com  ). This critique, though scientifically, superficial and morally based, was highly instrumental in initiating a chain of events that led to con-
[page 14]
gressional condemnation of our article. It formed the basis of numerous subsequent attacks by social conservatives, the first of which appeared in early March 1999 in The Wanderer, a conservative Catholic newspaper. This attack, drawn heavily from NARTH's critique, called our review a "pseudo-professional, pseudo-academic analysis." It claimed that "a team of academics from Temple university has endorsed the view that adult-child sexual relations are beneficial ... and recommends overhauling and euphemizing the language of sexual abuse." It expressed regrets that homosexuality was depathologized and feared the same would now happen to pedophilia.

A listener to a Philadelphia talk station sent The Wanderer article to one of the station's hosts, who contacted one of us (Tromovitch), who agreed to appear on his radio show to discuss the study. After this appearance, the host launched an attack on Temple University, the paper, and us along the lines of the critique presented in The Wanderer. Another listener who heard this interview and the host's ensuing attacks wrote a letter to "Dr. Laura" Schlessinger, who runs a nationwide syndicated radio talk show, broadcast daily by about 485 radio stations in the U.S. and Canada, reaching about 20 million listeners.

"Dr. Laura" is a social conservative who staunchly espouses family-values positions and has been dubbed "the poster girl of the Christian fundamentalists" by Vanity Fair magazine (Der Spiegel, August 2, 1999). The writer of the letter began by thanking "Dr. Laura" for her "constant fight against the assault on our families" and then characterized our study as a "new assault" (Schlessinger, 1999).

On March 22, "Dr. Laura" began a series of attacks against our article and the APA that lasted for months. In developing her criticisms, she "had three renowned, licensed clinical psychologists and a scientist" review the article, who unanimously described it as "junk science" (Schlessinger. 1999). Three of these reviews, authored by clinical psychologists Samenow and NARTH members Nicolosi and Van den Aardweg, appeared on her web page ( http://www.drlaura.com/monologue  ). Finally, she made extensive use of criticisms provided to her by Paul J. Fink. former president of the American Psychiatric Association and cur-
[page 15]
rent president of a new group called the Leadership Council for Mental Health, Justice, and the Media.

The Leadership Council consists largely of persons whose psychological thinking is based on the psychoanalytic paradigm, who practice or advocate recovered memory therapy and the diagnosis and treatment of multiple personality disorder. They represent themselves as a nonprofit organization composed of "many of the nation's most prominent mental health leaders" whose mission it is to "insure the public receives accurate information about mental health issues" (press release, May 24, 1999).

Fink, writing on behalf of his organization, sent a letter to "Dr. Laura" ; asserting that we "loaded" our analysis with data involving primarily mild adult-child interactions with no physical contact,  and that 60 percent of our data came from one single study done over forty years ago. Because these erroneous claims had a major influence on the fate of our paper, we will discuss them in detail later.

Shortly after "Dr. Laura" began attacking our study, the Family Research Council (FRC), a socially conservative lobbying group in Washington, D.C., joined in on the attack. Its stated purpose is to  "reaffirm and promote nationally ... the traditional family unit and the Judeo-Christian value 'system upon which it is built". To achieve this, it "keeps watch over political and cultural forces that threaten the traditional family, with a special focus on the homosexual agenda." As such, the FRC has much common ground with NARTH and "Dr. Laura."

Attacks by the FRC, in turn, prompted the Alaska State Legislature to react, the first governmental body to do so. In mid-April 1999 a press release announced House Joint Resolution 36, which rejected the conclusions of our study, noting that "peer review has identified several questionable assumptions and methodologies in this research paper" (the only peer review performed on our article was that by Psychological Bulletin: presumably this "peer review" referred to "Dr. Laura's" three clinical psychologists and a scientist).

This resolution in turn served as the blueprint for numerous subsequent state resolutions as well as H. Con. Res. 107. It stated that the legislature "condemns and denounces all suggestions ... [that) sexual relation-
[page 16]
ships between adults and willing children are less harmful than believed and might even be positive for 'willing' children," and concluded by encouraging "competent investigations to continue to research the effects of child sexual abuse using the best methodology so that the public and public policymakers may act upon accurate information."

On May 12, 1999, the FRC held a press conference in Washington, D.C. to demand that the APA repudiate our study. The FRC's press release for this event stated that our study was "based on the premise that a child can actually consent to sex with an adult." Participants included, among others, "Dr. Laura" via satellite, a representative from NARTH, and three conservative Republican congressmen (Reps. Delay- TX, Salmon-AZ, and Weldon-f1..).

Tom DeLay said the "lack of judgment shown by the American Psychological Association in publishing it absolutely confounds me" and "challenged" the APA to publicly admit it erred in publishing the article ( Washington Yimes, May 13, 1999). According to the FRC's web page, Salmon called on the APA to denounce the study as "sick and twisted."

Two days after the press conference, Raymond Fowler, CEO of the APA, appeared on MSNBC with Congressman Weldon. Weldon said that our study was "a very, very bad study ... based on some very, very bad data" and that it should never have been published. Fowler replied that, "Well, with all due respect, it isn't a bad study. It's been peer-reviewed by the same principles as any kind of scientific publication. It's been examined by statistical experts. It's a good study." Weldon disagreed, saying that our study was based on "what they call meta-analysis, where they take a whole bunch of studies and put them together. But a whole bunch of studies that they put in this study were never peer-reviewed, and 60% of them were based on one study done over 40 years ago in 1946 [sic]." Weldon then read a quote from Fink, the source of the latter criticism.

Shortly after the MSNBC debate, Fowler contacted us, asking about the meaning of the 60 percent figure, adding that members of Congress were using it as "major data for discrediting" both the APA and us. We provided a complete refutation of Fink's comments {detailed below ), but to no avail. The APA found itself in the
[page 17]
middle of a growing firestorm thanks to a coalition of social conservatives, two sorts of psychoanalysts -- anti-homosexual and repressed memory advocates -- and conservative Republican congressmen. As Fowler commented to us on June 8, he was "in hand to hand combat with congressmen, talk show hosts, the Christian Right, and the American Psychiatric Association:'

This pressure, especially from Congress -- with whom the APA must negotiate for political support and funding for both clinical treatment and behavioral research -- proved to be too great. On June 9, Fowler wrote a letter to Congressman Delay that started by commending Delay for his strong stand against sexual abuse. Later he wrote that our article "included opinions of the authors that are inconsistent with APA's stated and deeply held positions." He then specified the "inconsistencies":

It is the position of the Association that sexual activity between children and adults should never be considered or labeled as harmless or acceptable. Furthermore. it is the position of the Association that children cannot consent to sexual activities with adults.

Finally, Fowler offered a series of unprecedented concessions, among them that the APA would seek independent evaluation of the scientific quality of our article and that its journal editors would be asked to "fully consider the social policy implications of articles on controversial topics." As a result, APA was congratulated rather than condemned in H. Con. Res. 107 for "clarifying its opposition to any adult-child sexual relations ... and for resolving to evaluate the scientific articles it publishes in light of their potential social, legal, and political implications." .

Critics of our review hailed Fowler's concessions and Congress' condemnation as a major victory. In an August 2, 1999 newsletter, the FRC listed its recent accomplishments, one being its role in Congress' condemnation of our article. The Leadership Council also took credit for Congress' condemnation, as well as the APA's decision to have our paper reevaluated. In an unpublished letter written to the Los Angeles Times, Fink, Silberg, and Dallam ( 1999) wrote that

Our in depth analysis of the study was presented to the APA and members of the United State!; Congress. Soon after reading our analysis, APA took the unprecedented step of sending the study to an independent scientific organization for review. And Congress passed a bill condemning the study...

[page 18]
They later emphasized that "Congress passed the bill only after receiving our analysis. Thus they had firm data to support their position" (italics in the original).

But our critics' celebration is not the end of the story. As discussed previously, AAAS rebuked their misunderstandings and distortions, finding no fault in our methods or analyses. AAAS's  opinion was far from isolated.

Once Congress got involved, prompting news coverage in the mainstream media, numerous supportive commentaries began to emerge and continue as of this writing. These include, among others, articles in

The New Republic (May 28. 1999, by J. Zenergle, feed service),

Der Spiegel (August 2, 1999, by R. Paul, pp. 190-91),

National Journal (August 7,1999, by J. Rauch, pp. 2269- 70),

New York Times Magazine (November 7, 1999 , by A. Sullivan, pp. 38-40),

the Skeptical Inquirer (January / February 2000, by Berry & Berry, p. 20),

The Public Interest (Winter 2000. by G. Zuriff, pp.29-39), and

Lingua Franca (February, 2000, by S. Cole, pp. 12-14).

Supportive newspaper columns or op-ed pieces have included, among many other,

the San Francisco Chronicle (June 17, 1999, by J. Carroll, E12),

Los Angeles Times (July 19, 1999, by C. Tavris), and

Boston Globe (August 1, 1999, by S. Lamb, E1).

The essence of these commentaries was that our review has been seriously distorted by social conservatives, exploited by political opportunists, was properly done, and has made important and vlJid distinctions. For example, in his popular syndicated column "Savage Love" (July 29, 1999), Dan Savage commented:

Why is this controversial? Speaking as a survivor of CSA at 14 with a 22-year-old woman; sex at 15 with a 30-year-old man --I can back the researchers up; I was not traumatized by these technically illegal sexual encounters; indeed, I initiated them and cherish their memory. It's absurd to think that what I did at 15 would be considered "child sexual abuse," or lumped together by lazy researchers with the incestuous rape of a 5-year- old girl.

Refutation of Criticisms Concerning Methodology and Data Analysis

The AAAS opinion released in the fall of 1999 characterized the vocal critics of our article as lacking understanding of the analyses,
[page 19]
misrepresenting the findings, and /or failing in their responsibility to be accurate about the evidence that informed their public statements.

In this section, we document the misunderstandings and misrepresentations concerning our methods and analyses. In the next section, we will document the misunderstandings and misrepresentations concerning two conceptual issues that led to especially harsh denunciations: our recommendations regarding terminology and our use of the construct of willingness or consent.

NARTH

NARTH's critique, as mentioned earlier, was highly influential in igniting the attacks on our article. Its specific criticisms, however, did not deal directly with our methodology or data analysis -- points with which any valid scientific rebuttal ought to be concerned. Instead, NARTH's critique focused on the fear that pedophilia would be normalized in the same way that they allege homosexuality was -- by suggesting the use of value-neutral terms. NARTH dealt with our findings not by challenging them, but by presenting its own view as universal truth -- i.e., CSA is invariably intensely harmful.

Citing mostly clinical research, NARTH claimed that the consequences of childhood seduction include guilt, shame, anxiety, lowered self-esteem, depression, vulnerability to drug and alcohol abuse, increased risk for suicide, and sexual problems (e.g., homosexuality).

In making these claims, NARTH repeated the very same errors we described in the introductions in our two meta-analytic reviews. To reiterate, we argued for the need to conduct yet another literature review because previous ones predominately focused on clinical samples, reported extensive lists of symptoms without qualifications, assumed CSA was the cause of all problems despite confounding, and inappropriately generalized from clinical samplels to all experiences of CSA. We discussed the need for more representative samples and caution in causal interpretation, and in our own reviews used national probability and college samples and paid careful attention to causal interpretation through statistical analysis and control. We demonstrated the usefulness of college samples by their high degree of correspondence with na-
[page 20]
tional samples in all relevant measures. NARTH completely ignored our logic, methods, and analyses. As if in "denial," NARTH covered its eyes and just reiterated the prevailing dogma. Remarkably, although especially superficial, its critique had a major impact.

"Dr Laura" and the Criticism of Unpublished Studies

One of "Dr. Laura's" sources (Samenow) complained that our review was weak because 38 percent of the studies were never peer reviewed. "Dr. Laura" popularized this criticism, and Congressman Weldon used it to attack our study in his debate with Fowler on MSNBC. However, this criticism is methodologically and empirically without merit.

As a methodological issue, using unpublished studies in a meta-analysis is not only accepted practice, but is actually encouraged, provided the studies are well conducted (Rosenthal, 1994). Including unpublished studies is important because journals may have a bias toward publishing studies with statistically significant results, which in turn inflates estimates of the magnitude of relationships between variables (Rosenthal, 1984). To overcome this bias, referred to as the "file drawer problem" (i.e., studies with non-significant results are stashed away in file drawers rather than being published), meta-analysts who use only published studies and find an overall statistically significant relationship often compute the number of unpublished studies with null results that it would take to make the result non-significant.

An alternative is simply to locate and use unpublished studies, as we did. Most (21 out of 23) of the unpublished studies we used were doctoral dissertations and two were master theses. As most academics know, doctoral dissertations are generally well supervised by a group of Ph.D.s through all stages of the study, and must be defended before a panel of Ph.D.s who may question the methods, analyses, and conclusions. Thus, dissertations appear to meet Rosenthal's (1994) "well-conducted" criterion.

The argument just made, however, may not satisfy critics. But there is a more compelling argument-an empirical one. If the un-
[page 21]
published studies are anomalous compared to published studies -- in this case, vastly underestimating CSA-symptom associations -- then there should be a difference in CSA-symptom associations in the published versus unpublished studies.

We conducted such an analysis and presented its results in our review (Rind et ai., 1998, p. 34 ). We found that these associations did not statistically differ. The mean effect size was r = .11 for 27 published samples and r = .08 for 24 unpublished samples. From a practical point of view, these mean effect sizes are the same -- both small and both hovering around r = .09, the mean effect size obtained in the published studies using national probability samples. Additionally, among the 54 effect sizes that we meta-analyzed, all but three were homogeneous -- i.e., statistically consistent with the overall mean effect size for the college samples, which was also r = .09.

The three outliers, two above the mean and one below, were all published. Thus, all unpublished studies were consistent with the overall trend, showing they were not anomalous and that they did not bias the results.

The Leadership Council and its Criticism that We "Loaded our Analysis"

As noted previously, the Leadership Council and its president,  Paul J. Fink, were highly influential in characterizing our study as flawed and having it condemned. Congressman Weldon cited Fink's comments as authoritative in showing that our study was based on very bad data, should never have been published by the APA, and should be repudiated by the APA.

In an opinion piece, which we will discuss later in more detail, psychologist Romy Cawood defended our article as sound in all respects (Greensboro News & Record. July 4, 1999, H7). In response, one letter argued that Cawood left out some important facts, foremost of which was that "Dr. Paul J. Fink, past president of the American Psychiatric Association, has condemned the study" (Dayton Daily News. June 26, 1999, 13A).

Another, written by Keven Bellows, the vice presidebt and general manager of the "Dr. Laura" program, complained that Cawood's ridiculing of "Dr. Laura" was unfounded because she, rather than relying on her own analysis to debunk our study,
[page 22]
consistently referenced "an analysis refuting the study by the Leadership Council. ... It was Fink's scathing critique of the study that prompted Congress to pass a near-unanimous resolution condemning it" (Dayton Daily News, July I, 1999, 11A). Because of the enormous influence wielded by Fink, his group, and their critique, we now devote considerable attention to their criticisms.

Right after he appeared on MSNBC, Fowler of the APA asked us for clarification on Fink's assertions that, of the 59 studies we used, over 60 percent of the data came from one single study done 40 years ago, and that we "loaded" our analyses with data that were based on primarily mild interactions involving no physical contact. Fink was referring to the Landis (1956) study.

The following is a summary of the points we made to demonstrate that these assertions are specious.

First, the Landis study was not used in our meta-analyses of CSA-symptom associations, which constituted the primary and most important analyses of our study. We used 54 samples comprised of 15,912 participants in these analyses, not one of whom came from Landis. Thus, we did not "load" our primary analyses with the Landis study.
.

Second, we did use the Landis data for our analyses of self-reported reactions to CSA. Our summary table (Rind et al., 1998, p. 36) shows that data were drawn from 9 male and 9 female samples. The Landis samples comprised more cases than all other samples (493 of 1,421 female cases; 183 of 606 male cases), but far less than 60 percent (in fact, Landis represented 35% of female ~ and 30% of male cases). Among all samples, the Landis reaction data were the most negative. In computing mean reactions across samples, we employed weighted means, thereby giving the most weight to Landis' negatively skewed data. This approach maximized reports of negative reactions relative to alternative methods.
Using the weighted approach, mean positive and negative reactions for males were 37 and 33 percent, respectively. Using an unweighted approach would have yielded more positive and less negative reactions: 43 and 30 percent.
Dropping the Landis data completely would have yielded even more positive and less negative reactions: 50 and 24 percent.
The critics' implication that we

[page 23]

used Landis' data to minimize reports of negative reactions is in fact the opposite of what happened; we handled them in such a way as to maximize negative reports.
Notably, anyone could use the data from our table to determine the exact effect of Landis' study on the overall findings; the critics apparently failed to make these simple calculations.

Third, we analyzed self-reported effects from the 6 male and 5 female samples that had this information. Here, the Landis data made up 53 percent of the total N for males and 68 percent for females (combined was 63%). It appears that Fink and his group derived their criticism specifically from this analysis -- but that was not implied in Fink's letter to "Dr. Laura" or subsequent public representations of it, in which all our analyses were implicated, which is clearly false.
In our review, we first examined self-reported negative effects on subjects' current sex lives or attitudes.
For males, negative effects ranged from 0.4 percent (Landis, 1956) to 16 percent (Condy et al., 1987). We computed their unweighted mean, obtaining 8.5 percent. Given that the Landis data were the least negative but by far most numerous in terms of cases, our approach maximized self-reported negative effects.
Using a weighted approach would have yielded 4.4 percent. For females, the Landis data were similarly the least negative (2.2% ) and most numerous. The unweighted mean that we reported was 13 percent, whereas the weighted mean would have been only 3.8 percent.
Once again, our approach maximized rather than minimized negative reports.
Additionally, we examined general reports of lasting negative effects using three male samples (0% for Landis; 0% and 27% for the other two) and three female samples (3% for Landis; 20% and 25% for the other two). In these cases, we did not give means, but instead concluded properly that lasting self-reported negative effects occurred for only a minority of the students -- a conclusion that holds independently of the inclusion of the Landis study.
.

Fourth, the claim that we "loaded" our analysis "primarily with mild adult-child interactions involving no physical contact" is false. We included all the studies that were available at the time, 16 of which included cases exclusively involving physical contact. We examined whether CSA-symptom associations differed between

[page 24]

contact and non-contact CSA. They did not (see Rind et al., 1998, p. 33). The studies we included, in contradiction to Fink's claim, represented well the severity of CSA found in the general population, as we demonstrated in our review (Rind et al.,-1998, pp. 29- 31 ).

Fifth, we were criticized for using the Landis study in the first place. This criticism is selective, however, given that abuse researchers examining college samples have often cited this study for background (e.g.. Finkelhor, 1979; Frornuth & Burkhart, 1989).
Fink's implication that the Landis study is invalid because of high rates of non-contact CSA is itself misleading. Victimologists have expanded the definition of CSA to include non-contact experiences ranging up to age 18 in some studies. Even with this expanded definition, they still have characterized all CSA as pathogenic.
We simply tested assumptions about CSA using the researchers' own definitions of CSA. Even if we assume that all self-reports of negative effects in the Landis study involved contact CSA, given that about 10 percent of male and 41 percent of female experiences were contact CSA, reports of lasting sexual harm would involve only 4 percent of male and 6 percent of female contact cases and reports of lasting general harm would involve 0 percent of male and 7 percent of female contact cases. There is no practical difference between these figures and the ones we used in our review; they are all in the single digits.

Fink's group has also asserted that Landis' data should be dismissed because, in the case of males, they consisted mainly of "mild" non-contact CSA-teenagers rejecting unwanted approaches. The implication is that if the teenage males had been involved in contact CSA, then self-reported effects would have been highly negative. Fromuth and Burkhart ( 1987) provided data directly relevant to this implication. They found that approximately 30 percent of their SA male students had experienced oral sex and another 25 percent intercourse -- a fair degree of "severe" contact CSA. Instead of primarily negative self-reported effects, 39 percent reported a positive effect on their lives, 46 percent a neutral effect, and only 15 percent negative. Restricting the results to teenagers (aged 13 to 16 at the time of the CSA) resulted in an even
[page 25]
more positively skewed distribution: 60 percent positive, 37 percent neutral, and 3 percent negative. These results, missing from our original review where we relied on Fromuth and Burkhart's 1989 report, flatly contradict Fink's contention.

In short, the Landis data played no role at all in the key analyses of our study -- the meta-analyses of CSA-symptom relations. For self-reported reactions and effects, we analyzed the Landis data in ways that maximized reporting of negative outcomes. Overall, the studies we used accurately represented levels of contact CSA in the general population, and restricting Landis' results to contact CSA left our basic conclusions untouched. The Leadership Council's claim that we "loaded" our analyses fails at every turn.

Additional Criticisms by the Leadership Council

In addition to the false claim that we loaded our analyses, which lay critics easily absorbed and widely cited, the Leadership Council made a number of more "technical" criticisms that have not been widely cited but nevertheless played a key role in our article's fate, as Fink el al. (1999) claimed in their unpublished letter to the Los Angeles Times, discussed previously.

To reiterate, they claimed that their group's in-depth analysis, authored by Dallam et al. ( 1999), prompted the APA to seek an independent review of our article and Congress to condemn it. We recognize that this analysis was in draft form. But because it, and not some future version, had the alleged impact of prompting two historical events -- i.e., re-review of an already rigorously peer-reviewed article in a premiere psychological journal and congressional condemnation -- a careful scrutiny of this critique is in order for archival purposes. As we will show, this critique has a patina of scientific rigor to it, but it is in fact severely flawed. Because of its significance, we present and then address nearly all its points.

Sample bias

The authors claimed that our argument that college samples are representative of the general population because the prevalence of abuse was similar is unconvincing. They argued that similar prevalence does not necessarily mean similar experience
[page 26]
(e.g., severity) or outcomes and that those who made it to college may have had less severe CSA, been less distressed, better copers, and so on. As described in our Psychological Bulletin article and in the summary presented above, not only did we find similar prevalence rates between the college and national probability samples, but we also found similar experiences in terms of severity (i.e., types of CSA; degree of incest; frequency) and outcomes (e.g., identical mean effect sizes, similar reactions and self-reported effects). Given our documentation of these extensive similarities in the original article, the authors' speculation about lesser severity, distress, or better coping is not only empirically baseless but negligent as well.

Inclusion of non-contact CSA

The authors criticized our use of studies including non-contact experiences, claiming that this minimized effect sizes, and cited our use of the Landis ( 1956) and Risin and Koss (1987) studies as particularly problematic because of their large sample sizes. 

First, this criticism is better directed at victimologists who established the inclusion of non-contact experiences in definitions of CSA, leading the researchers of the studies we reviewed to use such overbroad definitions. We were testing victimological assumptions and therefore followed their definitions. 
.

Second, their citation of Landis (1956) and Risin and Koss (1987) is highly misleading, given that these two studies were not included in the meta-analyses of CSA-symptom relations. 
.

Third, despite this definitional weakness, we found that over all the studies, effect sizes did not differ between studies including only contact CSA and those with broader definitions. 
We note further that in Laumann, Gagnon, Michael, and Michaels (1994) face-to-face interview study using a national probability sample, arguably the best and most important study ever conducted for understanding sexual behavior and its correlates in the general American population, CSA-adjustment effect sizes were only r = .07 for males and r = .05 for females. 
This study included only contact CSA occurring before puberty. 
Moreover, symptoms were uncorrelated with CSA severity (e.g., touching vs. oral sex or penetration).

These results further support the conclusion that we were not reducing effect size estimates by including non-contact cases. These conceptual and empirical points show this criticism to be without merit.

[page 27]

Poor measurement

The authors claimed the studies we included were not uniform with respect to issues such as purpose, questions asked, and definition of CSA. 

In fact, the studies did have a common purpose: to examine outcomes of CSA. True, the studies varied widely in questions asked and definitions, but this problem applies to all CSA literature reviews. Nevertheless, the authors selectively applied this criticism to our review. 

More importantly, from an empirical view, had the non-uniformity been problematic, then heterogeneous effect sizes should have emerged. But across 94 percent of the samples, the effect sizes were homogeneous, which empirically invalidates this criticism. 

The authors also claimed that identification of CSA based on retrospective self-report, as done in the college studies, is biased toward underreporting CSA because of factors such as shame, denial of victimization, or amnesia. 

Once again, this is a selective criticism that could be applied to all reviews. More importantly, the college studies were almost all anonymous and usually asked about "experiences" rather than "abuse" to minimize non-responding, which weakens shame or denial as a biasing factor. Given that claims for amnesia for CSA are based more on conviction than scientific evidence, and that recovered memories after long periods of amnesia are most likely false (Brandon et al., 1998), amnesia cannot be accepted as a valid threat to self-reporting CSA. 

The authors then claimed that our reporting different outcomes separately was a "divide and conquer" technique that attenuated effect sizes. 

Ignoring the obvious, that one should separately analyze different symptoms because CSA might have an impact in one area but not another, given the homogeneity of effect sizes across samples and the uniformity across symptoms, this criticism  is without empirical support. It ia also selective, in that analyzing and reporting outcomes separately is standard practice in CSA literature reviews. The authors claimed that only some CSA victims will have any given symptom (but all will have some symptom), minimizing assessed effects of CSA. Again, this represents selective attention to our review, when it applies to all others as well. This criticism also suffers from unfalsifiability ("data picking" and "multiple endpoints," in which researchers seize upon any observed
[page 28]
symptom and attribute it to CSA), which can produce spurious confirmation of cause and effect (e.g., Gilovich, 1991 ). 

The authors also argued that the college studies used general measures of symptoms rather than CSA-specific ones, most notably PTSD. 

This is a tired criticism that has already been answered in other forums (e.g., Nash et al., 1993). Because CSA has come once again to be seen as the cause of all adult neuroses (Esman, 1994), it is valid to examine all types of CSA-symptom relations. Moreover, given that PTSD implies severe psychopathology, it is unreasonable to argue that measures such as anxiety and depression are not valid for examining CSA effects. 

Finally, "sexual problems" is often viewed as a CSA-specific outcome. Yet its effect size (r = .09) was completely consistent with other CSA-symptom relations rather than being larger.

The authors went on to complain that some of our coding of data was misleading. As an example, for neutral reactions, they claimed that we coded Condy et al.'s ( 1987) "mixed" category and Long  and Jackson's (1993) "low" and "ambivalent" responders as neutral reactions, yet nowhere did we reveal this to our readers. 

This claim is simply false; in Table 7 (Rind et al., 1998, p. 36), we used superscript a for our neutral percents for both of these studies, clarifying the coding procedure. 

They also complained that we coded Fromuth and Burkhart's (1989) "surprised" category as neutral. 

We had good reason to do so: these researchers based their measure of reactions on that of Finkelhor (1979), which in turn reflected that of Landis (1956), both of whom interpreted "surprised" as a neutral response. In our recent reading of Fromuth and Burkhart (1987) -- their other publication on the same data -- we found that neutral reactions were 30 percent, as opposed to the 28 percent we reported in our review. Using the coding of neutral data as an example, the authors suggested that all our conclusions regarding the reaction data were invalid. But even if we removed all three of the disputed studies, the overall percents from the remaining studies are virtually identical to what we reported in our review; none would differ by more than 2 percent. This did not stop the authors from claiming that all our conclusions regarding the proportion of negative reactions were invalid, charging that our statements based on 
[page 29]
these data were "ill responsible [sic) and seem to be intent on misleading the reader." 

Poor choice and interpretation of effect sizes

The authors noted that Cohen (1988) described d = .8 as a large effect size and then claimed that such a value, when converted to r, can fall anywhere between .00 (when CSA prevalence is 0%) and .39 (when CSA prevalence is 50% ). They argued that given the low prevalence of CSA in the college population, the use of r rather than d reduced the appearance of the effect sizes -- claiming that rs of .1 or .2 may still be large effect sizes, that is, if expressed in d. 

Their assertion, however, is purely speculative. Furthermore, this speculative criticism is negligent in that they completely ignored the fact that we directly addressed this issue in our review (Rind et al., 1998, p. 41) in a section entitled "statistical validity." 

We noted that effect size  attenuation is quite small for a 27-73 split (CSA prevalence for females). It is somewhat larger for a 14-86 split (CSA prevalence for males), but is still small in absolute magnitude for small effect sizes -- i.e., an r = .07 based on a 14-86 split increases at most by .03 (to r= .10) in a 50-50 split. Furthermore, for the current article we examined their speculation empirically by converting all rs to ds using Rosenthal's ( 1984) formula that takes into account population prevalence rates -- we converted using sample prevalences from the individual samples. 

For the 14 male samples, with mean r = .07, we obtained mean d = .22; for the 33 female samples, with mean r= .10, we obtained mean d = .25. Both of these mean ds are small, not large, according to Cohen 's (1988) designation in which d = .2 is small and d = .8 is large, which empirically flatly contradicts the authors' speculation. 

As a side note, they further claimed, without providing evidence or a citation, that converting d to r does not mean that the converted r has the properties of the Pearson correlation coefficient. As we obtained al rs from ts, Fs, or chi-squares, not from ds, this point is irrelevant.

Deck stacked against CSA compared to FE

The authors argued that the deck was stacked against CSA and in favor of family environment (FE) in accounting tor symptoms variance, because CSA was measured dichotomously while FE was measured continuously. 

However, we fully addressed this issue as well in the section on statis-
[page 30]
tical validity. We reviewed several studies where CSA was constructed on a continuous scale and was then compared with FE in terms of accounting for adjustment. Based on this, we concluded that "[r]esults from these studies in which CSA was constructed to be continuous are consistent with results from studies in which CSA was treated dichotomously in terms of pointing to family environment, rather than CSA, as a significant contributor to current adjustment" (p. 41 ). 

The authors also argued that the deck was stacked because measures of CSA were unreliable compared to measures of FE. 

Again, we directly addressed this issue in our review, providing evidence that allowed us to conclude that "[tJhese results point to acceptable reliabilities for measures of CSA, which are comparable to reliabilities for family environment measures" (p. 41 ). Once again, the authors ignored our discussion of this issue. 

The authors then argued that the statistical control we used was suspect because CSA and FE were confounded, citing Briere (1992) to argue that such statistical control is invalid under certain circumstances. 

In our treatment of this issue, we thoroughly addressed this point (pp. 43-44), showing why Briere's criteria, although relevant in the clinical population, do not obtain in the college population.

No way to infer causality or lack of it

The authors then argued that there is no way to infer causality or the absence of it from retrospective observational data. All we should have been doing, they argued, was to examine the magnitude of relations. 

Once again we see selective criticism of our review when this criticism could, and should, be applied to all other reviews. Considering their later causal assertions about the "growing and well documented" literature on "the harmful and long-lasting effects of adult-child sexual contact" (italics added), we find this criticism disingenuous and indicative of extreme bias. In addition, the authors' reasoning is erroneous. Correlation alone can never prove causality, but a demonstrable lack of correlation disproves causality.

Double standard of interpretation

The authors next argued that we claimed that young people 

(a) cannot be counted on to determine if their experience was harmful, but 

(b) can be relied upon to determine if it was not harmful. 

Supposedly, we supported the first claim by citing Nisbett and Wilson ( 1977), who concluded that
[page 31]
people are often unaware of the causes of their behavior when causal relations are ambiguous or complex. And we supposedly supported the second by suggesting that a willing encounter experienced positively be labeled adult-child sex, which they referred to as a "rather bizarre recommendation." 

Once again, the authors misrepresented what we wrote. 

First, we cited Nisbett and Wilson (1977) to suggest that students may have been harmed even though they perceived no harm -- the opposite of what the authors charged. 

Second, we explicitly stated in the Discussion that self-reports of lasting harm imply genuinely negative effects of CSA (p. 44). 

Below we address in detail our so-called "bizarre recommendation." Given that we acknowledged that perceptions of harm imply negative effects, it is completely consistent to assume that perceptions of positive effects or reactions do not imply harm. It is the authors and victimologists in general who have the double standard of interpretation, because they accept the validity only of perceptions of harm.

Misleading interpretation of misrepresented data

The authors began by charging that we "loaded" our analysis with the Landis (1956) data, which they stated "represents a serious misreporting of data." 

This characterization is itself a serious misrepresentation, as we have already shown (see above). 

They next once again attacked us for our "misuse" of data we coded as neutral in a few of the studies. Based on this, they claimed that all our conclusions about reactions were faulty, calling us "ill responsible [sic]" with seeming "intent on misleading the reader." 

It is thus instructive to show precisely what happens when we remove the disputed studies and re-compute mean reactions. For males, positive, neutral, and negative reactions become 36, 31, and 34 percent (we reported them in our review as 37%, 29%, and 33%); for females, they become 12, 16, and 72 percent (we reported them as 11%, 18%, and 72% ). These virtually identical results highlight Dallam et al.'s bias.

We are also excoriated for not including the types of disclaimers found, for example. in West and Woodhouse (1993). According to the authors, these researchers wrote in their conclusion that most of the incidents reported by male subjects were either trivial approaches that were soon rejected or minor indecencies that the boys saw as unimportant. 

The authors' selective paraphrase is what is in fact 
[page 32]
misleading. West and Woodhouse's (1993) most important conclusions coincide with ours. They stated that their "research and others like it have shown that boys' sexual encounters with older males and females are far from rare and for the most part fairly innocuous" (p. 126, italics added), while "the more serious kinds of adult-child sexual interaction are unusual.' (p. 127). 

Furthermore. although many cases in West and Woodhouse's survey involved trivial approaches that were soon rejected, many other cases involved sexual contact perceived positively by the boys, which Dallam et al. seem incapable of acknowledging. 

West and Woodhouse (1993. pp. 32-88) reported lengthy interviews of 24 students of whom 7 had oral sex or intercourse with adults. Of these, 6 experienced these encounters positively (cases 008, 050, 249, 261, 297, 337), while only one reacted negatively (case 239). Two of these students had heterosexual encounters, four had homosexual encounters. and one had both. Negative or neutral reactions occurred almost exclusively in the less "severe" cases involving approaches or fondling. Our own discussion of West and Woodhouse accords well with these findings.

Flawed conclusions not well supported by the data

The authors then claimed that research does not support our contention that boys are not typically harmed by CSA. To support this, they argued that boys who do not perceive harm are committing misattribution. They cited Myers (1989) to argue that sexually abused boys tend to deny or normalize their abuse. 

Their claim of misattribution is circular in that harm is assumed to occur, and lack of perceived harm is therefore inferred to be cognitive distortion, thereby "proving" the initial assumption. Myers' (1989) research was a clinical study involving unusual cases of unambiguous and severe abuse; their use of this citation shows an obstinacy in insisting that the typical case is validly characterized by the extreme. 

Our narrative literature review of nonclinical cases of boy-adult sex (Bauserman & Rind, 1997) coincided in its conclusions with those of West and Woodhouse (1993) just discussed: serious cases are rare and typical cases are fairly innocuous. The authors then used their misattribution claim and citation of Myers to argue that "we cannot assume that boys were not harmed based solely on their self-reports." This is a double standard, because the authors clearly ac-
[page 33]
cept reports of harm as evidence for harm. Ironically, the authors falsely accused us of having a double standard (see above), while exhibiting one themselves.

Ignored majority of sample in arguing for lack of harmfulness

The authors claimed that we minimized evidence for harm in our discussion, despite having found that all CSA-symptom associations but one were significant. 

What the authors ignored is that we demonstrated that all of these associations were small; FE was confounded with CSA; FE accounted for nearly ten times as much adjustment variance as CSA; and statistical control tended to reduce CSA-symptom associations to non-significance. 

They then stated that it was a "small minority -- mainly older, willing adolescent males -- who appeared to have suffered less harm" (italics added). 

First, in view of their very next criticism, where they dismissed the possibility of consent in age-discrepant sexual relations involving minors, it is remarkable that they use the term "willing" in the current criticism. 

Second, their phrase "suffered less harm" shows that their basic position is unscientific because it is un-falsifiable -- harm is a premise that data cannot dispute.

Summary

Methodological or statistical criticisms were absent in NARTH's critique, which instead focused on reiterating clinically-based findings and dogma. 

"Dr. Laura's" major criticism of the use of unpublished studies was flawed on both conceptual and empirical grounds. 

The Leadership Council's claim that we "loaded" our analyses to minimize harm was blatantly wrong. Its in-depth critique was flawed by such problems as patently false assertions, speculations shown to be faulty by empirical examination, and bias in terms of selective attention and misrepresentation.

Refutation of Criticisms Concerning Conceptual Issues

The major attacks on our review have not been methodological or statistical, however. They have concerned our suggestion that the term "child sexual abuse" is not always appropriate and our use 
[page 34]
of the construct of willingness or "consent."  In this section, we present detailed discussions of these conceptual issues, showing that our treatment of them was scientifically sound.

Terminology

NARTH was the first to attack the suggestion in our discussion that certain types of CSA should be relabeled by researchers with the value-neutral terms "adult-child sex" or "adult-adolescent sex" (see Rind et al., 1998, p. 46). 

NARTH misrepresented what we wrote, falsely claiming that we recommended that psychologists should stop using terms such as "sexual abuse" and should use the phrase "level of sexual intimacy" instead of "severity of abuse." Regarding the latter point, what we actually wrote, in discussing the progression from exhibitionism to masturbation to intercourse, was that "many authors referred to this increasing level of sexual intimacy as 'severity' " (Rind et al., 1998, p. 29). This distortion was repeated numerous times in opinion pieces around the country spreading a false impression of irresponsibility and lack of sensitivity. NARTH also attacked our view that science should separate itself from moral language, and complained that replacing the term "abuse" with neutral terms is "a repetition of the steps by which homosexuality was normalized." Their logic resonated with many subsequent critics.

The Wanderer article asserted that we recommended "overhauling and euphemizing the language of sexual abuse." 

Dallam et al., writing for the Leadership Council, charged that the "insistence on 'value-neutral' terminology normalizes child sexual abuse." 

Steven Mirin, Medical Director of the American Psychiatric Association, wrote in a letter to the FRC that "academic hair-splitting over whether the act should be considered adult-child sex or child sexual abuse ... is not in the public interest and obfuscates the moral issue involved." 

And the APA agreed with them all, stating in its letter to Congressman DeLay that some of  "the language in the article ... is inflammatory." ln the end, the APA characterized this usage as our opinion, inconsistent with APA's position, and then added that sexual activity between children and adults should never be "labeled as harmless."

[page 35]
In fact, our suggestions for terminology resulted directly from the editorial and peer review process with APA's premiere journal. In our original two drafts, we did not suggest the use of value-neutral terms. This recommendation came only after the action editor requested an in-depth discussion of the "child sexual abuse" term. In response, we added comments in the Introduction (pp. 22-23), wrote a new section in the Discussion entitled "Child sexual abuse as a construct reconsidered" (pp. 45-46), and added recommendations in the "Summary and Conclusion" section (pp.46-47).

In accepting our article for publication, the action editor wrote that the "major and most difficult issue to address is the central one raised by Reviewer A concerning the conceptual and definitional issues. ..[that) might need to be reconsidered in light of your findings." After noting that 37 percent of males reacted positively to their CSA experience at the time, and 42 percent in retrospect, he wrote:

Although these experiences might meet legal and social definitions, the data suggest that the operationalizations employed might not sufficiently contextualize the events in such a way that adequately captures the essence of "abuse." 
Please note that I am not condoning behaviors that meet current definitions of CSA any more than I condone illicit substance use in minors. Indeed, both types of behaviors are legally and socially proscribed. Both. however, need to be contextualized in order to carefully assess their pathogenicity. ... [P]erhaps we need to be more thoughtful about how we define CSA at a psychological level. 
That is, current definitions may not be sufficiently probing. I base this conclusion on the data regarding the extent that such experiences were positive, and the extent that such experiences correlate with outcome in men if the CSA was unwanted. ... I'm not encouraging a conceptual definition that requires harm as an effect ... but one which captures the essence of "abuse" ... With respect to the "big picture," I think you need to ... spend more time in your discussion elaborating the conceptual and operational implications of your review. I believe that, in doing this, you can make the substantive contribution sufficient to warrant publication in Psychological Bulletin.

Reviewer A had noted that definitions for CSA have been too diffuse and inconsistent, resulting in "poor predictive utility." The idea is that differences in adjustment would likely be better accounted for if the term CSA were restricted to a subset of the very wide range of experiences currently labeled CSA, which would 
[page 36]
advance understanding of CSA and prediction of its effects. 

In fact, another reviewer on a later draft reinforced the notion that the term "abuse" is problematic, recommending that we change "child sexual abuse" in our title to "child and adolescent sexual experiences" -- a step we chose not to take.

Hence, our assignment was to reconceptualize the term "child sexual abuse." We carefully outlined the problems caused in the past by the mixing of morality and science in other areas of sexuality, such as the seventeenth century transformation of masturbation from sin to sickness and medical representation of it as "self-abuse," hindering scientific understanding of this behavior and creating iatrogenic victims in the process. 

We noted how several researchers in the college studies came to question their broad definition of the term "child sexual abuse" after gathering empirical data. We then discussed the problems the term CSA created for scientific validity, based on the results of our review. We noted that over-inclusive definitions of CSA (combining experiences perceived as willing and positive with those perceived as coercive or negative) produced poor predictive validity -- that is, CSA was not particularly predictive of negative outcomes, which it should have been had it been properly conceptualized. Because terms or constructs are scientifically useless if they lack validity, we argued "[tJo achieve better scientific validity, a more thoughtful approach is needed by researchers when labeling events that have heretofore been defined socio-legally as CSA" (p. 46) -- note the use of the word "thoughtful," which was taken directly from the action editor's letter.

Our recommendations

Finally, we made our recommendations, which were directed to scientific use. We suggested "adult-child sex" or "adult-adolescent sex" whenever the minor was willing and had positive reactions, and "child sexual abuse" or "adolescent sexual abuse" whenever the minor felt that he or she did not freely participate or experienced negative reactions to it. We then commented that, "[bJy drawing these distinctions, researchers are likely to achieve a more scientifically valid understanding of the ... nature, causes, and consequences of the heterogeneous collection of behaviors heretofore labeled CSA " (pp. 46-47). 

From an empirical perspective, the data from our college samples as well as 
[page 37]
many other non-clinical samples (cf. Bauserman & Rind, 1997) make it clear that these recommendations do improve predictive validity. 

Bogan's (1992) college study provides one example: all participants had experienced CSA, but those who did not label their experience as "abuse" were significantly better adjusted on all measures than those who did. 

From a methodological perspective, emotionally loaded terms are primes that can lead to invalid inferences (Rind & Bauserman, 1993). From a conceptual perspective, drawing distinctions between children and adolescents, as we did, is long overdue (see discussion below on consent for elaboration). All these points imply that our recommendations were completely defensible from a scientific point of view.

Other sexologists concur that the term "abuse" is problematic

The attacks on our recommendations concerning terminology may have created the false impression that our suggestions were radical and unprecedented in the scientific arena. The fact is that many other researchers have expressed concern with CSA terminology because of problems of scientific validity (e.g., Fishman, 1991; Fromuth & Burkhart, 1987; Green, 1992; Kilpatrick, 1987; Long &Jackson, 1993; Money & Weinrich,1983 ;Nelson, 1989; Okami, 1990; Sandfort,1992; West, 1998). 

For example, D. J.West (1998), a prominent criminologist and sexologist from Cambridge University (emeritus) and co-author of two of the college studies in our review, recently commented that professional use of terms such as abuse, perpetrator, victim, and survivor has incorrectly reinforced the idea that any kind of sexual incident with a child is likely to cause great and lasting harm. He noted further that this usage has "introduced a tone of moral revulsion alien to scientific inquiry" (p. 539). 

Richard Green (1992), ps