“Side Effects of Contingent Shock Treatment”, an examination
This post examines a “scientific” paper that the doctors at the JRC put together and linked to on their website. Their website lists it as “recently approved for publication,” but the article was published in 2007.
- The first thing to notice is that the “side effects” that the study measures are all behaviors that contingent shock (CS) therapy directly or indirectly targets. Depression, constant fear, and sleep loss are examples of potential side effects; complying with instructions is the intended therapeutic benefit, i.e., not a side effect. Showing that a therapy accomplishes the intended primary effects cannot even in principle show that it has no serious side effects and claiming otherwise is dishonest.
- The “side effects” measured were all behaviors, whereas most of the side effects reported to result from CS therapy are mental in nature, such as feelings of helpless, great fear, depression, suicidal thoughts, etc. Behavior does not necessarily reflect internal mental state, especially given that what is being tested is a therapy primarily intended to modify behavior. For example, it is conceivable that a patient who is punished severely for crying might eventually learn to stop expressing sadness while becoming even more depressed, or that a patient might fake being happy by affecting a smile and friendly demeanor to avoid punishment while secretly planning violent revenge the instant the threat of punishment vanishes. A person who reports thoughts of suicide and is punished for it would presumably no longer report feeling suicidal, but there is no reason to think that this would correspond to an actual reduction of suicidal thoughts, or even a reduced risk of actually committing suicide.
- The nine test subjects were not chosen at random. This implies that someone went through students and selected those they thought most suitable for the study, i.e., those that were predicted to respond best to CS therapy.
- The subjects were not evaluated on all of their behaviors, but only on those that were “target” behaviors, which varied by individual. So if crying is a target behavior for Jack but not Jill, then Jack crying less frequently constitutes an improvement, while Jill can start crying 24/7 after treatment begins without having any effect in the results. Again, side effects are not effects that are targeted, but rather unintentional effects.
- Additionally, because target behaviors were determined before the treatment phase of the study began, unanticipated behaviors possibly triggered by CS therapy would not be recorded in the study. If a subject replaces old prohibited behaviors with new negative behaviors of equal or greater severity due to CS therapy, and the team at JRC, which does not believe that CS has any negative side effects, did not predict that specific behavior, the study would count that as an improvement. Thus, the study cannot falsify the hypothesis that CS can causes subjects to begin to exhibit self-destructive or aggressive behaviors which were not previously present in the subject.
- There was no control group. Rather, each subject first had a brief baseline period of “randomly” (more on this next) determined length. The lengths of the baseline periods were not reported in the article, which makes it impossible to determine how statistically significant the baseline measurements were. (more on statistical analysis later)
- The baseline periods were not of random length. Subjects who began exhibiting behaviors that the JRC judged too severe were immediately begun on CS therapy, which skews … the baselines more positively and treatment period negatively, against the intended conclusion of the article? Weird.
- Subjects still received all other punishments as normal during the baseline and treatment period, including mechanical restraint and food deprivation. Therefore, the study cannot falsify the hypothesis that CS has negative sides taken by itself, which is what the title would suggest is being investigated. It could be the case that the other aversives already being used cause severe side effects and that CS therapy causes severe side effects, and the two together cause severe side effects, but not significantly more than the other aversives alone.
- The frequency and severity of CS administered was not reported anywhere in the paper. The hypothesis that CS has side effects would imply some level of correlation between the amount of its use and the incidence of its proposed side effects. The paper later makes the assertion (unfounded, as will be explained below) that some subjects made improvements in some areas while others did not. The paper cannot falsify the hypothesis that those who made less or no improvements received more frequent or more severe CS than the subjects who did improve, which would imply that there is a positive correlation between frequency/severity of CS and a lack of improvement in targeted areas.
- The only measure of target behaviors was frequency and kind – no severity. Thus, a subject who rolled his eyes several times an hour during the baseline period, but replaced hourly eye-rolling with physically attacking someone daily after CS therapy began would be marked as an improvement under the system of the study. The paper cannot falsify the hypothesis that the use of CS, a painful disciplinary measure, instills the belief that violence is an effective and appropriate expression of negative emotions, which causes subjects to replace forbidden negative behaviors, such as crying, with less frequent but more serious behaviors such as violent aggression.
- Behaviors were only monitored during a 10 minute period chosen “randomly” each weekday, but which never included times when subjects were receiving behavior reinforcements (such as CS). This seems to imply that the monitoring periods were not random. By definition, there is a strong timing correlation between contingent punishments and the behaviors they are contingent upon. By excluding episodes where subjects were administered contingent punishments, it is highly likely the case that the target behavior which triggered the punishment was also excluded. Thus, during the treatment period, the negative behaviors which are responded to immediately by contingent punishments are excluded from the data, while the positive targeted behaviors that are not reinforced immediately (such as rewarding a subject for being polite the entire day), are left in the data. This creates an obvious bias towards reporting incidents of positive behavior in favor of incidents of negative behavior during the treatment period.
There’s more (the staff around the subjects were aware of the monitoring and could influence results, the JRC monitors students with video 24/7 and documents every single incident of targeted behavior, yet the study inexplicably ignores this and only uses a few hours of footage per subject) but I have to go to bed so I’ll cut to the most damning flaw (after the fact that the study couldn’t even in principle measure side effects, point #1):
The data was not statistically significant. Specifically, the baselines were so short that the author admitted that statistical analysis could not be done. Instead, he “analyzed” the data by plotting graphs (incidence of targeted behaviors vs. time), removing the time labels so that you cannot tell from the graph whether it shows a significant timescale or not, and asking clinicians whether they thought the pictures of the graphs indicated significant change or not. The clinicians were not said to be independent or unaware of the purpose of the study, so we can assume that they were all JRC employees who where aware of what answers would benefit the institution. Further, no mention is made of the clinicians having any sort of mathematical literacy at all (although we know that JRC training includes a 1 hour section on graphing data) or being aware of how long a timescale the graph represents (that information is not even given to those reading the paper) or how many datapoints the graph represents, or what margins of error are, etc. In other words, the results were not statistically significant, so the paper instead reports on what laymen without any statistical knowledge or scientific knowledge or training and every reason to bias the results think the statistically insignificant data signifies.