Reporting of SUSARs Based on Aggregate Analysis – Statistical Considerations

Michael Zelasky

In December 2015, the FDA released the draft guidance Summary of Assessment for IND Safety Reporting1. The draft guidance includes recommendations on reporting of SUSARs (Serious and Unexpected Suspected Adverse Reactions) based on aggregate analysis. Aggregate analysis is stated in the Code of Federal Regulations (CFR)2 as an important consideration on whether to report a SUSAR. This article summarizes the guidance with respect to aggregate reporting and highlights a few statistical methods that can facilitate the process and help with drawing accurate conclusions . It also discusses some of limitations of these methods and considerations that should be taken into account.

The draft guidance is a follow-up to the previous guidance, Safety Reporting Requirements for INDs and BA/BE Studies from 20123. Both guidances provide recommendations to comply with title 21, part 312 of the CFR. Specifically, sections § 312.32(c)(1)(i)(C) and § 312.32(c)(1)(iv) of title 21 of the CFR pertain to reporting of SUSARs based on aggregate analysis:

(i)(C) An aggregate analysis of specific events observed in a clinical trial (such as known consequences of the underlying disease or condition under investigation or other events that commonly occur in the study population independent of drug therapy) that indicates those events occur more frequently in the drug treatment group than in a concurrent or historical control group.
(iv) Increased rate of occurrence of serious suspected adverse reactions. The sponsor must report any clinically important increase in the rate of a serious suspected adverse reaction over that listed in the protocol or investigator brochure

The recommendations from the draft guidance are related to:

(1) The composition and role of a safety assessment committee,
(2) Aggregate analyses for comparison of adverse event rates across treatment groups,
(3) Planned unblinding of safety data,
(4) Reporting thresholds for IND safety reporting, and
(5) The development of a safety surveillance plan.

Numbers 2 and 4, above, are related to analysis of aggregate data. Additionally, the draft guidance emphasizes a parsimonious approach to reporting SUSARs. All serious adverse events do not need to be reported if there is little reason to consider them suspected adverse reactions (i.e., having been caused by the drug).

Statistical Considerations in Aggregate analyses:

Aggregate analyses are used to detect whether events occur more frequently in the treatment arm [(i)(c)] or if there is a clinically important increase in the rate of the adverse event [(iv)]. Per the draft guidance, the primary aggregate analysis should be a pooled analysis but individual studies should be reviewed for consistency. When deciding which studies to pool, careful consideration should be given to factors such as length of study and heterogeneity of patient populations. As the number of studies increases, the more informative the aggregate analysis will be. The draft guidance recommends unblinding to allow comparisons across treatment groups as long as steps are undertaken to maintain the overall blinding of the study.

The draft guidance does not have specific numerical recommendations for the reporting threshold of aggregate analyses but does offer the following factors:

  • The size of the difference in frequency between the test and control groups
  • Consistent increase in multiple trials
  • Pre-clinical evidence to support the finding
  • Evidence of a dose response
  • Plausible mechanism of action
  • Known class effect
  • Occurrence of other related adverse events (e.g., both strokes and transient ischemic attacks)

Meta-analysis refers to aggregate analysis of several studies. Some of the analysis methods such as fractional reporting ratios, standardized incidence ratios, network meta analyses, data visualization tools, Multi-Item Gamma Poisson Shrinker, and disproportionality analyses are listed in the draft guidance.

A standardized incidence ratio (SIR) is the ratio of observed cases to expected cases multiplied by 100. The expected cases are from a matched background population or can be derived from the expected rate. SIR is used in cancer aggregate analyses.

Network meta-analysis, which is also known as mixed-treatment meta-analysis or multiple treatment comparison meta-analysis, analyses both direct comparisons of treatments and indirect comparisons when there are multiple treatments across the studies to be pooled. For example, if one study compares treatment A versus B and another study compares treatment B versus C, then network meta-analysis allows the indirect comparison of treatment A versus C. This is an advantage over traditional pairwise meta-analysis.

Two models used in network meta-analysis is the fixed effects model, which assumes no heterogeneity exists between studies, and the random effects model, which does account for heterogeneity. Bayesian methods can also be used.

Disproportionality analysis seeks to identify events that are being reported unusually frequently compared to the background of other reports in the same database. Disproportionality analysis methods include the multi-item gamma-Poisson Shrinker, proportional reporting ratios, reporting odds ratios, and Bayesian confidence propagation neural network.

The Multi-Item Gamma Poisson Shrinker (MGPS) developed by DuMouchel is a disproportionality method that utilizes an empirical Bayesian model to detect the magnitude of drug-event associations in drug safety databases. It works by fitting a statistical model to the entire database of counts and expected values. This method assumes the observed count is drawn from a Poisson distribution. The method uses the relative reporting ratio values for all associations to estimate the parameters of a probability function that is a mixture of two Gamma distributions. This is the prior distribution. The main statistic is Empirical Bayes Geometric Mean, which is an empirical Bayes estimate of the reporting ratio.5,6

Sometimes, simple descriptive statistics may also suffice. However, simple pooling of data may lead to wrong conclusions as illustrated through the following examples.

Example 1:

Aggregate analysis may be performed by simply summing across studies the number of occurrences of a particular AE and dividing by the total number of patients to compare frequencies across treatment arms. Doing this could lead to false conclusions.

For example, in the below table of hypothetical data, treatment A has a higher rate than treatment B for study 1 and study 2. However, when you combined the data across studies, treatment B has a higher rate. This is known as Simpson’s Paradox, which occurs when a set of studies reviewed individually suggest one conclusion but when the studies are combined the conclusion is the opposite.

Study 1 Study 2 Both Studies
Treatment A 9/1000 (.90%) 20/3000 (.67%) 29/4000 (.73%)
Treatment B 26/3000 (.87%) 6/1000 (.60%) 32/4000 (.80%)

Example 2:

Relative risk, or logarithmic mean of the relative risk, is a common statistic used to compare incidences of adverse events. The relative risk calculated by simply pooling the data may show significant differences between treatments. But using the relative risk weighted by the inverse of its variance, the differences may not be significant. Lievre, Cucherat and Leizorovicz4 illustrate this while they propose the method of logarithmic mean of the relative risk weighted by the inverse of its variance, as noted in the table below.

Method Result: RR (Confidence Interval), p-value
Relative Risk by Simply Pooling the Data 5.20 (2.07, 13.08), p=0.00004
Relative Risk Weighted by the Inverse of its Variance 2.20 (0.76, 6.32), p=0.14


Aggregate analysis is an important component of SUSAR reporting. However, care must be taken to ensure the proper statistical methods are used. There are inherent limitations of analysis through simple pooling of frequencies or other statistical measures. Hence these need to be used cautiously. It is important to review each study individually for consistency of conclusions. More sophisticated meta-analysis methods should be used as appropriate, based on number of studies, frequencies of events, heterogeneity across pooled studies etc.


  1. Summary of Assessment for IND Safety Reporting, FDA Draft Guidance, Dec 2015. [Link]
  2. The Code of Federal Regulations, title 21, part 312. [Link]
  3. Safety Reporting Requirements for INDs and BA/BE Studies, FDA Guidance, Dec 2012. [Link]
  4. Lievre, M, Cucherat M, Leizorovicz A. Pooling, meta-analysis, and the evaluation of drug safety.
    Curr Control Trials Cardiovasc Med. 2002; 3(1): 6. [Link] Copyright © 2002 Lièvre et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article’s original URL.
  5. Kajungu D, Speybroeck N (2015) Implementation of Signal Detection Methods in Pharmacovigilance – A Case for their Application to Safety Data from Developing Countries. Ann Biom Biostat 2(2): 1017.
  6. Van Manen, Robbert P. Challenges in Safety Signal Detection and Management. First Arab Meeting of Pharmacovigilance, Rabat, Morocco, 22-26 September 2014.