Mastering Method
Development: A Comprehensive
Guide to Sampling
Protocols and Robust
CCI Testing
Method development and validation are often regarded
as the most challenging aspects of quantitative and
deterministic Container Closure Integrity (CCI) testing. Despite
the availability of turnkey solutions, contract laboratories, and
extensive industry support, PTI frequently receives client questions
regarding the process, sampling sizes, and practical recommendations for
implementing method development and validation. A robust test method requires
a well-characterized measurement system, a thorough analysis of a representative sample baseline and effective
testing of positive controls. This article explores each topic, aiming to provide a framework for making informed
decisions in the method development process.
decisions in the method development process.
The importance of characterization cannot be overstated. Quantitative detection assumes that the behavior of the
measurement system can be characterized. Before any other testing is conducted, the operational qualifications
and instrument specifications must be met.

The first step following system characterization is to evaluate baseline sample behavior. Negative control
characterization may also include master sample testing. Whether system characterization occurs with or without
master samples, the primary objective is to test a sufficient number of negative samples to represent negative
control behavior. When considering sampling, the number of samples is often only the first question. Clients must
also determine whether their plan accounts for differences between production runs. Are they attempting to set
up the same method for two packages that are actually handled differently? Any potential difference warrants
increasing the sample size and evaluating samples separately. PTI recommends using at least 30 samples for a
basic negative control set. This value is derived from the convergence point between a normal distribution and a
Student’s distribution. Classically, 30 samples is considered the point where the Student’s distribution, designed
for handling smaller sample sizes, begins to approximate the behavior of a normal distribution. This allows for
the use of straightforward Gaussian analysis instead of a more complex model.
The Central Limit Theorem (CLT) provides an alternate basis for evaluating sample size sufficiency. According
to the CLT, for a population characterized by a mean (μ) and standard deviation (σ), if sufficiently large random
sample sets are taken from this population with replacement, the distribution of the sample set means will
approach a normal distribution. However, if the underlying population is not normal and the sample set is too
small, even the sample set means will not result in a normal distribution. The typical size recommendation remains
30 samples. This method is only effective if a representative set of test samples is provided.
Once the negative sample set is evaluated, appropriate confidence intervals may be computed using a normal
distribution. Confidence intervals estimate a range of values likely to encompass an unknown population
parameter. This estimated range is derived from specific sample data. In the context of Container Closure Integrity
Testing (CCIT)
, this sample data represents test results. Furthermore, the direction of failure is often known. This
knowledge makes it possible to assess the probability of a false negative, which occurs when a test fails to detect
a specified breach size in container integrity.
Positive control samples are also required to develop a method.
Their purpose is to demonstrate that specific defect types are
statistically distinguishable from the general population. Before
selecting the quantity of positive controls, the types of positive
controls must be appropriately selected. Leaks calibrated to a required size and leaks simulating natural defects
may be necessary for positive control evaluation. A comprehensive risk assessment may help identify defect
profiles and their underlying causes, forming the basis for a robust positive control strategy.
Defects fall into three categories: catastrophic, gross, and micro. Catastrophic defects are apparent upon visual
inspection and are often associated with visible container damage. Gross defects, larger than 100 μm, may not
be readily visible upon inspection. Micro defects refer to leaks smaller than gross defects. To effectively challenge
the method, it is advisable to include four distinct positive control sizes: three sizes of micro leaks and one gross
leak profile. This comprehensive strategy ensures that the testing method reliably detects various leak sizes and
types, enhancing its robustness and validity of the results. Depending on the production or laboratory situation,
testing for catastrophic defects may also be necessary.
Positive controls are evaluated with the assumption that a given defect type will perform consistently according
to a normal distribution. Defect type refers to characteristics such as size, location, creation method, or defect
channel length. PTI recommends a set of three sizes and 15 samples for each defect type. For mass-flow-based
technologies, the assumption of normality aligns with the measurement method. Exceptions, such as alternative
fluids or detection methods like visual inspection or electric current-based detection, require increasing the
number of positive controls. Regardless of the assumption of normality, baseline signal variation for positive
controls must be assumed to match the negative control baseline variation. Without this assumption the number
of positive controls could not be reduced below 30 samples per defect type.
Positive control evaluations must also consider harmonization needs, allowable drift, and rejection limits. Recipes
are typically established for long-term use or multi-site applications. A well-validated method with reasonable
margins is essential. A general rule of thumb is to allow at least 1.5 standard deviations for long-term variability,
on top of the classic six-sigma rejection criterion. For normally distributed detection technologies, long term
variability should be added to the minimum of 3 to 6 standard deviations that should fall between the positive
control sample average and the rejection limit. For non-normal distributions, such as high-voltage detection,
PTI recommends using the lowest measured positive control value instead of the average. If additional margin is
required, PTI suggests doubling the rejection limit as a target for lowest positive control result.
For processes where speed is critical or the limit of detection (LOD) is low, clients are encouraged to contact
PTI for assistance. Methods using a well charactized system where the lowest positive control results range from
15–30 standard deviations above the negative sample average are considered robust. Clients using recipes with
results below this range, especially for multi-site considerations, should conduct a careful risk assessment. Where
defect type normality is well-characterized, or theoretically justified, positive sample sets may be reduced to 8–15
controls per defect type, compensated for by using a Student’s distribution evaluation.
Establishing a robust method for CCIT requires
understanding baseline sample behavior, positive
control evaluation, and appropriate statistical
techniques. While Gaussian statistics, Student’s
distributions, and the Central Limit Theorem provide
valuable tools, thorough risk assessment is essential
for evaluating defect profiles accurately. A carefully
developed method supports long-term success in
product line testing. Implementing a positive control
strategy that includes various leak sizes enhances
testing reliability, ensuring the detection of potential
breaches in container integrity. This safeguards product
quality and patient safety while improving robustness,
transferability, and long-term applicability of recipes.
REFERENCES