Censored data

Most types of data are 'complete', meaning that the value of each sample is known. However, in reliability analysis, experiments with minimum and maximum thresholds of detection, observations with a maximum available time or only periodic, etc. we may only know that some observations were above a certain value (right censored), below a certain value (left censored) or between two values (interval censored).

Complete Data

Complete data means that the value of each sample unit is observed or known. Complete data is much easier to work with than censored data, and basic statistical analysis techniques assume that we have complete data.

Censored Data

There are three types of possible censoring schemes, right censored data (also called suspended data), interval censored data, and left censored data.

Right Censored (Suspended)

These are data for which we know only its minimum value. In reliability testing, for example, not all of the tested units will necessarily fail within the testing period. Then all we know is that the failure time exceeds the testing time. In microbiology, there is a practical threshold above which we cannot count colonies on a Petri dish. In sequential sifting, we known only the minimum diameter of the largest particles that don't pass through the first sieve. This type of data is commonly called right-censored or suspended data.

Interval Censored

These are data for which we know only that they lie between a certain minimum and maximum. Interval censoring arises commonly when we assign measurements into categories or intervals. For example, a survey may ask people which income range they have, and offer several contiguous intervals, rather than ask their exact income. In reliability testing, for example, we may only be inspecting the units every T hours, so can only record that a unit failed between nT and (n+1)T hours. This is sometimes called inspection data.

Left Censored

These are data for which we know only its maximum value. In scientific experiments, for example, we may not be able to measure some quantity because it is below the threshold of detection (e.g. chemical concentration).

Fitting to truncated, censored or binned data requires an adjusted Likelihood function to optimize, as explained here.

Censored data

Navigation