Take advantage of your time, International Six Sigma Inc. offers both Instructor-led Live Virtual classes as well as Online Self-Paced training. Enroll Today!

Enroll Now
Phone: (866) 409-1363
Six SigmaSix Sigma Tools

Basic Statistics in Six Sigma

In this module, we’ll discuss the following various statistical approached utilized in Six Sigma.

1. The Purpose of Basic Statistics

The purpose for statistics in Six Sigma are the following:

  • Provide a numerical summary of the data being analyzed.
  • Provide the basis for making inferences about the future.
  • Provide the foundation for assessing process capability.
  • Provide a common language to be used throughout the organization.

Statistics is the basic language of Six Sigma. A good understanding of statistics is the foundation upon which many of the subsequent tools will be used.

1.2 Statistical Notation Cheat Sheet

Don’t bother memorizing any of this, but refer to this as needed.

statistical notation cheat sheet, shmula

1.3. Parameters vs Statistics

Let’s go through a few definitions, with the help of the chart below:

  • Population: All the items that have the “property of interest” under study.
  • Frame: An identifiable subset of the population.
  • Sample: A significantly smaller subset of the population used to make an inference.

population parameters in statistics

1.4. Purpose of Sampling

To get a sufficiently accurate inference for considerably less time, money, and other resources, and also to provide a basis for statistical inference; if sampling is done well, and sufficiently, then the inference is that what we see in the sample is representative of the population

A population parameter is a numerical value that summarizes the data for an entire population, a sample has a corresponding numerical value called a statistic.

The population is a collection of all the individuals of interest. It must be defined carefully, such as all the trades completed in 2001. If for some reason there are unique subsets of trades it may be appropriate to define those as a unique population, such as, all sub custodial market trades completed in 2001, or emerging market trades.

Sampling frames are complete lists and should be identical to a population with every element listed only once. It sounds very similar to population, and it is. The difference is how it is used. A sampling frame, such as the list of registered voters could be used to represent the population of adult general public. Maybe there are reasons why this wouldn’t be a good sampling frame. Perhaps a sampling frame of licensed drivers would be a better frame to represent the general public.

The sampling frame is the source for a sample to be drawn.

It is important to recognize the difference between a sample and a population because we typically are dealing with a sample of the what the potential population could be in order to make an inference. The formulas for describing samples and populations are slightly different. In most cases we will be dealing with the formulas for samples.

2. Types of Data

The nature of data is important to understand. As we discussed in the video Data Types in Six Sigma, knowing the data type gives you the option to utilize different analysis.

2.1. Attribute Data (Qualitative)

Attribute data is always binary – only two possible values.

  • Yes/No
  • Go/No Go
  • Pass/Fail

2.2. Variable Data (Quantitative)

Discrete Data is data that can be counted, categorized, and classified based on counts. For example:

  • Number of Defects
  • Number of Defective Units
  • Number of Customer Returns

Continuous Data is data that can be measured on a continuum. It has decimal subdivisions, for example:

  • Time, pressure, conveyor speed, material feed rate
  • Money

Here are several real-world examples that may help you:

table of examples continuous data table-variable-data

2.3. Scaled Data

Knowing how to represent data can affect the types of statistical tests available to you. Here are a few scales to keep in mind:

  • Nominal Scale: Data consists of names, labels, or categories. These cannot be arranged in an ordering scheme and no arithmetic operations are performed on this type of data.
  • Ordinal Scale: Data is arranged in some order, but differences between  data values either cannot be determined or are not meaningful.
  • Interval Scale: data can be arranged in some order and for which differences in data values are meaningful. The data can be arranged in an ordering scheme and differences can be interpreted.
  • Ratio Scale: data that can be ranked and for which all arithmetic operations including division can be performed. (division by zero is of course excluded) Ratio level data has an absolute zero and a value of zero indicates a complete absence of the characteristic of interest.

Now let’s go through several examples of scaled data.

ordinal-scaled-data table-interval-data table-nominal-scaled-data table-ratio-data

At this point, I recommend you review the section on Distributions in Six Sigma to give you a better idea of how data can look given the data type you have.

In the next section, we’ll introduce you to Z-Values.
[contentblock id=16 img=gcb.png]

SixSigma.com offers both Live Virtual classes as well as Online Self-Paced training. Most option includes access to the same great Master Black Belt instructors that teach our World Class in-person sessions. Sign-up today!

Discussion

Comments are disabled for this post.

Training Options

Classroom Training

Explore Programs

Online Training

Explore Programs

Webinar Training

Explore Programs

On-site Training

Explore Programs

Blended Training

Explore Programs

Operational Excellence

Explore Programs

Consulting Services

Explore Programs

Group/Corporate Training

Explore Programs
Scroll to top