Personal tools

Statistical Inference

Cornell University_060120A
[Cornell University]


- Overview

Statistical inference is the process of using data analysis to draw conclusions about a population or process. It involves making decisions about the parameters of a population based on random sampling. 

Statistical inference is different from descriptive statistics. Descriptive statistics only considers the properties of the observed data. Statistical inference, on the other hand, assumes that the observed data set is sampled from a larger population. 

Statistical inference is based on the assumption that each individual within the population of interest has the same probability of being included in a specific sample. 

The purpose of statistical inference is to estimate the uncertainty or sample to sample variation. It helps to assess the relationship between the dependent and independent variables. 

For example, using the result of a poll about the president's current approval rating to estimate (or predict) his or her true current approval rating nationwide. 

The steps of the test depend on: 

  • Type of data (categorical or numerical)
  • If you are looking at: a single group; comparing one group to another; comparing the same group before and after a change 
 
 

- Inference

Inference is the step of reasoning, from premises to logical consequences; etymologically speaking, the word infer means "inherit". 

Reasoning has traditionally been divided theoretically into deduction and induction, a distinction that dates back at least as far as Aristotle (300s BC) in Europe. 

Deduction is reasoning that draws logical conclusions from premises known or assumed to be true, and studies the rules of effective reasoning in logic. 

Induction is the inference of general conclusions from specific evidence. 

A third type of inference is sometimes distinguished, notably by Charles Sanders Peirce, which contrasts abductive and inductive reasoning.

 

- The Main Types of Inferential Statistics

Statistical inference is a method of using data analysis to draw conclusions about a population or process. It involves using sample data to estimate the characteristics of the whole population from which the sample was drawn. 

Inferential statistics are used to analyze data and make educated guesses about a population. Some types of inferential statistics include: 

  • Hypothesis testing: One of the two main types of inferential statistics, along with regression analysis. Hypothesis testing involves taking a sample of data from a population and calculating a statistic.
  • T-tests: A type of inferential statistics that determines if there is a significant difference between the means of two groups.
  • ANOVA: A popular inferential statistics test that compares the means of a study's target groups to identify if they are statistically different.
  • Regression analysis: A primary subtype of inferential statistics.


Other types of inferential statistics include: 

  • Linear regression analysis
  • Logistic regression analysis
  • Analysis of covariance (ANCOVA)

Inferential statistics requires that the samples used be representative of the population as a whole.

 

- The Four Pillars of Inferential Statistics

The four pillars of inferential statistics are: Significance, Estimation, Generalization, Causation. 

Inferential statistics is a branch of statistics that uses the properties of data to test hypotheses and draw conclusions. Statisticians can use these conclusions to draw four main types of inferences from data: 

  • Significance: How strong is the evidence?
  • Estimation
  • Generalization
  • Causation


Some examples of inferential statistics tools include: 

  • Hypothesis testing
  • Confidence intervals
  • Regression analysis
  • Analysis of variance (ANOVA)
  • Chi-square tests

 

- Statistical Estimation and Statistical Hypothesis Testing

Statistical inference can be divided into two broad areas: statistical estimation and statistical hypothesis testing. 

Hypothesis testing is a type of statistical inference that uses data from a sample to draw conclusions about a population parameter or a population probability distribution. It involves making a tentative assumption about the parameter or distribution, and then testing that assumption and drawing conclusions about the parameters from the sample. 

Confidence intervals are a range of values that are believed to contain the true value of a population parameter. They provide a way to measure the uncertainty associated with point estimates and provide a range of plausible values for the population parameter.

 
 

[More to come ...]



Document Actions