what is marginal distribution in statistics

1 year ago 56
Nature

A marginal distribution is a probability distribution of values for one variable that ignores a more extensive set of related variables in a dataset. It describes the likelihood of an event to occur, independent of others. Marginal distributions are useful because although we often collect data for two variables, sometimes we have specific questions about just one variable.

In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables in the subset without reference to the values of the other variables. Marginal variables are those variables in the subset of variables being retained. These concepts are "marginal" because they can be found by summing values in a table along rows or columns, and writing the sum in the margins of the table. The distribution of the marginal variables (the marginal distribution) is obtained by marginalizing over the distribution of the variables being discarded, and the discarded variables are said to have been marginalized out.

To find a marginal distribution, statisticians say you need to “marginalize out” the other variables. It is easiest to understand and find marginal distributions using a two-way contingency table. In a two-way table, the marginal distributions are shown in the margins of the table.