1
D. G. Schrausser / Thesis chapter 2: Random selection as correlative construct (2022)
Thesis chapter 2: Random selection as correlative construct
Dietmar G. Schrausser
Institute of Psychology, Karl Franzens University, Universitätsplatz 2, 8010 Graz, Austria
Thesis July1 1996; English translation July 2022
Introducton
Since a random selection (or assignment) of elements (persons) from populations is of fundamental
importance for both, approximate and exact procedures, this concept is discussed in more detail. Two examples
are given for illustration: Scenario A (Section 2.1) illustrates the formal mechanism of random selection. In
Scenario B (Section 2.2) this formal mechanism is applied to the selection of samples from a population.
Analogies to error variance and treatment variance are discussed. The mechanism of random selection is defined
as non-relationship between a selection process A and characteristics of a population P from which a sample S
is selected (Section 2.3).
2.1 Scenario A: Of panels and cuboids
2.1.1 Case 1
Given a set P, a panel with a uniform thickness of entities e. Furthermore, we assume that set A is an
arrangement of cuboids of different diameters. The squares are all the same length. This set A shall be called
the ‚shaping matrix‘. Now imagine that set A is positioned parallel to the left of the flat panel P and penetrates
it from left to right. Figure 2.1 illustrates what has been described:
Figure 2.1
Pictorial representation of a selection process of elements S from a set P determined by a set A. The sizes of the set S are completely
determined by the size property of set A, since P is constant with respect to this property.
1 Schrausser, D. G. (1996). Permutationstests: Theoretische und praktische Arbeitsweise von Permutationsverfahren beim
unverbundenen 2 Stichprobenproblem. Thesis. Institute of Psychology, Karl Franzens University, Austria.
DOI: 10.13140/RG.2.2.24500.32640/1
2
D. G. Schrausser / Thesis chapter 2: Random selection as correlative construct (2022)
The individual parts that result from this 'shaping process' are cuboids of a set S. This set S will be called
the shaped quantity. The side length distribution of the resulting quantity S is defined by the thickness of panel
P and the cuboid diameters of shaping matrix A. The cuboids of quantity S have the thickness e of panel P as
one side length (a), and the diameter of cuboids with different thicknesses of matrix A as the other side (b), see
Figure 2.2.
Figure 2.2
Since one side length (a) of quantity S is constant or invariant (always entities e wide), its distribution
follows a constant. If one neglects the constant side length of resulting S (a), defined by P, then only the
distribution of matrix A becomes apparent in the distribution of quantity S. In other words, only the distribution
information2 of shaping matrix A is contained in the distribution information of the shaped quantity S.
2.1.2 Case 2
Again 2 sets are defined: A panel P and a shaping matrix A, which forms cuboids out of P. This time,
however, P should not have a uniform thickness e, but should be of different diameters. Let's just vary the
thickness in the same way as the diameters of the cuboids of our shaping matrix A in case 1. We get an uneven
formed panel (see Figure 2.3). All cuboids of matrix A should have a constant diameter e. The squares are again
the same length. Furthermore, the same number of cuboids should belong to matrix A as in case 1, see Figure
2.3. If you form cuboids out of the panel, you get quantity S consisting of cuboids whose side lengths are
determined by the two basic sets A and P. Again, one side length of the cuboid (b) is constant e, the other side
length (a) varies (see Figure 2.2). This time, however, it is exactly the opposite: the distribution information,
which can be neglected (because it is constant), does not result from panel P, but from shaping matrix A, which
consists of cuboids with equal diameter. If one neglects the constant side length (b) of resulting S, defined by
matrix A, then only the distribution of P becomes apparent in the distribution of S, only the distribution
information of panel P is included in the distribution information of the shaped quantity S.
Figure 2.3
Pictorial representation of case 2. Again a set A and a set P determining a set S. Here, the size properties of set S are determined by the
size properties of set P, since the sizes of the boxes from set A are constant and can be kept that way for S.
As a result, both cases (1 and 2) are equivalent, only the prerequisites are different: First, the 'selecting'
set (shaping matrix A) was different in terms of its distribution information, contained higher information
content (cf. Shannon and Weaver, 1949), the other time it was the set that was 'selected' from, that differed in
its distribution information (panel P). In both cases, the distribution information of the cuboids of set S can be
2
The way in which elements of a set differ.
3
D. G. Schrausser / Thesis chapter 2: Random selection as correlative construct (2022)
used to deduce the distribution information of one of the initial sets A or P: In case 1, that of set A. In case 2,
that of set P.
2.1.3 Case 3
Two sets A and P are defined as in the previous two cases. Now, both sets are of unequal, variant
distribution information in terms of thickness or diameter, see Figure 2.4. Panel P is of unequal thickness (case
2), the cuboids of matrix A differ in diameter (case 1). The shaping process proceeds as before. Quantity S
created by the shape forming process has cuboids, which in turn get their side lengths (a) and (b) from both
initial sets (the cuboid diameters of matrix A and the varying thickness of panel P) (see Figure 2.2). However,
since this time both initial sets A and P have variable distributions and therefore no side length can be kept
constant, it is no longer possible (if you were to mix the cuboids of S) to identify the distribution pattern of one
of the initial sets (A or P) based on the distribution pattern of the cuboid sizes of S. The side lengths of the
cuboids S imply the different diameters of both, the cuboids of A and the different thicknesses of P. The
distribution of panel P and the distribution of shapingng matrix A can be seen in the distribution of shaped
quantity S.
Figure 2.4
Pictorial representation of case 3. The size properties of set S are determined by the size properties of set P and set A.
2.2 Scenario B: From panel to population, from cuboid to person.
2.2.1 For case 1
Let panel P be a population of people from whom a hypothetical variable is to be determined, let's say
long-term memory performance. The memory performance (e.g. the number of pictures that can be retained by
a person) be analogous to the thickness3 of the panel P. In case 1, when P had a uniform thickness, all people
would have the same memory performance capacity: all could memorize e images (as an analogue to panel
thickness e). Shaping matrix A (different cuboids) should now represent a selection process. This set shall be
called the selection set A (cf. Table 2.2.1). The purpose of this construction is to select individuals from the total
population P. The selection process takes place in the same way as above: The cuboids of the selection set A
form cuboids out of the panel, now population P. These formed cuboids, now selected subjects, are a random
sample S (analogous to the shaped quantity S).
Persons selected to sample S by the process of shaping would be determined on the one hand by their
memory performance in the population P (a) and on the other hand by the cuboid diameters of selection set A
(b). Analogously to the side lengths (a) and (b) (cf. Figure 2.2) of the cuboids of the shaping quantity S in case
1, which were determined by the panel thickness (panel P) and cuboid diameters (shaping matrix A). The panel
thickness (=memory performance) is constant e and can therefore be neglected. In S only the cuboid diameter
distribution of set A is visible. Only the distribution information of selection set A is contained in the distribution
information of sample S.
3
The memory performance (=panel thickness) is continuously distributed.
4
D. G. Schrausser / Thesis chapter 2: Random selection as correlative construct (2022)
2.2.2 For case 2
This time the people in population P should d i f f e r in their ability to retain images. The subjects
should vary in terms of their memory performance. This is analogous to panel P of unequal thickness from case
2. Accordingly, the selection set A should also have the properties of the shaping matrix A from case 2.
(=constant cuboid diameter). If, as described above, people were selected, the random sample S will in turn
imply the distribution information of selection set A (diameter of the cuboid) and population P (memory
performance). The cuboid diameter is constant and can be neglected. In sample S only the memory performance
distribution of population P is evident. Only the population distribution information is included in the sample
distribution information, sample S is an image of population P in terms of distribution information.
2.2.3 For case 3
A panel P of different thicknesses, a shaping set A with cuboids of different thicknesses result in a
shaped quantity S with cuboids of different side lengths (cf. 2.1.3). This now corresponds to differently
distributed memory performance in the population P and different cuboid diameters of the selection set A, which
result in a different distribution of the memory performance in sample S. The cuboids of the quantity S imply
the distribution information of panel P thickness and cuboid diameters of shaping matrix A. Here: sample S
implies the memory performance distribution of population P and the distribution information of selection set
A. The sample S reflects the distribution information of the two sets A and P again, which are now implicitly
included in the distribution information of the sample S and cannot be considered separately. In terms of
distribution information, sample S is an image of population P and selection set A. The distribution of memory
performance in sample S can no longer be used to determine the distribution of the memory performance in the
population P, since the distribution of selection set A is unknown. In this context Bohm (1987) describes
implications concerning holograms.
The following analogies are noticeable:
(1) The cuboid diameter distribution analogous to the error variance: The more the cuboid diameters of the
selection set A differ, the more the cuboid diameters in sample S differ.
(2) The panel thickness distribution analogous to the person-related error variance. The more the panel
thicknesses of population P differ, the more the cuboid diameters in sample S differ.
(3) Instead of person-related error variance, treatment variance can also be used. The thicker a part of the panel,
the more effect a treatment (=experimental treatment) achieves for a specific individual. Further
analogies (e.g.: mutual relationship of A and P, definition of A1 by A0, S as P etc.) and possible
extensions of the principle to other scenarios (e.g.: conditional probabilities, sequences of letters or
color mixtures) will not be discussed here.
Table 2.2.1 Comparative representation of the terms from scenarios A and B and analogies.
5
D. G. Schrausser / Thesis chapter 2: Random selection as correlative construct (2022)
2.3 Random selection or representative selection
If the thickness distribution P (person-related error variance or treatment variance) is to be apparent in
the distribution of the cuboids (individuals) of sample S, then there must be no connection between diameter A
(selection process) and thickness P (characteristic to be surveyed in the population). It must be ensured that
large cuboid diameters A encounter small panel thicknesses with the same frequency as large panel thicknesses.
The same applies to small cuboid diameters. Only if this non-relationship between the distribution information
from A and P is given, then the distribution information from P will be redundant in S. This is all the more true
the smaller the difference (=variance) in cuboid diameter A (=error variance) and the greater the variance
thickness P (=person-related error variance or treatment variance).
2.4 Discussion
If one sets the condition that only the distribution information of a population P is transferred to a
sample S (the sample is an image of the population on a small scale), then the distribution information of a
selection set A (selection process) must not be included in sample S. This can be clarified by formulating the
occurrence of panel characteristics in a population (thickness of P, memory ability of people in population P)
as a statement of probability. Certain feature intervals occur with distribution-related probabilities or
frequencies. If the selection process (forming cuboids out of a panel) has no connection whatsoever with the
feature to be surveyed (cuboid diameter with panel thickness), then each feature interval (panel thickness) is
selected with the same probability (cuboid diameter); i.e.: no feature interval is preferentially selected (no panel
thickness is preferentially ‘distorted’ by a particular cuboid diameter). The only determinant that determines the
frequency with which a feature or feature interval is selected, is the probability or frequency of the interval in
the population. Since this applies to all intervals, all intervals in the sample have the same probability of
occurrence as in the population: sample and population have approximately the same distribution information.
Representative samples S of a certain population P are only given if there is no correlation between population
P and selection process A.
(2.4.1)
r(A,P) = 0.
We speak of random selection if sample S contains the distribution information of population P from
which it was selected and if the distribution information of P is not related to the distribution information of the
selection A. In case of psychological experiments, where one is almost exclusively dependent on random
sampling, in order to be able to make general statements, genuine random selection of elements (= subjects) is
of essential importance. Bortz (1993) sums up this problem in context with conditional probabilities:
Genaugenommen müßte die Aussage „In diesem Zufallsexperiment hat das Ereignis A eine Wahrscheinlichkeit von p(A)“
ersetzt werden durch die Aussage „In diesem Zufallsexperiment hat das Ereignis A eine Wahrscheinlichkeit von p(A)
vorausgesetzt, das Zufallsexperiment wurde korrekt durchgeführt (Ereignis B)“. ... (d. h. daß die Wahrscheinlichkeit eines
korrekten Zufallsexperimentes 1 ist bzw. daß p(B)=1, ... . (S.54)
Strictly speaking, the statement "In this random experiment, event A has a probability of p(A)" should be replaced by the
statement "In this random experiment, event A has a probability of p(A) assuming the random experiment was carried out
correctly (event B)". ... (i.e. that the probability of a correct random experiment is 1 or that p(B)=1, ... . (p.54)
The Bayesian approach represents an alternative to the access via random experiments. However, when
estimating the a priori probabilities, one has to rely on assumptions, which make an objective justification of
the results problematic. In addition, it is not possible to infer the distribution pattern of populations if one
relativizes the random element4 and thus tolerates a hybrid distribution structure (cf. case 3) of the sample. So
4
It should be noted that there are some doubts as to whether one can make valid statements at all on the basis
of random mechanisms, since chance can also be randomly non-random (cf. Urbach, 1985).
6
D. G. Schrausser / Thesis chapter 2: Random selection as correlative construct (2022)
the approach has probably more scientific-theoretical than practical value. This is fundamentally illustrated by
Chernoff and Moses (1959), Pratt et al. (1965), Bühlmann et al. (1967), de Groot (1970), LaValle (1970) and
Bortz (1984).
However, the randomness does not only have to refer to the selection of elements, but can also be
transferred to there allocation under certain experimental conditions (the selection set A would only have to be
renamed allocation set A). Random assignment of subjects to treatment conditions is of crucial importance in
context with such permutation methods where random selection does not exist or is not possible (cf. Edgington
1995).
References
Bohm, D. (1987). Wholeness and the implicate order. London: Routledge & Kegan Paul PLC.
Bortz, J. (1984). Lehrbuch der empirischen Forschung. Berlin: Springer.
Bortz, J. (1993). Statistik für Sozialwisenschaftler (4. Aufl.). Berlin: Springer.
Bühlmann, H., Löffel H., Nievergelt, E. (1967). Einführung in die Theorie und Praxis der Entscheidung bei Unsicherheit.
Heidelberg: Springer.
Chernoff, H., Moses, L. E. (1959). Elementary decision theorie. New York: Wiley.
De Groot, M, H. (1970). Optima statistical decisions. New York: McGraw Hill.
Edgington, E. S. (1995). Randomization tests (3rd ed). New York: Marcel Dekker.
LaValle, J. H. (1970). An introduction to probability, decision and inference. New York: Holt, Rinehart and Winston.
Pratt, J. W., Raiffa, H., Schlaifer, R. (1965). Introduction to statistical decision theory. New York: McGraw Hill.
Shannon, C. E., Weaver, W. (1949). The Mathematical Theory of Communication. Urbana.
Urbach, P. (1985). Randomization and the design of experiments. Philosophy of Science. 52 (2), 256-273.