Chapter 4

THEORY BUILDING IN INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY

Jane Webster

and

William H. Starbuck

New York University

Pages 93-138 in C. L. Cooper and I. T. Robertson (eds.), International Review of Industrial and Organizational Psychology 1988; Wiley, 1988.

SUMMARY

I/O psychology has been progressing slowly. This slowness arises partly from a three-way imbalance: a lack of substantive consensus, insufficient use of theory to explain observations, and excessive confidence in induction from empirical evidence. I/O psychologists could accelerate progress by adopting and enforcing a substantive paradigm.

Staw (1984: 658) observed:

The micro side of organizational behavior historically has not been strong on theory. Organizational psychologists have been more concerned ‘with research methodology, perhaps because of the emphasis upon measurement issues in personnel selection and evaluation. As an example of this methodological bent, the I/O Psychology division of the American Psychological Association, when confronted recently with the task of improving the field’s research, formulated the problem as one of deficiency in methodology rather than theory construction.... It is now time to provide equal consideration to theory formulation.

This chapter explores the state of theory in I/O psychology and micro-Organizational Behavior (OB).[1] The chapter argues that these fields have progressed very slowly, and that progress has occurred so slowly partly because of a three-way imbalance: a lack of theoretical consensus, inadequate attention to using theory to explain observations, coupled with excessive confidence in induction from empirical evidence. As a physicist, J. W. N. Sullivan (1928; quoted by Weber, 1982, p. 54), remarked: ‘It is much easier to make measurements than to know exactly what you are measuring.’

Well-informed people hold widely divergent opinions about the centrality and influence of theory. Consider Dubin’s (1976, p. 23) observation that managers use theories as moral justifications, that managers may endorse job enlargement, for example, because it permits more complete delegation of responsibilities, raises morale and commitment, induces greater effort, and implies a moral imperative to seek enlarged jobs and increased responsibilities. Have these consequences anything to do with theory? Job enlargement is not a theory, but a category of action. Not only do these actions produce diverse consequences, but the value of any single consequence is frequently debatable. Is it exploitative to elicit greater effort, because workers contribute more but receive no more pay? Or is it efficient, because workers contribute more but receive no more pay? Or is it humane, because workers enjoy their jobs more? Or is it uplifting, because work is virtuous and laziness despicable? Nothing compels managers to use job enlargement; they adopt it voluntarily. Theory only describes the probable consequences if they do use it. Furthermore, there are numerous theories about work redesigns such as job enlargement and job enrichment, so managers can choose the theories they prefer to espouse. Some theories emphasize the consequences of work redesign for job satisfaction; others highlight its consequences for efficiency, and still others its effects on accident rates or workers’ health (Campion and Thayer, 1985).

We hold that theories do make a difference, to non-scientists as well as to scientists, and that theories often have powerful effects. Theories are not neutral descriptions of facts. Both prospective and retrospective theories shape facts. Indeed, the consequences of actions may depend more strongly on the actors’ theories than on the overt actions. King’s (1974) field experiment illustrates this point. On the surface, the study aimed at comparing two types of job redesign: a company enlarged jobs in two plants, and began rotating jobs in two similar plants. But the study had a 2 x 2 design. Their boss told two of the plant managers that the redesigns ought to raise productivity but have no effects on industrial relations; and he told the other two plant managers that the redesigns ought to improve industrial relations and have no effects on productivity. The observed changes in productivity and absenteeism matched these predictions: productivity rose significantly while absenteeism remained stable in those two plants, and absenteeism dropped while productivity remained constant in the other two plants. Job rotation and job enlargement, however, yielded the same levels of productivity and absenteeism. Thus, the differences in actual ways of working produced no differences in productivity or absenteeism, but the different rationales did induce different outcomes.

Theories shape facts by guiding thinking. They tell people what to expect, where to look, what to ignore, what actions are feasible, what values to hold. These expectations and beliefs then influence actions and retrospective interpretations, perhaps unconsciously (Rosenthal, 1966). Kuhn (1970) argued that scientific collectivities develop consensus around coherent theoretical positions- paradigms. Because paradigms serve as frameworks for interpreting evidence, for legitimating findings, and for deciding what studies to conduct, they steer research into paradigm-confirming channels, and so they reinforce themselves and remain stable for long periods. For instance, in 1909, Finkelstein reported in his doctoral dissertation that he had synthesized benzocyclobutene (Jones, 1966). Finkelstein’s dissertation was rejected for publication because chemists believed, at that time, such chemicals could not exist, and so his finding had to be erroneous. Theorists elaborated the reasons for the impossibility of these chemicals for another 46 years, until Finkelstein’s thesis was accidentally discovered in 1955.

Although various observers have argued that the physical sciences have stronger consensus about paradigms than do the social sciences, the social science findings may be even more strongly influenced by expectations and beliefs. Because these expectations and beliefs do not win consensus, they may amplify the inconsistencies across studies. Among others, Chapman and Chapman (1969), Mahoney and DeMonbreun (1977) and Snyder (1981) have presented evidence that people holding prior beliefs emphasize confirmatory strategies of investigation, they rarely use disconfirmatory strategies, and they discount disconfirming observations: these confirmatory strategies turn theories into self-fulfilling prophecies in situations where investigators’ behaviors can elicit diverse responses or where investigators can interpret their observations in many ways (Tweney et al., 1981). Mahoney (1977) demonstrated that journal reviewers tend strongly to recommend publication of manuscripts that confirm their beliefs and to give these manuscripts high ratings for methodology, whereas reviewers tend strongly to recommend rejection of manuscripts that contradict their beliefs and to give these manuscripts low ratings for methodology. Faust (1984) extrapolated these ideas to theory evaluation and to review articles, such as those in this volume, but he did not take the obvious next step of gathering data to confirm his hypotheses.

Thus, theories may have negative consequences. Ineffective theories sustain themselves and tend to stabilize a science in a state of incompetence, just as effective theories may suggest insightful experiments that make a science more powerful. Theories about which scientists disagree foster divergent findings and incomparable studies that claim to be comparable. So scientists could be better off with no theories at all than with theories that lead them nowhere or in incompatible directions. On the other hand, scientists may have to reach consensus on some base-line theoretical propositions in order to evaluate adequately the effectiveness of these base-line propositions and the effectiveness of newer theories that build on these propositions. Consensus on base-line theoretical propositions, even ones that are somewhat erroneous, may also be an essential prerequisite to the accumulation of knowledge because such consensus leads scientists to view their studies in a communal frame of reference (Kuhn, 1970). Thus, it is an interesting question whether the existing theories or the existing degrees of theoretical consensus have been aiding or impeding scientific progress in I/O psychology.

Consequently and paradoxically, this chapter addresses theory building empirically, and the chapter’s outline matches the sequence in which we pose questions and seek answers for them.

First we ask: How much progress has occurred in I/O psychology? If theories are becoming more and more effective over time, they should explain higher and higher percentages of variance. Observing the effect sizes for some major variables, we surmise that I/O theories have not been improving.

Second, hunting an explanation for no progress or negative progress, we examine indicators of paradigm consensus. To our surprise, I/O psychology does not look so different from chemistry and physics, fields that are perceived as having high paradigm consensus and as making rapid progress. However, physical science paradigms embrace both substance and methodology, whereas I/O psychology paradigms strongly emphasize methodology and pay little attention to substance.

Third, we hypothesize that I/O psychology’s methodological emphasis is a response to a real problem, the problem of detecting meaningful research findings against a background of small, theoretically meaningless, but statistically significant relationships. Correlations published in the Journal of Applied Psychology seem to support this conjecture. Thus, I/O psychologists may be de-emphasizing substance because they do not trust their inferences from empirical evidence.

In the final section, we propose that I/O psychologists accelerate the field’s progress by adopting and enforcing a substantive paradigm. We believe that I/O psychologists could embrace some base-line theoretical propositions that are as sound as Newton’s laws, and using base-line propositions would project findings into shared perceptual frameworks that would reinforce the collective nature of research.

PROGRESS IN EXPLAINING VARIANCE

Theories may be evaluated in many ways. Webb (l961) said good theories exhibit knowledge, skepticism and generalizability. Lave and March (1975) said good theories are metaphors that embody truth, beauty and justice; whereas unattractive theories are inaccurate, immoral or unaesthetic. Daft and Wiginton (1979) said that influential theories provide metaphors, images and concepts that shape scientists’ definitions of their worlds. McGuire (1983) noted that people may appraise theories according to internal criteria, such as their logical consistency, or according to external criteria, such as the statuses of their authors. Miner (1984) tried to rate theories’ scientific validity and usefulness in application. Landy and Vasey (1984) pointed out tradeoffs between parsimony and elegance and between literal and figurative modeling.

Effect sizes measure theories’ effectiveness in explaining empirical observations or predicting them. Nelson et al. (1986) found that psychologists’ confidence in research depends primarily on significance levels and secondarily on effect sizes. But investigators can directly control significance levels by making more or fewer observations, so effect sizes afford more robust measures of effectiveness.

According to the usual assumptions about empirical research, theoretical progress should produce rising effect sizes-for example, correlations should get larger and larger over time. Kaplan (1963: 351-5) identified eight ways in which explanations may be open to further development; his arguments imply that theories can be improved by:

1. taking account of more determining factors,

2. spelling out the conditions under which theories should be true,

3. making theories more accurate by refining measures or by specifying more precisely the relations among variables,

4. decomposing general categories into more precise subclasses, or aggregating complementary subclasses into general categories,

5. extending theories to more instances,

6. building up evidence for or against theories’ assumptions or predictions,

7. embedding theories in theoretical hierarchies, and

8. augmenting theories with explanations for other variables or situations.

The first four of these actions should increase effect sizes if the theories are fundamentally correct, but not if the theories are incorrect. Unless it is combined with the first four actions, action (5) might decrease effect sizes even for approximately correct theories. Action (6) could produce low effect sizes if theories are incorrect.

Social scientists commonly use coefficients of determination, r², to measure effect sizes. Some methodologists have been advocating that the absolute value of r affords a more dependable metric than r² in some instances (Ozer, 1985; Nelson et al., 1986). For the purposes of this chapter, these distinctions make no difference because r² and the absolute value of r increase and decrease together. We do, however, want to recognize the differences between positive and negative relationships, so we use r.

Of the nine effect measures we use, six are bivariate correlations. One can argue that, to capture the total import of a stream of research, one has to examine the simultaneous effects of several independent variables. Various researchers have advocated multivariate research as a solution to low correlations (Tinsley and Heesacker, 1984; Hackett and Guion, 1985). However, in practice, multivariate research in I/O psychology has not fulfilled these expectations, and the articles reviewing I/O research have not noted any dramatic results from the use of multivariate analyses. For instance, McEvoy and Cascio (1985) observed that the effect sizes for turnover models have remained small despite research incorporating many more variables. One reason is that investigators deal with simultaneous effects in more than one way: they can observe several independent variables that are varying freely; they can control for moderating variables statistically; and they can control for contingency variables by selecting sites or subjects or situations. It is far from obvious that multivariate correlations obtained in uncontrolled situations should be higher than bivariate correlations obtained in controlled situations. Indeed, the rather small gains yielded by multivariate analyses suggest that careful selection and control of sites or subjects or situations may be much more important than we have generally recognized.

Scientists’ own characteristics afford another reason for measuring progress with bivariate correlations. To be useful, scientific explanations have to be understandable by scientists; and scientists nearly always describe their findings in bivariate terms, or occasionally trivariate terms. Even those scientists who advocate multivariate analyses most fervently fall back upon bivariate and trivariate interpretations when they try to explain what their analyses really mean. This brings to mind a practical lesson that Box and Draper (1969) extracted from their efforts to use experiments to discover more effective ways to run factories: Box and Draper concluded that practical experiments should manipulate only two or three variables at a time because the people who interpret the experimental findings have too much difficulty making sense of interactions among four or more variables. Speaking directly of the inferences drawn during scientific research, Faust (1984) too pointed out the difficulties that scientists have in understanding four-way interactions (Meehl, 1954; Goldberg, 1970). He noted that the great theoretical contributions to the physical sciences have been distinguished by their parsimony and simplicity rather than by their articulation of complexity. Thus, creating theories that psychologists themselves will find satisfying probably requires the finding of strong relationships among two or three variables.

To track progress in I/O theory building, we gathered data on effect sizes for five variables that I/O psychologists have often studied. Staw (1984) identified four heavily researched variables: job satisfaction, absenteeism, turnover and job performance. I/O psychologists also regard leadership as an important topic: three of the five annual reviews of organizational behavior have included it (Mitchell, 1979; Schneider, 1985; House and Singh, 1987).

Other evidence supports the centrality of these five variables for I/O psychologists. De Meuse (1986) made a census of dependent variables in I/O psychology, and identified job satisfaction as one of the most frequently used measures; it had been the focus of over 3000 studies by 1976 (Locke, 1976). Psychologists have correlated job satisfaction with numerous variables: Here, we examine its correlations with job performance and with absenteeism. Researchers have made job performance I/O psychology’s most important dependent variable, and absenteeism has attracted research attention because of its costs (Hackett and Guion, 1985). We look at correlations of job satisfaction with absenteeism because researchers have viewed absenteeism as a consequence of employees’ negative attitudes (Staw, 1984).

Investigators have produced over 1000 studies on turnover (Steers and Mowday, 1981). Recent research falls into one of two categories: turnover as the dependent variable when assessing a new work procedure, and correlations between turnover and stated intentions to quit (Staw, 1984).

Although researchers have correlated job performance with job satisfaction for over fifty years, more consistent performance differences have emerged in studies of behavior modification and goal setting (Staw, 1984). Miner (1984) surveyed organizational scientists, who nominated behavior modification and goal setting as the two of the most respected theories in the field. Although these two theories overlap (Locke, 1977; Miner, 1980), they do have somewhat different traditions, and so we present them separately here.

Next to job performance, investigators have studied leadership most often (Mitchell, 1979; De Meuse, 1986). Leadership research may be divided roughly into two groups: theories about the causes of leaders’ behaviors, and theories about contingencies influencing the effectiveness of leadership styles. Research outside these two groups has generated too few studies for us to trace effect sizes over time (Van Fleet and Yukl, 1986).

Many years ago, psychologists seeking ways to identify effective leaders focused their research on inherent traits. This work, however, turned up very weak relationships, and no set of traits correlated consistently with leaders’ effectiveness. Traits also offended Americans’ ideology espousing equality of opportunity (Van Fleet and Yukl, 1986). Criticisms of trait approaches directed research towards contingency theories (Lord et al., 1986). But these studies too turned up very weak relationships, so renewed interest in traits has surfaced (Kenny and Zaccaro, 1983; Schneider, 1985). As an example of the trait theories, we examine the correlations of intelligence with perceptions of leadership, because these have demonstrated the highest and most consistent relationships.

It is impossible to summarize the effect sizes of contingency theories of leadership in general. First, even though leadership theorists have proposed many contingency theories, little research has resulted (Schriesheim and Kerr, 1977), possibly because some of the contingency theories may be too unclear to suggest definitive empirical studies (Van Fleet and Yukl, 1986). Second, different theories emphasize different dependent variables (Campbell, 1977; Schriesheim and Kerr, 1977; Bass, 1981). Therefore, one must focus on a particular contingency theory. We examine Fiedler’s (1967) theory because Miner (1984) reported that organizational scientists respect it highly.

Sources

A manual search of thirteen journals[2] turned up recent review articles concerning the five variables of interest; Borgen et al. (1985) identified several of these review articles as exemplary works. We took data from articles that reported both the effect sizes and the publication dates of individual studies. Since recent review articles did not cover older studies well, we supplemented these data by examining older reviews, in books as well as journals. In all, data came from the twelve sources listed in Table 1; these articles reviewed 261 studies.

Table 1 – Review Article Sources
Job satisfaction	Iaffaldano and Muchinsky (1985)
	Vroom (1964)
	Brayfield and Crockett (1955)

Absenteeism	Hackett and Guion (1985)
	Vroom (1964)
	Brayfield and Crockett (1955)

Turnover	McEvoy and Cascio (1985)
	Steel and Ovalle (1984)

Job Performance	Hopkins and Sears (1982)
	Locke et al. (1980)

Leadership	Lord et al. (1986)
	Peters et al. (1985)
	Mann (1959)
	Stogdill (1948)

Measures

Each research study is represented by a single measure of effect: for a study that measured the concepts in more than one way, we averaged the reported effect sizes.

To trace changes in effect sizes over time, we divided time into three equal periods. For instance, for studies from 1944 to 1983, we compare the effect sizes for 1944-57, 1958-70 and 1971-83.

Results

Figures 1-4 present the minimum, maximum and average effect sizes for the five variables of interest. Three figures (1(a), 3(b) and 4) seem to show that no progress has occurred over time; and four figures (1(b), 2(a), 2(b) and 3(a)) seem to indicate that effect sizes have gradually declined toward zero over time. The largest of these correlations is only .22 in the most recent time period, so all of these effects account for less than five per cent of the variance.

Moreover, four of these relationships (2(a), 2(b), 3(a) and 3(b)) probably incorporate Hawthorne effects: They measure the effects of interventions. Because all interventions should yield some effects, the differential impacts of specific interventions would be less than these effect measures suggest. That is, the effects of behavior modification, for example, should not be compared with inaction, but compared with those of an alternative intervention, such as goal setting.

Figure 2(c) is the only one suggesting significant progress. Almost all of this progress, however, occurred between the first two time periods: Because only one study was conducted during the first of these periods, the apparent progress might be no more than a statement about the characteristics of that single study. This relationship is also stronger than the others, although not strong enough to suggest a close causal relationship: The average correlation in the most recent time period is .40. What this correlation says is that some of the people who say in private that they intend to quit actually do quit.

Progress with respect to Fiedler’s contingency theory of leadership is not graphed. Peters et al. (1985) computed the average correlations (corrected for sampling error) of leadership effectiveness with the predictions of this theory. The absolute values of the correlations averaged .38 for the studies from which Fiedler derived this theory (approximately 1954-65); but for the studies conducted to validate this theory (approximately 1966-78), the absolute values of the correlations averaged .26. Thus, these correlations too have declined toward zero over time.

I/O psychologists have often argued that effects do not have to be absolutely large in order to produce meaningful economic consequences. (Zedeck and Cascjo, 1984; Schneider, 1985). For example, goal setting produced an average performance improvement of 21.6 per cent in the seventeen studies conducted from 1969 to 1979. If performance has a high economic value and goal setting costs very little, then goal setting would be well worth doing on the average. And because the smallest performance improvement was 2 per cent, the risk that goal setting would actually reduce performance seems very low (Cascjo, 1984; Schneider, 1985). For example, goal setting produced an average performance improvement of 21.6 per cent in the seventeen studies conducted from 1969 to 1979. If performance has a high economic value and goal setting costs very little, then goal setting would be well worth doing on the average. And because the smallest performance improvement was 2 per cent, the risk that goal setting would actually reduce performance seems very low.

This chapter, however, concerns theoretical development; and so the economic benefits of relations take secondary positions to identifying controllable moderators, to clarifying causal links, and to increasing effect sizes. In terms of theoretical development, it is striking that none of these effect sizes rose noticeably after the first years. This may have happened for any of five reasons, or more likely a combination of them:

(a) Researchers may be clinging to incorrect theories despite disconfirming evidence (Staw, 1976). This would be more likely to happen where studies’ findings can be interpreted in diverse ways. Absolutely small correlations nurture such equivocality, by making it appear that random noise dominates any systematic relationships and that undiscovered or uninteresting influences exert much more effect than the known ones.

(b) Researchers may be continuing to elaborate traditional methods of information gathering after these stop generating additional knowledge. For example, researchers developed very good leadership questionnaires during the early 1950s. Perhaps these early questionnaires picked up all the information about leadership that can be gathered via questionnaires. Thus, subsequent questionnaires may not have represented robust improvements; they may merely have mistaken sampling variations for generalities.

(c) Most studies may fail to take advantage of the knowledge produced by the very best studies. As a sole explanation, this would be unlikely even in a world that does not reward exact replication, because research journals receive wide distribution and researchers can easily read reports of others’ projects. However, retrospective interpretations of random variations may obscure real knowledge in clouds of ad hoc rationalizations, so the consumers of research may have difficulty distinguishing real knowledge from false.

Because we wanted to examine as many studies as possible and studies of several kinds of relationships, we did not attempt to evaluate the methodological qualities of studies. Thus, we are using time as an implicit measure of improvement in methodology. But time may be a poor indicator of methodological quality if new studies do not learn much from the best studies. Reviewing studies of the relationship between formal planning and profitability, Starbuck (1985) remarked that the lowest correlations came in the studies that assessed planning and profitability most carefully and that obtained data from the most representative samples of firms.

(d) Those studies obtaining the maximum effect sizes may do so for idiosyncratic or unknown reasons, and thus produce no generalizable knowledge. Researchers who provide too little information about studied sites, subjects, or situations make it difficult for others to build upon their findings (Orwin and Cordray, 1985); several authors have remarked that many studies report too little information to support meta-analyses (Steel and Ovalle, 1984; Iaffaldano and Muchinsky, 1985; Scott and Taylor, 1985). The tendencies of people, including scientists, to use confirmatory strategies mean that they attribute as much of the observed phenomena as possible to the relationships they expect to see (Snyder, 1981; Faust, 1984; Klayman and Ha, 1987). Very few studies report correlations above .5, so almost all studies leave much scope for misattribution and misinterpretation.

(e) Humans’ characteristics and behaviors may actually change faster than psychologists’ theories or measures improve. Stagner (1982) argued that the context of I/O psychology has changed considerably over the years: the economy has shifted from production to service industries, jobs have evolved from heavy labor to cognitive functions, employees’ education levels have risen, and legal requirements have multiplied and changed, especially with respect to discrimination. For instance, Haire et al. (1966) found that managers’ years of education correlate with their ideas about proper leadership, and education alters subordinates’ concepts of proper leadership (Dreeben, 1968; Kunda, 1987). In the US, median educational levels have risen considerably, from 9.3 years in 1950 to 12.6 years in 1985 (Bureau of the Census, 1987). Haire et al. also attributed 25 per cent of the variance in managers’ leadership beliefs to national differences: so, as people move around, either between countries or within a large country, they break down the differences between regions and create new beliefs that intermingle beliefs that used to be distinct. Cummings and Schmidt (1972) conjectured plausibly that beliefs about proper leadership vary with industrialization; thus, the ongoing industrialization of the American south-east and southwest and the concomitant deindustrialization of the north-east are altering Americans’ responses to leadership questionnaires.

Whatever the reasons, the theories of I/O psychology explain very small fractions of observed phenomena, I/O psychology is making little positive progress, and it may actually be making some negative progress. Are these the kinds of results that science is supposed to produce?

PARADIGM CONSENSUS

Kuhn (1970) characterized scientific progress as a sequence of cycles, in which occasional brief spurts of innovation disrupt long periods of gradual incremental development. During the periods of incremental development, researchers employ generally accepted methods to explore the implications of widely accepted theories. The researchers supposedly see themselves as contributing small but lasting increments to accumulated stores of well-founded knowledge; they choose their fields because they accept the existing methods, substantive beliefs and values, and consequently they find satisfaction in incremental development within the existing frames of reference. Kuhn used the term paradigm to denote one of the models that guide such incremental developments. Paradigms, he (1970, p. 10) said, provide ‘models from which spring particular coherent traditions of scientific research’.

Thus, Kuhn defined paradigms, not by their common properties, but by their common effects. His book actually talks about 22 different kinds of paradigm (Masterman, 1970), which Kuhn placed into two broad categories: (a) a constellation of beliefs, values and techniques shared by a specific scientific community; and (b) an example of effective problem-solving that becomes an object of imitation by a specific scientific community.

I/O psychologists have traditionally focused on a particular set of variables: the nucleus of this set would be those examined in the previous section-job satisfaction, absenteeism, turnover, job performance and leadership. Also, we believe that substantial majorities of I/O psychologists would agree with some base-line propositions about human behavior. However, Campbell et al. (1982) found a lack of consensus among American I/O psychologists concerning substantive research goals. They asked them to suggest ‘the major research needs that should occupy us during the next 10-15 years (p. 155): 105 respondents contributed 146 suggestions, of which 106 were unique. Campbell et al. (1982, p. 71) inferred: ‘The field does not have very well worked out ideas about what it wants to do. There was relatively little consensus about the relative importance of substantive issues.’

Shared Beliefs, Values and Techniques

I/O psychologists do seem to have a paradigm of type (a)-shared beliefs, values, and techniques, but it would seem to be a methodological paradigm rather than a substantive one. For instance, Watkins et al.’s (1986) analysis of the 1984-85 citations in three I/O journals revealed that a methodologist, Frank L. Schmidt, has been by far the most cited author. In this methodological orientation, I/O psychology fits a general pattern: numerous authors have remarked on psychology’s methodological emphasis (Deese, 1972; Koch, 1981; Sanford, 1982). For instance, Brackbill and Korten (1970, p. 939) observed that psychological ‘reviewers tend to accept studies that are methodologically sound but uninteresting, while rejecting research problems that are of significance for science or society but for which faultless methodology can only be approximated.’ Bakan (1974) called psychology ‘methodolatrous’. Contrasting psychology’s development with that of physics, Kendler (1984, p. 9) argued that ‘Psychological revolutions have been primarily methodological in nature.’ Shames (1987, p. 264) characterized psychology as ‘the most fastidiously committed, among the scientific disciplines, to a socially dominated disciplinary matrix which is almost exclusively centred on method.’

I/O psychologists not only emphasize methodology, they exhibit strong consensus about methodology. Specifically, I/O psychologists speak and act as if they believe they should use questionnaires, emphasize statistical hypothesis tests, and raise the validity and reliability of measures. Among others, Campbell (1982, p. 699) expressed the opinion that 110 psychologists have been relying too much on ‘the self-report questionnaire, statistical hypothesis testing, and multivariate analytic methods at the expense of problem generation and sound measurement’. As Campbell implied, talk about reliability and especially validity tends to be lip-service: almost always, measurements of reliability are self-reflexive facades and no direct means even exist to assess validity. I/O psychologists are so enamored of statistical hypothesis tests that they often make them when they are inappropriate, for instance when the data are not samples but entire sub-populations, such as all the employees of one firm, or all of the members of two departments. Webb et al. (1966) deplored an overdependence on interviews and questionnaires, but I/O psychologists use interviews much less often than questionnaires (Stone, 1978).

An emphasis on methodology also characterizes the social sciences at large. Garvey et al. (1970) discovered that editorial processes in the social sciences place greater emphasis on statistical procedures and on methodology in general than do those in the physical sciences; and Lindsey and Lindsey (1978) factor analysed social science editors’ criteria for evaluating manuscripts and found that a quantitative-methodological orientation arose as the first factor. Yet, other social sciences may place somewhat less emphasis on methodology than does I/O psychology. For instance, Kerr et al. (1977) found little evidence that methodological criteria strongly influence the editorial decisions by management and social science journals. According to Kerr et al., the most influential methodological criterion is statistical insignificance, and the editors of three psychological journals express much stronger negative reactions to insignificant findings than do editors of other journals.

Mitchell et al. (1985) surveyed 139 members of the editorial boards of five journals that publish work related to organizational behavior, and received responses from 99 editors. Table 2 summarizes some of these editors’ responses.[3] The average editor said that ‘importance’ received more weight than other criteria; that methodology and logic were given nearly equal weights, and that presentation carried much less weight. When asked to assign weights among three aspects of ‘importance’, most editors said that scientific contribution received much more weight than practical utility or readers’ probable interest in the topic. Also, they assigned nearly equal weights among three aspects of methodology, but gave somewhat more weight to design.

Table 2 compares the editors of two specialized I/O journals-Journal of Applied Psychology (JAP) and Organizational Behavior and Human Decision Processes (OBHDP)-with the editors of three more general management journals- Academy of Management Journal (AMJ), Academy of Management Review (AMR) and Administrative Science Quarterly (ASQ). Contrary to our expectations, the average editor of the two I/O journals said that he or she allotted more weight to ‘importance’ and less weight to methodology than did the average editor of the three management journals. It did not surprise us that the average editor of the I/O journals gave less weight to the presentation than did the average editor of the management journals. Among aspects of methodology, the average I/O editor placed slightly more weight on design and less on measurement than did the average management editor. When assessing ‘importance’, the average I/O editor said that he or she gave distinctly less weight to readers’ probable interest in a topic and more weight to practical utility than did the average management editor. Thus, the editors of I/O journals may be using practical utility to make up for I/O psychologists’ lack of consensus concerning substantive research goals: if readers disagree about what is interesting, it makes no sense to take account of their preferences (Campbell et al., 1982).

Table 2 – Review Article Sources

Relative weights among four general criteria
	All five journals	JAP and OBHDP	AMJ, AMR, and ASQ
‘Importance’	35	38	34
Methodology	26	25	27
Logic	24	24	24
Presentation	15	13	16

Relative weights among three aspects of importance
	All five journals	JAP and OBHDP	AMJ, AMR, and ASQ
Scientific contribution	53	54	53
Practical utility	28	31	26
Readers’ interest in topic	19	14	21

Relative weights among three aspects of methodology
	All five journals	JAP and OBHDP	AMJ, AMR, and ASQ
Design	38	39	37
Measurement	31	30	32
Analysis	31	31	31

Editors’ stated priority of ‘importance’ over methodology contrasts with the widespread perception that psychology journals emphasize methodology at the expense of substantive importance. Does this contrast imply that the actual behaviors of journal editors diverge from their espoused values? Not necessarily. If nearly all of the manuscripts submitted to journals use accepted methods, editors would have little need to emphasize methodology. And if, like I/O psychologists in general, editors disagree about the substantive goals of I/O research, editors’ efforts to emphasize ‘importance’ would work at cross-purposes and have little net effect. Furthermore, editors would have restricted opportunities to express their opinions about what constitutes scientific contribution or practical utility if most of the submitted manuscripts pursue traditional topics and few manuscripts actually address ‘research problems that are of significance for science or society’.

Objects of Imitation

I/O psychology may also have a few methodological and substantive paradigms of type (b) examples that become objects of imitation. For instance, Griffin (1987, pp. 82-3) observed:

The [Hackman and Oldham] job characteristics theory was one of the most widely studied and debated models in the entire field during the late 1970s. Perhaps the reasons behind its widespread popularity are that it provided an academically sound model, a packaged and easily used diagnostic instrument, a set of practitioner-oriented implementation guidelines, and an initial body of empirical support, all within a relatively narrow span of time. Interpretations of the empirical research pertaining to the theory have ranged from inferring positive to mixed to little support for its validity. (References omitted.)

Watkins et al. (1986) too found evidence of interest in Hackman and Oldham’s (1980) job-characteristics theory: five of the twelve articles that were most frequently cited by I/O psychologists during 1984-85 were writings about this theory, including Roberts and Glick’s (1981) critique of its validity. Although its validity evokes controversy, Hackman and Oldham’s theory seems to be the most prominent current model for imitation. As well, the citation frequencies obtained by Watkins et al. (1986), together with nominations of important theories collected by Miner (1984), suggest that two additional theories attract considerable admiration: Katz and Kahn’s (1978) open-systems theory and Locke’s (1968) goal-setting theory. It is hard to see what is common among these three theories that would explain their roles as paradigms; open-systems theory, in particular, is much less operational than job-characteristics theory, and it is more a point of view than a set of propositions that could be confirmed or disconfirmed.

To evaluate more concretely the paradigm consensus among I/O psychologists, we obtained several indicators that others have claimed relate to paradigm consensus.

Measures

As indicators of paradigm consensus, investigators have used: the ages of references, the percentages of references to the same journal, the numbers of references per article, and the rejection rates of journals.

Kuhn proposed that paradigm consensus can be evaluated through literature references. He hypothesized that during normal-science periods, references focus upon older, seminal works; and so the numbers and types of references indicate connectedness to previous research (Moravcsik and Murgesan, 1975). First, in a field with high paradigm consensus, writers should cite the key works forming the basis for that field (Small, 1980). Alternatively, a field with a high proportion of recent references exhibits a high degree of updating, and so has little paradigm consensus. One measure of this concept is the Citing Half-Life, which shows the median age of the references in a journal. Second, referencing to the same journal should reflect an interaction with research in the same domain, so higher referencing to the same journal should imply higher paradigm consensus. Third, since references reflect awareness of previous research, a field with high paradigm consensus should have a high average number of references per article (Summers, 1979).

Journals constitute the accepted communication networks for transmitting knowledge in psychology (Price, 1970; Pinski and Narin, 1979), and high paradigm consensus means agreement about what research deserves publication. Zuckerman and Merton (1971) said that the humanities demonstrate their pre-paradigm states through very high rejection rates by journals, whereas the social sciences exhibit their low paradigm consensus through high rejection rates, and the physical sciences show their high paradigm consensus through low rejection rates. That is, paradigm consensus supposedly enables physical scientists to predict reliably whether their manuscripts are likely to be accepted for publication, and so they simply do not submit manuscripts that have little chance of publication.

Results

Based partly on work by Sharplin and Mabry (1985), Salancik (1986) identified 24 ‘organizational social science journals’. He divided these into five groups that cite one another frequently; the group that Salancik labeled Applied corresponds closely to I/O psychology.[4] Figure 5 compares these groups with respect to citing half-lives, references to the same journal, and numbers of references per article. The SSCI Journal Citation Reports (Social Science Citation Index, Garfield, 198 1-84b) provided these three measures, although a few data were missing. We use four-year averages in order to smooth the effects of changing editorial policies and of the publication of seminal works (Blackburn and Mitchell, 1981). Figure 5 also includes comparable data for three fields that do not qualify as ‘organizational social science’-chemistry, physics, and management information systems (MIS).^[5] Data concerning chemistry, physics and MIS hold special interest because they are generally believed to be making rapid progress; MIS may indeed be in a pre-paradigm state.

Seven of the eight groups of journals have average citing half-lives longer than five years, the figure that Line and Sandison (1974) proposed as signaling a high degree of updating. Only MIS journals have a citing half-life below five years; this field is both quite new and changing with extreme rapidity. I/O psychologists update references at the same pace as chemists and physicists, and only slightly faster than other psychologists and OB researchers.

Garfield (1972) found that referencing to the same journal averages around 20 per cent across diverse fields, and chemists and physicists match this average. All five groups of ‘organizational social science’ journals average below 20 per cent references to the same journal, so these social scientists do not focus publications in specific journals to the same degree as physical scientists, although the OB researchers come close to the physical-science pattern. The I/O psychologists, however, average less than 10 per cent references to the same journal, so they focus publications even less than most social scientists. MIS again has a much lower percentage than the other fields.

Years ago, Price (1965) and Line and Sandison (1974) said 15-20 references per article indicated strong interaction with previous research. Because the numbers of references have been increasing in all fields (Summers, 1979), strong interaction probably implies 25-35 references per article today. I/O psychologists use numbers of references that fall within this range, and that look much like the numbers for chemists, physicists and other psychologists.

We could not find rejection rates for management, organizations and sociology journals, but Jackson (1986) and the American Psychological Association (1986) published rejection rates for psychology journals during 1985. In that year, I/O psychology journals rejected 82.5 per cent of the manuscripts, which is near the 84.3 per cent average for other psychology journals. By contrast, Zuckerman and Merton (1971) reported that the rejection rates for chemistry and physics journals were 31 and 24 per cent respectively. Similarly, Garvey et al. (1970) observed higher rejection rates and higher rates of multiple rejections in the social sciences than in the physical sciences. However, these differences in rejection rates may reflect the funding and organization of research more than its quality or substance: American physical scientists receive much more financial support than do social scientists, most grants for physical science research go to rather large teams, and physical scientists normally replicate each others’ findings. Thus, most physical science research is evaluated in the process of awarding grants as well as in the editorial process, teams evaluate and revise their research reports internally before submitting them to journals, and researchers have incentives to replicate their own findings before they publish them. The conciseness of physical science articles reduces the costs of publishing them. Also, since the mid-1950s, physical science journals have asked authors to pay voluntary page charges, and authors have characteristically drawn upon research grants to pay these charges.

Peters and Ceci (1982) demonstrated for psychology in general that a lack of substantive consensus shows up in review criteria. They chose twelve articles that had been published in psychology journals, changed the authors’ names, and resubmitted the articles to the same journals that had published them: The resubmissions were evaluated by 38 reviewers. Eight per cent of the reviewers detected that they had received resubmissions, which terminated review of three of the articles. The remaining nine articles completed the review process, and eight of these were rejected. The reviewers stated mainly methodological reasons rather than substantive ones for rejecting articles, but Mahoney’s (1977) study suggests that reviewers use methodological reasons to justify rejection of manuscripts that violate the reviewers’ substantive beliefs.

Figure 6 graphs changes in four indicators from 1957 to 1984 for the Journal of Applied Psychology and, where possible, for other I/O psychology journals.[6] Two of the indicators in Figure 6 have remained quite constant; one indicator has risen noticeably; and one has dropped noticeably. According to the writers on paradigm consensus, all four of these indicators should rise

as consensus increases. If these indicators actually do measure paradigm consensus, I/O psychology has not been developing distinctly more paradigm consensus over the last three decades.

Overall, the foregoing indicators imply that I/O psychology looks much like management, sociology, and other areas of psychology. In two dimensions- citing half-lives and references per article-I/O psychology also resembles chemistry and physics, fields that are usually upheld as examples of paradigm consensus (Lodahl and Gordon, 1972). I/O psychology differs from chemistry and physics in references to the same journal and in rejection rates, but the latter difference is partly, perhaps mainly, a result of government policy. Hedges (1987) found no substantial differences between physics and psychology in the consistency of results across studies, and Knorr-Cetina’s (1981) study suggests that research in chemistry incorporates the same kinds of uncertainties, arbitrary decisions and interpretations, social influences, and unproductive tangents that mark research in psychology.

Certainly, these indicators do not reveal dramatic differences between I/O psychology and chemistry or physics. However, these indicators make no distinctions between substantive and methodological paradigms. The writings on paradigms cite examples from the physical sciences that are substantive at least as often as they are methodological; that is, the examples focus upon Newton’s laws or phlogiston or evolution, as well as on titration or dropping objects from the Tower of Pisa. Though far from a representative sample, this suggests that physical scientists place more emphasis on substantive paradigms than I/O psychologists do; but since I/O psychology seems to be roughly as paradigmatic as chemistry and physics, this in turn suggests that I/O psychologists place more emphasis on methodological paradigms than physical scientists do.

Perhaps I/O psychologists tend to de-emphasize substantive paradigms and to emphasize methodological ones because they put strong emphasis on trying to discover relationships by induction. But can analyses of empirical evidence produce substantive paradigms where no such paradigms already exist?

INDUCING RELATIONSHIPS FROM OBSERVATIONS

Our colleague Art Brief has been heard to proclaim, ‘Everything correlates .1 with everything else.’ Suppose, for the sake of argument, that this were so. Then all observed correlations would deviate from the null hypothesis of a correlation less than zero, and a sample of 272 or more would produce statistical significance at the .05 level with a one-tailed test. If researchers would make sure that their sample sizes exceed 272, all observed correlations would be significantly greater than zero. Psychologists would be inundated with small, but statistically significant, correlations.

In fact, psychologists could inundate themselves with small, statistically significant correlations even if Art Brief is wrong. By making enough observations, researchers can be certain of rejecting any point null hypothesis that defines an infinitesimal point on a continuum, such as the, hypothesis that two sample means are exactly equal, as well as the hypothesis that a correlation is exactly zero. If a point hypothesis is not immediately rejected, the researcher need only gather more data. If an observed correlation is .04, a researcher would have to make 2402 observations to achieve significance at the .05 level with a two-tailed test; and if the observed correlation is .2, the researcher will need just 97 observations.

Induction requires distinguishing meaningful relationships (signals) against an obscuring background of confounding relationships (noise). The background of weak and meaningless or substantively secondary correlations may not have an average value of zero and may have a variance greater than that assumed by statistical tests. Indeed, we hypothesize that the distributions of correlation coefficients that researchers actually encounter diverge quite a bit from the distributions assumed by statistical tests, and that the background relationships have roughly the same order of magnitude as the meaningful ones, partly because researchers’ nonrandom behaviors construct meaningless background relationships. These meaningless relationships make induction untrustworthy.

In many tasks, people can distinguish weak signals against rather strong background noise. The reason is that both the signals and the background noise match familiar patterns. People have trouble making such distinctions where signals and noise look much alike or where signals and noise have unfamiliar characteristics. Psychological research has the latter characteristics. The activity is called research because its outputs are unknown; and the signals and noise look a lot alike in that both have systematic components and both contain components that vary erratically. Therefore, researchers rely upon statistical techniques to make these distinctions. But these techniques assume: (a) that the so-called random errors really do cancel each other out so that their average values are close to zero; and (b) that the so-called random errors in different variables are uncorrelated. These are very strong assumptions because they presume that the researchers’ hypotheses encompass absolutely all of the systematic effects in the data, including effects that the researchers have not foreseen or measured. When these assumptions are not met, the statistical techniques tend to mistake noise for signal, and to attribute more importance to the researchers’ hypotheses than they deserve. It requires very little in the way of systematic ‘errors’ to distort or confound correlations as small as those I/O psychologists usually study.

One reason to expect confounding background relationships is that a few broad characteristics of people and social systems pervade psychological data. One such characteristic is intelligence: Intelligence correlates with many other characteristics and behaviors, such as leadership, job satisfaction, job performance, social class, income, education and geographic location during childhood. These correlates of intelligence tend to correlate with each other, independently of any direct causal relations among them, because of their common relation to intelligence. Other broad characteristics that correlate with many variables include sex, age, social class, education, group or organizational size, and geographic location.

A group of related organization-theory studies illustrates how broad characteristics may mislead researchers. In 1965, Woodward hypothesized that organizations employing different technologies adopt different structures, and she presented some data supporting this view. There followed many studies that found correlations between various measures of organization-level technology and measures of organizational structure. Researchers devoted considerable effort to refining the measures of technology and structure and to exploring variations on this general theme. After some fifteen years of research, Gerwin (1981) pulled together all the diverse findings: Although a variety of significant correlations had been observed, virtually all of them differed insignificantly from zero when viewed as partial correlations with organizational size controlled.

Researchers’ control is a second reason to expect confounding background relationships. Researchers often aggregate numerous items into composite variables; and the researchers themselves decide (possibly indirectly via a technique such as factor analysis) which items to include in a specific variable and what weights to give to different items. By including in two composite variables the same items or items that differ quite superficially from each other, researchers generate controllable but substantively meaningless correlations between the composites. Obviously, if two composite variables incorporate many very similar items, the two composites will be highly correlated. In a very real sense, the correlations between composite variables lie entirely within the researchers control; researchers can construct these composites such that they correlate strongly or weakly, and so the ‘observed’ correlations convey more information about the researchers’ beliefs than about the situations that the researchers claim to have observed.

The renowned Aston studies show how researchers’ decisions may determine their findings (Starbuck, 1981). The Aston researchers made 1000-2000 measurements of each organization, and then aggregated these into about 50 composite variables. One of the studies’ main findings was that four of these composite variables-functional specialization, role specialization, standardization and formalization-correlate strongly: The first Aston study found correlations ranging from .57 to .87 among these variables. However, these variables look a lot alike when one looks into their compositions: Functional specialization and role specialization were defined so that they had to correlate positively, and so that a high correlation between them indicated that the researchers observed organizations having different numbers of specialities. Standardization measured the presence of these same specialities, but did so by noting the existence of documents; and formalization too was measured by the presence of documents, frequently the same documents that determined standardization. Thus, the strong positive correlations were direct consequences of the researchers’ decisions about how to construct the variables.

Focused sampling is a third reason to anticipate confounding background relationships. So-called samples are frequently not random, and many of them are complete sub-populations. If, for example, a study obtains data from every employee in a single firm, the number of employees should not be a sample size for the purposes of statistical tests: For comparisons among these employees, complete sub-populations have been observed, the allocations of specific employees to these sub-populations are not random but systematic, and statistical tests are inappropriate. For extrapolation of findings about these employees to those in other firms, the sample size is one firm. This firm, however, is unlikely to have been selected by a random process from a clearly defined sampling frame and it may possess various distinctive characteristics that make it a poor basis for generalization - such as its willingness to allow psychologists entry, or its geographic location, or its unique history.

These are not unimportant quibbles about the niceties of sampling. Study after study has turned up evidence that people who live close together, who work together, or who socialize together tend to have more attitudes, beliefs, and behaviors in common than do people who are far apart physically and socially. That is, socialization and interaction create distinctive sub-populations. Findings about any one of these sub-populations probably do not extrapolate to others that lie far away or that have quite dissimilar histories or that live during different ages. It would be surprising if the blue-collar workers in a steel mill in Pittsburgh were to answer a questionnaire in the same way as the first-level supervisors in a steel mill in Essen, and even more surprising if the same answers were to come from executives in an insurance company in Calcutta. The blue-collar workers in one steel mill in Pittsburgh might not even answer the questionnaire in the same way as the blue-collar workers in another steel mill in Pittsburgh if the two mills had distinctly different histories and work cultures.

Subjective data obtained from individual respondents at one time and through one method provide a fourth reason to watch for confounding background relationships. By including items in a single questionnaire or a single interview, researchers suggest to respondents that they ought to see relationships among these items; and by presenting the items in a logical sequence, the researchers suggest how the items ought to relate. Only an insensitive respondent would ignore such strong hints. Moreover, respondents have almost certainly made sense of their worlds, even if they do not understand these worlds in some objective sense. For instance, Lawrence and Lorsch (1967) asked managers to describe the structures and environments of their organizations; they then drew inferences about the relationships of organizations’ structures to their environments. These inferences might be correct statements about relationships that one could assess with objective measures; or they might be correct statements about relationships that managers perceive, but managers’ perceptions might diverge considerably from objective measures (Starbuck, 1985). Would anyone be surprised if it turned out that managers perceive what makes sense because it meshes into their beliefs? In fact, two studies (Tosi et al., 1973; Downey et al., 1975) have attempted to compare managers’ perceptions of their environments with other measures of those environments: both studies found no consistent correlations between the perceived and objective measures. Furthermore, Downey et al. (1977) found that managers’ perceptions of their firms’ environments correlate more strongly with the managers’ personal characteristics than with the measurable characteristics of the environments. As to perceptions of organization structure, Payne and Pugh (1976) compared people’s perceptions with objective measures: they surmised (a) that the subjective and objective measures correlate weakly; and (b) that people often have such different perceptions of their organization that it makes no sense to talk about shared perceptions.

Foresight is a fifth and possibly the most important reason to anticipate confounding background relationships. Researchers are intelligent, observant people who have considerable life experience and who are achieving success in life. They are likely to have sound intuitive understanding of people and of social systems; they are many times more likely to formulate hypotheses that are consistent with their intuitive understanding than ones that violate it; they are quite likely to investigate correlations and differences that deviate from zero; and they are less likely than chance would imply to observe correlations and differences near zero. This does not mean that researchers can correctly attribute causation or understand complex interdependencies, for these seem to be difficult, and researchers make the same kinds of judgement, interpretation, and attribution errors that other people make (Faust, 1984). But prediction does not require real understanding. Foresight does suggest that psychological differences and correlations have statistical distributions very different from the distributions assumed in hypothesis tests. Hypothesis tests assume no foresight.

If the differences and correlations that psychologists test have distributions quite different from those assumed in hypothesis tests, psychologists are using tests that assign statistical significance to confounding background relationships. If psychologists then equate statistical significance with meaningful relationships, which they often do, they are mistaking confounding background relationships for theoretically important information. One result is that psychological research may be creating a cloud of statistically significant differences and correlations that not only have no real meaning but that impede scientific progress by obscuring the truly meaningful ones.

Measures

To get an estimate of the population distribution of correlations that I/O psychologists study, we tabulated every complete matrix of correlations that appeared in the Journal of Applied Psychology during 1983-86. This amounts to 6574 correlations from 95 articles.

We tabulated only complete matrices of correlations in order to observe the relations among all of the variables that I/O psychologists perceive when drawing inductive inferences, not only those variables that psychologists actually include in hypotheses. Of course, some studies probably gathered and analysed data on additional variables beyond those published, and then omitted these additional variables because they correlated very weakly with the dependent variables. It seems well established that the variables in hypotheses are filtered by biases against publishing insignificant results (Sterling, 1959; Greenwald, 1975; Kerr et al., 1977). These biases partly explain why some authors revise or create their hypotheses after they compute correlations, and we know from personal experiences that editors sometimes improperly ask authors to restate their hypotheses to make them fit the data. None the less, many correlation matrices include correlations about which no hypotheses have been stated, and some authors make it a practice to publish the intercorrelation matrices for all of the variables they observed, including variables having expected correlations of zero.

To estimate the percentage of correlations in hypotheses, we examined a stratified random sample of 21 articles. We found it quite difficult to decide whether some relations were or were not included in hypotheses. Nevertheless, it appeared to us that four of these 21 intercorrelation matrices included no hypothesized relations, that seven matrices included 29-70 per cent hypothesized relations, and that ten matrices were made up of more than 80 per cent hypothesized relations. Based on this sample, we estimate that 64 per cent of the correlations in our data represented hypotheses.

Results

Figure 7 shows the observed distribution of correlations. This distribution looks. much like the comparable ones for Administrative Science Quarterly and the Academy of Management Journal, for which we also have data, so the general pattern is not peculiar to I/O psychology.

It turns out that Art Brief was nearly right on average, for the mean correlation is .0895 and the median correlation is .0956. The distribution seems to reflect a strong bias against negative correlations: 69 per cent of the correlations are positive and 31 per cent are negative, so the odds are better than 2 to 1 that an observed correlation will be positive. This strong positive bias provides quite striking evidence that many researchers prefer positive relationships, possibly because they find these easier to understand. To express this preference, researchers must either be inverting scales retrospectively or be anticipating the signs of hypothesized relationships prospectively, either of which would imply that these studies should not use statistical tests that assume a mean correlation of zero.

Table 3 – Differences Associated with Numbers of Observations

	N<70	70<N<180	N>180
Mean number of observations	40	120	542
Mean correlations	.140	.117	.064
Numbers of correlations	1195	1457	3922

Percentage of correlations are:
Positive	71%	71%	67%
Negative	29%	29%	33%

Percentage of correlations that are statististically significant at .05 using two tails:
Positive correlations	34%	64%	72%
Negative correlations	18%	41%	56%

Studies with large numbers of observations exhibit slightly less positive bias. Table 3 compares studies having less than 70 observations, those with 70 to 180 observations, and those with more than 180 observations. Studies with over 180 observations report 67 per cent positive correlations and 33 per cent negative ones, making the odds of a positive correlation almost exactly 2 to 1. The mean correlation found in studies with over 180 observations is .064, whereas the mean correlation in studies with fewer than 70 observations in .140.

Figure 8 compares the observed distributions of correlations with the distributions assumed by a typical hypothesis test. The test distributions in Figure 8 assume random samples equal to the mean numbers of observations for each category. Compared to the observed distributions, the test distributions assume much higher percentages of correlations near zero, so roughly 65 per cent of the reported correlations are statistically significant at the 5 per cent level. The percentages of statistically significant correlations change considerably with numbers of observations because of the different positive biases and because of different test distributions. For studies with more than 180 observations, 72 per cent of the positive correlations and 56 per cent of the negative correlations are statistically significant; whereas for studies with less than 70 observations, 34 per cent of the positive correlations and only 18 per cent of the negative correlations are statistically significant (Table 3). Thus, positive correlations are noticeably more likely than negative ones to be judged statistically significant.

Figure 9a shows that large-N studies and small-N studies obtain rather similar distributions of correlations. The small-N studies do produce more correlations above + .5, and the large-N studies report more correlations between - .2 and + .2. Both differences fit the rationale that researchers make more observations when they are observing correlations near zero. Some researchers undoubtedly anticipate the magnitudes of hypothesized relationships and set out to make numbers of observations that should produce statistical significance (Cohen, 1977); other researchers keep adding observations until they achieve statistical significance for some relationships; and still other researchers stop making observations when they obtain large positive correlations. Again, graphs for Administrative Science Quarterly and the Academy of Management Journal strongly resemble these for the Journal of Applied Psychology.

Figure 9b graphs the test distributions corresponding to Figure 9a. These graphs provide a reminder that large-N studies and small-N studies differ more in the criteria used to evaluate statistical significance than in the data they produce, and Figures 9a and b imply that an emphasis on statistical significance amounts to an emphasis on absolutely small correlations.

The pervasive correlations among variables make induction undependable: starting with almost any variable, an I/O psychologist finds it extremely easy to discover a second variable that correlates with the first at least .1 in absolute value. In fact, if the psychologist were to choose the second variable utterly at random, the psychologist’s odds would be 2 to 1 of coming up with such a variable on the first try, and the odds would be 24 to 1 of discovering such a variable within three tries. This is a cross-sectional parallel to a finding by Ames and Reiter (1961) relating to the analyses of historical economic statistics: Starting with one time series and choosing a second series at random, an economist would need only three trials on average to discover a correlation of .71 or more; even if the economist would correct each series for linear trend, finding a correlation of .71 or more would require only five trials on average.

Induction becomes even less dependable if a psychologist uses hypothesis tests to decide what correlations deserve attention, and especially so if the psychologist tries to make enough observations to guarantee statistical significance. If the psychologist also defines or redefines variables so as to make positive correlations more likely than negative correlations, hypothesis tests based on the null hypothesis of zero correlation become deceptive rituals.

Suppose that roughly 10 per cent of all observable relations could be theoretically meaningful and that the remaining 90 per cent either have no meanings or can be deduced as implications of the key 10 per cent. But we do not now know which relations constitute the key 10 per cent, and so our research resembles a search through a haystack in which we are trying to separate needles from more numerous straws. Now suppose that we adopt a search method that makes every straw look like a needle and that turns up thousands of apparent needles annually; 90 per cent of these apparent needles are actually straws, but we have no way of knowing which ones. Next, we fabricate a theory that ‘explains’ these apparent needles. Some of the propositions in our theory are likely to be correct, merely by chance; but many, many more propositions are incorrect or misleading in that they describe straws. Even if this theory were to account rationally for all of the needles that we have supposedly discovered in the past, which is extremely unlikely, the theory has very little chance of making highly accurate predictions about the consequences of our actions unless the theory itself acts as a powerful self-fulfilling prophecy (Eden and Ravid, 1982). Our theory would make some correct predictions, of course; with so many correlated variables, even a completely false theory would have a reasonable chance of generating predictions that come true, so we dare not even take correct predictions as dependable evidence of our theory’s correctness (Deese, 1972, pp.61-7).

I/O psychologists with applied orientations might protest that they primarily need to make correct predictions and that doing this does not require a correct and parsimonious theory. Two responses are in order. First, this chapter concerns theory building, not practical utility. Second, the predictive accuracies of I/O relationships, which are not very high, may already be as high as they can be made solely on the basis of blind statistical methods. Making major improvements in predictive accuracies probably requires actual theoretical insights that will not come through purely statistical methods.

Undependable induction may be a cause of I/O psychology’s methodological emphasis as well as a consequence of it. Facing a world of unstable ‘facts’ and weak relationships, we have reason to distrust substantive propositions and to view methods as sounder, more deserving of admiration. We can control our methods better than substance, so emphasizing methods reduces our risks; and because we evaluate our methods ritualistically, we find it much easier to meet methodological standards than to demonstrate the theoretical significance of our findings. Indeed, if our world changes rapidly, ‘facts’ are ephemeral and theoretical significance becomes very elusive.

Because we doubt that methodological improvements are what I/O psychology needs most, we do this with reluctance, but we cannot resist pointing out same of the methodological opportunities that exist:

(a) Statistical significance is a very dangerous criterion. It probably causes more harm than good, by inducing researchers who have few observations to discount strong relationships and encouraging those who have many observations to highlight weak relationships. Moreover, a researcher can be certain of rejecting any point null hypothesis, and point null hypotheses usually look quite implausible if one treats them as genuine descriptions of phenomena (Gilpin and Diamond, 1984; Shames, 1987). Deese (1972, pp. 56-9), among others, has advocated that researchers replace hypothesis tests with statements about confidence limits. But confidence limits too exaggerate the significance of numbers of observations. In I/O psychology, and in social science research more generally, the numbers of observations are rarely equivalent to the sample sizes assumed in statistical theories, both because truly random sampling is rare and because statistical theories assume that sample sizes are basically the only observable characteristics by which to judge data’s dependability, generality, or representativeness. Real life offers researchers many characteristics by which to evaluate data, and carefully chosen observations may be more informative than random samples. Thus, researchers could improve their analyses by using statistical procedures that allow them to assign different weights to observations reflecting their dependability, generality or representativeness; more dependable or more representative observations would receive more weight. As much as from the weights’ arithmetic effects, the improvements in induction would come from researchers’ efforts to analyse data’s dependability or representativeness and from the researchers’ efforts to communicate rationales for these weights. Researchers could also improve their analyses by paying more attention to percentage differences between categories: Are males 1 per cent different from females, or 30 per cent? And yet more improvement could come from less use of averages to represent heterogeneous groups and more use of distributions (Brandt, 1982). What fraction of males are 30 per cent different from what fraction of females? Speaking of measures of organizational climate, Payne and Pugh (1976) remarked that respondents’ opinions generally vary so greatly that it makes no sense to use averages to characterize groups or organizations.

(b) Statistical analyses would have greater credibility and greater theoretical significance if researchers would base their analyses on naive hypotheses or realistic hypotheses instead of null hypotheses (Fombrun and Starbuck, 1987). Virtually the entire apparatus of classical statistics was created when high-speed computers did not yet exist and statisticians had to manipulate distributions algebraically. Thus, statisticians built an analytic rationale around distributions that are algebraically pliant even though these distributions make incredible assumptions such as point null hypotheses. With modern computers, however, researchers can generate statistical distributions that reflect either realistic assumptions or naive ones, even if these distributions cannot be manipulated algebraically. For example, computer simulations could generate the distributions of observed correlations in samples of size N from a hypothetical bivariate Normal population with a correlation of .1. To assess the plausibility of alternative theories where several influences interact, some biologists (Connor and Simberloff, 1986) have begun to compare data with computer-generated multinomial distributions that incorporate combinations of several probable influences; such distributions reduce the need for simplifying assumptions such as normality and linearity, and they make it more practical to examine entire distributions of data.

A key value judgement, however, is how challenging should a researcher make the naive or credible hypothesis? How high should the jumper place the crossbar? In science, the crossbar’s height has implications for the research community as a whole as well as for an individual researcher: Low crossbars make it easier to claim that the researcher has learned something of significance, but they also lead to building scientific theories on random errors.

(c) Even traditional hypothesis tests and confidence limits could support better induction than they do, but no statistical procedure can surmount inappropriate assumptions, biased samples, overgeneralization, or misrepresentation. Researchers should either eschew the appearance of statistical methods or try to approximate the assumptions underlying these methods.

(d) Researchers should often attempt to replicate others’ studies, basing these replications solely on the published reports. Frequent replication would encourage researchers to describe their work completely and to characterize its generality modestly. Replication failures and successes would clarify the reasons for exceptional findings, and thus provide grounds on which to design better studies and to discard inexplicably deviant ones.

(e) I/0 psychology has been bounded by two data-acquisition methods: questionnaires and interviews. Although cheap and easy, these methods emphasize subjective perceptions that people recognize and understand well enough to express verbally. These are a part of life. But verbal behavior is bounded by socialization and social constraints that make I/O psychology prone to observe clichés and stereotypes, and it is altogether too easy to find observable behaviors that people do not recognize that they exhibit or that they describe in misleading terms. Thus, researchers should remain skeptical about the validity of subjective data, and they should supplement questionnaires and interviews with their personal observations of behavior, with documents such, as letters, memoranda and grievances, and with quantitative data such as costs, turnover statistics, and production volumes (Campbell and Fiske, 1959; Phillips, 1971; Denzin, 1978). Jick (1979) has discussed the advantages and problems of reconciling different kinds of data. However, the greatest payoffs may come from discovering that different kinds of data simply cannot be reconciled.

(f) New sciences tend to begin timidly by gathering data through passive observation and then constructing retrospective explanations for these data (Starbuck, 1976). Unfortunately, most spontaneous events are uninteresting; the more interesting objects of study are unusual, complex, dynamic and reactive; and postdiction makes weak discriminations between alternative theories. Consequently, as sciences gain confidence, they gradually move from the passive, postdictive mode toward a more active and predictive mode: They make more and more efforts to anticipate future events and to manipulate them. Interventions enable scientists to create interesting situations and dynamic reactions. Predictions tend to highlight differences between alternative theories, and trying to make predictions come true may be the only practical way to find out what would happen if. Giving theories visible consequences puts scientists under pressure to attempt innovations (Gordon and Marquis, 1966; Starbuck and Nystrom, 1981).

Thus, potential advantages inhere in I/O psychology’s applied orientation and in the numerous I/O psychologists holding non-academic jobs. Compared to academic areas of psychology and to most social sciences, I/O psychology could be more innovative, quicker to discard ineffective theories, more interested in dynamic theories, and more strongly oriented toward prediction and intervention. I/O psychology probably does pay more attention than most social sciences to prediction and intervention, but prediction seems to be associated mainly with personnel selection, interventions have focused on goal-setting and behavior modification, and it is doubtful that I/O psychology is exploiting its other potential advantages. We examined several studies of goal-setting and behavior modification published during 1986, and we turned up only static before-and-after comparisons and no analyses that were truly dynamic.

SUMMARY

We started by asking: How much has I/O psychology progressed? Partly because a number of I/O psychologists have been expressing dissatisfaction with the field’s progress and asking for more innovation (Hackman, 1982; Nord, 1982), we had an initial impression that progress has been quite slow since the early 1950s. We had also seen a sequence of business-strategy studies that had achieved negative progress, in the sense that relationships became less and less clear as the studies accumulated, and we wondered whether this might have happened in some areas of I/O psychology. We appraised progress by observing the historical changes in effect sizes for some of I/O’s major variables. If theories are becoming more and more effective, they should explain higher and higher percentages of variance over time. We found that I/O theories have not been improving by this measure. For the reasons just stated, this did not surprise us, but we were surprised to find such small percentages of variance explained and such consistent changes in variance explained.

Seeking an explanation for no progress or negative progress, we turned to the literature on paradigm development. These writings led us to hypothesize that I/O psychology might be inconsistent with itself: various reviews have suggested that I/O psychologists disagree with each other about the substance of theories. Perhaps I/O psychologists have low paradigm consensus but employ quantitative, large-sample research methods that presume high paradigm consensus. So we assembled various indicators of paradigm consensus. According to these indicators, I/O psychology looks much like Organizational Behavior and psychology in general. This is no surprise, of course. I/O psychology also looks different from Management Information Systems (MIS), which appears to be a field that both lacks paradigm consensus and makes rapid progress. But, to our astonishment, I/O psychology does not look so very different from chemistry and physics, two fields that are widely perceived as having high paradigm consensus and as making rapid progress. I/O psychology may, however, differ significantly from the physical sciences in the content of paradigms. Physical science paradigms evidently embrace both substance and methodology, whereas I/O psychology paradigms strongly emphasize methodology and pay little attention to substance. I/O psychologists act as if they do not agree with each other concerning the substance of human behavior, although we believe that this lack of substantive Consensus is unnecessary and probably superficial.

Why might the paradigms of I/O psychologists deemphasize substance? We hypothesized that this orientation is probably an intelligent response to a real problem. This real problem, we conjectured, is that I/O psychologists find it difficult to detect meaningful research findings against a background of small, theoretically meaningless, but statistically significant relationships. Thus, I/O psychologists dare not trust their inferences from empirical evidence. To assess the plausibility of this conjecture, we tabulated all of the correlation matrices reported in the Journal of Applied Psychology over four years. We found that two-thirds of the reported correlations are statistically significant at the 5 per cent level, and a strong bias makes positive correlations more likely to be reported and to be judged statistically significant than negative ones.

Thus, I/O psychology faces a Catch-22. A distrust of undependable substantive findings may be leading I/O psychologists to emphasize methodology. This strategy, however, assumes that induction works, whereas it is induction that is producing the undependable substantive findings.

CONSTRUCTING A SUBSTANTIVE PARADIGM

Our survey of effect sizes seems to say that I/O theories are not very effective and they are not improving significantly over time. We psychologists seem to have achieved very little agreement among ourselves concerning the substantive products of our research; and it is easy to see why this might be the case, for almost everything in our worlds appears to be somewhat related to everything else, and we use criteria that say almost every relationship is important.

We could make this situation a springboard for despair. People are simple creatures seeking to comprehend worlds more complex than themselves. Scientists attempt to construct rational explanations; but rationality is a human characteristic, not an intrinsic characteristic of nature, so scientists have no guarantee that science will prove adequate to the demands they place on it. Psychological research itself details the cognitive limitations that confine and warp human perceptions (Faust, 1984). The complexity of people’s worlds may also be a human characteristic, for people who think they comprehend some aspects of their worlds tend to react by complicating their worlds until they no longer understand them. Thus, social scientists have reason to doubt the adequacy of rational explanations to encompass most phenomena (Starbuck, 1988). Within our limitations, we psychologists may find it impossible to achieve complete explanations without reducing actions and measures to trivial tautologies. For example, we can decide that we will only teach in school what we can measure with an aptitude test, or that we will select and promote leaders solely on the basis of leadership questionnaires.

We need not despair, however. Studies of progress in the physical sciences emphasize the strong effects of social construction (Sullivan, 1928; Knorr-Cetina, 1981; Latour, 1987). Although it is true that physical scientists discard theories that do not work, the scientists themselves exercise a good deal of choice about what aspects of phenomena they try to explain and how they measure theories’ efficacies. Newton’s laws are one of the best known substantive paradigms. Physicists came to accept these laws because they enabled better predictions concerning certain phenomena, but the laws say nothing whatever about some properties of physical systems, and the laws fail to explain some of the phenomena that physicists expected them to explain, such as light or sub-atomic interactions. In no sense are Newton’s laws absolute truths. Rather they are statements that physicists use as base lines for explanation: physicists attempt to build explanations upon Newton’s laws first. If these explanations work, the physicists are satisfied, and their confidence in Newton’s laws has been reaffirmed. If these base-line explanations do not work, physicists try to explain the deviations from Newton’s laws. Are there, for instance, exogenous influences that had not previously been noticed? Finally, if some inexplicable deviations from Newton’s laws recur systematically, but only in this extreme circumstance, physicists contemplate alternative theories.

The contrast to I/O psychology is striking. . . and suggestive. The difference between physics and psychology may be more in the minds of physicists and psychologists than in the phenomena they study (Landy and Vasey, 1984). After arguing that psychological facts are approximately as stable over time as physical ones, Hedges (1987, pp. 453-4) observed:

New physical theories are not sought on every occasion in which there is a modest failure of experimental consistency. Instead, reasons for the inconsistency are likely to be sought in the methodology of the research studies. At least tentative confidence in theory stabilizes the situation so that a rather extended series of inconsistent results would be required to force a major reconceptualization. In social sciences, theory does not often play this stabilizing role.

Campbell (1982, p. 697) characterized the theories of I/O psychology as ‘collections of statements that are so general that asserting them to be true conveys very little information.’ But, of course, the same could be said of the major propositions of the physical sciences such as Newton’s laws: any truly general proposition can convey no information about where it applies because it applies everywhere (Smedslund, 1984). General theoretical propositions are necessarily heuristic guidelines rather than formulae with obvious applications in specific instances, and it is up to scientists to apply these heuristics in specific instances. But general theoretical propositions are more than heuristics because they serve social functions as well.

Scientific progress is a perception by scientists, and theories need not be completely correct in order to support scientific progress. As much as correctness, theories need the backing of consensus and consistency. When scientists agree among themselves to explain phenomena in terms of base-line theories, they project their findings into shared perceptual frameworks that reinforce the collective nature of research by facilitating communication and comparison and by defining what is important or irrelevant. Indeed, in so far as science is a collective enterprise, abstractions do not become theoretical propositions until they win widespread social support. A lack of substantive consensus is equivalent to a lack of theory, and scientists must agree to share a theory in order to build on each others’ work. Making progress depends upon scientists’ agreeing to make progress.

The absence of a strong substantive paradigm may be more a cause of slow progress than a consequence of it, and I/O psychologists could dramatically accelerate the field’s progress by adopting and enforcing a substantive paradigm. Of course, conformity to a seriously erroneous paradigm might delay progress until dissatisfaction builds up to a high state and one of Kuhn’s revolutions takes place; but so little progress is occurring at present that the prospect of non-progress hardly seems threatening.

Moreover, I/O psychology could embrace some theoretical propositions that are roughly as sound as Newton’s laws. At least, these propositions are dependable enough to serve as base lines: they describe many phenomena, and deviations from them point to contingencies. For example, we believe that almost all I/O psychologists could accept the following propositions as base lines:

Pervasive Characteristics. Almost all characteristics of individual people correlate with age, education, intelligence, sex, and social class; and almost all characteristics of groups and organizations correlate with age, size, and wealth. (Implication: every study should measure these variables and take them into account.)

Cognitive Consonance. Simultaneously evoked cognitions (attitudes, beliefs, perceptions and values) tend to become logically consistent (Festinger, 1957; Heider, 1958; Abelson et al., 1968). Corollary 1: Retrospection makes what has happened appear highly probable (Fischhoff, 1980). Corollary 2: Social status, competence, control, and organizational attitudes tend toward congruence (Sampson, 1969; Payne and Pugh, 1976). Corollary 3: Dissonant cognitions elicit subjective sensations such as feelings of inequity, and strong dissonance may trigger behaviors such as change initiatives or reduced participation (Walster et al., 1973). Corollary 4: Simultaneously evoked cognitions tend to polarize into one of two opposing clusters (Cartwright and Harary, 1956). Corollary 5: People and social systems tend to resist change (Marx, 1859; Lewin, 1943).

Social Propositions:

Activities, interactions, and sentiments reinforce each other (Homans, 1950). Corollary 1: People come to resemble their neighbors (Coleman et al., 1966; Industrial Democracy in Europe International Research Group, 1981). Corollary 2: Collectivities develop distinctive norms and shared beliefs (Roethlisberger and Dickson, 1939; Seashore, 1954; Beyer, 1981). (These propositions too can be viewed as corollaries of cognitive consonance.)

Idea evaluation inhibits idea generation (Maier, 1963).

Participation in the implementation of new ideas makes them more acceptable (Lewin, 1943;Kelley and Thibaut, 1954). Corollary 1: Participation in goal setting fosters the acceptance of goals (Maier, 1963; Locke, 1968; Vroom and Yetton, 1973; Latham and Yukl, 1975). Corollary 2: Participation in the design of changes reduces resistance to change (Coch and French, 1948; Marrow et al., 1967; Lawler and Hackman, 1969). Corollary 3: Opportunities to voice dissent make exit less likely (Hirschman, 1970).

Reinforcement Propositions:

Rewards make behaviors more likely, punishments make behaviors less likely (Thorndike, 1911; Skinner, 1953). (This is a tautology, of course [Smedshxnd, 1984], but so is Newton’s F = ma. A proposition need not convey information in order to facilitate consensus.)

The more immediate a reinforcement the stronger is its impact (Hull, 1943).

Continuous reinforcements produce faster learning that is more quickly unlearned, whereas intermittent reinforcements produce slower learning that is more slowly unlearned (Hull, 1943; Estes, 1957).

Other propositions doubtless could be added to the list, but these illustrate what we mean. We would be exceedingly happy to have some august body take responsibility for formulating dogma.

I/O psychologists are quite unlikely to adopt and use a set of base-line propositions voluntarily. Many I/O psychologists hold vested interests in specific propositions that do not qualify for base-line status or that would become redundant. I/O psychologists are not accustomed to projecting everything they do onto a shared framework, so they would have to learn new ways of thinking and speaking. Some I/O psychologists have expressed doubts about the validity of theoretical propositions in the field. Thus, we surmise that constructing a consensus requires explicit actions by the key journals that act as professional gatekeepers. Specifically, to promote progress in I/O psychology, the key journals could adhere to three policies:

1. Journals should refuse to publish studies that purport to contradict the base-line propositions.[7] Since the propositions are known laws of nature, valid evidence cannot contradict them. Apparent discrepancies from these laws point to exogenous influences, to interactions among influences, or to observational errors.

2. Journals should refuse to publish studies that do no more than reaffirm the base-line propositions. Known laws of nature need no more documentation. However, there may be need to explain the implications of these laws in circumstances where those implications are not self-evident.

3. Journals should insist that all published studies refer to any of the baseline propositions that are relevant. There is no need for new theoretical propositions where the existing laws are already adequate, so any phenomena that can be explained in terms of these laws must be so explained.

Will base-line propositions such as those we have listed prove to be adequate psychological laws in the long run? No, unquestionably not. First, because we are simple creatures trying to comprehend complex worlds, it behooves us to expect our theories to prove somewhat wrong; and because we are hopeful creatures, we intend to do better. Secondly, in order to integrate multiple propositions, I/O psychology will have to move from qualitative propositions to quantitative ones. Attempts to apply base-line propositions would likely produce demands for standardized measures, and then more specific propositions. How rapidly do cognitions become consistent, and how can one judge whether they have attained consistency? Thirdly, processes that tend to alter some characteristics of a social System also tend to evoke antithetical processes that affect these characteristics oppositely (Fombrun and Starbuck, 1987). Stability creates pressures for change, consensus arouses dissent, constraint stirs up rebellion, conformity brings out independence, and conviction evokes skepticism. Thus, the very existence of a scientific paradigm would call forth efforts to overthrow that paradigm.

But we believe I/O psychology should try using a consistent paradigm for a few decades before overthrowing it. Moreover, history suggests that I/O psychologists do not actually overthrow theoretical propositions. Instead, they react to unsatisfactory propositions by integrating them with their antitheses.

For example, during the early part of the twentieth century, many writers and managers held that successful organizations require firm leaders and obedient subordinates (Starbuck and Nystrom, 1981, pp. xvii-xviii). Leadership was viewed as a stable characteristic of individuals: some fortunate people have leadership traits, and other unlucky souls do not. This orthodoxy attracted challenges during the 1920s and 1930s: Weber (1947) noted that some organizations depersonalize leadership and that subordinates sometimes judge leaders illegitimate. The Hawthorne studies argued that friendly supervision increases subordinates’ productivity (Roethlisberger and Dickson, 1939; Mayo, 1946). Barnard (1938) asserted that authority originates in subordinates rather than superiors. By the early 1950s, various syntheses were being proposed. Bales (1953), Cartwright and Zander (1953), and Gibb (1954) analysed leadership as an activity shared among group members. Coch and French (1948) and Lewin (1953) spoke of democratic leadership, and Bales (1953) distinguished leaders’ social roles from their task roles. Cattell and Stice (1954) and Stogdill (1948) considered the distinctive personality attributes of different kinds of leaders. By the late 1950s, the Ohio State studies had factored leadership into two dimensions: initiating structure and consideration (Fleishman et al., 1955; Stogdill and Coons, 1957). Initiating structure corresponds closely to the leadership concepts of 1910, and consideration corresponds to the challenges to those concepts. Thus, views that had originally been seen as antithetical had eventually been synthesized into independent dimensions of multiple, complex phenomena.

REFERENCES

Abelson, R. P., Aronson, E., McGuire, W. J., Newcomb, T. M., Rosenberg, M. J. and Tannenbaum, P. H. (1968) Theories of Cognitive Consistency. Chicago: Rand-McNally. American Psychological Association (1968) Summary report of journal operations: 1967, American Psychologist, 23, 872.

American Psychological Association (1978) Summary report of journal operations for 1977, American Psychologist, 33, 608.

American Psychological Association (1982) Summary report of journal operations: 1981, American Psychologist, 37, 709.

American Psychological Association (1983) Summary report of journal operations: 1982, American Psychologist, 38, 739.

American Psychological Association (1984) Summary report of journal operations, American Psychologist, 39, 689.

American Psychological Association (1985) Summary report of journal operations, American Psychologist, 40, 707.

American Psychological Association (1986) Summary report of journal operations: 1985, American Psychologist, 41, 701.

Ames, E. and Reiter, S. (1961) Distributions of correlation coefficients in economic time series. Journal of the American Statistical Association, 56, 637-656.

Bakan, D. (1974) On Method: Toward a Reconstruction of Psychological Investigation. San Francisco: Jossey-Bass.

Bales, R. F. (1953) The equilibrium problem in small groups. In T. Parsons, R. F. Bales and E. A. Shils (eds.) Working Papers in the Theory of Action. Glencoe, Ill.: Free Press, 111-161.

Barnard, C. I. (1938) The Functions of the Executive. Cambridge, Mass.: Harvard University Press.

Bass, B. M. (1981) Stogdill’s Handbook of Leadership. New York: The Free Press.

Beyer, J M. (1978) Editorial policies and practices among leading journals in four scientific fields. The Sociological Quarterly, 19, 68-88.

Beyer, J. M. (1981) Ideologies, values, and decision-making in organizations. In P. C. Nystrom and W. H. Starbuck (eds) Handbook of Organizational Design. Oxford:Oxford University Press, 166-202.

Blackburn, R. S. and Mitchell, M.(1981) Citation analysis in the organizational sciences. Journal of Applied Psychology, 66, 337-342.

Borgen, F. H., Layton, W. L., Veenhuizen, D. L. and Johnson, D. J. (1985) Vocational behavior and career development, 1984: A review. Journal of Vocational Behavior, 27, 2 18-269.

Box, G. E. P. and Draper, N. R. (1969) Evolutionary Operation. New York: John Wiley. Brackbill, Y. and Korten, F. (1970) Journal reviewing practices: Authors’ and APA members’ suggestions for revision. American Psychologist, 25, 937-940.

Brandt, L. W. (1982) Psychologists Caught: A psycho-logic of psychology. Toronto: University of Toronto Press.

Brayfield, A. H. and Crockett, W. H. (1955) Employee attitudes and employee performance. Psychological Bulletin, 52, 396-424.

Bureau of the Census(1987) Statistical Abstract of the United States 1987. Washington, DC: US Department of Commerce.

Campbell, D. T. and Fiske, D. W. (1959) Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 8 1-105.

Campbell, J. P. (1977) The cutting edge of leadership: An overview. In J. G. Hunt and L. L. Larson (eds) Leadership: The cutting edge. Carbondale, Ill.: Southern Illinois University Press.

Campbell, J. P. (1982) Editorial: Some remarks from the outgoing editor. Journal of Applied Psychology, 67, 691-700.

Campbell, J. P., Daft, R. L. and Hulin, C. L. (1982) What to Study: Generating and developing research questions. New York: Sage.

Campion, M. A. and Thayer, P. W. (1985) Development and field evaluation of an interdisciplinary measure of job design. Journal of Applied Psychology, 62, 29-43.

Cartwright, D. and Harary, F. (1956) Structural balance: A generalization of Heider’s theory. Psychological Review, 63, 277-293.

Cartwright, D. and Zander, A. (1953) Leadership: Introduction. In D. Cartwright and A. Zander (eds) Group Dynamics. Evanston, Ill.: Row, Peterson, 535-550.

Cattell, R. B. and Stice, G. F. (1954) Four formulae for selecting leaders on the basis of personality. Human Relations, 7, 493-507.

Chapman, L. J. and Chapman, J. P. (1969) Illusory correlation as an obstacle to the use of valid psychodiagnostic signs. Journal of Abnormal Psychology, 74, 27 1-280.

Coch, L. and French, J. R. P. Jr (1948) Overcoming resistance to change. Human Relations, 1, 512-532.

Cohen, J. (1977) Statistical Power Analysis for the Behavioral Sciences. New York: Academic Press.

Coleman, J. S., Katz, E. and Menzel, H. (1966) Medical Innovation. Indianapolis: BobbsMerrill.

Connor, E. F. and Simberloff, D. (1986) Competition, scientific method, and null models in ecology. American Scientist, 74, 155-162.

Cummings, L. L. and Schmidt, S. M. (1972) Managerial attitudes of Greeks: The roles of culture and industrialization. Administrative Science Quarterly, 17, 265-272.

Daft, R. L. and Wiginton, J. (1979) Language and organization. Academy of Management Review, 4, 179-191.

De Meuse, K. P. (1986) A compendium of frequently used measures in industrial/organizational psychology. The Industrial-Organizational Psychologist, 23 (2), 53-59.

Deese, J. (1972) Psychology as Science and Art. New York: Harcourt.

Denzin, N. K. (1978) The Research Act. New York: McGraw-Hill.

Downey, H. K., Hellriegel, G. and Slocum, J. W., Jr (1975) Environmental uncertainty: The construct and its application. Administrative Science Quarterly, 20, 613-629.

Downey, H. K., Hellriegel, G. and Slocum, J. W., Jr (1977) Individual characteristics as sources of perceived uncertainty. Human Relations, 30, 161-174. Dreeben, R. (1968) On What is Learned in School. Reading, Mass.: Addison-Wesley.

Dubin, R. (1976) Theory building in applied areas. In M. D. Dunnette (ed.), Handbook of Industrial and Organizational Psychology. Chicago: Rand-McNally, 17-39.

Eden, D. and Ravid, G. (1982) Pygmalion versus self-expectancy: Effects of instructor-and self-expectancy on trainee performance. Organizational Behavior and Human Performance, 30, 351-364.

Estes, W. K. (1957) Theory of learning with constant, variable, or contingent probabilities of reinforcement. Psychometrika, 22, 113-132.

Faust, D. (1984) The Limits of Scientific Reasoning. Minneapolis, MN: University of Minnesota Press.

Festinger, L. (1957) A Theory of Cognitive Dissonance. Evanston, Ill.: Row, Peterson.

Fiedler, F. E. (1967) A Theory of Leadership Effectiveness. New York: McGraw-Hill.

Fischhoff, B. (1980) For those condemned to study the past: Reflections on historical judgment. In R. A. Shweder and D. W. Fiste (eds) New Directions for Methodology of Behavioral Science. San Francisco: Jossey-Bass, 79-93.

Fleishman, E. A., Harris, E. F. and Burtt, H. E. (1955) Leadership and Supervision in Industry. Columbus, Ohio: Ohio State University, Bureau of Educational Research.

Fombrun, C. J. and Starbuck, W. H. (1987) Variations in the Evolution of Organizational Ecology. Working paper, New York University.

Garfield, E. (1972) Citation analysis as a tool in journal evaluation. Science, 178, 471-479.

Garfield, E. (198 1-84a) SSCI Journal Citation Reports. Philadelphia, Penn.: Institute for Scientific Information.

Garfield, E. (1981-84b) SSCI Journal Citation Reports. Philadelphia, Penn.: Institute for Scientific Information.

Garvey, W. D., Lin, N. and Nelson, C. E. (1970) Some comparisons of communication activities in the physical and social sciences. In C. E. Nelson and D. K. Pollock (eds) Communication among Scientists and Engineers. Lexington, Mass.: Heath Lexington, 61-84.

Gerwin, D. (198 l)Relationships between structure and technology. In P. C. Nystrom and W.H. Starbuck (eds) Handbook of Organizational Design. New York: Oxford University Press, 3-38.

Gibb, C. A. (1954) Leadership. In G. Lindzey (ed.), Handbook of Social Psychology. Cambridge, Mass.: Addison-Wesley.

Gilpin, M. E. and Diamond, J. M. (1984) Are serious co-occurrences on islands non-random, and are null hypotheses useful in community ecology? In D. R. Strong and others (eds) Ecological Communities: Conceptual issues and the evidence. Princeton, N.J.:Princeton University Press, 297-315.

Goldberg, L. R. (1970) Man versus model of man: A rationaie, plus some evidence, for a method of improving on clinical inferences. Psychological Bulletin, 73, 422-432.

Gordon, G. and Marquis, S. (1966) Freedom, visibility of consequences, and scientific innovation. American Journal of Sociology, 72, 195-202.

Greenwald, A. G. (1975) Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1-20.

Griffin, R. W. (1987) Toward an integrated theory of task design. In L. L. Cummings and B. M. Staw (eds) Research in Organizational Behavior (pp. 79-120). Greenwich, Conn.: JAI Press.

Hackett, R. D. and Guion, R. M. (1985) A reevaluation of the absenteeism - job satisfaction relationship. Organizational Behavior and Human Decision Processes, 35, 340-381.

Hackman, J. R. (1982) Preface. In Campbell, J. T., Daft, R. L. and Hulin, C. L. (eds) What to Study: Generating and developing research questions. New York: Sage.

Hackman, J. R. and Oldham, G. R. (1980) Work Redesign. Reading, Mass.: Addison-Wesley.

Haire, M., Ghiselli, E. E. and Porter, L. W. (1966) Managerial Thinking. New York: John Wiley.

Hedges, L. V. (1987) How hard is hard science, how soft is soft science? American Psychologist, 42, 443-455.

Heider, F. (1958) The Psychology of Interpersonal Relations. New York: John Wiley.

Hirschman, A. 0. (1970) Exit, Voice, and Loyalty. Cambridge, Mass.: Harvard University Press.

Homans, G. C. (1950) The Human Group. New York: Harcourt, Brace.

Hopkins, B. L. and Sears, J. (1982) Managing behavior for productivity. In L. W. Frederiksen (ed.) Handbook of Organizational Behavior Management. New York: John Wiley, 393-425.

House, R. J. and Singh, J. V. (1987) Organizational behavior: Some new directions for I/O psychology. Annual Review of Psychology, 38.

Hull, C. L. (1943) Principles of Behavior. New York: D. Appleton Century.

Iaffaldano, M. T. and Muchinsky, P. M. (1985) Job satisfaction and job performance: A meta-analysis. Psychological Bulletin, 97, 251-273.

Industrial Democracy in Europe International Research Group (1981) Industrial Democracy in Europe. Oxford: Oxford University Press.

Ives, B. and Hamilton, S. (1982) Knowledge utilization among MIS researchers. MIS Quarterly, 6 (4), 61-77.

Jackson, S. E. (1986) Workshop: Results from a survey of editors. Paper presented at the Washington, DC meeting of the APA Annual Convention.

Jick, T. J. (1979) Mixing qualitative and quantitative methods: Triangulation in action. Administrative Science Quarterly, 24, 602-611.

Jones, D. E. H. (1966) On being blinded with science - being a ruthless enquiry into scientific methods, complete with 29 genuine references and a learned footnote. New Scientist, 24 November, 465-467.

Kaplan, A. (1963) The Conduct of Inquiry: Methodology for behavioral science. San Francisco: Chandler.

Katz, D. and Kahn, R. L. (1978) The Social Psychology of Organizations. New York: John Wiley.

Kelley, H. H. and Thibaut, 3. W. (1954) Experimental studies of group problem solving and process. In G. Lindzey (ed.), Handbook of Social Psychology. Cambridge, Mass.: Addison-Wesley, 735-786.

Kendler, H. H. (1984) Evolutions or revolutions? In K. M. J. Lagerspetz and P. Niemi

(eds) Psychology in the 1990's. Amsterdam: North-Holland.

Kenny, D. A. and Zaccaro, S. J. (1983) An estimate of variance due to traits in leadership. Journal of Applied Psychology, 68, 678-685.

Kerr, S., Tolliver, J. and Petree, D. (1977) Manuscript characteristics which influence acceptance for management and social science journals. Academy of Management Journal, 20, 132-141.

King, A. S. (1974) Expectation effects in organizational change; Administrative Science Quarterly, 19, 221-230.

Klayman, J. and Ha, Y.-W. (1987) Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94, 211-228.

Knorr-Cetina, K. D. (1981) The Manufacture of Knowledge: An essay on the constructivist and contextual nature of science. Oxford: Pergamon.

Koch, S. (1981) The nature and limits of psychological knowledge. American Psychologist, 36, 257-269.

Kuhn, T. S. (1970) The Structure of Scientific Revolutions. Chicago: The University of Chicago Press.

Kunda, G. (1987) Engineering Culture: Culture and control in a high-tech organization. PhD thesis, Alfred P. Sloan School of Management, MIT.

Landy, F. J. and Vasey, J. (1984) Theory and logic in human resources research. In K. M. Rowland and G. R. Ferris (eds) Research in Personnel and Human Resources Management. Greenwich, Conn.: JAI Press, 1-34.

Latham, G. P. and Yukl, G. A. (1975) A review of research on the application of goal setting in organizations. Academy of Management Journal, 18, 824-845.

Latour, B. (1987) Science in Action. Cambridge, Mass.: Harvard University Press.

Lave, C. and March, 3. (1975) An Introduction to Models in the Social Sciences. New York:Harper and Row.

Lawler, E. E. III and Hackman, 3. (1969) Impact of employee participation in the development of pay incentive plans: A field experiment. Journal ofApplied Psychology, 53, 467-471.

Lawrence, P. R. and Lorsch, J. W. (1967) Organization and Environment. Boston, Mass.: Harvard Business School.

Lewin, K. (1943) Forces behind food habits and methods of change. National Research Council, Bulletin, 108, 35-65.

Lewin, K. (1953) Studies in group decision. In D. Cartwright and A. Zander (eds) Group Dynamics. Evanston, Ill.: Row, Peterson, 287-301.

Lindsey, D. and Lindsey, T. (1978) The outlook of journal editors and referees on the normative criteria of scientific craftsmanship. Quality and Quantity, 12, 45-62.

Line, M. B. and Sandison, A. (1974) ‘Obsolescence’ and changes in the use of literature with time. Journal of Documentation, 30, 283-350.

Locke, E. A. (1968) Toward a theory of task motivation and incentives. Organizational Behavior and Human Performance, 3, 157-189.

Locke, E. (1976) The nature and causes of job satisfaction. In M. D. Dunnette (ed.), Handbook of Industrial and Organizational Psychology. New York: John Wiley, 1297-1349.

Locke, E. A. (1977) The myths of behavior mod in organizations. Academy ofManagement Review, 4, 543-553.

Locke, E. A., Feren, D. B., McCaleb, V. M., Shaw, K. N. and Denny, A. T. (1980) The relative effectiveness of four methods of motivating employee performance. In K. D. Duncan, M. M. Gruneberg and D. Wallis (eds) Changes in Working Life. New York: John Wiley, 363-388.

Lodahl, J. B. and Gordon, G. (1972) The structure of scientific fields and the functioning of university graduate departments. American Sociological Review, 37, 57-72.

Lord, R. G., De Vader, C. L. and Alliger, G. M. (1986) A ineta-analysis of the relation between personality traits and leadership perceptions: An application of validity generalization procedures. Journal of Applied Psychology, 71, 402-410.

Mahoney, M. J. (1977) Publication prejudices: An experimental study of confirmatory bias in the peer review system. Cognitive Therapy and Research, 1, 161-175.

Mahoney, M. J. and DeMonbreun, B. G. (1977) Psychology of the scientist: An analysis of problem-solving bias. Cognitive Therapy and Research, 1, 229-238.

Maier, N. R. F. (1963) Problem-solving Discussions and Conferences: Leadership methods and skills. New York: McGraw-Hill.

Mann, R. D. (1959) A review of the relationships between personality and performance in small groups. Psychological Bulletin, 56, 241-270.

Marrow, A. J., Bowers, D. G. and Seashore, S. E. (1967) Management by Participation. New York: Harper and Row.

Marx, K. (18S9) A Contribution to the Critique of Political Economy. Chicago: Kerr.

Masterman, M. (1970) The nature of a paradigm. In I. Lakatos and A. Musgrave (eds) Criticism and the Growth of Knowledge. London: Cambridge University Press.

Mayo, E. (1946) The Human Problems of an Industrial Civilization. Boston, Mass.: Harvard University Press, Graduate School of Business Administration.

McEvoy, G. M. and Cascio, W. F. (1985) Strategies for reducing employee turnover: A meta-analysis. Journal of Applied Psychology, 70, 342-353.

McGuire, W. J. (1983) A contextualist theory of knowledge: Its implications for innovation and reform in psychological research. In L. Berkowitz (ed.), Advances in Experimental Social Psychology. Orlando: Academic Press, 1-47.

Meehl, P. E. (1954) Clinical versus Statistical Prediction: A theoretical analysis and review of the evidence. Minneapolis, Minn.: University of Minnesota Press.

Miner, J. B. (1980) Theories of Organizational Behavior. Hinsdale, Ill.: Dryden.

Miner, J. B. (1984) The validity and usefulness of theories in an emerging organizational science. Academy of Management Review, 9, 296-306.

Mitchell, T. R. (1979) Organizational behavior. Annual Review of Psychology, 30, 243-281.

Mitchell, T. R., Beach, L. R. and Smith, K. 0. (1985) Some data on publishing from the authors’ and reviewers’ perspectives. In L. L. Cummings and P. J. Frost (eds) Publishing in the Organizational Sciences. Homewood, Ill.: Richard D. Irwin, 248-264.

Moravcsik, M. J. and Murgesan, P. (1975) Some results on the function and quality of citations. Social Studies of Science, 5, 86-92.

Nelson, N., Rosenthal, R. and Rosnow, R. L. (1986) Interpretation of significance levels and effect sizes by psychological researchers. American Psychologist, 41, 1299-1301.

Nord, W. R. (1982) Continuity and change in Industrial/Organizational psychology: Learning from previous mistakes. Professional Psychology, 13, 942-952.

Orwin, R. G. and Cordray, D. S. (1985) Effects of deficient reporting on meta-analysis: A conceptual framework and reanalysis. Psychological Bulletin, 97, 134-147.

Ozer, D. J. (1985) Correlation and the coefficient of determination. Psychological Bulletin, 97, 307-315.

Payne, R. L. and Pugh, D. S. (1976) Organizational structure and climate. In M. D. Dunnette (ed.) Handbook of Industrial and Organizational Psychology. Chicago: RandMcNally, 1125-1173.

Peters, D. P. and Ceci, S. J. (1982) Peer-review practices of psychological journals: The fate of published articles, submitted again. The Behavioural and Brain Sciences, 5, 187-195.

Peters, L. H., Hartke, D. D. and Pohlmann, J. T. (1985) Fiedler’s contingency theory of leadership: An application of the meta-analysis procedures of Schmidt and Hunter. Psychological Bulletin, 97, 274-285.

Phillips, D. L. (1971) Knowledge from What?: Theories and methods in social research. Chicago: Rand-McNally.

Pinski, G. and Narin, F. (1979) Structure of the psychological literature. Journal of the American Society for Information Science, 30, 16 1-168.

Price, D. J. de S. (1965) Networks of scientific papers. Science, 149, 510-515.

Price, D. J. de S. (1970) Citation measures of hard science, soft science, technology, and nonscience. In C. E. Nelson and D. K. Pollock (eds) Communication among Scientists and Engineers. Lexington, Mass.: Heath Lexington, 3-22.

Roberts, K. H. and Glick, W. (1981) The job characteristics approach to task design: A critical review. Journal of Applied Psychology, 66, 193-217.

Roethlisberger, F. J. and Dickson, W. J. (1939) Management and the Worker. Cambridge, Mass.: Harvard University Press.

Rosenthal, R. (1966) Experimenter Effects in Behavioral Research. New York: AppletonCentury-Crofts.

Salancik, G. R. (1986) An index of subgroup influence in dependency networks. Administrative Science Quarterly, 31, 194-211.

Sampson, E. E. (1969) Studies in status congruence. In L. Berkowitz (ed.), Advances in Experimental Social Psychology. New York: Academic Press, 225-270.

Sanford, N. (1982) Social psychology: Its place in personology. American Psychologist, 37, 896-903.

Schneider, B. (1985) Organizational behavior. Annual Review of Psychology, 36, 573-611.

Schriesheim, C. A. and Kerr, S. (1977) Theories and measures of leadership: A critical appraisal of current and future directions. In J. G. Hunt and L. L. Larson (eds) Leadership: The cutting edge. Carbondale, Ill.: Southern Illinois University Press, 9-45.

Scott, K. D. and Taylor, G. S. (1985) An examination of conflicting findings on the relationship between job satisfaction and absenteeism: A meta-analysis. Academy of Management Journal, 28, 599-612.

Seashore, S. E. (1954) Group Cohesiveness in the Industrial Work Group. Ann Arbor, Mich.: Institute for Social Research.

Shames, M. L. (1987) Methodocentricity, theoretical sterility, and the socio-behavioral sciences. In W. J. Baker, M. E. Hyland, H. Van Rappard, and A. W. Staats (eds) Current Issues in Theoretical Psychology. Amsterdam: North-Holland.

Sharplin, A. D. and Mabry, R. H. (1985) The relative importance of journals used in management research: An alternative ranking. Human Relations, 38, 139-149.

Skinner, B. F. (1953) Science and Human Behavior. New York: Macmillan.

Small, H. (1980) Co-citation context analysis and the structure of paradigms. Journal of Documentation, 36, 183-196.

Smedslund, J. (1984) What is necessarily true in psychology? In J.R. Royce and L. P. Mos (eds) Annals of Theoretical Psychology. New York: Plenum Press, 24 1-272.

Snyder, M. (1981) Seek, and ye shall find: Testing hypotheses about other people. In E. T. Higgins, C. P. Herman and M. P. Zanna (eds) Social Cognition. The Ontario Symposium. Hilisdale, N.J.: Lawrence Erlbaum, 277-303.

Stagner, R. (1982) Past and Future of Industrial/Organizational Psychology. Professional Psychology, 13, 892-902.

Starbuck, W. H. (1976) Organizations and their environments. In M. D. Dunnette (ed.) Handbook of Industrial and Organizational Psychology. Chicago: Rand-McNally, 1069-1123.

Starbuck, W. H. (1981) A trip to view the elephants and rattlesnakes in the garden of Aston. In A. H. Van de Ven and W. F. Joyce (eds) Perspectives on Organization Design and Behavior. New York: Wiley-Interscience, 167-198.

Starbuck, W. H. (1985) Acting first and thinking later: Theory versus reality in strategic change. In J. M. Pennings and Associates, Organizational Strategy Decision and Change. San Francisco: Jossey-Bass, 336-372.

Starbuck, W. H. (1988) Surmounting our human limitations. In R. Quinn and K. Cameron (eds) Paradox and Transformation: Toward a theoty of change in organization and management. Cambridge, Mass.: Ballinger.

Starbuck, W. H. and Nystrom, P. C. (1981) Designing and understanding organizations. In P. C. Nystrom and W. H. Starbuck, (eds) Handbook of Organizational Design. Oxford: Oxford University Press, ix-xxii.

Staw, B. M. (1976) Knee deep in the Big Muddy: A study of escalating commitment to a chosen course of action. Organizational Behavior and Human Performance, 16, 27-44.

Staw, B. M. (1984) Organizational behavior: A review and reformulation of the field’s outcome variables. Annual Review of Psychology, 35, 627-666.

Steel, R. P. and Ovalle, N. K. 11(1984) A review and meta-analysis of research on the relationship between behavioral intentions and employee turnover. Journal ofApplied Psychology, 69, 673-686.

Steers, R. M. and Mowday, R. T. (1981) Employee turnover and post-decision accomodation process. In B. M. Staw and L. L. Cummings (eds) Research in Organizational Behavior. Greenwich, CN: JAI Press, 235-282.

Sterling, T. D. (1959) Publication decisions and their possible effects on inferences drawn from tests of significance - or vice versa. Journal of the American Statistical Association, 54, 30-34.

Stogdill, R. M. (1948) Personal’ factors associated with leadership: A survey of the literature. The Journal of Psychology, 25, 35-71.

Stogdill, R. M. and Coons, A. E. (1957) Leader Behavior. Columbus, Ohio: Ohio State University, Bureau of Business Research.

Stone, E. F. (1978) Research Methods in Organizational Behavior. Glenview, Ill.: Scott, Foresman.

Sullivan, J. W. N. (1928) The Bases of Modern Science. London: Benn.

Summers, E. G. (1979) Information characteristics of the ‘Journal of Reading’ (1957- 1977). Journal of Reading, 23, 39-49.

Thorndike, E. L. (1911) Animal Intelligence. New York: Macmillan.

Tinsley, H. E. A. and Heesacker, M. (1984) Vocational behavior and career development, 1983: A review. Journal of Vocational Behavior, 25, 139-190.

Tosi, H., Aldag, R. and Storey, R. (1973) On the measurement of the environment: An assessment of the Lawrence and Lorsch environmental uncertainty subscale. Administrative Science Quarterly, 18, 27-36.

Tweney, R. D., Doherty, M. E. and Mynatt, C. R. (1981) On Scientific Thinking. New York: Columbia University Press.

Van Fleet, D. D. and Yukl, G. A. (1986) A century of leadership research. In D. A. Wren and J. A. Pearce (eds) Papers Dedicated to the Development of Modern Management. Academy of Management, 12-23.

Vroom, V. H. (1964) Work and Motivation. New York: John Wiley.

Vroom, V. H. and Yetton, P. W. (1973) Leadership and Decision-making. Pittsburgh: University of Pittsburgh Press.

Walster, E., Berscheid, E. and Waister, G. W. (1973) New directions in equity research. Journal of Personality and Social Psychology, 25, 15 1-176.

Watkins, C. E., Jr., Bradford, B. D., Mitchell, B., Christiansen, T. J., Marsh, G., Blumentritt, J. and Pierce, C. (1986) Major contributors and major contributions to the industrial/organizational literature. The Industrial-Organizational Psychologist, 24 (1), 10-12.

Webb, E. J., Campbell, D. T., Schwartz, R. D. and Sechrest, L. (1966) Unobtrusive Measures. Skokie, Ill.: Rand McNally.

‘Webb, W. B. (1961) The choice of the problem. American Psychologist, 16, 223-227. Weber, R. L. (1982) More Random Walks in Science. London: The Institute of Physics.

Weber, M. (1947) The Theory of Social and Economic Organization. London: Collier-Macmillan.

Woodward, J. (1965) Industrial Organization: Theory and practice. London: Oxford University Press.

Xhignesse, L. V. and Osgood, C. E. (1967) Bibliographical citation characteristics of the psychological journal network in 1950 and 1960. American Psychologist, 22, 778-791.

Zedeck, S. and Cascio, W. F. (1984) Psychological issues in personnel decisions. Annual Review of Psychology, 35, 461-518.

Zuckerman, H. and Merton, R. K. (1971) Patterns of evaluation in science: Institutionalisation, structure and functions of the referee system. Minerva, IX, 66-100.