? ******************************************************************* ? Part 1. Modifications of least squares to estimate inefficiency ? ******************************************************************* ? a. Define a production function. We omit the Materials variable ? for reasons explored in part 5 below. ? NAMELIST ; X = ONE,LF,LE,LL,LP $ ? ? Fit the production function by least squares, then shift the constant ? term up so that all points are below it. Then change the sign, to ? make it easy to interpret the residuals as the inefficiencies. ? The mean inefficiency is 0.418 or about 42%. Seems high. What ? are the minimum and maximum? ? REGRESS ; Lhs = LQ ; Rhs = X ; Res = e $ CALC ; Maxe = Max(e) $ CREATE ; uicols = Maxe - e $ DSTAT ; Rhs = uicols $ FRONTIER ; Lhs = LQ ; Rhs = X ; Model = COLS ; Techeff = eucols $ KERNEL ; Rhs = eucols ; Grid $ ? ? ? b. Now, assume a distribution for u(i) and shift the function ? up by the mean of the residuals instead. This is an alternative ? way to estimate u(i). It lowers the mean inefficiency to something ? more plausible, 18.7%, but it leaves some of them negative, which ? is difficult to reconcile with the theory. This is not an very ? favorable way to fit the model. This procedure manipulates the ? same OLS residuals. ? ? Modified OLS. For the exponential model, the standard deviation ? is 1/theta. We already have an estimator of this parameter. ? ? Mean inefficiency is now about 18.7% CALC ; thetainv = sdv(e) $ CREATE ; uimols = thetainv - e $ DSTAT ; rhs=uimols $ ? ******************************************************************* ? Part 2. Stochastic frontier model ? ******************************************************************* ? ? Now fit the normal - half normal stochastic frontier model. ? FRONTIER ; Lhs = LQ ; Rhs = X ; Techeff = euihn ; Eff=uihn$ KERNEL ; Rhs = euihn, eucols ; Grid ; Title=Half Normal Model vs. Corrected OLS $ ? Coefficients and variance parameters look mostly ok. ? Negative output elasticity (also significant) for labor is ? a problem, however. From the output for the model, we have the ? estimates of the variance parameters, ?| Sigma(v) = .14824 | ?| Sigma(u) = .18661 | ? so the "u" part appears to be larger. But, be careful. The ? standard deviation of u is sqr[(pi - 2)/pi] * sigma(u) = 0.11249. ? CALC ; List ; SDU = Sqr((pi - 2)/pi) * .18661 $ ? ? To test the hypothesis of the frontier model, we use the log ? likelihoods for the SF model and for OLS. From the results already ? computed, ? ? logL(frontier) = 68.54675. LogL(OLS) is 67.01375. ? 2*difference is 2*1.533 = 3.066 < 3.84. So, it looks like the ? stochastic part of the frontier model is not significant. But, in ? the results for the SF model, the t ratio for lambda is far greater ? than 2. The only way lambda can be nonzero is if sigma(u) is nonzero, ? so on this basis, the hpothesis of the regression model against the ? frontier model is rejected. This is a contradictory result that, ? we assume, results from having a finite sample. There is one more ? statistic, the LM statistic, which equals 4.882. This is consistent ? with the rejection of the null hypothesis of no inefficiency. ? ******************************************************************* ? Part 3. Exponential and Rayleigh Models ? ******************************************************************* ? ? To fit the exponential model, we just add ;MODEL=E to the command. FRONTIER ; Lhs = lq ; Rhs = one,lf,le,ll,lp ; Model=Exponential ; Techeff=euiexp ; Eff = uiexp $ ? ? The Rayleigh model is done likewies ? FRONTIER ; Lhs = lq ; Rhs = one,lf,le,ll,lp ; Model=Rayleigh ; Techeff=euiray ; Eff = uiray $ ? ? Now, examine the distribution of the estimates of E[u(i)|e(i)]. ? KERNEL ; Rhs = euihn,euiexp,eucols,euiray ; Grid ; Title=Estimated Technical Efficiency from Several Models $ DSTAT ; Rhs = uihn,uiexp,uicols,uimols,uiray $ FRONTIER ; lhs = lq ; rhs=one,lf,le,ll,lp ; model=r ; techeff=euiray $ ? ? The sample mean is .098, the standard deviation is .064. From the model, ? since theta = 10.2, the implied mean of ui is 1/theta=.098. The ? implied standard deviation is also .098. The standard deviation of ? the computed uis is only .064, but this is because the estimates are ? not of ui but of E[ui|epsilon_i]. So it should have a smaller variance. ? ******************************************************************* ? Part 4. Observed heterogeneity in the model. Explaining u(i) ? ******************************************************************* ? ? The three extra variables are included in the model to attempt ? to account for the inefficiency based on known factors. ? We first repeat the earlier estimation to get u(i) ? FRONTIER ; Lhs = lq ; rhs = x ; techeff = euihn $ REGRESS ; Lhs = euihn ; rhs = one,loadfctr,lstage,points $ ? ? Now add the observed heterogeneity to the model and recompute the ? inefficiencies. The correlation is very high, but the differences ? are visible in a plot. One airline in particular, seems to stand out. ? FRONTIER ; Lhs = lq ; rhs = x,loadfctr,lstage,points ; techeff=euihnz $ CALC ; List ; Cor (euihn, euihnz) $ ? ? Plotting UI against UIA. Including UI in the plot adds a 45 degree ? line to the figure. Now it is clear that UIA is always less than ? UI. The explanation is that the heterogeneity, load factor, etc., ? has accounted for some of what we called inefficiency. ? PLOT ; Lhs = euihn ; Rhs = euihnz ; Rh2 = euihn ; Grid ; Title=Estimated Efficiency with Heterogeneity vs. No Heterogeneity $ ? ? How do the 'z' variables affect efficiency. The following plots the ? average efficiency for the sample against the range of values of ? load factor. ? SIMULATE ; Scenario: & loadfctr = .4(.05).95 ; Plot(ci) $ ? ******************************************************************* ? Part 5. A vexing problem. The dreaded error 315. ? ******************************************************************* ? ? We now try to complete the production function by including the 5th ? factor, materials in it. Notice that the model "doesn't work" any ? more. The MLE is OLS with zero inefficiency. At this point, one ? should question the model. What do you think we should do next? ? FRONTIER ; Lhs = lq ; rhs = x,lm,loadfctr,lstage,points $ REGRESS ; Quietly ; Lhs = lq ; Rhs = x,lm,loadfctr,lstage,points ; Res = u315$ KERNEL ; Rhs = u315;normal $ ? ? One thing we might do is add a restriction to the model. Note, we did ? this earlier, implicitly, by omitting LM from it. Now, instead, we ? impose constant returns to scale. Now it works. Is constant returns ? to scale a reasonable restriction? Keep in mind, the labor coefficient ? in our results is persistently negative, so in any event, the whole ? model is suspect. Note the 315 warning is given based on the unconstrained ? OLS residuals. ? FRONTIER ; Lhs = lq ; rhs = x,lm,loadfctr,lstage,points ; cml: lf+lm+le+ll+lp = 1 $ ? Here we do an experiment. The OLS residuals produce the 325 error. ? When the CRTS restriction is imposed, the residuals are negatively ? skewed, which is what we need. The following examines the two sets ? of residuals. REGRESS ; Quietly ; Lhs = lq ; rhs = x,lm,loadfctr,lstage,points ; Res = OLS $ REGRESS ; Quietly ; Lhs = lq ; rhs = x,lm,loadfctr,lstage,points ; cls: lf+lm+le+ll+lp = 1 ; Res=CNS_OLS$ DSTAT ; Rhs = OLS,CNS_OLS ; Normality $ KERNEL ; Rhs = OLS,CNS_OLS $ ? ******************************************************************* ? Part 6. Comparing stochastic frontier and DEA. ? ******************************************************************* ? ? We now use DEA as an alternative method of examining inefficiency. ? To begin, we recompute the normal-half normal model and its estimates ? of the inefficiency terms. These are then translated into estimates ? of efficiency. ? FRONTIER ; Lhs = LQ ; Rhs = X ; techeff= euisf $ ? ? We use data envelopment rather than SF to compute efficiency firm ? by firm. Note, for DEA, we use levels, not logs. Also, of course, ? there is no constant term. (There is no "function." LHS and RHS here ? just tell LIMDEP what variables is output and what the inputs are. ? We pick up the input oriented efficiency from the computation ? FRONTIER ; Lhs = output ; Rhs = fuel,eqpt,labor,prop ; alg=dea $ CREATE ; euidea=deaeff_o$ ? DSTAT ; Rhs = euisf,euidea $ ? ? Additional ways to compare the two. The two sets of results are not ? actually all that different. Closer than in many other studies. ? PLOT ; Lhs=euidea ; Rhs=euisf ; Rh2 = euidea $ CALC ; List ; Cor(euidea,euisf)$ KERNEL ; Rhs=euisf,euidea ; Grid ; Title=Densities for SF and DEA Efficiencies $ ? ? The DEA computation has no direct way to incorporate heterogeneity ? in the computation. Some researchers compute a second step analysis ? by regressing the estimated efficiencies on the interesting variables. ? Since some of the efficiency values are 1.0 by construction, some have ? used a tobit model to account for this. (This is not necessarily a ? good idea, as the data are not at all generated by a tobit model. ? But, it has been done.) In a compromise, others have used a truncated ? regression. Not necessarily a better idea, but it has been done. ? What do you find? REGRESS ; lhs = euidea;rhs=one,loadfctr,lstage,points$ ? ? A followup exercise: The DEA computation has produced a second ? efficiency measure, DEAEFF_I. We analyzed the input based measure ? above, DEAEFF_O. Repeat the exercise with the output oriented ? measure, DEAEFF_I. Do you get the same results? ?******************************************************************** ? ? This exercise uses the Spanish Dairy Data. We compare SFA and ? DEA. The similarity of the predictions is striking. ? ?******************************************************************** ? SETPANEL ; Group = farm ; Pds=T $ NAMELIST ; (new) ; means=cowsbar,landbar,laborbar,feedbar $ NAMELIST ; factors = cows,land,labor,feed $ CREATE ; means = Group Mean (factors,pds=t)$ CREATE ; milkbar=Group Mean (milk,pds=t) $ CREATE ; yb=log(milkbar) ; x1b=log(cowsbar) ;x2b=log(landbar) ; x3b=log(laborbar) ;x4b=log(feedbar)$ CREATE ; Output = Milkbar/10000 ; Food = feedbar/10000 $ FRONTIER ; If [ year = 98] ; Lhs = output ; Rhs = cowsbar,landbar,laborbar,food ; Alg=DEA ; List $ FRONTIER ; If[year=98] ;Lhs = yb ; Rhs = one,x1b,x2b,x3b,x4b ; techeff = eusf $ DSTAT ; if[year=98] ; Rhs = eusf,deaeff_o $ PLOT ; if[year=98] ; Lhs=eusf ; Rhs = deaeff_o ; Rh2=eusf ; Title=Stochastic Frontier Efficiency vs. DEA ; Grid $