Recently a vendor supporting Colorado, a State using Burning Glass, reached out to us because the occupational coding provided by Burning Glass was deemed unacceptable. After we coded 50k records for them using O*NET-SOC AutoCoder v8.0, the vendor compared the results and licensed our software … it was one of the fastest sales we’ve ever made.
Since we guarantee 85% accuracy to our customers, one of the things we do is provide an accuracy evaluation back to them … below you’ll find the results. Burning Glass was asked to participate in the evaluation, but declined.
No occupational coder is perfect; but as you’ll see below, the results vary widely.
As you review the following results, please keep in mind the results are biased by two facts: 1) ‘expert’ codes were assigned only by me; and 2) the AutoCoder methodologies were developed based on my interpretation of the SOC-O*NET coding guidelines … so the AutoCoder-assigned codes are likely to appear ‘correct’ to me. That said, I provided the spreadsheet with all of the ‘expert’ codes to the vendor and to ETA, knowing that I’d have to defend my choice to others who might not agree with me. This knowledge that others would be ‘looking over my shoulder’ helped reduce the personal bias, and in fact may have led me to overcompensate.
So, here are the results:
Overall accuracy scores:
O*NET-SOC AutoCoder v8.0: 82.0%
Colorado staff: 67.9%
OccuCoder v1.2: 51.0%
Notes: The O*NET-SOC AutoCoder v8.0 accuracy score is using National tuning factors rather than the State-specific tuning factors for Colorado, which were still under development. After the Colorado factors were integrated, the overall results exceeded our 85% accuracy guarantee. By the way, the Colorado staff accuracy is the highest staff-assigned accuracy I’ve ever seen … far higher than the national average in the low 50% range … congratulations!
Percent of Zero-Score Records:
O*NET-SOC AutoCoder v8.0: 5.5%
Colorado staff: 9.3%
OccuCoder v1.2: 31.5%
Notes: Zero-score records are the worst-case results … the codes assigned are completely off target since they don’t match any of the ‘expert’ codes, even at the 2-digit level. Thus a high-rate of occurrence can seriously undermine confidence in all of the assigned codes. Note that almost 1/3 of the codes assigned by OccuCoder v1.2 earned no points.
Occupational Distribution Consistency:
O*NET-SOC AutoCoder v8.0 Correlation: .9780
Colorado staff Correlation: .9403
OccuCoder v1.2 Correlation: .5594
Correlating the 2-digit frequency count for each coding method against the distribution of the expert top codes shows how well each option does at painting a clear picture of the actual distribution of occupations. In contrast to the overall accuracy numbers, these correlation coefficients indicate consistency … e.g. ‘is AutoCoder as accurate at identifying medical occupations as it is at identifying computer occupations.
Clearly O*NET-SOC AutoCoder v8.0 and the Colorado staff are quite consistent given the high correlation values; while OccuCoder is much less consistent. The associated graph (click the image for a larger version) illustrates the major problem for OccuCoder … it classifies far too many records as Manufacturing.
Fit Scores as a Predictor of Accuracy:
O*NET-SOC AC v8.0: Avg Fit Score = 78.3 (Overall Accuracy = 82.0%)
Colorado staff: Not Available
OccuCoder v1.2: Avg Fit Score = 99.9 (Overall Accuracy = 51.0%)
Notes: When an automated coding tool is used as a staff aid, it is helpful if the tool returns a score indicating the level of confidence in the returned occupations … staff can use these ‘fit’ scores to focus on the cases where confidence is low. AutoCoder v8.0 fit scores are designed to be conservative; thus the average fit score of 78.3 is slightly below the overall accuracy of 82.0%. OccuCoder scores on the other hand are non-predictive … virtually every record receives a top score of 100, rather than a number close to the overall accuracy of 51%.
A closer look at the accuracy scores assigned by O*NET-SOC AutoCoder v8.0 to the records in this sample shows that fit scores are indeed a strong predictor of accuracy:
Accuracy when Fit Score >=50 and <60: 50.2%
Accuracy when Fit Score >=60 and <70: 59.8%
Accuracy when Fit Score >=70 and <80: 80.5%
Accuracy when Fit Score >=80 and <90: 91.5%
Accuracy when Fit Score >=90 and <=100: 98.4%
Wrapping up, I hope you find this review useful … we certainly do. All of the records that coded poorly have been reviewed, and our dictionary has been updated, to handle these records more accurately. In addition, we ran a text-mining program to identify words that occur more frequently in ads for Colorado employers … these too were used to update the State-specific tuning for Colorado.