Sunday, January 31, 2010

Lessons from the NYC Fire case - part 1

Part 1 of 2

New York City, like the cities of New Haven and Chicago, has a long history of employment discrimination litigation related to its firefighter testing.

Since the 1970s and cases like Guardians, the city has been under scrutiny for its woefully low number of black firefighters.

In 2007 the city found itself faced with another lawsuit over its firefighter hiring practices, and in July of 2009, a U.S. District Court judge found that the city had violated Title VII by administering written exams from 1999-2007 that had high levels of adverse impact. The city marshaled an inadequate defense. In January of 2010, the same judge (Nicholas Garaufis) found the city liable for a pattern and practice of disparate treatment for those same exams. An adverse impact finding, particularly for written exams, and especially for public safety tests, is not earth-shattering. But a finding of disparate treatment in this situation is less common.

This case, while only one example and limited in its impact, has some valuable lessons for test users and sheds some light on how judges look at our field. In particular, I describe below nine points the judge specifically made and what lessons we can draw from them:

1) While the city conducted a job analysis with an "extensive" list of tasks and surveyed incumbents, the city offered "no evidence of 'the relationship of abilities to tasks.'" They conducted a linkage, but the judge found that the SMEs were confused about what they were supposed to do and didn't understand several of the abilities they were rating.

Lesson: simply having subject matter experts (SMEs) link essential tasks and knowledge, skills, and abilities (KSAs) is not sufficient. You need to ensure they understand the statements they are linking as well as how exactly they are supposed to be linking them.

2) In conducting the job analysis, the city inappropriately retained tasks and KSAs that could be learned on the job. It is quite clear (e.g., per the Uniform Guidelines) that only tasks and KSAs that are required upon entry to the job should be identified as critical in terms of exam development.

Lesson: make sure that when you are developing exams based on job analysis results that you focus only on those tasks and KSAs that are required upon entry to the job. This should be determined by your SMEs.

3) The city relied to some extent upon the work of a previous test developer, Dr. Frank Landy (who sadly recently passed away). In addition to a tenuous link between Dr. Landy's work and the current exams, the judge makes it clear that "reliance on the stature of a test-maker cannot stand in for a proper showing of validity." At the same time, the judge emphasizes that exams should be constructed by "testing professionals."

Lesson: tests should be developed by people who know what they're doing. This means HR professionals with the requisite background in test validation and construction in conjunction with job experts. Do not rely solely on previous efforts, particularly when (as in this case) the results of those efforts were either incomplete or not fully relevant to your current situation.

4) The city performed no "sample testing" to ensure that the questions were reliable as well as "comprehensible and unambiguous."

Lesson: few steps in the test development process are as easy--or as valuable--as pilot testing. I have yet to see an exam that didn't benefit from a "trial run" with a group of incumbents. Not only will you catch unintended flaws, you will verify that the exam is doing what you claim it is.

5) There was insufficient evidence that the exams actually measured the (nine cognitive) KSAs the city claimed they intended to measure. Plaintiffs were able to suggest the opposite through analyzing convergent and discriminant validity as well as by conducting a factor analysis.

Lesson: there are two linkages of primary importance in test development. The first was describe in #1. The second is the link between critical KSAs and the exam(s). At the very least, you must be able to show evidence that there is a logical link between the two. When you claim to be measuring cognitive abilities, you incur an additional responsibility, which is gathering statistical evidence that supports this claim.

Next time: more lessons and the relief order.

Friday, January 22, 2010

Jan '10 issue of JAP, plus APA gets stingy

The January 2010 issue of the Journal of Applied Psychology is out, and there are some good articles to take a look at. It just may be more difficult to see them. More on that in a minute.

First, here are some of the titles in this issue:

Emotional intelligence: An integrative meta-analysis and cascading model. A must for anyone interested in EI; posits and supports a cascading model whereby emotion perception-->emotion understanding-->emotion regulation-->job performance.

Time is on my side: Time, general mental ability, human capital, and extrinsic career success. GMA shown to have strong links to two extrinsic measures of career success, income and prestige. (ah, but are smart people happier?)

I won’t let you down… or will I? Core self-evaluations, other-orientation, anticipated guilt and gratitude, and job performance. Core self-evaluations' impact on job performance may depend on how much they focus on others.

Understanding performance ratings: Dynamic performance, attributions, and rating purpose. Performance ratings are influences by a variety of things, including overall performance variance and purpose of the ratings.

Okay, so back to my earlier comment: it appears that APA has restricted viewing abstracts of their journals to registered members (hence the lack of links in this post). On the one hand, no big deal, it appears you can simply register to gain access. On the other hand...why should someone have to do this? This is another unfortunate example of research being restricted (first by charging exorbitant fees for articles, now through personal identification) and contributes to the field being insular.

Granted, APA's not the only one that does this (hey buddy, got $400 for the CRL?) but that doesn't excuse it. Our field benefits from sharing of information, not just among professionals but with the general public. Requiring registration does not further that goal. Thankfully some individual researchers (see the sidebar on the main page) allow access to their work--something we should all be grateful for.

Monday, January 18, 2010

How to get r = 1.0

Recruiters have a variety of measures of their success, often including process outcomes (time-to-fill, number of requisitions filled, etc.).

And although assessment professionals have a variety of success measures, some in common with recruiters (e.g., tenure), there is one measure that stands above all others: job performance.

The "gold standard" of this measurement is to correlate test scores with job performance measures (called criterion-related validation evidence). A correlation of, say, .50 between these two, is considered outstanding. Square that and you have the percentage of behavior explained. So in other words, when we can explain 25% of job performance with assessments, we call that success (and with good reason, because it's a heck of a lot better than 0%).

Why not higher than 25%? What would it take to get r =1.0, in other words a perfect correlation between test scores and performance? Here is a somewhat tongue-in-cheek recipe for achieving this impossible dream:

1. An accurate identification of the top competencies/KSAs required for the job. Qualified subject matter experts reach consensus on a handful of far and away the most important qualities that impact job performance.

2. Perfectly constructed and administered, perfectly reliable and accurate measures of the the top KSAs.

3. Variability among applicants in terms of amount of the relevant KSAs possessed.

4. Test scores combined and weighted appropriately given the job analysis results.

5. Variability in scores for those hired.

6. A clear description of the work to be performed and competencies to be demonstrated so the individuals understand expectations.

7. Perfectly reliable, accurate measures of job performance that capture behaviors one would logically relate to the critical KSAs.

8. A supportive work environment (e.g., high quality supervision, adequate resources) so this doesn't interfere with work performance.

9. Variability in job performance among those hired using the assessments.

10. Elimination of outside factors that may contribute to lower job performance (e.g., family emergencies, medical/psychological changes).

As you can see, some of these are achievable (1, 4, 6), others are challenging and depend on circumstances, but are not impossible to achieve (3, 5, 8, 9) and some are practically impossible (2, 7, 10). I said earlier this was tongue-in-cheek because obviously we'll never have a situation where all of these conditions (as well as ones I'm sure I forgot) are true.

Does this mean we should abandon the correlation between test score(s) and job performance? Absolutely not. It should continue to be one of our "gold standards" for measuring our success as assessment professionals. But we--and our customers--should have our eyes wide open before pressing "compute."

Wednesday, January 13, 2010

Meet the new SIOP...same as the old SIOP

The votes are in, and the new name for the Society for Industrial and Organizational Psychology (SIOP) is...the same.

After over a thousand votes from members, the existing acronym beat The Society for Organizational Psychology (TSOP) by a tally of 51% to 49%--a difference of 15 votes. You can read my comments about this option--and my prediction of the outcome--here.

Why is this non-news, news? Because it's problematic that the main professional, scientific body that devotes itself to researching the psychology of organizations and work (POW!) repeatedly has identity issues. This is in large part because of the word "industrial", which makes it sound like we're all studying factory workers. I am not alone in having people look at me sideways when I attempt to explain our field.

To be perfectly honest, I am reluctant to describe my focus as "psychology", except to others in the same field. It sidetracks the conversation (perhaps due to my insufficient skill). It's much easier to connect with people by saying I'm in Human Resources. This isn't to say that the focus on psychology isn't important, or that others in I/O psychology might not mind using this phrase, or that there isn't some brand value in SIOP. But call me crazy, if you're reluctant to name your field (and attorneys don't count--people know, or think they know, what you do), the profession has a problem.

So, our identity struggle continues. An interesting follow-up study might be to ask SIOP members how they describe their field of work to non-I/O folk and break that down by area of focus. It's a big tent.

Personally, I prefer something that includes Work and Organizational. Mix and match letters as you will.

On a positive note, did you know you can access all of SIOP's quarterly news publication, TIP, here? The January 2010 issue has pieces on integrated performance management, a preview of Lewis v. City of Chicago, and a lot more.