If the research community at large is not already well aware of it, I recommend taking a moment to check out Amazon’s Mechanical Turk for its data collection potential. Like the historical chess playing “machine” Turk of its namesake, the site works by harnessing people behind the scenes to do bite sized “Human Intelligence Tasks” or HITs. HITs are small tasks still best performed by human rather than machine intelligence and include tagging photos with the proper labels, eliminating duplicate entries from catalogs, writing reviews, and, most importantly for academic researchers, filling out surveys.
As an example, I just executed a trial run of Mechanical Turk for a study I’m working on regarding curiosity for inherently positive or negative information and the circumstances in which subjects are better able to control the satisfaction of their curiosity. I created a HIT out of the control version of the two different curiosity questionnaires and asked that the survey be completed by up to 100 unique subjects for a reward of 25 cents. In less than four hours from submitting the HIT I have 100 responses at a cost of $27.50 (the site charges a small fee for use). Additionally, Mechanical Turk allows you to reject and not compensate any responders who did not complete their HIT satisfactorily. So, as a quality check I mixed in a question that helped ensure the responders were paying attention. A vast majority of the participants correctly answered a question very similar to the following: “If one hundred thousand and nine is greater than nine thousand enter ‘Q’ otherwise enter ‘T’.”
Though a quick, powerful, and cheap way to collect human subject data, Mechanical Turk does appear to have some major limitations. Most importantly I have yet to figure out a way to bar past respondents from answering subsequent altered versions of surveys used in between subject study designs, though as each respondent has a unique ID it is possible for repeat participants to be eliminated after the fact. Additionally, the baseline demographics of the typical Mechanical Turk worker in the subject pool and the self-selected participant factor may require special statistical treatment. Finally, the interface for creating surveys is rather limited so HTML skills are required.
Even with its limitations, at the very least Mechanical Turk seems like a great vehicle to do pilot studies. I plan to use it next on an outcome bias study to see if there is any merit in pursuing research on an alternative theory to Moral Luck written about previously in this blog.
As an example, I just executed a trial run of Mechanical Turk for a study I’m working on regarding curiosity for inherently positive or negative information and the circumstances in which subjects are better able to control the satisfaction of their curiosity. I created a HIT out of the control version of the two different curiosity questionnaires and asked that the survey be completed by up to 100 unique subjects for a reward of 25 cents. In less than four hours from submitting the HIT I have 100 responses at a cost of $27.50 (the site charges a small fee for use). Additionally, Mechanical Turk allows you to reject and not compensate any responders who did not complete their HIT satisfactorily. So, as a quality check I mixed in a question that helped ensure the responders were paying attention. A vast majority of the participants correctly answered a question very similar to the following: “If one hundred thousand and nine is greater than nine thousand enter ‘Q’ otherwise enter ‘T’.”
Though a quick, powerful, and cheap way to collect human subject data, Mechanical Turk does appear to have some major limitations. Most importantly I have yet to figure out a way to bar past respondents from answering subsequent altered versions of surveys used in between subject study designs, though as each respondent has a unique ID it is possible for repeat participants to be eliminated after the fact. Additionally, the baseline demographics of the typical Mechanical Turk worker in the subject pool and the self-selected participant factor may require special statistical treatment. Finally, the interface for creating surveys is rather limited so HTML skills are required.
Even with its limitations, at the very least Mechanical Turk seems like a great vehicle to do pilot studies. I plan to use it next on an outcome bias study to see if there is any merit in pursuing research on an alternative theory to Moral Luck written about previously in this blog.
No comments:
Post a Comment