On Research - Survey Research and Bots

Season 1 – Episode 6 – Survey Research and Bots

In this episode, we delve into the effects of online bots on survey research, exploring ethical considerations and the ramifications of incorporating bots into surveys, while also examining the potential advantages and unintended outcomes that may arise.

Episode Transcript

Click to expand/collapse

Darren Gaddis: From CITI Program, I’m Darren Gaddis, and this is On Research. Today, survey research and bots, I spoke with Myra Luna-Lucero, the Research Compliance Director at Teachers College Columbia University. As a reminder, this podcast is for educational purposes only. It is not intended to provide legal advice or guidance. You should consult with your organization’s attorneys if you have questions or concerns about relevant laws and regulations discussed in this podcast. Additionally, diffuse express in this podcast or solely those of the guest and do not represent the views of their employer. Hi, Myra. Thank you for joining me today.

Myra Luna-Lucero: Hi.

Darren Gaddis: Myra, to get us started today, would you be willing to share a little bit about your own background and your role currently at Teachers College?

Myra Luna-Lucero: Sure. I am a teacher and researcher by trade. I have done multiple studies looking at gender STEM education, and then I landed into research compliance. I feel like very often the research compliance specialists and administrators sort of land in this position. I don’t remember being in fourth grade and being asked, “What do you want to be when you grow up?” And saying, “I wanted to be a research compliance specialist,” but here I am. Now I currently hold this role at my institution, but I do have a background in research. I think that’s a nice merging of the two worlds with the research background as well as the research compliance background.

Darren Gaddis: To help us better ground today’s conversation, could you briefly define what a bot is and how it broadly impacts survey research and research at large?

Myra Luna-Lucero: Sure. I will start by saying I’m not an expert in bots, but I have a very keen interest in understanding bots as they impact internet mediated research. But the short definition is that bots are a software or internet robot. They are designed to not be easily detected and they’re really this computer program to simulate human activity and operate as an agent for a user or another program. Bots are not necessarily bad or good. They can do things that compromise the normal flow of an interaction, but they may not always have a set to harm. There’s lots of bots that we may interact with on a daily basis. There’s chatbots. We have computer software platforms that chat with us if we are trying to engage through, say, customer service. There’s social bots on social media platforms. There’s shop bots that call content for finding the best airline price. There’s also about harvesters that will harvest content and lots of information from websites.

These are the kinds of considerations that as a research compliance specialist and also a researcher need to have when they’re thinking about putting any research study in an online context, that there really is that diligence of what a bot is, how it exists in our typical everyday life, and then also some of the risk factors and considerations that the bot may have a capability to interact with a research study and potentially compromise that study.

Darren Gaddis: Within survey research, what is the most common application of bots? And should we be aware of any ethical implications?

Myra Luna-Lucero: I think one of the challenges is that it really is just observing what’s happening in the world and how bots may be impacting a research study. These computer designers create malleable responsive computer designs and simulations to appear very much like human activity. Being able to identify the existence of a bot does pose some challenges. It’s not necessarily common applications, but it’s really just trying to think through does this survey response in my online survey look off? Are there considerations that appear more bot-like and less human-like?

Sometimes those nuances can be very subtle, but it’s more just trying to think that bots exist. They are very ever present. They are in our everyday lives. Researchers are designing a research study that they’re putting out into the world with the best intentions to engage a human in that study, but that may not always be the case. Those bots may infiltrate a study, and the researcher needs to be able to be observant in that context. It’s less to me, I think about common applications and more just thinking about what is going on in the survey, what is some key signs in the data that may raise questions about that particular data point or that particular response in the survey that is questions about whether or not it is a bot.

Darren Gaddis: Are you aware of any specific ethical implications that we should be aware of when utilizing or using bots in survey research?

Myra Luna-Lucero: When individuals create a bot, they have a wide range of intentions, not always malicious, and so it’s very important as a researcher to not think of a bot as only out to get their study. But the ethical considerations really are rooted into is the presence of a bot posing risk to an eligible participant who may want to engage in that research study? What safeguards can the researcher create in the design of the study that welcome the eligible human who the researcher is interested in understanding and trying to design a study for that target population?

So, how can the researcher design it to welcome that eligible human participant safeguarding their online study to decrease the chances for the risk of a bot circumventing that eligible participant to be in the study? That sometimes becomes a very challenging balance because the researcher may want to include a lot of safeguards in the design of their online study, which could very well eliminate or limit the possibility for the human to even interact with that study because there’s just so many safeguards that it becomes laborious for that human who’s eligible to be in the study to even get through the process.

It’s trying to think of as research compliance specialists always do, as researchers are trained to always do, mitigate the risks and the benefits. The design itself should be accessible to the population of interest that researcher has designed the study for, the eligible participant, the human participant, to be in that study while also balancing in as best as possible the limitation of the vulnerability of that study to be targeted by a bot.

The ethical considerations are in a lot of ways on a case-by-case basis and malleable, but they really just anchor to being observant about what is happening in the world of internet-mediated research. That’s the research. We’re spending time looking at the existence of these challenges that are already there, building the safeguards that are reasonable for the study design without disqualifying an eligible study study participant or creating too many obstacles to prevent the eligible human study participant to engage in the online project, but also consulting others and thinking through, if I design the study this way, is this a reasonable pathway to mitigate that risk of bots infiltrating the study, but also making sure that it’s accessible to that human who I want to know from in that study.

Darren Gaddis: In my opinion, I think we have all heard to some degree about bots impacting the sale of concert tickets, do social media websites, do crowdsourcing websites. In the past for survey research, what are the positives and potential negatives of utilizing bots?

Myra Luna-Lucero: As I was mentioning earlier, bots are really pervasive, and we can maybe even think about calling customer service in our recent lifetime and getting access to a bot that kind of gets us to the answer that we were seeking. Sometimes that’s easier in just going through the large amounts of content that to say a company may have to sort through and a bot is just sometimes easier to go, “Hey, I just need this one bit of information and I can get it easily by a chatbot or easily with a phone bot that gives me the information that I need.” We, as professionals, use software all the time that has those call response bots prompts that you may be looking for this. Sometimes that’s really handy. That’s, I think, some benefits to this human-like engagement that a bot could provide, but really the negatives don’t have a clear marker of this is bad.

It’s more just trying to understand the nuances of what a bot is capable of and how that bot could impact the integrity of the research that the investigator is putting out into the world. When you think about the collection of data that is possible from humans in an internet-mediated research project, it’s breathtaking. It’s unimaginable to think about how much data could be collected, hundreds and hundreds of thousands of participants who may be eligible to participate in a study. The scientific impact of that is tremendous and laudatory, but those bots need to be a strong consideration for that researcher when they’re designing that content because it could appear that the data is found or legitimate, but in reality there could just be bots having infiltrated that study design and there may appear a normal distribution of data when in reality culling that data, it may show that there was not eligible humans participating in that study, rather there were bots participating in that study and that could impact the integrity of the data.

When you think about those nuances, it doesn’t mean that every internet-mediated research project or any online study is vulnerable to bots, but it does mean that the researcher needs to do their due diligence to examine that data in the analysis of it and in the process of the data collection to assess is there a legitimate human taking this study as I intended it in my research design, or is there now a concern that this survey is vulnerable or this online study is vulnerable to a bot and we need to stop and reassess what’s happening?

The dichotomy of positive negative is a little bit tough to just really identify. It’s more just in my opinion, a gray and a potential positive and a potential negative, and assessing in every instance that is possible, is this impacting my study design? Is this impacting my confidentiality and privacy in a negative way? Is this okay? What is feasible? What is not feasible? Have I created enough safeguards to move forward in the design of this study? Is this possible to collect the data that I would like to collect with the eligible human subject and still create the safeguard to keep the data clear and clean as best as you can from bots? It’s these conversations rooted in being observant, building in those safeguards and consulting others. As you’re designing the study.

Darren Gaddis: How can bots continue to impact survey research, and what are some potential implications for the field of research as bots become more of a commonplace?

Myra Luna-Lucero: I think this is a great question, and I think really one of the biggest motivators for me as a research compliance specialist is to spend time talking with researchers and sometimes balancing the job and the expectation of the job, and then also trying to engage researchers is a tough balance. But I think there’s such importance in engaging the research community. I have gathered researchers who study internet-mediated research and really just had these candid conversations of like, “All right, lay it all out. What are the challenges that you’re experiencing? And how can I, as a research compliance specialist who is familiar with the regulations, who’s familiar with ethical conduct in the protection of human subjects, better understand what you as a researcher are going through in the field?”

To me, it’s tapping, I think, two very distinct populations of researchers. One research group who have long-term experience with internet research and have gone through the ebbs and flow and the technology booms for years, they’ve developed these patterns of conduct and these data security plans that could translate very well to the novice researcher who may be embarking on internet-mediated research for the first time or using a study design in a way for the first time in an online capacity.

In that question, trying to find those examples to help you as a research compliance specialist apply the regulations, apply the ethical standards, but in a way that is possible for the researcher to attain and possible and relevant for the researcher to develop. That’s trying to think of those examples and trying to pull those examples from the research community themselves.

Darren Gaddis: What else should we know about bots in survey research?

Myra Luna-Lucero: I think this has been an emerging conversation. It has been something that takes time to really understand. As a research compliance specialist, I do consult with the colleagues that I have in the research compliance field. I work and talk with researchers in the field to understand what’s going on in their lived experiences when they’re going out and collecting internet-mediated research. Then I have conversations and am active in working with the IT office, the Information Technology Office, because they really have the content expertise to help me understand how to apply the regulations and the ethics when considering a research study. What I’ve learned in all of that is a data security plan is absolutely necessary. We have examples of what that data security plan will look like, but many IRB offices, institutional review board offices have a data security plan already.

It’s the confidentiality privacy, it’s the securing of the data, its identifiers, all the big picture stuff. But I’ve also learned a lot about online data collection that I think is more nuanced than just the overall data security plan. Just broad strokes, a data security plan really is being observant, building and safeguards and consulting experts in designing the data security plan as it’s relevant for your own study. But beyond that, in the nuanced capacity of an online study, there’s also considerations for the researcher to make about compensation. If the researcher is compensating a eligible participant in an online setting, what considerations are they making about disqualifying a potential suspicious respondent? And does that take away from an eligible respondent because that eligible respondent may not have been verbose or may not have been 100% attentive? It’s balancing that compensation with not becoming punishment to an eligible participant.

And so, having candid conversations with the researcher about compensation and what the compensation protocol is, if the participant does not pass the attention checks over the course of three times, they will not be compensated. Then making sure that’s clear in the consent form as opposed to just saying, you will not be paid for some arbitrary reason. There needs to be that delineation compensation and then a potential attention check. The researcher also can weave in their survey itself or their design itself, these if then conditional logic questions, and this is something I’ve learned a lot from the IT office, is thinking through this conditional logic questions where things branch outward tend to disrupt the flow of a bot, because bots don’t always have the nuance to follow logically those branching if then statements. A human may be able to reason logically through them, but some consideration should be that the logic shouldn’t be overly complicated to make the study inaccessible to an eligible participant, so it’s balancing that`.

The researcher can also identify oddities in the data. I am very adamant that researchers engaged in internet-mediated research are observing the data as it’s coming in and really watching as the data is happening within reason, sometimes it’s not possible to do it in every case, but if it is possible, because then they can identify potential unanswered questions, inconsistent responses, incomplete surveys, things that are impossible for potential human to have answered in the speed to which it was answered, or even those illogical responses to open-ended questions, providing safeguards to the best of your capacity. For a research compliance specialist, really trying to understand from the researcher perspective what a day in the life of their field data collection looks like, and then tying it to the regulations and the ethical considerations to help that research move forward.

Darren Gaddis: Myra, thank you for joining me today.

Myra Luna-Lucero: Sure. Thanks for having me.

Darren Gaddis: Be sure to follow, like and subscribe to on research with CITI Program to stay in the know. If you enjoyed this podcast, you may also be interested in other podcasts from CITI Program, including On Campus and On Tech Ethics. You can listen to all of our podcasts on Apple Podcast, Spotify, and other streaming services. I also invite you to review our course offerings regularly as we are continually adding new courses, subscriptions, and webinars that may be of interest to you, like CITI Program’s Research Study Design course.

How to Listen and Subscribe to the Podcast

You can find On Research with CITI Program available from several of the most popular podcast services. Subscribe on your favorite platform to receive updates when episodes are newly released. You can also subscribe to this podcast, by pasting “https://feeds.buzzsprout.com/2112707.rss” into your your podcast apps.

Recent Episodes

Meet the Guest

Myra Luna-Lucero, EdD – Columbia University

Dr. Myra Luna-Lucero is the Research Compliance Director at Teachers College, Columbia University. In addition to supporting researchers, she has recently launched an ethics internship program and an extensive transformation of the College’s IRB website. She regularly offers seminars and workshops on research compliance and IRB leadership.

Meet the Host

Darren Gaddis, Host, On Campus Podcast – CITI Program

He is the host of the CITI Program’s higher education podcast. Mr. Gaddis received his BA from University of North Florida, MA from The George Washington University, and is currently a doctoral student at Florida State University.