Graphic for Do I Trust You?? Effects of Trust Calibration on Human-Machine Team Performance in Operational Environments

By Shane Klestinski, Associate Editor


NTSA and Tech Grove held “Do I Trust You?? Effects of Trust Calibration on Human-Machine Team Performance in Operational Environments
,” the second in a series of webinars based on I/ITSEC 2023’s best papers and tutorials, on March 14.

Two of the paper’s co-authors, Sandro Scielzo, Ph.D., human systems technical authority and learning science fellow at CAE USA, and Beth Hartzler, Ph.D., research psychologist with the Warfighter Learning Technologies Section of the Air Force Research Laboratory’s 711 Human Performance Wing, presented the content.

Scielzo began by providing their paper’s backstory, which involved a comprehensive literature review from 2020 to 2022. Scielzo said they wanted to create a human-machine team (HMT) multidomain, theoretical framework for the warfighter, and they determined three main elements that would be critical in creating interfaces.

Scielzo explained that first, they had to model trust in real time, which was a key construct, along with situational awareness. Second, trust would need to be maintained. Third, they had to facilitate trust, which required determining what kind of modes were available to coordinate and collaborate with a team of humans and machines.

“We were able to have this framework guide the research we wanted to do, and what we’ve done in 2023, is to build that HMT virtual task range,” Scielzo said. “It basically puts on a multidomain, constructive network, a number of simulated platforms where we can have constructive platforms or human-operated platforms, and we decided this will be our proving grounds of all that theory that’s being generated. We wanted to accelerate applied and advanced human subject research in human-machine teaming to come up with the requirements of what works and what doesn’t work depending on use cases, constraints and so forth. Our I/ITSEC paper was based on an applied view of HMT research.”

Scielzo went on to give a brief history of human-machine interaction since the 1960s. He said that in the 2020s, “things are getting a lot more complicated because we’re anthropomorphizing a lot of our relationships with machines because of the advances in artificial intelligence.” This results in a more human-like, collaborative interface with synthetic agents because, according to Scielzo, humans have a need for trust in something that is human-like.

“In the next frontier in human factors [going into the 2030s], we’re introducing a lot more complexity…this concept is called ‘from tools to teammates,’” Scielzo said. “It’s the concept that machines used to be tools to support us and achieve our goals, but now they’re more like human teammates whereby we need to collaborate and coordinate to achieve those goals. This is, in part, driven by future warfare requirements, and the idea is to reduce the cognitive burden on the human operators while promoting that coordination. For effective HMT interaction, [the right level of] trust is that key construct that mediates those interactions and decides whether they’re effective.”

Scielzo said that trust is a willingness to be vulnerable and depend on another teammate, whether human or machine, to meet expectations. “Calibrated trust” describes humans being able to trust machines at the right level for what they are designed to do, and when that trust is properly calibrated, people have the optimal situational awareness and manageable workload in using the system for what it is designed to do. If people “overtrust” or distrust, then a negative outcome can lead to human error, according to Scielzo.

Hartzler said establishing a real-time measurement of trust involved creating “CONS” (pronounced “cones”) or the “Continuous Online Numerical Score,” using Likert scale metrics. A “1” indicated a lack of trust on the user’s part. A “5” meant the user had “total trust.”  She explained that their overarching thesis was that any objective indicators the team included would correspond to the subjective ratings obtained through the CONS.

“To explore this, we developed virtual urban search and rescue missions that represented the aftermath of an F-4 tornado going through downtown Houston, Texas,” Hartzler said. “Participants received flight path recommendations to help locate and deliver supplies to survivors [through drones], but it was up to [the humans] to either accept or reject the suggestions.”

Hartzler noted that participants’ behaviors in response to the recommendations was one the main objective indicators of their trust in completing the task, particularly in whether they choose to review the suggestions before responding. Participants accepting the recommendation without reviewing the suggested path indicated compliance, or “high trust.” Reading the recommendation before accepting or rejecting indicated imperfect trust, categorized as “verification.” Participants demonstrated rejection by dismissing a recommendation without reading it at all.

“Participants indicated overall ‘low trust’ in the system with an average rating of just below 2.5 on a 5-point scale,” Hartzler said. “The analyses of these behaviors revealed that the quality of the recommendations during the first three-minute block of the missions were the heaviest influencers of the participants’ likelihood of accepting or rejecting flight path suggestions throughout the rest of the 12-minute mission.”

After the description of the study, Scielzo said they were “pretty excited” at having found a measure that yielded data which was indicative of trust-related behaviors.

“Looking at the conclusions and takeaways, the summary for the CONS metric is that we demonstrated its utility during operational tasks, and one of the main takeaways was that we found that the level of trust is directly proportional to recommendation compliance and rejection,” Scielzo said. “We also found that capturing continuous trust attitudes would predict trust behaviors, so we know we’re on the right track.”

People who read this article also found these articles interesting :