Salesforce strives to support business users with little to no knowledge about predictive analytics or Artificial Intelligence (AI). There is limited understanding on how such users would respond to recommendations made by AI, especially when a prediction leads to unsuccessful business outcomes. Therefore…
how might we overcome the potential to distrust AI after a failure in predictive analytics?
Through in-depth interviews, a competitive analysis, and usability testing, our team surfaced a set of key findings and recommendations to help Salesforce address the problem of trusting AI after failures in predictive analytics.
*Due to ongoing work, I detail the overall process and excluded the specific findings and recommendations produced from this study.
DETERMINING RESEARCH QUESTIONS
Our main research goal was to explore how users react to AI tools failing to provide relevant insights for aiding decision making in business contexts.
With these research questions, we set out to understand our problem by first conducting in-depth expert interviews.
UNDERSTANDING THE CONTEXT
The interviews helped us gain perspective into people’s current processes in making business decisions with data, how they feel about AI in general, and what potential issues they may face with using AI in their business insights processes.
Taking on the duty to design the interview protocol, I decided a semi-structured approach was best to gather an initial understanding of the participants and their work, and then be able to dig deeper based on their responses as the session unfolded.
Since each team member took turns facilitating the interviews, I designed an interview guide outlining the introduction to our project, warm-up and focus questions, and the retrospective so that we all provided the same information to the participants. Providing a common definition and examples of AI to use also helped curb any potential biasing and differences between the sessions.
For both the interviews and the usability testing later in our study, we recruited people with characteristics that match Salesforce’s target audience for their tool.
As the person in charge of recruiting and scheduling participants, I reached out to our respective networks to find people who met the participant requirements, as well as posted on Salesforce’s Trailblazer Community for usability testing participants. In total, I scheduled 2 remote and 3 in-person interviews, and 4 remote and 1 in-person usability sessions for the team to conduct.
A team member and I collaborated on building an affinity diagram with the crucial points of each interview. After categorizing and sub-categorizing the points, we noted the key patterns and converted them into a set of important criteria in building and maintaining trust with an AI tool in a business analytics context.
ANALYZING THE COMPETITION
Using our set of criteria from the interviews as a guide, we wanted to see how our client's tool measures up against similar tools. The criteria was broken down into explicit features, and each tool was evaluated as to whether or not it contained the feature.
We also subjectively scored them on the criteria overall (on a scale from 1=low to 5=high). The competitive analysis helped identify any opportunity gaps that Salesforce can fill, as well as any weaknesses it should address.
TESTING THE IDEAS
We transformed our criteria into how-might-we questions to brainstorm ways to improve the AI tool’s current interface in maintaining trust. We narrowed down the ideas to:
Feature #1. An explanation for how the tool arrived at a specific recommendation
Feature #2. A feedback mechanism
We incorporated the two features into a mockup for usability testing.
I led 2 of the 4 usability test sessions, showing the participants the current AI tool’s UI and our version. A within-subjects test design allowed us to gain more feedback on both versions, and we swapped which version was presented first in order to avoid bias from the presentation order.
Pulling from existing literature on how to measure trust in technology systems, we devised a set of quantitative questions to measure the level of trust towards the product after presenting each test version. During the session, I moderated a failure scenario and asked participants how they would then interact with each version given a task.
In addition to the close-ended questions, we analyzed the participants' actions and comments as they "thought aloud" throughout the exercise. This, in addition to the following questions, made up our qualitative analysis of the reactions toward the added features:
• Given that prototype Z provides an explanation to the analysis process whereas prototype X doesn’t provide the explanation, does the explanation affect your trust in the tool in any way? Why?
• Given that prototype Z provides a feature to intake feedback to the tool whereas prototype X doesn’t, does the ability to input feedback affect your trust in the tool in any way? Why?
• Throughout the project, I learned that research doesn't always go as planned, even if you have a study protocol. We often encountered logistical issues during the first session, and learned the importance of conducting pilot runs to address such as issues before the real sessions, especially during remote sessions.
• During our usability testing stage, it was challenging to figure out how we can measure trust. We decided a mixed-method approach would be best to triage quantitative and qualitative measurements. Because we had a small sample size, there were certain measurements that we decided to refrain from using - one was time to complete the tasks. We saw that some participants would scrutinize every part of the interface and ask a lot of questions, while others quickly rushed through the screens.
• We had the amazing opportunity to present our project to a variety of audiences, from classmates to our clients, to the larger Salesforce UX group in AI. For each of these presentations, I learned the importance of tailoring our presentations. Practicing and incorporating feedback really helped ensure our main points were getting across to our specific audience at the time.