Though recent sentiment seems to turn more bearish on AI, many continue to see enormous potential in the technology and how it could fundamentally change business and life in general. There are no silver bullets and all beginnings are hard.
From many conversations I have with business leaders and dedicated teams looking to unlock AI’s potential for their organisations there are simple practices that make a lot of sense today. Yet, I see very few people implementing them, regardless of whether they are sceptics or careful adopters.
If you are a sceptic – and there are good reasons to be sceptical of the hype over the last 12-18 months – you might simply take a ‘wait and see’ approach until you witness tangible value from this extraordinarily expensive technology. However, I believe you might be missing out on early benefits and are placing your organisation in a less than desirable position.
Similarly, if you are already evaluating POCs or moving to production with AI tools, the following four proposals strike me as intuitively right, valuable and simple steps that have the potential to significantly benefit your organisation. Despite their simplicity, I haven’t come across a single organisation implementing even a single one of them. This may be due to an inadequately small sample size, but the frequent response (‘Ah, interesting! Yeah, we should look into this!’) speaks volumes.
1. Record and archive all internal conversations
One of the most valuable resources in any business is the collective knowledge shared during conversations. By recording and archiving all your discussions, you create a rich repository of ideas, decisions, and insights that you can revisit and leverage in the future.
Why it matters
During management or team meetings, 80-90% of the content may be routine, but the remaining 10-20% could contain valuable brainstorming and problem-solving ideas. These ideas often get lost in the haste of rushing into your next meeting, catching up with emails or finalising whatever document is suddenly urgent.
Now consider archiving all these conversations, transcribing them and every few months going back and retrieving a list of all these thoughts and ideas. Do you think there might be value in that?
How to implement
Use meeting recording tools: Utilize tools like Zoom, Microsoft Teams, or Google Meet to record meetings.
Transcription services: Employ transcription services like Otter.ai or Open AI’s Whisper to convert spoken conversations into text.
Centralised archive: Maintain a centralised archive of these recordings and transcripts, for example, as a dedicated vector store to augment a conversational AI with retrieval augmentation (RAG).
2. Provide AI users with confidence scores
AI models appear powerful, but they confabulate.
Nota bene: I would love if you considered ditching the ‘hallucination’ lingo and started speaking of ‘confabulations’. I am not the first to point this out, but it is the much more accurate description of what is happening when LLMs make stuff up. This is not mere academic pedantry, but a much-needed distinction between (a) the generation of plausible-sounding but potentially inaccurate or fabricated information (confabulation), from (b) suggesting the experience of sensory perceptions in the absence of external stimuli (hallucination). With AI technology in 2024 we are talking about the former, most certainly not the latter.
I believe that we won’t be able to eliminate confabulations altogether, for the very way the technology works. Precisely for this reason I suggest that you are transparent with your users about the model’s outputs. Give them a confidence score, i.e. an indication about the likelihood that a given output is accurate. Machine learning is enthralled with evaluation metrics, and that is unsurprising when you have a bunch of mathematicians huddling over which formula is turning out better results. This kind of single-number score won’t be easy in every use-case, but give it a try and user will respond favourably.
Why it matters
To enable the kind of augmentation you are likely looking for in your business you need to guide the user’s attention and circumspection as they deal with the model’s outputs. Transparency about its limitations fosters trust and honest communication about AI’s capabilities, but more importantly an open recognition of what you expect your people to be doing productively with AI technology.
How to implement
Confidence display: Integrate a confidence indicator directly into the user interface where AI outputs are presented.
User training: Educate users on how to interpret and utilize such indicator in their decision-making processes.
Feedback loop: Implement a feedback mechanism where users can report discrepancies, helping to improve the model’s accuracy over time.
3. Track behavioural drift
As people start interacting with AI tools, their behaviour tends to change over time. You may have witnessed this for yourself: Very likely the first time you interacted with ChatGPT, Claude or Gemini you thought ‘meh, this is really kind of so-so’. If you stuck with it, I suspect you use it every day now. You do that because you have found it to be useful. Do you think you are as critical of outputs today, as you were when you first started using it?
Why it matters
Initially, users may scrutinise AI’s outputs closely, but as they grow more comfortable and used to the technology, they may become more trusting. Monitoring this behavioural drift ensures that users maintain a healthy balance of trust and scepticism.
Also let’s not forget the setting: You have equipped your team with a new (expensive) tool. The expectation is likely that they will accomplish more, right? With this general tendency we shouldn’t be surprised if, in the heat of battle, ‘because I have so much on my plate’, ‘just this one time’ even the best AI governance rules deteriorate to become somewhat optional.
How to implement
Behavioural analytics: Use analytics tools to capture how users interact with AI over time.
Capture user sentiment: Require user’s explicit feedback on their trust and confidence (as all self-reporting this won’t be perfect, but a good start).
Training and updates: Provide training and updates to users, reinforcing best practices and the importance of critical evaluation.
4. Segregate clean ‘training data’
If you know a little bit about how models are trained you have enough intuition about how critical training data is. The internet is being flooded with AI generated content, and while the same may not yet hold true for businesses, I am convinced that the percentage of AI infused content and documents in continually rising also in business.
Why it matters
Maybe you have come across ‘synthetic training data’. It is (often) AI generated data, because more data is generally considered to be better. Alas, quality matters just as much, if not more.
In this sense do yourself and your business a favour: Ensure the integrity of your future AI models by maintaining a clear distinction between original, ‘clean’ training documents and content that has been influenced (tainted?) or generated by AI. This segregation helps prevent the inadvertent degradation of future model’s performance due to ‘AI-muddied’ content.
How to implement
Metadata tagging: Implement a robust tagging system to label documents based on their content origin.
CMS: A content management system (or even Google Workspace) that supports advanced tagging and filtering might do the trick.
Audit Trails: Maintain audit trails to track the provenance of documents.
Conclusion
Implementing AI in a business setting is a complex but exciting endeavour. Consider to start by recording and archiving internal meetings, providing confidence scores to users, tracking potential behavioural drift and segregating training data. These steps will help you reap more benefits from AI (if not yet today, then tomorrow) than complacent competitors while avoiding potential pitfalls.
Parting note
I intend for this to be a somewhat regular publication. I will never claim to possess absolute truth or genius insight about these topics. I find AI technology fascinating and mildly terrifying at the same time. There is lot of hype in the media and from self-proclaimed experts, which I find incredibly frustrating (just think Rabbit r1). This publication is an attempt to provide well-balanced thoughts on the possibilities and limitations of this technology.
If you have constructive criticism, recommendations or requests, please reach out.
Came across a paper today that substantiates my last point, that you should segregate clean, human only created content. I think i’ll cover this in a coming post in more detail.
https://www.nature.com/articles/s41586-024-07566-y