OpenAI recently allowed paying ChatGPT Plus and Enterprise subscribers to create their own custom chatbots known as GPTs utilizing GPT 4, the latest Generative AI version from OpenAI.
I took advantage of this opportunity to move beyond the standard, general-purpose ChatGPT model by creating two specialized chatbots for international development practitioners using my own time and personal resources.
USAID ADS and AIDAR Chatbots
Each of these chatbots analyze the full ADS & AIDAR regulations, respectively, as published on November 15. They are free to use by anyone and should improve over time as the GenAI “learns” from its interactions.
- ADS Chatbot: focuses on the USAID Automated Directives System (ADS) that contains over 200 chapters organized in six functional series describing the organization and functions of USAID, along with the policies and procedures that guide the Agency’s programs and operations.
- AIDAR Chatbot: looks only at the AIDAR document that provides a Mandatory Reference to the Acquisition Chapters of the ADS 300 Series.
Please engage with both of these custom chatbots at your leisure and let me know their benefits, limitations, and how I can improve them in the comments section.
For those wanting to go deep into GenAI technology, these GPTs are using Retrieval Augmented Generation. The documents are being dissected into shorter passages and then turned into vector embeddings – number strings representing the text.
When you enter a search query, which GPT also turns into a number string, the GPT is using that vector database to find the number strings most relevant to your query. It then converts the number strings back into a text answer.
GenAI Solution Disclaimer
As anyone who has interacted with Generative AI should know by now, public solutions like these can have inaccuracies and misinterpretations, often politely described as “hallucinations.”
Hence, you should not rely solely on either chatbot for conclusive interpretations of US government regulations. These tools are not to replace detailed reviews of the underlying regulations or the expertise of USAID Mission Executive Officers (EXOs).
In addition, these are public chatbots running on a private company’s servers. OpenAI claims they will not use input or responses from custom GPTs to inform their work. Regardless, please do not to put any private or confidential information into any public GenAI solution.
My aim in creating these chatbots is to reduce confusion and repetitive questions that tire us all. I want chatbots like these to free up EXOs, USAID staff, and implementing partners to focus on the complex regulation questions and issues that make our work interesting.
What About FAM/FAH?
You may be wondering about creating a custom GPT to explore the Foreign Affairs Manual (FAM) and associated Handbooks (FAHs) – the authoritative source for structures, policies, and procedures that govern the operations of the State Department, the Foreign Service and, when applicable, other federal agencies.
Custom GPTs had a 100 MB limit, now a 30MB limit, on the size of files that could be used for source documentation. The ADS and ADIAR were below that limit. The FAM alone is over 150 MB and 10,300 pages when saved as a PDF, greatly exceeding the custom GPT limit.
Please see the ChatGPT vs Claude post to find the best publicly available FAM/FAH analysis tool for you.
The image above is what ChatGPT 4 thinks is USAID staff interpreting the ADS and AIDAR – a curious visualization that totally breaks the USAID logo style guide.
Hi Wayan,
Thank you very much for sharing your experience. I was wondering how you addressed the challenges you list in the disclaimer section.
It is very good (and too often missing) to provide information about the risks and limitations of GenAI tools but it would be better to provide clear guidelines on how those are addressed. For instance with hallucinations. As a user, i ask a question and get an answer. Then what? What indication do I have that the answer is or not a possible hallucination? If I need to check the source document or ask an EXO then it defeats the purpose doesn’t? Also we know that in general users will not check. So is the warning only a way to absolve oneself from accountability or liability?
Regarding the risk of data leak, one could argue that whoever puts a tool live should actually verify with OpenAI whether or not they actually walk the talk and do not use the data for further processing.
Lastly, how do you think that you -as the provider- should actually verify that the aim of the tool (here: “creating these chatbots is to reduce confusion and repetitive questions”) is actually achieved?
Take care.
Vincent
Hi Vincent. Thanks for your questions around these LLM AI systems. I feel they probably warrant their own post since these are questions larger than me or my efforts. Maybe you can write a guest post to bring them to our 25,000 email subscribers?
In the mean time, I would suggest our long-standing coverage of artificial intelligence in general, and chatbot lessons learned in particular to see how we are looking at the issues you raise. I’d stress the many digital principles in existence (and in development) to help digital development practitioners navigate new technology usage – obviously including the ICRC Handbook on Data Protection in Humanitarian Action.
Hi, Wayan. Thanks for sharing the results of your interesting work. I’m a little unclear about the objective. You say that the AI Chatbots will ‘analyze’ the AIDAR etc. You also say “My aim in creating these chatbots is to reduce confusion and repetitive questions that tire us all. I want chatbots like these to free up EXOs, USAID staff, and implementing partners to focus on the complex regulation questions…”
I’m not sure what these points mean in practical terms. Perhaps you can share a couple of examples showing input (e.g. the type of repetitive questions you refer to) versus output.
Thanks again.
Nigel, I am not privy to the specific questions asked of EXOs and USAID staff, and I doubt I could share them publicly if I did know. That said, I’ve had plenty of regulation-related questions swirl around me over my 20-year career in implementing partner organizations. Most of them relate to procurement and are covered by the FAR or AIDAR. Others are process related and can be in the ADS or FAM/FAH.
A majority of questions are only asked once – see the ChatGPT vs Claude post for examples – but multiply a dozen only-ask-once questions by the dozens of people hired into international development every day and you have thousands of similar questions occupying hundreds of contract compliance officers time every month.
Would be interested in this but it looks like it’s only available to ChatGPT Plus subscribers? So it’s not quite “free”
Just gonna second what Alex said…and I myself was not aware of this until now. Apparently you can only interact with custom GPT’s of you’re a Plus subscriber, I thought a subscription was only required for making the custom bots but apparently I was wrong.
Jared, I too thought that a subscription was only required for making the custom bots, however OpenAI changed its rules to say you needed a Plus subscription, and then stopped accepting new subscribers. Ugh. This is another reason to move solutions like this in-house, so you’re not at the whim of a third party’s business decisions.
I also just tried to access and failed. I look forward to trying it. Perhaps it can be posted here or elsewhere like “Work with USAID” social media when accessible.
Has anyone here signed up to try ChatGPT Plus? Is it worth paying for a subscription?
I saw this as well, and they have also put a hold on new chat+ subs as of the past few weeks, so we’ll have to wait even of we want to pony up.
Wayan, how can the Chat GPT update the AIDAR and other files? How does one guarantee that the files ChatGPT is accessing are in fact accurate?
I am sure that there would be interest in my agency for finding FAM/FAH information that is better than a word search, as long as it is truly up to date. the FAMs and FAHs get added to almost every month.
I also have to ask about the cost of maintaining such a specialized database, as government agencies are notoriously slow to adopt new tech if there is a new budget line needed to do that.
David, your question is very appropriate, hence my note in the post that these are the public rules that I could find, as published on November 15. On this or a private LLM, one could have the relevant regulations ingested at whatever interval (daily, weekly, monthly) deemed necessary and time stamped, so users would know how recently the LLM updated its regulations files.
At a previous employer, we developed a FAM/FAH Bot that was demonstrability better than the State’s keyword search. I wasn’t privy to the pricing strategy they wanted to use with LLM chatbots tuned to regulations, but the costs were marginal when compared with other software services and therefore could be (and often where) incorporated into existing contracts.