A “Goldilocks” Level of Complexity
With the recent public release of ChatGPT, there have been many examples of people using GPT models for a range of tasks, from writing poems to building websites to answering questions. These highlight both the incredible range of capabilities of these models, as well as shortcomings of the models in their current state.
I recently integrated easy access to GPT-3 into my Emacs workflow and have been using the model for tasks that are significantly less complex than those examples above, but just a little bit above the complexity level of tasks I generally turn to Emacs keyboard macros for. I’ve come to the conclusion that there are a range of actions and tasks around text manipulation which lie within a “Goldilocks” level of complexity that have made the GPT API practical and useful to me today, out-of-the-box:
Semantic Macros
gpt-emacs-macro package: GitHub | Discussion on Reddit | Twitter
Keyboard macros are one of my favorite features in Emacs — being able to string together a sequence of keyboard commands, store that sequence, and then apply it on arbitrary selections of text is extremely useful. However, there’s a limit to the complexity of the actions I want to perform, weighed alongside the amount of time it will take me to formulate my objective as a macro, before I just decide to do it manually.
For example, repeated tasks for tasks like, “make all text after the 5th word in a line lowercase” or “add a comma after every second word in a sequence” are perfect candidates for recording macros — the logic is fixed, there are no edge cases or exceptions to the rules. Thus, I can record a macro as quickly as I can perform the edits on the first instance, then repeat the macro for subsequent instances.
However, for things that are a little more complex, I generally give up and just do them manually, as it takes more effort and time to build these macros than it would to complete the tasks. A recent example is taking an author list of full names and abbreviating them to First Initial, Last Name (multiple edge cases — ie some in the list already meet this format, some also have a middle name, need to ignore phrases such as “and” in the string). Other times, I don’t recall how to do something — how do I format this hyperlink in org-mode to a Markdown-format URL? — and end up looking it up and manually making the adjustments.
Both these examples are real tasks I needed to perform over this past week, which I would have ordinarily just done manually. However, with my gpt-emacs-macro package, I could just select the text, press C-x g
, and ask GPT-3 to “convert the list of names to first initial, last name”. In a few seconds, the text was replaced with what I wanted! Similarly, I highlighted a URL and asked to “format this as a Markdown URL”, without having to look up the syntax!
Some Reflections
I like calling these tiny actions “semantic macros” because they allow me to think about interfacing with text at a fundamentally different level than I do when working with keyboard macros. Instead of having to think from big picture → logical steps to implementation → recording kbd-macro, I start and stop my thought process at big picture and have GPT-3 take care of the rest.
I think this is a perfect use-case for these models in their current state — the wide capabilities of the model allows it to perform a huge range of tasks that can be formulated in natural language, but the tasks limited enough in scope as to not produce significant errors in the generations. Using GPT-3 for these tasks have saved me seconds/minutes at a time, bit by bit; as my intuition about what sorts of things I can leverage the model for grows, I continue to offload more and more manual work to GPT.
Some conditions I’ve thought of for these tasks that make them well-suited for automation via GPT:
They can be performed in a single step/clearly explained with a single command, rather than a sequence of commands.
They are simple enough to verify (as a human) such that I’m not too worried about missing any mistakes in the generation.
For these sort of tasks, even if the model is able to complete 80% of a task and requires me to complete the remaining 20%, that is still a time-saver compared to manually doing the whole thing.
Do I need to pay for ChatGPT to be able to use this?