Last weekend could not come fast enough, after what could well be one of the most important weeks in AI history. A number of Large Language Models (LLMs) releases and technical breakthroughs refined the picture of society in an era of commoditized AI. And it became clear that major disruptions in our day-to-day lives are only months away, starting with our workplaces.
Some of the key innovations, questions, and lessons from the past week.
The AI-week, featuring a new paradigm of multimodal AI and impressive LLM hacks
If Google have a mention in this section, it is only to say that they did not make it to the highlights of the week, despite opening up their PaLM model and rolling-out AI throughout their Google Workplace suite. Whether this is down to communication strategy or it is indicative of a fundamental issue with their AI-capabilities, Google was overshadowed by, amongst others, Midjourney v5, Anthropic Claude (Google-backed), Stanford Alpaca and of course OpenAI & Microsoft with GPT-4 and Microsoft 365 copilot.
GPT-4 is a multimodal LLM, where “multimodal” refers to inputs, and that’s a real gamechanger. The model does not only accept text as prompts but also images (pictures or videos). You can expect GPT-4 to identify what’s odd or funny in a picture or to propose recipes based on a picture of the content of your fridge.
It is fair to say that, when GPT-4 was released last week, Twitter was blown away. The peak of the excitement was reached when Greg Brockman, President and Co-Founder of OpenAI, drew a sketch of an imaginary website, used a picture of that sketch as input to GPT-4, which returned the code of a fully operational website replicating the design. Although that stunt has not been replicated independently yet, that demo was a big statement from OpenAi to their peers.
And whilst the big AI-players’ arms-race was unfolding in front of our eyes; independent developers continued their relentless effort to tweak and optimize whatever was available publicly or on BitTorrent.
The Stanford team falls in that second camp as their Alpaca model is derived from LLaMa 7B -the “small” one-, achieving ChatGPT-like results through fine-tuning. It is all the more impressive as the fine-tuning “only” required c.50k input-output pairs; for a total cost of $600 (of which $500 paid to generate the input/output pairs via … Open AI’s API).
Also last week, 4-bit quantization accelerated on several LLMs including LLaMA, Alpaca or Bloom which now run on consumer hardware. The progress made over just few days is astonishing as the smaller models reportedly run smoothly on smartphones.
GPT-4 System Card and the Alignment Research Center
So fine-tuning and/or side-tuning may be the next big theme, opening the door to a multitude of private -and/or open-source- LLMs. Not GPT-4 equivalent LLMs, but certainly good enough given they can easily be integrated to other apps to actually do something. In any case if, as Simon Willison inferred, LLMs can now be “trained” for a fraction of the carbon footprint and a cost an order of magnitude lower, the scenario of private LLMs within 12-24 months does not seem so unrealistic.
The environmental aspect definitely falls on the positive side of things. Then, there is a valid argument that such important technology should not be concentrated in the hands of few large players – and China will develop their own anyway. But no one can deny that, after witnessing GPT-4 at work, a number of threats suddenly felt not-so-remote.
Without falling into the trap of AI panic fuelled by influential groups, it is clear that the leaders in LLMs understand the threats– or at least use them as an excuse not to disclose all information about their models. In an interview to the Verge, Ilya Sutskever, OpenAI’s Chief Scientist and Co-Founder, declared that great harm could be caused by their technology and that the company’s past approach to openly share research was wrong.
I would say that the safety side is not yet as salient a reason as the competitive side. But it’s going to change, and it’s basically as follows. These models are very potent and they’re becoming more and more potent. At some point it will be quite easy, if one wanted, to cause a great deal of harm with those models. And as the capabilities get higher it makes sense that you don’t want to disclose them
Ilya Sutskever, OpenAI’s Chief Scientist and Co-Founder - The Verge - March 20, 2023
Upon release of GPT-4, OpenAI issued a System Card along with a Technical Report. The System Card shows that the company did look into risks and checked that the most imminent threats were appropriately addressed. The System Card starts with a content warning : “This document contains content that some may find disturbing”. And indeed, the read sometimes triggered uncomfortable feelings.
The scary part is not so much that the Alignment Research Center (ARC) that evaluated the model concluded that it was “probably not yet capable of autonomously [replicating]”. But that GPT-4, without any task-specific fine-tuning, was apparently capable of convincing a human being to do something it could not do on its own (in that case, solving a CAPTCHA).
Read @Algon_33's brilliant summary of the Safety Card here
In the workplace : reshaping hierarchies and redefining careers
Having seen GPT-4 achieve so much, it seemed even odd that it could not solve a CAPTCHA. In reality, understanding everything these powerful models can do will take time, even for computer scientists, as capabilities seem to emerge unexpectedly above a certain model size. These “emergent” abilities can be observed but can’t really be explained yet.
GPT-4 is surely not AGI but, if the demos are even half-true, it displays a high level of intelligence. And it's only going to increase. Such level of intelligence, combined with very short execution time, means that humans will soon be regarded as an obsolete technology for many tasks. That’s why the effects of powerful LLMs should first be felt in the workplace.
If, at any point in your career, you have been part of a workflow and had to crunch data, replicate documents, or draft memos, e-mails and presentations, you must see how multimodal LLMs change everything: productivity is significantly enhanced through, amongst other things, photo-to-text capability, bullet-point expansion, automatic translation or the photo-[or-text]-to-code capability. Within companies, functions are therefore likely to be polarized along a vertical axis with Pilots at the top, AI-Copilots in the middle and human assistants of the AI-Copilots at the bottom.
If you see the glass half-full, there will be plenty of opportunities since AI can help you become a Pilot faster than ever. From a technical perspective, the playing field may never be truly level, but multimodal LLMs should considerably reduce certain asymmetries so that technical expertise may no longer constitute a real barrier, nor a sufficient differentiator. It is clear that software development will become more accessible, leading to more product-oriented solutions, which is another positive outcome for end-users. However, DIY software will inevitably raise serious security questions, meaning that IT security expertise should remain in high demand. AI Compliance too.
Notwithstanding the opportunities, the net effect of commoditized AI on labor is likely to be negative as recent research suggests. OpenAI’s CEO Sam Altman recognizes the threat, claiming in an interview to ABC News last week that he was worried about how quickly this could happen.
"I think over a couple of generations, humanity has proven that it can adapt wonderfully to major technological shifts.” ”But if this happens in a single-digit number of years, some of these shifts ... That is the part I worry about the most."
Sam Altman, CEO of OpenAI - ABC News - March 16, 2023
AI is expected to completely derail some careers (think of professional photographers competing with the likes of Midjourney v5), sometimes before they even started (think of students in Neural Language Processing now that it has effectively become a commodity). But what’s equally -if not more- concerning is the psychological consequences of the realization, for an individual, that they become useless. If they are perfectly cynical about this, Psychologists must be looking forward to the AI revolution as there will be a lot of business for them…