With little fanfare, researchers from Apple and Columbia University
released an open source multimodal LLM, called Ferret, in October 2023. At the time, the release — which included the code and weights, but for research use only, not a commercial license — did not receive much attention. But now that may be changing: With open source models from
Mistral making recent headlines and Google’s Gemini model is
coming to the Pixel Pro and eventually to Android, there has been increased chatter about the potential for local LLMs to power small devices.
That chatter increased recently because Apple announced it had made a key breakthrough in deploying LLMs on iPhones: The company
released two new research papers introducing new techniques for 3D avatars and efficient language model inference. The advancements were hailed as potentially enabling more immersive visual experiences and allowing complex AI systems to run on consumer devices such as the iPhone and iPad.