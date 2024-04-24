Ahead of iOS 18’s debut at WWDC in June, Apple has released a family of open-source large language models. Called OpenELM, Apple describes these as: a family of Open-source Efficient Language Models.

In its testing, Apple says that OpenELM offers similar performance to other open language models, but with less training data.

Apple explains:

To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. For example, with a parameter budget of approximately one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring 2× fewer pre-training tokens. Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations. We also release code to convert models to MLX library for inference and fine-tuning on Apple devices. This comprehensive release aims to empower and strengthen the open research community, paving the way for future open research endeavors.

iOS 18 will include a collection of new artificial intelligence features, and today’s OpenELM release is just the latest piece of Apple’s behind-the-scenes work in preparation.

Bloomberg reported last week that iOS 18’s AI features will be powered by an entirely on-device large language model, which will offer privacy and speed benefits.

