Apple, in a technical paper discussing models that power Apple Intelligence, has said that the training data for the Apple Foundation Models was sourced in a “responsible way”. It’s an opportunity for the company to indirectly respond to acquisitions that it took an ethically questionable approach to training some of its models.
“[The] pre-training data set consists of … data we have licensed from publishers, curated publicly available or open-sourced datasets and publicly available information crawled by our web crawler, Applebot. Given our focus on protecting user privacy, we note that no private Apple user data is included in the data mixture,” the company noted in the paper.
It’s a response to Proof News, which reported earlier this month that Apple used a data set that contains subtitles from YouTube videos, to train models designed for on-device processing, prompting content creators, like MKBHD, to make statements.
Apple said that the AI models were pretrained on “Cloud TPU clusters”, meaning Apple rented servers from a Cloud provider to perform the calculations. Choosing Google’s homegrown Tensor Processing Unit (TPU) for training is interesting because Nvidia’s expensive graphics processing units (GPUs) dominate the market for high-end AI training chips.
Meta CEO Mark Zuckerberg and Alphabet CEO Sundar Pichai have both said recently that the tech industry may be overinvesting in AI infrastructure, but, at the same time, said the business risk of doing otherwise was too high. “The downside of being behind is that you’re out of position for like the most important technology for the next 10 to 15 years,” Zuckerberg told Bloomberg.
Apple has rolled out the first version of Apple Intelligence in the developer beta of iOS 18.1. It is also available in similar releases for iPad and Mac. At the moment, it is only available to registered Apple developers.
The developer preview at the moment includes the reimagined Siri interface, which allows switching between voice commands and typing; Writing Tools; new categories and smart replies in Mail; smart replies in Messages that use the context of a conversation; and the ability to create summaries for transcripts and content. A few Apple Intelligence features are unavailable for the time being, like Genmoji, Image Playground, the ChatGPT integration with Siri, Priority notifications and Siri personal context.