Research|Apple intelligence – ultimate winner in AI? – part 2

Apple vs. Android, Reasoning, Search, PCC, iPhone Sales, Regulations, Other Hardware, Semiconductor

Nov 13, 2024

∙ Paid

In our previous article, we discuss Apple’s ambition to become the personalized assistant for everyone and the entry point of AI user traffic. In this article, we are continuing our discussion in Apple Intelligence.

Apple vs. Android

Apple and Google's ecosystems present unique competitive advantages. Apple has a complete ecological closed loop, while Google is also actively building its own ecosystem. For example, Google's latest self-developed chips and Pixel series devices are steadily developing towards AI entry points. Google recently released GPT-4-level Gemini Live. The Pixel 9 phone has actually made many upgrades, adding AI functions. This also reflects that model manufacturers in the Android ecosystem are working hard to find more scenarios and entry points to keep up with technological and ecological needs.

Figure 1. Ecosystem Comparison: Apple and Google, with integrated hardware, software, and proprietary AI, hold edge in on-device AI over third-party Android makers. Apple is “phone + AI” while Google is “AI + phone”

Google's two main projects currently are Project Astra, a voice agent, and an AI tool platform that allows models to call specialized tools according to specific needs, achieving personalized customization. This approach enables users to use AI models based on personalized needs, building a more complete AI ecosystem. Google recently reorganized the structure of the Android and Pixel teams to better promote the development of the AI hardware ecosystem.

In addition, both Apple and Google are exploring the field of robotics, especially home robots, which may seek new growth points based on their existing smart home layouts. Google has a very extensive layout in on-device technology and cooperates closely with Samsung. Google has long focused on self-research of on-device and cloud models, has partially self-developed chips, and a strong software and hardware OS ecosystem. Google has launched many AI functions for on-device devices and plans to fully self-develop future chips to provide developers with more convenient tools. Although Google's Pixel products still have a large gap in sales compared to Apple's in the market, its planning is clear and is expected to narrow this gap through further integration with the entire Android team.

Google's current strategic core is to build an "AI + phone" ecosystem centered around AI, while Apple's strategy is "phone + AI," that is, focusing on the phone and supplemented by AI functions. Google recently launched the brand-new Gemini Live, hoping to integrate this product with its strong software and hardware ecosystem to create a new AI entry point. For other Android manufacturers, especially those not abroad, they often need to rely on third-party technology to iterate products. This report mainly focuses on the situation in foreign markets.

Figure 2. Google has launched the Pixel 9 series AI smartphones, featuring comprehensive upgrades in chip, storage, and other hardware

Samsung is expected to achieve significant improvements in on-device AI models by 2025. The on-device model of Samsung S24 is based on the previous generation ARM architecture, which is actually based on the Bard model, and the overall performance is not ideal. The S25 plans to adopt a 3.5B native multimodal model based on Gemini. This model uses unified tokens to process all inputs, avoiding information loss during the conversion process, so it is expected to significantly improve in performance and accuracy. When in use, the model can insert new dialogue prompts at any time like GPT-4, making interactions more natural. At the specific application level, the main improvement direction of the S25 is imaging capabilities, and AI image processing technology will have significant improvements. In addition, multilingual support capabilities will also be enhanced, especially possibly increasing sales in the European market, and offline voice capabilities will also be greatly improved.

Reasoning Models in the Edge

I believe Chain-of-Thought (COT) is crucial—essentially, how an agent's task can be broken down and executed step-by-step. This approach is not only a focus at Apple but across the entire AI industry. To truly achieve AGI, models must be able to decompose complex tasks autonomously and determine the necessary steps to complete them. Each step involves self-checking, which can enhance task execution accuracy. However, implementing this method on devices poses several challenges.

First, there’s the issue of scaling down complex reasoning models. Currently, only OpenAI has this capability, while other companies are researching how to make it possible.

Second, reasoning demands a significant KV cache, as both the input and output context windows become lengthy, raising challenges in handling these on-device. Memory management becomes critical, and researchers are investigating if some parts could be stored in NAND or other storage solutions.

Third, further customization of reasoning models for specific vertical scenarios is a key area of focus. This is one of the most critical research areas today. In addition to multimodality and safety, reasoning model development for agents is a priority.

Reinforcement learning is also significant—improving model performance based on usage outcomes. This research focus aligns somewhat with OpenAI’s O1, which enhances model efficacy through synthetic data and reinforcement learning.

OpenAI has two versions of its models: a large model and a complete version. One of these models likely has hundreds of billions of parameters, while OpenAI Preview may have tens of billions. OpenAI has also developed an OpenAI o1 Mini, which we understand has fewer than ten billion parameters. Further compression could feasibly reduce it to a three-billion-parameter model. However, the context window remains quite long to accommodate context and entire operations, necessitating substantial KV cache storage. Both the KV cache and model weights need to be stored in memory, leading to higher memory demands.

Figure 3. Reasoning benchmark for o1-mini. o1-mini is a smaller model optimized for STEM reasoning during pretraining. After training with the same high-compute reinforcement learning (RL) pipeline as o1, o1-mini achieves comparable performance on many useful reasoning tasks, while being significantly more cost efficient.

For smaller models, implementing multi-step reasoning presents product design challenges, such as how long it takes to process and reason. Nevertheless, this approach looks promising. By decoupling knowledge and reasoning ability in small models, such a model might not have broad knowledge—e.g., it may not know today’s date—but it can make independent judgments.

Apple Search?

Siri was set as the primary entry point by Steve Jobs, a choice rarely altered within Apple. The Siri product team has advocated internally for collaboration with OpenAI, showing their intent to boost AI capabilities rather than relying solely on Apple’s in-house model team. Siri may evolve into more than a voice assistant, potentially becoming a comprehensive entry point for intelligent search and API calls. This interaction would extend beyond simple command execution, involving deep understanding of user intent and context-specific responses. For example, if a user says, “I want to go out for dinner,” the system would not only understand this request but also recommend suitable restaurants based on the user’s historical preferences and habits. This requires the system to access and process user history to deliver more personalized services.

This type of search is known as “vertical search” because it focuses on specific application scenarios and tasks, such as food recommendations or scheduling, differing from traditional broad internet searches. Vertical search integrates workflows and scenario-specific requirements to make task execution more efficient and accurate. On Apple devices, implementing vertical search is crucial, as it leverages on-device data and functions to achieve more complex solutions.

The current focus is on applications of both on-device and cloud-side search. Apple’s vertical search covers functions related to images, photos, and apps, all developed and maintained by Apple due to their close integration with the device, making them essential for users. Additionally, voice functionality is naturally integrated with search, becoming a gateway for user interaction.

As for internet search, despite Google paying Apple billions annually to remain the default search engine, Apple does not currently plan to develop its own internet search engine. Unless U.S. laws change the cooperation model due to antitrust concerns, Apple likely won’t invest in creating a search engine to compete with Google. However, in the future, Apple may consider a new AI-specific search entry point outside of Safari.

Apple’s on-device models might focus on providing quick knowledge queries, which can be seen as a form of vertical search. For instance, users may ask Siri specific questions, and Siri would need to source answers either from a local knowledge base or through the internet. Rapid iteration of this model could give Apple a greater competitive advantage in future search applications, particularly in AI and machine learning integration.

Some suggest Apple is considering transforming its internal device search into a system that directly provides answers, akin to an offline, intelligent version of Wikipedia. I believe Apple’s current strategy is to decouple on-device processing capabilities from content, meaning the device will have basic processing abilities, while specific knowledge content can be integrated through continuous updates.

PCC (Private Compute Cloud) as a major moat

Apple is currently implementing a complex architecture including PCC (Private Compute Cloud), which has a three-layer model. First, there is a 2.7 billion parameter on-device model with inference precision of about 2 to 4 bits. Second, there is an intent model on the device side, used to determine whether a task is executed on the device, should be processed by Apple's model on the PCC, or call the GPT-4 model for processing. The current intent model still uses relatively traditional machine learning methods rather than large-scale models. Third, the server-side model is limited by PCs using M2 Ultra chips, with relatively small parameters. In the future, Apple is developing a model with hundreds of billions of parameters comparable to GPT-4 internally.

On-device tasks usually cover functions such as proofreading, summarization, and rewriting, while tasks executed on the PCC may involve more complex generation tasks, such as generating complete illustrations based on sketches. For such complex tasks, AI's question-answering function is particularly critical.

At present, in the architecture design of Apple Intelligence, the on-device call ratio is the highest; relative to the SoC's computing power, the device's performance may be limited by DRAM. In the future charging model, Apple may adopt a subscription value-added service similar to iCloud and may also launch a model store similar to the Apple Store, where small models are free and large models are charged.

Apple's on-device models are still mainly single-modal models, and the company is using a large amount of computing resources to develop these high-parameter models. Apple's strategy is to put as many tasks as possible on the device—firstly because the entry point is the on-device model, and secondly because the improvement of hardware and the iteration of small-model capabilities make the capabilities of on-device models quickly improve. As the capabilities of on-device models improve, more complex tasks can be handed over to on-device models to reduce dependence on cloud and third-party servers. This aligns with Apple's consistent strategy of not letting its products be controlled by any external company. The decision in Apple's history to replace Intel chips with M-series chips reflects this strategy. Processing data on-device also helps better protect user privacy and optimize the use of device computing resources, reducing unnecessary network communication and server resource consumption. In particular, the chip performance on Apple's device side has been overpowered for a long time, and the utilization rate of the NPU is not high. With the introduction of large models, this will significantly benefit NPU utilization. However, the existing NPU architecture is relatively old, and Apple is also developing a new architecture to better adapt to Transformer-based large language models.

In terms of Private Compute Cloud (PCC) layout, Apple provides strong physical isolation for global users and builds a moat in data protection. Apple plans to build PCC data centers worldwide, which will run on systems based on M2 Ultra chips. It is expected that several data centers outside North America will complete infrastructure construction in the first quarter and plan to start operations in the third quarter.

Figure 4. The Secure Enclave, part of the M2 Ultra, is a dedicated security subsystem within Apple’s SoC. Isolated from the main processor, it ensures sensitive data remains secure even if the application processor is compromised.

At the same time, other Android manufacturers have relatively little investment in cloud infrastructure, so they may still need some time and resources to catch up with market leaders.

In the future, Apple's PCC will be deployed on self-developed M-series chips and some GPUs, focusing on the inference performance of M chips. GPUs will be used to iterate and train personalized models in the PCC environment. Apple plans to mass-produce the next-generation AI cloud inference chip in the third quarter of 2025, which will use TSMC's 3nm process + InFO_LSI, combined with LPDDR5X technology for inference computing. It is expected that the cost of a single chip is about $3,000, and Apple expects to order 300,000 to 400,000 pieces in 2025 to achieve its primary user group target, which is expected to be better than NVIDIA's H100 series. With the continuous iteration of chip technology, the cost is expected to be further reduced, thereby improving the cost-performance of its inference capabilities.

iPhone Sales

At present, AI-enhanced intelligent features may have a relatively limited boosting effect on phone sales in the short term. Our research on North American users shows that although users have given good feedback on certain features such as call recording, photo search, and "memory movie" applications, and photo cleaning features have also been recognized, users generally believe that these features are not particularly leading; some even feel they are comparable to Google's corresponding features. Regarding Siri's upgrade, its improvement is mainly reflected in natural language understanding, enabling it to understand user intent more accurately and optimize iPhone settings accordingly. However, for more specific features, such as integration with ChatGPT and the ability to automatically call ChatGPT, such integration still needs time to develop and may not appear in the current generation of products, or even until 2025. Therefore, these intelligent features cannot drive users to immediately replace their phones.

Apple has hinted that the order volume of the iPhone 16 will be lower than expected. The current AI features do not seem to have reached an extremely smooth level, so it may be difficult to significantly increase the demand for the iPhone 16 in the short term. However, we are still very optimistic about Apple's development prospects in the long run; it is expected that by 2028, its earnings per share (EPS) may reach $11.9. Although the sales of the iPhone 16 may be lower than previous generations, with the improvement of infrastructure and the enrichment of the ecosystem, Apple is working hard to occupy a place in consumers' minds, which is also why Apple is so actively promoting Apple Intelligence.

For the iPhone 17, because the design and chip development of the iPhone 16 were completed before the large model was launched, there is relatively limited room to add new features. We expect that the iPhone 17 will expand more for AI features, which may drive Apple's shipments in fiscal year 2026 to exceed 260 million units. As the user base grows, Apple may also launch a subscription service for AI models, and its charging logic may be similar to iCloud.

Assuming that by 2028, the penetration rate of this service reaches the current 20% of iCloud, it can contribute about $17 billion in additional revenue to Apple. In the long run, as smartphones become new traffic entry points, whether through traffic diversion or other means, there are opportunities to create more revenue for Apple.

International regulations and entry into China

A major current issue is dealing with strict international regulations, especially Europe's DMA (Digital Markets Act). This act has particularly strict compliance requirements for non-English regions, and Apple and Europe have not yet reached an agreement on the specific details of the DMA act; resolving these regulatory issues may require more time. Although historically, Apple may eventually find solutions, this is still a challenge.

From the perspective of server construction, Apple mainly builds inference servers in North America, while there are also several data centers under construction in other regions, expected to be completed in the first quarter and enabled in the third quarter, which aligns with the release time of the next-generation iPhone 17. In addition, Apple's deployment strategy in China may involve a "local version of PCC," that is, a "Cloud Guizhou" version, which involves transferring Apple's resources and chips to China. This strategy faces challenges including U.S. restrictions on exporting high-end chips to China and China's policies prohibiting data from leaving the country.

Furthermore, it is a core issue for international companies to conduct model registration in China. This involves not only technical issues but also complex business and policy issues. Apple also needs to decide which company to cooperate with in China, such as Baidu's ERNIE Bot, Tencent, etc. Considering the existing payment issues between Apple and Tencent, these business decisions will affect Apple's operations and strategies in the Chinese market.

The biggest problem is Trump administration; it is unknown whether he will introduce a stricter ban on AI chips entering China. For Apple, is it absolutely necessary to use PCC? PCC has become part of its architecture and security, including chips specially designed for it. At present, this chip seems to be at the boundary of the existing ban. In the future, if stricter regulations appear—such as new regulations that Trump may introduce—this chip may exceed the limits of these regulations. This raises the question: Is it necessary to specially customize a PCC chip for certain markets? This will undoubtedly make the situation more complicated. Moreover, considering the complexity of policies and markets, cooperation between Apple and companies like Huawei seems unlikely. This is the main issue.

Apple plans to officially enter the Chinese market in the next one or two years, and this strategy may have a significant impact on the brand's global influence. In places like Hong Kong, when buying phones, salespeople often emphasize that even if the phone is taken back to mainland China, it can use Apple, implying that AI version phone products will eventually be officially sold in mainland China. This expectation may affect consumers' purchasing decisions, especially considering that Apple's future products, such as iPhone 17 and 18, may be specifically designed and optimized for AI features.

If Apple cannot smoothly launch in the Chinese market, its new generation of phones centered on AI may be restricted, which will significantly affect its sales in China. Consumers may be reluctant to buy a device whose main features cannot be fully utilized locally. For example, if the new iPhone cannot use all the intended features in China, it may be regarded as an ordinary functional phone rather than a high-end smartphone. This will force Apple to consider launching special edition products to adapt to the specific needs and regulations of the Chinese market; these products may have lower prices and adjusted feature settings.

Therefore, Apple is doing its best to ensure that its AI technologies and services, such as Apple Intelligence and new Siri, can enter the Chinese market.

Other Consumer Hardware

Future interaction forms will undergo significant changes with the development of software and hardware integration, expanding from phones to a broader device ecosystem including wearables. Hardware forms may initially be liberated; in the future, more imperceptible wearable devices and AI-native hardware may appear. As technology advances, the capabilities of on-device models are expected to increase significantly, which will greatly enrich and improve the interaction between AI and humans. The development of brand-new projection technology and personalized systems may make AI's interaction with people closer and more personalized. Currently, not only phones but many top companies are developing wearable devices, especially investing heavily in voice interaction technology. For example, Meta is quite aggressive in the field of wearable devices, launching various smart glasses and accessories; it is expected that more than a dozen new hardware products will be launched in the next one or two years, including glasses, necklaces, pendants, etc. Although some emerging manufacturers perform modestly in the North American hardware market, the continuous innovation of tech giants like Apple, Google, and Meta will determine the future direction of the market.

Figure 5. On-device Wearable AI Hardware : Apple, Google, and Meta lead in wearable AI, focusing on glasses, headphones, and watches. Other forms are explored but not yet scaled

In the field of smart homes, Apple is deeply developing robot projects, covering desktop robots, mobile robots, and humanoid robots. It is reported that after the adjustment of the Apple Car project, some team members turned to invest in the research and development of robot projects. Apple's robot project shows its long-term goal of applying AI technology to home scenarios.

Google is also active in robot research. Its Google Home product line is constantly expanding, and it is also actively deploying home robots. Google's robot technology has deep roots, thanks to integration with DeepMind and Google Brain, developing models with embodied intelligence such as RT-2 and RTX. Although Google's Everyday Robot project was once suspended, its home robot project is still advancing; Google regards this as a new opportunity in the North American market. It is expected that the home robot market will become another growth point of AI hardware in the future.

Figure 6. In robotics, Google has a strong foundation in models and product development, backed by a top-tier robotics algorithm team, giving it significant competitive potential. This is Google’s Everyday Robot project.

Semiconductor

Keep reading with a 7-day free trial

Subscribe to FundamentalBottom to keep reading this post and get 7 days of free access to the full post archives.