Charts
DataOn-chain
VIP
Market Cap
API
Rankings
CoinOSNew
CoinClaw🦞
Language
  • 简体中文
  • 繁体中文
  • English
Leader in global market data applications, committed to providing valuable information more efficiently.

Features

  • Real-time Data
  • Special Features
  • AI Grid

Services

  • News
  • Open Data(API)
  • Institutional Services

Downloads

  • Desktop
  • Android
  • iOS

Contact Us

  • Chat Room
  • Business Email
  • Official Email
  • Official Verification

Join Community

  • Telegram
  • Twitter
  • Discord

© Copyright 2013-2026. All rights reserved.

简体繁體English
|Legacy

This Open-Source Phone AI Agent Sees, Hears and Acts—All Without Touching the Cloud

CN
Decrypt
Follow
4 hours ago
AI summarizes in 5 seconds.






Your phone already has a camera, a microphone, and a screen. It can see what you're looking at in real life and what's happening on its own display. And now, the AI team from Chinese smartphone manufacturer Oppo has figured out that all that hardware that sits there, mostly underused, is exactly what you need to build a genuinely useful mobile AI agent.


That project is X-OmniClaw, published by the Multi-X Team. It's an open-source AI agent framework for Android that turns your phone into a hands-free, context-aware assistant capable of running real tasks across real apps, without routing everything through a cloud copy of your device.


Most mobile AI systems don't actually run on your phone. They run on cloud servers that host virtual copies of Android, letting an AI tap and scroll through apps remotely. The result: no access to your real camera, your actual photos, or your local files—just a stranger using a copy of your phone.





X-OmniClaw takes the opposite approach. Per the technical report, it introduces "an edge-native architecture that executes directly on the user's physical device, thereby eliminating the gap between simulated environments and real-world interaction contexts."


The report uses a car analogy: The smartphone is "the vehicle," X-OmniClaw is "the internal engine for control and perception," and the cloud-based language model is only called in as "the fuel" when heavy reasoning is needed. Everything else stays local.


How the Oppo AI phone agent works


X-OmniClaw's overall architecture is based on three pillars: Omni Perception, Omni Action, and Omni Memory that work as one continuous loop, with cloud LLMs called in only for heavy reasoning, according to Oppo.



Source: OPPO AI Center

Omni Perception covers everything the phone can sense. It combines camera feeds, screen content, and voice input into a single pipeline. A vision-language model interprets the scene before the agent does anything else. So if you point your camera at a bottle and ask, "how much does this cost?", the agent first figures out what you're looking at, then opens the relevant shopping app and starts searching. No guessing required.


Omni Memory is what separates X-OmniClaw from a one-shot chatbot. The agent maintains context across tasks, app switches, and sessions. It also builds a long-term semantic memory from your photo gallery, turning raw images into structured notes about objects, scenes, and events. The report states "runtime continuity is what lets X-OmniClaw operate as an ongoing device agent rather than a one-shot response system."


Omni Action handles execution. It combines XML interface data with an on-device visual model and OCR—a character-recognition layer to figure out exactly what to tap, even on ad-heavy screens where structure alone isn't enough. It also includes behavior cloning: record yourself navigating to a buried app page once, and the agent can replay that route instantly using an Android deeplink shortcut next time.


What the Oppo AI agent can actually do


Oppo shared some things the model can do. For example, the agent identifies a physical product via camera, opens Taobao, scrolls results, and returns a price summary—no typing required.



Oppo also demoed a floating on-screen companion that helps a user work through math exercises step by step: autonomously reading the screen, processing each question, and advancing when done.


It also offered another example in which a user asks the agent to assemble a highlight video from parrot-themed photos. The system scans the gallery, finds matching photos using its semantic memory, opens CapCut's video editor via deeplink, batch-selects the files, and generates the video. What used to take "a few minutes or longer" becomes a handful of automated steps.



Source: OPPO AI Center

2026: The year of agentic AI


AI agents have become one of the most discussed categories in tech. OpenClaw—the open-source agent framework that reached over 373,000 GitHub stars and was eventually backed by OpenAI—launched the current wave by showing what persistent, locally-run agents could do on PCs. Hermes Agent by Nous Research took things further with a self-improving learning loop that compounds capabilities over time.


Both run primarily on desktop hardware. X-OmniClaw extends the same architecture to the device you actually carry everywhere. The team built on the open-source HermesApp codebase, and the paper explicitly credits OpenClaw's structured skill model as foundational inspiration, then adapted it for the multimodal, always-on nature of a smartphone.


The code is on GitHub now. Oppo says it will release all assets and keep updating the project as the system evolves.


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

Selected Articles by Decrypt

1 hour ago
Lawyers Apologize After Fake Claude-Generated Quotes Appear in Trump Layoffs Case
2 hours ago
AI Still Can\\\'t Beat the On-Call Engineer: Here\\\'s Why
2 hours ago
Bitcoin Faces Greater Quantum Computing Risk Than Ethereum, Citi Warns
View More

Table of Contents

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

Related Articles

avatar
avatarbitcoin.com
29 minutes ago
Mike Novogratz’s Galaxy Lands Bitlicense to Serve New York Hedge Funds and RIAs
avatar
avatarbitcoin.com
1 hour ago
Bitcoin Dip-Buy Signal Emerges as Retail Fear Overtakes Optimism
avatar
avatarDecrypt
1 hour ago
Lawyers Apologize After Fake Claude-Generated Quotes Appear in Trump Layoffs Case
avatar
avatarbitcoin.com
1 hour ago
Bitcoin Miner Revenue Drops 9.44% Following Network Difficulty Jump
avatar
avatarDecrypt
2 hours ago
AI Still Can\\\'t Beat the On-Call Engineer: Here\\\'s Why
APP
Windows
Mac

X

Telegram

Facebook

Reddit

CopyLink