MoFA Mobile Solution Exploration

Technical Challenges

Differences between MoFA’s mobile application scenarios and traditional robot control:

Robot Control Scenario Characteristics:

Low-latency hardware communication (microsecond level)
Real-time processing of large sensor data
Dora-RS shared memory

Mobile AI Agent Scenario Characteristics:

Logical reasoning and decision-making
Network operations and screen interaction as the main focus
Dialogue and task planning oriented
Relatively high latency tolerance (millisecond level is acceptable)

Dora-RS Porting Issues

Some Problems:

// Dora shared memory
shared-memory-server (Dora's own)
├── shared_memory_extended = "0.13.0"
├── raw_sync_2 = "0.1.5"

Shared Memory Dependencies: Dora’s dependency on the shared_memory_extended crate has incomplete support on mobile platforms
Inter-process Communication Mechanism: Mobile operating system IPC mechanisms differ from desktop systems
ARM Architecture Adaptation

Solutions

Solution 1: Termux + Root Permissions for Complete Porting (Demo Phase)

Technical Architecture

Android Host
├── Termux (complete Arch Linux environment)
│   ├── libandroid-shmem (shared memory)
│   ├── Rust toolchain
│   ├── Dora-RS complete version
│   └── MoFA agent framework
└── Host Bridge App (referencing Auto.JS)
    ├── Screenshot API
    ├── Touch injection API
    └── Socket communication bridge

Layered Validation Plan

Basic Environment

□ Termux installation and Root permission acquisition
□ Basic Linux command availability verification
□ Network connection and file system permission testing
pkg update && pkg install -y git curl wget
su -c "whoami"  # Verify root permissions

Rust Toolchain

pkg install rust
rustc --version && cargo --version

Shared Memory

# Kernel support check
cat /proc/version
ls -la /dev/shm
git clone https://github.com/termux/libandroid-shmem
cd libandroid-shmem && make && make install
gcc test/test.c -o test_shmem && ./test_shmem

Dora

git clone https://github.com/dora-rs/dora
cd dora
cargo build --bin dora-daemon
cargo build --bin dora-coordinator

# Basic operation test
./target/debug/dora-daemon --help

MoFA

pip install -e . # Install MoFA, details omitted
python examples/hello_world/hello_world.py

Host-VM Bridge Layer Design

Screen Data Flow:

// Android Host side, demo code, not yet implemented
class ScreenBridge {
    fun captureScreen(): ByteArray {
        val mediaProjection = getMediaProjection()
        val imageReader = ImageReader.newInstance(width, height, ImageFormat.JPEG, 1)
        return compressedImageData
    }

    fun sendToVM(data: ByteArray) {
        socket.write(data) // Send to VM via Socket
    }
}

Operation Command Callback:

# MoFA Agent in VM, demo code, not yet implemented
class AndroidOperationNode:
    def __init__(self):
        self.bridge_socket = socket.connect("unix:///tmp/android_bridge")

    def click(self, x: int, y: int):
        command = {"action": "click", "x": x, "y": y}
        self.bridge_socket.send(json.dumps(command))

Solution 2: Native Android Redesign

New Architecture Design

Android Native MoFA
├── Zenoh-Kotlin communication layer
├── Agent Runtime (Kotlin/Java)
├── Agent template system (simplified version)
├── Android Services
│   ├── AccessibilityService (screen operations)
│   ├── ScreenCapture (screenshot service)
│   └── ForegroundService (background persistence)
└── Mobile Node Hub
    ├── AutoJS integration node
    ├── UI Automator wrapper
    ├── Intent caller
    └── API call node

Comparison of the First Two Solutions

Evaluation Dimension	Termux + Linux	Native Redesign
Development Speed	🟢 Rapid reuse of existing code	🔴 Requires complete rewrite
Runtime Performance	🔴 Double virtualization overhead	🟢 Native performance
Functional Completeness	🟢 100% compatible	🔴 Functionally limited
User Experience	🔴 Extremely complex installation	🟢 Native experience
Maintenance Cost	🟢 Low maintenance	🔴 Dual maintenance

Further Scenario Analysis

Unless developers have specific needs, such as requiring agents to operate certain mobile-exclusive apps (Taobao, WeChat, etc.), but this market is relatively small and can easily deviate from the core direction.

Seriously implementing a mobile solution is quite resource-intensive. The technical complexity is high, and the input-output ratio is not ideal.

However, we can later try packaging directly into containers, not performing operations on users’ phones, but simply running flows for information crawling, document processing, etc., which is still feasible.

Solution 3: Containerized Deployment

Container-based MoFA Mobile Stage
├── Docker/Podman container
│   ├── Complete MoFA environment
│   ├── Pre-installed common dependencies
│   └── Standardized runtime
├── Common Agents/Flows
│   ├── Web data crawling
│   ├── Document processing Flow
│   ├── API call integration
│   └── Data analysis tasks
└── Mobile adaptation
    ├── Responsive Web UI
    ├── Touch-optimized interactions
    └── Offline cache support

Benefits:

Technology stack fully reused, no need to reinvent the wheel
Simple deployment

Solution 4: Cloud + Web Wrapper App

Later, after cloud deployment, create a web wrapper app for the MoFA Stage.

Cloud MoFA + Mobile Client
├── Cloud MoFA
│   ├── Agent runtime environment
│   ├── Task scheduling system
│   └── Result cache layer
├── Mobile App (web wrapper)
│   ├── MoFA Stage mobile version
│   ├── Task creation and monitoring
│   ├── Result display interface
│   └── Offline mode support
└── Communication layer
    ├── WebSocket real-time communication
    ├── RESTful API
    └── Message queue synchronization