Mirror Sync vs. Merge-on-Read

·October 22, 2025

·10 min read

Mirror Sync vs. Merge-on-Read — Image Source: pexels

You often have to pick between Mirror Sync and Merge-on-Read when you want your data to stay current. Mirror Sync copies all your data so every device is the same. Merge-on-Read only updates what has changed, so it saves time if you edit a lot. Think about needing offline access on your tablet or wanting to handle many updates fast. The best sync method helps you avoid waiting and keeps your data correct.

Choosing the right way is important for quick access and correct records.

Key Takeaways

Mirror Sync gives real-time updates on all devices. You always have the newest data, even when offline.
Merge-on-Read works well with big datasets that change a lot. It lets you write fast and manage data easily.
Pick Mirror Sync if you need the same data right away on many devices. This is good for mobile apps or disaster recovery.
Use Merge-on-Read for big data analytics or streaming data. It handles lots of changes fast without rewriting whole files.
Always look at your needs and data goals. This helps you pick the best sync method for your work.

Mirror Sync Overview

What Is Mirror Sync

Mirror Sync keeps your data the same on all devices. You use it when you want updates to be fast and dependable. This method is good for offline use. You can get your files on iPad, iPhone, macOS, or Windows PC. Mirror Sync copies your data in real time. You always see the newest information. You do not need to wait for long transfers. You have one true copy of your data. This helps your records stay correct.

Mirror Sync gives you one main copy of your data. This helps you avoid mistakes and confusion.

How It Works

You set up Mirror Sync to copy your data to each device. When you change something, all devices update right away. You do not need to wait for a big transfer. Mirror Sync only moves changed data. You can keep working while it syncs. The process is fast and uses strong methods. You can also change it with an open API.

Near real-time data copying
No ETL needed for moving data
Fast syncing
Can be changed with Open Mirroring API

Benefits

Mirror Sync gives you many good things. The table below shows how it helps:

Advantage	Description
Smart Updates	Only changed data moves, so you save time.
Non-blocking Internal Writes	You can keep working while data syncs in the background.
Deferred Flushing	Data writes happen in a smart way, making sync faster.
Optimized Clones	Smaller data sets move fast, so the process is quicker.
Improved Solution Downloads	You get the newest app version easily.
Performance Benchmark	Sync time goes from 28 minutes to less than 7 minutes.

Drawbacks

There are some limits with Mirror Sync, especially for big jobs:

Drawback	Description
Limited scalability	You must add more power by hand if you need more space.
Single point of failure	If one part breaks, the whole system can stop. This is risky for important work.

Use Cases

Mirror Sync is best when you need fast and correct data. Here are some common ways to use it:

Industry/Application Scenario	Description
Fitness Tracking	You see live health stats and workout help on smart mirrors.
Home Automation	You control lights, heating, and appliances by voice or touch.
Augmented Reality Beauty Tools	You get beauty tips and try on makeup with AR.
Productivity Dashboards	You get reminders, alerts, and traffic news from your smart mirror.

Merge-on-Read Overview

What Is Merge-on-Read

Merge-on-Read helps you handle tables with many updates. You do not need to change the whole file each time. MoR keeps a main file and a log of changes. This is the usual way for primary key tables in Apache Paimon. The table below shows how it works:

Aspect	Description
Definition	Merge-on-Read is the main mode for primary key tables in Apache Paimon.
Read Process	You must merge all files to read, using multi-way merging and checking primary keys.
Read Performance	Reading can be slow for big jobs because only one thread reads LSM trees at a time.
Write Performance	Writing is usually fast.
Data Volume Recommendation	Buckets should be between 200MB and 1GB for best reading speed.
Filtering Limitation	You cannot skip data by filtering non-primary key columns without risking mistakes.

How It Works

Merge-on-Read uses two file types: base files and delta logs. The base file has the main data. The delta log keeps track of every change. When you update, you add to the log. You do not change the whole file. Later, the system mixes the log and base file together. This is called compaction. You get quick updates and save space.

MoR keeps base files and delta logs.
Delta logs record each change, so updates are fast.
The system waits to rewrite files until compaction, which saves time and space.
This works well for streaming data and lots of updates.

Benefits

Merge-on-Read gives you many good things when you have lots of updates:

🚀 Faster Writes: You log changes, so updates are quick.
💰 Cost-Efficiency: You use less space and fewer resources.
🕒 Real-Time Processing: You can handle streaming data and fast updates.
🔄 Change Data Capture: You can track and check changes as they happen.
📊 Enhanced Features: You get time travel, step-by-step processing, and real-time analytics.

Drawbacks

There are some problems with Merge-on-Read:

MoR can be slow to read if you have lots of data, because it must mix files each time.
Many updates can cause write conflicts, which may slow you down.
The system makes lots of snapshots, which can make tables bigger and slow down data loading.
Some tools do not support all delete types, so you might see old data by mistake.

Use Cases

Merge-on-Read is best when you need to handle many changes in big datasets. It works well for:

Big analytics and large data platforms.
Streaming data pipelines, like IoT or live logs.
Jobs with lots of updates, such as transactional data lakes.
Times when you want to avoid changing whole files for small edits.

Merge-on-Read helps you keep up with fast-changing data and saves you time and space.

Comparison

Performance

You want your data to move fast and stay fresh. Mirror Sync gives you near real-time updates. You see changes on every device almost right away. This works well when you need the same data everywhere, like on your phone and computer. Merge-on-Read focuses on fast writes. It logs each change quickly, so you do not have to rewrite the whole file. This helps when you have lots of updates, like in big data jobs or streaming logs. Reading data with Merge-on-Read can take longer because the system must combine the main file and the log each time.

Here is a quick look at how both methods perform:

Factor	Mirror Sync	Merge-on-Read
Write Speed	Fast for small changes	Very fast for many updates
Read Speed	Near real-time, always fresh	Slower, needs to merge files
Data Freshness	High, always up to date	Depends on compaction timing
Resource Use	More for syncing devices	Less for writing, more for reading

Consistency

You need your data to stay correct, even when many people make changes at once. Mirror Sync keeps all devices the same by copying changes right away. If two people change the same thing, the system merges the changes from both local and remote sources. It uses a default policy to pick the winner. Sometimes, you can solve conflicts by hand or check a special table that stores the losing data for review.

Merge-on-Read also faces conflicts when many updates happen at once. The system merges changes from different sources to create a new, correct version. If both versions exist, it uses a default rule to decide which one wins. You can also use tools to log conflicts or solve them during sync.

You can trust both methods to keep your data correct, but you may need to check conflict tables or use special tools if many people update the same record.

Complexity

Setting up data sync can be hard. Mirror Sync often needs you to build connectors and understand how to handle logins, data pages, and special cases. You must plan for things like retries and permissions. Merge-on-Read makes things easier by giving you one way to handle data, with built-in tools for bulk updates and retries. You get a simple way to sync, even with lots of data.

Mirror Sync setup needs more planning and custom work.
Merge-on-Read gives you ready-made tools for common sync jobs.

If you want a simple setup, Merge-on-Read may save you time and effort.

Scalability

You may want to grow your system to handle more users or bigger data. Both Mirror Sync and Merge-on-Read have limits when you work with very large datasets. Mirror Sync may slow down if you add many devices or lots of data. You might need to add more power by hand. Merge-on-Read also faces slowdowns when reading large tables, because it must merge many files.

Description	Mirror Sync	Merge-on-Read
Handles large datasets	Needs manual scaling	May slow down on big reads
Performance with many users	Can hit limits, needs more servers	Can slow with many updates
Product teams work to improve these limits	Yes	Yes

Scenarios

You should pick the right method for your needs. Here are some examples:

If you need offline access on your phone or tablet, Mirror Sync works best. You get the same data everywhere, even without internet.
If you handle lots of updates, like in a streaming data job or a big analytics platform, Merge-on-Read saves you time and space.
For a sales team that travels and needs up-to-date info on every device, Mirror Sync keeps everyone in sync.
For a company that tracks millions of sensor readings every minute, Merge-on-Read handles the high update rate without slowing down.

Think about your main goal: fast access everywhere, or handling lots of changes. Your answer will help you choose the best sync method.

Head-to-Head Summary

Feature	Mirror Sync	Merge-on-Read
Speed	Fast sync, real-time updates	Fast writes, slower reads
Data Freshness	Always up to date	Depends on merge timing
Resource Use	More for syncing, less for writes	Less for writes, more for reads
Typical Use	Offline mobile, identical datasets	High-frequency updates, big data

Decision Guide

When to Use Mirror Sync

Pick Mirror Sync if you need your data all the time. This method is best for quick access. You do not want to wait. Many people use Mirror Sync for disaster recovery. It helps keep downtime short. You also get the same data on every device right away.

You need high availability.
You want low-latency access.
You need disaster recovery with little downtime.

If you use FileMaker or need offline access, Mirror Sync helps keep your data current on all devices.

When to Use Merge-on-Read

Merge-on-Read works well when you have lots of updates. It helps you process data fast and keeps it safe. This method is good for big datasets. You get better writing and reading speed, especially with indexing.

Indicator	Description
Workload Suitability	Merge-On-Read is best for jobs with many updates and real-time data processing.
Data Consistency	It keeps data correct and safe using ACID transactions. This is important for big data systems.
Data Integrity	It uses checks like checksum verification to stop data problems and handle large datasets.
Performance	Indexing makes queries and data loading faster and easier.

Use Merge-on-Read for big analytics, streaming data, or tracking many changes quickly.

Key Factors

When you choose a sync method, think about your main needs. Ask yourself these questions:

Do you need real-time sync?
How often do you update records?
Is it important to find and fix conflicts?
Do you want flexible sync schedules?
How do you keep data safe and follow security rules?

Your answers will help you pick the best method.

Tip: Always choose your sync method based on your business needs and how your system works.

You see Mirror Sync works best when you need identical data on every device. Merge-on-Read fits jobs with frequent updates and large datasets. If you want to avoid costly mistakes, watch for data conflicts and poor configuration. Many companies lose millions from bad data quality. To choose the right method, follow these steps:

Review your current data management.
Pick a tool that matches your goals.
Build a clear sync strategy.
Monitor and adjust your plan.

Choose the method that matches your workflow and keeps your data safe.

FAQ

What happens if you lose your internet connection during sync?

You can keep working offline with Mirror Sync. The system saves your changes. When you reconnect, it updates your data. Merge-on-Read also stores updates and merges them later.

Can you use both Mirror Sync and Merge-on-Read together?

You can use both in some systems. Mirror Sync works well for device syncing. Merge-on-Read helps with fast updates. You get the best of both if your platform supports it.

Does Merge-on-Read slow down your reports?

Yes, reading can take longer with Merge-on-Read. The system must combine files each time you read data. For fast reports, you may need to compact data often.

How do you handle data conflicts?

Mirror Sync and Merge-on-Read both use rules to solve conflicts. You can set up custom rules or review conflict tables. This helps you keep your data correct.

Which method is better for mobile apps?

Mirror Sync works best for mobile apps. You get the same data on every device. You can work offline and sync later. Merge-on-Read fits big data jobs more than mobile use.

Mirror Sync vs. Merge-on-Read

Key Takeaways

Mirror Sync Overview

What Is Mirror Sync

How It Works

Benefits

Drawbacks

Use Cases

Merge-on-Read Overview

What Is Merge-on-Read

How It Works

Benefits

Drawbacks

Use Cases

Comparison

Performance

Consistency

Complexity

Scalability

Scenarios

Head-to-Head Summary

Decision Guide

When to Use Mirror Sync

When to Use Merge-on-Read

Key Factors

FAQ

What happens if you lose your internet connection during sync?

Can you use both Mirror Sync and Merge-on-Read together?

Does Merge-on-Read slow down your reports?

How do you handle data conflicts?

Which method is better for mobile apps?

See Also