How to Make Quran Recitation Videos with Synced Captions

By the AyahFlow team · Updated June 2026 · 9 min read

Recitation videos with Arabic text and a translation on screen are the most-shared format in Islamic social media. The recitation is the easy part — you already have that. The captions are what take people hours. This guide covers the three ways to do them, what each actually takes, and the styling details that separate clean videos from amateur ones.

Why Quran captions are harder than normal subtitles

If you have ever tried captioning a recitation by hand, you have hit at least one of these problems:

Generic auto-caption tools fail on all four points at once, because speech-to-text models are trained on conversational Arabic, not tajwid-governed classical recitation. We wrote a separate breakdown of why auto-captioning fails on Quranic Arabic if you want the technical detail.

First: record a recitation that captions well

Whatever method you use, the recording matters. A few habits make every later step easier:

Method 1: Automatic captioning (minutes)

AyahFlow automates the entire caption layer. The flow:

  1. Upload your video (MP4/MOV) or audio file (MP3, WAV, M4A and others) in the browser.
  2. Detection. The AI listens to the recitation, identifies the surah, then verifies word-by-word against the Uthmani text to find the exact ayah range — including partial ayahs and repeated verses. You never type verse numbers.
  3. Alignment. A speech model specialized for Quranic recitation matches every word of the known text to its exact timestamp in your audio. Because the text is known in advance, this is forced alignment rather than transcription — it doesn't guess words, only timing.
  4. Review and style. A segment editor shows each caption with its timing. Captions split at waqf marks by default. You pick fonts, sizes, colors, text position, background dimming, and one of 16 translation languages. The preview matches the final render exactly.
  5. Render. The final video renders in the cloud in about a minute and downloads ready to post in 9:16, 1:1, 4:5, or 16:9.

Three videos are free (no card), so the honest advice is to test it on your own recitation rather than take our word for accuracy. Start here.

Method 2: Manual editing in CapCut or Premiere (2–4 hours)

The traditional route, and still what most tutorials on YouTube teach. Summarized honestly:

  1. Find your recited passage on Quran.com or Tanzil and copy the Uthmani text ayah by ayah.
  2. Install an Uthmani-compatible font (KFGQPC Uthmanic Hafs). In CapCut mobile this means importing the font file; in Premiere, installing it system-wide. Verify the harakat render correctly — many fonts quietly break them.
  3. Create a text layer per segment, pasting the Arabic, deciding yourself where to break lines (use the waqf marks in the mushaf as your guide).
  4. Add a second text layer per segment for the translation, copied from a published translation of your choice.
  5. Scrub the audio waveform and set in/out points for every segment by ear. This is the slow part — expect several minutes of fiddling per ayah to get word-level precision, and most people settle for ayah-level timing instead.
  6. Add a dim layer between the footage and text so the text stays readable, then export.

Done carefully this produces excellent results, and full manual control is real: any font, any animation, any layout. The cost is time — creators we've talked to report 2–4 hours per video — and the timing precision rarely matches forced alignment because human scrubbing works at the segment level, not the word level.

Method 3: QuranCaption (free desktop app)

QuranCaption is a free, open-source desktop app for Windows, macOS, and Linux built for exactly this task. You install it, load your recitation, and it assists with subtitle timing and translations in many languages, then exports a styled video. It is a genuinely good free option, particularly if you produce long-form videos at a desk and want to keep everything local.

The trade-offs against a web tool: you need a computer (no phone workflow), the pipeline involves more manual steps, and rendering happens on your own hardware. We compare the options in more depth in the best Quran video makers, compared.

Styling that looks right

These defaults come from rendering thousands of recitation videos. They hold regardless of which tool you use:

Mistakes that mark a video as amateur

Caption your next recitation in about a minute

Upload a clip — AyahFlow detects the ayahs, syncs every word, and renders it ready to post.

Try AyahFlow Free

3 free videos · No credit card required

Common questions

Do I need to know which ayahs I recited?

Not with automatic detection — AyahFlow identifies the surah and exact ayah range from the audio alone, including partial ayahs. For manual methods, yes: you'll look the passage up yourself.

Can I caption someone else's recitation?

Technically yes — the tools don't care whose voice it is. Get the reciter's permission before publishing, and credit them. Many famous reciters' recordings are rights-managed.

What if the detection gets an ayah wrong?

Review before you render. AyahFlow shows every detected segment in the editor where you can correct text and timing; never publish a Quran video without checking the text against a mushaf.

Related reading