Cover image for: How to Create AI Videos with Synthesia (Step by Step)

How to Create AI Videos with Synthesia (Step by Step)

How to Create AI Videos with Synthesia (Step by Step)

Affiliate links ↓

Updated · May 16, 2026

Most Synthesia tutorials spend three paragraphs explaining what Synthesia is. You’re here because you already know — you want to know what to actually click. This walkthrough takes you from blank screen to exported MP4, covering the settings most guides skip and the gotchas that waste your first 30 minutes. Before you start: you’ll need a Synthesia account (the free trial gets you one video), your script in any format — rough notes are fine — and roughly 15–20 minutes for a two-minute finished video.

1. Create your account and understand what you’re paying for

Go to Synthesia’s homepage and click Get started free. The free trial gets you one video export at no cost — enough to run the full workflow before committing to a plan. After that, the Starter tier runs around $22/month billed annually and gives you approximately 10 videos per month with no watermarks. The Creator plan (around $67/month annually) adds global brand kit settings, more avatar variety, and the ability to order a custom avatar built from your own likeness.

One thing to check before entering card details: avatar access. Synthesia’s stock library of 230+ avatars is available on all paid plans. But custom avatars — where you record yourself once and generate an AI version — are Creator and above only. Decide upfront which you need, because upgrading mid-month just for that feature isn’t worth it.

After signup, you land on the dashboard. It looks like a stripped-down presentation tool. That’s the right mental model — each video is a deck of slides, each slide driving one chunk of script.

2. Start a new project and pick a template

Click New video from the dashboard. Synthesia gives you two starting points: choose a template or start from scratch. For your first video, pick a template. Look in the Business or Training category — these pre-configure slide layout, avatar position, and background so you’re not making design decisions while also learning the interface.

The editor opens with three main areas: slide panel on the left, canvas in the center, element controls on the right. The script input lives below the canvas — whatever you type there is what the avatar speaks.

If no template fits, start blank and choose a solid-color background from the Backgrounds tab. Avoid image backgrounds with detail — they compete with the avatar and make any text overlays nearly unreadable. Flat colors or soft gradients work best.

3. Write your script directly in the editor

Click the script field below the canvas and paste your text. Each slide should carry one idea — roughly 30–60 words, which lands at about 15–25 seconds of spoken video. Go longer than that and the avatar either rushes or the pacing feels crammed. When a slide hits 60 words, split it into two.

Synthesia reads punctuation, so periods create natural pauses. If you need a deliberate beat before a key point, insert a pause marker inline: <break time=”1s” />. Most people don’t know this exists and wonder why their avatar sounds like it’s racing through a legal disclaimer.

Write for speaking, not reading. Read each sentence aloud before you paste it. If you trip over a word, rewrite it. The avatar will deliver whatever you give it, awkward phrasing included — there’s no AI layer smoothing the script for you. In our testing, scripts ported directly from slide decks almost always needed at least one pass to convert bullet logic into spoken sentences.

4. Choose your avatar and voice

Click the avatar thumbnail on the canvas — or find Avatar in the right panel — to open the full library. Filter by gender, age, and apparent ethnicity. Before committing to any avatar, click the preview button and watch the full clip. Some avatars have subtly stiff body language that reads as uncanny over a two-minute video; you want one whose movements feel natural before you build 10 slides around them.

Voice and avatar are linked by default but separable. Under the Voice tab, you can change language and accent without touching the avatar’s appearance. Synthesia supports over 140 languages with auto-synced lip movement — meaning you can produce an English version and a Spanish version of the same video by swapping the voice setting and replacing the script. No re-rendering the avatar, no re-recording anything.

For tone, some voices offer a Conversational vs Formal toggle. Pick conversational for training content. Formal voice sounds like an automated phone system, which is the exact feeling you’re trying to avoid.

5. Set up your scene — background, text, and media

The right panel has tabs for Elements, Media, Text, and Backgrounds. Here’s what’s actually worth using:

  • Text overlays — add a headline or key takeaway, ideally six words or fewer. More than that and it competes with what the avatar is saying rather than reinforcing it.
  • Media uploads — drop in a screenshot, product image, or diagram and Synthesia positions it on a split-screen layout next to the avatar. This is the right move for software walkthroughs where you want to show a UI while explaining it.
  • Brand colors — in the Backgrounds tab, use the custom color option and enter your hex code directly.

To add a logo, go to ElementsUpload image, then drag it to a corner of the canvas and resize using the corner handles. On the Starter plan, there’s no global brand kit — you’ll need to add your logo to each slide individually. On longer videos this gets tedious fast. It’s one of the more frustrating omissions at that price tier, and a genuine reason to consider Creator if you’re making branded content regularly.

6. Generate, preview, and export your video

When your slides are ready, click Generate video in the top-right corner. A two-minute video typically renders in 3–5 minutes; longer videos (10+ minutes) can take 15–20. Synthesia sends an email notification when it’s done, so you don’t need to sit on the tab waiting.

When the preview loads, watch the full video before downloading. Listen specifically for mispronounced words and slides where pacing feels rushed. To fix a mispronunciation, go back to that slide’s script and use phonetic spelling — if “Synthesia” is being mangled, try writing it as “Sin-theez-ia” in the script field. Then click Regenerate on just that slide; you don’t have to re-render the entire video.

To download, click DownloadMP4. The default export is 1080p. No standard plan offers 4K, but for training videos, internal communications, or social clips, 1080p is fine. If you’d rather share a link than a file, click Share — Synthesia hosts the video and gives you a direct URL with optional privacy controls.

What do you do when the video won’t generate?

Three problems come up repeatedly, and all of them have straightforward fixes:

  • Avatar mouth out of sync with words — almost always caused by hidden characters or smart quotes pasted from Word or Google Docs. Paste your script into a plain text editor (Notepad on Windows, TextEdit in plain text mode on Mac) first, clean it up, then paste into Synthesia.
  • “Video generation failed” error — refresh and try once more. If it fails a second time, Synthesia’s live chat support is genuinely responsive, typically within a few minutes during business hours. Don’t wait hours trying to diagnose it yourself.
  • Slide hits a character limit — Synthesia caps each slide at around 1,000 characters, which isn’t clearly documented anywhere in the UI. If you’re getting a generation error on a specific slide, check whether it’s unusually long and split it.

What can you build once you have the basics?

The most immediate efficiency gain is building a reusable template. Duplicate any finished video from the dashboard and replace only the script — avatar, background, branding, and layout stay identical. For teams producing weekly training content, this cuts production time from 20 minutes to closer to 5.

For higher-volume workflows, Synthesia has an API that accepts script text and returns a rendered video URL. It requires the Enterprise plan, but it maps cleanly to automation tools like Zapier — you can trigger a new video from a spreadsheet row, a CRM update, or a form submission without touching the editor at all.

If Synthesia’s avatar library doesn’t match your use case — say, you need a more cinematic look or you’re generating short-form creative content rather than training videos — HeyGen is worth a look as an alternative. It has a similar avatar-plus-script workflow but with stronger support for talking-head video styles and a more generous free tier. Neither tool handles everything; which one fits depends on what “professional video” means for your specific context.

Frequently asked questions

Can you use Synthesia without paying?

The free trial lets you create one video and export it — enough to evaluate the full workflow. After that, you’ll need the Starter plan (around $22/month billed annually) to produce videos without watermarks on a recurring basis.

How many languages does Synthesia support?

As of mid-2026, over 140 languages with lip-sync that adjusts automatically when you swap the voice setting. You can localize a video into multiple languages by changing the voice and replacing the script text — no re-recording or re-configuring the scene needed.

Can Synthesia create a video from a PowerPoint or PDF?

Not directly — it doesn’t import files and convert them automatically. You can upload slides as images and place them in the media panel alongside your avatar, but you still write the script manually. There’s no automatic script generation from uploaded documents on standard plans.

Bottom line
Synthesia

If you need to produce training videos, product walkthroughs, or internal communications without camera equipment or editing skills, Synthesia is the most reliable path from script to finished MP4 we’ve found — provided you go in with realistic expectations about what the avatars look like and what the Starter plan actually includes.

Try Synthesia

This article contains affiliate links. If you subscribe through one, we may earn a commission at no extra cost to you. It never changes what we recommend — we only link to tools we actually use. Full disclosure.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *