Validator Failure
The validator expects one filename but the audio folder contains another.
IPA-safe conversion rules for diphone filenames, validator logic, and rebuild stability
Mapping
This page documents the filename-safe mapping layer that connects IPA output to diphone audio files. It is one of the most important practical bridges between pronunciation logic, validator checks, and working TTS playback.
A TTS system needs more than correct IPA. It also needs filenames that can be:
IPA symbols such as ʃ, ŋ, ə, or boundary markers like
# are linguistically useful, but not ideal as raw filenames.
The safe mapping layer converts them into stable equivalents.
IPA diphone ↓ Safe symbol mapping ↓ Filename-safe diphone ↓ WAV file lookup
IPA diphone: ʃ-aː
Safe form: sh-aa
Filename: sh-aa.wav
Word boundaries are crucial in diphone-based synthesis.
These are usually represented in phonological notation as #.
In safe filenames, a readable replacement should be used.
| IPA / Symbol | Safe Form | Example |
|---|---|---|
| # | sil | #-d → sil-d.wav |
| # | sil | a-# → a-sil.wav |
Using sil is practical because it is readable and clearly signals
word-initial or word-final silence/boundary behavior.
| IPA Symbol | Safe Form | Reason |
|---|---|---|
| # | sil | Boundary marker |
| aː | aa | Long vowel made ASCII-safe |
| iː | ii | Long vowel made ASCII-safe |
| uː | uu | Long vowel made ASCII-safe |
| eː | ee | Long vowel made ASCII-safe |
| oː | oo | Long vowel made ASCII-safe |
| ʃ | sh | Readable fricative mapping |
| ŋ | ng | Readable nasal mapping |
| ɔ | aw | Readable vowel mapping |
| ə | schwa | Explicit reduced vowel name |
| ɽ | rr | Avoid raw IPA in filename |
| j | y | Readable glide mapping if needed |
| IPA Diphone | Safe Form | Filename |
|---|---|---|
| #-d | sil-d | sil-d.wav |
| d-i | d-i | d-i.wav |
| i-ʃ | i-sh | i-sh.wav |
| ʃ-a | sh-a | sh-a.wav |
| a-# | a-sil | a-sil.wav |
| k-ɔ | k-aw | k-aw.wav |
| ʃ-aː | sh-aa | sh-aa.wav |
| aː-# | aa-sil | aa-sil.wav |
| a-ŋ | a-ng | a-ng.wav |
| k-ə | k-schwa | k-schwa.wav |
The safest workflow is to treat the mapping as a one-way transformation:
IPA → safe filename form
The TTS engine, validator, batch tools, and deployment scripts should all use the same mapping logic. Do not maintain slightly different versions in different pages.
Long vowels should not be stored using IPA length marks in filenames. A doubled-letter form is safer and easier to read.
aː → aa
iː → ii
uː → uu
eː → ee
oː → oo
This makes filenames more portable and avoids encoding issues in some environments.
Word ↓ IPA ↓ Phoneme sequence ↓ Diphone sequence ↓ Safe mapping ↓ Expected filenames ↓ Validator / playback
This means safe filename generation should happen after diphone construction, not during early grapheme or IPA processing.
The validator expects one filename but the audio folder contains another.
JavaScript tries to load a file that does not exist because the mapping changed.
Old files remain from a previous naming system and create confusion during rebuilds.
One page uses sh while another uses raw ʃ or a different alias.
Before rebuilding a diphone folder, the following should be frozen:
Freeze rules ↓ Back up old folder ↓ Create fresh diphone folder ↓ Generate / copy only current-system files ↓ Run validator ↓ Deploy
A validator or spreadsheet tracker should include explicit safe filename columns.
| Field | Purpose |
|---|---|
| IPA diphone | Linguistic form |
| Safe diphone form | Mapped deployment form |
| Filename | Actual expected WAV filename |
| Exists? | Yes/No validation |
| Status | Pass / Missing / Mismatch |
sil for boundariesReview inventory layers, priority levels, and diphone organization.
Review missing-file logic, coverage checks, and clean rebuild testing.
Review the pronunciation side that feeds this filename mapping layer.