Validator Workflow
Detailed explanation of validation logic and coverage checks.
Curated validation words for diphone coverage and TTS rebuild testing
Validation
This page documents the curated validation word list used to test diphone coverage, safe filename mapping, validator logic, and live TTS playback during system rebuilds.
Running the validator on thousands of words is useful, but debugging becomes difficult. A curated list of about 30–50 representative words makes it easier to isolate rule mismatches and missing diphone files.
| # | BPM Word | Purpose |
|---|---|---|
| 1 | দিশা | tests d-i and i-ʃ transitions |
| 2 | কথা | k-ɔ and th-a transitions |
| 3 | কাম | k-a and a-m |
| 4 | মান | m-a and a-n |
| 5 | দিন | d-i and i-n |
| 6 | পান | p-a and a-n |
| 7 | গান | g-a and a-n |
| 8 | ভাল | bh-a and a-l |
| 9 | তর | t-ɔ and ɔ-r |
| 10 | ধর | dh-ɔ and ɔ-r |
| 11 | শর | ʃ onset coverage |
| 12 | সার | s onset coverage |
| 13 | হর | h onset coverage |
| 14 | লাল | l onset and coda |
| 15 | রাম | r onset coverage |
| 16 | নাম | nasal onset |
| 17 | মালা | m-a-l transitions |
| 18 | বন | b-ɔ-n sequence |
| 19 | চাল | c onset coverage |
| 20 | ঝাল | jh onset coverage |
| 21 | জল | j onset coverage |
| 22 | টাল | retroflex onset |
| 23 | ডাল | retroflex voiced onset |
| 24 | তাল | dental onset |
| 25 | দাল | dental voiced onset |
| 26 | কাজ | z/j coda environment |
| 27 | গাছ | cʰ/ʧ coda test |
| 28 | রাত | t final diphone |
| 29 | ঘর | gh onset test |
| 30 | নদী | n-ɔ-d-i chain |
| 31 | শিশু | ʃ-i repetition |
| 32 | পথ | p-ɔ-th sequence |
| 33 | ভাষা | bh-a-ʃ-a |
| 34 | দেশ | d-e-ʃ |
| 35 | কাল | k-a-l |
| 36 | গরু | g-ɔ-r-u |
| 37 | চাকা | c-a-k-a chain |
| 38 | নগর | n-ɔ-g-ɔ-r chain |
| 39 | পদ | p-ɔ-d coda |
| 40 | বনজ | cluster-like transitions |
| 41 | কৃষি | cluster environment |
| 42 | প্রকাশ | cluster onset test |
| 43 | ত্রাণ | complex cluster |
| 44 | গ্রাম | cluster onset |
| 45 | শ্রদ্ধা | learned cluster |
| 46 | স্বাধীন | sv cluster test |
| 47 | জ্ঞাত | learned consonant sequence |
| 48 | অন্তর | nasal cluster environment |
| 49 | সংগীত | ŋ environment |
| 50 | বাংলা | ŋ-g transition |
Run validator ↓ Check diphone sequence ↓ Check safe filenames ↓ Check diphone WAV files ↓ Measure coverage
If this list passes validation, the core TTS system is usually stable enough for wider testing.
Detailed explanation of validation logic and coverage checks.
Operational guide for rebuilding the diphone inventory safely.
Overview of the full diphone inventory used by the TTS system.