Sample Test Word List

Curated validation words for diphone coverage and TTS rebuild testing

Validation

Sample Test Word List

This page documents the curated validation word list used to test diphone coverage, safe filename mapping, validator logic, and live TTS playback during system rebuilds.

Purpose of this list. A small but carefully selected set of words can reveal most system-level problems in a diphone-based speech synthesizer. These words test boundary diphones, common consonant–vowel transitions, clusters, and rare phonetic environments.

1. Why a Test Word List Is Important

Running the validator on thousands of words is useful, but debugging becomes difficult. A curated list of about 30–50 representative words makes it easier to isolate rule mismatches and missing diphone files.

Good validation words should test:
  • word-initial diphones
  • word-final diphones
  • common CV transitions
  • VC transitions
  • nasal environments
  • fricatives and affricates
  • clusters and learned forms

2. Core Validation Word List

# BPM Word Purpose
1দিশাtests d-i and i-ʃ transitions
2কথাk-ɔ and th-a transitions
3কামk-a and a-m
4মানm-a and a-n
5দিনd-i and i-n
6পানp-a and a-n
7গানg-a and a-n
8ভালbh-a and a-l
9তরt-ɔ and ɔ-r
10ধরdh-ɔ and ɔ-r
11শরʃ onset coverage
12সারs onset coverage
13হরh onset coverage
14লালl onset and coda
15রামr onset coverage
16নামnasal onset
17মালাm-a-l transitions
18বনb-ɔ-n sequence
19চালc onset coverage
20ঝালjh onset coverage
21জলj onset coverage
22টালretroflex onset
23ডালretroflex voiced onset
24তালdental onset
25দালdental voiced onset
26কাজz/j coda environment
27গাছcʰ/ʧ coda test
28রাতt final diphone
29ঘরgh onset test
30নদীn-ɔ-d-i chain
31শিশুʃ-i repetition
32পথp-ɔ-th sequence
33ভাষাbh-a-ʃ-a
34দেশd-e-ʃ
35কালk-a-l
36গরুg-ɔ-r-u
37চাকাc-a-k-a chain
38নগরn-ɔ-g-ɔ-r chain
39পদp-ɔ-d coda
40বনজcluster-like transitions
41কৃষিcluster environment
42প্রকাশcluster onset test
43ত্রাণcomplex cluster
44গ্রামcluster onset
45শ্রদ্ধাlearned cluster
46স্বাধীনsv cluster test
47জ্ঞাতlearned consonant sequence
48অন্তরnasal cluster environment
49সংগীতŋ environment
50বাংলাŋ-g transition

3. How This List Is Used

Run validator
   ↓
Check diphone sequence
   ↓
Check safe filenames
   ↓
Check diphone WAV files
   ↓
Measure coverage

If this list passes validation, the core TTS system is usually stable enough for wider testing.

4. Recommended Workflow

  • run validator on this list
  • note missing diphones
  • rebuild or segment missing audio
  • rerun validator
  • confirm stable coverage

5. Related Archive Pages

Test list note. This list should evolve as the dictionary grows and new phonetic environments are discovered. It should remain small enough to run quickly but broad enough to detect rule mismatches during rebuilds.