Shared Configuration
Site-wide settings, publication metadata, and shared contributor/reference/index data.
A practical guide to the archive structure, shared files, and TTS-related code responsibilities
Guide
This page explains the structure of the Bishnupriya Manipuri research archive and speech-technology codebase, including shared infrastructure, article pages, toolkit pages, and the logic layers that must remain synchronized.
/che/ pages shared templates metadata helpers publication helpers contributor helpers glossary / reference / index systems search system toolkit and dataset pages
The archive is designed as a shared PHP publication framework rather than a collection of unrelated pages. That means many pages depend on the same shared logic.
Site-wide settings, publication metadata, and shared contributor/reference/index data.
Functions used across the site for metadata, references, search, glossary linking, publication display, and contributor rendering.
Header, footer, templates, print styles, and site-wide CSS.
Article pages, book page, glossary, references, and public-facing archive pages.
IPA toolkit, diphone inventory, recording protocol, validator workflow, safe filename mapping, rebuild checklist, and related operational pages.
Search, book index, glossary, resources, downloads, and archive landing pages.
| File | Role | Importance |
|---|---|---|
site_config.php | Site-wide configuration | Core |
meta_helpers.php | SEO/meta/canonical/JSON-LD output | Core |
publication_config.php | Publication defaults | Core |
publication_helpers.php | Edition/version/revision display | Core |
contributors_data.php | Contributor master data | Core |
contributors_helpers.php | Contributor rendering functions | Core |
articles_data.php | Master article listing | Core |
article_helpers.php | Article lookup/navigation helpers | Core |
references_data.php | Shared bibliography data | Core |
references_helpers.php | Reference formatting and linking | Core |
index_terms.php | Master index term list | Core |
index_helpers.php | Auto-index logic | Core |
glossary_data.php | Glossary term data | Core |
glossary_helpers.php | Glossary rendering and auto-linking | Core |
search_helpers.php | Archive search logic | Core |
| File | Role |
|---|---|
header.php | Site header, navigation, meta loading, global page start |
footer.php | Site footer, page close |
style.css | Main archive styling |
print.css | Print/PDF formatting |
article_template.php | Shared article wrapper |
book_chapter_template.php | Shared chapter/book wrapper |
chapter_nav.php | Previous/next chapter navigation |
header.php, footer.php,
and style.css.
| File | Role |
|---|---|
index.php | Landing page / home page for the archive |
about.php | About the language and project |
book.php | Combined book page |
references.php | Shared bibliography page |
glossary.php | Glossary page |
book_index.php | Auto-generated archive/book index |
contributors.php | Contributors and credits page |
updates.php | Project updates / news |
downloads.php | Download / export center |
resources.php | Datasets / resources hub |
search.php | Site-wide search results page |
The article series pages are the main long-form research content:
article1.php article2.php ... article10.php
Each article should ideally do only a few things:
The less page-specific infrastructure repeated inside article files, the more stable the system remains.
These pages document the practical working system behind the archive:
| File | Role |
|---|---|
ipa_toolkit.php | Orthography-to-IPA and pronunciation toolkit |
diphone_inventory.php | Diphone structure, layers, priorities, and coverage |
recording_protocol.php | Recording and normalization workflow |
validator_workflow.php | Validation logic and pass/fail workflow |
safe_filename_mapping.php | IPA-to-safe-filename rules |
rebuild_checklist.php | Operational rebuild deployment checklist |
test_word_list.php | Curated validation words for rebuild testing |
tts_architecture.php | Full technical TTS pipeline overview |
articles_data.php ↔ article pages ↔ navigationglossary_data.php ↔ glossary_helpers.php ↔ glossary auto-linkingreferences_data.php ↔ references_helpers.php ↔ references pageindex_terms.php ↔ index_helpers.php ↔ book indexheader.php ↔ navigation links ↔ actual page pathsWhen something looks correct on one page but wrong on another, the issue is often one of these synchronization points.
Do not duplicate core logic across multiple pages unless absolutely necessary.
Keep one shared source of truth for article listings, references, glossary terms, and index terms.
Freeze naming and pronunciation rules before rebuilding dependent datasets.
When many pages fail at once, inspect shared files before page files.
When one page fails but others work, inspect that page’s local includes and metadata first.
Use smoke-test pages when debugging shared include chains.
Page fails ↓ Check which include is last known good ↓ Test shared helper chain ↓ Inspect data file or helper file ↓ Confirm encoding and syntax ↓ Reload page
A small smoke test can isolate errors much faster than guessing across many files.
meta_helpers.php article_helpers.php references_helpers.php index_helpers.php glossary_helpers.php publication_helpers.php contributors_helpers.php
Not every page needs every helper, but pages should only load what they actually use.
| If the problem is... | Patch here first |
|---|---|
| navigation or top bar | header.php, style.css |
| footer across pages | footer.php |
| article previous/next links | articles_data.php, chapter_nav.php |
| glossary linking | glossary_data.php, glossary_helpers.php |
| reference formatting | references_data.php, references_helpers.php |
| index page results | index_terms.php, index_helpers.php |
| search results | search_helpers.php |
| page titles / canonical / SEO | meta_helpers.php, page metadata arrays |
| publication versioning | publication_config.php, publication_helpers.php |
| contributors / credits | contributors_data.php, contributors_helpers.php |
If one of those layers changes while the others stay old, the system becomes unreliable.
As the project grows, it may be useful to group code more explicitly:
/che/ /core/ /pages/ /data/ /toolkit/ /assets/
The current flat structure is still workable, but this future structure can help once the archive becomes larger.
Central technical overview of the speech pipeline.
Dataset and toolkit overview for the whole project.
Operational deployment and rebuild workflow.