MetaLang is a multilingual, ontology-based framework for linguistic metalanguage — the vocabulary used to describe language itself. It serves as a canonical pivot layer between academic, pedagogical, and computational grammar systems (e.g., Universal Dependencies, EAGLES, national school grammars).
Linguistic terminology is fragmented. The same concept (e.g., an "article") might be labeled ART, DET, lidwoord, or επίθετο depending on the tradition, standard, or language. MetaLang resolves this by mapping these disparate labels to stable, globally unique identifiers (GUIDs).
MetaLang provides:
- Canonical Ontology: A stable hierarchy of linguistic concepts across domains (POS, Morphology, Syntax, etc.).
- Multilingual Labels: localized terms and abbreviations for end-users and software (EN, NL, EL, DE, FR, PT, ES, IT, RU).
- Plugin Architecture: A system for mapping any external tagset (e.g., UD, CELEX, PTB) to the MetaLang core.
MetaLang is positioned within the emerging subdiscipline of Language Resource Infrastructure and Standardization. It addresses the "Backbone infrastructure" problems of:
- Linguistic Data Infrastructure (LDI): Organizing and disclosing heterogeneous linguistic data.
- Interoperability: Bridging the gap between legacy institutional data (e.g., Greek-specific INTERA) and modern standards like Universal Dependencies (UD) and CLDF.
- Lexicography & Morphology: Providing a machine-readable path for historical lexicographic terms to modern NLP pipelines.
MetaLang complements and integrates with:
- UD: Standardized syntactic annotation.
- BabelNet: The world's largest multilingual encyclopedic dictionary and semantic network.
- LexInfo: An ontology for linguistic annotations in the LLOD (Linguistic Linked Open Data) cloud.
- New Plugins:
@metalang/plugin-babelnet(64 tags) and@metalang/plugin-lexinfo(119 tags). - Ontology Alignment: Expanded PoS taxonomy with granular concepts for pronouns, particles, and specialized verb types to match LexInfo's high-fidelity classification.
- Schema: Added
retrievedAttoBibliographicSourceto support web-based resource citations.
This is a monorepo containing the following components:
packages/schema: Core TypeScript interfaces and JSON schemas.packages/core: The central ontology engine and registry.packages/plugin-ud: Universal Dependencies (UD) tag mapping provider.packages/plugin-BabelNet: BabelNet Universal POS tag mapping provider.packages/plugin-LexInfo: LexInfo PartOfSpeech tag mapping provider.docs/: Comprehensive specifications and concept notes.ontology/: The single-source-of-truth directory of definitions, defining the entire "world" of MetaLang—the domains, the concepts, and their hierarchical relationships
- Concept Note: Philosophical and architectural introduction.
- Core Specification: Functional and technical requirements for the engine.
- GUI Specification: Requirements for the MetaLang authoring and governance tool.
pnpm installMetaLang provides a powerful programmatic interface to resolve, translate, and explore linguistic metalanguage data.
Quickly find concepts or resolve tags from specific systems.
import { defaultRegistry as registry } from '@metalang/core';
// Search across all plugins for any tag or term
const results = registry.search("znw");
// [ { systemId: "nl-generic", tag: "znw.", conceptId: "ML_POS_NOUN", matchType: "partial" }, ... ]
// Resolve a tag in a specific context
const concepts = registry.resolve("v", "nl-taalunie");
// Returns full Concept objects for ML_MORPH-VALUE_GENDER_FEMININEMap terminology directly from one tradition to another.
// Translate a Dutch school grammar tag to its English pedagogical equivalent
const enTags = registry.translateTag("znw", "nl-generic", "en-generic");
// Returns: ["noun", "n.", "noun phrase"]
// Get all tags for a concept in a specific system
const elTags = registry.translateConcept("ML_POS_NOUN", "el-generic");
// Returns: ["ουσιαστικό", "ουσ.", ...]Retrieve localized singular, plural, and abbreviated forms with a robust fallback chain.
// Get forms for 'article' in a specific system, with automatic fallbacks
const forms = registry.getForms("ML_POS_ARTICLE", "nl-taalunie");
console.log(forms.singular); // "lidwoord"
console.log(forms.abbreviations); // ["lw."]
console.log(forms.sourceSystemId); // "nl-generic" (resolves via language fallback)Traverse the concept hierarchy and link to global knowledge bases.
// Navigate the ontology
const children = registry.getChildren("ML_POS_NOUN");
// [Concept(ML_POS_NOUN-COMMON), Concept(ML_POS_PROPER-NOUN), ...]
// External Links
const wikidata = registry.getWikiDataId("ML_POS_NOUN"); // "Q1401131"
const wikiUrl = registry.getWikipediaUrl("ML_POS_NOUN", "nl");
// "https://nl.wikipedia.org/wiki/zelfstandig_naamwoord"Turn a raw array of database strings directly into a multilingual, nested taxonomy tree ready for UI rendering.
// Pass a flat array of database labels (mixed with MetaLang IDs)
const rawLabels = ["bnw.", "zin", "ML_POS_NOUN"];
// MetaLang automatically resolves variants, drops duplicates, constructs the domain taxonomy,
// and maps any known multi-parent relationships down to a fully structured nested object.
const dataset = registry.processDataset(rawLabels, "nl", {
format: "tree",
languages: ["nl", "en", "el"],
includeAbbreviations: true
});
console.log(dataset.nodes); // Array of Domain roots containing nested cleanly-mapped children
console.log(dataset.unmapped); // Array of strings that could not be recognizedRun the comprehensive API stress test:
npx tsx scripts/verify_api.tsIf you use MetaLang in your research or project, please cite it as follows:
@software{Soudan_MetaLang_2026,
author = {Soudan, Wouter},
affiliation = {Independent Scholar; PhD from KU Leuven; former postdoc in Computational Linguistics, Universiteit Antwerpen, Belgium},
license = {ISC},
month = {2},
title = {{MetaLang: Cross-Linguistic Terminology Alignment Layer}},
url = {https://github.com/rhythmus/MetaLang},
version = {1.0.0},
year = {2026}
}Alternatively, you can use the CITATION.cff file in this repository for other formats.