Silent Speech Recognition
The recognition model performs direct EMG-to-text transcription using CTC loss and a KenLM-backed beam search decoder.
Core API
Main Script
The recognition_model.py script manages the training loop and Word Error Rate (WER) evaluation using character-based CTC.
Lexicon & LM Setup
Utilities for preparing the decoding environment.
get_lexicon
download_kenlm
Downloads pre-trained KenLM 4-gram language model files (lm.bin, lexicon.txt, tokens.txt).
Uses torchaudio utility with a fallback to direct requests for systems with SSL issues.
Parameters:
-
output_dir
(str)
–
Directory where files will be saved.
Returns:
-
None
–
None. Files are saved to the specified output directory.
Source code in get_lexicon.py
| def download_kenlm(output_dir: str) -> None:
"""
Downloads pre-trained KenLM 4-gram language model files (lm.bin, lexicon.txt, tokens.txt).
Uses torchaudio utility with a fallback to direct requests for systems with SSL issues.
Args:
output_dir: Directory where files will be saved.
Returns:
None. Files are saved to the specified output directory.
"""
os.makedirs(output_dir, exist_ok=True)
original_cwd = os.getcwd()
os.chdir(output_dir)
logging.info(f"Downloading KenLM files to {output_dir}...")
try:
download_pretrained_files("librispeech-4-gram")
except Exception as e:
logging.warning(f"Torchaudio download failed ({e}), attempting manual download...")
files = {
"lm.bin": "https://download.pytorch.org/torchaudio/decoder-assets/librispeech-4-gram/lm.bin",
"lexicon.txt": "https://download.pytorch.org/torchaudio/decoder-assets/librispeech-4-gram/lexicon.txt",
"tokens.txt": "https://download.pytorch.org/torchaudio/decoder-assets/librispeech-4-gram/tokens.txt",
}
for filename, url in files.items():
if not os.path.exists(filename):
logging.info(f"Downloading {filename}...")
response = requests.get(url, stream=True, timeout=60)
response.raise_for_status()
with open(filename, "wb") as f:
f.write(response.content)
os.chdir(original_cwd)
|
get_lexicon
Generates a lexicon file where each word is mapped to its character sequence.
Format: word c h a r s |
Parameters:
-
vocab
(set)
–
-
output_file
(str)
–
Path to save the lexicon.
Returns:
-
None
–
None. Writes the lexicon to the specified output file.
Source code in get_lexicon.py
| def get_lexicon(vocab: set, output_file: str) -> None:
"""
Generates a lexicon file where each word is mapped to its character sequence.
Format: word c h a r s |
Args:
vocab: Set of unique words.
output_file: Path to save the lexicon.
Returns:
None. Writes the lexicon to the specified output file.
"""
with open(output_file, "w", encoding="utf-8") as fout:
for word in sorted(list(vocab)):
# split word into char tokens
chars = list(word)
fout.write(f"{word} " + " ".join(chars) + " |\n")
|
get_unigram
Extracts all unique words (unigrams) from the dataset's text samples.
Parameters:
-
dataset
(H5EmgDataset)
–
The H5EmgDataset to process.
Returns:
-
set
–
A set of unique lowercase words.
Source code in get_lexicon.py
| def get_unigram(dataset: H5EmgDataset) -> set:
"""
Extracts all unique words (unigrams) from the dataset's text samples.
Args:
dataset: The H5EmgDataset to process.
Returns:
A set of unique lowercase words.
"""
unigram = set()
for example in tqdm.tqdm(dataset, "Building unigram", leave=False):
clean = transform.clean_text(example["text"])
for w in clean.split():
unigram.add(w)
return unigram
|