Translation Model

This project implements a simple natural language translation system using pre-trained word embeddings. It supports basic English-to-French and English-to-Spanish word and sentence translation by leveraging cosine similarity between word vectors.

Project Structure

ics_project/

Header files

├── include/
├── common.h
├── eval.h
├── globals.h
├── io.h
├── translate.h
│── vector.h

Source files

├── src/ ├── *_eval.c
├── *_globals.c
├── *_io.c
├── *_main.c
├── *_translate.c
│── *_vector.c

Object files

├── obj/
├── makefile # Build instructions
├── main # Compiled binary (after build)
├── .gitignore
|── LICENSE

Build Instructions

If Data is not donwloaded Downlaod it from here

Unzip & Add it to the /data folder

Build the project using:

make

If this doesnt work use :

gcc -Iinclude src/*.c -o main -lm

Running Instructions

./main

Features included in the project:

Load and process word embeddings from files
Translate individual words using top-k cosine similarity
Translate entire sentences word-by-word
Evaluate similarity and semantic closeness of translations
Modular code architecture with clear separation of concerns

Sources

This project uses FastText word embeddings developed by Facebook AI Research (FAIR). Unlike Word2Vec, FastText represents words as character n-grams, allowing it to handle out-of-vocabulary words using subword information. We used pre-trained aligned word vectors trained on Wikipedia and Common Crawl, which place words from different languages (e.g., English, French, Spanish) in the same vector space. This enabled accurate word-level translation using cosine similarity to find semantically similar words across languages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Translation Model

Project Structure

Header files

Source files

Object files

Build Instructions

Running Instructions

Features included in the project:

Sources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
include		include
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main		main
makefile		makefile

Folders and files

Latest commit

History

Repository files navigation

Translation Model

Project Structure

Header files

Source files

Object files

Build Instructions

Running Instructions

Features included in the project:

Sources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages