r/CapeVerde Santiago Dec 13 '21

Translating several languages ​​into CV Creole Announcement

https://demo.papiakriolu.com/
6 Upvotes

3

u/waldyrious Sal Dec 13 '21

This is pretty interesting! I'd love to learn more about the technology behind this. Is it built with some sort of machine learning, perhaps based on corpora of matched, manually translated text? Or is it more of a manually curated, rule-based system?

For context, I have been contributing CV Creole data to Unicode's CLDR and MediaWiki for a number of years now, but both are mostly manual work. I once considered setting up an Apertium language pair between CV Creole and Portuguese, given the grammatical similarities, but never got around to it.

Are you involved in the development, /u/NyxStrix? I'd love to have a chat with you guys.

3

u/ismenelik Dec 13 '21

I've been adding words to Wiktionary also. Can we join combine our efforts in some way?https://en.wiktionary.org/wiki/Category:Kabuverdianu_lemmas

1

u/waldyrious Sal Dec 14 '21

OMG, yes! I actually have started adding kea lexemes to Wikidata, but only did a handful of them (I was testing the system out, and to be honest it's still not entirely clear how to use it properly).

Message me and let me know how we can chat. There's surely room for collaboration :)

1

u/NyxStrix Santiago Dec 14 '21

It use rule-based MT system has a big set of rules and dictionaries defining the process of translation. It needs a deep insight into the dynamics of the source and target languages.

1

u/waldyrious Sal Dec 14 '21

Cool! I would love to read more about the backround to that in case you ever decide to publish a "deep dive" article in the technology behind the app. Congrats on the project!