5.1 Communicating with Communities

To develop language tech solutions, you need a very large amount of language data to build machine translation solutions. This is especially true for low-resource languages as there are not many datasets available in these languages. People speak these languages in areas across Asia, South America and Africa. Often there is no good-quality data available in these languages. They aren’t seen as the language of the internet or the power language.

When building language technology solutions to improve communications in these low-resource languages, it’s important to work with the local communities. They can help to create the data for such language technology solutions. They can also make sure the data is accurate and includes cultural nuances, which play a key role in communication in these languages. This means, for example, asking for help from Swahili communities when setting up projects to build STT or ASR solutions in Swahili. This will ensure that end users of the language technology solution will have a good level of understanding.

Building datasets means setting up teams within the community to help deliver solutions that can support the use of language technology. You can do things like hire translators to convert a lot of text or pay locals to translate shorter pieces. You can also ask volunteers or members of the community to do translations in their free time.

To follow are some key points about the approach to take when creating datasets. This will help you to deliver language technology that aids communication:

  • It’s very important to explain how data helps to build better language technology solutions if you want to transfer language from humans to models effectively and accurately. Language technology plays a key role in bridging the gap between different cultures, languages, and societies. But it’s important to understand that using language technology is not just a mechanical process of converting words from one language to another. It also needs to convey meaning and cultural nuance.

  • If you want to communicate effectively about introducing language technology, it’s important to tell people what you expect and give them clear guidelines. Organizations should give translators or volunteers detailed instructions. These should explain the purpose, target audience, and desired tone of their translations. This helps them to understand the context and to adapt their work if needed, so they can provide better data.

To sum up, for effective communication about translation and the data collection process, volunteers, translators and organizations need to set clear expectations. There also needs to be open dialogue between the volunteers themselves. If we keep these channels of communication open, we can ensure accurate translations with the correct meanings. We can then use these to break down language barriers.

  • Communication between the volunteers themselves is also vital. This will make sure terminology and style are consistent across various projects. Native speakers understand that there are different ways to refer to the same word in languages like Swahili, Hausa and Somali. Platforms that allow volunteers and translators to work together mean they can share knowledge, ask questions, and ask one another for advice. They are very important in ensuring a smooth and effective process for data collection when building language technology solutions.

  • Translators or volunteers also need to communicate with organizations if they come across any challenges or things that are not clear. If there are any terms or phrases that have multiple meanings or a special cultural meaning, they should ask for clarification. Open dialogue between volunteers and organizations means that any issues can be dealt with quickly. The resulting translations will then be more accurate.

Last updated