2.4 Key terminology and concepts

This section looks at the most common terms and concepts used in language technology. These terms will help readers to make good use of the playbook. They will have a clear understanding of the terminology, abbreviations, and concepts used in language technology. Language Technology (LT): Information technologies that focus on human language, in both spoken and written forms. Human Language Technology includes various technologies that are used to process, understand, and generate language. Imagine the helpful tools on your phone or computer that understand and generate words, like language translation apps or voice assistants. These technologies help us to communicate and interact with devices using language. When you type a message on your phone and it suggests the next word, or you talk to a virtual assistant like Siri or Alexa, this is language technology in action. It's like having a smart friend who understands and responds to what you say. It makes technology more accessible and user-friendly.

Artificial Intelligence (AI): Artificial Intelligence (AI) refers to computer systems that can do tasks that normally require human intelligence. AI is like giving computers the ability to do things that usually only humans can do. Imagine having a smart friend who can learn, make decisions, and solve problems but doesn’t need to be programmed for each task.

Automatic Speech Recognition (ASR) or Speech-to-Text (STT): ASR converts spoken language into text. It's used in voice assistants and transcription services, for example. The terms ASR and STT are both used and they are almost the same. But STT can be a semi-manual process, while ASR is fully automated. ASR is like a smart listener that turns spoken words into written text on your device. This process is usually automated while involves a bit of human touch, a bit of fine-tuning to make sure every word is correct. Text-to-Speech (TTS): TTS technology converts written text into spoken language. It's used in screen readers and virtual assistants, and to generate audio content. You can find TTS on accessibility tools like screen readers. It brings written content to life through spoken words. Imagine your phone reading out loud the messages you get, or an audiobook telling your favorite story. This is made possible by TTS technology.

Machine Translation (MT): MT is the technology that uses algorithms and models to automatically translate text or speech from one language to another. It's the tech behind apps and systems that make your friend's text in Swahili appear in English. It lets you connect with people around the world with no language barriers.

Speech Translation (STS): STS technology translates spoken language from one language to another in spoken or written form. It uses ASR, MT, and TTS and allows you to have a seamless conversation. If you're talking, it turns your words into written text (ASR) and translates them into the other language (MT). It can then even read the translated message out loud (TTS). You can have a smooth and natural chat with someone who speaks a language you don't. Your communication will feel effortless.

Language Model (LM): A language model is a statistical or machine learning model that looks at the structure and patterns of written language. Large Language Models (LLMs) are a type of LM that has been in the focus because they can generate meaningful text that is relevant to a context. These models are often trained on massive amounts of text data. They can be fine-tuned for specific tasks such as completing texts, generating language, and even translation.

Chatbot: An application that uses Artificial Intelligence (AI) to hold conversations that feel like human conversations. The interactions are usually text-based. Imagine having a friendly virtual assistant on your computer or phone who is trained to give you certain information. A chatbot is like a digital buddy who uses artificial intelligence to chat with you. It’s like chatting with a real person but via text messages.

Conversational AI: A branch of artificial intelligence that aims to create natural and engaging conversations between machines and humans. This technology makes interactions with machines feel more like talking to a person. It makes it easier and more fun to get things done. Whether you need answers, want to set a reminder, or just have a casual chat, Conversational AI adds a human touch to your interactions with technology.

Dialogue system: The technical framework that allows chatbots to have back-and-forth conversations with users. A dialogue system is the technology behind a chatbot. It makes them capable of having meaningful back-and-forth conversations with you. Imagine interacting with a virtual assistant that not only understands what you say but also responds in a way that feels like a real conversation.

Knowledge technologies: Technologies that form a bridge between language and concepts or tasks in the real world. They connect an understanding of language to knowledge and information in a specific field. Knowledge technologies are like digital interpreters. They connect language with real-world concepts and tasks. This makes information easier to access and more useful. Imagine you're talking to a super-smart assistant who not only understands what you're saying but also knows a lot about specific topics.

Natural Language Processing (NLP): A field of artificial intelligence that gives computers the ability to understand, interpret, and generate human language. It includes various language technology tasks, such as sentiment analysis and text summarization. Natural Language Processing is like giving super-smart brains to computers so they can understand, talk, and write like humans. Imagine your computer not just reading your words but understanding the feelings behind them. Or writing a summary of a long article for you.

Natural Language Understanding (NLU): The process of giving machines the ability to understand the meaning and context of human language. This makes it possible to have more complicated interactions. NLU is what allows computers or systems to understand a voice request. For example, when you ask them to find the right music, even without using specific technical terms. It's like giving machines the ability to understand the deeper meaning and essence of our words. Interactions with technology can be much smoother and more human-like.

Annotated Data: Data that has been manually labeled with specific information. This might be named entities, transcription, sentiment labels, or part-of-speech tags. Annotated data is information that has been carefully tagged or labeled by humans so that computers can understand it more easily. It's like providing special notes or labels on a map to help someone (in this case, a computer) find their way and understand different aspects of a large amount of information.

Graphical Processing Units (GPUs): Specialized electronic circuits that speed up graphics rendering. They are widely used in language technology for parallel processing tasks, like developing models. Similarly, a Tensor Processing Unit (TPU) is a specialized kind of hardware that speeds up machine learning workloads, giving better performance for deep learning tasks.

Last updated