AI GLOSSARY - T
Definition: A machine learning algorithm for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. It converts similarities between data points to joint probabilities and tries to minimise the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the high-dimensional data.
Definition: In natural language processing, tagging refers to the process of identifying and labelling the grammatical parts of speech in text, such as nouns, verbs, adjectives, etc. This is commonly used in text analysis and linguistic processing.
Definition: A generalisation of vectors and matrices and is easily understood as a multidimensional array. In the context of machine learning, tensors are a type of data structure used in algorithms to generalise matrix operations across higher dimensions.
Definition: An open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks.
Definition: The process of deriving high-quality information from text. It involves the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources.
Definition: A type of assistive technology that reads digital text aloud. It’s sometimes called “read aloud” technology. With a click of a button or the touch of a finger, TTS can take words on a computer or other digital device and convert them into audio.
Definition: An algorithm for decision problems where actions are taken in sequence balancing between exploitation (choosing actions that have known rewards) and exploration (choosing actions whose values are uncertain).
Definition: A technique used in image processing and other applications that converts a grayscale image into a binary image where the pixels are either 0 or 1, depending on whether they are above or below the threshold value.
Definition: The process of converting a sequence of characters into a sequence of tokens (small pieces of the whole). It is often the first step in text analysis and natural language processing tasks.
Definition: A method of data analysis that uses techniques from topology to infer the underlying structure of data. It helps to identify shapes and patterns, providing insights that are not readily available through traditional statistical methods.
Definition: A research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognise cars could apply when trying to recognise trucks.
Definition: A model architecture that eschews recurrence and instead relies entirely on an attention mechanism to draw global dependencies between input and output. It is the basis for many state-of-the-art models in NLP, including BERT and GPT.
Definition: A type of algorithm in machine learning that predicts by learning simple decision rules inferred from the data features. Decision trees and random forests are examples of tree-based models.
Definition: A type of n-gram where n is equal to three. It’s a sequence of three adjacent elements from a string of tokens, which could be letters, syllables, or words. Trigrams are often used in text prediction and cryptanalysis.
Definition: In the context of classification tasks, a true negative is an outcome where the model correctly predicts the negative class. For example, if the model correctly predicts that a given email is not spam, that email is considered a true negative.

