12. Glossary

Activation function

An activation function calculates a “weighted sum” of its input, adds a bias and then decides whether it should be “fired” or not

See Activation Functions

See Understanding Activation Functions in Neural Network y = f(\sum_{\substack{i}} wixi + b)

alt-f1

ALT-F1 designs, implements, deploys and supports secure, large-scale software solutions for diverse industries: Manufacturing, MRO, Warehouse, Broadcasting, Bank, Insurance, Law Enforcement, Justice & Serious International Crime

See http://www.alt-f1.be

autograd

Module that PyTorch uses to calculate gradients for training neural networks

See https://pytorch.org/docs/stable/notes/autograd.html

Back office

The back office is all the resources of the company that are devoted to actually producing a product or service and all the other labor that isn’t seen by customers, such as administration or logistics.

Source: Wikipedia contributors. (2019, July 19). Back office. In Wikipedia, The Free Encyclopedia. Retrieved 07:53, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Back_office&oldid=906961159

Broker

“An insurance broker sells, solicits, or negotiates insurance for compensation.”

Source: Wikipedia contributors. (2019, September 12). Insurance broker. In Wikipedia, The Free Encyclopedia. Retrieved 10:33, September 13, 2019, from https://en.wikipedia.org/w/index.php?title=Insurance_broker&oldid=915277342

Business Process Management

Business Process Management is a discipline aimed at managing all aspect of the business processes; from process design to modeling and analysis to execution and improvement

Source: https://www.ipdsolution.com/ipdblog/bpm-workflows

CDI

See term:Customer Data Integration

Chatbot

“A chatbot is a piece of software that conducts a conversation via auditory or textual methods.”

Source: Wikipedia contributors. (2019, September 9). Chatbot. In Wikipedia, The Free Encyclopedia. Retrieved 14:26, September 12, 2019, from https://en.wikipedia.org/w/index.php?title=Chatbot&oldid=914875664

Chatbot - book a train

Conda

Package, dependency and environment management for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN

Contact center

A contact center, further extension to call centers administers centralized handling of individual communications, including letters, faxes, live support software, social media, instant message, and e-mail.

Source: Wikipedia contributors. (2019, September 15). Call centre. In Wikipedia, The Free Encyclopedia. Retrieved 08:59, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Call_centre&oldid=915792349

Cross-entropy loss

Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1.

A perfect model would have a log loss of 0.

See Neural networks - Cross Entropy

See PyTorch - Cross entropy loss function

CUDA

PyTorch uses a library called CUDA to accelerate operations using the GPU

Customer Data Integration

“Customer data integration (CDI) is the process of defining, consolidating and managing customer information across an organization’s business units and systems to achieve a “single version of the truth” for customer data.”

Source: https://searchdatamanagement.techtarget.com/definition/customer-data-integration

Digitization

Digitization is “Digitization, less commonly digitalization, is the process of converting information into a digital (i.e. computer-readable) format, in which the information is organized into bits.”

Source: Wikipedia contributors. (2019, August 28). Digitization. In Wikipedia, The Free Encyclopedia. Retrieved 07:13, September 12, 2019, from https://en.wikipedia.org/w/index.php?title=Digitization&oldid=912864588

Epoch

One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE.

See Epoch vs Batch Size vs Iterations

Gradient descent

The gradient is the slope of the loss function and points in the direction of fastest change. To get to the minimum in the least amount of time, we then want to follow the gradient (downwards). You can think of this like descending a mountain by following the steepest slope to the base.

See Intro to PyTorch - Notebook Workspace

Gradients

Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient.

In machine learning, we use Gradient descent to update the parameters of our model. Parameters refer to coefficients in Linear Regression and weights in neural networks.

See https://ml-cheatsheet.readthedocs.io/en/latest/gradient_descent.html

A gradient is a partial derivative - why partial? Because one computes it with respect to (w.r.t.) a single parameter. We have two parameters, a and b, so we must compute two partial derivatives

See Understanding PyTorch with an example: a step-by-step tutorial

Hidden Layers

Sits between the input and output layers and applies an activation function before passing on the results.

There are often multiple hidden layers in a network.

In traditional networks, hidden layers are typically fully-connected layers - each neuron receives input from all the previous layer’s neurons and sends its output to every neuron in the next layer.

See https://ml-cheatsheet.readthedocs.io/en/latest/nn_concepts.html?highlight=hidden#layers

Inbound call center

An inbound call center is operated by a company to administer incoming product or service support or information enquiries from consumers.

Source: Wikipedia contributors. (2019, September 15). Call centre. In Wikipedia, The Free Encyclopedia. Retrieved 08:59, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Call_centre&oldid=915792349

Jaccard

The Jaccard index, also known as Intersection over Union and the Jaccard similarity coefficient (originally given the French name coefficient de communauté by Paul Jaccard), is a statistic used for gauging the similarity and diversity of sample sets. The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets

Source: https://en.wikipedia.org/wiki/Jaccard_index

Kolmogorov–Smirnov test

In statistics, the Kolmogorov–Smirnov test (K–S test or KS test) is a nonparametric test of the equality of continuous (or discontinuous, see Section 2.2), one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test).

Source: https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

Layers

The first layer shown on the bottom here are the inputs, understandably called the input layer. The middle layer is called the hidden layer, and the final layer (on the right) is the output layer.

Source: Intro to PyTorch - Notebook Workspace

Logit

In statistics, the logit function or the log-odds is the logarithm of the odds p/(1 - p) where p is probability. It is a type of function that creates a map of probability values from [0,1] to -\infty ,+\infty

It is the inverse of the sigmoidal “logistic” function or logistic transform used in mathematics, especially in statistics.

See https://en.wikipedia.org/wiki/Logit

s function

A measure of our prediction error. (also called the cost)

Mann–Whitney U test

In statistics, the Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test) is a nonparametric test of the null hypothesis that, for randomly selected values X and Y from two populations, the probability of X being greater than Y is equal to the probability of Y being greater than X.

Source: https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test

mathjax

See Short Math Guide for LATEX

See Math into LATEX An Introduction to LATEX and AMS-LATEX

Middle office

The middle office is made up of the risk managers and the information technology managers who manage risk and maintain the information resources.

Source: Wikipedia contributors. (2019, August 9). Middle office. In Wikipedia, The Free Encyclopedia. Retrieved 08:36, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Middle_office&oldid=910135163

MNIST

The Modified National Institute of Standards and Technology database is a large database of handwritten digits that is commonly used for training various image processing systems. Source https://en.wikipedia.org/wiki/MNIST_database

NumPy

Interacts with term:PyTorch. NumPy is the fundamental package for scientific computing with Python. It contains among other things:

  • a powerful N-dimensional array object

  • sophisticated (broadcasting) functions

  • tools for integrating C/C++ and Fortran code

  • useful linear algebra, Fourier transform, and random number capabilities

See https://numpy.org/

OpenMined

OpenMined is an open-source community focused on researching, developing, and promoting tools for secure, privacy-preserving, value-aligned artificial intelligence. https://www.openmined.org

Outbound call center

An outbound call center is operated for telemarketing, for solicitation of charitable or political donations, debt collection, market research, emergency notifications, and urgent/critical needs blood banks.

Source: Wikipedia contributors. (2019, September 15). Call centre. In Wikipedia, The Free Encyclopedia. Retrieved 08:59, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Call_centre&oldid=915792349

PyTorch

An open source machine learning framework that accelerates the path from research prototyping to production deployment.

See https://pytorch.org/

Robo advisor

a class of financial adviser that provide financial advice or Investment management online with moderate to minimal human intervention

Source: Wikipedia contributors. (2019, August 29). Robo-advisor. In Wikipedia, The Free Encyclopedia. Retrieved 14:22, September 12, 2019, from https://en.wikipedia.org/w/index.php?title=Robo-advisor&oldid=912998258

Sigmoid function

A sigmoid function is a mathematical function having a characteristic “S”-shaped curve or sigmoid curve.

See https://en.wikipedia.org/wiki/Sigmoid

SIREMIS

Web Management Interface for Kamailio (OpenSER) SIP Server

See https://siremis.asipto.com

td-idf
TD-IDF
Term Frequency–inverse Document Frequency

“In information retrieval, tf–idf, TF*IDF, or TFIDF, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.[1] It is often used as a weighting factor in searches of information retrieval, text mining, and user modeling. The tf–idf value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general.”

Source: https://en.wikipedia.org/wiki/Tf%E2%80%93idf

tensors

The main data structure of PyTorch. the tensor is an array. A vector is a 1-dimensional tensor, a matrix is a 2-dimensional tensor, an array with three indices is a 3-dimensional tensor (RGB color images for example)

torchvision

The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision.

See torchvision

Underwriter

“Insurance underwriters evaluate the risk and exposures of potential clients. They decide how much coverage the client should receive, how much they should pay for it, or whether even to accept the risk and insure them. Underwriting involves a measuring risk exposure and determining the premium that needs to be charged to insure that risk.”

See https://en.wikipedia.org/wiki/Underwriting#Insurance_underwriting

Source: Wikipedia contributors. (2019, August 9). Underwriting. In Wikipedia, The Free Encyclopedia. Retrieved 08:26, September 13, 2019, from https://en.wikipedia.org/w/index.php?title=Underwriting&oldid=910020948

Validation

the action of checking or proving the validity or accuracy of the model generated by the Artificial Intelligence

Validation Dataset

The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters. The evaluation becomes more biased as skill on the validation dataset is incorporated into the model configuration.

See About Train, Validation and Test Sets in Machine Learning

Web Scraping

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may access the World Wide Web directly using the Hypertext Transfer Protocol, or through a web browser.

Source: https://en.wikipedia.org/wiki/Web_scraping