12. Glossary¶
- Activation function¶
An activation function calculates a “weighted sum” of its input, adds a bias and then decides whether it should be “fired” or not
- alt-f1¶
ALT-F1 designs, implements, deploys and supports secure, large-scale software solutions for diverse industries: Manufacturing, MRO, Warehouse, Broadcasting, Bank, Insurance, Law Enforcement, Justice & Serious International Crime
- autograd¶
Module that PyTorch uses to calculate gradients for training neural networks
- Back office¶
The back office is all the resources of the company that are devoted to actually producing a product or service and all the other labor that isn’t seen by customers, such as administration or logistics.
Source: Wikipedia contributors. (2019, July 19). Back office. In Wikipedia, The Free Encyclopedia. Retrieved 07:53, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Back_office&oldid=906961159
- Broker¶
“An insurance broker sells, solicits, or negotiates insurance for compensation.”
Source: Wikipedia contributors. (2019, September 12). Insurance broker. In Wikipedia, The Free Encyclopedia. Retrieved 10:33, September 13, 2019, from https://en.wikipedia.org/w/index.php?title=Insurance_broker&oldid=915277342
- Business Process Management¶
Business Process Management is a discipline aimed at managing all aspect of the business processes; from process design to modeling and analysis to execution and improvement
- CDI¶
See term:Customer Data Integration
- Chatbot¶
“A chatbot is a piece of software that conducts a conversation via auditory or textual methods.”
Source: Wikipedia contributors. (2019, September 9). Chatbot. In Wikipedia, The Free Encyclopedia. Retrieved 14:26, September 12, 2019, from https://en.wikipedia.org/w/index.php?title=Chatbot&oldid=914875664
- Conda¶
Package, dependency and environment management for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN
- Contact center¶
A contact center, further extension to call centers administers centralized handling of individual communications, including letters, faxes, live support software, social media, instant message, and e-mail.
Source: Wikipedia contributors. (2019, September 15). Call centre. In Wikipedia, The Free Encyclopedia. Retrieved 08:59, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Call_centre&oldid=915792349
- Cross-entropy loss¶
Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1.
A perfect model would have a log loss of 0.
- CUDA¶
PyTorch uses a library called CUDA to accelerate operations using the GPU
- Customer Data Integration¶
“Customer data integration (CDI) is the process of defining, consolidating and managing customer information across an organization’s business units and systems to achieve a “single version of the truth” for customer data.”
Source: https://searchdatamanagement.techtarget.com/definition/customer-data-integration
- Digitization¶
Digitization is “Digitization, less commonly digitalization, is the process of converting information into a digital (i.e. computer-readable) format, in which the information is organized into bits.”
Source: Wikipedia contributors. (2019, August 28). Digitization. In Wikipedia, The Free Encyclopedia. Retrieved 07:13, September 12, 2019, from https://en.wikipedia.org/w/index.php?title=Digitization&oldid=912864588
- Epoch¶
One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE.
- Gradient descent¶
The gradient is the slope of the loss function and points in the direction of fastest change. To get to the minimum in the least amount of time, we then want to follow the gradient (downwards). You can think of this like descending a mountain by following the steepest slope to the base.
- Gradients¶
Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient.
In machine learning, we use Gradient descent to update the parameters of our model. Parameters refer to coefficients in Linear Regression and weights in neural networks.
See https://ml-cheatsheet.readthedocs.io/en/latest/gradient_descent.html
A gradient is a partial derivative - why partial? Because one computes it with respect to (w.r.t.) a single parameter. We have two parameters, a and b, so we must compute two partial derivatives
See Understanding PyTorch with an example: a step-by-step tutorial
- Hidden Layers¶
Sits between the input and output layers and applies an activation function before passing on the results.
There are often multiple hidden layers in a network.
In traditional networks, hidden layers are typically fully-connected layers - each neuron receives input from all the previous layer’s neurons and sends its output to every neuron in the next layer.
See https://ml-cheatsheet.readthedocs.io/en/latest/nn_concepts.html?highlight=hidden#layers
- Inbound call center¶
An inbound call center is operated by a company to administer incoming product or service support or information enquiries from consumers.
Source: Wikipedia contributors. (2019, September 15). Call centre. In Wikipedia, The Free Encyclopedia. Retrieved 08:59, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Call_centre&oldid=915792349
- Jaccard¶
The Jaccard index, also known as Intersection over Union and the Jaccard similarity coefficient (originally given the French name coefficient de communauté by Paul Jaccard), is a statistic used for gauging the similarity and diversity of sample sets. The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets
- Kolmogorov–Smirnov test¶
In statistics, the Kolmogorov–Smirnov test (K–S test or KS test) is a nonparametric test of the equality of continuous (or discontinuous, see Section 2.2), one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test).
Source: https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
- Layers¶
The first layer shown on the bottom here are the inputs, understandably called the input layer. The middle layer is called the hidden layer, and the final layer (on the right) is the output layer.
- Logit¶
In statistics, the logit function or the log-odds is the logarithm of the odds p/(1 - p) where p is probability. It is a type of function that creates a map of probability values from [0,1] to
It is the inverse of the sigmoidal “logistic” function or logistic transform used in mathematics, especially in statistics.
See https://en.wikipedia.org/wiki/Logit
s function
A measure of our prediction error. (also called the cost)
- Mann–Whitney U test¶
In statistics, the Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test) is a nonparametric test of the null hypothesis that, for randomly selected values X and Y from two populations, the probability of X being greater than Y is equal to the probability of Y being greater than X.
Source: https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test
- mathjax¶
- Middle office¶
The middle office is made up of the risk managers and the information technology managers who manage risk and maintain the information resources.
Source: Wikipedia contributors. (2019, August 9). Middle office. In Wikipedia, The Free Encyclopedia. Retrieved 08:36, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Middle_office&oldid=910135163
- MNIST¶
The Modified National Institute of Standards and Technology database is a large database of handwritten digits that is commonly used for training various image processing systems. Source https://en.wikipedia.org/wiki/MNIST_database
- NumPy¶
Interacts with term:PyTorch. NumPy is the fundamental package for scientific computing with Python. It contains among other things:
a powerful N-dimensional array object
sophisticated (broadcasting) functions
tools for integrating C/C++ and Fortran code
useful linear algebra, Fourier transform, and random number capabilities
- OpenMined¶
OpenMined is an open-source community focused on researching, developing, and promoting tools for secure, privacy-preserving, value-aligned artificial intelligence. https://www.openmined.org
- Outbound call center¶
An outbound call center is operated for telemarketing, for solicitation of charitable or political donations, debt collection, market research, emergency notifications, and urgent/critical needs blood banks.
Source: Wikipedia contributors. (2019, September 15). Call centre. In Wikipedia, The Free Encyclopedia. Retrieved 08:59, September 19, 2019, from https://en.wikipedia.org/w/index.php?title=Call_centre&oldid=915792349
- PyTorch¶
An open source machine learning framework that accelerates the path from research prototyping to production deployment.
- Robo advisor¶
a class of financial adviser that provide financial advice or Investment management online with moderate to minimal human intervention
Source: Wikipedia contributors. (2019, August 29). Robo-advisor. In Wikipedia, The Free Encyclopedia. Retrieved 14:22, September 12, 2019, from https://en.wikipedia.org/w/index.php?title=Robo-advisor&oldid=912998258
- Sigmoid function¶
A sigmoid function is a mathematical function having a characteristic “S”-shaped curve or sigmoid curve.
- SIREMIS¶
Web Management Interface for Kamailio (OpenSER) SIP Server
- td-idf¶
- TD-IDF¶
- Term Frequency–inverse Document Frequency¶
“In information retrieval, tf–idf, TF*IDF, or TFIDF, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.[1] It is often used as a weighting factor in searches of information retrieval, text mining, and user modeling. The tf–idf value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general.”
- tensors¶
The main data structure of PyTorch. the tensor is an array. A vector is a 1-dimensional tensor, a matrix is a 2-dimensional tensor, an array with three indices is a 3-dimensional tensor (RGB color images for example)
- torchvision¶
The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision.
See torchvision
- Underwriter¶
“Insurance underwriters evaluate the risk and exposures of potential clients. They decide how much coverage the client should receive, how much they should pay for it, or whether even to accept the risk and insure them. Underwriting involves a measuring risk exposure and determining the premium that needs to be charged to insure that risk.”
See https://en.wikipedia.org/wiki/Underwriting#Insurance_underwriting
Source: Wikipedia contributors. (2019, August 9). Underwriting. In Wikipedia, The Free Encyclopedia. Retrieved 08:26, September 13, 2019, from https://en.wikipedia.org/w/index.php?title=Underwriting&oldid=910020948
- Validation¶
the action of checking or proving the validity or accuracy of the model generated by the Artificial Intelligence
- Validation Dataset¶
The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters. The evaluation becomes more biased as skill on the validation dataset is incorporated into the model configuration.
See About Train, Validation and Test Sets in Machine Learning
- Web Scraping¶
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may access the World Wide Web directly using the Hypertext Transfer Protocol, or through a web browser.