Python is widely used and one of the top programming languages for data science, data ingestion, and transformation – technology areas belonging to the Business Intelligence IT vertical. Inzata has been leveraging Python in the following modules and components:
- InFlow – is a NoSQL fully interactive module for data ingestion and transformation within the comprehensive Inzata BI platform. There are two ingestion/transformation functions that allow a user to leverage fully:
- Row-based transformation method – allowing direct to define Python transformation code typically used for transforming streamed data (e.g.IoT)
- Complex transformation method – utilizing a Docker container wrapper within an InFlow function typically used in the situation, where there is a need to access the whole data set to code transformation logic.
- AI/ML Module– The whole AI/ML module is designed in Python and it allows either to use of a predefined set of AI/ML methods (see the list below) or the use of an interactive Python dev environment (e.g. Anaconda) to code its own AI/MM method (in such an interactive dev environment) is based on the large Python AI/NN library of functions. The greatest benefit is that the Python dev environment is integrated with Inzata, which significantly simplifies data sample preparation efforts, which are typically time-consuming. In terms of predefined, out-of-the-box available AI/ML, function following methods are available:
- AI Neural Net model with both regression and classification module
- SARIMA forecasting modules for time series forecasting
- ANOVA variance analysis
- Entity Matching – for deduplication and record linkage at scale using an unsupervised learning approach.
- Text classification with the ability to parametrize it for Nearest Neighbors, SVM, Gaussian Process, Decision Tree, Random Forest, etc.