Understanding the Core Modules

The core modules are custom packages that support different types of models, model pipelines, serialized data and data. When you run your algorithm, we will read in your model and data files. We will then traverse the test_engine_core_modules directory and see if there are support packages to handle the model and data files. If there are, we will be able to process the data correctly.

There are four categories of support for algorithms:

Model

Models are frameworks that are run by algorithms.

Models currently supported:

LightGBM
scikit-learn
XGBoost

Model Pipeline

Model pipelines are models which apply a list of transforms and final estimator to the data.

Model pipelines currently supported:

scikit-learn

Deserializer

Deserializers process serialized data and make them into readable objects. Model and data files can sometimes be passed in as a serialized file type (e.g. Joblib). A serialized file is not easily readable and modifiable by humans. If we have the right deserializer for the serialized file, it wil deserialize the file into an object like Pandas dataframe, which users are able to modify.

Deserializers currently supported:

Delimiter
Joblib
Pickle
TensorFlow
Image

Data Type

Data type refers to the type of data after it has been deserialized. If the data passed in does not require deserializing (e.g. the data file is csv file), the data type will be whatever is in the data file.
Data types currently supported:

Delimiter (colon, comma, pipe, semicolon, space, tab separated values)
Pandas
Image (JPG, JPEG, PNG)