Pipeline

Pipeline defines a set of steps that can be executed together.

Attributes

NameTypeDescription
pipeline_iduuid.UUIDtechnical id of the pipeline.
experiment_iduuid.UUIDtechnical id of the experiment linked to the pipeline.
keystrlogical string unique id of the pipeline.
stepslistlist of steps connected between by their inputs and outputs.
template_iduuid.UUIDtemplate id associated to the pipeline.
variablesdictdictionary of variable name with they wizata_dsapi.VarType.
createdByIduuid.UUIDunique identifier of creating user.
createdDateinttimestamp of created date.
updatedByIduuid.UUIDunique identifier of updating user.
updatedDateinttimestamp of updated date.

Methods

add_model()

add a model step

NameTypeDefaultDescription
configMLModelConfigmodel configuration to define how pipeline should train and use your model.
input_dfstr, dict or Pipeline I/O defining input dataframe properties
output_dfNonestr, dict or Pipeline I/O defining input dataframe properties

add_plot()

add a writer step

NameTypeDefaultDescription
scriptscript configuration to define how pipeline should execute the plot script.
df_namestrNonestr deprecated usage
input_dfPipelineIONonestr, dict or Pipeline I/O defining input dataframe properties

add_query()

add a query step

NameTypeDefaultDescription
requestRequestrequest definition to add.
df_namestrquery_dfoutput name ot use for the dataframe - use df_output for more features.
use_templateboolTrueby default, if pipeline is link to a template, the query will be too. set to false to disable forcing it.
output_dfPipelineIONoneoutput df - can set a mapping.

add_transformation()

add a transformation script

NameTypeDefaultDescription
scriptname, Script or ScriptConfig.
inputslistNonelist of Pipeline I/O or dict or str for dataframe input names.
outputslistNonelist of Pipeline I/O or dict or str for dataframe output names.
input_df_nameslistNonedeprecated support.
output_df_nameslistNonedeprecated support.

add_writer()

add a writer step

NameTypeDefaultDescription
configWriteConfigwriter configuration to define how pipeline should write data into platform.
input_dfstr, dict or Pipeline I/O defining input dataframe properties

api_id()

Id of the pipeline

return: string formatted UUID of the Pipeline.

check_path()

validate that steps create a valid path.

return true if path is valid, otherwise raise errors

check_variables()

verify that variables dict is a valid { "name" : "VarType" } dictionary.

endpoint()

Name of the endpoints used to manipulate pipeline.

return: Endpoint name.

from_json()

load from JSON dictionary representation

NameTypeDefaultDescription
obj

set_id()

specify the id_value neutrally

NameTypeDefaultDescription
id_value

return:

to_json()

Convert to a json version of Execution definition.

By default, use DS API format.

NameTypeDefaultDescription
targetstrNone