Pipeline

Pipeline data model.

Defines Pipeline — a networkx.DiGraph of Step nodes — along with the configuration classes that describe pipeline-level settings such as the pipeline name, KFP host, volumes, and Katib experiments.

class kale.pipeline.VolumeConfig(*args, **kwargs)[source]

Bases: Config

Used for validating the volumes field of NotebookConfig.

name = <kale.config.config.Field object>
mount_point = <kale.config.config.Field object>
snapshot = <kale.config.config.Field object>
snapshot_name = <kale.config.config.Field object>
size = <kale.config.config.Field object>
size_type = <kale.config.config.Field object>
type = <kale.config.config.Field object>
annotations = <kale.config.config.Field object>
storage_class_name = <kale.config.config.Field object>
volume_access_mode = <kale.config.config.Field object>
class kale.pipeline.KatibConfig(*args, **kwargs)[source]

Bases: Config

Used to validate the katib_metadata field of NotebookConfig.

parameters = <kale.config.config.Field object>
objective = <kale.config.config.Field object>
algorithm = <kale.config.config.Field object>
maxTrialCount = <kale.config.config.Field object>
maxFailedTrialCount = <kale.config.config.Field object>
parallelTrialCount = <kale.config.config.Field object>
class kale.pipeline.SecurityContextConfig(*args, **kwargs)[source]

Bases: Config

Configuration for Kubernetes security context settings.

These settings control the security context applied to all pipeline steps. Can be configured via JupyterLab settings or KALE_* environment variables.

enabled = <kale.config.config.Field object>
run_as_user = <kale.config.config.Field object>
run_as_group = <kale.config.config.Field object>
run_as_non_root = <kale.config.config.Field object>
class kale.pipeline.PipelineConfig(*args, **kwargs)[source]

Bases: Config

Main config class to validate the pipeline metadata.

pipeline_name = <kale.config.config.Field object>
experiment_name = <kale.config.config.Field object>
pipeline_description = <kale.config.config.Field object>
base_image = <kale.config.config.Field object>
enable_caching = <kale.config.config.Field object>
volumes = <kale.config.config.Field object>
katib_run = <kale.config.config.Field object>
katib_metadata = <kale.config.config.Field object>
abs_working_dir = <kale.config.config.Field object>
marshal_volume = <kale.config.config.Field object>
marshal_path = <kale.config.config.Field object>
steps_defaults = <kale.config.config.Field object>
kfp_host = <kale.config.config.Field object>
storage_class_name = <kale.config.config.Field object>
volume_access_mode = <kale.config.config.Field object>
timeout = <kale.config.config.Field object>
security_context = <kale.config.config.Field object>
output_path = <kale.config.config.Field object>
property source_path

Get the path to the main entry point script.

class kale.pipeline.Pipeline(config: PipelineConfig, *args, **kwargs)[source]

Bases: DiGraph

A Pipeline that can be converted into a KFP pipeline.

This class is used to define a pipeline, its steps and all its configurations. It extends nx.DiGraph to exploit some graph-related algorithms but provides helper functions to work with Step objects instead of standard networkx “nodes”. This makes it simpler to access the steps of the pipeline and their attributes.

pipeline_parameters: dict[str, PipelineParam]
run()[source]

Runs the steps locally in topological sort.

add_step(step: Step)[source]

Add a new Step to the pipeline.

add_dependency(parent: Step, child: Step)[source]

Link two Steps in the pipeline.

get_step(name: str) Step[source]

Get the Step with the provided name.

property steps: Iterable[Step]

Get the Steps objects sorted topologically.

property steps_names

Get all Steps’ names, sorted topologically.

property all_steps_parameters

Create a dict with step names and their parameters.

property pipeline_dependencies_tasks

Generate a dictionary of Pipeline dependencies.

property pps_names

Get the names of the pipeline parameters sorted.

property pps_types

Get the types of the pipeline parameters, sorted by name.

property pps_values

Get the values of the pipeline parameters, sorted by name.

get_ordered_ancestors(step_name: str) Iterable[Step][source]

Return the ancestors of a step in an ordered manner.

Wrapper of graphutils.get_ordered_ancestors.

Returns:

A Steps iterable.

Return type:

Iterable[Step]

get_leaf_steps()[source]

Get the list of leaf steps of the pipeline.

A step is considered a leaf when its in-degree is > 0 and its out-degree is 0.

Returns (list): A list of leaf Steps.

override_pipeline_parameters_from_kwargs(**kwargs)[source]

Overwrite the current pipeline parameters with provided inputs.

show()[source]

Print the pipeline nodes and dependencies in a table.