Checkpoint 3.8.0 Download

0 views

Skip to first unread message

Hilary Laite

unread,

Aug 5, 2024, 2:37:24 AM8/5/24

to fomrworkprofcount

Checkpointproteins, such as PD-L1 on tumor cells and PD-1 on T cells, help keep immune responses in check. The binding of PD-L1 to PD-1 keeps T cells from killing tumor cells in the body (left panel). Blocking the binding of PD-L1 to PD-1 with an immune checkpoint inhibitor (anti-PD-L1 or anti-PD-1) allows the T cells to kill tumor cells (right panel).

One such drug acts against a checkpoint protein called CTLA-4. Other immune checkpoint inhibitors act against a checkpoint protein called PD-1 or its partner protein PD-L1. Some tumors turn down the T cell response by producing lots of PD-L1.

Immune checkpoint inhibitors can cause side effects that affect people in different ways. The side effects you may have and how they make you feel will depend on how healthy you are before treatment, your type of cancer, how advanced it is, the type of immune checkpoint inhibitor you are receiving, and the dose.

The goal of checkpoint is to solve the problem of package reproducibility in R. Specifically, checkpoint allows you to install packages as they existed on CRAN on a specific snapshot date as if you had a CRAN time machine. To achieve reproducibility, the checkpoint() function installs the packages required or called by your project and scripts to a local library exactly as they existed at the specified point in time. Only those packages are available to your project, thereby avoiding any package updates that came later and may have altered your results. In this way, anyone using checkpoint's checkpoint() can ensure the reproducibility of your scripts or projects at any time. To create the snapshot archives, once a day (at midnight UTC) Microsoft refreshes the Austria CRAN mirror on the "Microsoft R Archived Network" server (). Immediately after completion of the rsync mirror process, the process takes a snapshot, thus creating the archive. Snapshot archives exist starting from 2014-09-17.

A checkpoint is a point in the write-ahead log sequence at which all data files have been updated to reflect the information in the log. All data files will be flushed to disk. Refer to Section 30.5 for more details about what happens during a checkpoint.

The CHECKPOINT command forces an immediate checkpoint when the command is issued, without waiting for a regular checkpoint scheduled by the system (controlled by the settings in Section 20.5.2). CHECKPOINT is not intended for use during normal operation.

If you see anything in the documentation that is not correct, does not match your experience with the particular feature or requires further clarification, please use this form to report a documentation issue.

Checkpoints provide a convenient abstraction for bundling the ValidationThe act of applying an Expectation Suite to a Batch. of a Batch (or Batches)A selection of records from a Data Asset. of data against an Expectation SuiteA collection of verifiable assertions about data. (or several), as well as the ActionsA Python class with a run method that takes a Validation Result and does something with it that should be taken after the validation.

Like Expectation Suites and Validation ResultsGenerated when data is Validated against an Expectation or Expectation Suite., Checkpoints are managed using a Data ContextThe primary entry point for a Great Expectations deployment, with configurations and methods for all supporting components., and have their own Store which is used to persist their configurations to YAML files. These configurations can be committed to version control and shared with your team.

A Checkpoint uses a ValidatorUsed to run an Expectation Suite against data. to run one or more Expectation Suites against one or more Batches provided by one or more Batch RequestsProvided to a Data Source in order to create a Batch.. Running a Checkpoint produces Validation Results and will result in optional Actions being performed if they are configured to do so.

In the Validate Data step of working with Great Expectations, there are two points in which you will interact with Checkpoints in different ways: First, when you create them. And secondly, when you use them to actually Validate your data.

You do not need to re-create a Checkpoint every time you Validate data. If you have created a Checkpoint that covers your data Validation needs, you can save and re-use it for your future Validation needs. Since you can set Checkpoints up to receive some of their required information (like Batch Requests) at run time, it is easy to create Checkpoints that can be readily applied to multiple disparate sources of data.

One of the most powerful features of Checkpoints is that they can be configured to run Actions, which will do some process based on the Validation Results generated when a Checkpoint is run. Typical uses include sending email, slack, or custom notifications. Another common use case is updating Data Docs sites. However, Actions can be created to do anything you are capable of programing in Python. This gives you an incredibly versatile tool for integrating Checkpoints in your pipeline's workflow!

Creating a Checkpoint is part of the initial setup for data validation. Checkpoints are reusable and only need to be created once, although you can create multiple Checkpoints to cover multiple Validation use cases. For more information about creating Checkpoints, see How to create a new Checkpoint.

After you create a Checkpoint, you can use it to Validate data by running it against a Batch or Batches of data. The Batch Requests used by a Checkpoint during this process may be pre-defined and saved as part of the Checkpoint's configuration, or the Checkpoint can be configured to accept one or more Batch Request at run time. For more information about data validation, see How to validate data by running a Checkpoint.

In its most basic form, a Checkpoint accepts an expectation_suite_name identfying the test suite to run, and a batch_request identifying the data to test. Checkpoint can be directly directly in Python as follows:

A Checkpoint uses its configuration to determine what data to Validate against which Expectation Suite(s), and what actions to perform on the Validation Results - these validations and Actions are executed by calling a Checkpoint's run method (analogous to calling validate with a single Batch). Checkpoint configurations are very flexible. At one end of the spectrum, you can specify a complete configuration in a Checkpoint's YAML file, and simply call my_checkpoint.run(). At the other end, you can specify a minimal configuration in the YAML file and provide missing keys as kwargs when calling run.

Checkpoint configurations follow a nested pattern, where more general keys provide defaults for more specific ones. For instance, any required validation dictionary keys (e.g. expectation_suite_name) can be specified at the top-level (i.e. at the same level as the validations list), serving as runtime defaults. Starting at the earliest reference template, if a configuration key is re-specified, its value can be appended, updated, replaced, or cause an error when redefined.

This configuration specifies full validation dictionaries - no nesting (defaults) are used. When run, this Checkpoint will perform one validation of a single batch of data, against a single Expectation Suite ("my_expectation_suite").

This configuration specifies four top-level keys ("expectation_suite_name", "action_list", "evaluation_parameters", and "runtime_configuration") that can serve as defaults for each validation, allowing the keys to be omitted from the validation dictionaries. When run, this Checkpoint will perform two Validations of two different Batches of data, both against the same Expectation Suite ("my_expectation_suite"). Each Validation will trigger the same set of Actions and use the same Evaluation ParametersA dynamic value used during Validation of an Expectation which is populated by evaluating simple expressions or by referencing previously generated metrics. and runtime configuration.

This configuration omits the "validations" key from the YAML, which means a "validations " list must be provided when the Checkpoint is run. Because "action_list", "evaluation_parameters", and "runtime_configuration" appear as top-level keys in the YAML configuration, these keys may be omitted from the validation dictionaries, unless a non-default value is desired. When run, this Checkpoint will perform two validations of two different batches of data, with each batch of data validated against a different Expectation Suite ("my_expectation_suite" and "my_other_expectation_suite", respectively). Each Validation will trigger the same set of actions and use the same Evaluation ParametersA dynamic value used during Validation of an Expectation which is populated by evaluating simple expressions or by referencing previously generated metrics. and runtime configuration.

The return object of a Checkpoint run is a CheckpointResult object. The run_results attribute forms the backbone of this type and defines the basic contract for what a Checkpoint's run method returns. It is a dictionary where the top-level keys are the ValidationResultIdentifiers of the Validation Results generated in the run. Each value is a dictionary having at minimum, a validation_result key containing an ExpectationSuiteValidationResult and an actions_results key containing a dictionary where the top-level keys are names of Actions performed after that particular Validation, with values containing any relevant outputs of that action (at minimum and in many cases, this would just be a dictionary with the Action's class_name).

The run_results dictionary can contain other keys that are relevant for a specific Checkpoint implementation. For example, the run_results dictionary from a WarningAndFailureExpectationSuiteCheckpoint might have an extra key named "expectation_suite_severity_level" to indicate if the suite is at either a "warning" or "failure" level.

CheckpointResult objects include many convenience methods (e.g. list_data_asset_names) that make working with Checkpoint results easier. You can learn more about these methods in the documentation for class: great_expectations.checkpoint.types.checkpoint_result.CheckpointResult.