vasjez andree ualany

1 view

Skip to first unread message

Yury Morris

unread,

Aug 3, 2024, 1:34:34 AM8/3/24

to senxasecat

The training script is very similar to a training script you might run outside of SageMaker, but you can access useful properties about the training environment through various environment variables, including the following:

A typical training script loads data from the input channels, configures training with hyperparameters, trains a model, and saves a model to model_dir so that it can be deployed for inference later.Hyperparameters are passed to your script as arguments and can be retrieved with an argparse.ArgumentParser instance.For example, a training script might start with the following:

Because the SageMaker imports your training script, you should put your training code in a main guard (if __name__=='__main__':) if you are using the same script to host your model,so that SageMaker does not inadvertently run your training code at the wrong point in execution.

The example above will eventually delete both the SageMaker endpoint and endpoint configuration through delete_endpoint(). If you want to keep your SageMaker endpoint configuration, use the value False for the delete_endpoint_config parameter, as shown below.

Additionally, it is possible to deploy a different endpoint configuration, which links to your model, to an already existing SageMaker endpoint.This can be done by specifying the existing endpoint name for the endpoint_name parameter along with the update_endpoint parameter as True within your deploy() call.For more information.

For GitHub or other Git repositories,If repo is an SSH URL, you should either have no passphrase for the SSH key pairs, or have the ssh-agent configuredso that you are not prompted for the SSH passphrase when you run a git clone command with SSH URLs. For SSH URLs, itdoes not matter whether two-factor authentication is enabled. If repo is an HTTPS URL, 2FA matters. When 2FA is disabled, either token or username``+``password will beused for authentication if provided (token prioritized). When 2FA is enabled, only token will be used forauthentication if provided. If required authentication info is not provided, python SDK will try to use localcredentials storage to authenticate. If that fails either, an error message will be thrown.

For CodeCommit repos, please make sure you have completed the authentication setup: -up.html.2FA is not supported by CodeCommit, so 2FA_enabled should not be provided. There is no token in CodeCommit, sotoken should not be provided either. If repo is an SSH URL, the requirements are the same as GitHub repos.If repo is an HTTPS URL, username``+``password will be used for authentication if they are provided; otherwise,Python SDK will try to use either CodeCommit credential helper or local credential storage for authentication.

Git support can be used not only for training jobs, but also for hosting models. The usage is the same as the above,and git_config should be provided when creating model objects, e.g. TensorFlowModel, MXNetModel, PyTorchModel.

Amazon SageMaker supports using Amazon Elastic File System (EFS) and FSx for Lustre as data sources to use during training.If you want use those data sources, create a file system (EFS/FSx) and mount the file system on an Amazon EC2 instance.For more information about setting up EFS and FSx, see the following documentation:

The SageMaker Python SDK allows you to specify a name and a regular expression for metrics you want to track for training.A regular expression (regex) matches what is in the training algorithm logs, like a search function.Here is an example of how to define metrics:

To use a Docker image that you created and use the SageMaker SDK for training, the easiest way is to use the dedicated Estimator class.You can create an instance of the Estimator class with desired Docker image and use it as described in previous sections.

You can also find this notebook in the Advanced Functionality folder of the SageMaker Examples section in a notebook instance.For information about using sample notebooks in a SageMaker notebook instance, see Use Example Notebooksin the AWS documentation.

To use incremental training with SageMaker algorithms, you need model artifacts compressed into a tar.gz file. Theseartifacts are passed to a training job via an input channel configured with the pre-defined settings Amazon SageMaker algorithms require.

This is converted into an input channel with the specifications mentioned above once you call fit() on the predictor.In bring-your-own cases, model_channel_name can be overriden if you require to change the name of the channel while usingthe same settings.

You can also find this notebook in the Advanced Functionality section of the SageMaker Examples section in a notebook instance.For information about using sample notebooks in a SageMaker notebook instance, see Use Example Notebooksin the AWS documentation.

SageMaker Model Packages are a way to specify and share information for how to create SageMaker Models.With a SageMaker Model Package that you have created or subscribed to in the AWS Marketplace,you can use the specified serving image and model data for Endpoints and Batch Transform jobs.

The SageMaker Python SDK provides built-in algorithms with pre-trained models from popular open source modelhubs, such as TensorFlow Hub, Pytorch Hub, and HuggingFace. You can deploy these pre-trained modelsas-is or first fine-tune them on a custom dataset and then deploy to a SageMaker endpoint for inference.

All JumpStart foundation models are available to use programmatically with the SageMaker Python SDK.For a list of available example notebooks related to JumpStart foundation models, seeJumpStart foundation models example notebooks.

SageMaker built-in algorithms with pre-trained models support 15 different machine learning problem types.Below is a list of all the supported problem types with a link to a Jupyter notebook that provides example usage.

This example uses the foundation model FLAN-T5 XL, which is suitable for a wide range of text generation use cases including question answering,summarization, chatbot creation, and more. For more information about model use cases, seeChoose a foundation model in the Amazon SageMaker Developer Guide.

You can optionally include specific model versions or instance types when deploying a pretrained modelusing the JumpStartModel class. All JumpStart models have a default instance type.Retrieve the default deployment instance type using the following code:

In this section, you learn how to take a pre-trained model and deployit directly to a SageMaker Endpoint and understand what happens behindthe scenes if you deployed your model as a JumpStartModel. The followingassumes familiarity with SageMakermodelsand their deploy functions.

Next, pass the URIs and other key parameters as part of a newSageMaker Model class. The entry_point is a JumpStart scriptnamed inference.py. SageMaker handles the implementation of thisscript. You must use this value for model inference to be successful.For more information about the Model class and its parameters,see Model.

Save the output from deploying the model to a variable namedpredictor. The predictor is used to make queries on the SageMakerendpoint. Currently, the generic model.deploy call requiresthe predictor_cls parameter to define the predictor class. Passin the default SageMaker Predictor class for this parameter.Deployment may take about 5 minutes.

Because the model and script URIs are distributed by SageMaker JumpStart,the endpoint, endpoint config and model resources will be prefixed withsagemaker-jumpstart. Refer to the model Tags to inspect themodel artifacts involved in the model creation.

Finally, use the predictor instance to query your endpoint. Forcatboost-classification-model, for example, the predictor acceptsa csv. For more information about how to use the predictor, seetheAppendix.

Preparing your model for deployment on a SageMaker endpoint can take multiple steps, including choosing a model image, setting up the endpoint configuration, coding your serialization and deserialization functions to transfer data to and from server and client, identifying model dependencies, and uploading them to S3. SageMaker Modelbuilder can reduce the complexity of initial setup and deployment to help you create a SageMaker-deployable model in a single step. For an in-depth explanation of ModelBuilder and its supporting classes and examples, you can also refer to Create a Model in Amazon SageMaker Studio with ModelBuilder.

ModelBuilder takes a framework model (such as XGBoost or PyTorch) or an inference specification (as discussed in the following sections) and converts it into a SageMaker-deployable model. ModelBuilder provides a build function that generates the artifacts for deployment. The model artifact generated is specific to the model server, which you can specify as one of the inputs. For more details about the ModelBuilder class, see ModelBuilder.

At minimum, the model builder expects a model, input, output and the role. In the following code example, ModelBuilder is called with a framework model and an instance of SchemaBuilder with minimum arguments (to infer the corresponding functions for serializing and deserializing the endpoint input and output).

If you want to bring your own container that is extended from a SageMaker container, you can also specify the image URI as shown in the following example. It is also advised that you identify the model server which corresponds to the image using the model_server argument.

When invoking a SageMaker endpoint, the data is sent through HTTP payloads with different MIME types. For example, an image sent to the endpoint for inference needs to be converted to bytes by the client and sent through HTTP payload to the endpoint. The endpoint deserializes the bytes before model prediction, and serializes the prediction to bytes that are sent back through the HTTP payload to the client. The client performs deserialization to convert the bytes data back to the expected data format, such as JSON.