mlflow model artifacts are not getting stored, while running the airflow dag. for that reason unable to fetch experiment details?

607 views
Skip to first unread message

Vasant Chanukya

unread,
May 1, 2022, 3:05:49 AM5/1/22
to mlflow-users
import mlflow

from mlflow.tracking import MlflowClient
client = MlflowClient()

" training the model and saving the model artificats"
mlflow.set_registry_uri('postgresql://postgres:postgres@localhost/mlflow')
mlflow.set_experiment('testing_mlflow_with_airflow')
with mlflow.start_run():
# creating the training dataframe
    train_x = self.train_data[0]
    train_y = self.train_data[1]

    # training the given model
    model.fit(train_x, train_y)
               
    mlflow.sklearn.log_model(model, "model")

" getting the experiment details by experiment name"
experiment_id = client.get_experiment_by_name('testing_mlflow_with_airflow').experiment_id
experiment_results = mlflow.search_runs(experiment_ids=experiment_id)

airflow code:


training = BashOperator(
              task_id = 'mlflow_training',
              bash_command='python3 /home/vasanth/airflow/scripts/mlproject/src/models/train_mlflow.py',
              do_xcom_push=False
               )

airflow error :


[2022-04-29, 13:01:08 UTC] {subprocess.py:74} INFO - Running command: ['bash', '-c', 'python3 /home/vasanth/airflow/scripts/mlproject/src/models/train_mlflow.py']
[2022-04-29, 13:01:08 UTC] {subprocess.py:85} INFO - Output:
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - WARNING:root:Malformed experiment '2'. Detailed error Yaml file '/tmp/airflowtmpzjvuldm6/mlruns/2/meta.yaml' does not exist.
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - Traceback (most recent call last):
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO -   File "/usr/local/lib/python3.8/dist-packages/mlflow/store/tracking/file_store.py", line 262, in list_experiments
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO -     experiment = self._get_experiment(exp_id, view_type)
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO -   File "/usr/local/lib/python3.8/dist-packages/mlflow/store/tracking/file_store.py", line 341, in _get_experiment
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO -     meta = read_yaml(experiment_dir, FileStore.META_DATA_FILE_NAME)
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO -   File "/usr/local/lib/python3.8/dist-packages/mlflow/utils/file_utils.py", line 179, in read_yaml
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO -     raise MissingConfigException("Yaml file '%s' does not exist." % file_path)
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - mlflow.exceptions.MissingConfigException: Yaml file '/tmp/airflowtmpzjvuldm6/mlruns/2/meta.yaml' does not exist.
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - priniting the testing data <class 'pandas.core.frame.DataFrame'>
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - priniting the testing data    fixed_acidity  volatile_acidity  citric_acid  ...  sulphates  alcohol  quality
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - 0            7.4              0.70         0.00  ...       0.56      9.4        5
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - 1            7.8              0.88         0.00  ...       0.68      9.8        5
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - 2            7.8              0.76         0.04  ...       0.65      9.8        5
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - 3           11.2              0.28         0.56  ...       0.58      9.8        6
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO -
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - [4 rows x 12 columns]
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - tracking uri ***ql://***:***@localhost/mlflow
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - Traceback (most recent call last):
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - File "/home/vasanth/airflow/scripts/mlproject/src/models/train_mlflow.py", line 44, in <module>
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - File "/home/vasanth/airflow/scripts/mlproject/src/models/mlflow_class.py", line 88, in __init__
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - experiment_id = client.get_experiment_by_name('testing_mlflow_with_airflow').experiment_id
[2022-04-29, 13:01:28 UTC] {subprocess.py:89} INFO - AttributeError: 'NoneType' object has no attribute 'experiment_id'
[2022-04-29, 13:01:29 UTC] {subprocess.py:93} INFO - Command exited with return code 1
[2022-04-29, 13:01:29 UTC] {taskinstance.py:1774} ERROR - Task failed with exception

how i can set a directory where all my experiment runs artifacts will be stored? where the mlflow artifacts are getting stored now? how i can find all runs details by the mlflow client as per the above code?

i have tried with different approaches, None of them is worked

setting the tracking server as below
mlflow.set_tracking_uri('postgresql://postgres:postgres@localhost/mlflow')
mlflow.set_tracking_uri('file:///tmp/mlruns')

mlflow.set_tracking_uri('http://localhost:5000')

mlflow.set_registry_uri('postgresql://postgres:postgres@localhost/mlflow')
mlflow.set_tracking_uri('/home/vasanth/airflow/scripts/mlproject/src/models')


Reply all
Reply to author
Forward
0 new messages