log_artifact not working

4,042 views
Skip to first unread message

Franklin Sarkett

unread,
Aug 23, 2018, 1:43:03 PM8/23/18
to mlflow-users
Hey folks,

I got the mlflow ui server set up, and its saving everything except log_artifact.

The server is running mlflow in a docker container with continuum/miniconda3 as the base image.

From the command line, I'm launching the server like this:

$ mlflow ui -h 0.0.0.0 -p 5000

I can see and interact with the data.

When I try to save the params, metrics and artifacts, I do this:

# neumann
mlflow_server = '52.89....'

# Tracking URI
mlflow_tracking_URI = 'http://' + mlflow_server + ':5000'
print ("MLflow Tracking URI: %s" % (mlflow_tracking_URI))

# set tracking URI
mlflow.set_tracking_uri(mlflow_tracking_URI)

with mlflow.start_run(experiment_id=3):
    mlflow.log_param("depth", 5)
    mlflow.log_metric("roc_auc", 0.8)
    mlflow.log_artifact(local_path='curve.png')

This is my FileNotFoundError error message:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-113-f5870bc80ffe> in <module>()
      2     mlflow.log_param("depth", 5)
      3     mlflow.log_metric("roc_auc", 0.8)
----> 4     mlflow.log_artifact(local_path='curve.png')

~/py3/lib/python3.7/site-packages/mlflow/tracking/fluent.py in log_artifact(local_path, artifact_path)
    131     """Log a local file or directory as an artifact of the currently active run."""
    132     artifact_uri = _get_or_start_run().info.artifact_uri
--> 133     get_service().log_artifact(artifact_uri, local_path, artifact_path)
    134 
    135 

~/py3/lib/python3.7/site-packages/mlflow/tracking/service.py in log_artifact(self, artifact_uri, local_path, artifact_path)
    105         :param artifact_path: If provided, will be directory in artifact_uri to write to"""
    106         artifact_repo = ArtifactRepository.from_artifact_uri(artifact_uri, self.store)
--> 107         artifact_repo.log_artifact(local_path, artifact_path)
    108 
    109     def log_artifacts(self, artifact_uri, local_dir, artifact_path=None):

~/py3/lib/python3.7/site-packages/mlflow/store/local_artifact_repo.py in log_artifact(self, local_file, artifact_path)
     14             if artifact_path else self.artifact_uri
     15         if not exists(artifact_dir):
---> 16             mkdir(artifact_dir)
     17         shutil.copy(local_file, artifact_dir)
     18 

~/py3/lib/python3.7/site-packages/mlflow/utils/file_utils.py in mkdir(root, name)
     99             return target
    100     except OSError as e:
--> 101         raise e
    102 
    103 

~/py3/lib/python3.7/site-packages/mlflow/utils/file_utils.py in mkdir(root, name)
     96     try:
     97         if not exists(target):
---> 98             os.mkdir(target)
     99             return target
    100     except OSError as e:

FileNotFoundError: [Errno 2] No such file or directory: '/mlruns/3/1053b732c0a14d6cb8c07ee4320fd781/artifacts'


When I look at the filesystem, everything is saving except artifacts:

~/mlruns/3$ tree
.
├── 5dcd18160aa74e6e8e405a6257a13177
  ├── artifacts
  ├── meta.yaml
  ├── metrics
    └── roc_auc
  └── params
      └── depth

Any suggestions?

Thanks!
Franklin

Aaron Davidson

unread,
Aug 23, 2018, 1:49:48 PM8/23/18
to Franklin Sarkett, mlflow-users
"mlflow ui" is actually not suitable to be run on a remote server, you should be using "mlflow server" to let you specify further options. Either way, the problem you are running into is that the "--default-artifact-root" is "/mlruns", which differs between the server and client.

Please take a look at this section of the docs:

(and for a more complete explanation of the problem, see the highlighted "Important" section there)

--
You received this message because you are subscribed to the Google Groups "mlflow-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlflow-users+unsubscribe@googlegroups.com.
To post to this group, send email to mlflow...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mlflow-users/1db93876-9c50-4c3d-94e9-04d67f0165e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Franklin Sarkett

unread,
Aug 23, 2018, 9:42:54 PM8/23/18
to mlflow-users
Thank you! That fixed it.
To unsubscribe from this group and stop receiving emails from it, send an email to mlflow-users...@googlegroups.com.

Arnab Biswas

unread,
Sep 4, 2018, 12:55:47 AM9/4/18
to mlflow-users
Looks like I am also facing the same issue even after taking care of the suggested solution. Not sure what am I missing.

1. Started MLFlow Server on a RHEL 6.10 server (say server 1) using the following command. I have specified two different locations for default-artifact-root and file-store.

$ cd  /home/arnab/mlflow_install
$ mlflow server --host 0.0.0.0 --port 8090 --default-artifact-root /home/arnab/artifact_location/ --file-store /home/arnab/file_store_location/

MLFlow starts successfully.

2. From Server 2, conda environment, I tried to execute the "mlflow/example/tutorial/train.py". However, I have added the following code so that it uses MLFlow server. Please note, while creating the experiment, I have NOT added "artifact_location". 

mlflow.set_tracking_uri("http://<Server_1_IP>:8090")
exp_id = mlflow.create_experiment("Yet Another Sklearn wine experiment")
......
with mlflow.start_run(experiment_id = exp_id):
.........
    # Log parameter, metrics, and model to MLflow
    mlflow.log_param("alpha", alpha)
    mlflow.log_param("l1_ratio", l1_ratio)
    mlflow.log_metric("rmse", rmse)
    mlflow.log_metric("r2", r2)
    mlflow.log_metric("mae", mae)

    mlflow.sklearn.log_artifacts(lr, "model")
On executing the code, mlflow.sklearn.log_model fails with the following stack trace. However, the params and metrics got saved at MLFlow server and are visible through mlflow ui.

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-5-4c49cbe9b70c> in <module>()
----> 1 train(0.5, 0.5)

<ipython-input-4-ec3e6c2563bd> in train(in_alpha, in_l1_ratio)
     73         mlflow.log_metric("mae", mae)
     74 
---> 75         mlflow.sklearn.log_artifacts(lr, "model")

~/anaconda3/envs/python-skl/lib/python3.7/site-packages/mlflow/tracking/fluent.py in log_artifacts(local_dir, artifact_path)
    135     """Log all the contents of a local directory as artifacts of the run."""
    136     artifact_uri = _get_or_start_run().info.artifact_uri
--> 137     get_service().log_artifacts(artifact_uri, local_dir, artifact_path)
    138 
    139 

~/anaconda3/envs/python-skl/lib/python3.7/site-packages/mlflow/tracking/service.py in log_artifacts(self, artifact_uri, local_dir, artifact_path)
    113         :param artifact_path: If provided, will be directory in artifact_uri to write to"""
    114         artifact_repo = ArtifactRepository.from_artifact_uri(artifact_uri, self.store)
--> 115         artifact_repo.log_artifacts(local_dir, artifact_path)
    116 
    117     def set_terminated(self, run_id, status=None, end_time=None):

~/anaconda3/envs/python-skl/lib/python3.7/site-packages/mlflow/store/local_artifact_repo.py in log_artifacts(self, local_dir, artifact_path)
     21             if artifact_path else self.artifact_uri
     22         if not exists(artifact_dir):
---> 23             mkdir(artifact_dir)
     24         dir_util.copy_tree(src=local_dir, dst=artifact_dir)
     25 

~/anaconda3/envs/python-skl/lib/python3.7/site-packages/mlflow/utils/file_utils.py in mkdir(root, name)
     99             return target
    100     except OSError as e:
--> 101         raise e
    102 
    103 

~/anaconda3/envs/python-skl/lib/python3.7/site-packages/mlflow/utils/file_utils.py in mkdir(root, name)
     96     try:
     97         if not exists(target):
---> 98             os.mkdir(target)
     99             return target
    100     except OSError as e:

FileNotFoundError: [Errno 2] No such file or directory: '/home/arnab/artifact_location/1/365b8fd4692d47f8bc611ab3c5cfce24/artifacts/model'

Following is the detail of the directory related to the experiment (at Server 1). There is no file or directory created under "default-artifact-root" location:

$ pwd
/home/arnab/artifact_location
$ ls
$

The artifact directory is under "file-store" location, but, there is no "model" directory under artifacts.

$ pwd
/home/arnab/file_store_location/1/365b8fd4692d47f8bc611ab3c5cfce24
$ ls
artifacts  meta.yaml  metrics  params

Please let me know if I am missing anything.

I have tried various combinations with "default-artifact-root" and create_experiment locations. But, no success. 

Thanks,
Arnab

Arnab Biswas

unread,
Sep 4, 2018, 5:05:57 AM9/4/18
to mlflow-users

Well.... from this (https://github.com/mlflow/mlflow/issues/212) issue, I understood that the artifact location should be a NFS mounting shared by both client and serve). I was not sure about the "client and server sharing" portion of it and hence assuming a local directory/path at the server would be considered as a valid location. My bad!

In between, I tried to use the SFTP option for artifact location and encountered with several issues:

1. I started the MLFLow server in the following way:

mlflow server --host 0.0.0.0 --port 8090 --default-artifact-root sftp://arnab@<Server_3>:2222/home/arnab/artifact_location --file-store /home/arnab/file_store_location/

Server 3 is different from Server_1 (Where MLFlow Server is running) and Server 2 (Where client code is running). Also note I am using a custom SFTP port (2222) here. 

2. First issue is I got the following error message which seems to be due to a bug in paramiko with Python 3.7 (https://github.com/paramiko/paramiko/issues/1108 , https://github.com/unbit/sftpclone/issues/26, ). Paramiko version needs to be bumped for this (https://github.com/paramiko/paramiko/commit/0e0b2b87b547d97860ccf5962ad030df640b692f).

      ................................
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/mlflow/server/handlers.py", line 198, in _list_artifacts
        artifact_entities = _get_artifact_repo(run).list_artifacts(path)
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/mlflow/server/handlers.py", line 249, in _get_artifact_repo
        return ArtifactRepository.from_artifact_uri(run.info.artifact_uri, store)
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/mlflow/store/artifact_repo.py", line 82, in from_artifact_uri
        return SFTPArtifactRepository(artifact_uri)
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/mlflow/store/sftp_artifact_repo.py", line 26, in __init__
        import pysftp
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/pysftp/__init__.py", line 12, in <module>
        import paramiko
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/paramiko/__init__.py", line 31, in <module>
        from paramiko.transport import SecurityOptions, Transport
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/paramiko/transport.py", line 70, in <module>
        from paramiko.sftp_client import SFTPClient
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/paramiko/sftp_client.py", line 43, in <module>
        from paramiko.sftp_file import SFTPFile
      File "/home/arnab/.conda/envs/mlflow/lib/python3.7/site-packages/paramiko/sftp_file.py", line 68
        self._close(async=True)
                        ^
    SyntaxError: invalid syntax

3. On resolving that, I started getting the following error. This is because of the fact that paramiko does not handle custom ports (https://bitbucket.org/dundeemt/pysftp/issues/106/no-hostkey-for-host). I was able to handle by following the work around mentioned here:
"A workaround available to users, assuming that they don't also use the same server on the standard port, is to copy the entry in known_hosts and put it in again labeled with only host."
     ...........................
    ~/anaconda3/envs/python-skl/lib/python3.7/site-packages/mlflow/store/artifact_repo.py in from_artifact_uri(artifact_uri, store)
         80         elif artifact_uri.startswith("sftp:/"):
         81             from mlflow.store.sftp_artifact_repo import SFTPArtifactRepository
    ---> 82             return SFTPArtifactRepository(artifact_uri)
         83         elif artifact_uri.startswith("dbfs:/"):
         84             from mlflow.store.dbfs_artifact_repo import DbfsArtifactRepository

    ~/anaconda3/envs/python-skl/lib/python3.7/site-packages/mlflow/store/sftp_artifact_repo.py in __init__(self, artifact_uri, client)
         47                 self.config['private_key'] = user_config['identityfile'][0]
         48 
    ---> 49             self.sftp = pysftp.Connection(**self.config)
         50 
         51         super(SFTPArtifactRepository, self).__init__(artifact_uri)

    ~/anaconda3/envs/python-skl/lib/python3.7/site-packages/pysftp/__init__.py in __init__(self, host, username, private_key, password, port, private_key_pass, ciphers, log, cnopts, default_path)
        130         # check that we have a hostkey to verify
        131         if self._cnopts.hostkeys is not None:
    --> 132             self._tconnect['hostkey'] = self._cnopts.get_hostkey(host)
        133 
        134         self._sftp_live = False

    ~/anaconda3/envs/python-skl/lib/python3.7/site-packages/pysftp/__init__.py in get_hostkey(self, host)
         69         kval = self.hostkeys.lookup(host)  # None|{keytype: PKey}
         70         if kval is None:
    ---> 71             raise SSHException("No hostkey for host %s found." % host)
         72         # return the pkey from the dict
         73         return list(kval.values())[0]

    SSHException: No hostkey for host <IP_Address> found.

I am still encountering issues with SFTP and custom port (most probably because of my set up), but thought of composing of this email first (before I start forgetting things :-))

Thanks,
Arnab

Aaron Davidson

unread,
Sep 4, 2018, 2:04:55 PM9/4/18
to Arnab Biswas, Toon KBC, mlflow-users
Thanks for the detailed investigation. If you have an NFS mount, that should work if and only if it's the same path on the client and server, which you might be able to fake by using symlinks/remounting.

For the first SFTP issue you mentioned with paramiko, good catch -- we should probably publish the versions of the dependent libraries we've tested against (e.g., paramiko and pysftp versions) for each artifact store, and make them available like "pip install mlflow[sftp]", similar to Airflow.

Regarding the problems with getting a non-default port to work, unfortunately I too have little experience with this artifactory. I CC'd Toon, who wrote the initial SFTP artifactory, in case he has any experience with this.

To unsubscribe from this group and stop receiving emails from it, send an email to mlflow-users+unsubscribe@googlegroups.com.

To post to this group, send email to mlflow...@googlegroups.com.

Aaron Davidson

unread,
Sep 4, 2018, 2:12:05 PM9/4/18
to Arnab Biswas, mlflow-users
Oops, Toon's Github email does not seem to point anywhere, so a response is unlikely :)

Arnab Biswas

unread,
Sep 5, 2018, 2:31:18 AM9/5/18
to mlflow-users
Hi Aaron,

Understood. Thank you for clarifying. 

I was able to make SFTP work and that gave me an idea about how MLFlow Server remote artifact storage work. 

[Aaron]  If you have an NFS mount, that should work if and only if it's the same path on the client and server, which you might be able to fake by using symlinks/remounting.
[Arnab] Should not this be clearly documented? 

For the other two issues, do you want me to raise bugs at github?

Thanks,
Arnab 

Aaron Davidson

unread,
Sep 5, 2018, 11:31:51 AM9/5/18
to Arnab Biswas, mlflow-users
Yes, please do. We should probably explicitly document NFS in the Storage section here: https://github.com/mlflow/mlflow/blob/master/docs/source/tracking.rst

To unsubscribe from this group and stop receiving emails from it, send an email to mlflow-users+unsubscribe@googlegroups.com.

To post to this group, send email to mlflow...@googlegroups.com.

Arnab Biswas

unread,
Sep 7, 2018, 4:58:21 AM9/7/18
to mlflow-users
I have raised the following two issues:

https://github.com/mlflow/mlflow/issues/446

For correcting the documentation, should we re-open : https://github.com/mlflow/mlflow/issues/212?

Thanks,
Arnab
Reply all
Reply to author
Forward
0 new messages