Are artifacts only files?

44 views
Skip to first unread message

Veeral Patel

unread,
Sep 30, 2019, 5:36:48 PM9/30/19
to go-cd
I'm trying to create a GoCD pipeline that takes the URL of a file in S3, computes the MD5 of it, and does a few different things with the MD5.

I was thinking of setting it up so computing the file's MD5 is one job and the other operations using the MD5 are additional jobs (which run in parallel).

My question is, how do I allow the additional jobs to access the MD5 computed from the previous job? I could create a text file artifact which contains the MD5, but I'm wondering if I could make the MD5 a "string artifact" if that makes sense.

Aravind SV

unread,
Oct 1, 2019, 6:19:23 PM10/1/19
to go...@googlegroups.com
Hello Veeral,

On Mon, Sep 30, 2019 at 14:36:48 -0700, Veeral Patel wrote:
> I was thinking of setting it up so computing the file's MD5 is one job and
> the other operations using the MD5 are additional jobs (which run in
> parallel).
>
> My question is, how do I allow the additional jobs to access the MD5
> computed from the previous job? I could create a text file artifact which
> contains the MD5, but I'm wondering if I could make the MD5 a "string
> artifact" if that makes sense.

As you mentioned, jobs run in parallel. So, it's possible that your MD5-computing-job has not finished while your other jobs which need it have started. That's the reason jobs cannot share artifacts. So, your MD5-computing-job should be in a previous stage.

Between jobs, the usual way to share information is through file artifacts. Of course, you can upload that information to a different system (say a Docker image can be considered an artifact which resides in the Docker registry).

GoCD has the concept of a pluggable, external artifact where the artifact is not necessarily a file. For instance, the Docker registry artifact plugin (https://github.com/gocd/docker-registry-artifact-plugin) allows you to publish and fetch Docker images as artifacts. GoCD manages the metadata using regular artifact files in the background.

Cheers,
Aravind

Veeral Patel

unread,
Oct 1, 2019, 8:01:33 PM10/1/19
to go-cd
As you mentioned, jobs run in parallel. So, it's possible that your MD5-computing-job has not finished while your other jobs which need it have started. That's the reason jobs cannot share artifacts. So, your MD5-computing-job should be in a previous stage.

Yes, I was envisioning putting the compute MD5 job into a different stage. What's the best way to share string data like the MD5 between stages, in your view?

Aravind SV

unread,
Oct 2, 2019, 1:40:10 PM10/2/19
to go...@googlegroups.com
Hey Veeral,

On Tue, Oct 01, 2019 at 17:01:32 -0700, Veeral Patel wrote:
> > As you mentioned, jobs run in parallel. So, it's possible that your
> MD5-computing-job has not finished while your other jobs which need it have
> started. That's the reason jobs cannot share artifacts. So, your
> MD5-computing-job should be in a previous stage.
>
> Yes, I was envisioning putting the compute MD5 job into a different stage.
> What's the best way to share string data like the MD5 between stages, in
> your view?

Yes, it would be easiest to share a file with that information. Of course an ideal modeling approach would be:

Stage 1: "Save" string to some key-value store.

Stage 2: "Retrieve" string from the key-value store for "Stage 1".

But, since it's not possible to model it that way, you will need to save it to a file. It is possible to write a pluggable artifact plugin which can do that. :) But, it seems like it would be overkill.

Cheers,
Aravind

Veeral Patel

unread,
Oct 2, 2019, 1:42:34 PM10/2/19
to go-cd
Can you share a file between stages?

Aravind SV

unread,
Oct 2, 2019, 1:53:53 PM10/2/19
to go...@googlegroups.com
On Wed, Oct 02, 2019 at 10:42:34 -0700, Veeral Patel wrote:
> Can you share a file between stages?

Yes. You can publish artifacts in a stage and fetch them from any stage following it.

https://docs.gocd.org/current/configuration/managing_dependencies.html#fetching-artifacts-from-an-upstream-pipeline

It can even be done across pipelines. GoCD ensures that what it fetches is the correct version of the upstream artifact, meaning, it will go up the dependency graph and find the artifact published during the correct run of an upstream pipeline. Of course, if it's a stage in the same pipeline, it doesn't need to go up the dependency graph.

Veeral Patel

unread,
Oct 2, 2019, 5:45:15 PM10/2/19
to go-cd
Great, so for future readers, my options are either to:

1. write the MD5 to a file and read it in a subsequent stage
2. write an artifact plugin which reads/writes the MD5 from a key/value store

Let me know if this is incorrect!

Aravind SV

unread,
Oct 3, 2019, 9:50:42 AM10/3/19
to go...@googlegroups.com
On Wed, Oct 02, 2019 at 14:45:15 -0700, Veeral Patel wrote:
> Great, so for future readers, my options are either to:
>
> 1. write the MD5 to a file and read it in a subsequent stage
> 2. write an artifact plugin which reads/writes the MD5 from a key/value store

That seems about right.

For option 2, GoCD's artifact endpoint allows you to store a key-value pair inside itself. See the "metadata" in the response in: https://plugin-api.gocd.org/current/artifacts/#publish-artifact
Reply all
Reply to author
Forward
0 new messages