chad thanks again for quick response.
(ignore tl;dr if you find that pretentious or high maint or whatever, any help appreciated)
i don't think it's downloading an empty artifact. typically when it can't find an artifact it will give a 404 error, but the stage will fail.
this happens way, way more rarely and it's almost always because the artifact max size setting got hit and it cleared some artifacts.
(i'm guessing this is why you are perhaps asking about size, we've carefully tuned our max artifact size so we probably haven't had missing artifacts for almost a year. )
to answer your question our largest artifact store is about 80M, the smallest ones are around 3-4MB and the mean is probably 20-30M. the disk w/artifact store has about 700-800GB free tho, we don't clean artifacts until we get around 200-300GB free. so more than enough to hold several previous revisions of every pipelines' artifacts.
the other reason i don't think this is the issue (also the same reason i don't have any reason to suspect it downloading old artifacts) is because
- we would get more failed tests (and to lesser extent, failed packages) because usually if a new test was added but run against old binaries this would cause the test failure. this never happens
- also- the ultimate problem trying to fix here- we never have failed pipelines. it's just occasionally a downstream will get packaged with an old upstream that fails on runtime (not in the pipeline, or at least never in these particular pipelines, which don't run code outside of tests)
UPSTREAM ARTIFACTS VS SAME-PIPELINE ARTIFACTS (w/variable upstream pipelines)
i think i didn't explain the scenario clearly enough that i think is happening
project2 depends on project1
(project here is synonymous w/pipelines)
two agents: agentA and agentB w/ this directory tree:
agent-working/project1-working
agent-working/project2-working
assume both agents have built both projects most recent revisions
(iow both of their 'agent-working' directories are essentially identical)
agentA-project1-version=1
agentA-project2-version=1
agentB-project1-version=1
agentB-project2-version=1
then comes a new commit to project1 (commit-aka-version=2)
- project1 commit#2 build stage/job assigned to agent A by gocd, it builds and uploads it artifacts
- the rest of the stages complete by fetching the artifacts from commit#2
- project2 gets it's second commit, which gets assigned to agentB by gocd
- agentB builds it fine, but recall that agentB wasn't involved in project1-commit#2 and so it only has built project1-commit#1
- because project2 isn't a stage of project1, it can't fetch project1's build artifacts (unless you untemplate everything, or unless i can figure out how to templatize multiple upstream artifact fetches)
- so in this instance, project2 builds but it uses the builds from "../project1-working", which is commit#1
- this works in most cases and gets packaged up, but then fails in runtime because it's still got an old build from project1 mixed in with project2's second commit
i'm pretty sure this is whats happening, because when this happens if i go look at the versions of the dlls it's got the old ones.
that's also why the workaround of re-pulling and refetching the pipelines periodically works, tho this is messy/inefficient and done outside of gocd.
what i want to fix this is to add extra artifact fetches to project2.
if i can change project2 to fetch project1's artifacts always pre-build, it should work (this is essentially what the localized refetch-and-rebuild-everything hack does )
with templates tho, i have to parametize what projects is upstream (the "project2 depends on project1 build artifacts" relationship)
if it was just one upstream per pipeline, i know how to do this w/templates and parameters.
however for almost all the pipelines, there are MULTIPLE upstream pipelines that a given downstream pipeline needs to build.
and not just more than one, it's an unknown number (sometimes there are zero, 1, 2, and even one has 7-8)
how can i parameterize an unknown number of artifact fetches through a template?
if i could do this, then this is what i believe would happen in the above scenario:
- project1 commit#2 build stage/job assigned to agent A by gocd, it builds and uploads it artifacts
- the rest of the stages complete by fetching the artifacts from commit#2
- project2 gets it's second commit, which gets assigned to agentB by gocd
- agentB looks at it's 'upstream-pipeline-list' and sees it has to pull artifacts from 'project1' so it does this, and then gets the correct upstream version
- then everything builds and works.
is this possible?
superficially it seems it's not, but i thought something similiar about having different resource requirements per-pipeline and you and some other people explained how to do it in the config.
not sure if it's the same but this seems like it should be possible, and maybe it's not clear through the gocd web pipeline editor.
other solutions i can think of are:
- (least preferred) stop using pipeline templates, rebuild all pipelines as stand-alone and just put arbitrary #s of 'fetch artifacts' tasks on each pipeline (one for each upstream project as needed)
- (still bad but better) hack something to mass rsync directories between agents
- (more controlled but still bad and outside of gocd) using our hack script to rebuild pipelines and just automating it further to run on every agent periodically
- (still very complicated but at least inside gocd) somehow scheduling/forcing gocd to periodically build all pipelines on all agents (eg so for any given project/pipeline, agentA and agentN all have identical trees)
- (hacky but more deterministic and closer to "GOCD WAY") trying to put some fixed # of upstreams (upstream1,upstream2,upstream3, etc) into the template and see if it will properly ignore empty parameters
- ("GOCD WAY" as i understand it) being able to somehow create a single parameter that holds multiple upstream pipelines and have gocd fetch them all before it builds/tests a downstream stage)