I wrote a python program to generate grids for the up/down/left/right (udlr) version of this Grid Arithmetic Problem (gap).
I stopped the a(4) run after it reached 100,000 terms, which took about 5 hours.
The program did a depth first search, and chose directions in an up/right/down/left (ie, clock-wise) order.
This ended up generating a grid that on the large scale looks like a line with a slope of about 2.48.
I have attached a picture showing what the first 437 terms look like, with the last point in the picture at x,y coordinates 83,217.
I think one of the biggest time saving strategies for this search was to realize that,
if you know that your current grid is free from ap4's (arithmetic progressions of length 4) (or any apN, really),
then when adding the next point, you only have to check whether the new point is part of an ap4.
You don't have to re-check the whole grid for ap4's.
My first version was recursive and quickly ran into python's recursion limit of about 1000.
I converted it to an iterative version and was able to get up to 100,000 terms.
I also tried the variant with diagonals included.
I let the search for a(3) of this version get up to 100,000 terms before stopping the program.
The diagonal version produced a grid that, overall, looks like a line with a slope of about 1.19.
I also tried the variant where each next term was a knight's move, or L-move (2 over 1 across), away.
I let the search for a(3) of this version get up to 100,000 terms before stopping the program.
The knight's move version produced a grid that, overall, looks like a line with a slope of about 0.56.
I have attached a picture showing what the first 204 terms look like (instead of the L shape, each line segment is the "2x1" diagonal).
The last point in this picture is at x,y coordinates 253,164.
I don't know for sure, but it seems like the unbounded grid version of these 3 variants,
udlr/udlr+diagonals/L-moves will grow indefinitely for a(4), a(3), and a(3), respectively.
I like the version that Alex suggested asking if any variation can completely fill the grid,
and if not, how dense can each one fill the grid. It'll be a while before I can look at this,
but I'd be interested to hear if anyone else tackles this question.
I have also attached my python program, and a zip file with the 100k terms for each of the 3 variants. The zip files are a list of the [x,y] coordinates of each term in that grid.
-David C.