Dear Tobias,
BFGS should be fine up to the order of (a few) 1000 atoms depending on your computational resources, because it involves the diagonalisation of a matrix with the dimensions of 3 times the number of atoms. This becomes readily a bottleneck.
For large systems I would recommend LBFGS which shows an efficiency close to BFGS and it workw for 100,000s of atoms. The CG can show a very slow convergence, but it is usually quite robust and can be used for cases where LBFGS troubles.
You may also dump in the formats dcd or dcd_aligned_cell see
http://manual.cp2k.org/trunk/CP2K_INPUT/MOTION/PRINT/TRAJECTORY.html#desc_FORMATDCD is a binary dump, but this saves time and disk space for large systems. DCD files can be displayed for instance with vmd, but they contain no atomic information. Thus you have first to load a pdb or xyz file of your system (just one configuration, but same atomic order, e.g. the intial one) and then you load the dcd file into this molecule. This allows a plotting of the atoms and the cell for each configuration, which is useful for variable cells.
Best,
Matthias