Copying a directory from one remote host to another

6,531 views
Skip to first unread message

Tin Tvrtković

unread,
Mar 8, 2013, 6:08:15 PM3/8/13
to ansible...@googlegroups.com
Hi all,

here's a problem that's been on my mind lately. I'm trying to set up a master-slave pair of Postgres servers using Ansible. Near the end of the process, I need to recursively copy a directory from the master to the slave. Atm while I still don't have an automated solution, I log into the slave and run an rsync command, inputting the password manually. This is the only step I've not been able to elegantly automate so far.

I'd like to pick the community's brain for the most elegant and simple (not necessarily easy, but simple) solution. I've considered automating rsync with expect, and setting up passwordless ssh between the master and slave. The expect option feels really hacky; I'd have to ask the user for the password again (vars_prompt, or can I access the password from -k?), generate the script from a template, upload the script, make sure expect is installed, execute the script, remove the script.

I'm not entirely sure how I'd go about setting up temporary SSH access from one machine to the other. I suppose it would have to go like this: generate (or use prepared) key pair, upload the public key to one machine (using the authorized_key module), upload the private key to the other machine, execute rsync, remove both keys (don't think I'd like to leave the passwordless connection set up).

Am I missing a really simple alternative? Has anyone dealt with this issue, and how have you approached it, and with what success?

Thanks in advance.

Romeo Theriault

unread,
Mar 8, 2013, 6:24:53 PM3/8/13
to ansible...@googlegroups.com
On Fri, Mar 8, 2013 at 1:08 PM, Tin Tvrtković <tinch...@gmail.com> wrote:

Am I missing a really simple alternative? Has anyone dealt with this issue, and how have you approached it, and with what success?

One possible other option, which may or may not work well depending on the size of what you need to move, is rsyncing everything up to the ansible server from the master, then rsyncing it all down to the slave.

--
Romeo

Lester Wade

unread,
Mar 8, 2013, 6:36:14 PM3/8/13
to ansible...@googlegroups.com

Another alternative is to tar the directory, fetch to the ansible server and then copy back down to the other postgres server and untar it.

--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Matt Coddington

unread,
Mar 8, 2013, 6:38:52 PM3/8/13
to ansible...@googlegroups.com
Depending on what kind of access/traffic you're willing to allow from the slave to the master, you could set up an rsyncd.conf on the master and run an rsync daemon there (lock down access to the ip's of the slaves only, or however you want to restrict things, but that works without a password), then use the rsync protocol to pull the data down via the ansible "command" module against the slave; you could even make it idempotent assuming you know what files will be created.  e.g.

- name: get slave files from master
  command: rsync -a master::rsyncmodule/stuff /dir/on/slave creates=/dir/on/slave/somefile

If you're uncomfortable giving that access from slave to master, you could do it in reverse (push from master -> slave via rsyncd running on the slave), but i'm not sure how to make that idempotent in a "simple" way.

matt



--

Tin Tvrtković

unread,
Mar 10, 2013, 7:17:14 PM3/10/13
to ansible...@googlegroups.com
Thanks for the suggestions, Romeo, Lester, Matt.

I think I'm leaning towards just tarring up the directory, fetching the tar and pushing the tar to the slave. Rsyncing master-ansible and then ansible-slave still runs into the issue of passwords if I'm using -k instead of ssh-agent/passwordless SSH.

The idea of temporarily popping up an rsync daemon that allows passwordless access on the master is interesting. I'd have to spawn a new rsync daemon process using a custom rsyncd.conf (also taking into consideration there might already be an rsync daemon running, for whatever reason, on the default port), do the sync, and stop the daemon. Is there an Ansible idiom for doing this sort of thing? (Starting something up in one task, then stopping it in a later task, with the thing not being a service but just a process started by shell/command.) I could use an inline shell command that would start the server up and echo the PID into a temp file, then later feed that to kill, but this is getting kind of elaborate.

Michael DeHaan

unread,
Mar 10, 2013, 8:22:57 PM3/10/13
to ansible...@googlegroups.com
I'd be very tempted to just use the git module if it's not a lot of binary data.

What's the use case?

Romeo Theriault

unread,
Mar 10, 2013, 9:27:42 PM3/10/13
to ansible...@googlegroups.com
On Sun, Mar 10, 2013 at 1:17 PM, Tin Tvrtković <tinch...@gmail.com> wrote:
Thanks for the suggestions, Romeo, Lester, Matt.

I think I'm leaning towards just tarring up the directory, fetching the tar and pushing the tar to the slave. Rsyncing master-ansible and then ansible-slave still runs into the issue of passwords if I'm using -k instead of ssh-agent/passwordless SSH.

Another possibility that I thought of today while driving to the grocery store :) is (temporarily?) installing paramiko on the master (or slave) server and having ansible push down a simple python script, using the script module, which does the rsync through ssh for you. That way you don't have to setup keys, etc.... Just an idea.


Hmm, wonder if there's a use-case for a paramiko ansible module ?....

--
Romeo

Michael DeHaan

unread,
Mar 10, 2013, 9:31:48 PM3/10/13
to ansible...@googlegroups.com
I would find that a little weird.

Romeo Theriault

unread,
Mar 10, 2013, 9:47:53 PM3/10/13
to ansible...@googlegroups.com
On Sun, Mar 10, 2013 at 3:31 PM, Michael DeHaan <michael...@gmail.com> wrote:
I would find that a little weird.


Well, they all can't be winners right. ;) Though, I do think the paradigm of being able to direct one minion to do something on another minion via ssh, without keys, would be a useful ability. Maybe that's just installing ansible on the minions and having ansible on the master calling ansible on the minion though. I'll need to think about it some more.

Romeo



--
Romeo

Michael DeHaan

unread,
Mar 10, 2013, 9:58:05 PM3/10/13
to ansible...@googlegroups.com
I had, at one point, considered creating of a task-only ephemeral
fileserver that required access by using certain tokens, but I thought
it was a bit of a distraction and I don't want to maintain bonus
security systems.

While I understerstand the idea of using --ask-pass and so on, this is
really what locked SSH keys and ssh-agent excel at.

On Sun, Mar 10, 2013 at 9:47 PM, Romeo Theriault
Michael DeHaan <mic...@ansibleworks.com>
CTO, AnsibleWorks, Inc.
http://www.ansibleworks.com/

Tin Tvrtković

unread,
Mar 11, 2013, 9:04:12 AM3/11/13
to ansible...@googlegroups.com
Hi guys, thanks again for the comments.

It's not a lot of data, just a freshly initialized Postgres database. (Almost no rows, just empty tables, indexes etc.) My particular use case, as mentioned in the OP, is setting up Postgres streaming replication (step 6: http://wiki.postgresql.org/wiki/Streaming_Replication), which involves jump starting the replication process by making a base backup.

How would the git module help here? I'm thinking, on the master:

ensure git is installed
git init
git add
git daemon upload-archive
punch a hole in the firewall

then on the slave:

ensure git is installed
git archive

then shutdown the daemon on the master, remove firewall hole and delete the .git directory. (Or set up a script that would automatically shut it down and clean up in one minute.)

That might work. Could also work with hg and the hg serve command.

If we're brainstorming, wouldn't the fireball semi-daemon be good for things like this? It seems to already be able to deal with time-based auth and auto shutdown. If the ephemeral daemon could serve rsync (or scp, or a custom protocol that supports recursive directory fetching with excludes) requests, that would be just great. (Hopefully all the dependencies are available on CentOS :)

Michael DeHaan

unread,
Mar 11, 2013, 9:33:26 AM3/11/13
to ansible...@googlegroups.com
FYI, fireball file transfer is, right now, very un-optimized for large
file transfer.

Really the super easiest way is still:

local_action: rsync -avz /where/from user@$ansible_hostname:/where/to

of course using keys.

we could of course explore that fileserver option but I'd want to run
some benchmarks. It's hard to beat rsync!

--Michael

Tin Tvrtković

unread,
Mar 11, 2013, 4:01:12 PM3/11/13
to ansible...@googlegroups.com
Here's an idea: expand on the authorized key module, and find a way to set up passwordless SSH between two nodes temporarily (during a playbook, during an Ansible run, for a temporary time like the fireball daemon?). Instead of implementing a file server, just allow the full SSH arsenal to function between nodes unhampered during deployment.

This could probably be done via normal tasks (and a handler to clean up?), but I'm thinking a module (or modules) could make this sort of thing easier and more robust.

Linking two remote nodes (node 1 needs to have access to node 2):
1. generate a temporary, throw-away SSH key pair, keep it in memory (or temp file?). No passphrase.
2. using the authorized_key module, install the public key on node 2
3. on node 1, back up the default identity if it exists, install the generated identity as the default identity
4. let the playbook(s) do their thing
5. clean up

The clean up step is the tricky one. What the clean up should do is: get rid of the authorized key installed at #2, and delete the installed identity and reinstate the default identity from step #3. This should be fairly robust. Going by the fireball process, two temporary processes could be spawned on the hosts, with a timeout and hardcoded rules what to do when the timeout expires. The clean up should be performed even if the playbooks fail during step #4, or Ansible loses connectivity during any of the steps, or even if the user terminates Ansible mid-exec.

Does this sound interesting or have I gone off the deep end here?

Brian Coca

unread,
Mar 11, 2013, 4:04:55 PM3/11/13
to ansible...@googlegroups.com
How about using tar cz |nc with a delegate_to host#2 with nc|tar xz

--
Brian Coca
Stultorum infinitus est numerus
0110000101110010011001010110111000100111011101000010000001111001011011110111010100100000011100110110110101100001011100100111010000100001
Pedo mellon a minno

Greg Andrews

unread,
Mar 11, 2013, 5:37:08 PM3/11/13
to ansible...@googlegroups.com
Having recently written a (non-Ansible) shell script to perform this kind of one-shot file transfer between remote hosts, I agree with Michael.  A temporary rsync daemon on one end and an rsync command on the other is the simplest approach that offers speed, recovery from an interrupted transfer, and ease of setup/teardown.  It's simpler than I thought it would be when I started to write my script.

To illustrate, here are the main steps my script performs on Ubuntu 10.x/12.x:
  • On the sending machine (db master), create a temporary rsync config file with a suitable name (e.g., /tmp/send-to-slave.conf).  The file has only 3 lines:
[db-image]
path = /path/to/master/image/dir
use chroot = no
  • On the sending machine, start rsync in daemon mode using the above config file and a custom port.  Note: no need for '&' at the end of the command:
rsync --daemon --port=1873 --config=/tmp/send-to-slave.conf
  • On the receiving machine (db slave), run rsync to pull the files:
rsync  -av  rsync://sending-machine:1873/db-image/  /path/to/slave/image/dir

When the transfer is done, an ordinary kill or pkill command works to end the rsync daemon, and a pgrep command works to verify it exited.

Since the port number is not the usual rsync port (873), there's little risk of conflict with a normal rsync service daemon.  Since the port number is above 1024, the one-shot rsync daemon doesn't have to run as root.

  -Greg

Michael DeHaan

unread,
Mar 11, 2013, 6:14:30 PM3/11/13
to ansible...@googlegroups.com
Rsync over SSH makes more better sense -- it's secure.

This is what local_action rsync already does.

Tin Tvrtković

unread,
Mar 11, 2013, 6:36:23 PM3/11/13
to ansible...@googlegroups.com
Thanks for the input, guys.

Since my needs are very simple, I'm leaning towards the netcat option (very clever!). On the server side, timeout 30 tar cz . | nc -lp 2222 (maybe wrap it with commands to open a firewall port and close it), and the reverse on the client side. I don't mind it being insecure and inefficient in this case, and the amount of data is very low. I like that it's reasonably clean (no temporary files, shuts down after one request or the timeout). If I needed anything more complex I'd have gone with the temp rsync daemon, also wrapped in a timeout command.

I think this might be it, nice ideas.

Lorin Hochstein

unread,
Mar 11, 2013, 11:02:42 PM3/11/13
to ansible...@googlegroups.com
Tin:

I might be missing something here, but can't you use SSH agent forwarding to achieve this?  You'd need to use the "ssh" connection instead of paramiko. 



--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Lorin Hochstein
Lead Architect - Cloud Services
Nimbis Services, Inc.

Jim Kleckner

unread,
Mar 13, 2013, 10:34:03 AM3/13/13
to ansible...@googlegroups.com


On Sunday, March 10, 2013 6:58:05 PM UTC-7, Michael DeHaan wrote:
I had, at one point, considered creating of a task-only ephemeral
fileserver that required access by using certain tokens, but I thought
it was a bit of a distraction and I don't want to maintain bonus
security systems.

While I understerstand the idea of using --ask-pass and so on, this is
really what locked SSH keys and ssh-agent excel at.


At the risk of hijacking a thread, another line of reasoning about the OP's
use case is to provide a mechanism to supply passwords the way that
ssh-agent provides tokens.  Yes it is different that the private key never
leaves the ssh-agent process.

The issue with ask-pass is really that it is manual.

There was some discussion of integrating a keyring here:


This is a bit like the situation with secure passwords in a browser
that leads to a solution like LastPass.

Just a thought.
 

Michael DeHaan

unread,
Mar 13, 2013, 10:49:32 AM3/13/13
to ansible...@googlegroups.com
ssh-agent and keys are great. Just saying.

Jim Kleckner

unread,
Mar 13, 2013, 10:51:44 AM3/13/13
to ansible...@googlegroups.com
I prefer that as well.

I have seen some people shy away and am curious why.

Also, some people have situations where they are "fitting in" to
an existing setup where they just have passwords.

Tin Tvrtković

unread,
Mar 14, 2013, 4:48:31 AM3/14/13
to ansible...@googlegroups.com
In this case, the automated setting up of keys between two remote machines (and tearing the setup down after, reliably) strikes me as too complex. Since we'll be using Ansible for automated integration testing (which might include fresh, virgin VM snapshots), we'd really like things to be as automated as possible. SSH agent forwarding looks great (wasn't familiar with it until the poster above mentioned it), but as far as I can see it requires both key auth and using ssh-agent (and some additional configuration), which I can't necessarily impose on either our integration test setup nor on the guy who'll eventually be executing the deployment playbooks on client premises.

I realize passwords are less secure and more cumbersome to use, but they simply fit our current requirements better. Yes, there are things we could do to transition away from using passwords (and maybe we'll do them in the future), but right now, passwords work for us.

Michael DeHaan

unread,
Mar 14, 2013, 9:45:40 AM3/14/13
to ansible...@googlegroups.com
Solution is to not tear those keys down :)

ssh-copy-id is a pretty useful program, though it's even better to
deploy them at provisioning time, which also all cloud solutions make
/super/ easy to inject keys.
> --
> You received this message because you are subscribed to the Google Groups
> "Ansible Project" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ansible-proje...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



Tin Tvrtković

unread,
Mar 14, 2013, 10:08:52 AM3/14/13
to ansible...@googlegroups.com
I see your point; just as now we have VM images with a preset password we could have VM images that trust a particular key, or deploy a particular key using the password as the first step. We just down atm, due to lack of resources/education/whatever.

As for tearing down keys, if we're talking about ansible -> hosts communication, sure, I'd be fine with leaving the hosts trusting a particular key. If we're talking remote machine -> remote machine (i.e. Postgres slave -> Postgres master), I'd really like those two machines to not have a passwordless SSH connection between them outside of the short deployment time.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages