t11 = SimpleHttpOperator(
task_id='get_op',
method='GET',
endpoint="http://192.168.1.10:8300/services/myapi?",
data={"startTime": "2016-03-01T00:00:00",
"endTime":"2016-03-01T01:00:00"},
headers={"Content-Type": "application/x-www-form-urlencoded"},
response_check=lambda response: True if len(response.json()) == 0 else False,
dag=dag)
[2016-04-21 16:53:56,669] {http_operator.py:57} INFO - Calling HTTP method [2016-04-21 16:53:56,673] {base_hook.py:53} INFO - Using connection to: https://www.google.com/ [2016-04-21 16:53:56,675] {http_hook.py:63} INFO - Sending 'GET' to url: https://www.google.com/http://192.168.1.10:8300/services/myapi
t11 = SimpleHttpOperator(
task_id='get_op',
method='GET',
http_conn_id='http://192.168.1.10:8300',
endpoint="/generate/pipeline",
data={"startTime": "2016-03-01T00:00:00","endTime":"2016-03-01T01:00:00"},headers={"Content-Type": "application/x-www-form-urlencoded"},
response_check=lambda response: True if len(response.json()) == 0 else False,
dag=dag)
I get this:
[2016-04-21 17:10:58,507] {models.py:1108} ERROR - The conn_id `http://192.168.59.103:8900` isn't defined Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1067, in run result = task_copy.execute(context=context) File "/usr/local/lib/python2.7/dist-packages/airflow/operators/http_operator.py", line 61, in execute self.extra_options) File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/http_hook.py", line 45, in run session = self.get_conn(headers) File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/http_hook.py", line 24, in get_conn conn = self.get_connection(self.http_conn_id) File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/base_hook.py", line 51, in get_connection conn = random.choice(cls.get_connections(conn_id)) File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/base_hook.py", line 39, in get_connections "The conn_id `{0}` isn't defined".format(conn_id)) AirflowException: The conn_id `http://192.168.59.103:8900` isn't defined [2016-04-21 17:10:58,514] {models.py:1144} ERROR - The conn_id `http://192.168.59.103:8900` isn't defined
http_conn_id should be a reference to an Airflow connection as defined in here:
http://pythonhosted.org/airflow/concepts.html#connections
Max
The connection information to external systems is stored in the Airflow metadata database and managed in the UI (Menu -> Admin -> Connections
) A conn_id
is defined there and hostname / login / password / schema information attached to it. Airflow pipelines can simply refer to the centrally managed conn_id
without having to hard code any of this information anywhere.
Many connections with the same conn_id
can be defined and when that is the case, and when thehooks uses the get_connection
method from BaseHook
, Airflow will choose one connection randomly, allowing for some basic load balancing and fault tolerance when used in conjunction with retries.
Airflow also has the ability to reference connections via environment variables from the operating system. The environment variable needs to be prefixed with AIRFLOW_CONN_
to be considered a connection. When referencing the connection in the Airflow pipeline, the conn_id
should be the name of the variable without the prefix. For example, if the conn_id
is named POSTGRES_MASTER
the environment variable should be named AIRFLOW_CONN_POSTGRES_MASTER
. Airflow assumes the value returned from the environment variable to be in a URI format (e.g.postgres://user:password@localhost:5432/master
).