HTTP Operator always uses www.google.com?

3,841 views
Skip to first unread message

r0ger

unread,
Apr 21, 2016, 1:05:36 PM4/21/16
to Airflow
I am trying to call my api service using HttpOperator. But why does it always call google :) 


t11 = SimpleHttpOperator(
task_id='get_op',
method='GET',
endpoint="http://192.168.1.10:8300/services/myapi?",
data={"startTime": "2016-03-01T00:00:00",
"endTime":"2016-03-01T01:00:00"},
headers={"Content-Type": "application/x-www-form-urlencoded"},
response_check=lambda response: True if len(response.json()) == 0 else False,
dag=dag)



[2016-04-21 16:53:56,669] {http_operator.py:57} INFO - Calling HTTP method
[2016-04-21 16:53:56,673] {base_hook.py:53} INFO - Using connection to: https://www.google.com/
[2016-04-21 16:53:56,675] {http_hook.py:63} INFO - Sending 'GET' to url: https://www.google.com/http://192.168.1.10:8300/services/myapi

r0ger

unread,
Apr 21, 2016, 1:12:41 PM4/21/16
to Airflow
When I use the below:

t11 = SimpleHttpOperator(
task_id='get_op',
method='GET',
    http_conn_id='http://192.168.1.10:8300',
endpoint="/generate/pipeline",

data={"startTime": "2016-03-01T00:00:00",
          "endTime":"2016-03-01T01:00:00"},
headers={"Content-Type": "application/x-www-form-urlencoded"},
response_check=lambda response: True if len(response.json()) == 0 else False,
dag=dag)


I get this:

[2016-04-21 17:10:58,507] {models.py:1108} ERROR - The conn_id `http://192.168.59.103:8900` isn't defined
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1067, in run
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python2.7/dist-packages/airflow/operators/http_operator.py", line 61, in execute
    self.extra_options)
  File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/http_hook.py", line 45, in run
    session = self.get_conn(headers)
  File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/http_hook.py", line 24, in get_conn
    conn = self.get_connection(self.http_conn_id)
  File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/base_hook.py", line 51, in get_connection
    conn = random.choice(cls.get_connections(conn_id))
  File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/base_hook.py", line 39, in get_connections
    "The conn_id `{0}` isn't defined".format(conn_id))
AirflowException: The conn_id `http://192.168.59.103:8900` isn't defined
[2016-04-21 17:10:58,514] {models.py:1144} ERROR - The conn_id `http://192.168.59.103:8900` isn't defined

Maxime Beauchemin

unread,
Apr 21, 2016, 3:10:19 PM4/21/16
to Airflow

r0ger

unread,
Apr 21, 2016, 9:10:23 PM4/21/16
to Airflow
Aaah.. Thats helpful. Thanks a lot.
Is this only a UI feature?  can I create new connections at run time - i.e., in the code itself? 

Maxime Beauchemin

unread,
Apr 21, 2016, 9:13:15 PM4/21/16
to Airflow
http://pythonhosted.org/airflow/concepts.html#connections

Connections

The connection information to external systems is stored in the Airflow metadata database and managed in the UI (Menu -> Admin -> Connections) A conn_id is defined there and hostname / login / password / schema information attached to it. Airflow pipelines can simply refer to the centrally managed conn_id without having to hard code any of this information anywhere.

Many connections with the same conn_id can be defined and when that is the case, and when thehooks uses the get_connection method from BaseHook, Airflow will choose one connection randomly, allowing for some basic load balancing and fault tolerance when used in conjunction with retries.

Airflow also has the ability to reference connections via environment variables from the operating system. The environment variable needs to be prefixed with AIRFLOW_CONN_ to be considered a connection. When referencing the connection in the Airflow pipeline, the conn_id should be the name of the variable without the prefix. For example, if the conn_id is named POSTGRES_MASTER the environment variable should be named AIRFLOW_CONN_POSTGRES_MASTER. Airflow assumes the value returned from the environment variable to be in a URI format (e.g.postgres://user:password@localhost:5432/master).

r0ger

unread,
Apr 21, 2016, 9:23:03 PM4/21/16
to Airflow
awesome. so any env variable with "AIRFLOW_CONN_" prefix would work!!!
Thanks. Sorry I dont know why i did not look at the link in the first place. Thank Maxiime.
Reply all
Reply to author
Forward
0 new messages