Cannot connect to Memorystore from Dataflow

286 views
Skip to first unread message

thomas.o...@gmail.com

unread,
Jul 15, 2019, 10:45:31 AM7/15/19
to Google Cloud Memorystore Discuss
Hi,
I'm attempting to use GCP Memorystore to handle session ids for a event streaming job running on GCP Dataflow. The job fails with a timeout when trying to connect to Memorystore:

[error] redis.clients.jedis.exceptions.JedisConnectionException: Failed connecting to host 10.0.0.4:6379
[error] at redis.clients.jedis.Connection.connect(Connection.java:207)
[error] at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:101)
[error] at redis.clients.jedis.Connection.sendCommand(Connection.java:126)
[error] at redis.clients.jedis.Connection.sendCommand(Connection.java:117)
[error] at redis.clients.jedis.Jedis.get(Jedis.java:155)

My Memorystore instance has these properties:

Version is 4.0
Authorized network is default-auto
Master is in us-central1-b. Replica is in us-central1-a.
Connection properties: IP address: 10.0.0.4, Port number: 6379 
> gcloud redis instances list --region us-central1
INSTANCE_NAME  VERSION    REGION       TIER         SIZE_GB  HOST      PORT  NETWORK       RESERVED_IP  STATUS  CREATE_TIME
memorystore    REDIS_4_0  us-central1  STANDARD_HA  1        10.0.0.4  6379  default-auto  10.0.0.0/29  READY   2019-07-15T11:43:14
 
My Dataflow job has these properties:

runner: org.apache.beam.runners.dataflow.DataflowRunner
zone: us-central1-b
network: default-auto
> gcloud dataflow jobs list   
JOB_ID                                    NAME                        TYPE       CREATION_TIME        STATE      REGION
2019-06-17_02_01_36-3308621933676080017   eventflow                   Streaming  2019-06-17 09:01:37  Running    us-central1

My "default" network could not be used since it is a legacy network, which Memorystore would not accept. I failed to find a way to upgrade the default network from legacy to auto and did not want to delete the existing default network since this would require messing with production services. Instead I created a new network "default-auto" of type auto, with the same firewall rules as the default network. The one I believe is relevant for my Dataflow job is this:

Name: default-auto-internal
Type: Ingress
Targets: Apply to all
Filters: IP ranges: 10.0.0.0/20
Protocols/ports: 
  tcp:0-65535
  udp:0-65535
  icmp
Action: Allow
Priority: 65534

I can connect to Memorystore using "telnet 10.0.0.4 6379" from a Compute Engine instance.

Things I have tried, which did not change anything:
- Switched Redis library, from Jedis 2.9.3 to Lettuce 5.1.7
- Deleted and re-created the Memorystore instance

Is Dataflow not supposed to be able to connect to Memorystore, or am I missing something?

thomas.o...@gmail.com

unread,
Jul 16, 2019, 11:55:16 AM7/16/19
to Google Cloud Memorystore Discuss
Figured it out. I was trying to connect to Memorystore from code called directly from the main method of my Dataflow job. Connecting from code running in a Dataflow step worked. On second though (well, actually more like 1002nd thought) this makes sense because main() is running on the driver machine (my desktop in this case) whereas the steps of the Dataflow graph will run on GCP. I have confirmed this theory by connecting to Memorystore on localhost:6379 in my main(). This works since I have an SSH tunnel to Memorystore running on port 6379 (using this trick).

Thomas

Machindra Rithe

unread,
Jan 26, 2022, 8:43:29 AMJan 26
to Google Cloud Memorystore Discuss
Hi Thomas,

Thanks for solution. I am getting timeout error while try to connect redis memorystore from dataflow step. Redis memorystore and dataflow job is in same VPC and same project. Still facing timeout issue. 
Thanks in advance.

Regards,
Machindra Rithe

Thomas Oldervoll

unread,
Jan 26, 2022, 9:25:52 AMJan 26
to Google Cloud Memorystore Discuss
Is your VPC network a legacy network, by any chance? I had to create a new "auto" VPC because my existing VPC (that we have used for 6 years) was of the old "legacy" type. Redis Memorystore does not support legacy networks. For me the fix was to create a new VPC and connect both Redis and the Dataflow jobs to this VPC. See last line of https://cloud.google.com/memorystore/docs/redis/networking.

Thomas



Reply all
Reply to author
Forward
0 new messages