MoveTable from unmanaged mysql to vitess managed keyspace

247 views
Skip to first unread message

Yohan Bismuth

unread,
Sep 15, 2020, 10:42:42 AM9/15/20
to vitess
Hello,
I'm trying to switch from a single mysql server to a new vitess managed cluster.

I have two keyspaces available.
The first one is named brandnew, which is a new keyspace started with the script 101_initial_cluster.sh and a custom vschema.json and sql schema. 

The second one is called legacy and is created following this documentation : https://vitess.io/docs/user-guides/unmanaged-tablet/
It is an unmanaged mysql server containing my data.

I've been trying to use the MoveTable process from the legacy keyspace to the brandnew keyspace in order to migrate my data from a regular mysql server to a vitess cluster.

Here is my vttablet state:
+-------+----------+-------+------------+---------+------------------+----------+
| Cell  | Keyspace | Shard | TabletType | State   | Alias            | Hostname |
+-------+----------+-------+------------+---------+------------------+----------+
| zone1 | brandnew     | 0     | MASTER     | SERVING | zone1-0000000100 | BXDGK13  |
| zone1 | brandnew     | 0     | REPLICA    | SERVING | zone1-0000000101 | BXDGK13  |
| zone1 | brandnew     | 0     | RDONLY     | SERVING | zone1-0000000102 | BXDGK13  |
| zone1 | legacy   | 0     | MASTER     | SERVING | zone1-0000000401 | BXDGK13  |
+-------+----------+-------+------------+---------+------------------+----------+

I run the MoveTable process like this:
 vtctlclient MoveTables -tablet_types=master -workflow=legacy2brandnew legacy brandnew '{<my tables list>}'

I've got some errors I don't understand like 
E0915 16:20:45.480627    8237 main.go:67] remote error: rpc error: code = Unknown desc = source table TRM_VALUESET_C_DESIGNATIONES_REINDEX_ARCHIVED does not exist

I dont understand the error because the table exists in legacy keyspace, but anyway I gave it a try and it seemed to work (at least for some other tables I was interested in).
Example:
mysql brandnew -e 'select ID from RESOURCE LIMIT 1';
+--------+
| ID |
+--------+
|      1 |
+--------+

So now trouble begins. I kept following the documentation and ran the following commands:
vtctlclient SwitchReads -tablet_type=rdonly brandnew.legacy2brandnew 
vtctlclient SwitchReads -tablet_type=replica brandnew.legacy2brandnew
vtctlclient SwitchWrites brandnew.legacy2brandnew
vtctlclient DropSources brandnew.legacy2brandnew

All of them ended up with the following error:
I0915 16:26:16.160432    8453 trace.go:151] successfully started tracing with [noop]
E0915 16:26:16.174098    8453 main.go:64] E0915 14:26:16.171597 traffic_switcher.go:112] buildTrafficSwitcher failed: no streams found in keyspace brandnew for: legacy2brandnew
E0915 16:26:16.175093    8453 main.go:67] remote error: rpc error: code = Unknown desc = no streams found in keyspace brandnew for: legacy2brandnew

Then I shutdown my remote unmanaged mysql server and try the same request as before:
mysql brandnew -e 'select ID from RESOURCE LIMIT 1';

which results with error because the unmanaged server was shut down.

E0915 16:29:00.740713    7574 tabletserver.go:1522] net.Dial(127.0.0.1:4306) failed: dial tcp 127.0.0.1:4306: connect: connection refused (errno 2003) (sqlstate HY000) (CallerID: userData1): Sql: "select ID from RESOURCE limit :vtg1", BindVars: {#maxLimit: "type:INT64 value:\"10001\" "vtg1: "type:INT64 value:\"1\" "}
I0915 16:29:00.741149    7574 tabletserver.go:693] Check MySQL failed. Shutting down query service
I0915 16:29:00.741214    7574 tabletserver.go:332] TabletServer state: SERVING -> SHUTTING_DOWN (Shutting Down)
I0915 16:29:00.741251    7574 tabletserver.go:593] Executing complete shutdown.
I0915 16:29:00.741334    7574 tx_engine.go:407] No grace period specified: performing normal wait.
I0915 16:29:00.742552    7574 tabletserver.go:601] Shutdown complete.
I0915 16:29:00.742600    7574 tabletserver.go:332] TabletServer state: SHUTTING_DOWN (Shutting Down) -> NOT_SERVING (Not Connected)
ERROR 2003 (HY000) at line 1: vtgate: http://BXDGK13:15001/: target: legacy.0.master, used tablet: zone1-401 (BXDGK13): vttablet: rpc error: code = Unknown desc = net.Dial(127.0.0.1:4306) failed: dial tcp 127.0.0.1:4306: connect: connection refused (errno 2003) (sqlstate HY000) (CallerID: userData1): Sql: "select ID from RESOURCE limit :vtg1", BindVars: {#maxLimit: "type:INT64 value:\"10001\" "vtg1: "type:INT64 value:\"1\" "}

I tried to reset the Routing Rules with vtctlclient ApplyRoutingRules -rules='{}'
After that, I don't have errors anymore (since queries aren't routed anymore to legacy server), but I can't retrieve any data. So i guess no data was copied from legacy to brandnew keyspace, which was my initial goal.

Do you have any idea what I'm doing wrong or misunderstood ?

Thank you

Rohit Nayak

unread,
Sep 15, 2020, 3:42:04 PM9/15/20
to vitess
Hi, which version of Vitess are you using?

Overall you have understood it perfectly! Maybe there is some issue with the streams. To see if there is a problem with the vreplication streams, can you check their status (and paste the result?) using one of  :

vtctlclient Workflow brandnew.legacy2brandnew show 

vtctlclient VExec brandnew.legacy2brandnew "select * from _vt.vreplication"

The message column is used to store any errors. The pos column tracks the last gtid of the source which got vreplicated.

Yohan Bismuth

unread,
Sep 16, 2020, 5:15:48 AM9/16/20
to vitess
Hello, thank you for your answer.
I've tried it with vitess-6.0.20 and vitess 7. I had the same results in both cases.

I can't run your commands, they also end up with errors:

vtctlclient Workflow brandnew.legacy2brandnew show 
I0916 11:13:39.821929    7618 trace.go:151] successfully started tracing with [noop]
Unknown command: Workflow
E0916 11:13:39.823194    7618 main.go:67] remote error: rpc error: code = Unknown desc = unknown command

same error with the other command : vtctlclient VExec brandnew.legacy2brandnew "select * from _vt.vreplication"
I0916 11:14:43.239182    7679 trace.go:151] successfully started tracing with [noop]
Unknown command: VExec
E0916 11:14:43.245277    7679 main.go:67] remote error: rpc error: code = Unknown desc = unknown command

Rohit Nayak

unread,
Sep 16, 2020, 6:00:06 AM9/16/20
to vitess
Hmm. Are you trying with a local install or docker? Does the local_example: test/local_example.sh run to completion on your setup?

Can you check which version has been built using: vtctld -version

Yohan Bismuth

unread,
Sep 16, 2020, 10:26:06 AM9/16/20
to vitess
I'm using a local install with the latest binary release available on https://github.com/vitessio/vitess/releases

I cloned the repository since local_example.sh isn't par of the tar.gz
Then I tried to run ./test/local_example.sh but it's stuck at step 1 because of password issue I think:
+ ./101_initial_cluster.sh
add /vitess/global
add /vitess/zone1
add zone1 CellInfo
etcd start done...
Starting vtctld...
Starting MySQL for tablet zone1-0000000100...
Resuming from existing vttablet dir:
    /home/yohanbismuth/lifen/git/vitess/vtdataroot/vt_0000000100

And in the logs : 2020-09-16T14:20:09.769853Z 279 [Note] Access denied for user 'vt_dba'@'localhost' (using password: NO)

Here is the output of vtctld -version

Version: 9438d77 (Git branch 'HEAD') built on Wed Jun 24 21:21:50 UTC 2020 by runner @ fv-az52 using go1.13.12 linux/amd64

Rohit Nayak

unread,
Sep 17, 2020, 3:52:22 AM9/17/20
to vitess
The users should be created by the scripts themselves. I see the "Resuming from existing vttablet dir ..." line. Please delete everything under the vtdataroot directory try again. Otherwise it uses the previous state of the tablet which may have been inconsistent.

Also the version I see after downloading the latest release (7.0.1) is 
Version: 19c92a5 (Git branch 'HEAD') built on Tue Aug 25 16:28:38 UTC 2020 

You seem to have the 6.x version where VExec/Workflow were not available.

Yohan Bismuth

unread,
Sep 17, 2020, 6:36:14 AM9/17/20
to vitess
I cleaned everything in vtdataroot/*10* and reran the example.
local_example.sh works fine and completes.

Also,
you were right I was still using 6.x version (I moved the 7.x bins to the wrong path). I fixed it and ran your commands again :

vtctlclient Workflow brandnew.legacy2brandnew show
E0917 12:19:53.344066    5194 main.go:67] remote error: rpc error: code = Unknown desc = invalid action found: show
=> It seems the only available actions are start/stop/delete

vtctlclient VExec brandnew.legacy2brandnew "select * from _vt.vreplication"
E0917 12:18:25.223656    5129 main.go:67] remote error: rpc error: code = Unknown desc = 
no matching streams found for workflow legacy2brandnew, tablet cell:"zone1" uid:100 , query select * from _vt.vreplication where db_name = 'vt_brandnew' and workflow = 'legacy2brandnew'

Yohan Bismuth

unread,
Sep 18, 2020, 4:09:21 AM9/18/20
to vitess
I think I managed to make it work. I changed two things. 
I was setting my vschema before doing my migration, so I stopped doing that, like in the example. Also, I did run workflow only on tables actually containing data (I was previously trying to run it on every tables with no distinction)

Thank you for your help

Rohit Nayak

unread,
Sep 18, 2020, 7:00:23 AM9/18/20
to vitess
That is great! 

So does MoveTables stall if there are empty tables? 

Yohan Bismuth

unread,
Sep 18, 2020, 12:03:10 PM9/18/20
to vitess
I guess. The routing rules worked just fine, but the data copy didn't happen.
Reply all
Reply to author
Forward
0 new messages