Hi Marc,
The GPT-3 has been trained on a huge dataset captured from the internet. It has an intrinsic understanding about entities and relationships, without any specific context. It's a statistical model, so the output is not guaranteed to be correct, but the expectation is that as it consumes more data it gets stronger, to the point that it might eventually be comparable to human expertise. If you search Twitter for GPT-3 you will find many interesting examples of people creating automatic code generators that are quite fascinating.
I believe that if we fine-tune the model with enough examples from many schema, it might have a better chance of inferring what is the structure in Gremlin that corresponds to the statement. If I prime the model with the following examples:
"""
What is the full name of Alice?
g.V().hasLabel(<object>).has(<identifier>, ‘Alice’).values(<result>);
result="full name"
<end>
"""
Who is the boss of Alice?
g.V().hasLabel(<object>).has(<identifier>, ‘Alice’).in(<relationship>).values(‘name’);
relationship="is the boss"
<end>
"""
Who is the manager of Bob?
g.V().hasLabel(<object>).has(<identifier>, ‘Bob’).in(<relationship>).values(‘name’);
relationship="is the manager"
<end>
"""
What is the birthday of Alice?
g.V().hasLabel(<object>).has(<identifier>, ‘Alice’).values(<result>);
result="birthday"
<end>
"""
What is the current salary of Charlie?
g.V().hasLabel(<object>).has(<identifier>, ‘Charlie’).values(<result>);
result="current salary"
<end>
"""
Then when I ask the question:
What is the first name of Mrs. Jackson?
It replies with:
g.V().hasLabel(<object>).has(<identifier>, ‘Mrs. Jackson’).values(<result>); result="first name"
So even with a tiny dataset it can already start understanding patterns. If we could feed it more complex examples, the model would become much stronger.
I have also experimented with trying a sort of discourse representation structure, and also with some form of semantic understanding. The result was quite promising too. With the prompt:
Q: What is the birthday of Alice?
A: Alice is a person. Alice has a birthday. What is the birthday of Alice?
Q: What is the current salary of Robert?
A: Robert is a person. Robert has a salary. Of all salaries, Robert has a current salary. What is the current salary of Robert?
Q: Who is the boss of Charlie?
A: Charlie is a person. Charlie has a boss. Charlie's boss is a person. What is the name of Charlie's boss?
Q: Who is the director of Pulp Fiction?
A: Pulp Fiction is a movie. Pulp Fiction has a director. Pulp Fiction's director is a person. What is the name of the director of Pulp Fiction?
Q: What are the tasks of Douglas?
A: Douglas is a person. Douglas has tasks assigned. Douglas' assigned tasks have titles. What are the titles of Douglas' tasks?
Q: How many people work on team Yankee?
A: Yankee is a team. The team Yankee has people working on it. What is the number of people working on team Yankee?
Then when I ask questions it generates the answers by itself:
Q: What are the sizes of all teams in department Delta?
A: Delta is a department. The department Delta has teams in it. All teams in department Delta have sizes. What are the sizes of all teams in department Delta?
Q: What is the most important task I need to work on?
A: I am a person. I have tasks assigned. I need to work on tasks. What is the most important task I need to work on?
Q: Who is my next appointment with?
A: I am a person. I have appointments scheduled. I have an appointment scheduled with someone. Who is my next appointment with?
Q: What are Josephine's main skills?
A: Josephine is a person. Josephine has skills. What are Josephine's main skills?
Q: What is the meaning of life?
A: Life is a concept. Life has a meaning. What is the meaning of life?
Q: What is the last thing I talked to with Mary?
A: I am a person. I talked to someone. I talked to someone last. What is the last thing I talked to with Mary?
Not 100%, but quite impressive, imho.