PK improvements

Sean Colsen

unread,

Feb 19, 2025, 12:07:19 PMFeb 19

to Mathesar Developers

Team,

During today’s product meeting we decided to work on improving Mathesar’s support for various PK configurations as part of 0.2.2. After the meeting, I spent a little time thinking this through, in an effort to move this initiative forward.

Here’s what I would suggest:

Proposal

Aim to do the following for 0.2.2:

Allow customizing the PK options when creating tables from Mathesar

Possible new functionality:
- Customize the type of the PK column (e.g. TEXT, UUID)
- Customize the name of the PK column
- Customize the default value of the PK column (e.g. manually-generated)
- Choose an existing column from an import as the PK
- Create the table with no PK
Possible entry points:
- Create table via import
- Create table from scratch
- Create many-to-many relationship (maybe? …probably not worth it)
Relevant threads: discussion, issue
Improve support for manually-generated PKs
- Allow inserting records (issue)
- Allow updating the PK values themselves
Improve support for the UUID type
- Add a proper “UI Type” (on the front end) for UUIDs (e.g. to render an icon)
- Fix filtering on partially-entered UUID values: (issue)
Allow users to add and remove primary key constraints
Fix sequence setup in our sample data (so users don’t encounter this bug)

Do not try doing these things right now

Support for multi-column PKs (issue) — Effort seems too high. Front end would need a ton of refactoring.
Support tables without PKs — Effort seems too high
Support tables with no PK and at least one un-orderable column (bug) — Impact seems too low

Next steps

I think we need clarity on scope first. Perhaps we could get that async, if people can respond to this email. But if people don’t respond (or if we have differing opinions), then I think a meeting would be wise. I would suggest we don’t wait a whole week (until the next product meeting) to get this figured out. I think we can and should move faster, which is why I’ve tried to jump start this discussion via email.
Once we have clarity on scope, we need to identify the work that requires design. In my proposal, I think only the first bullet point above would require design (custom PK options for Mathesar-created tables). Even if the team does need to discuss scope, I still think it’s pretty likely that we’ll want to tackle this feature. So this brings me to the biggest reason I’m sending this email:

If we can get enough clarity on scope asynchronously, then Ghislaine could potentially get started designing the custom-PK-options feature soon. That would seem useful to me since I hear she needs work.

Zack Krida

unread,

Feb 19, 2025, 3:07:31 PMFeb 19

to Mathesar Developers, se...@mathesar.org

Sean, this is an awesome list and while I'll wait for other engineers to chime in, your scope makes sense to me. I had one small clarifying question:

In "Aim to do the following for 0.2.2" you mention "Create the table with no PK" but in "Do not try doing these things right now" you mention "Support tables without PKs — Effort seems too high". Those seem a bit contradictory to me, could you clarify what you mean? It's likely I'm missing something obvious.

Barring detailed feedback from the rest of the team, I'd like to try to get on a call on Friday to talk through this a bit more to 1. Define the scope with certainty and 2. Get necessary design work scheduled and on Ghislaine's plate for next week. I'll send out an invite for that tomorrow morning (again, unless this thread reaches some kind of consensus before then).

Thanks for putting this together!

Sean Colsen

unread,

Feb 19, 2025, 7:09:41 PMFeb 19

to Zack Krida, Mathesar Developers

Those seem a bit contradictory to me

Oh, good catch! Let’s take “create table with no PK” out of scope then.

Brent Moran

unread,

Feb 20, 2025, 4:12:54 AMFeb 20

to Mathesar Developers

Hey,

The list is a good start. Out of curiosity, did you do a thorough search through our issues for "primary key", "pkey", "pk", etc.?

Regarding creating tables w/o primary keys

I think being able to create a table with no pkey would be useful even without full support, and it's conceptually a prerequisite for choosing an existing column as a pkey. The process would be:

- Import the table

- Choose the types

- Choose the pkey column

- Choose the default for that column

The reason to import, then choose the pkey column is that it's (much) more performant, and the error scenarios if the column is inappropriate for a primary key are less irritating. If you're importing a large file, you don't want to have the whole import fail because you chose a primary key column that had a duplicate value 90% of the way through. Long term, you might even want to fix the duplicate value (not possible until we can edit with no pkey; see below) and then make the column your pkey. You don't want to reimport everything after fixing the problem through some other tool.

Also, to support allowing users to add and remove primary key constraints, we need to at least not throw red everywhere if they're looking at a table without a pkey. This could be avoided if instead of add/remove we have "change" (i.e., "change the primary key column"), but I don't think that's the best way.

To be clear, we already support tables with no pkey for reading. Just not editing. An achievable half-measure would be to more gracefully handle situations where things break when there's no pkey. E.g., show a little "it's not yet possible to edit a table with no primary key" message in the UI if they perform some action that indicates they're trying to edit the table.

Regarding customizing default values for primary keys

Customizing the default value for the pkey column could easily turn into a boondoggle, since we don't currently have any functionality for setting dynamic defaults on columns. We'd need to come up with a UI that let them set dynamic defaults for pkey columns. Then why not let them use that for other columns of the same type? Etc. So, "dynamic default setting" is kind of a prerequisite for custom default values. There's also the problem of setting integer sequences to the proper value (or at least according to their expectation) if they choose a preexisting integer column. For example, they may choose an integer column as pkey, and want to generate sequential integers. The sequence can't start at 1. All that is to say, I suggest we restrict the scope to either "use Mathesar's auto-generated integer ID, Mathesar's auto-generated UUID id, or manually enter pkey values into any type pkey column you'd like" for this release.

Regarding sequence permissions

The sequence setup bug in our sample data is already fixed in this PR, visible from the issue. After re-reading our conversations in that issue, it's not clear to me that there's anything to do here that we'd want in-scope for this release. Specifically, we'd need to introduce the concept of "sequence permissions" for sequences on preexisting DBs.

Regarding supporting preexisting tables without primary keys

I think it would be nice to at least smooth out the error experience without actually increasing any functionality for these tables. This seems pretty low-effort, and would help less technical users understand their situation better. Also, improved error information here would go well with adding primary key constraints to preexisting tables without one.

Additional: Adding a primary key column

This is different from just adding the constraint to a preexisting column, and I think we should have this in the milestone. In fact, I'd put this at a higher priority than just adding the constraint to a preexisting column, since you can always add a pkey column, but you may not have a column which is appropriate as a single-column pkey (e.g., in a many-to-many mapping table. Typically the pairs are unique, but not the single columns.).

Additional: removing a multi-column pkey constraint

This, plus letting them add a primary key column, would get them out of the trap where their many-to-many mapping table just uses the unique pairs as the pkey (this does happen). They'd drop that constraint, add a generated ID primary key column, and then be back on their feet able to add mappings to their table.

That's all I can think of for now.

Brent

Sean Colsen

unread,

Feb 20, 2025, 10:26:56 AMFeb 20

to Brent Moran, Mathesar Developers

Excellent thoughts, Brent! My responses below:

Issue search

did you do a thorough search through our issues for “primary key”, “pkey”, “pk”, etc.?

Yeah, reasonably thorough. I could have missed some, but I’m fairly confident that I surfaced the important ones.
Import flow

You make some good points about the import flow which I hadn’t considered.

I think if we really stepped back and considered our ideal import flow along with all the pk-related features we’d want, it would certainly be too hefty for us to implement now. The good news is that I can think of a lot of different ways to strike a compromise. But the bad news is that all those different scope-cutting strategies lead to a somewhat open-ended problem with the risk of our team getting stuck in a design quagmire. We’ll need to carefully balance technical limitations with UX considerations, and I think we’ll have an easier time doing that on a call. I have some creative ideas to add to the mix, but for the sake of efficiency I’ll save them for the call.

The table import flow is where I think we need to focus our synchronous discussion efforts.

I would hope that we can (roughly) spec out much of the surrounding work asynchronously.
Default values

I suggest we restrict the scope to either “use Mathesar’s auto-generated integer ID, Mathesar’s auto-generated UUID id, or manually enter pkey values into any type pkey column you’d like” for this release.

Yeah, this sounds good to me.
Sequences

The sequence setup bug in our sample data is already fixed in this PR

Ahh, right. Thanks. Let’s strike that from the list then!
Non-PK tables

Regarding supporting preexisting tables without primary keys, I think it would be nice to at least smooth out the error experience without actually increasing any functionality for these tables.

Gosh, yeah you’re right — the experience is quite bad currently! I just opened this issue to track that: Provide better error experience for tables without primary keys. And yeah, I agree it’d be easy to do, so we ought to take this up for 0.2.2.

Adding a primary key column

Neat idea. I would suggest we roll this idea into the design discussion for the import flow, since I imagine the two would be related.
Removing multi-column PKs

Yeah, this makes sense. And I imagine it would be covered under Allow users to add and remove primary key constraints.

Zack Krida

unread,

Feb 20, 2025, 10:55:40 AMFeb 20

to Mathesar Developers, se...@mathesar.org, br...@mathesar.org

Just sharing that I've sent out an invite for a Monday meeting to wrap up this discussion with a focus on refining the Import flow.

Pavish Kumar Ramani Gopal

unread,

Feb 21, 2025, 3:18:46 AMFeb 21

to Zack Krida, se...@mathesar.org, br...@mathesar.org, Mathesar Developers

> I would suggest we roll this idea into the design discussion for the import flow, since I imagine the two would be related.

> with a focus on refining the Import flow

I think we might be considering this the other way around. The import flow uses an underlying table for all its operations. We might benefit from first figuring out how to add, change, and remove a primary key column of a table, and then fitting it in with the import flow.

I agree with Brent on limiting the scope of default values and only setting them for primary key columns that we auto-create.

I do have a number of opinions and suggestions on this and I think it would be best to save them for our synchronous discussion on Monday.

Zack Krida

unread,

Feb 24, 2025, 7:19:08 PMFeb 24

to Mathesar Developers

Hi team,

I've prepared an agenda for tomorrow's continued discussion of primary key work for 0.2.2. Please review the agenda in advance and update it if I've missed anything.