Thoughts after the code deletion project

Brent Moran

unread,

Oct 24, 2024, 3:42:58 AM10/24/24

to Mathesar Developers

Hello!

Since the dead code deletion project is complete for the time being, I thought this would be a good moment to write up some thoughts that arose over the course of the project.

First, some stats about the current state:

Since the start of the dead code deletion project, the repo has had a net change of -52,833 lines of code.
In the db library, the small amount of remaining code using SQLAlchemy has been moved to a deprecated namespace.
The only code in the mathesar library using SQLAlchemy is now Exception Handling code for the REST API
The tests now run in less than 13 seconds on my laptop, and less than 45 seconds in the CI pipeline.

Details about the new form of the db library:

There's a new db.deprecated namespace. All code there is related to Explorations.

We can get rid of this namespace if we move Exploration query building to the SQL layer.
This namespace is the only place that can produce SQLAlchemy exceptions, and so forces SQLAlchemy imports in the mathesar layer for error handling.

SQLAlchemy is no longer allowed in the non-deprecated namespace.
psycopg2 is no longer allowed in the non-deprecated namespace.
Try to avoid adding any new uses of the now-deprecated `PostgresType` Enum in the non-deprecated namespace.
Functions in the non-deprecated areas are collected into single files per object

I.e., All functions under db/columns/operations/... are now in a db/columns.py file
This is because the only remaining modules were all under "operations", and each module typically had only one function.

Details about the mathesar library:

Getting rid of the remaining REST endpoints (DataFiles and Users) would massively simplify this area

The majority of the code complexity here is just Exception handling necessitated by DRF.
Would also let us get rid of many dependencies
Wouldn't be that hard to pull off before beta (depending on release date)

The only remaining models are related to Connections, DataFiles, and Users
Same for serializers, viewsets, etc.

Ad-hoc thoughts

We should really figure out a way to get rid of the DataFile model. It's not actually useful after an import is complete.
If we want to keep Explorations around, I think it's worth moving the query building logic to the SQL layer.

Some TODOs noticed during this project:

Need to verify and fix docstrings of RPC functions
Need to decide (for the final time) on RPC function names, and consider unifying them throughout the codebase
Need to decide on function parameter order convention and unify them throughout the python codebase.
Need more SQL layer tests, especially around type casting
Need some E2E scenario-based tests of the back end to make sure everything is working as expected.
Need to make sure all RPC functions have a mocked test that checks to make sure they're generating the correct SQL query on the user DB

Brent

Sean Colsen

unread,

Oct 24, 2024, 6:38:57 AM10/24/24

to Brent Moran, Mathesar Developers

Very cool! Thanks for the report!

Kriti Godey

unread,

Oct 25, 2024, 4:17:31 PM10/25/24

to Sean Colsen, Brent Moran, Mathesar Developers

This is pretty awesome, thanks Brent! 50,000+ lines of deleted code is awesome, and that speed for running tests is amazing.

Can you track the todos and potential removal of models / serializers / viewsets in GitHub (if sufficiently specced out) or Asana (if still ideas)? Thanks!

Reply all

Reply to author

Forward