Thoughts after the code deletion project

6 views
Skip to first unread message

Brent Moran

unread,
Oct 24, 2024, 3:42:58 AM10/24/24
to Mathesar Developers
Hello!

Since the dead code deletion project is complete for the time being, I thought this would be a good moment to write up some thoughts that arose over the course of the project.

First, some stats about the current state:
  • Since the start of the dead code deletion project, the repo has had a net change of -52,833 lines of code.
  • In the db library, the small amount of remaining code using SQLAlchemy has been moved to a deprecated namespace.
  • The only code in the mathesar library using SQLAlchemy is now Exception Handling code for the REST API
  • The tests now run in less than 13 seconds on my laptop, and less than 45 seconds in the CI pipeline.
Details about the new form of the db library:
  • There's a new db.deprecated namespace. All code there is related to Explorations.
    • We can get rid of this namespace if we move Exploration query building to the SQL layer.
    • This namespace is the only place that can produce SQLAlchemy exceptions, and so forces SQLAlchemy imports in the mathesar layer for error handling.
  • SQLAlchemy is no longer allowed in the non-deprecated namespace.
  • psycopg2 is no longer allowed in the non-deprecated namespace.
  • Try to avoid adding any new uses of the now-deprecated `PostgresType` Enum in the non-deprecated namespace.
  • Functions in the non-deprecated areas are collected into single files per object
    • I.e., All functions under db/columns/operations/... are now in a db/columns.py file
    • This is because the only remaining modules were all under "operations", and each module typically had only one function.
Details about the mathesar library:
  • Getting rid of the remaining REST endpoints (DataFiles and Users) would massively simplify this area
    • The majority of the code complexity here is just Exception handling necessitated by DRF.
    • Would also let us get rid of many dependencies
    • Wouldn't be that hard to pull off before beta (depending on release date)
  • The only remaining models are related to Connections, DataFiles, and Users
  • Same for serializers, viewsets, etc.
Ad-hoc thoughts
  • We should really figure out a way to get rid of the DataFile model. It's not actually useful after an import is complete.
  • If we want to keep Explorations around, I think it's worth moving the query building logic to the SQL layer.
Some TODOs noticed during this project:
  • Need to verify and fix docstrings of RPC functions
  • Need to decide (for the final time) on RPC function names, and consider unifying them throughout the codebase
  • Need to decide on function parameter order convention and unify them throughout the python codebase.
  • Need more SQL layer tests, especially around type casting
  • Need some E2E scenario-based tests of the back end to make sure everything is working as expected.
  • Need to make sure all RPC functions have a mocked test that checks to make sure they're generating the correct SQL query on the user DB
Brent

Sean Colsen

unread,
Oct 24, 2024, 6:38:57 AM10/24/24
to Brent Moran, Mathesar Developers
Very cool! Thanks for the report!

Kriti Godey

unread,
Oct 25, 2024, 4:17:31 PM10/25/24
to Sean Colsen, Brent Moran, Mathesar Developers
This is pretty awesome, thanks Brent! 50,000+ lines of deleted code is awesome, and that speed for running tests is amazing.

Can you track the todos and potential removal of models / serializers / viewsets in GitHub (if sufficiently specced out) or Asana (if still ideas)? Thanks!
Reply all
Reply to author
Forward
0 new messages