Skip to content

Conversation

zackkrida
Copy link
Member

@zackkrida zackkrida commented Dec 17, 2024

This pull request introduces six sample datasets under the beta_use_cases directory. These datasets were created to support marketing copy and produce screenshots for an upcoming redesign of the https://mathesar.org site.

Testing instructions

To test this PR, I would recommend looking at the README markdown previews in GitHub or an editor like VSCode to get a feel for each schema via its corresponding mermaid ER diagram, and then loading them into a local mathesar instance. Here's how you could load them all quickly after cd beta_use_cases locally:

docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < bike_service_shop/schema.sql
docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < bike_service_shop/generated_data.sql

docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < hardware_store/schema.sql
docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < hardware_store/generated_data.sql

docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < ice_cream_timesheets/schema.sql
docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < ice_cream_timesheets/generated_data.sql

docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < library_makerspace/schema.sql
docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < library_makerspace/generated_data.sql

docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < museum_exhibits/schema.sql
docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < museum_exhibits/generated_data.sql

docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < nonprofit_grant_tracking/schema.sql
docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < nonprofit_grant_tracking/generated_data.sql

docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < {{DATASET}}/schema.sql/schema.sql
docker exec -i mathesar_dev_db bash -c 'psql -U mathesar' < {{DATASET}}/generated_data.sql/generated_data.sql

The python code here to generate data is pretty inconsequential. I wrote one and copy-pasted for the rest. I think it's fine but I'm not someone who writes Python daily so suggestions for best, idiomatic practices are welcomed.

@zackkrida zackkrida self-assigned this Dec 17, 2024
@zackkrida zackkrida force-pushed the case-studies/hardware-store branch from 008df48 to cdffb28 Compare January 9, 2025 15:50
@zackkrida zackkrida changed the title WIP: Hardware store case study WIP: Beta case studies Jan 9, 2025
@zackkrida zackkrida force-pushed the case-studies/hardware-store branch from 21ed3dc to 219368e Compare January 9, 2025 22:50
@zackkrida zackkrida changed the title WIP: Beta case studies WIP: Beta use cases Jan 10, 2025
@zackkrida zackkrida force-pushed the case-studies/hardware-store branch 8 times, most recently from 9ea35a0 to af9378c Compare January 17, 2025 00:11
@zackkrida zackkrida force-pushed the case-studies/hardware-store branch from af9378c to 0ff30db Compare January 17, 2025 00:17
@zackkrida zackkrida marked this pull request as ready for review January 17, 2025 00:24
@zackkrida zackkrida added the pr-status: review A PR awaiting review label Jan 17, 2025
@zackkrida zackkrida changed the title WIP: Beta use cases New datasets for marketing use cases and Mathesar inclusion Jan 17, 2025
Copy link
Contributor

@mathemancer mathemancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: I did not review:

  • The python code, since it's not that relevant in this use-case
  • The actual data in the SQL files.

Given our time constraints, and the size/scope of this PR, I instead restricted my attention to the data schemas.

The most important concern I have is that you cannot use DROP SCHEMA ... CASCADE; in scripts being run against user databases. I suppose they're okay here in the data playground, but it makes me nervous that we'll forget one when copying these over.

Please use

id INTEGER PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY

for id columns rather than the current definition. (Note that it's generated by default, not always)

Please use Mathesar's custom types where possible. Also, consider adding a Website column (or similar) somewhere, to use the mathesar_types.uri type.

When dumping these data sets, please use the --inserts flag to dump as INSERT statements rather than COPY. While it is possible to wire these up in python as COPY statements, it doesn't work in the form output by pg_dump (i.e., the form in these files). INSERT statements are also more flexible in general, if we use these data sets in other places.

We should not encourage using the WITH TIME ZONE date/time types so much. They don't do what you think. I feel like I keep having the TZ conversation... (though it's the first time with you, @zackkrida , haha). While we could have a column or two that demonstrates the behavior, the TZs are used all over these data sets.

You need to reset all sequences for all tables after inserting data. I'm getting insert errors on these data sets when adding rows in Mathesar:

UniqueViolation: duplicate key value violates unique constraint "Transactions_pkey" DETAIL: Key (id)=(5) already exists. CONTEXT: SQL statement " WITH insert_cte AS (INSERT INTO "Hardware Store"."Transactions" (asset_id, customer_id, transaction_type, transaction_date, total_charge, note) VALUES ('7', '7', 'Sale', '2025-01-17 15:00:00 +08:00', NULL, NULL) RETURNING id) SELECT * FROM insert_cte " PL/pgSQL function msar.add_record_to_table(oid,jsonb,boolean,jsonb) line 17 at EXECUTE

Finally, if we're getting rid of the Library Management data set (I don't think we should at this point), we should increase the scale of one of these data sets to at least show how Mathesar deals with pagination, etc.

I made many more small notes throughout.

id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
first_name TEXT NOT NULL,
last_name TEXT NOT NULL,
email TEXT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the mathesar_types.email type here.

SET search_path = "Bike Shop";

CREATE TABLE "Customers" (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using this definition to align with our standard id columns.

Suggested change
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
id INTEGER PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,

equipment_id BIGINT REFERENCES "Equipment" (id),
mechanic_id BIGINT REFERENCES "Mechanics" (id),
request_description TEXT NOT NULL,
cost NUMERIC(10, 2),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cost NUMERIC(10, 2),
cost mathesar_types.mathesar_money,

@@ -0,0 +1,53 @@
DROP SCHEMA IF EXISTS "Bike Shop" CASCADE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a DROP SCHEMA ... CASCADE; statement in these scripts in the actual app can (and absolutely will, given enough users) lead to user data loss.

@@ -0,0 +1,56 @@
DROP SCHEMA IF EXISTS "Hardware Store" CASCADE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No schema dropping please.

CREATE TABLE "Patrons" (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mathesar_types.email, por favor.

Comment on lines +66 to +68
FROM "Equipment Training"
WHERE "Equipment Training".patron_id = NEW.patron_id
AND "Equipment Training".equipment_id = NEW.equipment_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works if the INSERT query is run with the search_path set to "Library Makerspace". For example, if you're in the public schema and try

INSERT INTO "Library Makerspace"."Jobs" ...

breakage occurs. This is particularly bad in Mathesar, since we pretty much always operate in that fashion.

Comment on lines +42 to +46
CREATE TABLE "Item_Collections" (
item_id BIGINT NOT NULL REFERENCES "Items" (id) ON DELETE CASCADE,
collection_id BIGINT NOT NULL REFERENCES "Collections" (id) ON DELETE CASCADE,
PRIMARY KEY (item_id, collection_id)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently support tables without a single-column primary key. We should, therefore, avoid that situation in demo data.

id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
name TEXT NOT NULL,
description TEXT,
amount NUMERIC(12, 2) NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mathesar_types.mathesar_money please.

Comment on lines +37 to +38
allocated_amount NUMERIC(12, 2) NOT NULL,
spent_amount NUMERIC(12, 2) DEFAULT 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mathesar_types.mathesar_money

@kgodey
Copy link
Contributor

kgodey commented Jan 17, 2025

@zackkrida I talked to @mathemancer and he's going to take over the remaining work here.

@kgodey kgodey assigned mathemancer and unassigned zackkrida Jan 17, 2025
@kgodey kgodey added pr-status: revision A PR awaiting follow-up work from its author after review and removed pr-status: review A PR awaiting review labels Jan 17, 2025
@zackkrida zackkrida marked this pull request as draft September 5, 2025 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-status: revision A PR awaiting follow-up work from its author after review
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

3 participants