Is having union in σ the same as having two queries? - relational-algebra

I have this question:
What are the names of Employees in Boston or Chicago?
With these relations:
employees(id, name) and workIn(id, city)
Where the id in both relations refer to the same thing (the id of the employee)
The query I wrote was:
Π name (σ city="Boston" U city="Chicago"(employees ⋈ workIn))
The solution given to the question was:
Π name (σ city="Boston"(employees ⋈ workIn)) U
Π name (σ city="Chicago"(employees ⋈ workIn))
Would the two queries return the same result? Or is my query just wrong?
If my query is wrong, what would the difference be in values returned?

Your query is wrong since you are using the Union operator (U) between two logical conditions city="Boston" U city="Chicago" (which does not make sense, since the Union is a set operator, not a logical operator).
The logical operator to use in a condition is the “or” (written ∨), which makes a compound condition true when either of the two components are true (or both are true, but this is not possible here).
So a correct expression is:
Π name (σ city="Boston" ∨ city="Chicago"(employees ⋈ workIn))
and this is equivalent to the expression with the Union:
Π name (σ city="Boston"(employees ⋈ workIn)) U
Π name (σ city="Chicago"(employees ⋈ workIn))

Related

Null in Relational Algebra

I want to query the id of all apartments that were never rented
I tried something like this:
(π a_id
(apartments))
-
(π a_id
σ from_date Exists ∧ end_date Exists
(rental) ⨝ rental.a_id on apartment.a_id (apartment))
But I think I cannot use Exist in relational algebra or null or anything.
How could I do it?
Thanks
I attach the schema here
Presumably apartment ids in Apartment are for apartments & apartment ids in Rental are for rented apartments. Then the unrented apartments are the ones in Apartment but not in Rental. Their ids are the ones in a relational difference between projections of those relations.
Guessing at the legend/key for your ERD, there is a FK (foreign key) in Rental referencing Apartment. That confirms that an apartment in Rental is also in Apartment. So Apartment ⨝ Rental has the same apartments as Rental. That confirms that you don't need to join.
Here is how we can query & reason more generally based on what rows in tables mean.
You mention NULL & EXISTS. Maybe you are talking about SQL NULL & EXISTS and/or you are trying to find a relational algebra version of an SQL query and/or you are reasoning in SQL. And/or maybe you are talking about logic EXISTS & whether values exist in columns or tuples.
From common sense about renting & from you not saying otherwise, Rental is rows where occupant O rented apartment A from date F to date T. But you mention NULL. From common sense & guessing T can be NULL, Rental is rows where occupant O rented apartment A from date F to date T OR occupant O rented apartment A from date F ongoing & T is null. A tuple membership condition like these is a (characteristic) predicate. We cannot update or query about the business situation without being told each base relation's predicate.
NULL is a value that is treated specially by SQL operators & syntax. We don't know how your algebra & language treat NULL. In mathematics EXISTS X [p] & FOR SOME X [p] say that a value exists that we can name X that satisfies condition p. SQL EXISTS (R) says whether rows exist in table R. That is whether EXISTS t [t IN R]. When R is (X,...) rows where r, that is whether EXISTS X,... [r].
When R is rows where r, π x (R) is by definition rows where EXISTS *non-x attributes of R* [r]. So π A (Rental) is rows where EXISTS O,F,T [occupant O rented apartment A from date F to date T OR occupant O rented apartment A from date F ongoing & T is null].
When R is rows where r, σ p (R) is by definition rows where r & p. Rows where occupant O rented apartment A from date F ongoing is rows where (occupant O rented apartment A from date F to date T OR occupant O rented apartment A from date F ongoing & T is null) & T is null which is σ T is null (R).
When R is rows where r & S is rows where s, R - S is by definition rows where r & NOT s. Suppose Apartment is rows where apartment A has S square feet .... You want rows where EXISTS S,... [apartment A has S square feet ...] & NOT EXISTS O,F,T [occupant O rented apartment A from date F to date T OR occupant O rented apartment A from date F ongoing & T is null]. That's the relation difference at the start of this answer.
PS
Every query expression has an associated (characteristic) predicate--statement template parameterized by attributes. The tuples that make the predicate into a true proposition--statement--are in the relation.
We are given the predicates for expressions that are relation names.
Let query expression E have predicate e. For the most straightforward relational algebra, where a relation has an attribute set as heading & tuple set as body:
R ⨝ S has predicate / is rows satisfying r and s
R ∪ S has predicate / is rows satisfying r or s
R - S has predicate / is rows satisfying r and not s
σ p (R) has predicate / is rows satisfying r and p
π x (R) has predicate / is rows satisfying exists attributes of R other than x [r]
When we want the tuples satisfying a certain predicate we find a way to express that predicate in terms of relation operator transformations of given relation predicates. The corresponding query returns/calculates the tuples.
Re relational algebra querying.

Expressing (parameterized) ANY(array) query for Postgres in SQLKorma

I'm currently using SQLKorma for a project, and I'm running into a bit of a snag with it.
I have constructed a query with two left-joins; one of them contains an array with entries that I wish to use in my WHERE clause.
This is trivial to express in SQL. Note that this is a primarily redacted query.
SELECT
cu.name,
c.description,
c.created_at AT TIME ZONE 'utc'
FROM calendar_users cu LEFT JOIN calendars c ON cu.id = c.user_id
LEFT JOIN meetings m ON c.id = m.id
WHERE 'status_report' ILIKE ANY (m.meeting_metadata)
GROUP BY m.meeting_metadata, c.created_at, cu.name, cu.description
ORDER BY c.created_at DESC
The portion in regards to ILIKE ANY is what I'd like to be able to translate to Korma.
I understand from the docs that the ANY clause isn't supported from the WHERE clause, and I should look into using raw or exec-raw instead.
With that, I want to pass in a parameterized raw string into the WHERE clause to accomplish what I'm trying to go for.
This I've attempted, but it does fails with a syntax error in Postgres:
(select calendars
(fields calendar-user-cols)
(join :calendar_users (= :calendars.user_id :calendar_users.id))
(join :meetings (= :calendars.id :meetings.id))
(where (raw ["? ILIKE ANY(meetings.meeting_metadata)" metadata])))
Specifically:
PSQLException:
Message: ERROR: syntax error at or near "["
Position: 1006
SQLState: 42601
Error Code: 0
How would I go about this using Korma? Do I have to resort to a full-blown exec-raw query?
Korma has a very helpful function korma.core/sql-only which will render the SQL string that would be executed.
(defentity calendars)
=> #'korma-test.core/calendars
(sql-only
(select calendars
(fields :x :y)
(join :calendar_users (= :calendars.user_id :calendar_users.id))
(join :meetings (= :calendars.id :meetings.id))
(where (raw ["? ILIKE ANY(meetings.meeting_metadata)" "status_report"]))))
=> "SELECT \"calendars\".\"x\", \"calendars\".\"y\" FROM (\"calendars\" LEFT JOIN \"calendar_users\" ON \"calendars\".\"user_id\" = \"calendar_users\".\"id\") LEFT JOIN \"meetings\" ON \"calendars\".\"id\" = \"meetings\".\"id\" WHERE [\"? ILIKE ANY(meetings.meeting_metadata)\" \"status_report\"]"
or the more readable:
SELECT "calendars"."x",
"calendars"."y"
FROM ("calendars"
LEFT JOIN "calendar_users" ON "calendars"."user_id" = "calendar_users"."id")
LEFT JOIN "meetings" ON "calendars"."id" = "meetings"."id"
WHERE ["? ILIKE ANY(meetings.meeting_metadata)" "status_report"]
as you can see, the ILIKE is surrounded by []. Korma's raw just takes a raw string and doesn't support parameterisation like exec-raw does. The vector around the ILIKE string was just turned into a string with its contents. This is why you got a Postgres error about [.
You need to remove the [] from around the ILIKE string if you want to continue using raw, or see if exec-raw is going to fit your needs better. There is the very present danger of SQL injection if you are using 'raw' though which you will need to address.
;; require clojure.string :as str in your ns
;; change your clause from
(where (raw ["? ILIKE ANY(meetings.meeting_metadata)" "status_report"])))
;; to this
(where (raw (str/join " " ["'status_report'" "ILIKE ANY(meetings.meeting_metadata)"])))

Combine SUM and CAST - not working?

PostgreSQL Unicode 9.01 doesn't like:
SELECT table1.fielda,
SUM (CAST (table2.fielda AS INT)) AS header.specific
FROM *etc*
What is wrong with SUM-CAST?
Error Message:
Incorrect column expression: 'SUM (CAST
(specifics_nfl_3pl_work_order_item.delivery_quantity AS INT))
Query:
SELECT specifics_nfl_3pl_work_order.work_order_number,
specifics_nfl_3pl_work_order.goods_issue_date,
specifics_nfl_3pl_work_order.order_status_id,
SUM (CAST (specifics_nfl_3pl_work_order_item.delivery_quantity AS INT)) AS units
FROM public.specifics_nfl_3pl_work_order specifics_nfl_3pl_work_order,
public.specifics_nfl_3pl_work_order_item specifics_nfl_3pl_work_order_item,
public.specifics_nfl_order_status specifics_nfl_order_status
WHERE specifics_nfl_3pl_work_order.order_status_id In (3,17,14)
AND specifics_nfl_3pl_work_order_item.specifics_nfl_work_order_id=
specifics_nfl_3pl_work_order.id
AND ((specifics_nfl_3pl_work_order.sold_to_id<>'0000000000')
AND (specifics_nfl_3pl_work_order.goods_issue_date>={d '2013-08-01'}))
It would be really great if you can help.
If I were you, then I would do these steps:
give your table short aliases
format the query
use proper ANSI joins:
remove spaces between function name and (
select
o.work_order_number,
o.goods_issue_date,
o.order_status_id,
sum(cast(oi.delivery_quantity as int)) as units
from public.specifics_nfl_3pl_work_order as o
inner join public.specifics_nfl_3pl_work_order_item as oi on
oi.specifics_nfl_work_order_id = o.id
-- inner join public.specifics_nfl_order_status os -- seems redundant
where
o.order_status_id In (3,17,14) and
o.sold_to_id <> '0000000000' and
o.goods_issue_date >= {d '2013-08-01'}
Actually I really think you need group by clause here:
select
o.work_order_number,
o.goods_issue_date,
o.order_status_id,
sum(cast(oi.delivery_quantity as int)) as units
from public.specifics_nfl_3pl_work_order as o
inner join public.specifics_nfl_3pl_work_order_item as oi on
oi.specifics_nfl_work_order_id = o.id
where
o.order_status_id In (3,17,14) and
o.sold_to_id <> '0000000000' and
o.goods_issue_date >= {d '2013-08-01'}
group by
o.work_order_number,
o.goods_issue_date,
o.order_status_id
if it still doesn't work - try to comment sum and see is it working?
But you have a table2 or only table1?
Try:
SELECT table1.fielda,
SUM (CAST (table1.fielda AS INT)) AS "header.specific"
FROM etc
In addition to what #Roman already cleared up, there are more problems here:
SELECT o.work_order_number
,o.goods_issue_date
,o.order_status_id
,SUM(CAST(oi.delivery_quantity AS INT)) AS units -- suspicious
FROM public.specifics_nfl_3pl_work_order o,
JOIN public.specifics_nfl_3pl_work_order_item oi
ON oi.specifics_nfl_work_order_id = o.id
CROSS JOIN public.specifics_nfl_order_status os -- probably wrong
WHERE o.order_status_id IN (3,17,14)
AND o.sold_to_id <> '0000000000' -- suspicious
AND o.goods_issue_date> = {d '2013-08-01'} -- nonsense
GROUP BY 1, 2, 3
o.goods_issue_date> = {d '2013-08-01'} is syntactical nonsense. Maybe you mean:
o.goods_issue_date> = '2013-08-01'
You have the table specifics_nfl_order_status in your FROM list, but without any expression connecting it to the rest. This effectively results in a CROSS JOIN, which results in a Cartesian product and is almost certainly wrong in a very expensive way: every row is combined with every row of the rest:
CROSS JOIN public.specifics_nfl_order_status os
Either remove the table (since you don't use it) or add a WHERE or ON clause to connect it to the rest. Note, that it is not just redundant, it has a dramatic effect on the result as it is.
This WHERE clause is suspicious:
AND o.sold_to_id <> '0000000000'
Seems like you are storing numbers as strings or otherwise confusing the two.
Also, CAST (oi.delivery_quantity AS INT) should not be needed to begin with. The column should be of data type integer or some other appropriate numeric type to begin with. Be sure to use proper data types.
The default setting of search_path includes public, and you may not need to schema-qualify tables. Instead of public.specifics_nfl_3pl_work_order, it may suffice to use:
specifics_nfl_3pl_work_order
GROUP BY 1, 2, 3 is using positional parameters, just a notational shortcut for:
GROUP BY o.work_order_number, o.goods_issue_date, o.order_status_id
Details in the manual.
According to comments you are using MS Query to create the query. This is not the best of ideas. Produces the kind of inferior code you presented us with. You may want to get rid of that while you are working with PostgreSQL.

SQL and regular expression to check if string is a substring of larger string?

I have a database filled with some codes like
EE789323
990
78000
These numbers are ALWAYS endings of a larger code. Now I have a function that needs to check if the larger code contains the subcode.
So if I have codes 90 and 990 and my full code is EX888990, it should match both of them.
However I need to do it in the following way:
SELECT * FROM tableWithRecordsWithSubcode
WHERE subcode MATCHES [reg exp with full code];
Is a regular expression like this this even possible?
EDIT:
To clarify the issue I'm having, I'm not using SQL here. I just used that to give an example of the type of query I'm using.
In fact I'm using iOS with CoreData, and I need a predicate to fetch me only the records that match.
In the way that is mentioned below.
Given the observations from a comment:
Do you have two tables, one called tableWithRecordsWithSubcode and another that might be tableWithFullCodeColumn? So the matching condition is in part a join - you need to know which subcodes match any of the full codes in the second table? But you're only interested in the information in the tableWithRecordsWithSubcode table, not in which rows it matches in the other table?
and the laconic "you're correct" response, then we have to rewrite the query somewhat.
SELECT DISTINCT S.*
FROM tableWithRecordsWithSubcode AS S
JOIN tableWithFullCodeColumn AS F
ON F.Fullcode ...ends-with... S.Subcode
or maybe using an EXISTS sub-query:
SELECT S.*
FROM tableWithRecordsWithSubcode AS S
WHERE EXISTS(SELECT * FROM tableWithFullCodeColumn AS F
WHERE F.Fullcode ...ends-with... S.Subcode)
This uses a correlated sub-query but avoids the DISTINCT operation; it may mean the optimizer can work more efficiently.
That just leaves the magical 'X ...ends-with... T' operator to be defined. One possible way to do that is with LENGTH and SUBSTR. However, SUBSTR does not behave the same way in all DBMS, so you may have to tinker with this (possibly adding a third argument, LENGTH(s.subcode)):
LENGTH(f.fullcode) >= LENGTH(s.subcode) AND
SUBSTR(f.fullcode, LENGTH(f.fullcode) - LENGTH(s.subcode)) = s.subcode
This leads to two possible formulations:
SELECT DISTINCT S.*
FROM tableWithRecordsWithSubcode AS S
JOIN tableWithFullCodeColumn AS F
ON LENGTH(F.Fullcode) >= LENGTH(S.Subcode)
AND SUBSTR(F.Fullcode, LENGTH(F.Fullcode) - LENGTH(S.Subcode)) = S.Subcode;
and
SELECT S.*
FROM tableWithRecordsWithSubcode AS S
WHERE EXISTS(
SELECT * FROM tableWithFullCodeColumn AS F
WHERE LENGTH(F.Fullcode) >= LENGTH(S.Subcode)
AND SUBSTR(F.Fullcode, LENGTH(F.Fullcode) - LENGTH(S.Subcode)) = S.Subcode);
This is not going to be a fast operation; joins on computed results such as required by this query seldom are.
I'm not sure why you think that you need a regular expression... Just use the charindex function:
select something
from table
where charindex(code, subcode) <> 0
Edit:
To find strings at the end, you can create a pattern with the % wildcard from the subcode:
select something
from table
where '%' + subcode like code

List all the operator families associated with a schema and all the operators within an operator family?

G'day,
How can you select/list all the operator families associated with a database/schema and list all the operations within an operation family in postgreSQL (8.3 if it matters).
Thanks!
Operator families in a schema:
SELECT *
FROM pg_opfamily opf JOIN pg_namespace n ON n.oid = opf.opfnamespace
WHERE n.nspname = 'something';
Getting all the operators within an operator family is more tricky, because an operator families contains some operators directly and some through operator classes that in contains. To get the former, join pg_opfamily with pg_amop, to get the latter, join pg_opfamily with pg_opclass, and then in both cases against pg_operator. It's questionable how useful this information will be, though, because in order to assess the usability of an operator family for query planning and optimization, you also need information about access methods, data types, and a few other things.
From the manual: Use it from within the psql command prompt
\do [ pattern ]
Lists available operators with their operand and return types. If
pattern is specified, only operators
whose names match the pattern are
listed.

Resources