How do you do date math that ignores the year?
How do you do date math that ignores the year?
I am trying to select dates that have an anniversary in the next 14 days. How can I select based on dates excluding the year? I have tried something like the following.
SELECT * FROM events WHERE EXTRACT(month FROM "date") = 3 AND EXTRACT(day FROM "date") < EXTRACT(day FROM "date") + 14
The problem with this is that months wrap.
I would prefer to do something like this, but I don't know how to ignore the year.
SELECT * FROM events WHERE (date > '2013-03-01' AND date < '2013-04-01')
How can I accomplish this kind of date math in Postgres?
Answer by Andrew Hubbs for How do you do date math that ignores the year?
I found a way to do it.
SELECT EXTRACT(DAYS FROM age('1999-04-10', '2003-05-12')), EXTRACT(MONTHS FROM age('1999-04-10', '2003-05-12')); date_part | date_part -----------+----------- -2 | -1
I can then just check that the month is 0 and the days are less than 14.
If you have a more elegant solution, please do post it. I'll leave the question open for a bit.
Answer by Gordon Linoff for How do you do date math that ignores the year?
How about this?
select * from events e where to_char(e."date", 'MM-DD') between to_char(now(), 'MM-DD') and to_char(date(now())+14, 'MM-DD')
You can do the comparison as strings.
To take year ends into account, we'll convert back to dates:
select * from events e where to_date(to_char(now(), 'YYYY')||'-'||to_char(e."date", 'MM-DD'), 'YYYY-MM-DD') between date(now()) and date(now())+14
You do need to make a slight adjustment for Feb 29. I might suggest:
select * from (select e.*, to_char(e."date", 'MM-DD') as MMDD from events ) e where to_date(to_char(now(), 'YYYY')||'-'||(case when MMDD = '02-29' then '02-28' else MMDD), 'YYYY-MM-DD') between date(now()) and date(now())+14
Answer by Dan Bracuk for How do you do date math that ignores the year?
I don't work with postgresql so I googled it's date functions and found this: http://www.postgresql.org/docs/current/static/functions-datetime.html
If I read it correctly, looking for events in the next 14 days is as simple as:
where mydatefield >= current_date and mydatefield < current_date + integer '14'
Of course I might not be reading it correctly.
Answer by Mike Sherrill 'Cat Recall' for How do you do date math that ignores the year?
You can generate a virtual table of anniversaries, and select from it.
with anniversaries as ( select event_date, (event_date + (n || ' years')::interval)::date anniversary from events, generate_series(1,10) n ) select event_date, anniversary from anniversaries where anniversary between current_date and current_date + interval '14' day order by event_date, anniversary
The call to generate_series(1,10)
has the effect of generating 10 years of anniversaries for each event_date. I wouldn't use the literal value 10 in production. Instead, I'd either calculate the right number of years to use in a subquery, or I'd use a large literal like 100.
You'll want to adjust the WHERE clause to fit your application.
If you have a performance problem with the virtual table (when you have a lot of rows in "events"), replace the common table expression with a base table having the identical structure. Storing anniversaries in a base table makes their values obvious (especially for, say, Feb 29 anniversaries), and queries on such a table can use an index. Querying an anniversary table of half a million rows using just the SELECT statement above takes 25ms on my desktop.
Answer by Daniel Vrit for How do you do date math that ignores the year?
I believe the following test works in all cases, assuming a column named anniv_date
:
select * from events where extract(month from age(current_date+interval '14 days', anniv_date))=0 and extract(day from age(current_date+interval '14 days', anniv_date)) <= 14
As an example of how it works when crossing a year (and also a month), let's say an anniversary date is 2009-01-04
and the date at which the test is run is 2012-12-29
.
We want to consider any date between 2012-12-29
and 2013-01-12
(14 days)
age('2013-01-12'::date, '2009-01-04'::date)
is 4 years 8 days
.
extract(month...)
from this is 0
and extract(days...)
is 8
, which is lower than 14
so it matches.
Answer by wildplasser for How do you do date math that ignores the year?
For convenience, I created two functions that yield the (expected or past) birsthday in the current year, and the upcoming birthday.
CREATE OR REPLACE FUNCTION this_years_birthday( _dut DATE) RETURNS DATE AS $func$ DECLARE ret DATE; BEGIN ret = date_trunc( 'year' , current_timestamp) + (date_trunc( 'day' , _dut) - date_trunc( 'year' , _dut) ) ; RETURN ret; END; $func$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION next_birthday( _dut DATE) RETURNS DATE AS $func$ DECLARE ret DATE; BEGIN ret = date_trunc( 'year' , current_timestamp) + (date_trunc( 'day' , _dut) - date_trunc( 'year' , _dut) ) ; IF (ret < date_trunc( 'day' , current_timestamp)) THEN ret = ret + '1year'::interval; END IF; RETURN ret; END; $func$ LANGUAGE plpgsql; -- -- call the function -- SELECT date_trunc( 'day' , t.topic_date) AS the_date , this_years_birthday( t.topic_date::date ) AS the_day , next_birthday( t.topic_date::date ) AS next_day FROM topic t WHERE this_years_birthday( t.topic_date::date ) BETWEEN current_date AND current_date + '2weeks':: interval ;
NOTE: the casts are needed because I only had timestamps available.
Answer by a_horse_with_no_name for How do you do date math that ignores the year?
This should handle wrap-arounds at the end of the year as well:
with upcoming as ( select name, event_date, case when date_trunc('year', age(event_date)) = age(event_date) then current_date else cast(event_date + ((extract(year from age(event_date)) + 1) * interval '1' year) as date) end as next_event from events ) select name, next_event, next_event - current_date as days_until_next from upcoming order by next_event - current_date
You can filter than on the expression next_event - current_date
to apply the "next 14 days"
The case ...
is only necessary if you consider events that would be "today" as "upcoming" as well. Otherwise, that can be reduced to the else
part of the case statement.
Note that I "renamed" the column "date"
to event_date
. Mainly because reserved words shouldn't be used as an identifier but also because date
is a terrible column name. It doesn't tell you anything about what it stores.
Answer by Erwin Brandstetter for How do you do date math that ignores the year?
If you don't care for explanation and details, use the "Black magic version" below.
All queries presented so far operate with conditions that are not sargable - they cannot use an index and have to compute an expression for every single row in the base table to find matching rows. With small tables, this doesn't matter much. With big tables, however, this matters a lot.
Given the following simple table:
CREATE TABLE event ( event_id serial PRIMARY KEY ,event_date date );
Query
Version 1. and 2. can use a simple index of the form:
CREATE INDEX event_event_date_idx ON event(event_date);
But the following solutions are even faster without index.
1. Simple version
SELECT * FROM ( SELECT ((current_date + d.d) - interval '1y' * y.y)::date AS event_date FROM (SELECT generate_series(0, 14) AS d) d CROSS JOIN (SELECT generate_series(13, 113) AS y) y ) x JOIN event USING (event_date);
Subquery x
computes all possible dates over a given range of years from a CROSS JOIN
of two generate_series()
calls. The selection is done with a simple equi-join.
2. Advanced version
WITH val AS ( SELECT extract(year FROM age(now()::date + 14, min(event_date)))::int AS max_y ,extract(year FROM age(now()::date, max(event_date)))::int AS min_y FROM event ) SELECT e.* -- count(*) -- FROM ( SELECT ((current_date + d.d) - interval '1y' * y.y)::date AS event_date FROM (SELECT generate_series(0, 14) AS d) d ,(SELECT generate_series(min_y, max_y) AS y FROM val) y ) x JOIN event e USING (event_date);
Range of years is deduced from the table automatically - thereby minimizing the generated years.
You could even go one step further and distill a list of existing years if you have gaps in your range of years.
Effectiveness co-depends on the distribution of dates. Few years with many rows each make my solution more useful. Many years with few rows each make it less useful.
Simple SQLfiddle to play with.
3. Black magic version
Later I had an idea for this radically new approach.
Simple SQL function to calculate an integer for the pattern 'MMDD'
. It has to be IMMUTABLE
so it can be used in an index further down.
CREATE OR REPLACE FUNCTION mmdd(date) RETURNS int LANGUAGE sql IMMUTABLE AS $func$ SELECT ((extract('month' FROM $1)::int * 100) + extract('day' FROM $1)::int) $func$;
"Generated column" using above function. This costs a tiny bit of performance, but makes the syntax short and intuitive. Read here and here.
CREATE OR REPLACE FUNCTION mmdd(event) RETURNS int LANGUAGE sql IMMUTABLE AS 'SELECT mmdd($1.event_date)';
Multi-column index on above generated column. (Index on mmdd(event_date)
would achieve the same.)
Multi-column for a number of reasons. Can help with ORDER BY
or selecting from given years. Read here. At almost no additional cost for the index. A date
fits into the 4 bytes that would otherwise be lost to padding due to data alignment. Read here.
Also, since both index columns reference the same table column, no drawback with regard to H.O.T. Read here.
CREATE INDEX event_event_mmdd_idx ON event(mmdd(event), event_date);
One PL/pgSQL table function to rule them all.
Forking to one of two queries to cover the turn of the year.
CREATE OR REPLACE FUNCTION f_anniversary(date = current_date, int = 14) RETURNS SETOF event LANGUAGE plpgsql AS $func$ DECLARE d int := mmdd($1); d1 int := mmdd($1 + $2); BEGIN IF d1 > d THEN RETURN QUERY SELECT * FROM event e WHERE e.mmdd BETWEEN d AND d1 ORDER BY e.mmdd, e.event_date; -- including ORDER BY ELSE -- wrap around end of year RETURN QUERY SELECT * FROM event e WHERE e.mmdd >= d OR e.mmdd <= d1 ORDER BY (e.mmdd >= d) DESC, e.mmdd, event_date; -- chronological across turn of the year END IF; END $func$;
Call using defaults: 14 days beginning "today":
SELECT * FROM f_anniversary();
Call for 7 days beginning '2014-08-23':
SELECT * FROM f_anniversary('2014-08-23'::date, 7);
Quite a bit in this query I could comment on, but this post is already long enough.
SQLfiddle comparing EXPLAIN ANALYZE.
February 29
When dealing with anniversaries or "birthdays", you need to define how to deal with the special case February 29 in leap years.
When testing for ranges of dates, Feb 29
is usually included automatically, even if the current year is not a leap year. The range of days is extended by 1 retroactively when it covers this day.
On the other hand, if the current year is a leap year, and you want to look for 15 days, you may end up getting results for 14 days in leap years if your data is from non-leap years.
Say, Bob is born on the 29th of February:
My query 1. and 2. include February 29 only in leap years. Bob has birthday only every ~ 4 years.
My query 3. includes February 29 in the range. Bob has birthday every year.
There is no magical solution. You have to define what you want for every case.
Test
To substantiate my point I ran an extensive test with all the presented solutions. I adapted each of the queries to the given table and to yield identical results without ORDER BY
.
The good news: all of them are correct and yield the same result - except for Gordon's query that had syntax errors, and @wildplasser's query that fails when the year wraps around (easy to fix).
Insert 108000 rows with random dates from the 20th century, which is similar to a table of living people (13 or older).
INSERT INTO event (event_date) SELECT '2000-1-1'::date - (random() * 36525)::int FROM generate_series (1, 108000);
Delete ~ 8 % to create some dead tuples and make the table more "real life".
DELETE FROM event WHERE random() < 0.08; ANALYZE event;
My test case had 99289 rows, 4012 hits.
C - Catcall
WITH anniversaries as ( SELECT event_id, event_date ,(event_date + (n || ' years')::interval)::date anniversary FROM event, generate_series(13, 113) n ) SELECT event_id, event_date -- count(*) -- FROM anniversaries WHERE anniversary BETWEEN current_date AND current_date + interval '14' day;
C1 - Catcall's idea rewritten
Aside from minor optimizations, the major difference is to add only the exact amount of years date_trunc('year', age(current_date + 14, event_date))
to get this year's anniversary, which avoids the need for a CTE altogether:
SELECT event_id, event_date FROM event WHERE (event_date + date_trunc('year', age(current_date + 14, event_date)))::date BETWEEN current_date AND current_date + 14;
D - Daniel
SELECT * -- count(*) -- FROM event WHERE extract(month FROM age(current_date + 14, event_date)) = 0 AND extract(day FROM age(current_date + 14, event_date)) <= 14;
E1 - Erwin 1
See "1. Simple version" above.
E2 - Erwin 2
See "2. Advanced version" above.
E3 - Erwin 3
See "3. Black magic version" above.
G - Gordon
SELECT * -- count(*) FROM (SELECT *, to_char(event_date, 'MM-DD') AS mmdd FROM event) e WHERE to_date(to_char(now(), 'YYYY') || '-' || (CASE WHEN mmdd = '02-29' THEN '02-28' ELSE mmdd END) ,'YYYY-MM-DD') BETWEEN date(now()) and date(now()) + 14;
H - a_horse_with_no_name
WITH upcoming as ( SELECT event_id, event_date ,CASE WHEN date_trunc('year', age(event_date)) = age(event_date) THEN current_date ELSE cast(event_date + ((extract(year FROM age(event_date)) + 1) * interval '1' year) AS date) END AS next_event FROM event ) SELECT event_id, event_date FROM upcoming WHERE next_event - current_date <= 14;
W - wildplasser
CREATE OR REPLACE FUNCTION this_years_birthday(_dut date) RETURNS date AS $func$ DECLARE ret date; BEGIN ret := date_trunc( 'year' , current_timestamp) + (date_trunc( 'day' , _dut) - date_trunc( 'year' , _dut)); RETURN ret; END $func$ LANGUAGE plpgsql;
Simplified to return the same as all the others:
SELECT * FROM event e WHERE this_years_birthday( e.event_date::date ) BETWEEN current_date AND current_date + '2weeks'::interval;
W1 - wildplasser's query rewritten
The above suffers from a number of inefficient details (beyond the scope of this already sizable post). The rewritten version is much faster:
CREATE OR REPLACE FUNCTION this_years_birthday(_dut INOUT date) AS $func$ SELECT (date_trunc('year', now()) + ($1 - date_trunc('year', $1)))::date $func$ LANGUAGE sql; SELECT * FROM event e WHERE this_years_birthday(e.event_date) BETWEEN current_date AND (current_date + 14);
Test results
I ran this test with a temporary table on PostgreSQL 9.1.7. Results were gathered with EXPLAIN ANALYZE
, best of 5.
Results
Without index C: Total runtime: 76714.723 ms C1: Total runtime: 307.987 ms -- ! D: Total runtime: 325.549 ms E1: Total runtime: 253.671 ms -- ! E2: Total runtime: 484.698 ms -- min() & max() expensive without index E3: Total runtime: 213.805 ms -- ! G: Total runtime: 984.788 ms H: Total runtime: 977.297 ms W: Total runtime: 2668.092 ms W1: Total runtime: 596.849 ms -- ! With index E1: Total runtime: 37.939 ms --!! E2: Total runtime: 38.097 ms --!! With index on expression E3: Total runtime: 11.837 ms --!!
All other queries perform the same with or without index because they use non-sargable expressions.
Conclusio
So far, @Daniel's query was the fastest.
@wildplassers (rewritten) approach performs acceptably, too.
@Catcall's version is something like the reverse approach of mine. Performance gets out of hand quickly with bigger tables.
The rewritten version performs pretty well, though. The expression I use is something like a simpler version of @wildplassser'sthis_years_birthday()
function.My "simple version" is faster even without index, because it needs fewer computations.
With index, the "advanced version" is about as fast as the "simple version", because
min()
andmax()
become very cheap with an index. Both are substantially faster than the rest which cannot use the index.My "black magic version" is fastest with or without index. And it is very simple to call.
With a real life table an index will make even greater difference. More columns make the table bigger, and sequential scan more expensive, while the index size stays the same.
Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72
0 comments:
Post a Comment