I’m storing two dates in the PostgreSQL database. First, is the data of visit of a webpage, and the second date is the date of last modification of the webpage(this is get as a long).
I have some doubts what is the best strategy to store these values.
I only need day/month/year and hour:seconds and this will only for statistical proposes.
So, some doubts:
- is best store as long and convert on recover of information or store in the data format above?
- is best set the date of visit on the software or in the insertion in the database?
- in Java, how are the best classes to handle dates?
Advertisement
Answer
Any strategy for storing date-and-time data in PostgreSQL should, IMO, rely on these two points:
- Your solution should never depend on the server or client timezone setting.
- Currently, PostgreSQL (as most databases) doesn’t have a datatype to store a full date-and-time with timezone. So, you need to decide between an
Instant
or aLocalDateTime
datatype.
My recipe follows.
If you want to record the physical instant at when a particular event ocurred, (a true “timestamp” , typically some creation/modification/deletion event), then use:
- Java:
Instant
(Java 8 , or Jodatime). - JDBC:
java.sql.Timestamp
- PostgreSQL:
TIMESTAMP WITH TIMEZONE
(TIMESTAMPTZ
)
(Don’t let PostgreSQL peculiar datatypes WITH TIMEZONE
/WITHOUT TIMEZONE
confuse you: none of them actually stores a timezone)
Some boilerplate code: the following assumes that ps
is a PreparedStatement
, rs
a ResultSet
and tzUTC
is a static Calendar
object corresponding to UTC
timezone.
public static final Calendar tzUTC = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
Write Instant
to database TIMESTAMPTZ
:
Instant instant = ...; Timestamp ts = instant != null ? Timestamp.from(instant) : null; ps.setTimestamp(col, ts, tzUTC); // column is TIMESTAMPTZ!
Read Instant
from database TIMESTAMPTZ
:
Timestamp ts = rs.getTimestamp(col,tzUTC); // column is TIMESTAMPTZ Instant inst = ts !=null ? ts.toInstant() : null;
This works safely if your PG type is TIMESTAMPTZ
(In that case, the calendarUTC
has no effect in that code ; but it’s always advisable to not depend on defaults timezones).
“Safely” means that the result will not depend on server or database timezone, or timezones information: the operation is fully reversible, and whatever happens to timezones settings, you’ll always get the same “instant of time” you originally had on the Java side.
If, instead of a timestamp (an instant on the physical timeline), you are dealing with a “civil” local date-time (that is, the set of fields {year-month-day hour:min:sec(:msecs)}
), you’d use:
- Java:
LocalDateTime
(Java 8 , or Jodatime). - JDBC:
java.sql.Timestamp
- PostgreSQL:
TIMESTAMP WITHOUT TIMEZONE
(TIMESTAMP
)
Read LocalDateTime
from database TIMESTAMP
:
Timestamp ts = rs.getTimestamp(col, tzUTC); // LocalDateTime localDt = null; if( ts != null ) localDt = LocalDateTime.ofInstant(Instant.ofEpochMilli(ts.getTime()), ZoneOffset.UTC);
Write LocalDateTime
to database TIMESTAMP
:
Timestamp ts = null; if( localDt != null) ts = new Timestamp(localDt.toInstant(ZoneOffset.UTC).toEpochMilli()), tzUTC); ps.setTimestamp(colNum,ts, tzUTC);
Again, this strategy is safe and you can sleep peacefully: if you stored 2011-10-30 23:59:30
, you’ll retrieve those precise fields (hour=23, minute=59… etc) always, no matter what – even if tomorrow the timezone of your Postgresql server (or client) changes, or your JVM or your OS timezone, or if your country modifies its DST rules, etc.
Added: If you want (it seems a natural requirement) to store the full datetime specification (a ZonedDatetime
: the timestamp together with the timezone, which implicitly also includes the full civil datetime info – plus the timezone)… then I have bad news for you: PostgreSQL hasn’t a datatype for this (neither other databases, to my knowledge). You must devise your own storage, perhaps in a pair of fields: could be the two above types (highly redundant, though efficient for retrieval and calculation), or one of them plus the time offset (you lose the timezone info, some calculations become difficult, and some impossible), or one of them plus the timezone (as string; some calculations can be extremely costly).