<div dir="ltr">Background: I&#39;m the primary developer for <a href="http://nodatime.org">Noda Time</a> which consumes the tz data. I&#39;m currently refactoring the code to do this... and I&#39;ve come across some code (originally ported from Joda Time) which I now understand in terms of what it&#39;s doing, but not exactly why.<div><br></div><div>For a little while now, the Noda Time source repo has included a <a href="https://github.com/nodatime/nodatime/blob/master/src/NodaTime.Test/TestData/tzdb-dump.txt">text dump file</a>, containing a text dump of every transition (up to 2100, at the moment) for every time zone. It looks like this, picking just one example:</div><div><pre style="color:rgb(0,0,0)">Zone: Africa/Maseru

LMT: [StartOfTime, 1892-02-07T22:08:00Z) +01:52 (+00)

SAST: [1892-02-07T22:08:00Z, 1903-02-28T22:30:00Z) +01:30 (+00)

SAST: [1903-02-28T22:30:00Z, 1942-09-20T00:00:00Z) +02 (+00)

SAST: [1942-09-20T00:00:00Z, 1943-03-20T23:00:00Z) +03 (+01)

SAST: [1943-03-20T23:00:00Z, 1943-09-19T00:00:00Z) +02 (+00)

SAST: [1943-09-19T00:00:00Z, 1944-03-18T23:00:00Z) +03 (+01)

SAST: [1944-03-18T23:00:00Z, EndOfTime) +02 (+00)</pre>I use this file for confidence when refactoring my time zone handling code - if the new code comes up with the same set of transitions as the old code, it&#39;s probably okay. (This is just one line of defence, of course - there are unit tests, though not as many as I&#39;d like.)</div><div><br></div><div>It strikes me that having a similar file (I&#39;m not wedded to the format, but it should have all the same information, one way or another) released alongside the main data files would be really handy for <i>all</i> implementors - it would be a good way of validating consistency across multiple platforms, with the release data being canonical. For any platforms which didn&#39;t want to actually consume the rules as rules, but just wanted a list of transitions, it could even effectively replace their use of the data.</div><div><br></div><div>One other benefit: diffing the dump between two releases would make it clear what had changed in <i>effect</i>, rather than just in terms of rules.</div><div><br></div><div>One sticking point is size. The current file for Noda Time is about 4MB, although it zips down to about 300K. Some thoughts around this:</div><div><ul><li>We wouldn&#39;t need to distribute it in the same file as the data - just as we have data and code file, there could be a &quot;textdump&quot; file or whatever we&#39;d want to call it. These could be retroactively generated for previous releases, too.</li><li>As you can see, there&#39;s redundancy in the format above, in that it&#39;s a list of &quot;zone intervals&quot; (as I call them in Noda Time) rather than a list of transitions - the end of each interval is always the start of the next interval.</li><li>For zones which settle into an infinite daylight saving pattern, I currently generate from the start of time to 2100 (and then a single zone interval for the end of time as Noda Time understands it; we&#39;d need to work out what form that would take, if any). If we decided that &quot;year of release + 30 years&quot; was enough, that would cut down the size considerably.</li></ul><div>Any thoughts? If the feeling is broadly positive, the next step would be to nail down the text format, then find a willing victim/volunteer to write the C code. (You really don&#39;t want me writing C...)</div></div><div><br></div><div>Jon</div><div><br></div></div>