The year is 2004. Google launched Gmail, then Google Earth. In two years, Facebook will move to open registration and transform social communication. The iPhone is three years away. The first release of the iPad is yet six years away. It’s a leap year.
At Parliament House in Canberra, a selection of politicians and geospatial industry stakeholders assemble to mark the birth of something new – a Geocoded National Address File for Australia. They call this new thing ‘G-NAF’, and discuss the potential future applications of a world-leading processing methodology capable of building a quality assured national address index from ten diverse data sources.
Fast forward to 2014; PSMA Australia is celebrating the 10th birthday of G-NAF. Ten years is not a long time… but in this hectic 21st Century, so much can change! It can be pure revelation to cast our minds backwards and realise that even ten years ago, we had access to significantly different technology and had a different set of expectations for our digital platforms. Within this time, PSMA’s G-NAF has evolved from an unknown quantity utilised only by its patrons, the Australian Bureau of Statistics (ABS), the Australian Electoral Commission (AEC) and Australia Post, into a dataset utilised widely throughout the economy.
In 2013, the Australian Government announced its intention to explore options for providing open access to G-NAF – making it freely available to all Australians. This was recognition of G-NAF’s fundamental place within the Australian digital economy and a sign of the growing acknowledgement that some flavours of national geospatial data are an essential digital infrastructure – an invisible and virtual parallel to our physical infrastructures such as the roads we drive on and the utilities that provide us with essential services.
Addressing the challenge of addressing
This virtual nature of the address has created some interesting challenges for authoritative addressing. This is because the address itself is simply an idea: an artefact of human communication; a shared syntax for referring to location; an agreement between our institutions regarding names, numbers, and invisible administrative boundaries.
Given its human origins, we must understand that the address is transient and subjective – and may or may not be accepted by all in the community all of the time. Many people will, either intentionally or unintentionally, ‘select’ their address based upon social and cultural preferences rather than an official reference. Because people will adopt an unofficial address and use it consistently enough that it appears in government and commercial databases, the address itself then becomes a data type that lacks the mathematical consistency required for rigorous matching and analysis. Compounding this, there are thousands of new Australian addresses captured each week by multiple organisations and stored in a variety of formats; generating a multitude of raw address datasets that vary widely in content, quality and accuracy.
In the years leading up to the launch of G-NAF, the will to create a geocoded addressing reference that users could have confidence in, compelled PSMA Australia stakeholders to accept the challenge of designing a process that would match and resolve the range of official addresses with their unofficial aliases, and then assign an officially recognised geocode.
Olaf Hedberg (Chairman, PSMA Australia, 2001-2012), remembers the genesis of G-NAF as “All about taking a risk. What we were proposing had not been done before – anywhere. It was ambitious and there was no way to really know what it would cost or if it would work. While there was strong support for the concept, the risks made it virtually impossible for the stakeholders in the initiative to fund it. Ultimately the Board was left with the choice; to either proceed with the venture carrying all the cost or to let the concept lapse. The decision was made to provide funding and time has proven this to be the right decision. This was possible because of how PSMA Australia was established and the monetary reserves that it held.”
Once support was established, the vision for the methodology that would build G-NAF was a simple instruction: construct a comprehensive database of addressing knowledge that connects the officially recognised address of government, the commonly used address adopted by citizens, and the precise latitude and longitude of the geocode. It took six years of research, including three pilots and a feasibility study, followed by two years of development to create it. The development phase was undertaken with the support and assistance of Australia Post, the AEC, the ABS and the land administration agencies of the states and territories. The feasibility study that devised the methodology was developed by a team within Geometry Pty Ltd, headed by Ashley Maher; while the implementation and ongoing maintenance were performed by a team at Logica Australia, headed by Brian Marwick. It was a unique and ambitious project.
Ashley Maher remembers the development of G-NAF as “A fascinating project – and not simple matter of automating an existing process. I recall our developers analysing the structure of an address in depth and employing a range of geometric principles, prompting extensive debate around the coffee machine. Interestingly the fundamental address matching and geocoding philosophies deployed 10 years ago are still valid today, although considerable peripheral processes have been added to improve the quality of the final product.”
Brian Marwick has reflected on the first outputs of G-NAF: “What amazed us when merging data from the jurisdictions, AEC and Australia Post for the first time, was the lack of alignment between the addresses from the contributors and the gazetted localities in each state and territory. The discrepancy was far greater than expected. The establishment of a process that generates rules to align addresses with the reference datasets proved to be a significant success, and has provided the rich alias tables that exist today” he said.
Speaking for the Geometry team, Maher has stated “we are grateful to PSMA Australia for the opportunity to participate in this stimulating design and development challenge. It is pleasing to take a step back and see the impact G-NAF has had on the spatial and business communities in Australia.”
The end product was a world-leading processing methodology that assembles national data from a set of ten authoritative, but relatively independent datasets; each with its own strengths and weaknesses. The process differs from the normal approach which is to hold one dataset as a reference to which others are compared. In the case of G-NAF, all contributing datasets are weighted equally.
G-NAF begins by testing the logical consistency or every address from every contributor and comparing the address components against the other geospatial datasets managed by PSMA. These components are then confirmed as valid – meaning that the locality of the address is confirmed as valid and then the road name and road type of the address are confirmed to exist within the locality. Addresses that fail this test are subject to a variety of processes resulting in the generation of rules and the population of alias tables. Thus, the logic and consistency of addresses is tested and an accurate geocode is allocated. This in turn, makes it possible to merge apparently identical addresses from different contributors and to assess address usage based on the number of occurrences of an address in different datasets.
A remarkable outcome of this process is the evidence that no single custodian holds all the valid addresses for the country. Each custodian is biased by their core business and consequently, addresses that are not useful to the business will fall through the gaps. For the AEC, it is addresses that do not correspond to a residence. For Australia Post, it is addresses that do not receive mail. For the state and territory governments, it is sub-address units within multi-address sites that often fall outside the land registration system.
This fragmented nature of address collection makes the quality of G-NAF processing and its continuous improvement model especially critical: each G-NAF release is enhanced and modified depending on the outputs from the previous iterations, with each maintenance update removing further anomalies and discrepancies from the candidate data. However, the inexorable law of diminishing returns applies to this improvement process, making successive changes of the same magnitude harder, more complex and more expensive to achieve.
This is a major reason why the G-NAF of the future will need to be different to the G-NAF of the past. If G-NAF is to keep pace with maturing market demands, a step change in address processing is required to deliver further improvements.
G-NAF is used extensively across government and industry, underpinning many policy and business systems that provide important services to Australians including emergency response, social services, insurance, telecommunications and navigation. However, the weeks of time and effort associated with the current maintenance approach, while leading to unsurpassed quality, makes G-NAF unsuitable for an update schedule more frequent than quarterly.
The longer term vision for G-NAF however, is to not only achieve continuous maintenance with the same robust approach, but to incorporate real-time address validation failure notifications from client address validation services. Such an approach closes the loop on the address maintenance process, empowers the citizen and guarantees the highest levels of quality and currency, as are expected of G-NAF.
This is an ambitious vision. And while its realisation is some way off, numerous steps towards it have already been taken. The most significant of these steps will be the launch of G-NAF Live before the end of the 2014 financial year. G-NAF Live closes the G-NAF currency gap by providing web-based address verification for the addresses changed or added from the date of the last G-NAF supply. This will ensure the most current and authoritative geocoded addresses available. The service references a continuously maintained resource of jurisdictional data with a refresh rate equal to the highest update rate available in each jurisdiction (in some cases, daily), works with G-NAF and maintains linkages to other PSMA Australia datasets.
G-NAF Live does not incorporate data from the AEC or Australia Post – referencing the most relevant address attributes rather than the full attribution of G-NAF. Through the multiple award-winning PSMA Systems web interface, it allows users to customise and orchestrate address verification services with OGC Web Feature and Web Map services (WFS and WMS) into a single consumable workflow.
Geospatial futures and G-NAF
The G-NAF technology has enabled efficiency, accuracy and innovation for many vital functions across industry and government. Yet, as transformative as the introduction of G-NAF has been, PSMA Australia believes that the next ten years will be equally transformative. The G-NAF that has served us this past decade is nearing the end of its life – but the future of the geocoded address reference is already emerging and evolving to support the growing range of requirements within the digital knowledge economy.
Rivers of data flow through our economies, and within the previous decade, the capacity to collect and use this data for future business and service delivery design has multiplied exponentially. Data users are looking for data that is structurally as well as financially accessible. They are looking to layer diverse data types into intelligent services to create knowledge and value. They are looking to do this efficiently, and in anticipation of market needs. But above all else, they value reliable data and ask for quality and certainty as sustaining benchmarks.
“Consequently,” says PSMA Australia’s current Chairman, Glenn Appleyard, “PSMA Australia is driving forward with G-NAF Live and other ‘data as a service’ offerings, while continuing to explore the fringes of geocoded addressing and geospatial data management as we know it. With the growing interest in location data beyond specialist industries, PSMA has actively engaged with major users of location data as well as their value added reseller network to understand the market’s future needs for foundation spatial data.”
PSMA Australia CEO, Daniel Paull added that “the most useful approach to our Australian geospatial future must be one that recognises that the digital age of geospatial is very early in its lifecycle. And as far as G-NAF has come in these past ten years, there is so much more ground ahead for this foundation data resource – as there is for the geospatial industry itself…
“But in the meantime, Happy 10th Birthday G-NAF – and many more to come!”
 The Australian Government announced its intention to provide G-NAF as an open access resource through its 2013 strategy paper Advancing Australia as a Digital Economy: An Update to the National Digital Economy Strategy, Action 11.