ATIP Tools Tech Corner - Introduction to the ATIP Online Request Service and the use of artificial intelligence

Follow:

  • RSS
  • Cite

Introduction and background

Welcome to the ATIP Tools Tech Corner, where information and updates about the new ATIP Online Request Service (AORS) is shared.

The AORS is a simple, centralized website that enables users to complete access to information and personal information requests and submit them to any of the institutions that are subject to the Government of Canada’s Access to Information Act and Privacy Act.

Onboarding institutions

The AORS went live October 2018 with 6 institutions. All institutions subject to the Access to Information Act and the Privacy Act will continue to be onboarded to the application.

What is “onboarding”?

"Onboarding" in this context means that the institutions will be set up to leverage all the features of the application, and users will be able to send initial access to information and privacy information requests to them through the application.

ATIP Online Request Service (AORS) onboarding data

The following is an update on the state of onboarding

ATIP Online Request Service (AORS) onboarding data
Onboarding target number: 265
Number of institutions onboarded to date: 194
Number of institutions in progress: 71
See detailed list of onboarded institutions to date
  1. Administrative Tribunals Support Services of Canada
    • Canada Agricultural Review Tribunal
    • Canada Industrial Relations Board
    • Canadian Cultural Property Export Review Board
    • Canadian International Trade Tribunal Canadian International Trade Tribunal
    • Competition Tribunal
    • Human Rights Tribunal of Canada
    • Public Servants Disclosure Protection Tribunal Canada
    • Public Service Labour Relations and Employment Board
    • Registry of the Specific Claims Tribunal of Canada
    • Social Security Tribunal of Canada
    • Transportation Appeal Tribunal of Canada
  2. Agriculture and Agri-Food Canada
  3. Asia-Pacific Foundation of Canada
  4. Atlantic Canada Opportunities Agency
  5. Atlantic Pilotage Authority Canada
  6. British Columbia Treaty Commission
  7. Canada Council for the Arts
  8. Canada Deposit Insurance Corporation
  9. Canada Development Investment Corporation
    • Canada Eldor Inc.
    • Canada Hibernia Holding Corporation
  10. Canada Economic Development for Quebec Regions
  11. Canada Energy Regulator
  12. Canada Foundation for Innovation
  13. Canada Post
    • 2875039 Canada Limited
    • 3906949 Canada Inc
  14. Canada School of Public Service
  15. Canada-Nova Scotia Offshore Petroleum Board
  16. Canadian Centre for Occupational Health and Safety
  17. Canadian Food Inspection Agency
  18. Canadian Grain Commission
  19. Canadian Heritage
  20. Canadian Human Rights Commission
  21. Canadian Institutes of Health Research
  22. Canadian Museum of History
  23. Canadian Northern Economic Development Agency
  24. Canadian Nuclear Safety Commission
  25. Canadian Race Relations Foundation
  26. Canadian Radio-television and Telecommunications Commission
  27. Canadian Space Agency
  28. Canadian Transportation Agency
  29. Civilian Review and Complaints Commission for the Royal Canadian Mounted Police
  30. Communications Security Establishment Canada
  31. Copyright Board Canada
  32. Correctional Investigator Canada
  33. Crown-Indigenous Relations and Northern Affairs Canada
  34. Department of Finance Canada
  35. Department of Justice Canada
  36. Elections Canada
  37. Environment and Climate Change Canada
  38. Farm Products Council of Canada
  39. Federal Bridge Corporation
    • Seaway International Bridge Corporation
  40. Federal Economic Development Agency for Southern Ontario
  41. Federal Public Service Health Care Plan Administration Authority
  42. Financial Consumer Agency of Canada
  43. Financial Transactions and Reports Analysis Centre of Canada
  44. First Nations Tax Commission
  45. Fisheries and Oceans Canada
  46. Global Affairs Canada
  47. Gwich'in Land and Water Board
  48. Halifax Port Authority
  49. Hamilton port authority
  50. Health Canada
  51. Historic Sites and Monuments Board of Canada
  52. Immigration and Refugee Board of Canada
  53. Impact Assessment Agency
  54. Indigenous Services Canada
  55. Infrastructure Canada
  56. Ingenium – Canada’s Museums of Science and Innovation
  57. Innovation, Science and Economic Development Canada
  58. Mackenzie Valley Land and Water Board
  59. Military Grievances External Review Committee
  60. Military Police Complaints Commission
  61. Nanaimo Port Authority
  62. National Battlefields Commission
  63. National Film Board of Canada
  64. National Research Council Canada
  65. Natural Resources Canada
    • Energy Supplies Allocation Board
  66. Northern Pipeline Agency Canada
  67. Nunavut Impact Review Board
  68. Nunavut Water Board
  69. Office of the Administrator of the Fund for Railway Accidents Involving Designated Goods
  70. Office of the Administrator of the Ship-source Oil Pollution Fund
  71. Office of the Commissioner of Lobbying of Canada
  72. Office of the Commissioner of Official Languages
  73. Office of the Information Commissioner of Canada
  74. Office of the Privacy Commissioner of Canada
  75. Office of the Public Sector Integrity Commission of Canada
  76. Office of the Superintendent of Financial Institutions Canada
  77. Office of the Veterans Ombudsman
  78. Pacific Pilotage Authority Canada
  79. Parks Canada Agency
  80. Parole Board of Canada
  81. Patented Medicine Prices Review Board
  82. Pierre Elliott Trudeau Foundation
  83. Polar Knowledge Canada
  84. Port Alberni Port Authority
  85. Privy Council Office
  86. Public Health Agency of Canada
  87. Public Prosecution Service of Canada
  88. Public Safety Canada
  89. Public Sector Pension Investment Board
    • 3Net Indy Holdings
    • 3Net Indy Investments Inc.
    • 7986386 Canada Inc.
    • 8599963 Canada Inc.
    • Argentia Private Investments
    • AviAlliance Canada Inc.
    • Indo-Infra Inc.
    • Infra H20 GP Partners Inc.
    • Infra H20 LP Partners Inc.
    • Infra TM Investments Inc.
    • Infra-PSP Canada Inc.
    • Infra-PSP Credit Inc.
    • Infra-PSP ECEF Inc.
    • Infra-PSP Partners Inc.
    • Ivory Private Investments Inc.
    • Kings Island Private Investments Inc.
    • Northern Fjord Holdings Inc.
    • Port-aux-Choix Private Investments Inc.
    • Potton Holdings Inc.
    • PSP Capital Inc.
    • PSP Finco Inc.
    • PSP H2O FL GP INC.
    • PSP Public Credit I Inc.
    • PSP Public Credit Opportunities Inc.
    • PSP Public Markets Inc.
    • PSPIB Baltimore G.P. Inc.
    • PSPIB Bromont Investments Inc.
    • PSPIB Deep South Inc.
    • PSPIB DevCol Inc.
    • PSPIB Emerald Inc.
    • PSPIB G.P. Finance Inc.
    • PSPIB G.P. Inc.
    • PSPIB G.P. Partners Inc.
    • PSPIB Golden Range Cattle II Inc.
    • PSPIB Golden Range Cattle Inc.
    • PSPIB Homes Inc.
    • PSPIB IRP60 Inc.
    • PSPIB Michigan G.P. Inc.
    • PSPIB Orchid Inc.
    • PSPIB Paisas Inc.
    • PSPIB Pennsylvania Investments Inc.
    • PSPIB WEXFORD INVESTMENTS INC.
    • PSPIB-Andes Inc.
    • PSPIB-CCR Inc.
    • PSPIB-Condor Inc.
    • PSPIB-Eldorado Inc.
    • PSPIB-LSF Inc.
    • PSPIB-Newbury G.P. Inc.
    • PSPIB-RE Finance Inc.
    • PSPIB-RE Finance Partners II Inc.
    • PSPIB-RE Finance Partners Inc.
    • PSPIB-RE MANCHESTER INC.
    • PSPIB-RE Partners II Inc.
    • PSPIB-RE Partners Inc.
    • PSPIB-RE UK Inc.
    • PSPIB-SDL Inc.
    • PSPIB-Star Inc.
    • Red Isle Private Investments Inc.
    • Revera Inc.
    • Trinity Bay Private Investments Inc.
    • VOP Investments Inc.
  90. Public Service Commission of Canada
  91. Public Services and Procurement Canada
  92. RCMP External Review Committee
  93. Royal Canadian Mint
  94. Saguenay Port Authority
  95. Sahtu Land and Water Board
  96. Sahtu Land Use Planning Board
  97. Security Intelligence Review Committee
  98. Sept-Îles Port Authority
  99. Shared Services Canada
  100. Social Sciences and Humanities Research Council of Canada
  101. St. John’s Port Authority
  102. Statistics Canada
  103. Sustainable Development Technology Canada
  104. Telefilm Canada
  105. Thunder Bay Port Authority
  106. Toronto Port Authority
  107. Transportation Safety Board of Canada
  108. Treasury Board of Canada Secretariat
  109. Vancouver Fraser Port Authority
  110. Veterans Affairs Canada
  111. Veterans Review and Appeal Board Canada
  112. Western Economic Diversification Canada
  113. Windsor Port Authority
  114. Windsor-Detroit Bridge Authority
  115. Women and Gender Equality Canada
  116. Yukon Surface Rights Board

Institutions onboarded from the IRCC ATIP Online Pilot

Institutions that have been hosted by the IRCC ATIP Online Pilot are being migrated to the AORS.

Institutions onboarded from IRCC ATIP Online Pilot
Total number of institutions previously available on the pilot: 33
Total number of institutions onboarded on AORS from the pilot: 27
Number of institutions in progress: 6

Artificial intelligence

In this update, we will explain how the ATIP Online Request Service is leveraging artificial intelligence (AI).

This update is also one of our first efforts to explain the use of AI in governments, so please give us feedback so that we can understand how to make this as clear as possible. Contact us at open.ouvert@tbs-sct.gc.ca.

What is the impact of our use of artificial intelligence?

To assess the impact of our use of AI, we have used the Algorithmic Impact Assessment Tool.

Basically, what this tells us is that our use of AI has little socio-economic impact on citizens and little impact on government operations.

Using artificial intelligence

The search functionality provided on uses AI to improve user experience.

The first instance where AI is used is when searching for information that may have already been released in response to another request. The search results are based on information readily available on the Open Government website.

The second instance where AI is used is when helping to identify which institution may have the information pertaining to the request. The search will recommend institutions that are most suitable for the type of request. The data used to make this recommendation comes from the following locations:

  • Open Government summaries
  • departmental reports
  • "scraping" on government websites
  • institutions’ ATIP web pages
  • Government of Canada taxonomies
  • unified master data organization schema
  • Part III of Departmental Results Reports for the 2016 to 2017 fiscal year

How are we using AI

Ensuring that a web search finds all the correct documents can be a difficult task. The search system leverages machine learning to identify contextual and latent relationships that are more fundamental than keywords. To do this, the search looks at concepts and the relationship between past searches to improve result quality.

The searching system that was developed uses advanced natural language processing and machine learning techniques to enhance searches across multiple sources. This search solution will include websites, forums or anything that is publicly accessible. By going beyond simple word similarity and instead "understanding" the meaning of search terms, this solution can compare a user’s search needs to the corpus of documents in near real time, returning all relevant documents, or components of documents, that relate to a given search query or comparable document.

Synonyms, abbreviations and typos often mean that key documents go overlooked. By using advanced machine learning and natural language processing, the algorithm built is able to read an entire corpus of documents (such as an enterprise website, a course curriculum and the textbooks and related documents). After reading the documents, the AI search system is able to semantically "understand" the phrases and ideas, more so than in keyword matching.

AI algorithm

The following will give more technical information about the algorithm that was used:

  • category of algorithm used: natural language processing
  • models used: tf-idf (term frequency-inverse document frequency) and cosine similarity models

Tf-idf Improvements

Tf-idf is a method to score how related two pieces of text are. The input text from the user will be matched against all documents previously released under ATIP. Public documents that have the highest scores will be suggested to the user.

The tf-idf algorithm begins by counting the number of words in the request that are also present in each public document. This count is then divided by how common each of the matched words are. This division reduces noise and accounts for the fact that common words such as "Canada" are likely to match many documents, regardless of the ATIP request and, therefore, the fact that match isn’t as important. It is more valuable to know that a less common word is found in both the user submission and the publicly available document.

At its core, tf-idf is a word-matching algorithm. Similar words (not exact) in both the query and the documents will not register as a match with tf-idf alone. In fact, tf-idf is what powers many off-the-shelf search platforms, including Apache Solr, which ATIP has suggested does not return relevant results. Therefore, a number of improvements to the tf-idf algorithm had to be done.

Stemming

The most common way of improving tf-idf is to use a technique called stemming. Stemming is the process of simplifying a word to its "stem" or root. For example, the root word of "stemming" is "stem." If we reduce all words to their base and then look for matches, we will count two words such as "fishing" and "fisher" as matching. This technique works similarly in English and French.

Stop words

As we move through the content to reduce words to their stem, we can also remove stop words. A stop word is a common word that does not contribute meaning to the phase. For example, if we removed "the" and "a" from any sentence, we can still infer the general meaning. Removing stop words improves the speed of our algorithm and reduces false matches.

Word embeddings

Stemming is useful when two words share the same root. But often there are words that are practically the same but do not share a common root. For example, "Access to Information and Privacy" and "ATIP" have the exact same meaning but share no words in common. In order for tf-idf to register matches for similar words, we need a way of measuring the similarity, or distance, between any two words. For example, "kids" and "children" should be close together, and "sheep" and "lion" should be far apart. In order to measure the distance between words, we can use a method called word embedding.

Word embedding is a tool that converts a word to a vector. Typically, this vector has hundreds of dimensions. We tend to use word embeddings that have 100 to 300 dimensions. Despite having a high number of dimensions, we can calculate the distance between any two words the same way we calculate distance in a smaller number of dimensions.

To combine tf-idf and embeddings, we convert every word to a vector by way of an embedding. We then measure the distance between every word in the activity (source) and every word in the snippet of content (target). Words that are very close together are given a score close to 1 (or exactly 1 if they’re the same word), and words that are very far apart are given a score of 0. In this way, a word will be considered a match if the meaning of the word is similar.

Dimensionality reduction

A word embedding converts a single word to a vector of hundreds of numbers. This is done for all words in all publicly available government documents. Ultimately this generates a tremendous amount of data, and this data must be searched and analyzed with every ATIP request. We can reduce the amount of computation (and therefore increase search performance) by using an algorithm called singular value decomposition (SVD). In short, SVD can be used to compress the information in each document (and total data generated) while still retaining the information and search accuracy.