ATIP Tools Tech Corner - Introduction to the ATIP Online Request Service and the use of artificial intelligence

Follow:

  • RSS
  • Cite

Introduction and background

Welcome to the ATIP Tools Tech Corner, where information and updates about the new ATIP Online Request Service (AORS) is shared.

The AORS is a simple, centralized website that enables users to complete access to information and personal information requests and submit them to any of the institutions that are subject to the Government of Canada’s Access to Information Act and Privacy Act.

Onboarding institutions

The AORS went live October 2018 with 6 institutions. All institutions subject to the Access to Information Act and the Privacy Act will continue to be onboarded to the application.

What is “onboarding”?

"Onboarding" in this context means that the institutions will be set up to leverage all the features of the application, and users will be able to send initial access to information and privacy information requests to them through the application.

ATIP Online Request Service (AORS) onboarding data

The following is an update on the state of onboarding

ATIP Online Request Service (AORS) onboarding data
Onboarding target number: 265
Number of institutions onboarded to date: 198
Number of institutions in progress: 67
See detailed list of onboarded institutions to date
  1. Administrative Tribunals Support Services of Canada
    • Canada Agricultural Review Tribunal
    • Canada Industrial Relations Board
    • Canadian Cultural Property Export Review Board
    • Canadian International Trade Tribunal Canadian International Trade Tribunal
    • Competition Tribunal
    • Human Rights Tribunal of Canada
    • Public Servants Disclosure Protection Tribunal Canada
    • Public Service Labour Relations and Employment Board
    • Registry of the Specific Claims Tribunal of Canada
    • Social Security Tribunal of Canada
    • Transportation Appeal Tribunal of Canada
  2. Agriculture and Agri-Food Canada
  3. Asia-Pacific Foundation of Canada
  4. Atlantic Canada Opportunities Agency
  5. Atlantic Pilotage Authority Canada
  6. British Columbia Treaty Commission
  7. Canada Council for the Arts
  8. Canada Deposit Insurance Corporation
  9. Canada Development Investment Corporation
    • Canada Eldor Inc.
    • Canada Hibernia Holding Corporation
  10. Canada Economic Development for Quebec Regions
  11. Canada Energy Regulator
  12. Canada Foundation for Innovation
  13. Canada Mortgage and Housing Corporation
    • Canada Housing Trust
  14. Canada Post
    • 2875039 Canada Limited
    • 3906949 Canada Inc
  15. Canada School of Public Service
  16. Canada-Nova Scotia Offshore Petroleum Board
  17. Canadian Centre for Occupational Health and Safety
  18. Canadian Dairy Commission
  19. Canadian Food Inspection Agency
  20. Canadian Grain Commission
  21. Canadian Heritage
  22. Canadian Human Rights Commission
  23. Canadian Institutes of Health Research
  24. Canadian Museum of History
  25. Canadian Northern Economic Development Agency
  26. Canadian Nuclear Safety Commission
  27. Canadian Race Relations Foundation
  28. Canadian Radio-television and Telecommunications Commission
  29. Canadian Space Agency
  30. Canadian Transportation Agency
  31. Civilian Review and Complaints Commission for the Royal Canadian Mounted Police
  32. Communications Security Establishment Canada
  33. Copyright Board Canada
  34. Correctional Investigator Canada
  35. Crown-Indigenous Relations and Northern Affairs Canada
  36. Department of Finance Canada
  37. Department of Justice Canada
  38. Elections Canada
  39. Environment and Climate Change Canada
  40. Farm Products Council of Canada
  41. Federal Bridge Corporation
    • Seaway International Bridge Corporation
  42. Federal Economic Development Agency for Southern Ontario
  43. Federal Public Service Health Care Plan Administration Authority
  44. Financial Consumer Agency of Canada
  45. Financial Transactions and Reports Analysis Centre of Canada
  46. First Nations Tax Commission
  47. Fisheries and Oceans Canada
  48. Global Affairs Canada
  49. Gwich'in Land and Water Board
  50. Halifax Port Authority
  51. Hamilton port authority
  52. Health Canada
  53. Historic Sites and Monuments Board of Canada
  54. Immigration and Refugee Board of Canada
  55. Impact Assessment Agency
  56. Indigenous Services Canada
  57. Infrastructure Canada
  58. Ingenium – Canada’s Museums of Science and Innovation
  59. Innovation, Science and Economic Development Canada
  60. Mackenzie Valley Land and Water Board
  61. Military Grievances External Review Committee
  62. Military Police Complaints Commission
  63. Nanaimo Port Authority
  64. National Battlefields Commission
  65. National Film Board of Canada
  66. National Research Council Canada
  67. Natural Resources Canada
    • Energy Supplies Allocation Board
  68. Natural Sciences and Engineering Research Council
  69. Northern Pipeline Agency Canada
  70. Nunavut Impact Review Board
  71. Nunavut Water Board
  72. Office of the Administrator of the Fund for Railway Accidents Involving Designated Goods
  73. Office of the Administrator of the Ship-source Oil Pollution Fund
  74. Office of the Commissioner of Lobbying of Canada
  75. Office of the Commissioner of Official Languages
  76. Office of the Information Commissioner of Canada
  77. Office of the Privacy Commissioner of Canada
  78. Office of the Public Sector Integrity Commission of Canada
  79. Office of the Superintendent of Financial Institutions Canada
  80. Office of the Veterans Ombudsman
  81. Pacific Pilotage Authority Canada
  82. Parks Canada Agency
  83. Parole Board of Canada
  84. Patented Medicine Prices Review Board
  85. Pierre Elliott Trudeau Foundation
  86. Polar Knowledge Canada
  87. Port Alberni Port Authority
  88. Privy Council Office
  89. Public Health Agency of Canada
  90. Public Prosecution Service of Canada
  91. Public Safety Canada
  92. Public Sector Pension Investment Board
    • 3Net Indy Holdings
    • 3Net Indy Investments Inc.
    • 7986386 Canada Inc.
    • 8599963 Canada Inc.
    • Argentia Private Investments
    • AviAlliance Canada Inc.
    • Indo-Infra Inc.
    • Infra H20 GP Partners Inc.
    • Infra H20 LP Partners Inc.
    • Infra TM Investments Inc.
    • Infra-PSP Canada Inc.
    • Infra-PSP Credit Inc.
    • Infra-PSP ECEF Inc.
    • Infra-PSP Partners Inc.
    • Ivory Private Investments Inc.
    • Kings Island Private Investments Inc.
    • Northern Fjord Holdings Inc.
    • Port-aux-Choix Private Investments Inc.
    • Potton Holdings Inc.
    • PSP Capital Inc.
    • PSP Finco Inc.
    • PSP H2O FL GP INC.
    • PSP Public Credit I Inc.
    • PSP Public Credit Opportunities Inc.
    • PSP Public Markets Inc.
    • PSPIB Baltimore G.P. Inc.
    • PSPIB Bromont Investments Inc.
    • PSPIB Deep South Inc.
    • PSPIB DevCol Inc.
    • PSPIB Emerald Inc.
    • PSPIB G.P. Finance Inc.
    • PSPIB G.P. Inc.
    • PSPIB G.P. Partners Inc.
    • PSPIB Golden Range Cattle II Inc.
    • PSPIB Golden Range Cattle Inc.
    • PSPIB Homes Inc.
    • PSPIB IRP60 Inc.
    • PSPIB Michigan G.P. Inc.
    • PSPIB Orchid Inc.
    • PSPIB Paisas Inc.
    • PSPIB Pennsylvania Investments Inc.
    • PSPIB WEXFORD INVESTMENTS INC.
    • PSPIB-Andes Inc.
    • PSPIB-CCR Inc.
    • PSPIB-Condor Inc.
    • PSPIB-Eldorado Inc.
    • PSPIB-LSF Inc.
    • PSPIB-Newbury G.P. Inc.
    • PSPIB-RE Finance Inc.
    • PSPIB-RE Finance Partners II Inc.
    • PSPIB-RE Finance Partners Inc.
    • PSPIB-RE MANCHESTER INC.
    • PSPIB-RE Partners II Inc.
    • PSPIB-RE Partners Inc.
    • PSPIB-RE UK Inc.
    • PSPIB-SDL Inc.
    • PSPIB-Star Inc.
    • Red Isle Private Investments Inc.
    • Revera Inc.
    • Trinity Bay Private Investments Inc.
    • VOP Investments Inc.
  93. Public Service Commission of Canada
  94. Public Services and Procurement Canada
  95. RCMP External Review Committee
  96. Royal Canadian Mint
  97. Saguenay Port Authority
  98. Sahtu Land and Water Board
  99. Sahtu Land Use Planning Board
  100. Security Intelligence Review Committee
  101. Sept-Îles Port Authority
  102. Shared Services Canada
  103. Social Sciences and Humanities Research Council of Canada
  104. St. John’s Port Authority
  105. Statistics Canada
  106. Sustainable Development Technology Canada
  107. Telefilm Canada
  108. Thunder Bay Port Authority
  109. Toronto Port Authority
  110. Transport Canada
  111. Transportation Safety Board of Canada
  112. Treasury Board of Canada Secretariat
  113. Vancouver Fraser Port Authority
  114. Veterans Affairs Canada
  115. Veterans Review and Appeal Board Canada
  116. Western Economic Diversification Canada
  117. Windsor Port Authority
  118. Windsor-Detroit Bridge Authority
  119. Women and Gender Equality Canada
  120. Yukon Surface Rights Board

Institutions onboarded from the IRCC ATIP Online Pilot

Institutions that have been hosted by the IRCC ATIP Online Pilot are being migrated to the AORS.

Institutions onboarded from IRCC ATIP Online Pilot
Total number of institutions previously available on the pilot: 33
Total number of institutions onboarded on AORS from the pilot: 27
Number of institutions in progress: 6

Artificial intelligence

In this update, we will explain how the ATIP Online Request Service is leveraging artificial intelligence (AI).

What is the impact of our use of artificial intelligence?

To assess the impact of our use of AI, we have used the Algorithmic Impact Assessment Tool.

Basically, what this tells us is that our use of AI has little socio-economic impact on citizens and little impact on government operations.

Using artificial intelligence

The search functionality provided on uses AI to improve user experience.

The first instance where AI is used is when searching for information that may have already been released in response to another request. The search results are based on information readily available on the Open Government website.

The second instance where AI is used is when helping to identify which institution may have the information pertaining to the request. The search will recommend institutions that are most suitable for the type of request. The data used to make this recommendation comes from the following locations:

  • Open Government summaries
  • departmental reports
  • "scraping" on government websites
  • institutions’ ATIP web pages
  • Government of Canada taxonomies
  • unified master data organization schema
  • Part III of Departmental Results Reports for the 2016 to 2017 fiscal year

How are we using AI

Ensuring that a web search finds all the correct documents can be a difficult task. The search system leverages machine learning to identify contextual and latent relationships that are more fundamental than keywords. To do this, the search looks at concepts and the relationship between past searches to improve result quality.

The searching system that was developed uses advanced natural language processing and machine learning techniques to enhance searches across multiple sources. This search solution will include websites, forums or anything that is publicly accessible. By going beyond simple word similarity and instead "understanding" the meaning of search terms, this solution can compare a user’s search needs to the corpus of documents in near real time, returning all relevant documents, or components of documents, that relate to a given search query or comparable document.

Synonyms, abbreviations and typos often mean that key documents go overlooked. By using advanced machine learning and natural language processing, the algorithm built is able to read an entire corpus of documents (such as an enterprise website, a course curriculum and the textbooks and related documents). After reading the documents, the AI search system is able to semantically "understand" the phrases and ideas, more so than in keyword matching.

AI algorithm

The following will give more technical information about the algorithm that was used:

  • category of algorithm used: natural language processing
  • models used: tf-idf (term frequency-inverse document frequency) and cosine similarity models

Tf-idf Improvements

Tf-idf is a method to score how related two pieces of text are. The input text from the user will be matched against all documents previously released under ATIP. Public documents that have the highest scores will be suggested to the user.

The tf-idf algorithm begins by counting the number of words in the request that are also present in each public document. This count is then divided by how common each of the matched words are. This division reduces noise and accounts for the fact that common words such as "Canada" are likely to match many documents, regardless of the ATIP request and, therefore, the fact that match isn’t as important. It is more valuable to know that a less common word is found in both the user submission and the publicly available document.

At its core, tf-idf is a word-matching algorithm. Similar words (not exact) in both the query and the documents will not register as a match with tf-idf alone. In fact, tf-idf is what powers many off-the-shelf search platforms, including Apache Solr, which ATIP has suggested does not return relevant results. Therefore, a number of improvements to the tf-idf algorithm had to be done.

Stemming

The most common way of improving tf-idf is to use a technique called stemming. Stemming is the process of simplifying a word to its "stem" or root. For example, the root word of "stemming" is "stem." If we reduce all words to their base and then look for matches, we will count two words such as "fishing" and "fisher" as matching. This technique works similarly in English and French.

Stop words

As we move through the content to reduce words to their stem, we can also remove stop words. A stop word is a common word that does not contribute meaning to the phase. For example, if we removed "the" and "a" from any sentence, we can still infer the general meaning. Removing stop words improves the speed of our algorithm and reduces false matches.

Word embeddings

Stemming is useful when two words share the same root. But often there are words that are practically the same but do not share a common root. For example, "Access to Information and Privacy" and "ATIP" have the exact same meaning but share no words in common. In order for tf-idf to register matches for similar words, we need a way of measuring the similarity, or distance, between any two words. For example, "kids" and "children" should be close together, and "sheep" and "lion" should be far apart. In order to measure the distance between words, we can use a method called word embedding.

Word embedding is a tool that converts a word to a vector. Typically, this vector has hundreds of dimensions. We tend to use word embeddings that have 100 to 300 dimensions. Despite having a high number of dimensions, we can calculate the distance between any two words the same way we calculate distance in a smaller number of dimensions.

To combine tf-idf and embeddings, we convert every word to a vector by way of an embedding. We then measure the distance between every word in the activity (source) and every word in the snippet of content (target). Words that are very close together are given a score close to 1 (or exactly 1 if they’re the same word), and words that are very far apart are given a score of 0. In this way, a word will be considered a match if the meaning of the word is similar.

Dimensionality reduction

A word embedding converts a single word to a vector of hundreds of numbers. This is done for all words in all publicly available government documents. Ultimately this generates a tremendous amount of data, and this data must be searched and analyzed with every ATIP request. We can reduce the amount of computation (and therefore increase search performance) by using an algorithm called singular value decomposition (SVD). In short, SVD can be used to compress the information in each document (and total data generated) while still retaining the information and search accuracy.

Date modified: