Introduction and background
Welcome to the ATIP Tools Tech Corner, where information and updates about the new ATIP Online Request Service (AORS) is shared.
The AORS is a simple, centralized website that enables users to complete access to information and personal information requests and submit them to any of the institutions that are subject to the Government of Canada’s Access to Information Act and Privacy Act.
The AORS went live October 2018 with 6 institutions. All institutions subject to the Access to Information Act and the Privacy Act will continue to be onboarded to the application.
What is “onboarding”?
"Onboarding" in this context means that the institutions will be set up to leverage all the features of the application, and users will be able to send initial access to information and privacy information requests to them through the application.
ATIP Online Request Service (AORS) onboarding data
The following is an update on the state of onboarding
|Onboarding target number:||265|
|Number of institutions onboarded to date:||194|
|Number of institutions in progress:||71|
See detailed list of onboarded institutions to date
- Administrative Tribunals Support Services of Canada
- Canada Agricultural Review Tribunal
- Canada Industrial Relations Board
- Canadian Cultural Property Export Review Board
- Canadian International Trade Tribunal Canadian International Trade Tribunal
- Competition Tribunal
- Human Rights Tribunal of Canada
- Public Servants Disclosure Protection Tribunal Canada
- Public Service Labour Relations and Employment Board
- Registry of the Specific Claims Tribunal of Canada
- Social Security Tribunal of Canada
- Transportation Appeal Tribunal of Canada
- Agriculture and Agri-Food Canada
- Asia-Pacific Foundation of Canada
- Atlantic Canada Opportunities Agency
- Atlantic Pilotage Authority Canada
- British Columbia Treaty Commission
- Canada Council for the Arts
- Canada Deposit Insurance Corporation
- Canada Development Investment Corporation
- Canada Eldor Inc.
- Canada Hibernia Holding Corporation
- Canada Economic Development for Quebec Regions
- Canada Energy Regulator
- Canada Foundation for Innovation
- Canada Post
- 2875039 Canada Limited
- 3906949 Canada Inc
- Canada School of Public Service
- Canada-Nova Scotia Offshore Petroleum Board
- Canadian Centre for Occupational Health and Safety
- Canadian Food Inspection Agency
- Canadian Grain Commission
- Canadian Heritage
- Canadian Human Rights Commission
- Canadian Institutes of Health Research
- Canadian Museum of History
- Canadian Northern Economic Development Agency
- Canadian Nuclear Safety Commission
- Canadian Race Relations Foundation
- Canadian Radio-television and Telecommunications Commission
- Canadian Space Agency
- Canadian Transportation Agency
- Civilian Review and Complaints Commission for the Royal Canadian Mounted Police
- Communications Security Establishment Canada
- Copyright Board Canada
- Correctional Investigator Canada
- Crown-Indigenous Relations and Northern Affairs Canada
- Department of Finance Canada
- Department of Justice Canada
- Elections Canada
- Environment and Climate Change Canada
- Farm Products Council of Canada
- Federal Bridge Corporation
- Seaway International Bridge Corporation
- Federal Economic Development Agency for Southern Ontario
- Federal Public Service Health Care Plan Administration Authority
- Financial Consumer Agency of Canada
- Financial Transactions and Reports Analysis Centre of Canada
- First Nations Tax Commission
- Fisheries and Oceans Canada
- Global Affairs Canada
- Gwich'in Land and Water Board
- Halifax Port Authority
- Hamilton port authority
- Health Canada
- Historic Sites and Monuments Board of Canada
- Immigration and Refugee Board of Canada
- Impact Assessment Agency
- Indigenous Services Canada
- Infrastructure Canada
- Ingenium – Canada’s Museums of Science and Innovation
- Innovation, Science and Economic Development Canada
- Mackenzie Valley Land and Water Board
- Military Grievances External Review Committee
- Military Police Complaints Commission
- Nanaimo Port Authority
- National Battlefields Commission
- National Film Board of Canada
- National Research Council Canada
- Natural Resources Canada
- Energy Supplies Allocation Board
- Northern Pipeline Agency Canada
- Nunavut Impact Review Board
- Nunavut Water Board
- Office of the Administrator of the Fund for Railway Accidents Involving Designated Goods
- Office of the Administrator of the Ship-source Oil Pollution Fund
- Office of the Commissioner of Lobbying of Canada
- Office of the Commissioner of Official Languages
- Office of the Information Commissioner of Canada
- Office of the Privacy Commissioner of Canada
- Office of the Public Sector Integrity Commission of Canada
- Office of the Superintendent of Financial Institutions Canada
- Office of the Veterans Ombudsman
- Pacific Pilotage Authority Canada
- Parks Canada Agency
- Parole Board of Canada
- Patented Medicine Prices Review Board
- Pierre Elliott Trudeau Foundation
- Polar Knowledge Canada
- Port Alberni Port Authority
- Privy Council Office
- Public Health Agency of Canada
- Public Prosecution Service of Canada
- Public Safety Canada
- Public Sector Pension Investment Board
- 3Net Indy Holdings
- 3Net Indy Investments Inc.
- 7986386 Canada Inc.
- 8599963 Canada Inc.
- Argentia Private Investments
- AviAlliance Canada Inc.
- Indo-Infra Inc.
- Infra H20 GP Partners Inc.
- Infra H20 LP Partners Inc.
- Infra TM Investments Inc.
- Infra-PSP Canada Inc.
- Infra-PSP Credit Inc.
- Infra-PSP ECEF Inc.
- Infra-PSP Partners Inc.
- Ivory Private Investments Inc.
- Kings Island Private Investments Inc.
- Northern Fjord Holdings Inc.
- Port-aux-Choix Private Investments Inc.
- Potton Holdings Inc.
- PSP Capital Inc.
- PSP Finco Inc.
- PSP H2O FL GP INC.
- PSP Public Credit I Inc.
- PSP Public Credit Opportunities Inc.
- PSP Public Markets Inc.
- PSPIB Baltimore G.P. Inc.
- PSPIB Bromont Investments Inc.
- PSPIB Deep South Inc.
- PSPIB DevCol Inc.
- PSPIB Emerald Inc.
- PSPIB G.P. Finance Inc.
- PSPIB G.P. Inc.
- PSPIB G.P. Partners Inc.
- PSPIB Golden Range Cattle II Inc.
- PSPIB Golden Range Cattle Inc.
- PSPIB Homes Inc.
- PSPIB IRP60 Inc.
- PSPIB Michigan G.P. Inc.
- PSPIB Orchid Inc.
- PSPIB Paisas Inc.
- PSPIB Pennsylvania Investments Inc.
- PSPIB WEXFORD INVESTMENTS INC.
- PSPIB-Andes Inc.
- PSPIB-CCR Inc.
- PSPIB-Condor Inc.
- PSPIB-Eldorado Inc.
- PSPIB-LSF Inc.
- PSPIB-Newbury G.P. Inc.
- PSPIB-RE Finance Inc.
- PSPIB-RE Finance Partners II Inc.
- PSPIB-RE Finance Partners Inc.
- PSPIB-RE MANCHESTER INC.
- PSPIB-RE Partners II Inc.
- PSPIB-RE Partners Inc.
- PSPIB-RE UK Inc.
- PSPIB-SDL Inc.
- PSPIB-Star Inc.
- Red Isle Private Investments Inc.
- Revera Inc.
- Trinity Bay Private Investments Inc.
- VOP Investments Inc.
- Public Service Commission of Canada
- Public Services and Procurement Canada
- RCMP External Review Committee
- Royal Canadian Mint
- Saguenay Port Authority
- Sahtu Land and Water Board
- Sahtu Land Use Planning Board
- Security Intelligence Review Committee
- Sept-Îles Port Authority
- Shared Services Canada
- Social Sciences and Humanities Research Council of Canada
- St. John’s Port Authority
- Statistics Canada
- Sustainable Development Technology Canada
- Telefilm Canada
- Thunder Bay Port Authority
- Toronto Port Authority
- Transportation Safety Board of Canada
- Treasury Board of Canada Secretariat
- Vancouver Fraser Port Authority
- Veterans Affairs Canada
- Veterans Review and Appeal Board Canada
- Western Economic Diversification Canada
- Windsor Port Authority
- Windsor-Detroit Bridge Authority
- Women and Gender Equality Canada
- Yukon Surface Rights Board
Institutions onboarded from the IRCC ATIP Online Pilot
Institutions that have been hosted by the IRCC ATIP Online Pilot are being migrated to the AORS.
|Total number of institutions previously available on the pilot:||33|
|Total number of institutions onboarded on AORS from the pilot:||27|
|Number of institutions in progress:||6|
In this update, we will explain how the ATIP Online Request Service is leveraging artificial intelligence (AI).
This update is also one of our first efforts to explain the use of AI in governments, so please give us feedback so that we can understand how to make this as clear as possible. Contact us at email@example.com.
What is the impact of our use of artificial intelligence?
To assess the impact of our use of AI, we have used the Algorithmic Impact Assessment Tool.
Basically, what this tells us is that our use of AI has little socio-economic impact on citizens and little impact on government operations.
Using artificial intelligence
The search functionality provided on uses AI to improve user experience.
The first instance where AI is used is when searching for information that may have already been released in response to another request. The search results are based on information readily available on the Open Government website.
The second instance where AI is used is when helping to identify which institution may have the information pertaining to the request. The search will recommend institutions that are most suitable for the type of request. The data used to make this recommendation comes from the following locations:
- Open Government summaries
- departmental reports
- "scraping" on government websites
- institutions’ ATIP web pages
- Government of Canada taxonomies
- unified master data organization schema
- Part III of Departmental Results Reports for the 2016 to 2017 fiscal year
How are we using AI
Ensuring that a web search finds all the correct documents can be a difficult task. The search system leverages machine learning to identify contextual and latent relationships that are more fundamental than keywords. To do this, the search looks at concepts and the relationship between past searches to improve result quality.
The searching system that was developed uses advanced natural language processing and machine learning techniques to enhance searches across multiple sources. This search solution will include websites, forums or anything that is publicly accessible. By going beyond simple word similarity and instead "understanding" the meaning of search terms, this solution can compare a user’s search needs to the corpus of documents in near real time, returning all relevant documents, or components of documents, that relate to a given search query or comparable document.
Synonyms, abbreviations and typos often mean that key documents go overlooked. By using advanced machine learning and natural language processing, the algorithm built is able to read an entire corpus of documents (such as an enterprise website, a course curriculum and the textbooks and related documents). After reading the documents, the AI search system is able to semantically "understand" the phrases and ideas, more so than in keyword matching.
The following will give more technical information about the algorithm that was used:
- category of algorithm used: natural language processing
- models used: tf-idf (term frequency-inverse document frequency) and cosine similarity models
Tf-idf is a method to score how related two pieces of text are. The input text from the user will be matched against all documents previously released under ATIP. Public documents that have the highest scores will be suggested to the user.
The tf-idf algorithm begins by counting the number of words in the request that are also present in each public document. This count is then divided by how common each of the matched words are. This division reduces noise and accounts for the fact that common words such as "Canada" are likely to match many documents, regardless of the ATIP request and, therefore, the fact that match isn’t as important. It is more valuable to know that a less common word is found in both the user submission and the publicly available document.
At its core, tf-idf is a word-matching algorithm. Similar words (not exact) in both the query and the documents will not register as a match with tf-idf alone. In fact, tf-idf is what powers many off-the-shelf search platforms, including Apache Solr, which ATIP has suggested does not return relevant results. Therefore, a number of improvements to the tf-idf algorithm had to be done.
The most common way of improving tf-idf is to use a technique called stemming. Stemming is the process of simplifying a word to its "stem" or root. For example, the root word of "stemming" is "stem." If we reduce all words to their base and then look for matches, we will count two words such as "fishing" and "fisher" as matching. This technique works similarly in English and French.
As we move through the content to reduce words to their stem, we can also remove stop words. A stop word is a common word that does not contribute meaning to the phase. For example, if we removed "the" and "a" from any sentence, we can still infer the general meaning. Removing stop words improves the speed of our algorithm and reduces false matches.
Stemming is useful when two words share the same root. But often there are words that are practically the same but do not share a common root. For example, "Access to Information and Privacy" and "ATIP" have the exact same meaning but share no words in common. In order for tf-idf to register matches for similar words, we need a way of measuring the similarity, or distance, between any two words. For example, "kids" and "children" should be close together, and "sheep" and "lion" should be far apart. In order to measure the distance between words, we can use a method called word embedding.
Word embedding is a tool that converts a word to a vector. Typically, this vector has hundreds of dimensions. We tend to use word embeddings that have 100 to 300 dimensions. Despite having a high number of dimensions, we can calculate the distance between any two words the same way we calculate distance in a smaller number of dimensions.
To combine tf-idf and embeddings, we convert every word to a vector by way of an embedding. We then measure the distance between every word in the activity (source) and every word in the snippet of content (target). Words that are very close together are given a score close to 1 (or exactly 1 if they’re the same word), and words that are very far apart are given a score of 0. In this way, a word will be considered a match if the meaning of the word is similar.
A word embedding converts a single word to a vector of hundreds of numbers. This is done for all words in all publicly available government documents. Ultimately this generates a tremendous amount of data, and this data must be searched and analyzed with every ATIP request. We can reduce the amount of computation (and therefore increase search performance) by using an algorithm called singular value decomposition (SVD). In short, SVD can be used to compress the information in each document (and total data generated) while still retaining the information and search accuracy.