???global.info.a_carregar???
After completing my high school education, I pursued a Bachelor's degree in Electrical and Computer Engineering from the Universidade de Lisboa - Instituto Superior Técnico. Promptly after, I started a Master's degree in Data Science and Engineering, at the same institution, which I completed in December 2022. During my studies, my goal was always to perform at a high level and this effort has been recognized with the Academic Merit Diploma, which was awarded to me in my final year of Bachelor's studies and throughout both years of my Master's program. In January 2023, I joined INESC-ID - a leading R&D organization in the fields of Computer Science and Electrical and Computer Engineering - as a full-time researcher under the Center for Responsible AI consortium. My time here has been marked by two significant projects: 1) creating a novel address matching method in collaboration with the Portuguese Post Office Company using a BERT Bi-Encoder, and 2) developing a Retrieval-Augmented-Generation conversational assistant that allows users to delve into the life and career of the former Portuguese Prime Minister, Francisco Pinto Balsemão, through an interactive interview-like experience. Later in 2023, from October to December, I spent two months at Carnegie Mellon University. I was there as a visiting student in Professor Lei Li's group, working on a project about detecting copyrighted content in the data used to train language models. I am currently a 1st year PhD student at IST and I am accepted to start in September 2024 the IST-CMU Dual Degree program on the LTI department. The outcomes of my work have been transformed into two published and orally presented papers at EPIA and ICMLA in 2023, and a third paper accepted for publication on ICML 2024.
Identification

Personal identification

Full name
André Vicente Duarte

Citation names

  • Duarte, André Vicente

Author identifiers

Ciência ID
FF1B-2F2F-3527
ORCID iD
0000-0001-5987-0789

Email addresses

  • andre.v.duarte@tecnico.ulisboa.pt (Professional)

Telephones

Mobile phone
  • (+351) 966515729 (Personal)

Knowledge fields

  • Exact Sciences - Computer and Information Sciences - Information Science
  • Engineering and Technology - Electrotechnical Engineering, Electronics and Informatics

Languages

Language Speaking Reading Writing Listening Peer-review
Portuguese (Mother tongue)
English Advanced (C1) Advanced (C1) Advanced (C1) Advanced (C1) Advanced (C1)
French Elementary (A2) Elementary (A2) Beginner (A1) Elementary (A2) Beginner (A1)
Spanish; Castilian Upper intermediate (B2) Upper intermediate (B2) Intermediate (B1) Upper intermediate (B2) Intermediate (B1)
Education
Degree Classification
2024 - 2029
Ongoing
Dual Degree on the Ph.D. in Language Technologies (Doutoramento)
Universidade de Lisboa Instituto Superior Técnico, Portugal

Carnegie Mellon University Language Technologies Institute, United States
2020/09 - 2022/12/07
Concluded
Data Science and Engineering (Master's) (Mestrado)
Universidade de Lisboa Instituto Superior Técnico, Portugal
"Siamese Transformer Networks for Improving Address Matching" (THESIS/DISSERTATION)
17/20
2017/09 - 2020/08
Concluded
Electrical and Computer Engineering (Bachelor's) (Licenciatura)
Universidade de Lisboa Instituto Superior Técnico, Portugal
15/20
2014/09 - 2017/08
Concluded
Science & Technology Course (High School) (Ensino secundário)
Colégio Pedro Arrupe, Portugal
17/20
Affiliation

Science

Category
Host institution
Employer
2023/01/11 - Current Researcher (Research) Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa, Portugal
INESC-ID, Portugal
2023/10 - 2023/12 Visiting Researcher (Research) Carnegie Mellon University Language Technologies Institute, United States
Carnegie Mellon University Language Technologies Institute, United States
Projects

Contract

Designation Funders
2023/01/11 - Current NextGenAI: Center for Responsible AI
2022-C05i0102-02
Researcher
INESC-ID, Portugal
Agência para a Competitividade e Inovação IP
Ongoing
Outputs

Publications

Book chapter
  1. Duarte, André Vicente; Oliveira, Arlindo. "Improving Address Matching Using Siamese Transformer Networks". In Progress in Artificial Intelligence, 413-425. Cham, Switzerland: Springer Nature Switzerland, 2023.
    Published • 10.1007/978-3-031-49011-8_33
Conference paper
  1. Duarte, André Vicente; João Marques; Miguel Graça; Miguel Freire; Lei Li; Arlindo L. Oliveira. "LumberChunker: Long-Form Narrative Document Segmentation". Paper presented in EMNLP 2024 (Findings), Miami, Florida, 2024.
    Accepted
  2. Duarte, André Vicente; Zhao, Xuandong; Oliveira, Arlindo; Li, Lei. "DE-COP: Detecting Copyrighted Content in Large Language Models Training Data". Paper presented in The Forty-first International Conference on Machine Learning (ICML 2024), Vienna, 2024.
    Accepted
  3. Duarte, André Vicente; Oliveira, Arlindo. "Improving embeddings for high-accuracy transformer-based address matching using a multiple in-batch negatives loss". Paper presented in 22nd International Conference on Machine Learning and Applications (ICMLA), Jacksonville, Florida, 2023.
    In press
Thesis / Dissertation
  1. "Siamese Transformer Networks for Improving Address Matching.". Master, 2022.

Other

Software
  1. Duarte, André Vicente. "Memories - Talking with Francisco Pinto Balsemão. Design of a Retrieval-Augmented-Generation conversational assistant designed for users to explore Dr. Balsemão's life and career as if they were personally interviewing him. This project was developed to the former Portuguese Prime Minister Francisco Pinto Balsemão himself.". 2023.
  2. Duarte, André Vicente. "Development of Data Visualizations about the Sustainable Development Goals (SDGs). Development of data visualizations for PORDATA, using Javascript and its extension D3.js. This project was a collaboration between Instituto Superior Técnico and Fundação Francisco Manuel dos Santos; url: https://www.youtube.com/watch?v=ZUMnwXOu3LI". 2022.
  3. Duarte, André Vicente. "Puphluns: Awesome Parties 101. Development of an android app using kotlin as the main programming language. Puphluns is a card game to be used as an ice breaker on a party; url: https://play.google.com/store/apps/details?id=com.boss.myapplication_1233&hl=pt&gl=US]". Android. 2021.
Activities

Oral presentation

Presentation title Event name
Host (Event location)
2023/12/20 Improving embeddings for high-accuracy transformer-based address matching using a multiple in-batch negatives loss. 22nd International Conference on Machine Learning and Applications (ICMLA)
(Jacksonville, United States)
2023/09/05 Improving Address Matching using Siamese Transformer Networks EPIA Conference on Artificial Inteligence
(Faial, Portugal)

Event organisation

Event name
Type of event (Role)
Institution / Organization
2022/03/21 - 2022/03/25 JEEC (Electrical and Computer Engineering Week) is an event dedicated to the cutting edge of technology, science, and engineering. This week of networking, lectures, workshops, among others, provides an excellent opportunity for companies to introduce themselves to students of Electrical Engineering and similar courses at IST, thus narrowing the gap between the academic world and the business world. Role: Management of meal logistics (planning and distribution) (2022/03/21 - 2022/03/25)
Congress (Co-organisor)
Universidade de Lisboa Instituto Superior Técnico, Portugal
Distinctions

Award

2024 Carnegie Mellon - Dual Degree Ph.D. Scholarship
Carnegie Mellon University Portugal Office, Portugal
2023 Carnegie Mellon - Visiting Student Scholarship
Carnegie Mellon University Portugal Office, Portugal
2022 Academic Merit Diploma (2nd year of Master's)
Universidade de Lisboa Instituto Superior Técnico, Portugal
2021 Academic Merit Diploma (1st year of Master's)
Universidade de Lisboa Instituto Superior Técnico, Portugal
2020 Academic Merit Diploma (3rd year of Bachelor's)
Universidade de Lisboa Instituto Superior Técnico, Portugal