
Anup Roy
Development
Karnataka , India
Skills
Data Science
About
Anup Roy's skills align with Programmers (Information and Communication Technology). Anup also has skills associated with Sellers (Sales and Trading). Anup Roy appears to be a low-to-mid level candidate, with 3 years of experience.
View more
Work Experience
Data Science-Intern
Research Scientists @ETS-USA AI Labs
January 2024 - Present
- Januray'24-Present Exploring LLMs like T5 and Llama models for Automated Essay Scoring(AES). Worked on iimplementing multiple Research paper related to AES. Worked on twiking T5 output as classification model. PUBLICATION • Fidelity Investment|Data Science-Intern NLPIR 2023 Paper Worked on Text classification basically utterance classification on Fidelity provided data using a Graph-based classification model. Working on implementing research paper InducT-GCN and Induct GAT(Our Model) Our Model has Beaten Three Short Text SOTA Score NICE-45, TagMyNews, NICE-2. Build an Inductive Graph Attention network that beats transformer-based F1 score, which got an F1 score of 83.08 on a proprietary dataset provided by Fidelity Investment. • Methodologies for Legal Evaluation ACL 2023 Paper Our submission concentrated on three subtasks: Legal Named Entity Recognition (L-NER) for Task-B, Legal Judgment Prediction (LJP) for Task-C1, and Court Judgment Prediction with Explanation (CJPE) for Task-C2. Our team obtained competitive World rankings of 15th, 11th, and 1st in Task-B, Task-C1, and Task-C2, respectively, as reported on the leaderboard Semeval Competition. INTERNSHIP
Guide
June 2021 - Present
- D esign and development of Multilingual NER and POS systems for multiple Indian languages, mainly Hindi, Bengali, Telugu, Malayalam, and Marathi, with an emphasis on detecting semantically ambiguous and complex things in brief and low-context scenarios is the objective of this project. E xperimented different models namely Bi-LSTM CRF, Bi-LSTM Attention CRF, and mBERT-CRF compared using the weighted F1-Score mBERT-CRF has done exceptionally well and provided accuracy comparable to or exceeding state-of-the-art.
Data Scientists @Ascendum Solutions(Carelon solutions)
April 2023 - January 2024
- Worked on Medical Document Classification. Worked on doing POC for building medical summarisation using LLMs like T5.
Intern
Vel-AI Labs
May 2021 - December 2022
- Worked on utilising Stock returns data, with GDELT Events data for 1 day ahead prediction using LSTM models. Explored preprocessing, data collection from GDELT database THESIS
Applied Scientists-Intern
Amazon
May 2022 - August 2022
- Worked On Building Foundational model for Address text(Last Mile Use Cases) Explored Large Language Model like T5, Llama and Falcon model. Able to deliver results related to multi-task fine tuning of LLMs and its performance compared to base line results. Delivered LLama multi-task fine tuning performed better on address matching task than address parsing which beats baseline score for Washington state addresses.