Yufei Wang
Development
California, United States
Skills
Machine Learning (ML)
About
Yufei Wang's skills align with Programmers (Information and Communication Technology). Yufei also has skills associated with IT R&D Professionals (Information and Communication Technology). Yufei Wang appears to be an entry-level candidate, with 15 months of experience.
View more
Work Experience
LLaMA
December 2023 - February 2024
- Fined-tuned LLM LLaMA (3B) on a 20k+ row code dataset to "teach" LLaMA to write code with T4 GPU acceleration. Constructed configuration file of Multi-Lora and combined parameters to avoid additional delays during inference.
YouTube Clone
November 2023 - January 2024
- Built a website for video viewing and authenticated video uploading with scalability. Transcoded videos using Express, TypeScipt and FFmpeg, leveraging Cloud Pub/Sub for message communication. Stored videos in Google Cloud Storage and retrieved by Firebase Functions. Built the web client for user interaction using React and Next.js. Deployed services in a containerized environment with Docker and GCP Cloud Run.
YouTube Clone
August 2023 - November 2023
- Performed spatial and time series analysis for a 10-year dataset of reported crimes from LA government. Built data ETL pipeline based on Spark SQL for big data OLAP and generated insights for police patrols and tourists. Trained a Spark ML K-means model to cluster the spatial crime data and visualized the results via seaborn. Built ARIMA model to predict crime occurrences 1 year ahead and computed confidence intervals for statistical analysis.
AI Software Engineering Intern
Ping'an Science & Technology
May 2023 - August 2023
- Developed an AI-driven recommendation assistant using ChatGLM, a Large Language Model (LLM), to recommend products and provide rationale for agents at the leading insurance company with Python. Performed 2K+ data cleaning and extracted keywords using idf.txt. Collaborated with product manager to build test sets across 3 real-world scenarios. Achieved 100% on the test set to measure helpfulness metric by optimizing Elasticsearch (ES) with synonym expansion. Improved model's honestness metric from 70% to 100% on the test set by prompt engineering. Enhanced operational efficiency by 20% for 70k+ agents, contributing to improvement of the revenue by 10%
YouTube Clone
February 2023 - May 2023
- Built a distributed file system based on python, Flask and Firebase. Created a metadata server and data servers emulating Hadoop DFS and implemented file operations using REST APIs. Built the web page using JavaScript and Vue, integrating socket to process shell commands.
Data Science Intern
Tencent Holdings Ltd
November 2021 - February 2022
- Predicted the likelihood of a mobile ad being clicked using 11M+ user click logs and advertising information. Pre-processed data with Python and built data ETL pipeline with Spark Trained the XGBoost model, optimized by implementing the Wide&Deep neural network model with PyTorch Enhanced models' AUC scores by 15% by engineering features through PySpark SQL data analysis. ACADEMIC HIGHLIGHTS S