Whether you are a lone wolf or part of a team, you will go through a 3-week virtual competition from the 26th of September to the 17th of October, 2022. If you perform well in phase 1, you will be invited to a 2-day Hackathon in Dublin on the 21-22nd of October and get a chance to win the Grand Prize!
You will have to process the dataset, train machine learning models, and submit solutions throughout this period. Mentoring and knowledge sharing will also be provided to guarantee top-tier competitive solutions. Welcome to the Huawei TechArena Ireland 2022!
Don't miss out on this year's DATA SCIENCE COMPETITION! We are thrilled to launch this new edition, concluding with an ON-SITE hackathon in Dublin!
All STEM students from Ireland (and nearby) are invited to take on this new online challenge for a chance to be selected for the final event in Dublin, and compete for the first-place title. Spoiler: We have some generous prizes for you to take home (and all participants will receive their participation certificate).
COMPETE WITH LIKE-MINDED INDIVIDUALS
NETWORK WITH HUAWEI EXPERTS
Looking for an internship? Show off your skills!
WIN AWESOME CUTTING-EDGE TECH PRIZES
GAIN TOP-TIER EXPERIENCE
CROSS-DOMAIN CTR PREDICTION
Ad recommendation models are typically built on historical user-to-ads interaction data such as impressions, clicks, etc. Only the data from user-to-ads activities is used. However, the result of this approach may not be satisfactory to predict user-to-ad behaviours and make it less likely to recommend relevant ads to the users.
You are asked to enhance ads click-through rate (CTR) prediction accuracy by leveraging ad logs, user profiles, and user behaviour with the newsfeed.
Can you optimise your abilities?
All the participants (those who submit solutions) will receive a participation certificate.
All the participants will be considered for an internship!
*1 per team member. Product photos are for reference, colour or details may vary.
Data Science Ph.D. Student
Data Science Ph.D. Student
AI Solution Architect & Data Scientist
Lead Data Scientist
AI Research Scientist
Senior Data Scientist
Senior Data Scientist
Lead Engineer
23RD SEPTEMBER
REGISTRATIONS OPEN & DATASET RELEASE
3RD OCTOBER
SUBMISSIONS OPEN ON THE CHALLENGE PLATFORM
OCTOBER
TECHNICAL WEBINAR
17TH OCTOBER
FINAL SUBMISSION AND FINALISTS ANNOUNCEMENT
21-22ND OCTOBER @ DUBLIN
HACKATHON AND AWARDS CEREMONY
Ad recommendation models are typically built on historical user-to-Ads interaction data such as impressions, clicks etc. Only the data from user-to-Ads activities is used. However, the result of this approach may not be satisfactory to predict user-to-Ad behaviors and make it less likely to recommend relevant Ads to the users.
Introducing data from users' interactions with other apps may help to build a better prediction model. It will bring more features to profile the users and build a better CTR prediction model.
In our context, we have provided user info (in the *user_data* file) and users' interaction data with Ads (in the *train_ads* file), a "0" in the "clicked" column indicates the user did not click the Ad, and a "1" indicates the user has clicked the Ad. We also provided users' interaction data from "newsfeed" (in the *newsfeeds_log* file). The newsfeed dataset records users' behaviors when certain news is pushed to users' mobile phones.
You are asked to enhance ads click-through rate (CTR) prediction accuracy by leveraging ad logs, user profiles, and user behavior with newsfeed.
How we score:
Scoring method: Collect the predicted ads CTR values of the samples in the ads domain, and calculate the GAUCs and AUCs (AUC, area under the ROC curve).
Scoring indicator:
The sum of weighted GAUC and AUC will serve as the scoring indicator.
The formula is as follows:
xAUC = 0.7*GAUC + 0.3*AUC
A higher xAUC means a better result, and thus a higher ranking.
AUC in the formula is the sum of the AUCs of all samples.
GAUC is a weighted sum of group AUCs, grouped by user. The group weight is the number of ad impressions in the group divided by the total impressions.
This proposal is based on real industry scenarios, and therefore you are not permitted to use information that traverses time in the future (especially in the test set). For example, when a feature is constructed, the sample at time T should require only information *strictly* before time T (and use no information after T) to ensure the solution is practical. If you use information from after T, the result will be determined invalid during the manual code verification.
We also secretly split the test set into two to create two types of leaderboards: the public and the private leaderboard. The public leaderboard can be seen by everyone. To determine your score and ranking, however, we will use the private leaderboard. The private leaderboard is hidden from the participants and can only be seen by the organizers. We will share the private leaderboard with the participants only at the end of the competition.
You must upload your work with the file name submission.csv.
We are going to update the leaderboard in near real-time. If your solution made it to phase 2 (the final phase) you will be invited to Dublin for a 2-day hackathon live event where you can work with mentors, refine your submission, and improve your score.
In this phase, you will also be asked to submit your code for manual verification. If you make it to the top 6, you will also be invited to make a presentation on your solution.
Submit a submission.csv file containing log_id and pctr. The column log_id refers to the log_id of the corresponding test sample, and pctr refers to the predicted CTR of the test sample calculated by your model, respectively. The value of pctr shall contain six decimal places.
The file format example is as follows:
log_id,pctr
1,0.002345
2,0.012345
Data Science PhD Student
Nam is a PhD student at Dublin City University where his main research interest is to apply machine learning for brain signal analysis under uncertainty. He is also a research data scientist intern in the AISafe Team at Huawei Ireland Research Centre where he works mainly on Text-to-speech research. Previously, he has three years of experience working in the field of AI/ML where he used machine learning for speech-related disorder diagnoses.
Working in Huawei is both an interesting and challenging journey to me. At Huawei, I have the chance to work on the most state-of-the-art machine learning research with strong hardware platform for machine learning model training and deployment. Working with most advanced technology challenges me to be the best of myself every day and I have learnt so much from other senior members in the team not only in technical perspectives but also in time management and teamwork skills.
Data Science Ph.D. Student
Dung Tien Pham, a 3rd year Ph.D. student in Statistics at Trinity College Dublin, currently working as a Research Intern at Huawei IRC. Before embarking on the Ph.D. journey, he had a three-year experience in the industry and developed a passion for exploring, finding insights from data, and turning them into useful products. That passion has led him to many achievements, including the second prize in the Huawei University Challenge last year (2021).
Working at a world-class company like Huawei opens up great opportunities for me to sharpen my skill, strengthen my network and prepare for life after my Ph.D. Here we are working on state-of-the-art solutions for real-world problems, with an enormous amount of data, and most importantly, make an impact on millions of people. Huawei has successfully built an open and motivated workplace, where we can develop ourselves while contributing to the company’s success.
AI Solution Architect & Data Scientist
Lei is an experienced and passionate AI solution architect & data scientist with a demonstrated history of working in academia, and the banking and telecommunications industry for more than 10 years. He is specialised in AI, cloud computing and system modelling.
I apply AI and data analytics solutions to network management to pave the way toward fully autonomous networks.
Lead Data Scientist
Tri Kurniawan Wijaya is currently working in Huawei Ireland Research Centre and he leads several research groups working on Natural Language Processing, Computer Vision, and Recommender Systems, with 10+ years of experience in the field.
Working with Huawei has been a lot of fun, working with people from various backgrounds (cross-culture), communicating with stakeholders, and developing and improving state-of-the-art solutions to timely problems.
AI Research Scientist
Stefano is an AI researcher at Huawei IRC with more than 6 years of experience working on time series and anomaly detection.
Senior Data Scientist
David Lynch has been working as a Senior Data Scientist with the Huawei IRC team for 2 years. His research focuses on how machine learning can be used to achieve the vision of fully autonomous communications networks. Previously he completed a Master's and PhD in Computer Science with University College Dublin and has experience in topics including recommender systems, genetic programming, reinforcement learning, optimization, and deep learning.
A typical week in Huawei involves:
1. Meeting with stakeholders to understand business requirements for key technologies.
2. Reading about state-of-the-art approaches in the literature.
3. Designing machine learning algorithms to solve the problem at hand.
4. Evaluating the proposed algorithms through extensive experiments.
AI Software Engineer
Alexandros Agapitos obtained a BSc in Software engineering in 2003 and a Ph.D. in Computer Science in 2009 from the University of Essex, UK. He joined Huawei Ireland Research Center in 2016, and he is working on the application of AI methods to telecom networks.
Lead Engineer
Manuel Loureiro is a Principal Engineer at Huawei currently in his 2nd year of work with NLP in the AINews team. He has a Ph.D. in Engineering and Public Policy from Carnegie Mellon University.