課程信息
4.7
438 個評分
98 個審閱
專項課程

第 2 門課程(共 7 門)

100% 在線

100% 在線

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
高級

高級

完成時間(小時)

完成時間大約為49 小時

建議:6-10 hours/week...
可選語言

英語(English)

字幕:英語(English)

您將獲得的技能

Data AnalysisFeature ExtractionFeature EngineeringXgboost
專項課程

第 2 門課程(共 7 門)

100% 在線

100% 在線

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
高級

高級

完成時間(小時)

完成時間大約為49 小時

建議:6-10 hours/week...
可選語言

英語(English)

字幕:英語(English)

教學大綱 - 您將從這門課程中學到什麼

1
完成時間(小時)
完成時間為 6 小時

Introduction & Recap

This week we will introduce you to competitive data science. You will learn about competitions' mechanics, the difference between competitions and a real life data science, hardware and software that people usually use in competitions. We will also briefly recap major ML models frequently used in competitions....
Reading
8 個視頻 (總計 46 分鐘), 7 個閱讀材料, 6 個測驗
Video8 個視頻
Meet your lecturers2分鐘
Course overview7分鐘
Competition Mechanics6分鐘
Kaggle Overview [screencast]7分鐘
Real World Application vs Competitions5分鐘
Recap of main ML algorithms9分鐘
Software/Hardware Requirements5分鐘
Reading7 個閱讀材料
Welcome!10分鐘
Week 1 overview10分鐘
Disclaimer10分鐘
Explanation for quiz questions10分鐘
Additional Materials and Links10分鐘
Explanation for quiz questions10分鐘
Additional Material and Links10分鐘
Quiz5 個練習
Practice Quiz8分鐘
Recap8分鐘
Recap12分鐘
Software/Hardware6分鐘
Graded Soft/Hard Quiz8分鐘
完成時間(小時)
完成時間為 2 小時

Feature Preprocessing and Generation with Respect to Models

In this module we will summarize approaches to work with features: preprocessing, generation and extraction. We will see, that the choice of the machine learning model impacts both preprocessing we apply to the features and our approach to generation of new ones. We will also discuss feature extraction from text with Bag Of Words and Word2vec, and feature extraction from images with Convolution Neural Networks....
Reading
7 個視頻 (總計 73 分鐘), 4 個閱讀材料, 4 個測驗
Video7 個視頻
Overview6分鐘
Numeric features13分鐘
Categorical and ordinal features10分鐘
Datetime and coordinates8分鐘
Handling missing values10分鐘
Bag of words10分鐘
Word2vec, CNN13分鐘
Reading4 個閱讀材料
Explanation for quiz questions10分鐘
Additional Material and Links10分鐘
Explanation for quiz questions10分鐘
Additional Material and Links10分鐘
Quiz4 個練習
Feature preprocessing and generation with respect to models8分鐘
Feature preprocessing and generation with respect to models8分鐘
Feature extraction from text and images8分鐘
Feature extraction from text and images8分鐘
完成時間(小時)
完成時間為 29 分鐘

Final Project Description

This is just a reminder, that the final project in this course is better to start soon! The final project is in fact a competition, in this module you can find an information about it....
Reading
1 個視頻 (總計 4 分鐘), 2 個閱讀材料
Video1 個視頻
Reading2 個閱讀材料
Final project10分鐘
Final project advice #110分鐘
2
完成時間(小時)
完成時間為 2 小時

Exploratory Data Analysis

We will start this week with Exploratory Data Analysis (EDA). It is a very broad and exciting topic and an essential component of solving process. Besides regular videos you will find a walk through EDA process for Springleaf competition data and an example of prolific EDA for NumerAI competition with extraordinary findings....
Reading
8 個視頻 (總計 80 分鐘), 2 個閱讀材料, 1 個測驗
Video8 個視頻
Building intuition about the data6分鐘
Exploring anonymized data15分鐘
Visualizations11分鐘
Dataset cleaning and other things to check7分鐘
Springleaf competition EDA I8分鐘
Springleaf competition EDA II16分鐘
Numerai competition EDA6分鐘
Reading2 個閱讀材料
Week 2 overview10分鐘
Additional material and links10分鐘
Quiz1 個練習
Exploratory data analysis12分鐘
完成時間(小時)
完成時間為 2 小時

Validation

In this module we will discuss various validation strategies. We will see that the strategy we choose depends on the competition setup and that correct validation scheme is one of the bricks for any winning solution. ...
Reading
4 個視頻 (總計 51 分鐘), 3 個閱讀材料, 2 個測驗
Video4 個視頻
Validation strategies7分鐘
Data splitting strategies14分鐘
Problems occurring during validation20分鐘
Reading3 個閱讀材料
Validation strategies10分鐘
Comments on quiz10分鐘
Additional material and links10分鐘
Quiz2 個練習
Validation8分鐘
Validation8分鐘
完成時間(小時)
完成時間為 5 小時

Data Leakages

Finally, in this module we will cover something very unique to data science competitions. That is, we will see examples how it is sometimes possible to get a top position in a competition with a very little machine learning, just by exploiting a data leakage. ...
Reading
3 個視頻 (總計 26 分鐘), 3 個閱讀材料, 3 個測驗
Video3 個視頻
Leaderboard probing and examples of rare data leaks9分鐘
Expedia challenge9分鐘
Reading3 個閱讀材料
Comments on quiz10分鐘
Additional material and links10分鐘
Final project advice #210分鐘
Quiz1 個練習
Data leakages8分鐘
3
完成時間(小時)
完成時間為 3 小時

Metrics Optimization

This week we will first study another component of the competitions: the evaluation metrics. We will recap the most prominent ones and then see, how we can efficiently optimize a metric given in a competition....
Reading
8 個視頻 (總計 83 分鐘), 3 個閱讀材料, 2 個測驗
Video8 個視頻
Motivation8分鐘
Regression metrics review I14分鐘
Regression metrics review II8分鐘
Classification metrics review20分鐘
General approaches for metrics optimization6分鐘
Regression metrics optimization10分鐘
Classification metrics optimization I7分鐘
Classification metrics optimization II6分鐘
Reading3 個閱讀材料
Week 3 overview10分鐘
Comments on quiz10分鐘
Additional material and links10分鐘
Quiz2 個練習
Metrics12分鐘
Metrics12分鐘
完成時間(小時)
完成時間為 4 小時

Advanced Feature Engineering I

In this module we will study a very powerful technique for feature generation. It has a lot of names, but here we call it "mean encodings". We will see the intuition behind them, how to construct them, regularize and extend them. ...
Reading
3 個視頻 (總計 27 分鐘), 2 個閱讀材料, 2 個測驗
Video3 個視頻
Regularization7分鐘
Extensions and generalizations10分鐘
Reading2 個閱讀材料
Comments on quiz10分鐘
Final project advice #310分鐘
Quiz1 個練習
Mean encodings8分鐘
4
完成時間(小時)
完成時間為 3 小時

Hyperparameter Optimization

In this module we will talk about hyperparameter optimization process. We will also have a special video with practical tips and tricks, recorded by four instructors....
Reading
6 個視頻 (總計 86 分鐘), 4 個閱讀材料, 2 個測驗
Video6 個視頻
Hyperparameter tuning II12分鐘
Hyperparameter tuning III13分鐘
Practical guide16分鐘
KazAnova's competition pipeline, part 118分鐘
KazAnova's competition pipeline, part 217分鐘
Reading4 個閱讀材料
Week 4 overview10分鐘
Comments on quiz10分鐘
Additional material and links10分鐘
Additional materials and links10分鐘
Quiz2 個練習
Practice quiz6分鐘
Graded quiz8分鐘
完成時間(小時)
完成時間為 4 小時

Advanced feature engineering II

In this module we will learn about a few more advanced feature engineering techniques....
Reading
4 個視頻 (總計 22 分鐘), 2 個閱讀材料, 2 個測驗
Video4 個視頻
Matrix factorizations6分鐘
Feature Interactions5分鐘
t-SNE5分鐘
Reading2 個閱讀材料
Comments on quiz10分鐘
Additional Materials and Links10分鐘
Quiz1 個練習
Graded Advanced Features II Quiz12分鐘
完成時間(小時)
完成時間為 10 小時

Ensembling

Nowadays it is hard to find a competition won by a single model! Every winning solution incorporates ensembles of models. In this module we will talk about the main ensembling techniques in general, and, of course, how it is better to ensemble the models in practice. ...
Reading
8 個視頻 (總計 92 分鐘), 4 個閱讀材料, 4 個測驗
Video8 個視頻
Bagging9分鐘
Boosting16分鐘
Stacking16分鐘
StackNet14分鐘
Ensembling Tips and Tricks14分鐘
CatBoost 17分鐘
CatBoost 27分鐘
Reading4 個閱讀材料
Validation schemes for 2-nd level models10分鐘
Comments on quiz10分鐘
Additional materials and links10分鐘
Final project advice #410分鐘
Quiz2 個練習
Ensembling8分鐘
Ensembling12分鐘
4.7
98 個審閱Chevron Right
職業方向

17%

完成這些課程後已開始新的職業生涯
工作福利

14%

通過此課程獲得實實在在的工作福利

熱門審閱

創建者 MSMar 29th 2018

Top Kagglers gently introduce one to Data Science Competitions. One will have a great chance to learn various tips and tricks and apply them in practice throughout the course. Highly recommended!

創建者 MMNov 10th 2017

This course is fantastic. It's chock full of practical information that is presented clearly and concisely. I would like to thank the team for sharing their knowledge so generously.

講師

Avatar

Dmitry Ulyanov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Alexander Guschin

Visiting lecturer at HSE, Lecturer at MIPT
HSE Faculty of Computer Science
Avatar

Mikhail Trofimov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Dmitry Altukhov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Marios Michailidis

Research Data Scientist
H2O.ai

關於 国立高等经济大学

National Research University - Higher School of Economics (HSE) is one of the top research universities in Russia. Established in 1992 to promote new research and teaching in economics and related disciplines, it now offers programs at all levels of university education across an extraordinary range of fields of study including business, sociology, cultural studies, philosophy, political science, international relations, law, Asian studies, media and communications, IT, mathematics, engineering, and more. Learn more on www.hse.ru...

關於 Advanced Machine Learning 專項課程

This specialization gives an introduction to deep learning, reinforcement learning, natural language understanding, computer vision and Bayesian methods. Top Kaggle machine learning practitioners and CERN scientists will share their experience of solving real-world problems and help you to fill the gaps between theory and practice. Upon completion of 7 courses you will be able to apply modern machine learning methods in enterprise and understand the caveats of real-world data and settings....
Advanced Machine Learning

常見問題

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

還有其他問題嗎?請訪問 學生幫助中心