Jan 02, 2025

Wiki

Python

Aide

edit SideBar

Search

Data Mining


test

The practical work below is divided into sessions. You are asked, each time, to read the content, to do the exercises when there are some, and to ask me questions when something is not understood. At the end of each session, you will have to send your production to: cguyeux@femto-st.fr.

At the end of the course, I will randomly draw two practical works per student, to give a final grade on the practical part. The last session will be the occasion to make a final test on paper, covering all the sessions.

First session: introduction to Python

Second session : The Pandas library

Third session : Clustering (Unsupervised learning)

Fourth session : Supervized learning

General presentation

Supervized learning preparation

Project

Take your clustering codes from the previous session, and add a dimentionality reduction step upstream. Try several methods, and improve your clustering results according to various metrics previously proposed.

fifth session

Sixth session: time series

Choose a time series from the UCI directory and use it as an example in the following points. You can send me at the end of the day what you have achieved on the time series of your choice, both on its analysis and its prediction.

Machine Learning for ioT - M2 IoT

  • MIT Deep Sequence modeling labs
  • Classification of text
    • Sentiment classification - IMDB dataset - Colab file to complete
    • Number of weights: IMDB-LSTM versus IMDB-GRU
    • A focus on word embedding
    • SMS classification - Link to CSV file in "Classification of text"
      • Colab file SMS-LSTM-classif
      • Try to learn an embedding with Word2Vec based on Colab file Word2Vec
    • Complaint classification - Link to CSV file in "Classification of text"
      • Use SMS classification file to perform multiclass classification of complaints

Page Actions

Recent Changes

Group & Page

Back Links