2 minute read

Hello, the project that I was talking about earlier is finally over. But it tooks me (and my colleagues) so much time to do that I wasn’t able to post about it.

Anyway, the project website is online and can be seen there:

Ok, it is in French but, first, it is for a French diploma in a French university, so… And second, I’ll make translations of the posts here. See ? Everything is gonna be alright.

All the original source code (website and project) is available here:

I’m not happy with it, the code can be optimised (lots of loops, not so pretty maps and graphs).

I don’t know exactly what but I won’t translate it without getting it better.

So, here is the plan, each part will get a post:

  • Problematic, data source, and variable selection
  • Data management and handling
  • Visualisation of the variables
  • Principal components analysis
  • Hierachical clustering
  • Correspondence analysis
  • Results and conclusion
  • Tools

We were 3 people working on this project: Ekaterina, Mathieu and me. They did all the hard work with the analysis, I did the variables visualisations and provided tools (git, liftr and blogdown). I would like to thank them for all the work they did.

There was another team working on the same data set but with another subject. They handle this huge quantity of data differently and their website is great too. It can be seen there:

This project is over and the results were presented to the examnining board in January 11th 2018 along with 4 others teams working on 2 differents subjects and datasets. So the code won’t move anymore, and I think I’ll keep it that. As a milestone of my coding techniques and my knowledge at the moment.

But. Here, I’ll try to redo all of it, especially the parts I didn’t do, and improve its quality, in my code (I didn’t had enough time to improve some parts) or of my colleagues (automate somethings, less loops, etc). In that way, I’ll get a better understanding of the part I didn’t do.

Stay tuned !