39 Commits

Author SHA1 Message Date
913968b8c6 fix typo 2024-11-28 08:36:33 +01:00
b99c187bc5 add details for directories 2024-11-23 18:57:07 +01:00
ec4ad03fca add execution duration 2024-11-20 13:07:30 +01:00
9398f8fae3 adapt instructions with directory change 2024-11-20 11:56:51 +01:00
6cb7913af2 main folder to put files in it, replace category name space, fix / issue in img name 2024-11-20 11:56:10 +01:00
cece9d1874 remove the print() for test 2024-11-20 10:15:58 +01:00
b74090865e add info and structure 2024-11-20 10:12:21 +01:00
6fa035fc1a add finale folder, creates category dir and image name from title, full readme 2024-11-20 09:31:07 +01:00
aa0d3a7819 re-organize code to show ETL phases, add comments 2024-11-19 12:40:23 +01:00
73b302a2bc improve comments, indicate phases 2024-11-19 12:25:34 +01:00
90f3b22efb add a screenshot of the result 2024-11-14 15:08:05 +01:00
4785b2e6d8 add way to retrieve images : use requests and write binary in file. Name it with category and incremental number 2024-11-14 15:03:30 +01:00
22ccd97fa3 add content :/ 2024-11-14 14:49:50 +01:00
852c0e781b init phase4 2024-11-14 14:48:59 +01:00
c9aaef7222 add screenshot of the result 2024-11-14 14:27:23 +01:00
549291cd6c Téléverser les fichiers vers "phase3"
Screenshot du résultat

Signed-off-by: Yann <yann@needsome.coffee>
2024-11-14 13:22:29 +00:00
b34a5d123c refactor output counters 2024-11-14 14:16:36 +01:00
ebd5f5acd4 works. Add processed book and book to go counters displayed 2024-11-14 14:07:04 +01:00
d020998add all functions in same place, and loop in main 2024-11-14 13:56:55 +01:00
12dd0c9dfc init phase3 2024-11-14 13:24:12 +01:00
dd370cca8d add title 2024-11-14 13:22:36 +01:00
c35f7454a2 add text for fancy output and remove previous print (were testing) 2024-11-14 13:20:33 +01:00
4247f1ac83 manage exception when no description 2024-11-14 13:19:21 +01:00
2bbf684c26 build main to call function from phase 1 : build data from each page and write file 2024-11-14 12:37:52 +01:00
27d37fb5d3 copy phase1/main.py as phase1.py to import in main 2024-11-14 12:35:35 +01:00
50ca4fccd8 just one loop to fill the list, "extend" with each page list 2024-11-14 10:47:58 +01:00
c92ce51aa0 refactor some comments 2024-11-13 17:24:44 +01:00
8213f0849c test if multiple page, get URL, create list of product, and refactor main 2024-11-13 17:09:06 +01:00
e3ac12ff9b get category_list from home and get product url from a category (if one page) 2024-11-13 15:46:48 +01:00
7e6875a497 create phase2 folder+readme 2024-11-13 13:48:34 +01:00
c0fcd21346 add get_category 2024-11-13 13:44:27 +01:00
7b7f216be8 Merge branch 'main' of https://mcstn.fr/gitea/Yann/Projet2
je ne sais pas vraiment ce que ça va faire... j'ai commit, push puis
amend mon commit local... j'essaye de sync
2024-11-13 13:38:13 +01:00
3a6cf9b87e remove url_base, refactor list get_data, fix comment and PEP8 2024-11-13 11:06:11 +01:00
5d6a9bc263 remove url_base, refactor list get_data, fix comment and PEP8 2024-11-13 11:03:25 +01:00
1adcf0b224 data dict to list and write header then list 2024-11-12 19:17:30 +01:00
9d7edf3e9a improve function, added comment... writer to do 2024-11-12 19:02:32 +01:00
fd53a2d704 add tests, first try to scrape all title, prices 2024-11-12 18:20:27 +01:00
c5b1114e70 Init, README and main with main functions 2024-11-12 17:58:29 +01:00
cffae25b0c Commit initial 2024-11-11 19:43:33 +01:00