Kaggle快速上传dataset的方法
原理
从国内上传到有cdn的地方(如GitHub), 再在kaggle的kernel上下载下来,直接上传dataset。
方法
首先需要掌握kaggle-api的使用,kaggle-api是kaggle官方提供的命令行工具,可以从命理完成比赛数据的下载、dataset下载上传,获取榜单等操作。
https://github.com/Kaggle/kaggle-api
本地安装:pip install kaggle
Kaggle已经安装好了,不用再安装
步骤1:下载账户API json
https://www.kaggle.com/me/account
步骤2:在页面创建一个dataset
https://www.kaggle.com/datasets
步骤3:下载dataset的metadata
运行:kaggle datasets metadata shopee-models
步骤4:下载数据集并上传到dataset
完整代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| !mkdir /root/.kaggle lines = '''{"username":"写你的用户名","key":"写你的key"}''' with open('/root/.kaggle/kaggle.json', 'w') as up: up.write(lines)
!mkdir hubmapkidneysegmentation lines = '''{ "id": "finlay/shopee-models", "id_no": 122348, "title": "shopee_models", "subtitle": "", "description": "", "keywords": [], "resources": [] }''' with open('hubmapkidneysegmentation/dataset-metadata.json', 'w') as up: up.write(lines)
!apt-get install axel !axel -n 12 https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b7-dcc49843.pth -o hubmapkidneysegmentation/baseline_fold0_densenet_224_epoch50.pth
!kaggle datasets version -p ./hubmapkidneysegmentation -m "Updated data fcn"
|