基于NVIDIA TLT工具包,迁移学习TRT模型
1、配置环境变量
1 2 3 4 5 6 7 8 9 10 11 12 |
# Setting up env variables for cleaner command line commands. print("Update directory paths if needed") %env KEY=tlt_encode # User directory - pre-trained/unpruned/pruned/final models will be saved here %env USER_EXPERIMENT_DIR=../tlt_cv_samples_v1.1.0/detectnet_v2 # Download directory - tfrecords will be generated here %env DATA_DOWNLOAD_DIR=../data # Spec Directory %env SPECS_DIR=../face-mask-detection-master/tlt_specs_resnet50 # Number of GPUs used for training %env NUM_GPUS=2 |
2、准备数据集
数据集配置文件
1 2 3 |
print("TFrecords conversion spec file for kitti training") !cat $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt |
TLT支持kitti格式的数据集,所以需将TFrecords转化为kitti
1 2 3 4 5 |
# Creating a new directory for the output tfrecords dump. print("Converting Tfrecords for kitti trainval dataset") !tlt-dataset-convert -d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt \ -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval |
查看转化结果
1 2 |
!ls -rlt $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/ |
3、选取预训练模型
查看ngc上detectnet预训练模型的各种版本,这里选取backbone为resnet18或50的版本
1 2 3 |
# List models available in the model registry. !ngc registry model list nvidia/tlt_pretrained_detectnet_v2:* |
创建保存模型的目录
1 2 3 |
# Create the target destination to download the model. !mkdir -p $USER_EXPERIMENT_DIR/pretrained_resnet18/ |
下载
1 2 3 4 |
# Download the pretrained model from NGC !ngc registry model download-version nvidia/tlt_pretrained_detectnet_v2:resnet18 \ --dest $USER_EXPERIMENT_DIR/pretrained_resnet18 |
查看
1 2 |
!ls -rlt $USER_EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18 |
4、修改配置文件
训练配置文件需要根据官网的文档进行修改,示例中附带配置文件,示例下载方法查看上一篇TLT文章
1 2 |
!cat $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt |
5、训练
1 2 3 4 5 6 |
!tlt-train detectnet_v2 -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \ -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \ -k $KEY \ -n resnet18_detector \ --gpus $NUM_GPUS |
查看训练结果
1 2 3 4 |
print('Model for each epoch:') print('---------------------') !ls -lh $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights |
6、评估
1 2 3 4 |
!tlt-evaluate detectnet_v2 -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt\ -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \ -k $KEY |
7、修剪模型
通常只需要修改-pth(threshold)参数,起始值为5.2e-6,增加这个值获得更小的模型,减少值获得更好的精度。
1 2 3 4 5 6 7 8 9 10 |
# Create an output directory if it doesn't exist. !mkdir -p $USER_EXPERIMENT_DIR/experiment_dir_pruned print("Change Threshold (-pth) value according to you experiments") !tlt-prune -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt \ -eq union \ -pth 0.8 \ -k $KEY |
查看剪枝结果
1 2 |
!ls -rlt $USER_EXPERIMENT_DIR/experiment_dir_pruned/ |
8、修剪模型重新训练
Retrain配置文件
1 2 3 4 5 6 |
# Printing the retrain experiment file. # Note: We have updated the experiment file to include the # newly pruned model as a pretrained weights and, the # load_graph option is set to true !cat $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt |
重新训练
1 2 3 4 5 6 7 |
# Retraining using the pruned model as pretrained weights !tlt-train detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \ -r $USER_EXPERIMENT_DIR/experiment_dir_retrain \ -k $KEY \ -n resnet18_detector_pruned \ --gpus $NUM_GPUS |
查看
1 2 3 |
# Listing the newly retrained model. !ls -rlt $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights |
9、评估Retrain模型
1 2 3 4 |
!tlt-evaluate detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \ -k $KEY |
10、可视化
配置文件
1 2 |
!cat $SPECS_DIR/detectnet_v2_inference_kitti_tlt.txt |
批量运行
1 2 3 4 5 6 |
# Running inference for detection on n images !tlt-infer detectnet_v2 -e $SPECS_DIR/detectnet_v2_inference_kitti_tlt.txt \ -o $USER_EXPERIMENT_DIR/tlt_infer_testing \ -i $DATA_DOWNLOAD_DIR/test/images \ -k $KEY |
Jupyter页面查看
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# Simple grid visualizer import matplotlib.pyplot as plt import os from math import ceil valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm'] def visualize_images(image_dir, num_cols=4, num_images=10): output_path = os.path.join(os.environ['USER_EXPERIMENT_DIR'], image_dir) num_rows = int(ceil(float(num_images) / float(num_cols))) print(num_rows) # 80 30 f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,60]) f.tight_layout() a = [os.path.join(output_path, image) for image in os.listdir(output_path) if os.path.splitext(image)[1].lower() in valid_image_ext] for idx, img_path in enumerate(a[:num_images]): print(idx,img_path) col_id = int(idx % num_cols) row_id = int(idx / num_cols) print(col_id,row_id) img = plt.imread(img_path) axarr[row_id, col_id].imshow(img) # Visualizing the first 12 images. OUTPUT_PATH = 'tlt_infer_testing_temp/images_annotated' # relative path from $USER_EXPERIMENT_DIR. COLS = 4 # number of columns in the visualizer grid. IMAGES = 12 # number of images to visualize. visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES) |
11、部署
将tlt格式转化为加密部署格式etlt
1 2 3 4 5 6 7 8 9 10 11 12 13 |
!mkdir -p $USER_EXPERIMENT_DIR/experiment_dir_final # Removing a pre-existing copy of the etlt if there has been any. import os output_file=os.path.join(os.environ['USER_EXPERIMENT_DIR'], "experiment_dir_final/resnet18_detector.etlt") if os.path.exists(output_file): os.system("rm {}".format(output_file)) # 这里先不整重新训练的 !tlt-export detectnet_v2 \ -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY |
tlt-converter将tlt转化为TensorRT引擎,输出trt格式
1 2 3 4 5 6 7 8 |
# 没有降精度 !tlt-converter $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ -d 3,544,960 \ -o output_cov/Sigmoid,output_bbox/BiasAdd \ -i nchw \ -e $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt \ |
注意:这里报错的话,可以检查$KEY
1 2 |
!echo $KEY |
如果没有显示,可以再docker workspace中添加
11、验证TRT格式的模型
计算结果
1 2 3 4 5 |
!tlt-infer detectnet_v2 -e $SPECS_DIR/detectnet_v2_inference_kitti_etlt.txt \ -o $USER_EXPERIMENT_DIR/etlt_infer_testing \ -i $DATA_DOWNLOAD_DIR/test/images \ -k $KEY |
Jupyter查看
1 2 3 4 5 6 7 |
# visualize the first 12 inferenced images. OUTPUT_PATH = 'etlt_infer_testing/images_annotated' # relative path from $USER_EXPERIMENT_DIR. COLS = 4 # number of columns in the visualizer grid. IMAGES = 12 # number of images to visualize. visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES) |